[1/3] net/bnx2x: fix to use required mem barriers in Rx path

Message ID 20200114003920.17705-1-rmody@marvell.com (mailing list archive)
State Changes Requested, archived
Delegated to: Jerin Jacob
Headers
Series [1/3] net/bnx2x: fix to use required mem barriers in Rx path |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-testing success Testing PASS
ci/iol-nxp-Performance success Performance Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/Intel-compilation success Compilation OK

Commit Message

Rasesh Mody Jan. 14, 2020, 12:39 a.m. UTC
  When handling RX completion queue PMD is not using required read/write
barriers before reading completion queue element (CQE) indices,
updating/writing hardware consumer and producer.
This patch adds appropriate read/write memory barriers in places which
are required by driver and adapter to read or update indices.

Fixes: 540a211084a7 ("bnx2x: driver core")
Cc: stable@dpdk.org

Signed-off-by: Rasesh Mody <rmody@marvell.com>
---
 drivers/net/bnx2x/bnx2x.c      |  5 +++++
 drivers/net/bnx2x/bnx2x_rxtx.c | 22 ++++++++++++++++++++++
 2 files changed, 27 insertions(+)
  

Comments

Jerin Jacob Jan. 14, 2020, 1:49 p.m. UTC | #1
+ Gavin

On Tue, Jan 14, 2020 at 6:09 AM Rasesh Mody <rmody@marvell.com> wrote:
>
> When handling RX completion queue PMD is not using required read/write
> barriers before reading completion queue element (CQE) indices,
> updating/writing hardware consumer and producer.
> This patch adds appropriate read/write memory barriers in places which
> are required by driver and adapter to read or update indices.
>
> Fixes: 540a211084a7 ("bnx2x: driver core")
> Cc: stable@dpdk.org
>
> Signed-off-by: Rasesh Mody <rmody@marvell.com>
> ---
>  drivers/net/bnx2x/bnx2x.c      |  5 +++++
>  drivers/net/bnx2x/bnx2x_rxtx.c | 22 ++++++++++++++++++++++
>  2 files changed, 27 insertions(+)
>
> diff --git a/drivers/net/bnx2x/bnx2x.c b/drivers/net/bnx2x/bnx2x.c
> index ed31335ac..9c5e7995d 100644
> --- a/drivers/net/bnx2x/bnx2x.c
> +++ b/drivers/net/bnx2x/bnx2x.c
> @@ -1255,6 +1255,11 @@ static uint8_t bnx2x_rxeof(struct bnx2x_softc *sc, struct bnx2x_fastpath *fp)
>                 return 0;
>         }
>
> +       /* Add memory barrier as status block fields can change. This memory
> +        * barrier will flush out all the read/write operations to status block
> +        * generated before the barrier. It will ensure stale data is not read
> +        */
> +       mb();

# Do you need full barriers here?
# Which architecture did you saw this issue?
# rte_cio_* barriers are performance Friday, Have you checked
rte_cio_* would suffice the requirements.
See the discussion in  http://patches.dpdk.org/patch/64038/

I assume 2/3 and 3/3 patches are for the slow path. if so, it is fine
to use full barriers on those patches.
  
Rasesh Mody Jan. 15, 2020, 1:57 a.m. UTC | #2
Hi Jerin,

>From: Jerin Jacob <jerinjacobk@gmail.com>
>Sent: Tuesday, January 14, 2020 5:49 AM
>To: Rasesh Mody <rmody@marvell.com>; Gavin Hu <gavin.hu@arm.com>
>Cc: dpdk-dev <dev@dpdk.org>; Jerin Jacob Kollanukkaran
><jerinj@marvell.com>; Ferruh Yigit <ferruh.yigit@intel.com>; GR-Everest-
>DPDK-Dev <GR-Everest-DPDK-Dev@marvell.com>; dpdk stable
><stable@dpdk.org>
>Subject: [EXT] Re: [dpdk-dev] [PATCH 1/3] net/bnx2x: fix to use required
>mem barriers in Rx path
>
>External Email
>
>----------------------------------------------------------------------
>+ Gavin
>
>On Tue, Jan 14, 2020 at 6:09 AM Rasesh Mody <rmody@marvell.com> wrote:
>>
>> When handling RX completion queue PMD is not using required read/write
>> barriers before reading completion queue element (CQE) indices,
>> updating/writing hardware consumer and producer.
>> This patch adds appropriate read/write memory barriers in places which
>> are required by driver and adapter to read or update indices.
>>
>> Fixes: 540a211084a7 ("bnx2x: driver core")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Rasesh Mody <rmody@marvell.com>
>> ---
>>  drivers/net/bnx2x/bnx2x.c      |  5 +++++
>>  drivers/net/bnx2x/bnx2x_rxtx.c | 22 ++++++++++++++++++++++
>>  2 files changed, 27 insertions(+)
>>
>> diff --git a/drivers/net/bnx2x/bnx2x.c b/drivers/net/bnx2x/bnx2x.c
>> index ed31335ac..9c5e7995d 100644
>> --- a/drivers/net/bnx2x/bnx2x.c
>> +++ b/drivers/net/bnx2x/bnx2x.c
>> @@ -1255,6 +1255,11 @@ static uint8_t bnx2x_rxeof(struct bnx2x_softc
>*sc, struct bnx2x_fastpath *fp)
>>                 return 0;
>>         }
>>
>> +       /* Add memory barrier as status block fields can change. This memory
>> +        * barrier will flush out all the read/write operations to status block
>> +        * generated before the barrier. It will ensure stale data is not read
>> +        */
>> +       mb();
>
># Do you need full barriers here?

Yes

># Which architecture did you saw this issue?
># rte_cio_* barriers are performance Friday, Have you checked
>rte_cio_* would suffice the requirements.
>See the discussion in  https://urldefense.proofpoint.com/v2/url?u=http-
>3A__patches.dpdk.org_patch_64038_&d=DwIBaQ&c=nKjWec2b6R0mOyPaz7
>xtfQ&r=9aB46H7c7TYTnBun6ODgtnNLQdw3jNiVKHbs9eOyBSU&m=lmdOnuAp
>3MUDqX100Z4E2BDgaSSAy9oBgksySHVfEBI&s=oNq78smfHMB5fUZW_ew0_p
>e6gp_5C0MTw0TSPPWR8qQ&e=

This patch is to prevent potential issue which can happen in absence of required barriers.
The above barrier is in slow path. However, there is another full barrier in fast path as part of this patch which can use rte_cio_* .
I'll revise the patch and resend.

>
>I assume 2/3 and 3/3 patches are for the slow path. if so, it is fine to use full
>barriers on those patches.

The 2/3 is slow path fix and 3/3 is fixing a race condition that can occur between slow path and fast path.

Thanks!
-Rasesh
  
Rasesh Mody Jan. 26, 2020, 10:55 p.m. UTC | #3
HI Jerin,

>From: Rasesh Mody <rmody@marvell.com>
>Sent: Tuesday, January 14, 2020 5:57 PM
>
>Hi Jerin,
>
>>From: Jerin Jacob <jerinjacobk@gmail.com>
>>Sent: Tuesday, January 14, 2020 5:49 AM
>>To: Rasesh Mody <rmody@marvell.com>; Gavin Hu <gavin.hu@arm.com>
>>Cc: dpdk-dev <dev@dpdk.org>; Jerin Jacob Kollanukkaran
>><jerinj@marvell.com>; Ferruh Yigit <ferruh.yigit@intel.com>;
>>GR-Everest- DPDK-Dev <GR-Everest-DPDK-Dev@marvell.com>; dpdk stable
>><stable@dpdk.org>
>>Subject: [EXT] Re: [dpdk-dev] [PATCH 1/3] net/bnx2x: fix to use
>>required mem barriers in Rx path
>>
>>External Email
>>
>>----------------------------------------------------------------------
>>+ Gavin
>>
>>On Tue, Jan 14, 2020 at 6:09 AM Rasesh Mody <rmody@marvell.com> wrote:
>>>
>>> When handling RX completion queue PMD is not using required
>>> read/write barriers before reading completion queue element (CQE)
>>> indices, updating/writing hardware consumer and producer.
>>> This patch adds appropriate read/write memory barriers in places
>>> which are required by driver and adapter to read or update indices.
>>>
>>> Fixes: 540a211084a7 ("bnx2x: driver core")
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: Rasesh Mody <rmody@marvell.com>
>>> ---
>>>  drivers/net/bnx2x/bnx2x.c      |  5 +++++
>>>  drivers/net/bnx2x/bnx2x_rxtx.c | 22 ++++++++++++++++++++++
>>>  2 files changed, 27 insertions(+)
>>>
>>> diff --git a/drivers/net/bnx2x/bnx2x.c b/drivers/net/bnx2x/bnx2x.c
>>> index ed31335ac..9c5e7995d 100644
>>> --- a/drivers/net/bnx2x/bnx2x.c
>>> +++ b/drivers/net/bnx2x/bnx2x.c
>>> @@ -1255,6 +1255,11 @@ static uint8_t bnx2x_rxeof(struct bnx2x_softc
>>*sc, struct bnx2x_fastpath *fp)
>>>                 return 0;
>>>         }
>>>
>>> +       /* Add memory barrier as status block fields can change. This memory
>>> +        * barrier will flush out all the read/write operations to status block
>>> +        * generated before the barrier. It will ensure stale data is not read
>>> +        */
>>> +       mb();
>>
>># Do you need full barriers here?
>
>Yes
>
>># Which architecture did you saw this issue?
>># rte_cio_* barriers are performance Friday, Have you checked
>>rte_cio_* would suffice the requirements.
>>See the discussion in  https://urldefense.proofpoint.com/v2/url?u=http-
>>3A__patches.dpdk.org_patch_64038_&d=DwIBaQ&c=nKjWec2b6R0mOyPaz
>7
>>xtfQ&r=9aB46H7c7TYTnBun6ODgtnNLQdw3jNiVKHbs9eOyBSU&m=lmdOnuA
>p
>>3MUDqX100Z4E2BDgaSSAy9oBgksySHVfEBI&s=oNq78smfHMB5fUZW_ew0_
>p
>>e6gp_5C0MTw0TSPPWR8qQ&e=
>
>This patch is to prevent potential issue which can happen in absence of
>required barriers.
>The above barrier is in slow path. However, there is another full barrier in fast
>path as part of this patch which can use rte_cio_* .
>I'll revise the patch and resend.

Note that memory barrier changes are a preventive fix. We did not observe any issue with the performance with original changes.
If we change read/write fast path barriers to rte_cio_*, PMD will end up having mixed barriers for slow path and fast path.
We'll revisit these changes later. I have sent v2 series without memory barrier changes.

Thanks!
-Rasesh
>
>>
>>I assume 2/3 and 3/3 patches are for the slow path. if so, it is fine
>>to use full barriers on those patches.
>
>The 2/3 is slow path fix and 3/3 is fixing a race condition that can occur
>between slow path and fast path.
>
>Thanks!
>-Rasesh
>
  

Patch

diff --git a/drivers/net/bnx2x/bnx2x.c b/drivers/net/bnx2x/bnx2x.c
index ed31335ac..9c5e7995d 100644
--- a/drivers/net/bnx2x/bnx2x.c
+++ b/drivers/net/bnx2x/bnx2x.c
@@ -1255,6 +1255,11 @@  static uint8_t bnx2x_rxeof(struct bnx2x_softc *sc, struct bnx2x_fastpath *fp)
 		return 0;
 	}
 
+	/* Add memory barrier as status block fields can change. This memory
+	 * barrier will flush out all the read/write operations to status block
+	 * generated before the barrier. It will ensure stale data is not read
+	 */
+	mb();
 	/* CQ "next element" is of the size of the regular element */
 	hw_cq_cons = le16toh(*fp->rx_cq_cons_sb);
 	if (unlikely((hw_cq_cons & USABLE_RCQ_ENTRIES_PER_PAGE) ==
diff --git a/drivers/net/bnx2x/bnx2x_rxtx.c b/drivers/net/bnx2x/bnx2x_rxtx.c
index ae97dfee3..b52f023ea 100644
--- a/drivers/net/bnx2x/bnx2x_rxtx.c
+++ b/drivers/net/bnx2x/bnx2x_rxtx.c
@@ -324,11 +324,22 @@  bnx2x_upd_rx_prod_fast(struct bnx2x_softc *sc, struct bnx2x_fastpath *fp,
 	struct ustorm_eth_rx_producers rx_prods = { 0 };
 	uint32_t *val = NULL;
 
+	/* Update producers */
 	rx_prods.bd_prod  = rx_bd_prod;
 	rx_prods.cqe_prod = rx_cq_prod;
 
+	/*
+	 * Make sure that the BD and SGE data is updated before updating the
+	 * producers since FW might read the BD/SGE right after the producer
+	 * is updated.
+	 * The following barrier is also mandatory since FW will assumes BDs
+	 * must have buffers.
+	 */
+	wmb();
 	val = (uint32_t *)&rx_prods;
 	REG_WR(sc, fp->ustorm_rx_prods_offset, val[0]);
+
+	wmb();			/* keep prod updates ordered */
 }
 
 static uint16_t
@@ -346,6 +357,11 @@  bnx2x_recv_pkts(void *p_rxq, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 	uint16_t len, pad;
 	struct rte_mbuf *rx_mb = NULL;
 
+	/* Add memory barrier as status block fields can change. This memory
+	 * barrier will flush out all the read/write operations to status block
+	 * generated before the barrier. It will ensure stale data is not read
+	 */
+	mb();
 	hw_cq_cons = le16toh(*fp->rx_cq_cons_sb);
 	if ((hw_cq_cons & USABLE_RCQ_ENTRIES_PER_PAGE) ==
 			USABLE_RCQ_ENTRIES_PER_PAGE) {
@@ -357,6 +373,12 @@  bnx2x_recv_pkts(void *p_rxq, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 	sw_cq_cons = rxq->rx_cq_head;
 	sw_cq_prod = rxq->rx_cq_tail;
 
+	/*
+	 * Memory barrier necessary as speculative reads of the Rx
+	 * buffer can be ahead of the index in the status block
+	 */
+	rmb();
+
 	if (sw_cq_cons == hw_cq_cons)
 		return 0;