[1/2] net/txgbe: add proper memory barriers in Rx
Checks
Commit Message
Refer to commit 85e46c532bc7 ("net/ixgbe: add proper memory barriers in
Rx"). Fix the same issue as ixgbe.
Segmentation fault has been observed while running the
txgbe_recv_pkts_lro() function to receive packets on the Loongson 3A5000
processor. It's caused by the out-of-order execution of CPU. So add a
proper memory barrier to ensure the read ordering be correct.
We also did the same thing in the txgbe_recv_pkts() function to make the
rxd data be valid even though we did not find segmentation fault in this
function.
Fixes: 0e484278c85f ("net/txgbe: support Rx")
Cc: stable@dpdk.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
---
drivers/net/txgbe/txgbe_rxtx.c | 47 +++++++++++++++-------------------
1 file changed, 21 insertions(+), 26 deletions(-)
Comments
On 10/30/2023 10:51 AM, Jiawen Wu wrote:
> Refer to commit 85e46c532bc7 ("net/ixgbe: add proper memory barriers in
> Rx"). Fix the same issue as ixgbe.
>
> Segmentation fault has been observed while running the
> txgbe_recv_pkts_lro() function to receive packets on the Loongson 3A5000
> processor. It's caused by the out-of-order execution of CPU. So add a
> proper memory barrier to ensure the read ordering be correct.
>
> We also did the same thing in the txgbe_recv_pkts() function to make the
> rxd data be valid even though we did not find segmentation fault in this
> function.
>
> Fixes: 0e484278c85f ("net/txgbe: support Rx")
> Cc: stable@dpdk.org
>
> Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
> ---
> drivers/net/txgbe/txgbe_rxtx.c | 47 +++++++++++++++-------------------
> 1 file changed, 21 insertions(+), 26 deletions(-)
>
> diff --git a/drivers/net/txgbe/txgbe_rxtx.c b/drivers/net/txgbe/txgbe_rxtx.c
> index 834ada886a..24fc34d3c4 100644
> --- a/drivers/net/txgbe/txgbe_rxtx.c
> +++ b/drivers/net/txgbe/txgbe_rxtx.c
> @@ -1476,11 +1476,22 @@ txgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
> * of accesses cannot be reordered by the compiler. If they were
> * not volatile, they could be reordered which could lead to
> * using invalid descriptor fields when read from rxd.
> + *
> + * Meanwhile, to prevent the CPU from executing out of order, we
> + * need to use a proper memory barrier to ensure the memory
> + * ordering below.
> */
> rxdp = &rx_ring[rx_id];
> staterr = rxdp->qw1.lo.status;
> if (!(staterr & rte_cpu_to_le_32(TXGBE_RXD_STAT_DD)))
> break;
> +
> + /*
> + * Use acquire fence to ensure that status_error which includes
> + * DD bit is loaded before loading of other descriptor words.
> + */
> + rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
>
Hi Jiawen,
Can you please check the checkpatch warning:
Warning in drivers/net/txgbe/txgbe_rxtx.c:
Using __atomic_xxx/__ATOMIC_XXX built-ins, prefer
rte_atomic_xxx/rte_memory_order_xxx
For your case please use 'rte_memory_order_xxx' instead of '__ATOMIC_XXX'.
Same for both patches in the set.
@@ -1476,11 +1476,22 @@ txgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
* of accesses cannot be reordered by the compiler. If they were
* not volatile, they could be reordered which could lead to
* using invalid descriptor fields when read from rxd.
+ *
+ * Meanwhile, to prevent the CPU from executing out of order, we
+ * need to use a proper memory barrier to ensure the memory
+ * ordering below.
*/
rxdp = &rx_ring[rx_id];
staterr = rxdp->qw1.lo.status;
if (!(staterr & rte_cpu_to_le_32(TXGBE_RXD_STAT_DD)))
break;
+
+ /*
+ * Use acquire fence to ensure that status_error which includes
+ * DD bit is loaded before loading of other descriptor words.
+ */
+ rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
+
rxd = *rxdp;
/*
@@ -1726,32 +1737,10 @@ txgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts,
next_desc:
/*
- * The code in this whole file uses the volatile pointer to
- * ensure the read ordering of the status and the rest of the
- * descriptor fields (on the compiler level only!!!). This is so
- * UGLY - why not to just use the compiler barrier instead? DPDK
- * even has the rte_compiler_barrier() for that.
- *
- * But most importantly this is just wrong because this doesn't
- * ensure memory ordering in a general case at all. For
- * instance, DPDK is supposed to work on Power CPUs where
- * compiler barrier may just not be enough!
- *
- * I tried to write only this function properly to have a
- * starting point (as a part of an LRO/RSC series) but the
- * compiler cursed at me when I tried to cast away the
- * "volatile" from rx_ring (yes, it's volatile too!!!). So, I'm
- * keeping it the way it is for now.
- *
- * The code in this file is broken in so many other places and
- * will just not work on a big endian CPU anyway therefore the
- * lines below will have to be revisited together with the rest
- * of the txgbe PMD.
- *
- * TODO:
- * - Get rid of "volatile" and let the compiler do its job.
- * - Use the proper memory barrier (rte_rmb()) to ensure the
- * memory ordering below.
+ * "Volatile" only prevents caching of the variable marked
+ * volatile. Most important, "volatile" cannot prevent the CPU
+ * from executing out of order. So, it is necessary to use a
+ * proper memory barrier to ensure the memory ordering below.
*/
rxdp = &rx_ring[rx_id];
staterr = rte_le_to_cpu_32(rxdp->qw1.lo.status);
@@ -1759,6 +1748,12 @@ txgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts,
if (!(staterr & TXGBE_RXD_STAT_DD))
break;
+ /*
+ * Use acquire fence to ensure that status_error which includes
+ * DD bit is loaded before loading of other descriptor words.
+ */
+ rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
+
rxd = *rxdp;
PMD_RX_LOG(DEBUG, "port_id=%u queue_id=%u rx_id=%u "