[1/2] net/ixgbe: remove barrier in vPMD for aarch64

Message ID 20190813100248.8000-2-ruifeng.wang@arm.com (mailing list archive)
State Superseded, archived
Delegated to: Qi Zhang
Headers
Series IXGBE vPMD changes for aarch64 |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/iol-Compile-Testing success Compile Testing PASS
ci/intel-Performance-Testing success Performance Testing PASS
ci/mellanox-Performance-Testing success Performance Testing PASS

Commit Message

Ruifeng Wang Aug. 13, 2019, 10:02 a.m. UTC
  The memory barrier was intended for descriptor data integrity (see
comments in [1]). However, since NEON loads are atomic, there is
no need for the memory barrier. Remove it accordingly.

Corrected couple of code comments.

In terms of performance, observed slightly higher average throughput
in tests with 82599ES NIC.

[1] http://patches.dpdk.org/patch/18153/

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
---
 drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)
  

Patch

diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
index edb138354..86fb3afdb 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
@@ -214,13 +214,13 @@  _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 		uint32_t var = 0;
 		uint32_t stat;
 
-		/* B.1 load 1 mbuf point */
+		/* B.1 load 2 mbuf point */
 		mbp1 = vld1q_u64((uint64_t *)&sw_ring[pos]);
 
 		/* B.2 copy 2 mbuf point into rx_pkts  */
 		vst1q_u64((uint64_t *)&rx_pkts[pos], mbp1);
 
-		/* B.1 load 1 mbuf point */
+		/* B.1 load 2 mbuf point */
 		mbp2 = vld1q_u64((uint64_t *)&sw_ring[pos + 2]);
 
 		/* A. load 4 pkts descs */
@@ -228,7 +228,6 @@  _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 		descs[1] =  vld1q_u64((uint64_t *)(rxdp + 1));
 		descs[2] =  vld1q_u64((uint64_t *)(rxdp + 2));
 		descs[3] =  vld1q_u64((uint64_t *)(rxdp + 3));
-		rte_smp_rmb();
 
 		/* B.2 copy 2 mbuf point into rx_pkts  */
 		vst1q_u64((uint64_t *)&rx_pkts[pos + 2], mbp2);