From patchwork Fri Mar 6 05:04:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gavin Hu X-Patchwork-Id: 66313 X-Patchwork-Delegate: xiaolong.ye@intel.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id D2DF6A056A; Fri, 6 Mar 2020 06:05:15 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 63D771BFDB; Fri, 6 Mar 2020 06:05:10 +0100 (CET) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id A11011BFDA; Fri, 6 Mar 2020 06:05:08 +0100 (CET) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2336430E; Thu, 5 Mar 2020 21:05:08 -0800 (PST) Received: from net-arm-thunderx2-04.shanghai.arm.com (net-arm-thunderx2-04.shanghai.arm.com [10.169.40.184]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 190B33F6CF; Thu, 5 Mar 2020 21:05:04 -0800 (PST) From: Gavin Hu To: dev@dpdk.org Cc: nd@arm.com, david.marchand@redhat.com, thomas@monjalon.net, jerinj@marvell.com, xiaolong.ye@intel.com, Honnappa.Nagarahalli@arm.com, ruifeng.wang@arm.com, phil.yang@arm.com, joyce.kong@arm.com, steve.capper@arm.com, stable@dpdk.org Date: Fri, 6 Mar 2020 13:04:25 +0800 Message-Id: <20200306050427.66114-2-gavin.hu@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200306050427.66114-1-gavin.hu@arm.com> References: <20200306050427.66114-1-gavin.hu@arm.com> Subject: [dpdk-dev] [PATCH v1 1/3] net/i40e: relax barrier in the Tx fastpath of vPMD X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" To keep ordering of mixed accesses, rte_cio is sufficient. The rte_io barrier inside the I40E_PCI_REG_WRITE is overkill.[1] This patch fixes by replacing with just sufficient barriers in the normal PMD and vPMD. It showed 7% performance uplift on ThunderX2 and 4% on Arm N1SDP. The test case is the RFC2544 zero-loss test running testpmd. [1] http://inbox.dpdk.org/dev/CALBAE1M-ezVWCjqCZDBw+MMDEC4O9 qf0Kpn89EMdGDajepKoZQ@mail.gmail.com Fixes: 4861cde46116 ("i40e: new poll mode driver") Cc: stable@dpdk.org Signed-off-by: Gavin Hu Acked-by: Jerin Jacob --- drivers/net/i40e/i40e_rxtx_vec_neon.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/i40e/i40e_rxtx_vec_neon.c b/drivers/net/i40e/i40e_rxtx_vec_neon.c index deb185fe2..4376d8911 100644 --- a/drivers/net/i40e/i40e_rxtx_vec_neon.c +++ b/drivers/net/i40e/i40e_rxtx_vec_neon.c @@ -72,8 +72,9 @@ i40e_rxq_rearm(struct i40e_rx_queue *rxq) rx_id = (uint16_t)((rxq->rxrearm_start == 0) ? (rxq->nb_rx_desc - 1) : (rxq->rxrearm_start - 1)); + rte_cio_wmb(); /* Update the tail pointer on the NIC */ - I40E_PCI_REG_WRITE(rxq->qrx_tail, rx_id); + I40E_PCI_REG_WRITE_RELAXED(rxq->qrx_tail, rx_id); } static inline void @@ -564,7 +565,8 @@ i40e_xmit_fixed_burst_vec(void *tx_queue, struct rte_mbuf **tx_pkts, txq->tx_tail = tx_id; - I40E_PCI_REG_WRITE(txq->qtx_tail, txq->tx_tail); + rte_cio_wmb(); + I40E_PCI_REG_WRITE_RELAXED(txq->qtx_tail, tx_id); return nb_pkts; } From patchwork Fri Mar 6 05:04:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gavin Hu X-Patchwork-Id: 66314 X-Patchwork-Delegate: xiaolong.ye@intel.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 07D54A056A; Fri, 6 Mar 2020 06:05:25 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id DF5C11BFF0; Fri, 6 Mar 2020 06:05:12 +0100 (CET) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id F1AB71BFEE for ; Fri, 6 Mar 2020 06:05:11 +0100 (CET) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7AA0730E; Thu, 5 Mar 2020 21:05:11 -0800 (PST) Received: from net-arm-thunderx2-04.shanghai.arm.com (net-arm-thunderx2-04.shanghai.arm.com [10.169.40.184]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 90A1F3F6CF; Thu, 5 Mar 2020 21:05:08 -0800 (PST) From: Gavin Hu To: dev@dpdk.org Cc: nd@arm.com, david.marchand@redhat.com, thomas@monjalon.net, jerinj@marvell.com, xiaolong.ye@intel.com, Honnappa.Nagarahalli@arm.com, ruifeng.wang@arm.com, phil.yang@arm.com, joyce.kong@arm.com, steve.capper@arm.com Date: Fri, 6 Mar 2020 13:04:26 +0800 Message-Id: <20200306050427.66114-3-gavin.hu@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200306050427.66114-1-gavin.hu@arm.com> References: <20200306050427.66114-1-gavin.hu@arm.com> Subject: [dpdk-dev] [PATCH v1 2/3] net/i40e: restrict pointer aliasing for neon vec X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" restrict pointer aliasing to optimize the code generated. The patch showed ~3% performnace uplift on Arm N1SDP platform, and no degradation on ThunderX2. The tet case is RFC2544 zero-loss L2 forwarding running testpmd. [1] https://gcc.gnu.org/onlinedocs/gcc-4.8.5/gcc/Restricted-Pointers.html Signed-off-by: Gavin Hu Reviewed-by: Steve Capper --- drivers/net/i40e/i40e_rxtx_vec_neon.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/net/i40e/i40e_rxtx_vec_neon.c b/drivers/net/i40e/i40e_rxtx_vec_neon.c index 4376d8911..405bd82df 100644 --- a/drivers/net/i40e/i40e_rxtx_vec_neon.c +++ b/drivers/net/i40e/i40e_rxtx_vec_neon.c @@ -172,8 +172,8 @@ desc_to_olflags_v(struct i40e_rx_queue *rxq, uint64x2_t descs[4], #define I40E_UINT16_BIT (CHAR_BIT * sizeof(uint16_t)) static inline void -desc_to_ptype_v(uint64x2_t descs[4], struct rte_mbuf **rx_pkts, - uint32_t *ptype_tbl) +desc_to_ptype_v(uint64x2_t descs[4], struct rte_mbuf **restrict rx_pkts, + uint32_t *restrict ptype_tbl) { int i; uint8_t ptype; @@ -194,8 +194,8 @@ desc_to_ptype_v(uint64x2_t descs[4], struct rte_mbuf **rx_pkts, * numbers of DD bits */ static inline uint16_t -_recv_raw_pkts_vec(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts, - uint16_t nb_pkts, uint8_t *split_packet) +_recv_raw_pkts_vec(struct i40e_rx_queue *restrict rxq, struct rte_mbuf + **restrict rx_pkts, uint16_t nb_pkts, uint8_t *split_packet) { volatile union i40e_rx_desc *rxdp; struct i40e_rx_entry *sw_ring; @@ -432,7 +432,7 @@ _recv_raw_pkts_vec(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts, * numbers of DD bits */ uint16_t -i40e_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts, +i40e_recv_pkts_vec(void *restrict rx_queue, struct rte_mbuf **restrict rx_pkts, uint16_t nb_pkts) { return _recv_raw_pkts_vec(rx_queue, rx_pkts, nb_pkts, NULL); @@ -494,8 +494,8 @@ vtx1(volatile struct i40e_tx_desc *txdp, } static inline void -vtx(volatile struct i40e_tx_desc *txdp, - struct rte_mbuf **pkt, uint16_t nb_pkts, uint64_t flags) +vtx(volatile struct i40e_tx_desc *txdp, struct rte_mbuf **pkt, + uint16_t nb_pkts, uint64_t flags) { int i; @@ -504,8 +504,8 @@ vtx(volatile struct i40e_tx_desc *txdp, } uint16_t -i40e_xmit_fixed_burst_vec(void *tx_queue, struct rte_mbuf **tx_pkts, - uint16_t nb_pkts) +i40e_xmit_fixed_burst_vec(void *restrict tx_queue, + struct rte_mbuf **restrict tx_pkts, uint16_t nb_pkts) { struct i40e_tx_queue *txq = (struct i40e_tx_queue *)tx_queue; volatile struct i40e_tx_desc *txdp; From patchwork Fri Mar 6 05:04:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gavin Hu X-Patchwork-Id: 66315 X-Patchwork-Delegate: xiaolong.ye@intel.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5138BA056A; Fri, 6 Mar 2020 06:05:33 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 800421BFDA; Fri, 6 Mar 2020 06:05:17 +0100 (CET) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id 79EB41BFE8 for ; Fri, 6 Mar 2020 06:05:15 +0100 (CET) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0557130E; Thu, 5 Mar 2020 21:05:15 -0800 (PST) Received: from net-arm-thunderx2-04.shanghai.arm.com (net-arm-thunderx2-04.shanghai.arm.com [10.169.40.184]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id E88723F6CF; Thu, 5 Mar 2020 21:05:11 -0800 (PST) From: Gavin Hu To: dev@dpdk.org Cc: nd@arm.com, david.marchand@redhat.com, thomas@monjalon.net, jerinj@marvell.com, xiaolong.ye@intel.com, Honnappa.Nagarahalli@arm.com, ruifeng.wang@arm.com, phil.yang@arm.com, joyce.kong@arm.com, steve.capper@arm.com Date: Fri, 6 Mar 2020 13:04:27 +0800 Message-Id: <20200306050427.66114-4-gavin.hu@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200306050427.66114-1-gavin.hu@arm.com> References: <20200306050427.66114-1-gavin.hu@arm.com> Subject: [dpdk-dev] [PATCH v1 3/3] net/i40e: auto-vectorization to speed up Tx free X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Tx mbuf free is a hotspot for i40e on aarch64, as there are no inter-loop dependencies, it is safe to enable auto-vectorization to speed up. This patch showed 2~3% performance lift on ThunderX2 and no degradation on Arm N1SDP. The test case is single core RFC2544 zero-loss test. Signed-off-by: Gavin Hu Reviewed-by: Steve Capper --- drivers/net/i40e/i40e_rxtx_vec_common.h | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/net/i40e/i40e_rxtx_vec_common.h b/drivers/net/i40e/i40e_rxtx_vec_common.h index 0e6ffa007..fc0fa45d4 100644 --- a/drivers/net/i40e/i40e_rxtx_vec_common.h +++ b/drivers/net/i40e/i40e_rxtx_vec_common.h @@ -98,6 +98,11 @@ i40e_tx_free_bufs(struct i40e_tx_queue *txq) if (likely(m != NULL)) { free[0] = m; nb_free = 1; +#if defined(__clang__) +#pragma clang loop vectorize(assume_safety) +#elif defined(__GNUC__) +#pragma GCC ivdep +#endif for (i = 1; i < n; i++) { m = rte_pktmbuf_prefree_seg(txep[i].mbuf); if (likely(m != NULL)) {