From patchwork Wed Sep 15 08:33:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ruifeng Wang X-Patchwork-Id: 98887 X-Patchwork-Delegate: qi.z.zhang@intel.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id C13D9A0C41; Wed, 15 Sep 2021 10:34:47 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id B0512410ED; Wed, 15 Sep 2021 10:34:47 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id DD8264003C; Wed, 15 Sep 2021 10:34:46 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 661066D; Wed, 15 Sep 2021 01:34:46 -0700 (PDT) Received: from net-arm-n1amp-02.shanghai.arm.com (net-arm-n1amp-02.shanghai.arm.com [10.169.210.110]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 0C4273F719; Wed, 15 Sep 2021 01:34:42 -0700 (PDT) From: Ruifeng Wang To: dev@dpdk.org Cc: beilei.xing@intel.com, qi.z.zhang@intel.com, bruce.richardson@intel.com, jerinj@marvell.com, hemant.agrawal@nxp.com, drc@linux.vnet.ibm.com, honnappa.nagarahalli@arm.com, stable@dpdk.org, nd@arm.com, Ruifeng Wang Date: Wed, 15 Sep 2021 16:33:38 +0800 Message-Id: <20210915083339.2424369-2-ruifeng.wang@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210915083339.2424369-1-ruifeng.wang@arm.com> References: <20210906033201.1789796-1-ruifeng.wang@arm.com> <20210915083339.2424369-1-ruifeng.wang@arm.com> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v2 1/2] net/i40e: fix risk in Rx descriptor read in NEON vector path X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Rx descriptor is 16B/32B in size. If the DD bit is set, it indicates that the rest of the descriptor words have valid values. Hence, the word containing DD bit must be read first before reading the rest of the descriptor words. In NEON vector PMD, vector load loads two contiguous 8B of descriptor data into vector register. Given vector load ensures no 16B atomicity, read of the word that includes DD field could be reordered after read of other words. In this case, some words could contain invalid data. Read barrier is added after read of qword1 that includes DD field. And qword0 is reloaded to update vector register. This ensures that the fetched data is correct. Testpmd single core test on N1SDP/ThunderX2 showed no performance drop. Fixes: ae0eb310f253 ("net/i40e: implement vector PMD for ARM") Cc: stable@dpdk.org Signed-off-by: Ruifeng Wang Reviewed-by: Honnappa Nagarahalli --- drivers/net/i40e/i40e_rxtx_vec_neon.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/net/i40e/i40e_rxtx_vec_neon.c b/drivers/net/i40e/i40e_rxtx_vec_neon.c index b2683fda60..71191c7cc8 100644 --- a/drivers/net/i40e/i40e_rxtx_vec_neon.c +++ b/drivers/net/i40e/i40e_rxtx_vec_neon.c @@ -286,6 +286,14 @@ _recv_raw_pkts_vec(struct i40e_rx_queue *__rte_restrict rxq, descs[1] = vld1q_u64((uint64_t *)(rxdp + 1)); descs[0] = vld1q_u64((uint64_t *)(rxdp)); + /* Use acquire fence to order loads of descriptor qwords */ + rte_atomic_thread_fence(__ATOMIC_ACQUIRE); + /* A.2 reload qword0 to make it ordered after qword1 load */ + descs[3] = vld1q_lane_u64((uint64_t *)(rxdp + 3), descs[3], 0); + descs[2] = vld1q_lane_u64((uint64_t *)(rxdp + 2), descs[2], 0); + descs[1] = vld1q_lane_u64((uint64_t *)(rxdp + 1), descs[1], 0); + descs[0] = vld1q_lane_u64((uint64_t *)(rxdp), descs[0], 0); + /* B.1 load 4 mbuf point */ mbp1 = vld1q_u64((uint64_t *)&sw_ring[pos]); mbp2 = vld1q_u64((uint64_t *)&sw_ring[pos + 2]); From patchwork Wed Sep 15 08:33:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ruifeng Wang X-Patchwork-Id: 98888 X-Patchwork-Delegate: qi.z.zhang@intel.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 94D7BA0C41; Wed, 15 Sep 2021 10:34:55 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 59D7A410FC; Wed, 15 Sep 2021 10:34:54 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id 8E7E34003C; Wed, 15 Sep 2021 10:34:52 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 19BE46D; Wed, 15 Sep 2021 01:34:52 -0700 (PDT) Received: from net-arm-n1amp-02.shanghai.arm.com (net-arm-n1amp-02.shanghai.arm.com [10.169.210.110]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id B43B13F719; Wed, 15 Sep 2021 01:34:48 -0700 (PDT) From: Ruifeng Wang To: dev@dpdk.org Cc: beilei.xing@intel.com, qi.z.zhang@intel.com, bruce.richardson@intel.com, jerinj@marvell.com, hemant.agrawal@nxp.com, drc@linux.vnet.ibm.com, honnappa.nagarahalli@arm.com, stable@dpdk.org, nd@arm.com, Ruifeng Wang Date: Wed, 15 Sep 2021 16:33:39 +0800 Message-Id: <20210915083339.2424369-3-ruifeng.wang@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210915083339.2424369-1-ruifeng.wang@arm.com> References: <20210906033201.1789796-1-ruifeng.wang@arm.com> <20210915083339.2424369-1-ruifeng.wang@arm.com> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v2 2/2] net/i40e: fix risk in Rx descriptor read in scalar path X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Rx descriptor is 16B/32B in size. If the DD bit is set, it indicates that the rest of the descriptor words have valid values. Hence, the word containing DD bit must be read first before reading the rest of the descriptor words. Since the entire descriptor is not read atomically, on relaxed memory ordered systems like Aarch64, read of the word containing DD field could be reordered after read of other words. Read barrier is inserted between read of the word with DD field and read of other words. The barrier ensures that the fetched data is correct. Testpmd single core test showed no performance drop on x86 or N1SDP. On ThunderX2, 22% performance regression was observed. Fixes: 7b0cf70135d1 ("net/i40e: support ARM platform") Cc: stable@dpdk.org Signed-off-by: Ruifeng Wang Reviewed-by: Honnappa Nagarahalli --- drivers/net/i40e/i40e_rxtx.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c index 8329cbdd4e..c4cd6b6b60 100644 --- a/drivers/net/i40e/i40e_rxtx.c +++ b/drivers/net/i40e/i40e_rxtx.c @@ -746,6 +746,12 @@ i40e_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts) break; } + /** + * Use acquire fence to ensure that qword1 which includes DD + * bit is loaded before loading of other descriptor words. + */ + rte_atomic_thread_fence(__ATOMIC_ACQUIRE); + rxd = *rxdp; nb_hold++; rxe = &sw_ring[rx_id]; @@ -862,6 +868,12 @@ i40e_recv_scattered_pkts(void *rx_queue, break; } + /** + * Use acquire fence to ensure that qword1 which includes DD + * bit is loaded before loading of other descriptor words. + */ + rte_atomic_thread_fence(__ATOMIC_ACQUIRE); + rxd = *rxdp; nb_hold++; rxe = &sw_ring[rx_id];