From patchwork Wed Jul 7 09:03:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ruifeng Wang X-Patchwork-Id: 95474 X-Patchwork-Delegate: rasland@nvidia.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id BCB71A0C4A; Wed, 7 Jul 2021 11:03:31 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 95EFE4135A; Wed, 7 Jul 2021 11:03:31 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id 30E2A406B4; Wed, 7 Jul 2021 11:03:30 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9D68CED1; Wed, 7 Jul 2021 02:03:29 -0700 (PDT) Received: from net-arm-n1amp-02.shanghai.arm.com (net-arm-n1amp-02.shanghai.arm.com [10.169.210.110]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 855CC3F694; Wed, 7 Jul 2021 02:03:26 -0700 (PDT) From: Ruifeng Wang To: rasland@nvidia.com, matan@nvidia.com, shahafs@nvidia.com, viacheslavo@nvidia.com Cc: dev@dpdk.org, jerinj@marvell.com, nd@arm.com, honnappa.nagarahalli@arm.com, Ruifeng Wang , stable@dpdk.org Date: Wed, 7 Jul 2021 17:03:06 +0800 Message-Id: <20210707090307.1650632-2-ruifeng.wang@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210707090307.1650632-1-ruifeng.wang@arm.com> References: <20210601083055.97261-1-ruifeng.wang@arm.com> <20210707090307.1650632-1-ruifeng.wang@arm.com> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v2 1/2] net/mlx5: remove redundant operations X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Mask of entries after the compressed CQE is covered by invalid mask of non-compressed valid CQEs. Hence remove redundant calculation on mask. The change showed slight performance uplift on N1SDP. Fixes: 570acdb1da8a ("net/mlx5: add vectorized Rx/Tx burst for ARM") Cc: stable@dpdk.org Signed-off-by: Ruifeng Wang Acked-by: Viacheslav Ovsiienko --- drivers/net/mlx5/mlx5_rxtx_vec_neon.h | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h index 2234fbe6b2..ce50a3ccc4 100644 --- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h +++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h @@ -767,16 +767,15 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, comp_idx = __builtin_clzl(vget_lane_u64(vreinterpret_u64_u16( comp_mask), 0)) / (sizeof(uint16_t) * 8); - /* D.6 mask out entries after the compressed CQE. */ - mask = vcreate_u16(comp_idx < MLX5_VPMD_DESCS_PER_LOOP ? - -1UL >> (comp_idx * sizeof(uint16_t) * 8) : - 0); - invalid_mask = vorr_u16(invalid_mask, mask); + invalid_mask = vorr_u16(invalid_mask, comp_mask); /* D.7 count non-compressed valid CQEs. */ n = __builtin_clzl(vget_lane_u64(vreinterpret_u64_u16( invalid_mask), 0)) / (sizeof(uint16_t) * 8); nocmp_n += n; - /* D.2 get the final invalid mask. */ + /* + * D.2 mask out entries after the compressed CQE. + * get the final invalid mask. + */ mask = vcreate_u16(n < MLX5_VPMD_DESCS_PER_LOOP ? -1UL >> (n * sizeof(uint16_t) * 8) : 0); invalid_mask = vorr_u16(invalid_mask, mask); From patchwork Wed Jul 7 09:03:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ruifeng Wang X-Patchwork-Id: 95475 X-Patchwork-Delegate: rasland@nvidia.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0D9F2A0C4A; Wed, 7 Jul 2021 11:03:38 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id F11E841369; Wed, 7 Jul 2021 11:03:37 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id 9F644406B4 for ; Wed, 7 Jul 2021 11:03:35 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1C535ED1; Wed, 7 Jul 2021 02:03:35 -0700 (PDT) Received: from net-arm-n1amp-02.shanghai.arm.com (net-arm-n1amp-02.shanghai.arm.com [10.169.210.110]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 40B3B3F694; Wed, 7 Jul 2021 02:03:32 -0700 (PDT) From: Ruifeng Wang To: rasland@nvidia.com, matan@nvidia.com, shahafs@nvidia.com, viacheslavo@nvidia.com Cc: dev@dpdk.org, jerinj@marvell.com, nd@arm.com, honnappa.nagarahalli@arm.com, Ruifeng Wang Date: Wed, 7 Jul 2021 17:03:07 +0800 Message-Id: <20210707090307.1650632-3-ruifeng.wang@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210707090307.1650632-1-ruifeng.wang@arm.com> References: <20210601083055.97261-1-ruifeng.wang@arm.com> <20210707090307.1650632-1-ruifeng.wang@arm.com> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v2 2/2] net/mlx5: reduce unnecessary memory access X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" MR btree len is a constant during Rx replenish. Moved retrieve of the value out of loop to reduce data loads. Slight performance uplift was measured on both N1SDP and x86. Suggested-by: Slava Ovsiienko Signed-off-by: Ruifeng Wang Acked-by: Viacheslav Ovsiienko --- drivers/net/mlx5/mlx5_rxtx_vec.c | 35 ++++++++++++++++++-------------- 1 file changed, 20 insertions(+), 15 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.c b/drivers/net/mlx5/mlx5_rxtx_vec.c index d5af2d91ff..e64ef70181 100644 --- a/drivers/net/mlx5/mlx5_rxtx_vec.c +++ b/drivers/net/mlx5/mlx5_rxtx_vec.c @@ -106,22 +106,27 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data *rxq) rxq->stats.rx_nombuf += n; return; } - for (i = 0; i < n; ++i) { - void *buf_addr; - - /* - * In order to support the mbufs with external attached - * data buffer we should use the buf_addr pointer - * instead of rte_mbuf_buf_addr(). It touches the mbuf - * itself and may impact the performance. - */ - buf_addr = elts[i]->buf_addr; - wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr + - RTE_PKTMBUF_HEADROOM); - /* If there's a single MR, no need to replace LKey. */ - if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) - > 1)) + if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) > 1)) { + for (i = 0; i < n; ++i) { + /* + * In order to support the mbufs with external attached + * data buffer we should use the buf_addr pointer + * instead of rte_mbuf_buf_addr(). It touches the mbuf + * itself and may impact the performance. + */ + void *buf_addr = elts[i]->buf_addr; + + wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr + + RTE_PKTMBUF_HEADROOM); wq[i].lkey = mlx5_rx_mb2mr(rxq, elts[i]); + } + } else { + for (i = 0; i < n; ++i) { + void *buf_addr = elts[i]->buf_addr; + + wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr + + RTE_PKTMBUF_HEADROOM); + } } rxq->rq_ci += n; /* Prevent overflowing into consumed mbufs. */