From patchwork Tue Jun 1 08:30:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ruifeng Wang X-Patchwork-Id: 93714 X-Patchwork-Delegate: rasland@nvidia.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 42BFBA0524; Tue, 1 Jun 2021 10:31:30 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 2DD20410E3; Tue, 1 Jun 2021 10:31:30 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id 4A58940E6E; Tue, 1 Jun 2021 10:31:29 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C2A571FB; Tue, 1 Jun 2021 01:31:28 -0700 (PDT) Received: from net-arm-n1amp-01.shanghai.arm.com (net-arm-n1amp-01.shanghai.arm.com [10.169.210.111]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id A125B3F73D; Tue, 1 Jun 2021 01:31:25 -0700 (PDT) From: Ruifeng Wang To: rasland@nvidia.com, matan@nvidia.com, shahafs@nvidia.com, viacheslavo@nvidia.com Cc: dev@dpdk.org, jerinj@marvell.com, nd@arm.com, honnappa.nagarahalli@arm.com, Ruifeng Wang , stable@dpdk.org Date: Tue, 1 Jun 2021 08:30:54 +0000 Message-Id: <20210601083055.97261-2-ruifeng.wang@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210601083055.97261-1-ruifeng.wang@arm.com> References: <20210601083055.97261-1-ruifeng.wang@arm.com> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH 1/2] net/mlx5: remove redundant operations X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Some operations on mask are redundant and can be removed. The change yielded 1.6% performance gain on N1SDP. On ThunderX2, slight performance uplift was also observed. Fixes: 570acdb1da8a ("net/mlx5: add vectorized Rx/Tx burst for ARM") Cc: stable@dpdk.org Signed-off-by: Ruifeng Wang --- drivers/net/mlx5/mlx5_rxtx_vec_neon.h | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h index 2234fbe6b2..98a75b09c6 100644 --- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h +++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h @@ -768,18 +768,11 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq, comp_mask), 0)) / (sizeof(uint16_t) * 8); /* D.6 mask out entries after the compressed CQE. */ - mask = vcreate_u16(comp_idx < MLX5_VPMD_DESCS_PER_LOOP ? - -1UL >> (comp_idx * sizeof(uint16_t) * 8) : - 0); - invalid_mask = vorr_u16(invalid_mask, mask); + invalid_mask = vorr_u16(invalid_mask, comp_mask); /* D.7 count non-compressed valid CQEs. */ n = __builtin_clzl(vget_lane_u64(vreinterpret_u64_u16( invalid_mask), 0)) / (sizeof(uint16_t) * 8); nocmp_n += n; - /* D.2 get the final invalid mask. */ - mask = vcreate_u16(n < MLX5_VPMD_DESCS_PER_LOOP ? - -1UL >> (n * sizeof(uint16_t) * 8) : 0); - invalid_mask = vorr_u16(invalid_mask, mask); /* D.3 check error in opcode. */ opcode = vceq_u16(resp_err_check, opcode); opcode = vbic_u16(opcode, invalid_mask);