From patchwork Thu Feb 24 16:10:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavan Nikhilesh Bhagavatula X-Patchwork-Id: 108315 X-Patchwork-Delegate: jerinj@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 40987A034C; Thu, 24 Feb 2022 17:10:25 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D1DF940696; Thu, 24 Feb 2022 17:10:24 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id E1EEF40040 for ; Thu, 24 Feb 2022 17:10:22 +0100 (CET) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.16.1.2/8.16.1.2) with ESMTP id 21OEtPg4008939 for ; Thu, 24 Feb 2022 08:10:22 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=zZO0ZkUrEyBiAK3tHPl+p6SxmLyMIlfoi8JhyM1bUFU=; b=DOsJeO2QCCvc+LPXcu5LfZrO890qQiA7A4KzqlamJ2ha9lJl+i+jRuS5q76rKQDsVmlD FKcdMGFvZ102Bj+pj71o8pPAecUUWjRLvLWlORaMXTXm0WAEzij4F0H+UyW+ORHRs5Rt LhfrhW4KaCzsUq/DAZqfqBDYBobj5Fh4SPas6xMATFMKAcufc+e4x8O7VoJ8fj295Lri PQZpflfFFJjethgLInsJH32GobwzI7mbAqZcRi/hsLOVPa1pEgekvyI6xIfLHnLYuQl1 tBMBKLpRxPZIb8wMspXOdG78hoyvFzvTXexGd2ovqAb/RpD3Gu/FBCWRlojUKhB/O8jE 2g== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3edjerqg12-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT) for ; Thu, 24 Feb 2022 08:10:21 -0800 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Thu, 24 Feb 2022 08:10:19 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Thu, 24 Feb 2022 08:10:19 -0800 Received: from BG-LT7430.marvell.com (unknown [10.193.70.86]) by maili.marvell.com (Postfix) with ESMTP id 700DA5B6957; Thu, 24 Feb 2022 08:10:16 -0800 (PST) From: To: , Nithin Dabilpuram , "Kiran Kumar K" , Sunil Kumar Kori , Satha Rao CC: , Pavan Nikhilesh Subject: [PATCH v2 1/2] net/cnxk: optimize Rx pktsize extraction Date: Thu, 24 Feb 2022 21:40:11 +0530 Message-ID: <20220224161013.4566-1-pbhagavatula@marvell.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220224135243.4233-1-pbhagavatula@marvell.com> References: <20220224135243.4233-1-pbhagavatula@marvell.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: 4ltiLNKHLxEfCNsMWhPY7Usnr6S8gigg X-Proofpoint-GUID: 4ltiLNKHLxEfCNsMWhPY7Usnr6S8gigg X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.816,Hydra:6.0.425,FMLib:17.11.64.514 definitions=2022-02-24_03,2022-02-24_01,2022-02-23_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Pavan Nikhilesh In vWQE mode, the mbuf address is calculated without using the iova list. Packet length can also be calculated by using NIX_PARSE_S by which we can completely eliminate reading 2nd cache line depending on the offloads enabled. Signed-off-by: Pavan Nikhilesh --- v2 Changes: - Reword commit message. drivers/net/cnxk/cn10k_rx.h | 75 +++++++++++++++++++++++++++---------- 1 file changed, 55 insertions(+), 20 deletions(-) -- 2.17.1 diff --git a/drivers/net/cnxk/cn10k_rx.h b/drivers/net/cnxk/cn10k_rx.h index abf280102b..65a08e379b 100644 --- a/drivers/net/cnxk/cn10k_rx.h +++ b/drivers/net/cnxk/cn10k_rx.h @@ -590,7 +590,7 @@ cn10k_nix_recv_pkts_vector(void *args, struct rte_mbuf **mbufs, uint16_t pkts, *(uint64_t *)args : rxq->mbuf_initializer; const uint64x2_t data_off = flags & NIX_RX_VWQE_F ? - vdupq_n_u64(0x80ULL) : + vdupq_n_u64(RTE_PKTMBUF_HEADROOM) : vdupq_n_u64(rxq->data_off); const uint32_t qmask = flags & NIX_RX_VWQE_F ? 0 : rxq->qmask; const uint64_t wdata = flags & NIX_RX_VWQE_F ? 0 : rxq->wdata; @@ -687,6 +687,12 @@ cn10k_nix_recv_pkts_vector(void *args, struct rte_mbuf **mbufs, uint16_t pkts, cq3_w8 = vld1q_u64(CQE_PTR_OFF(cq0, 3, 64, flags)); if (!(flags & NIX_RX_VWQE_F)) { + /* Get NIX_RX_SG_S for size and buffer pointer */ + cq0_w8 = vld1q_u64(CQE_PTR_OFF(cq0, 0, 64, flags)); + cq1_w8 = vld1q_u64(CQE_PTR_OFF(cq0, 1, 64, flags)); + cq2_w8 = vld1q_u64(CQE_PTR_OFF(cq0, 2, 64, flags)); + cq3_w8 = vld1q_u64(CQE_PTR_OFF(cq0, 3, 64, flags)); + /* Extract mbuf from NIX_RX_SG_S */ mbuf01 = vzip2q_u64(cq0_w8, cq1_w8); mbuf23 = vzip2q_u64(cq2_w8, cq3_w8); @@ -705,21 +711,24 @@ cn10k_nix_recv_pkts_vector(void *args, struct rte_mbuf **mbufs, uint16_t pkts, mbuf2 = (struct rte_mbuf *)vgetq_lane_u64(mbuf23, 0); mbuf3 = (struct rte_mbuf *)vgetq_lane_u64(mbuf23, 1); - /* Mask to get packet len from NIX_RX_SG_S */ - const uint8x16_t shuf_msk = { - 0xFF, 0xFF, /* pkt_type set as unknown */ - 0xFF, 0xFF, /* pkt_type set as unknown */ - 0, 1, /* octet 1~0, low 16 bits pkt_len */ - 0xFF, 0xFF, /* skip high 16 bits pkt_len, zero out */ - 0, 1, /* octet 1~0, 16 bits data_len */ - 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF}; - - /* Form the rx_descriptor_fields1 with pkt_len and data_len */ - f0 = vqtbl1q_u8(cq0_w8, shuf_msk); - f1 = vqtbl1q_u8(cq1_w8, shuf_msk); - f2 = vqtbl1q_u8(cq2_w8, shuf_msk); - f3 = vqtbl1q_u8(cq3_w8, shuf_msk); - + if (!(flags & NIX_RX_VWQE_F)) { + /* Mask to get packet len from NIX_RX_SG_S */ + const uint8x16_t shuf_msk = { + 0xFF, 0xFF, /* pkt_type set as unknown */ + 0xFF, 0xFF, /* pkt_type set as unknown */ + 0, 1, /* octet 1~0, low 16 bits pkt_len */ + 0xFF, 0xFF, /* skip high 16 bits pkt_len, zero + out */ + 0, 1, /* octet 1~0, 16 bits data_len */ + 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF}; + + /* Form the rx_descriptor_fields1 with pkt_len and + * data_len */ + f0 = vqtbl1q_u8(cq0_w8, shuf_msk); + f1 = vqtbl1q_u8(cq1_w8, shuf_msk); + f2 = vqtbl1q_u8(cq2_w8, shuf_msk); + f3 = vqtbl1q_u8(cq3_w8, shuf_msk); + } if (flags & NIX_RX_OFFLOAD_SECURITY_F) { /* Prefetch probable CPT parse header area */ rte_prefetch_non_temporal(RTE_PTR_ADD(mbuf0, d_off)); @@ -731,12 +740,42 @@ cn10k_nix_recv_pkts_vector(void *args, struct rte_mbuf **mbufs, uint16_t pkts, /* Load CQE word0 and word 1 */ const uint64_t cq0_w0 = *CQE_PTR_OFF(cq0, 0, 0, flags); const uint64_t cq0_w1 = *CQE_PTR_OFF(cq0, 0, 8, flags); + const uint64_t cq0_w2 = *CQE_PTR_OFF(cq0, 0, 16, flags); const uint64_t cq1_w0 = *CQE_PTR_OFF(cq0, 1, 0, flags); const uint64_t cq1_w1 = *CQE_PTR_OFF(cq0, 1, 8, flags); + const uint64_t cq1_w2 = *CQE_PTR_OFF(cq0, 1, 16, flags); const uint64_t cq2_w0 = *CQE_PTR_OFF(cq0, 2, 0, flags); const uint64_t cq2_w1 = *CQE_PTR_OFF(cq0, 2, 8, flags); + const uint64_t cq2_w2 = *CQE_PTR_OFF(cq0, 2, 16, flags); const uint64_t cq3_w0 = *CQE_PTR_OFF(cq0, 3, 0, flags); const uint64_t cq3_w1 = *CQE_PTR_OFF(cq0, 3, 8, flags); + const uint64_t cq3_w2 = *CQE_PTR_OFF(cq0, 3, 16, flags); + + if (flags & NIX_RX_VWQE_F) { + uint16_t psize0, psize1, psize2, psize3; + + psize0 = (cq0_w2 & 0xFFFF) + 1; + psize1 = (cq1_w2 & 0xFFFF) + 1; + psize2 = (cq2_w2 & 0xFFFF) + 1; + psize3 = (cq3_w2 & 0xFFFF) + 1; + + f0 = vdupq_n_u64(0); + f1 = vdupq_n_u64(0); + f2 = vdupq_n_u64(0); + f3 = vdupq_n_u64(0); + + f0 = vsetq_lane_u16(psize0, f0, 2); + f0 = vsetq_lane_u16(psize0, f0, 4); + + f1 = vsetq_lane_u16(psize1, f1, 2); + f1 = vsetq_lane_u16(psize1, f1, 4); + + f2 = vsetq_lane_u16(psize2, f2, 2); + f2 = vsetq_lane_u16(psize2, f2, 4); + + f3 = vsetq_lane_u16(psize3, f3, 2); + f3 = vsetq_lane_u16(psize3, f3, 4); + } if (flags & NIX_RX_OFFLOAD_RSS_F) { /* Fill rss in the rx_descriptor_fields1 */ @@ -805,10 +844,6 @@ cn10k_nix_recv_pkts_vector(void *args, struct rte_mbuf **mbufs, uint16_t pkts, } if (flags & NIX_RX_OFFLOAD_VLAN_STRIP_F) { - uint64_t cq0_w2 = *(uint64_t *)(cq0 + CQE_SZ(0) + 16); - uint64_t cq1_w2 = *(uint64_t *)(cq0 + CQE_SZ(1) + 16); - uint64_t cq2_w2 = *(uint64_t *)(cq0 + CQE_SZ(2) + 16); - uint64_t cq3_w2 = *(uint64_t *)(cq0 + CQE_SZ(3) + 16); ol_flags0 = nix_vlan_update(cq0_w2, ol_flags0, &f0); ol_flags1 = nix_vlan_update(cq1_w2, ol_flags1, &f1);