From patchwork Thu Feb 10 10:19:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavan Nikhilesh Bhagavatula X-Patchwork-Id: 107223 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 075ECA00C2; Thu, 10 Feb 2022 11:19:52 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 8B6F6411EE; Thu, 10 Feb 2022 11:19:49 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com [67.231.148.174]) by mails.dpdk.org (Postfix) with ESMTP id F212741223 for ; Thu, 10 Feb 2022 11:19:47 +0100 (CET) Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.16.1.2/8.16.1.2) with ESMTP id 21AAIgG3008090 for ; Thu, 10 Feb 2022 02:19:47 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=Y9N6IzjwzvnzYBKY2CjCkx6JCZ9G6M5HubFr8qj8OeI=; b=e0e4uxN29Hc3rtlthGU45qRaFe+JVhc4e8AW46aJsuDe3IJZHVMCjwHXWUIskk2BBkOu fE91A+siplHLm0iL8X9S46s1yDtPTfKX/JWH7O+wCZEXxAeSDKPF5QVAnhwnR6OOynvQ Y6AS34AWF3Aa1j4Zry1h4wgGZxla1uHvwSAOEyzvibWu6hIOzHwKcdutZbeOHDBbbRES DhHAQNP9/pMe8PWjN+je+YHsVaqxEJvv/3yseRJQ1qpBP4HmljMeX4PRTxQj+YO+mxb9 feZnhpUTAlXGx6VNxgWPFJjcm/7Yi2VSb10pDWk+MK5Bcule9JfWdGkAizteMO4qV80U CA== Received: from dc5-exch02.marvell.com ([199.233.59.182]) by mx0a-0016f401.pphosted.com (PPS) with ESMTPS id 3e50uc803r-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT) for ; Thu, 10 Feb 2022 02:19:47 -0800 Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Thu, 10 Feb 2022 02:19:45 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.18 via Frontend Transport; Thu, 10 Feb 2022 02:19:45 -0800 Received: from BG-LT7430.marvell.com (BG-LT7430.marvell.com [10.28.177.176]) by maili.marvell.com (Postfix) with ESMTP id C8B853F703F; Thu, 10 Feb 2022 02:19:42 -0800 (PST) From: To: , Nithin Dabilpuram , "Kiran Kumar K" , Sunil Kumar Kori , Satha Rao , Pavan Nikhilesh , Shijith Thotton CC: Subject: [PATCH v3 1/3] event/cnxk: store and reuse workslot status Date: Thu, 10 Feb 2022 15:49:38 +0530 Message-ID: <20220210101940.1669-1-pbhagavatula@marvell.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220119071323.3650-1-pbhagavatula@marvell.com> References: <20220119071323.3650-1-pbhagavatula@marvell.com> MIME-Version: 1.0 X-Proofpoint-GUID: PqODYHzVxKfN2dLlpuIx0asbkzhGRrEi X-Proofpoint-ORIG-GUID: PqODYHzVxKfN2dLlpuIx0asbkzhGRrEi X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.816,Hydra:6.0.425,FMLib:17.11.62.513 definitions=2022-02-10_03,2022-02-09_01,2021-12-02_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Pavan Nikhilesh Store and reuse workslot status for TT, GRP and HEAD status instead of reading from GWC as reading from GWC imposes additional latency. Signed-off-by: Pavan Nikhilesh --- Depends-on: 21590 v3 Changes: - Split and rebase patches. v2 Changes: - Rebase. - Fix incorrect use of RoC API drivers/common/cnxk/roc_sso.h | 14 ++++++++------ drivers/event/cnxk/cn10k_worker.h | 16 +++++++++------- drivers/event/cnxk/cn9k_worker.h | 6 +++--- drivers/event/cnxk/cnxk_eventdev.h | 2 ++ drivers/event/cnxk/cnxk_worker.h | 11 +++++++---- drivers/net/cnxk/cn10k_tx.h | 12 ++++++------ 6 files changed, 35 insertions(+), 26 deletions(-) -- 2.17.1 diff --git a/drivers/common/cnxk/roc_sso.h b/drivers/common/cnxk/roc_sso.h index 27d49c6c68..ab7cee1c60 100644 --- a/drivers/common/cnxk/roc_sso.h +++ b/drivers/common/cnxk/roc_sso.h @@ -54,12 +54,13 @@ struct roc_sso { uint8_t reserved[ROC_SSO_MEM_SZ] __plt_cache_aligned; } __plt_cache_aligned; -static __plt_always_inline void -roc_sso_hws_head_wait(uintptr_t tag_op) +static __plt_always_inline uint64_t +roc_sso_hws_head_wait(uintptr_t base) { -#ifdef RTE_ARCH_ARM64 + uintptr_t tag_op = base + SSOW_LF_GWS_TAG; uint64_t tag; +#if defined(__aarch64__) asm volatile(PLT_CPU_FEATURE_PREAMBLE " ldr %[tag], [%[tag_op]] \n" " tbnz %[tag], 35, done%= \n" @@ -71,10 +72,11 @@ roc_sso_hws_head_wait(uintptr_t tag_op) : [tag] "=&r"(tag) : [tag_op] "r"(tag_op)); #else - /* Wait for the SWTAG/SWTAG_FULL operation */ - while (!(plt_read64(tag_op) & BIT_ULL(35))) - ; + do { + tag = plt_read64(tag_op); + } while (!(tag & BIT_ULL(35))); #endif + return tag; } /* SSO device initialization */ diff --git a/drivers/event/cnxk/cn10k_worker.h b/drivers/event/cnxk/cn10k_worker.h index ff08b2d974..ada230ea1d 100644 --- a/drivers/event/cnxk/cn10k_worker.h +++ b/drivers/event/cnxk/cn10k_worker.h @@ -40,8 +40,7 @@ cn10k_sso_hws_fwd_swtag(struct cn10k_sso_hws *ws, const struct rte_event *ev) { const uint32_t tag = (uint32_t)ev->event; const uint8_t new_tt = ev->sched_type; - const uint8_t cur_tt = - CNXK_TT_FROM_TAG(plt_read64(ws->base + SSOW_LF_GWS_WQE0)); + const uint8_t cur_tt = CNXK_TT_FROM_TAG(ws->gw_rdata); /* CNXK model * cur_tt/new_tt SSO_TT_ORDERED SSO_TT_ATOMIC SSO_TT_UNTAGGED @@ -81,7 +80,7 @@ cn10k_sso_hws_forward_event(struct cn10k_sso_hws *ws, const uint8_t grp = ev->queue_id; /* Group hasn't changed, Use SWTAG to forward the event */ - if (CNXK_GRP_FROM_TAG(plt_read64(ws->base + SSOW_LF_GWS_WQE0)) == grp) + if (CNXK_GRP_FROM_TAG(ws->gw_rdata) == grp) cn10k_sso_hws_fwd_swtag(ws, ev); else /* @@ -211,6 +210,7 @@ cn10k_sso_hws_get_work(struct cn10k_sso_hws *ws, struct rte_event *ev, } while (gw.u64[0] & BIT_ULL(63)); mbuf = (uint64_t)((char *)gw.u64[1] - sizeof(struct rte_mbuf)); #endif + ws->gw_rdata = gw.u64[0]; gw.u64[0] = (gw.u64[0] & (0x3ull << 32)) << 6 | (gw.u64[0] & (0x3FFull << 36)) << 4 | (gw.u64[0] & 0xffffffff); @@ -405,7 +405,8 @@ NIX_RX_FASTPATH_MODES RTE_SET_USED(timeout_ticks); \ if (ws->swtag_req) { \ ws->swtag_req = 0; \ - cnxk_sso_hws_swtag_wait(ws->base + SSOW_LF_GWS_WQE0); \ + ws->gw_rdata = cnxk_sso_hws_swtag_wait( \ + ws->base + SSOW_LF_GWS_WQE0); \ return 1; \ } \ return cn10k_sso_hws_get_work(ws, ev, flags, ws->lookup_mem); \ @@ -424,7 +425,8 @@ NIX_RX_FASTPATH_MODES uint64_t iter; \ if (ws->swtag_req) { \ ws->swtag_req = 0; \ - cnxk_sso_hws_swtag_wait(ws->base + SSOW_LF_GWS_WQE0); \ + ws->gw_rdata = cnxk_sso_hws_swtag_wait( \ + ws->base + SSOW_LF_GWS_WQE0); \ return ret; \ } \ ret = cn10k_sso_hws_get_work(ws, ev, flags, ws->lookup_mem); \ @@ -507,8 +509,8 @@ cn10k_sso_tx_one(struct cn10k_sso_hws *ws, struct rte_mbuf *m, uint64_t *cmd, else pa = txq->io_addr | ((segdw - 1) << 4); - if (!sched_type) - roc_sso_hws_head_wait(ws->base + SSOW_LF_GWS_TAG); + if (!CNXK_TAG_IS_HEAD(ws->gw_rdata) && !sched_type) + ws->gw_rdata = roc_sso_hws_head_wait(ws->base); roc_lmt_submit_steorl(lmt_id, pa); } diff --git a/drivers/event/cnxk/cn9k_worker.h b/drivers/event/cnxk/cn9k_worker.h index 303b04c215..8455272005 100644 --- a/drivers/event/cnxk/cn9k_worker.h +++ b/drivers/event/cnxk/cn9k_worker.h @@ -700,7 +700,7 @@ cn9k_sso_hws_xmit_sec_one(const struct cn9k_eth_txq *txq, uint64_t base, /* Head wait if needed */ if (base) - roc_sso_hws_head_wait(base + SSOW_LF_GWS_TAG); + roc_sso_hws_head_wait(base); /* ESN */ outb_priv = roc_nix_inl_onf_ipsec_outb_sa_sw_rsvd((void *)sa); @@ -793,7 +793,7 @@ cn9k_sso_hws_event_tx(uint64_t base, struct rte_event *ev, uint64_t *cmd, flags); if (!CNXK_TT_FROM_EVENT(ev->event)) { cn9k_nix_xmit_mseg_prep_lmt(cmd, txq->lmt_addr, segdw); - roc_sso_hws_head_wait(base + SSOW_LF_GWS_TAG); + roc_sso_hws_head_wait(base); cn9k_sso_txq_fc_wait(txq); if (cn9k_nix_xmit_submit_lmt(txq->io_addr) == 0) cn9k_nix_xmit_mseg_one(cmd, txq->lmt_addr, @@ -806,7 +806,7 @@ cn9k_sso_hws_event_tx(uint64_t base, struct rte_event *ev, uint64_t *cmd, cn9k_nix_xmit_prepare_tstamp(txq, cmd, m->ol_flags, 4, flags); if (!CNXK_TT_FROM_EVENT(ev->event)) { cn9k_nix_xmit_prep_lmt(cmd, txq->lmt_addr, flags); - roc_sso_hws_head_wait(base + SSOW_LF_GWS_TAG); + roc_sso_hws_head_wait(base); cn9k_sso_txq_fc_wait(txq); if (cn9k_nix_xmit_submit_lmt(txq->io_addr) == 0) cn9k_nix_xmit_one(cmd, txq->lmt_addr, diff --git a/drivers/event/cnxk/cnxk_eventdev.h b/drivers/event/cnxk/cnxk_eventdev.h index b26df58588..ab58508590 100644 --- a/drivers/event/cnxk/cnxk_eventdev.h +++ b/drivers/event/cnxk/cnxk_eventdev.h @@ -47,6 +47,7 @@ #define CNXK_CLR_SUB_EVENT(x) (~(0xffu << 20) & x) #define CNXK_GRP_FROM_TAG(x) (((x) >> 36) & 0x3ff) #define CNXK_SWTAG_PEND(x) (BIT_ULL(62) & x) +#define CNXK_TAG_IS_HEAD(x) (BIT_ULL(35) & x) #define CN9K_SSOW_GET_BASE_ADDR(_GW) ((_GW)-SSOW_LF_GWS_OP_GET_WORK0) @@ -123,6 +124,7 @@ struct cnxk_sso_evdev { struct cn10k_sso_hws { uint64_t base; + uint64_t gw_rdata; /* PTP timestamp */ struct cnxk_timesync_info *tstamp; void *lookup_mem; diff --git a/drivers/event/cnxk/cnxk_worker.h b/drivers/event/cnxk/cnxk_worker.h index 9f9ceab8a1..7de03f3fbb 100644 --- a/drivers/event/cnxk/cnxk_worker.h +++ b/drivers/event/cnxk/cnxk_worker.h @@ -52,11 +52,11 @@ cnxk_sso_hws_swtag_flush(uint64_t tag_op, uint64_t flush_op) plt_write64(0, flush_op); } -static __rte_always_inline void +static __rte_always_inline uint64_t cnxk_sso_hws_swtag_wait(uintptr_t tag_op) { -#ifdef RTE_ARCH_ARM64 uint64_t swtp; +#ifdef RTE_ARCH_ARM64 asm volatile(PLT_CPU_FEATURE_PREAMBLE " ldr %[swtb], [%[swtp_loc]] \n" @@ -70,9 +70,12 @@ cnxk_sso_hws_swtag_wait(uintptr_t tag_op) : [swtp_loc] "r"(tag_op)); #else /* Wait for the SWTAG/SWTAG_FULL operation */ - while (plt_read64(tag_op) & BIT_ULL(62)) - ; + do { + swtp = plt_read64(tag_op); + } while (swtp & BIT_ULL(62)); #endif + + return swtp; } #endif diff --git a/drivers/net/cnxk/cn10k_tx.h b/drivers/net/cnxk/cn10k_tx.h index 4ae6bbf517..ec6366168c 100644 --- a/drivers/net/cnxk/cn10k_tx.h +++ b/drivers/net/cnxk/cn10k_tx.h @@ -905,8 +905,8 @@ cn10k_nix_xmit_pkts(void *tx_queue, uint64_t *ws, struct rte_mbuf **tx_pkts, lnum++; } - if (flags & NIX_TX_VWQE_F) - roc_sso_hws_head_wait(ws[0]); + if ((flags & NIX_TX_VWQE_F) && !(ws[1] & BIT_ULL(35))) + ws[1] = roc_sso_hws_head_wait(ws[0]); left -= burst; tx_pkts += burst; @@ -1041,8 +1041,8 @@ cn10k_nix_xmit_pkts_mseg(void *tx_queue, uint64_t *ws, } } - if (flags & NIX_TX_VWQE_F) - roc_sso_hws_head_wait(ws[0]); + if ((flags & NIX_TX_VWQE_F) && !(ws[1] & BIT_ULL(35))) + ws[1] = roc_sso_hws_head_wait(ws[0]); left -= burst; tx_pkts += burst; @@ -2582,8 +2582,8 @@ cn10k_nix_xmit_pkts_vector(void *tx_queue, uint64_t *ws, if (flags & (NIX_TX_MULTI_SEG_F | NIX_TX_OFFLOAD_SECURITY_F)) wd.data[0] >>= 16; - if (flags & NIX_TX_VWQE_F) - roc_sso_hws_head_wait(ws[0]); + if ((flags & NIX_TX_VWQE_F) && !(ws[1] & BIT_ULL(35))) + ws[1] = roc_sso_hws_head_wait(ws[0]); left -= burst; From patchwork Thu Feb 10 10:19:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavan Nikhilesh Bhagavatula X-Patchwork-Id: 107224 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1F1DFA00C2; Thu, 10 Feb 2022 11:19:57 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 8C1D14122E; Thu, 10 Feb 2022 11:19:52 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id C717741223 for ; Thu, 10 Feb 2022 11:19:50 +0100 (CET) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.16.1.2/8.16.1.2) with ESMTP id 21A2i4nY014406 for ; Thu, 10 Feb 2022 02:19:50 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=RodfOBr5MMOdj8Uf1BHDb2747HBf4bTOl5AmtSracxk=; b=Yu0OXlMzPrcqtXatuZ0Y0UlmTuQMWr227F1vSov3MILKfd/jmr4sw4n5rhCCN41/LDm5 z+fMsNYon9H5ptkxtELyiuX13eDQIQd2YR6pN3p29qSasdTa4kpMGqgZx/cyq2UhdZ44 YpY6sWkMUYa502lB9QXCVpIWqNV9RYQ9bMJvC0+rzh3A4NWBP//m4o09wGUUS8iKYThf RzS+bPy6oRyRHLYekCsro5nrepWFxVKkkA2sRxyRoRfS19a6TXbEUWJKjit5Fgl0UTOC HALAGFn6mQNxlBB+m55ZEOriRVC/5XfrPKcA5QxelyNlW5NzxdELcMgRmgY+l7Z8aQx2 GA== Received: from dc5-exch02.marvell.com ([199.233.59.182]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3e4am95t95-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT) for ; Thu, 10 Feb 2022 02:19:49 -0800 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Thu, 10 Feb 2022 02:19:47 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Thu, 10 Feb 2022 02:19:47 -0800 Received: from BG-LT7430.marvell.com (BG-LT7430.marvell.com [10.28.177.176]) by maili.marvell.com (Postfix) with ESMTP id 4B2823F7043; Thu, 10 Feb 2022 02:19:46 -0800 (PST) From: To: , Pavan Nikhilesh , "Shijith Thotton" CC: Subject: [PATCH v3 2/3] event/cnxk: disable default wait time for dequeue Date: Thu, 10 Feb 2022 15:49:39 +0530 Message-ID: <20220210101940.1669-2-pbhagavatula@marvell.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220210101940.1669-1-pbhagavatula@marvell.com> References: <20220119071323.3650-1-pbhagavatula@marvell.com> <20220210101940.1669-1-pbhagavatula@marvell.com> MIME-Version: 1.0 X-Proofpoint-GUID: T7VB5oULcVNmVhybW-0V9lFSJNZIL6BM X-Proofpoint-ORIG-GUID: T7VB5oULcVNmVhybW-0V9lFSJNZIL6BM X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.816,Hydra:6.0.425,FMLib:17.11.62.513 definitions=2022-02-10_03,2022-02-09_01,2021-12-02_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Pavan Nikhilesh Setting WAITW bit enables default min dequeue timeout of 1us. Avoid the min dequeue timeout by setting WAITW only when dequeue_timeout is configured. Signed-off-by: Pavan Nikhilesh --- drivers/event/cnxk/cn10k_eventdev.c | 8 +++++-- drivers/event/cnxk/cn9k_eventdev.c | 9 ++++++- drivers/event/cnxk/cn9k_worker.h | 37 +++++++++++++---------------- drivers/event/cnxk/cnxk_eventdev.c | 2 +- drivers/event/cnxk/cnxk_eventdev.h | 2 ++ 5 files changed, 34 insertions(+), 24 deletions(-) diff --git a/drivers/event/cnxk/cn10k_eventdev.c b/drivers/event/cnxk/cn10k_eventdev.c index 97a88feb13..26d65e3568 100644 --- a/drivers/event/cnxk/cn10k_eventdev.c +++ b/drivers/event/cnxk/cn10k_eventdev.c @@ -15,7 +15,10 @@ static uint32_t cn10k_sso_gw_mode_wdata(struct cnxk_sso_evdev *dev) { - uint32_t wdata = BIT(16) | 1; + uint32_t wdata = 1; + + if (dev->deq_tmo_ns) + wdata |= BIT(16); switch (dev->gw_mode) { case CN10K_GW_MODE_NONE: @@ -88,7 +91,8 @@ cn10k_sso_hws_setup(void *arg, void *hws, uintptr_t grp_base) ws->xaq_lmt = dev->xaq_lmt; /* Set get_work timeout for HWS */ - val = NSEC2USEC(dev->deq_tmo_ns) - 1; + val = NSEC2USEC(dev->deq_tmo_ns); + val = val ? val - 1 : 0; plt_write64(val, ws->base + SSOW_LF_GWS_NW_TIM); } diff --git a/drivers/event/cnxk/cn9k_eventdev.c b/drivers/event/cnxk/cn9k_eventdev.c index f8652d4fbc..6d3d03c97c 100644 --- a/drivers/event/cnxk/cn9k_eventdev.c +++ b/drivers/event/cnxk/cn9k_eventdev.c @@ -72,7 +72,8 @@ cn9k_sso_hws_setup(void *arg, void *hws, uintptr_t grp_base) uint64_t val; /* Set get_work tmo for HWS */ - val = dev->deq_tmo_ns ? NSEC2USEC(dev->deq_tmo_ns) - 1 : 0; + val = NSEC2USEC(dev->deq_tmo_ns); + val = val ? val - 1 : 0; if (dev->dual_ws) { dws = hws; dws->grp_base = grp_base; @@ -677,6 +678,9 @@ cn9k_sso_init_hws_mem(void *arg, uint8_t port_id) dws->hws_id = port_id; dws->swtag_req = 0; dws->vws = 0; + if (dev->deq_tmo_ns) + dws->gw_wdata = BIT_ULL(16); + dws->gw_wdata |= 1; data = dws; } else { @@ -695,6 +699,9 @@ cn9k_sso_init_hws_mem(void *arg, uint8_t port_id) ws->base = roc_sso_hws_base_get(&dev->sso, port_id); ws->hws_id = port_id; ws->swtag_req = 0; + if (dev->deq_tmo_ns) + ws->gw_wdata = BIT_ULL(16); + ws->gw_wdata |= 1; data = ws; } diff --git a/drivers/event/cnxk/cn9k_worker.h b/drivers/event/cnxk/cn9k_worker.h index 8455272005..79374b8d95 100644 --- a/drivers/event/cnxk/cn9k_worker.h +++ b/drivers/event/cnxk/cn9k_worker.h @@ -149,10 +149,8 @@ cn9k_wqe_to_mbuf(uint64_t wqe, const uint64_t mbuf, uint8_t port_id, static __rte_always_inline uint16_t cn9k_sso_hws_dual_get_work(uint64_t base, uint64_t pair_base, struct rte_event *ev, const uint32_t flags, - const void *const lookup_mem, - struct cnxk_timesync_info *const tstamp) + struct cn9k_sso_hws_dual *dws) { - const uint64_t set_gw = BIT_ULL(16) | 1; union { __uint128_t get_work; uint64_t u64[2]; @@ -161,7 +159,7 @@ cn9k_sso_hws_dual_get_work(uint64_t base, uint64_t pair_base, uint64_t mbuf; if (flags & NIX_RX_OFFLOAD_PTYPE_F) - rte_prefetch_non_temporal(lookup_mem); + rte_prefetch_non_temporal(dws->lookup_mem); #ifdef RTE_ARCH_ARM64 asm volatile(PLT_CPU_FEATURE_PREAMBLE "rty%=: \n" @@ -175,14 +173,14 @@ cn9k_sso_hws_dual_get_work(uint64_t base, uint64_t pair_base, : [tag] "=&r"(gw.u64[0]), [wqp] "=&r"(gw.u64[1]), [mbuf] "=&r"(mbuf) : [tag_loc] "r"(base + SSOW_LF_GWS_TAG), - [wqp_loc] "r"(base + SSOW_LF_GWS_WQP), [gw] "r"(set_gw), + [wqp_loc] "r"(base + SSOW_LF_GWS_WQP), [gw] "r"(dws->gw_wdata), [pong] "r"(pair_base + SSOW_LF_GWS_OP_GET_WORK0)); #else gw.u64[0] = plt_read64(base + SSOW_LF_GWS_TAG); while ((BIT_ULL(63)) & gw.u64[0]) gw.u64[0] = plt_read64(base + SSOW_LF_GWS_TAG); gw.u64[1] = plt_read64(base + SSOW_LF_GWS_WQP); - plt_write64(set_gw, pair_base + SSOW_LF_GWS_OP_GET_WORK0); + plt_write64(dws->gw_wdata, pair_base + SSOW_LF_GWS_OP_GET_WORK0); mbuf = (uint64_t)((char *)gw.u64[1] - sizeof(struct rte_mbuf)); #endif @@ -202,12 +200,13 @@ cn9k_sso_hws_dual_get_work(uint64_t base, uint64_t pair_base, gw.u64[0] = CNXK_CLR_SUB_EVENT(gw.u64[0]); cn9k_wqe_to_mbuf(gw.u64[1], mbuf, port, gw.u64[0] & 0xFFFFF, flags, - lookup_mem); + dws->lookup_mem); /* Extracting tstamp, if PTP enabled*/ tstamp_ptr = *(uint64_t *)(((struct nix_wqe_hdr_s *) gw.u64[1]) + CNXK_SSO_WQE_SG_PTR); - cnxk_nix_mbuf_to_tstamp((struct rte_mbuf *)mbuf, tstamp, + cnxk_nix_mbuf_to_tstamp((struct rte_mbuf *)mbuf, + dws->tstamp, flags & NIX_RX_OFFLOAD_TSTAMP_F, flags & NIX_RX_MULTI_SEG_F, (uint64_t *)tstamp_ptr); @@ -232,9 +231,7 @@ cn9k_sso_hws_get_work(struct cn9k_sso_hws *ws, struct rte_event *ev, uint64_t tstamp_ptr; uint64_t mbuf; - plt_write64(BIT_ULL(16) | /* wait for work. */ - 1, /* Use Mask set 0. */ - ws->base + SSOW_LF_GWS_OP_GET_WORK0); + plt_write64(ws->gw_wdata, ws->base + SSOW_LF_GWS_OP_GET_WORK0); if (flags & NIX_RX_OFFLOAD_PTYPE_F) rte_prefetch_non_temporal(lookup_mem); @@ -529,9 +526,9 @@ NIX_RX_FASTPATH_MODES SSOW_LF_GWS_TAG); \ return 1; \ } \ - gw = cn9k_sso_hws_dual_get_work( \ - dws->base[dws->vws], dws->base[!dws->vws], ev, flags, \ - dws->lookup_mem, dws->tstamp); \ + gw = cn9k_sso_hws_dual_get_work(dws->base[dws->vws], \ + dws->base[!dws->vws], ev, \ + flags, dws); \ dws->vws = !dws->vws; \ return gw; \ } @@ -554,14 +551,14 @@ NIX_RX_FASTPATH_MODES SSOW_LF_GWS_TAG); \ return ret; \ } \ - ret = cn9k_sso_hws_dual_get_work( \ - dws->base[dws->vws], dws->base[!dws->vws], ev, flags, \ - dws->lookup_mem, dws->tstamp); \ + ret = cn9k_sso_hws_dual_get_work(dws->base[dws->vws], \ + dws->base[!dws->vws], ev, \ + flags, dws); \ dws->vws = !dws->vws; \ for (iter = 1; iter < timeout_ticks && (ret == 0); iter++) { \ - ret = cn9k_sso_hws_dual_get_work( \ - dws->base[dws->vws], dws->base[!dws->vws], ev, \ - flags, dws->lookup_mem, dws->tstamp); \ + ret = cn9k_sso_hws_dual_get_work(dws->base[dws->vws], \ + dws->base[!dws->vws], \ + ev, flags, dws); \ dws->vws = !dws->vws; \ } \ return ret; \ diff --git a/drivers/event/cnxk/cnxk_eventdev.c b/drivers/event/cnxk/cnxk_eventdev.c index 6ad4e23e2b..be021d86c9 100644 --- a/drivers/event/cnxk/cnxk_eventdev.c +++ b/drivers/event/cnxk/cnxk_eventdev.c @@ -610,7 +610,7 @@ cnxk_sso_init(struct rte_eventdev *event_dev) } dev->is_timeout_deq = 0; - dev->min_dequeue_timeout_ns = USEC2NSEC(1); + dev->min_dequeue_timeout_ns = 0; dev->max_dequeue_timeout_ns = USEC2NSEC(0x3FF); dev->max_num_events = -1; dev->nb_event_queues = 0; diff --git a/drivers/event/cnxk/cnxk_eventdev.h b/drivers/event/cnxk/cnxk_eventdev.h index ab58508590..e3b5ffa7eb 100644 --- a/drivers/event/cnxk/cnxk_eventdev.h +++ b/drivers/event/cnxk/cnxk_eventdev.h @@ -144,6 +144,7 @@ struct cn10k_sso_hws { /* Event port a.k.a GWS */ struct cn9k_sso_hws { uint64_t base; + uint64_t gw_wdata; /* PTP timestamp */ struct cnxk_timesync_info *tstamp; void *lookup_mem; @@ -160,6 +161,7 @@ struct cn9k_sso_hws { struct cn9k_sso_hws_dual { uint64_t base[2]; /* Ping and Pong */ + uint64_t gw_wdata; /* PTP timestamp */ struct cnxk_timesync_info *tstamp; void *lookup_mem; From patchwork Thu Feb 10 10:19:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavan Nikhilesh Bhagavatula X-Patchwork-Id: 107225 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0DDB0A00C2; Thu, 10 Feb 2022 11:20:02 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 6CC2541C25; Thu, 10 Feb 2022 11:19:55 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com [67.231.148.174]) by mails.dpdk.org (Postfix) with ESMTP id B452F41C25 for ; Thu, 10 Feb 2022 11:19:53 +0100 (CET) Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.16.1.2/8.16.1.2) with ESMTP id 21AAIgG7008090 for ; Thu, 10 Feb 2022 02:19:53 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=j+QMElHs+SwDGGFdGtpu6teq6fM3w75uGcGmsHqcyQE=; b=YQBcI+iER1jhBk7uivIm1rwJm/Ju4sw145lSccH1u8aqhlZc7AKu/MYG5uJifY+Z1+UE 91MZ8KDSkSYI4OvdOnf6qvxYNT3sGxXqXqx7cfKr+bPjBPc15ark+ijyiUqECOKx/UYF Kp8TlX5GC/eQ99TAtfj7O9fvfxW33x/DV09JD9aHA28mc4O5u/Di4T0HIcDzCzCcC3cj 1YfvnzXFy+/d6q0NkySjvXw8bkdI/WQScYpQZ03ZDkygDYTV6DpDuHz1vEKu8rl/ic6C d2lqQ/b2t/NPPWETnhv2FSm/SVlcvnZfqr5mXhs+ddwr/DPfJ9hHqmt3E9tYznkLBS6e /w== Received: from dc5-exch02.marvell.com ([199.233.59.182]) by mx0a-0016f401.pphosted.com (PPS) with ESMTPS id 3e50uc804h-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT) for ; Thu, 10 Feb 2022 02:19:52 -0800 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Thu, 10 Feb 2022 02:19:51 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Thu, 10 Feb 2022 02:19:51 -0800 Received: from BG-LT7430.marvell.com (BG-LT7430.marvell.com [10.28.177.176]) by maili.marvell.com (Postfix) with ESMTP id A6EBA3F703F; Thu, 10 Feb 2022 02:19:48 -0800 (PST) From: To: , Pavan Nikhilesh , "Shijith Thotton" , Nithin Dabilpuram , Kiran Kumar K , Sunil Kumar Kori , Satha Rao CC: Subject: [PATCH v3 3/3] net/cnxk: improve Rx performance Date: Thu, 10 Feb 2022 15:49:40 +0530 Message-ID: <20220210101940.1669-3-pbhagavatula@marvell.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220210101940.1669-1-pbhagavatula@marvell.com> References: <20220119071323.3650-1-pbhagavatula@marvell.com> <20220210101940.1669-1-pbhagavatula@marvell.com> MIME-Version: 1.0 X-Proofpoint-GUID: RrBy5cEPlY-gVIr9Jb8g4Uvy_zbNcIe8 X-Proofpoint-ORIG-GUID: RrBy5cEPlY-gVIr9Jb8g4Uvy_zbNcIe8 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.816,Hydra:6.0.425,FMLib:17.11.62.513 definitions=2022-02-10_03,2022-02-09_01,2021-12-02_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Pavan Nikhilesh Improve vWQE and CQ Rx performance by tuning perfetches to 64B cacheline size. Also, prefetch the vWQE array offsets at cacheline boundaries. Signed-off-by: Pavan Nikhilesh --- drivers/event/cnxk/cn10k_worker.h | 25 +++++++++++++++---------- drivers/net/cnxk/cn10k_rx.h | 8 ++++---- drivers/net/cnxk/cn9k_rx.h | 20 ++++++++++---------- 3 files changed, 29 insertions(+), 24 deletions(-) diff --git a/drivers/event/cnxk/cn10k_worker.h b/drivers/event/cnxk/cn10k_worker.h index ada230ea1d..cfe729cef9 100644 --- a/drivers/event/cnxk/cn10k_worker.h +++ b/drivers/event/cnxk/cn10k_worker.h @@ -118,11 +118,17 @@ cn10k_process_vwqe(uintptr_t vwqe, uint16_t port_id, const uint32_t flags, uint8_t loff = 0; uint64_t sa_base; uint64_t **wqe; + int i; mbuf_init |= ((uint64_t)port_id) << 48; vec = (struct rte_event_vector *)vwqe; wqe = vec->u64s; + rte_prefetch_non_temporal(&vec->ptrs[0]); +#define OBJS_PER_CLINE (RTE_CACHE_LINE_SIZE / sizeof(void *)) + for (i = OBJS_PER_CLINE; i < vec->nb_elem; i += OBJS_PER_CLINE) + rte_prefetch_non_temporal(&vec->ptrs[i]); + nb_mbufs = RTE_ALIGN_FLOOR(vec->nb_elem, NIX_DESCS_PER_LOOP); nb_mbufs = cn10k_nix_recv_pkts_vector(&mbuf_init, vec->mbufs, nb_mbufs, flags | NIX_RX_VWQE_F, lookup_mem, @@ -191,15 +197,13 @@ cn10k_sso_hws_get_work(struct cn10k_sso_hws *ws, struct rte_event *ev, uint64_t u64[2]; } gw; uint64_t tstamp_ptr; - uint64_t mbuf; gw.get_work = ws->gw_wdata; #if defined(RTE_ARCH_ARM64) && !defined(__clang__) asm volatile( PLT_CPU_FEATURE_PREAMBLE - "caspl %[wdata], %H[wdata], %[wdata], %H[wdata], [%[gw_loc]]\n" - "sub %[mbuf], %H[wdata], #0x80 \n" - : [wdata] "+r"(gw.get_work), [mbuf] "=&r"(mbuf) + "caspal %[wdata], %H[wdata], %[wdata], %H[wdata], [%[gw_loc]]\n" + : [wdata] "+r"(gw.get_work) : [gw_loc] "r"(ws->base + SSOW_LF_GWS_OP_GET_WORK0) : "memory"); #else @@ -208,14 +212,12 @@ cn10k_sso_hws_get_work(struct cn10k_sso_hws *ws, struct rte_event *ev, roc_load_pair(gw.u64[0], gw.u64[1], ws->base + SSOW_LF_GWS_WQE0); } while (gw.u64[0] & BIT_ULL(63)); - mbuf = (uint64_t)((char *)gw.u64[1] - sizeof(struct rte_mbuf)); #endif ws->gw_rdata = gw.u64[0]; - gw.u64[0] = (gw.u64[0] & (0x3ull << 32)) << 6 | - (gw.u64[0] & (0x3FFull << 36)) << 4 | - (gw.u64[0] & 0xffffffff); - - if (CNXK_TT_FROM_EVENT(gw.u64[0]) != SSO_TT_EMPTY) { + if (gw.u64[1]) { + gw.u64[0] = (gw.u64[0] & (0x3ull << 32)) << 6 | + (gw.u64[0] & (0x3FFull << 36)) << 4 | + (gw.u64[0] & 0xffffffff); if ((flags & CPT_RX_WQE_F) && (CNXK_EVENT_TYPE_FROM_TAG(gw.u64[0]) == RTE_EVENT_TYPE_CRYPTODEV)) { @@ -223,7 +225,10 @@ cn10k_sso_hws_get_work(struct cn10k_sso_hws *ws, struct rte_event *ev, } else if (CNXK_EVENT_TYPE_FROM_TAG(gw.u64[0]) == RTE_EVENT_TYPE_ETHDEV) { uint8_t port = CNXK_SUB_EVENT_FROM_TAG(gw.u64[0]); + uint64_t mbuf; + mbuf = gw.u64[1] - sizeof(struct rte_mbuf); + rte_prefetch0((void *)mbuf); if (flags & NIX_RX_OFFLOAD_SECURITY_F) { struct rte_mbuf *m; uintptr_t sa_base; diff --git a/drivers/net/cnxk/cn10k_rx.h b/drivers/net/cnxk/cn10k_rx.h index 8b00fcc660..564e50f0af 100644 --- a/drivers/net/cnxk/cn10k_rx.h +++ b/drivers/net/cnxk/cn10k_rx.h @@ -610,10 +610,10 @@ cn10k_nix_recv_pkts_vector(void *args, struct rte_mbuf **mbufs, uint16_t pkts, } /* Prefetch N desc ahead */ - rte_prefetch_non_temporal(CQE_PTR_OFF(cq0, 8, 0, flags)); - rte_prefetch_non_temporal(CQE_PTR_OFF(cq0, 9, 0, flags)); - rte_prefetch_non_temporal(CQE_PTR_OFF(cq0, 10, 0, flags)); - rte_prefetch_non_temporal(CQE_PTR_OFF(cq0, 11, 0, flags)); + rte_prefetch_non_temporal(CQE_PTR_OFF(cq0, 4, 64, flags)); + rte_prefetch_non_temporal(CQE_PTR_OFF(cq0, 5, 64, flags)); + rte_prefetch_non_temporal(CQE_PTR_OFF(cq0, 6, 64, flags)); + rte_prefetch_non_temporal(CQE_PTR_OFF(cq0, 7, 64, flags)); /* Get NIX_RX_SG_S for size and buffer pointer */ cq0_w8 = vld1q_u64(CQE_PTR_OFF(cq0, 0, 64, flags)); diff --git a/drivers/net/cnxk/cn9k_rx.h b/drivers/net/cnxk/cn9k_rx.h index 1178f95317..d36f292c95 100644 --- a/drivers/net/cnxk/cn9k_rx.h +++ b/drivers/net/cnxk/cn9k_rx.h @@ -388,16 +388,16 @@ cn9k_nix_cqe_to_mbuf(const struct nix_cqe_hdr_s *cq, const uint32_t tag, ol_flags = nix_update_match_id(rx->cn9k.match_id, ol_flags, mbuf); - mbuf->pkt_len = len; - mbuf->data_len = len; - *(uint64_t *)(&mbuf->rearm_data) = val; - mbuf->ol_flags = ol_flags; + *(uint64_t *)(&mbuf->rearm_data) = val; + mbuf->pkt_len = len; - if (flag & NIX_RX_MULTI_SEG_F) + if (flag & NIX_RX_MULTI_SEG_F) { nix_cqe_xtract_mseg(rx, mbuf, val, flag); - else + } else { + mbuf->data_len = len; mbuf->next = NULL; + } } static inline uint16_t @@ -769,10 +769,6 @@ cn9k_nix_recv_pkts_vector(void *rx_queue, struct rte_mbuf **rx_pkts, vst1q_u64((uint64_t *)mbuf2->rearm_data, rearm2); vst1q_u64((uint64_t *)mbuf3->rearm_data, rearm3); - /* Store the mbufs to rx_pkts */ - vst1q_u64((uint64_t *)&rx_pkts[packets], mbuf01); - vst1q_u64((uint64_t *)&rx_pkts[packets + 2], mbuf23); - if (flags & NIX_RX_MULTI_SEG_F) { /* Multi segment is enable build mseg list for * individual mbufs in scalar mode. @@ -797,6 +793,10 @@ cn9k_nix_recv_pkts_vector(void *rx_queue, struct rte_mbuf **rx_pkts, mbuf3->next = NULL; } + /* Store the mbufs to rx_pkts */ + vst1q_u64((uint64_t *)&rx_pkts[packets], mbuf01); + vst1q_u64((uint64_t *)&rx_pkts[packets + 2], mbuf23); + /* Prefetch mbufs */ roc_prefetch_store_keep(mbuf0); roc_prefetch_store_keep(mbuf1);