From patchwork Tue Feb 27 08:11:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavan Nikhilesh Bhagavatula X-Patchwork-Id: 137329 X-Patchwork-Delegate: jerinj@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3931A43BDC; Tue, 27 Feb 2024 09:12:01 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 0E2D94027D; Tue, 27 Feb 2024 09:12:01 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com [67.231.148.174]) by mails.dpdk.org (Postfix) with ESMTP id 2C98B40150 for ; Tue, 27 Feb 2024 09:11:59 +0100 (CET) Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.17.1.24/8.17.1.24) with ESMTP id 41R7nGNq028206 for ; Tue, 27 Feb 2024 00:11:58 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h= from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding:content-type; s=pfpt0220; bh=5C+reWZ0 C1BW/rgq4c3hJLSG9oeL2c9oFEohYROOits=; b=M8t3yqbjny1Z6vWXc+grj150 wzrrvhQikd2egW11zkoxfkGYQa0jVcYa1r+6YOBgnPcBys4A6Zbx64G6PuIqh+jm TUN+mHWb8FhRaU5UYjbq2gM2iuTVVJwjdAEAHd610SXF+zpb83SwwbcfFv64eD/y jVDSDYDVqowTSxtEMOs6Ru6sw0xju8vGVidqbETDt6jE260PgoJbgrhAEqGhTSxs W0wDuLFGn4f9XsNkO3lCc4VDz/rzWGswiBZgQJ49rkB/TJhttZeEdwMwRNsEmI/l LyFNHgbgMWPknZm5CUUiXd5Rpmh9igPIMjaqqwlEr6N6BpAwpOFY9nS1DbbaRw== Received: from dc6wp-exch02.marvell.com ([4.21.29.225]) by mx0a-0016f401.pphosted.com (PPS) with ESMTPS id 3whbpe01xg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 27 Feb 2024 00:11:58 -0800 (PST) Received: from DC6WP-EXCH02.marvell.com (10.76.176.209) by DC6WP-EXCH02.marvell.com (10.76.176.209) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.12; Tue, 27 Feb 2024 00:11:57 -0800 Received: from maili.marvell.com (10.69.176.80) by DC6WP-EXCH02.marvell.com (10.76.176.209) with Microsoft SMTP Server id 15.2.1258.12 via Frontend Transport; Tue, 27 Feb 2024 00:11:56 -0800 Received: from MININT-80QBFE8.corp.innovium.com (MININT-80QBFE8.marvell.com [10.28.164.106]) by maili.marvell.com (Postfix) with ESMTP id 09A913F7055; Tue, 27 Feb 2024 00:11:54 -0800 (PST) From: To: , Pavan Nikhilesh , "Shijith Thotton" CC: Subject: [PATCH v5] event/cnxk: use WFE LDP loop for getwork routine Date: Tue, 27 Feb 2024 13:41:53 +0530 Message-ID: <20240227081153.20826-1-pbhagavatula@marvell.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-Proofpoint-GUID: 5-Fsm_799Lkac-31lN3t8JjFxfczlSzc X-Proofpoint-ORIG-GUID: 5-Fsm_799Lkac-31lN3t8JjFxfczlSzc X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.1011,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2024-02-26_11,2024-02-26_01,2023-05-22_02 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Pavan Nikhilesh Use WFE LDP loop while polling for GETWORK completion for better power savings. Disabled by default and can be enabled by configuring meson with 'RTE_ARM_USE_WFE' enabled. Signed-off-by: Pavan Nikhilesh --- v4 Changes: - Split patches v5 Changes: - Update release notes and documentation. doc/guides/eventdevs/cnxk.rst | 9 +++++ doc/guides/rel_notes/release_24_03.rst | 4 ++ drivers/event/cnxk/cn10k_worker.h | 52 +++++++++++++++++++++----- 3 files changed, 56 insertions(+), 9 deletions(-) -- 2.25.1 diff --git a/doc/guides/eventdevs/cnxk.rst b/doc/guides/eventdevs/cnxk.rst index cccb8a0304..49ba11c902 100644 --- a/doc/guides/eventdevs/cnxk.rst +++ b/doc/guides/eventdevs/cnxk.rst @@ -198,6 +198,15 @@ Runtime Config Options -a 0002:0e:00.0,tim_eclk_freq=122880000-1000000000-0 +Power Savings on CN10K +---------------------- + +ARM cores can additionally use WFE when polling for transactions on SSO bus +to save power i.e., in the event dequeue call ARM core can enter WFE and exit +when either work has been scheduled or dequeue timeout has reached. +This feature can be selected by configuring meson with the ``RTE_ARM_USE_WFE`` +enabled. + Debugging Options ----------------- diff --git a/doc/guides/rel_notes/release_24_03.rst b/doc/guides/rel_notes/release_24_03.rst index 879bb4944c..7e68b697c2 100644 --- a/doc/guides/rel_notes/release_24_03.rst +++ b/doc/guides/rel_notes/release_24_03.rst @@ -138,6 +138,10 @@ New Features to support TLS v1.2, TLS v1.3 and DTLS v1.2. * Added PMD API to allow raw submission of instructions to CPT. +* **Updated Marvell cnxk eventdev driver.** + + * Added ARM WFE instruction in ``GETWORK(rte_event_dev_dequeue)`` routine + to save power while waiting for SSO to schedule work. Removed Items ------------- diff --git a/drivers/event/cnxk/cn10k_worker.h b/drivers/event/cnxk/cn10k_worker.h index 8aa916fa12..92d5190842 100644 --- a/drivers/event/cnxk/cn10k_worker.h +++ b/drivers/event/cnxk/cn10k_worker.h @@ -250,23 +250,57 @@ cn10k_sso_hws_get_work(struct cn10k_sso_hws *ws, struct rte_event *ev, gw.get_work = ws->gw_wdata; #if defined(RTE_ARCH_ARM64) -#if !defined(__clang__) - asm volatile( - PLT_CPU_FEATURE_PREAMBLE - "caspal %[wdata], %H[wdata], %[wdata], %H[wdata], [%[gw_loc]]\n" - : [wdata] "+r"(gw.get_work) - : [gw_loc] "r"(ws->base + SSOW_LF_GWS_OP_GET_WORK0) - : "memory"); -#else +#if defined(__clang__) register uint64_t x0 __asm("x0") = (uint64_t)gw.u64[0]; register uint64_t x1 __asm("x1") = (uint64_t)gw.u64[1]; +#if defined(RTE_ARM_USE_WFE) + plt_write64(gw.u64[0], ws->base + SSOW_LF_GWS_OP_GET_WORK0); + asm volatile(PLT_CPU_FEATURE_PREAMBLE + " ldp %[x0], %[x1], [%[tag_loc]] \n" + " tbz %[x0], %[pend_gw], done%= \n" + " sevl \n" + "rty%=: wfe \n" + " ldp %[x0], %[x1], [%[tag_loc]] \n" + " tbnz %[x0], %[pend_gw], rty%= \n" + "done%=: \n" + " dmb ld \n" + : [x0] "+r" (x0), [x1] "+r" (x1) + : [tag_loc] "r"(ws->base + SSOW_LF_GWS_WQE0), + [pend_gw] "i"(SSOW_LF_GWS_TAG_PEND_GET_WORK_BIT) + : "memory"); +#else asm volatile(".arch armv8-a+lse\n" "caspal %[x0], %[x1], %[x0], %[x1], [%[dst]]\n" - : [x0] "+r"(x0), [x1] "+r"(x1) + : [x0] "+r" (x0), [x1] "+r" (x1) : [dst] "r"(ws->base + SSOW_LF_GWS_OP_GET_WORK0) : "memory"); +#endif gw.u64[0] = x0; gw.u64[1] = x1; +#else +#if defined(RTE_ARM_USE_WFE) + plt_write64(gw.u64[0], ws->base + SSOW_LF_GWS_OP_GET_WORK0); + asm volatile(PLT_CPU_FEATURE_PREAMBLE + " ldp %[wdata], %H[wdata], [%[tag_loc]] \n" + " tbz %[wdata], %[pend_gw], done%= \n" + " sevl \n" + "rty%=: wfe \n" + " ldp %[wdata], %H[wdata], [%[tag_loc]] \n" + " tbnz %[wdata], %[pend_gw], rty%= \n" + "done%=: \n" + " dmb ld \n" + : [wdata] "=&r"(gw.get_work) + : [tag_loc] "r"(ws->base + SSOW_LF_GWS_WQE0), + [pend_gw] "i"(SSOW_LF_GWS_TAG_PEND_GET_WORK_BIT) + : "memory"); +#else + asm volatile( + PLT_CPU_FEATURE_PREAMBLE + "caspal %[wdata], %H[wdata], %[wdata], %H[wdata], [%[gw_loc]]\n" + : [wdata] "+r"(gw.get_work) + : [gw_loc] "r"(ws->base + SSOW_LF_GWS_OP_GET_WORK0) + : "memory"); +#endif #endif #else plt_write64(gw.u64[0], ws->base + SSOW_LF_GWS_OP_GET_WORK0);