Message ID | 1568473196-34972-1-git-send-email-gavin.hu@arm.com (mailing list archive) |
---|---|
Headers |
Return-Path: <dev-bounces@dpdk.org> X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 9B02D1DFEA; Sat, 14 Sep 2019 17:00:16 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id 980731C2B9 for <dev@dpdk.org>; Sat, 14 Sep 2019 17:00:15 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A7B6328; Sat, 14 Sep 2019 08:00:14 -0700 (PDT) Received: from net-arm-thunderx2-01.test.ast.arm.com (net-arm-thunderx2-01.shanghai.arm.com [10.169.40.40]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 7E4FD3F67D; Sat, 14 Sep 2019 08:00:12 -0700 (PDT) From: Gavin Hu <gavin.hu@arm.com> To: dev@dpdk.org Cc: nd@arm.com, thomas@monjalon.net, stephen@networkplumber.org, hemant.agrawal@nxp.com, jerinj@marvell.com, pbhagavatula@marvell.com, Honnappa.Nagarahalli@arm.com, ruifeng.wang@arm.com, phil.yang@arm.com, steve.capper@arm.com Date: Sat, 14 Sep 2019 22:59:49 +0800 Message-Id: <1568473196-34972-1-git-send-email-gavin.hu@arm.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1561911676-37718-1-git-send-email-gavin.hu@arm.com> References: <1561911676-37718-1-git-send-email-gavin.hu@arm.com> Subject: [dpdk-dev] [PATCH v6 0/7] use WFE for aarch64 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions <dev.dpdk.org> List-Unsubscribe: <https://mails.dpdk.org/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://mails.dpdk.org/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <https://mails.dpdk.org/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org> |
Series |
use WFE for aarch64
|
|
Message
Gavin Hu
Sept. 14, 2019, 2:59 p.m. UTC
V6: - squash the RTE_ARM_USE_WFE configuration entry patch into the new API patch - move the new configuration to the end of EAL - add doxygen comments to reflect the relaxed and acquire semantics - correct the meson configuration V5: - add doxygen comments for the new APIs - spinlock early exit without wfe if the spinlock not taken by others. - add two patches on top for opdl and thunderx V4: - rename the config as CONFIG_RTE_ARM_USE_WFE to indicate it applys to arm only - introduce a macro for assembly Skelton to reduce the duplication of code - add one patch for nxp fslmc to address a compiling error V3: - Convert RFCs to patches V2: - Use inline functions instead of marcos - Add load and compare in the beginning of the APIs - Fix some style errors in asm inline V1: - Add the new APIs and use it for ring and locks DPDK has multiple use cases where the core repeatedly polls a location in memory. This polling results in many cache and memory transactions. Arm architecture provides WFE (Wait For Event) instruction, which allows the cpu core to enter a low power state until woken up by the update to the memory location being polled. Thus reducing the cache and memory transactions. x86 has the PAUSE hint instruction to reduce such overhead. The rte_wait_until_equal_xxx APIs abstract the functionality of 'polling for a memory location to become equal to a given value'. For non-Arm platforms, these APIs are just wrappers around do-while loop with rte_pause, so there are no performance differences. For Arm platforms, use of WFE can be configured using CONFIG_RTE_USE_WFE option. It is disabled by default. Currently, use of WFE is supported only for aarch64 platforms. armv7 platforms do support the WFE instruction, but they require explicit wake up events(sev) and are less performannt. Testing shows that, performance varies across different platforms, with some showing degradation. CONFIG_RTE_ARM_USE_WFE should be enabled depending on the performance benchmarking on the target platforms. Power saving should be an bonus, but currenly we don't have ways to characterize that. Gavin Hu (7): bus/fslmc: fix the conflicting dmb function eal: add the APIs to wait until equal spinlock: use wfe to reduce contention on aarch64 ticketlock: use new API to reduce contention on aarch64 ring: use wfe to wait for ring tail update on aarch64 net/thunderx: use new API to save cycles on aarch64 event/opdl: use new API to save cycles on aarch64 config/arm/meson.build | 1 + config/common_base | 5 + drivers/bus/fslmc/mc/fsl_mc_sys.h | 10 +- drivers/bus/fslmc/mc/mc_sys.c | 3 +- drivers/event/opdl/opdl_ring.c | 5 +- drivers/net/thunderx/nicvf_rxtx.c | 3 +- .../common/include/arch/arm/rte_pause_64.h | 30 ++++++ .../common/include/arch/arm/rte_spinlock.h | 26 ++++++ lib/librte_eal/common/include/generic/rte_pause.h | 103 +++++++++++++++++++++ .../common/include/generic/rte_ticketlock.h | 3 +- lib/librte_ring/rte_ring_c11_mem.h | 4 +- lib/librte_ring/rte_ring_generic.h | 3 +- 12 files changed, 180 insertions(+), 16 deletions(-)
Comments
On Sat, Sep 14, 2019 at 8:30 PM Gavin Hu <gavin.hu@arm.com> wrote: > > V6: > - squash the RTE_ARM_USE_WFE configuration entry patch into the new API patch > - move the new configuration to the end of EAL > - add doxygen comments to reflect the relaxed and acquire semantics > - correct the meson configuration > V5: > - add doxygen comments for the new APIs > - spinlock early exit without wfe if the spinlock not taken by others. > - add two patches on top for opdl and thunderx > V4: > - rename the config as CONFIG_RTE_ARM_USE_WFE to indicate it applys to arm only > - introduce a macro for assembly Skelton to reduce the duplication of code > - add one patch for nxp fslmc to address a compiling error > V3: > - Convert RFCs to patches > V2: > - Use inline functions instead of marcos > - Add load and compare in the beginning of the APIs > - Fix some style errors in asm inline > V1: > - Add the new APIs and use it for ring and locks > > DPDK has multiple use cases where the core repeatedly polls a location in > memory. This polling results in many cache and memory transactions. > > Arm architecture provides WFE (Wait For Event) instruction, which allows > the cpu core to enter a low power state until woken up by the update to the > memory location being polled. Thus reducing the cache and memory > transactions. > > x86 has the PAUSE hint instruction to reduce such overhead. > > The rte_wait_until_equal_xxx APIs abstract the functionality of 'polling > for a memory location to become equal to a given value'. > > For non-Arm platforms, these APIs are just wrappers around do-while loop > with rte_pause, so there are no performance differences. > > For Arm platforms, use of WFE can be configured using CONFIG_RTE_USE_WFE > option. It is disabled by default. > > Currently, use of WFE is supported only for aarch64 platforms. armv7 > platforms do support the WFE instruction, but they require explicit wake up > events(sev) and are less performannt. > > Testing shows that, performance varies across different platforms, with > some showing degradation. > > CONFIG_RTE_ARM_USE_WFE should be enabled depending on the performance > benchmarking on the target platforms. Power saving should be an bonus, > but currenly we don't have ways to characterize that. > > Gavin Hu (7): > bus/fslmc: fix the conflicting dmb function > eal: add the APIs to wait until equal > spinlock: use wfe to reduce contention on aarch64 > ticketlock: use new API to reduce contention on aarch64 > ring: use wfe to wait for ring tail update on aarch64 > net/thunderx: use new API to save cycles on aarch64 > event/opdl: use new API to save cycles on aarch64 There is checkpatch failure. ### eal: add the APIs to wait until equal WARNING:LONG_LINE_COMMENT: line over 80 characters #123: FILE: lib/librte_eal/common/include/generic/rte_pause.h:29: + * Wait for *addr to be updated with a 16-bit expected value, with a relaxed memory With checkpatch fixes: Acked-by: Jerin Jacob <jerinj@marvell.com>
> > There is checkpatch failure. > ### eal: add the APIs to wait until equal > > WARNING:LONG_LINE_COMMENT: line over 80 characters > #123: FILE: lib/librte_eal/common/include/generic/rte_pause.h:29: > + * Wait for *addr to be updated with a 16-bit expected value, with a > relaxed memory > > With checkpatch fixes: > > Acked-by: Jerin Jacob <jerinj@marvell.com> Thanks Jerin for review, sorry for this leakage, it was fixed in the new version(v7, already posted).