From patchwork Tue May 26 09:01:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phil Yang X-Patchwork-Id: 70580 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id DFD6AA04A4; Tue, 26 May 2020 11:02:07 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 909801D91B; Tue, 26 May 2020 11:02:07 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id 6243B1D91B for ; Tue, 26 May 2020 11:02:05 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id CC40955D; Tue, 26 May 2020 02:02:04 -0700 (PDT) Received: from phil-VirtualBox.shanghai.arm.com (phil-VirtualBox.shanghai.arm.com [10.169.109.151]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id C90EB3F52E; Tue, 26 May 2020 02:01:57 -0700 (PDT) From: Phil Yang To: dev@dpdk.org Cc: mattias.ronnblom@ericsson.com, mb@smartsharesystems.com, stephen@networkplumber.org, thomas@monjalon.net, bruce.richardson@intel.com, ferruh.yigit@intel.com, hemant.agrawal@nxp.com, jerinj@marvell.com, ktraynor@redhat.com, konstantin.ananyev@intel.com, maxime.coquelin@redhat.com, olivier.matz@6wind.com, harry.van.haaren@intel.com, erik.g.carrillo@intel.com, drc@linux.vnet.ibm.com, david.marchand@redhat.com, zhaoyan.chen@intel.com, ola.liljedahl@arm.com, honnappa.nagarahalli@arm.com, ruifeng.wang@arm.com, phil.yang@arm.com, nd@arm.com Date: Tue, 26 May 2020 17:01:04 +0800 Message-Id: <1590483667-10318-2-git-send-email-phil.yang@arm.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1590483667-10318-1-git-send-email-phil.yang@arm.com> References: <1589270586-4480-1-git-send-email-phil.yang@arm.com> <1590483667-10318-1-git-send-email-phil.yang@arm.com> Subject: [dpdk-dev] [PATCH v5 1/4] doc: add generic atomic deprecation section X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add deprecating the generic rte_atomic_xx APIs to c11 atomic built-ins guide and examples. Signed-off-by: Phil Yang Signed-off-by: Honnappa Nagarahalli --- doc/guides/prog_guide/writing_efficient_code.rst | 139 ++++++++++++++++++++++- 1 file changed, 138 insertions(+), 1 deletion(-) diff --git a/doc/guides/prog_guide/writing_efficient_code.rst b/doc/guides/prog_guide/writing_efficient_code.rst index 849f63e..3bd2601 100644 --- a/doc/guides/prog_guide/writing_efficient_code.rst +++ b/doc/guides/prog_guide/writing_efficient_code.rst @@ -167,7 +167,13 @@ but with the added cost of lower throughput. Locks and Atomic Operations --------------------------- -Atomic operations imply a lock prefix before the instruction, +This section describes some key considerations when using locks and atomic +operations in the DPDK environment. + +Locks +~~~~~ + +On x86, atomic operations imply a lock prefix before the instruction, causing the processor's LOCK# signal to be asserted during execution of the following instruction. This has a big impact on performance in a multicore environment. @@ -176,6 +182,137 @@ It can often be replaced by other solutions like per-lcore variables. Also, some locking techniques are more efficient than others. For instance, the Read-Copy-Update (RCU) algorithm can frequently replace simple rwlocks. +Atomic Operations: Use C11 Atomic Built-ins +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +DPDK `generic rte_atomic `_ operations are +implemented by `__sync built-ins `_. +These __sync built-ins result in full barriers on aarch64, which are unnecessary +in many use cases. They can be replaced by `__atomic built-ins `_ that +conform to the C11 memory model and provide finer memory order control. + +So replacing the rte_atomic operations with __atomic built-ins might improve +performance for aarch64 machines. `More details `_. + +Some typical optimization cases are listed below: + +Atomicity +^^^^^^^^^ + +Some use cases require atomicity alone, the ordering of the memory operations +does not matter. For example the packets statistics in the `vhost `_ example application. + +It just updates the number of transmitted packets, no subsequent logic depends +on these counters. So the RELAXED memory ordering is sufficient: + +.. code-block:: c + + static __rte_always_inline void + virtio_xmit(struct vhost_dev *dst_vdev, struct vhost_dev *src_vdev, + struct rte_mbuf *m) + { + ... + ... + if (enable_stats) { + __atomic_add_fetch(&dst_vdev->stats.rx_total_atomic, 1, __ATOMIC_RELAXED); + __atomic_add_fetch(&dst_vdev->stats.rx_atomic, ret, __ATOMIC_RELAXED); + ... + } + } + +One-way Barrier +^^^^^^^^^^^^^^^ + +Some use cases allow for memory reordering in one way while requiring memory +ordering in the other direction. + +For example, the memory operations before the `lock `_ can move to the +critical section, but the memory operations in the critical section cannot move +above the lock. In this case, the full memory barrier in the CAS operation can +be replaced to ACQUIRE. On the other hand, the memory operations after the +`unlock `_ can move to the critical section, but the memory operations in the +critical section cannot move below the unlock. So the full barrier in the STORE +operation can be replaced with RELEASE. + +Reader-Writer Concurrency +^^^^^^^^^^^^^^^^^^^^^^^^^ +Lock-free reader-writer concurrency is one of the common use cases in DPDK. + +The payload or the data that the writer wants to communicate to the reader, +can be written with RELAXED memory order. However, the guard variable should +be written with RELEASE memory order. This ensures that the store to guard +variable is observable only after the store to payload is observable. +Refer to `rte_hash insert `_ for an example. + +.. code-block:: c + + static inline int32_t + rte_hash_cuckoo_insert_mw(const struct rte_hash *h, + ... + int32_t *ret_val) + { + ... + ... + + /* Insert new entry if there is room in the primary + * bucket. + */ + for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) { + /* Check if slot is available */ + if (likely(prim_bkt->key_idx[i] == EMPTY_SLOT)) { + prim_bkt->sig_current[i] = sig; + /* Store to signature and key should not + * leak after the store to key_idx. i.e. + * key_idx is the guard variable for signature + * and key. + */ + __atomic_store_n(&prim_bkt->key_idx[i], + new_idx, + __ATOMIC_RELEASE); + break; + } + } + + ... + } + +Correspondingly, on the reader side, the guard variable should be read +with ACQUIRE memory order. The payload or the data the writer communicated, +can be read with RELAXED memory order. This ensures that, if the store to +guard variable is observable, the store to payload is also observable. Refer to `rte_hash lookup `_ for an example. + +.. code-block:: c + + static inline int32_t + search_one_bucket_lf(const struct rte_hash *h, const void *key, uint16_t sig, + void **data, const struct rte_hash_bucket *bkt) + { + ... + + for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) { + .... + if (bkt->sig_current[i] == sig) { + key_idx = __atomic_load_n(&bkt->key_idx[i], + __ATOMIC_ACQUIRE); + if (key_idx != EMPTY_SLOT) { + k = (struct rte_hash_key *) ((char *)keys + + key_idx * h->key_entry_size); + + if (rte_hash_cmp_eq(key, k->key, h) == 0) { + if (data != NULL) { + *data = __atomic_load_n(&k->pdata, + __ATOMIC_ACQUIRE); + } + + /* + * Return index where key is stored, + * subtracting the first dummy index + */ + return key_idx - 1; + } + ... + } + Coding Considerations --------------------- From patchwork Tue May 26 09:01:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phil Yang X-Patchwork-Id: 70581 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 50E99A04A4; Tue, 26 May 2020 11:02:21 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 3275C1DA1E; Tue, 26 May 2020 11:02:14 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id F41B61D977 for ; Tue, 26 May 2020 11:02:12 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7E3251042; Tue, 26 May 2020 02:02:12 -0700 (PDT) Received: from phil-VirtualBox.shanghai.arm.com (phil-VirtualBox.shanghai.arm.com [10.169.109.151]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 67A5A3F52E; Tue, 26 May 2020 02:02:05 -0700 (PDT) From: Phil Yang To: dev@dpdk.org Cc: mattias.ronnblom@ericsson.com, mb@smartsharesystems.com, stephen@networkplumber.org, thomas@monjalon.net, bruce.richardson@intel.com, ferruh.yigit@intel.com, hemant.agrawal@nxp.com, jerinj@marvell.com, ktraynor@redhat.com, konstantin.ananyev@intel.com, maxime.coquelin@redhat.com, olivier.matz@6wind.com, harry.van.haaren@intel.com, erik.g.carrillo@intel.com, drc@linux.vnet.ibm.com, david.marchand@redhat.com, zhaoyan.chen@intel.com, ola.liljedahl@arm.com, honnappa.nagarahalli@arm.com, ruifeng.wang@arm.com, phil.yang@arm.com, nd@arm.com Date: Tue, 26 May 2020 17:01:05 +0800 Message-Id: <1590483667-10318-3-git-send-email-phil.yang@arm.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1590483667-10318-1-git-send-email-phil.yang@arm.com> References: <1589270586-4480-1-git-send-email-phil.yang@arm.com> <1590483667-10318-1-git-send-email-phil.yang@arm.com> Subject: [dpdk-dev] [PATCH v5 2/4] maintainers: claim maintainers of c11 atomics code X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add the maintainership of c11 atomics code. Signed-off-by: Phil Yang Reviewed-by: Honnappa Nagarahalli --- MAINTAINERS | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index d2b2867..3528907 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -265,6 +265,10 @@ F: lib/librte_eal/include/rte_random.h F: lib/librte_eal/common/rte_random.c F: app/test/test_rand_perf.c +C11 Code Maintainer +M: Honnappa Nagarahalli +M: David Christensen + ARM v7 M: Jan Viktorin M: Ruifeng Wang From patchwork Tue May 26 09:01:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phil Yang X-Patchwork-Id: 70582 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id BB41CA04A4; Tue, 26 May 2020 11:02:31 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id A27AB1DA29; Tue, 26 May 2020 11:02:22 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id 819241DA1D for ; Tue, 26 May 2020 11:02:20 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 12C191FB; Tue, 26 May 2020 02:02:20 -0700 (PDT) Received: from phil-VirtualBox.shanghai.arm.com (phil-VirtualBox.shanghai.arm.com [10.169.109.151]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 12F583F52E; Tue, 26 May 2020 02:02:12 -0700 (PDT) From: Phil Yang To: dev@dpdk.org Cc: mattias.ronnblom@ericsson.com, mb@smartsharesystems.com, stephen@networkplumber.org, thomas@monjalon.net, bruce.richardson@intel.com, ferruh.yigit@intel.com, hemant.agrawal@nxp.com, jerinj@marvell.com, ktraynor@redhat.com, konstantin.ananyev@intel.com, maxime.coquelin@redhat.com, olivier.matz@6wind.com, harry.van.haaren@intel.com, erik.g.carrillo@intel.com, drc@linux.vnet.ibm.com, david.marchand@redhat.com, zhaoyan.chen@intel.com, ola.liljedahl@arm.com, honnappa.nagarahalli@arm.com, ruifeng.wang@arm.com, phil.yang@arm.com, nd@arm.com Date: Tue, 26 May 2020 17:01:06 +0800 Message-Id: <1590483667-10318-4-git-send-email-phil.yang@arm.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1590483667-10318-1-git-send-email-phil.yang@arm.com> References: <1589270586-4480-1-git-send-email-phil.yang@arm.com> <1590483667-10318-1-git-send-email-phil.yang@arm.com> Subject: [dpdk-dev] [PATCH v5 3/4] devtools: prevent use of rte atomic APIs in future patches X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" In order to deprecate the rte_atomic APIs, prevent the patches from using rte_atomic APIs in the converted modules and compilers __sync built-ins in all modules. The converted modules: lib/librte_distributor lib/librte_hash lib/librte_kni lib/librte_lpm lib/librte_rcu lib/librte_ring lib/librte_stack lib/librte_vhost lib/librte_timer lib/librte_ipsec drivers/event/octeontx drivers/event/octeontx2 drivers/event/opdl drivers/net/bnx2x drivers/net/hinic drivers/net/hns3 drivers/net/memif drivers/net/thunderx drivers/net/virtio examples/l2fwd-event On x86 the __atomic_thread_fence(__ATOMIC_SEQ_CST) is quite expensive for SMP case. Flag the new code which use SEQ_CST memory ordering in __atomic_thread_fence API. Signed-off-by: Phil Yang Reviewed-by: Ruifeng Wang Reviewed-by: Honnappa Nagarahalli --- devtools/checkpatches.sh | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/devtools/checkpatches.sh b/devtools/checkpatches.sh index 158087f..5983f05 100755 --- a/devtools/checkpatches.sh +++ b/devtools/checkpatches.sh @@ -69,6 +69,38 @@ check_forbidden_additions() { # -f $(dirname $(readlink -f $0))/check-forbidden-tokens.awk \ "$1" || res=1 + # refrain from new additions of 16/32/64 bits rte_atomic_xxx() + # multiple folders and expressions are separated by spaces + awk -v FOLDERS="lib/librte_distributor lib/librte_hash lib/librte_kni + lib/librte_lpm lib/librte_rcu lib/librte_ring + lib/librte_stack lib/librte_vhost drivers/event/octeontx + drivers/event/octeontx2 drivers/event/opdl + drivers/net/bnx2x drivers/net/hinic drivers/net/hns3 + drivers/net/memif drivers/net/thunderx + drivers/net/virtio examples/l2fwd-event" \ + -v EXPRESSIONS="rte_atomic[0-9][0-9]_.*\\\(" \ + -v RET_ON_FAIL=1 \ + -v MESSAGE='Use of rte_atomicNN_xxx APIs not allowed, use __atomic_xxx APIs' \ + -f $(dirname $(readlink -f $0))/check-forbidden-tokens.awk \ + "$1" || res=1 + + # refrain from using compiler __sync built-ins + awk -v FOLDERS="lib drivers app examples" \ + -v EXPRESSIONS="__sync_.*\\\(" \ + -v RET_ON_FAIL=1 \ + -v MESSAGE='Use of __sync_xxx built-ins not allowed, use __atomic_xxx APIs' \ + -f $(dirname $(readlink -f $0))/check-forbidden-tokens.awk \ + "$1" || res=1 + + # refrain from using compiler __atomic_thread_fence(__ATOMIC_SEQ_CST) + # It should be avoided on x86 for SMP case. + awk -v FOLDERS="lib drivers app examples" \ + -v EXPRESSIONS="__atomic_thread_fence\\\(__ATOMIC_SEQ_CST" \ + -v RET_ON_FAIL=1 \ + -v MESSAGE='Use of __atomic_thread_fence with SEQ_CST ordering is not allowed, use rte_atomic_thread_fence' \ + -f $(dirname $(readlink -f $0))/check-forbidden-tokens.awk \ + "$1" || res=1 + # svg figures must be included with wildcard extension # because of png conversion for pdf docs awk -v FOLDERS='doc' \ From patchwork Tue May 26 09:01:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phil Yang X-Patchwork-Id: 70583 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id AF555A04A4; Tue, 26 May 2020 11:02:41 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 31CA21DA2D; Tue, 26 May 2020 11:02:29 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id 417A11DA2D for ; Tue, 26 May 2020 11:02:28 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B980B1FB; Tue, 26 May 2020 02:02:27 -0700 (PDT) Received: from phil-VirtualBox.shanghai.arm.com (phil-VirtualBox.shanghai.arm.com [10.169.109.151]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id A3B223F52E; Tue, 26 May 2020 02:02:20 -0700 (PDT) From: Phil Yang To: dev@dpdk.org Cc: mattias.ronnblom@ericsson.com, mb@smartsharesystems.com, stephen@networkplumber.org, thomas@monjalon.net, bruce.richardson@intel.com, ferruh.yigit@intel.com, hemant.agrawal@nxp.com, jerinj@marvell.com, ktraynor@redhat.com, konstantin.ananyev@intel.com, maxime.coquelin@redhat.com, olivier.matz@6wind.com, harry.van.haaren@intel.com, erik.g.carrillo@intel.com, drc@linux.vnet.ibm.com, david.marchand@redhat.com, zhaoyan.chen@intel.com, ola.liljedahl@arm.com, honnappa.nagarahalli@arm.com, ruifeng.wang@arm.com, phil.yang@arm.com, nd@arm.com Date: Tue, 26 May 2020 17:01:07 +0800 Message-Id: <1590483667-10318-5-git-send-email-phil.yang@arm.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1590483667-10318-1-git-send-email-phil.yang@arm.com> References: <1589270586-4480-1-git-send-email-phil.yang@arm.com> <1590483667-10318-1-git-send-email-phil.yang@arm.com> Subject: [dpdk-dev] [PATCH v5 4/4] eal/atomic: add wrapper for c11 atomic thread fence X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Provide a wrapper for __atomic_thread_fence built-in to support optimized code for __ATOMIC_SEQ_CST memory order for x86 platforms. Suggested-by: Honnappa Nagarahalli Signed-off-by: Phil Yang Reviewed-by: Ola Liljedahl Acked-by: Konstantin Ananyev --- lib/librte_eal/arm/include/rte_atomic_32.h | 6 ++++++ lib/librte_eal/arm/include/rte_atomic_64.h | 6 ++++++ lib/librte_eal/include/generic/rte_atomic.h | 6 ++++++ lib/librte_eal/ppc/include/rte_atomic.h | 6 ++++++ lib/librte_eal/x86/include/rte_atomic.h | 17 +++++++++++++++++ 5 files changed, 41 insertions(+) diff --git a/lib/librte_eal/arm/include/rte_atomic_32.h b/lib/librte_eal/arm/include/rte_atomic_32.h index 7dc0d06..dbe7cc6 100644 --- a/lib/librte_eal/arm/include/rte_atomic_32.h +++ b/lib/librte_eal/arm/include/rte_atomic_32.h @@ -37,6 +37,12 @@ extern "C" { #define rte_cio_rmb() rte_rmb() +static __rte_always_inline void +rte_atomic_thread_fence(int mo) +{ + __atomic_thread_fence(mo); +} + #ifdef __cplusplus } #endif diff --git a/lib/librte_eal/arm/include/rte_atomic_64.h b/lib/librte_eal/arm/include/rte_atomic_64.h index 7b7099c..22ff8ec 100644 --- a/lib/librte_eal/arm/include/rte_atomic_64.h +++ b/lib/librte_eal/arm/include/rte_atomic_64.h @@ -41,6 +41,12 @@ extern "C" { #define rte_cio_rmb() asm volatile("dmb oshld" : : : "memory") +static __rte_always_inline void +rte_atomic_thread_fence(int mo) +{ + __atomic_thread_fence(mo); +} + /*------------------------ 128 bit atomic operations -------------------------*/ #if defined(__ARM_FEATURE_ATOMICS) || defined(RTE_ARM_FEATURE_ATOMICS) diff --git a/lib/librte_eal/include/generic/rte_atomic.h b/lib/librte_eal/include/generic/rte_atomic.h index e6ab15a..5b941db 100644 --- a/lib/librte_eal/include/generic/rte_atomic.h +++ b/lib/librte_eal/include/generic/rte_atomic.h @@ -158,6 +158,12 @@ static inline void rte_cio_rmb(void); asm volatile ("" : : : "memory"); \ } while(0) +/** + * Synchronization fence between threads based on the specified + * memory order. + */ +static inline void rte_atomic_thread_fence(int mo); + /*------------------------- 16 bit atomic operations -------------------------*/ /** diff --git a/lib/librte_eal/ppc/include/rte_atomic.h b/lib/librte_eal/ppc/include/rte_atomic.h index 7e3e131..91c5f30 100644 --- a/lib/librte_eal/ppc/include/rte_atomic.h +++ b/lib/librte_eal/ppc/include/rte_atomic.h @@ -40,6 +40,12 @@ extern "C" { #define rte_cio_rmb() rte_rmb() +static __rte_always_inline void +rte_atomic_thread_fence(int mo) +{ + __atomic_thread_fence(mo); +} + /*------------------------- 16 bit atomic operations -------------------------*/ /* To be compatible with Power7, use GCC built-in functions for 16 bit * operations */ diff --git a/lib/librte_eal/x86/include/rte_atomic.h b/lib/librte_eal/x86/include/rte_atomic.h index b9dcd30..bd256e7 100644 --- a/lib/librte_eal/x86/include/rte_atomic.h +++ b/lib/librte_eal/x86/include/rte_atomic.h @@ -83,6 +83,23 @@ rte_smp_mb(void) #define rte_cio_rmb() rte_compiler_barrier() +/** + * Synchronization fence between threads based on the specified + * memory order. + * + * On x86 the __atomic_thread_fence(__ATOMIC_SEQ_CST) generates + * full 'mfence' which is quite expensive. The optimized + * implementation of rte_smp_mb is used instead. + */ +static __rte_always_inline void +rte_atomic_thread_fence(int mo) +{ + if (mo == __ATOMIC_SEQ_CST) + rte_smp_mb(); + else + __atomic_thread_fence(mo); +} + /*------------------------- 16 bit atomic operations -------------------------*/ #ifndef RTE_FORCE_INTRINSICS