[v8,0/3] generic rte atomic APIs deprecate proposal
mbox series

Message ID 1594875225-5850-1-git-send-email-phil.yang@arm.com
Headers show
Series
  • generic rte atomic APIs deprecate proposal
Related show

Message

Phil Yang July 16, 2020, 4:53 a.m. UTC
DPDK provides generic rte_atomic APIs to do several atomic operations.
These APIs are using the deprecated __sync built-ins and enforce full
memory barriers on aarch64. However, full barriers are not necessary
in many use cases. In order to address such use cases, C language offers
C11 atomic APIs. The C11 atomic APIs provide finer memory barrier control
by making use of the memory ordering parameter provided by the user.
Various patches submitted in the past [2] and the patches in this series
indicate significant performance gains on multiple aarch64 CPUs and no
performance loss on x86.

But the existing rte_atomic API implementations cannot be changed as the
APIs do not take the memory ordering parameter. The only choice available
is replacing the usage of the rte_atomic APIs with C11 atomic APIs. In
order to make this change, the following steps are proposed:

[1] deprecate rte_atomic APIs so that future patches do not use rte_atomic
APIs (a script is added to flag the usages).
[2] refactor the code that uses rte_atomic APIs to use c11 atomic APIs.

This patchset contains:
1) changes to programmer guide describing writing efficient code for aarch64.
2) the checkpatch script changes to flag rte_atomicNN_xxx API usage in patches.
3) wraps up __atomic_thread_fence with explicit memory ordering parameter.

v8:
Make descriptions more general. (Honnappa)

v7:
1. Remove code blocks in the guidance.
2. Remove code reference links in the guidance.
3. Remove the update of C11 atomics maintainers.

v6:
Add check for rte_smp barriers APIs in the new code.

v5:
1. Wraps up __atomic_thread_fence to support optimized code for
__ATOMIC_SEQ_CST memory order.
2. Flag __atomic_thread_fence with __ATOMIC_SEQ_CST in new patches.
3. Fix email address typo in patch 2/4.

v4:
1. add reader-writer concurrency case describing.
2. claim maintainership of c11 atomics code for each platforms.
3. flag rte_atomicNN_xxx in new patches for modules that have been converted to
c11 style.
4. flag __sync_xxx built-ins in new patches.
5. wraps up compiler atomic built-ins
6. move the changes of libraries which make use of c11 atomic APIs out of this
patchset.

v3:
add libatomic dependency for 32-bit clang

v2:
1. fix Clang '-Wincompatible-pointer-types' WARNING.
2. fix typos.

Phil Yang (3):
  doc: add optimizations using C11 atomic built-ins
  devtools: prevent use of rte atomic APIs in future patches
  eal/atomic: add wrapper for C11 atomic thread fence

 devtools/checkpatches.sh                         | 40 ++++++++++++++++
 doc/guides/prog_guide/writing_efficient_code.rst | 59 +++++++++++++++++++++++-
 lib/librte_eal/arm/include/rte_atomic_32.h       |  6 +++
 lib/librte_eal/arm/include/rte_atomic_64.h       |  6 +++
 lib/librte_eal/include/generic/rte_atomic.h      |  6 +++
 lib/librte_eal/ppc/include/rte_atomic.h          |  6 +++
 lib/librte_eal/x86/include/rte_atomic.h          | 17 +++++++
 7 files changed, 139 insertions(+), 1 deletion(-)