[v6] eal/x86: add 128-bit atomic compare exchange
Checks
Commit Message
This operation can be used for non-blocking algorithms, such as a
non-blocking stack or ring.
It is available only for x86_64.
Signed-off-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
This patch addresses x86-64 only; other architectures can/will be supported
in the future. The __atomic intrinsic was considered for the implementation,
however libatomic was found[1] to use locks to implement the 128-bit CAS on at
least one architecture and so is eschewed here. The interface is modeled after
the __atomic_compare_exchange_16 (which itself is based on the C++11 memory
model) to best support weak consistency architectures.
This patch was originally part of a series that introduces a non-blocking stack
mempool handler[2], and is required by a non-blocking ring patchset. This
patch was spun off so that the the NB ring depends only on this patch
and not on the entire non-blocking stack patchset.
[1] http://mails.dpdk.org/archives/dev/2019-January/124002.html
[2] http://mails.dpdk.org/archives/dev/2019-January/123653.html
v6:
- Use @code and @endcode to clean up pseudo-code generation
- Add note that the function is currently limited to x86-64
v5:
- Move declaration in the generic file for doxygen only (Thomas)
v4:
- Move function declaration from generic/rte_atomic.h to x86-64 header file
v3:
- Rename function to ISA-neutral rte_atomic128_cmp_exchange()
- Fix two pseudocode bugs in function documentation
v2:
- Rename function to rte_atomic128_cmpxchg()
- Replace "=A" output constraint with "=a" and "=d" to prevent GCC from using
the al register for the sete destination
- Extend 'weak' definition to allow non-atomic 'exp' updates.
- Add const keyword to 'src' and remove volatile keyword from 'dst'
- Put __int128 in a union in rte_int128_t and move the structure definition
inside the RTE_ARCH_x86_64 ifdef
- Drop enum rte_atomic_memmodel_t in favor of compiler-defined __ATOMIC_*
- Drop unnecessary comment relating to X86_64
- Tweak the pseudocode to reflect the 'exp' update on failure.
.../common/include/arch/x86/rte_atomic_64.h | 47 +++++++++++++++++++
lib/librte_eal/common/include/generic/rte_atomic.h | 52 ++++++++++++++++++++++
2 files changed, 99 insertions(+)
Comments
03/04/2019 21:44, Gage Eads:
> This operation can be used for non-blocking algorithms, such as a
> non-blocking stack or ring.
>
> It is available only for x86_64.
>
> Signed-off-by: Gage Eads <gage.eads@intel.com>
> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> ---
> This patch addresses x86-64 only; other architectures can/will be supported
> in the future. The __atomic intrinsic was considered for the implementation,
> however libatomic was found[1] to use locks to implement the 128-bit CAS on at
> least one architecture and so is eschewed here. The interface is modeled after
> the __atomic_compare_exchange_16 (which itself is based on the C++11 memory
> model) to best support weak consistency architectures.
>
> This patch was originally part of a series that introduces a non-blocking stack
> mempool handler[2], and is required by a non-blocking ring patchset. This
> patch was spun off so that the the NB ring depends only on this patch
> and not on the entire non-blocking stack patchset.
>
> [1] http://mails.dpdk.org/archives/dev/2019-January/124002.html
> [2] http://mails.dpdk.org/archives/dev/2019-January/123653.html
>
> v6:
> - Use @code and @endcode to clean up pseudo-code generation
> - Add note that the function is currently limited to x86-64
>
> v5:
> - Move declaration in the generic file for doxygen only (Thomas)
>
> v4:
> - Move function declaration from generic/rte_atomic.h to x86-64 header file
>
> v3:
> - Rename function to ISA-neutral rte_atomic128_cmp_exchange()
> - Fix two pseudocode bugs in function documentation
>
> v2:
> - Rename function to rte_atomic128_cmpxchg()
> - Replace "=A" output constraint with "=a" and "=d" to prevent GCC from using
> the al register for the sete destination
> - Extend 'weak' definition to allow non-atomic 'exp' updates.
> - Add const keyword to 'src' and remove volatile keyword from 'dst'
> - Put __int128 in a union in rte_int128_t and move the structure definition
> inside the RTE_ARCH_x86_64 ifdef
> - Drop enum rte_atomic_memmodel_t in favor of compiler-defined __ATOMIC_*
> - Drop unnecessary comment relating to X86_64
> - Tweak the pseudocode to reflect the 'exp' update on failure.
>
> .../common/include/arch/x86/rte_atomic_64.h | 47 +++++++++++++++++++
> lib/librte_eal/common/include/generic/rte_atomic.h | 52 ++++++++++++++++++++++
> 2 files changed, 99 insertions(+)
Applied, thanks
On 4/3/2019 8:44 PM, Gage Eads wrote:
> This operation can be used for non-blocking algorithms, such as a
> non-blocking stack or ring.
>
> It is available only for x86_64.
>
> Signed-off-by: Gage Eads <gage.eads@intel.com>
> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> ---
> This patch addresses x86-64 only; other architectures can/will be supported
> in the future. The __atomic intrinsic was considered for the implementation,
> however libatomic was found[1] to use locks to implement the 128-bit CAS on at
> least one architecture and so is eschewed here. The interface is modeled after
> the __atomic_compare_exchange_16 (which itself is based on the C++11 memory
> model) to best support weak consistency architectures.
>
> This patch was originally part of a series that introduces a non-blocking stack
> mempool handler[2], and is required by a non-blocking ring patchset. This
> patch was spun off so that the the NB ring depends only on this patch
> and not on the entire non-blocking stack patchset.
>
> [1] http://mails.dpdk.org/archives/dev/2019-January/124002.html
> [2] http://mails.dpdk.org/archives/dev/2019-January/123653.html
>
> v6:
> - Use @code and @endcode to clean up pseudo-code generation
> - Add note that the function is currently limited to x86-64
>
> v5:
> - Move declaration in the generic file for doxygen only (Thomas)
>
> v4:
> - Move function declaration from generic/rte_atomic.h to x86-64 header file
>
> v3:
> - Rename function to ISA-neutral rte_atomic128_cmp_exchange()
> - Fix two pseudocode bugs in function documentation
>
> v2:
> - Rename function to rte_atomic128_cmpxchg()
> - Replace "=A" output constraint with "=a" and "=d" to prevent GCC from using
> the al register for the sete destination
> - Extend 'weak' definition to allow non-atomic 'exp' updates.
> - Add const keyword to 'src' and remove volatile keyword from 'dst'
> - Put __int128 in a union in rte_int128_t and move the structure definition
> inside the RTE_ARCH_x86_64 ifdef
> - Drop enum rte_atomic_memmodel_t in favor of compiler-defined __ATOMIC_*
> - Drop unnecessary comment relating to X86_64
> - Tweak the pseudocode to reflect the 'exp' update on failure.
>
<...>
I am getting a build error, I guess generated from this patch, while building
mlx drivers, when MLX._DEBUG enabled [1].
This looks because of the "-pedantic" parameter mlx drivers have [2].
Gage, Shahaf, Matan, Yongseok,
Can you please check this?
[1]
.../dpdk/x86_64-native-linuxapp-gcc/include/rte_atomic_64.h:223:3: error: ISO C
does not support ‘__int128’ types [-Werror=pedantic]
__int128 int128;
^~~~~~~~
cc1: all warnings being treated as errors
[2]
Indeed I would like to discuss this distinct behavior of the mlx drivers, it is
many time "-pedantic" caused problem, I will bring the discussion on separate
thread.
04/04/2019 13:47, Ferruh Yigit:
> .../dpdk/x86_64-native-linuxapp-gcc/include/rte_atomic_64.h:223:3: error: ISO C
> does not support ‘__int128’ types [-Werror=pedantic]
We can try this kind of workaround (disable pedantic locally):
https://github.com/HowardHinnant/date/pull/38/commits/177032852d5b4614112ca1ab3ef42d6b41824816
04/04/2019 14:08, Thomas Monjalon:
> 04/04/2019 13:47, Ferruh Yigit:
> > .../dpdk/x86_64-native-linuxapp-gcc/include/rte_atomic_64.h:223:3: error: ISO C
> > does not support ‘__int128’ types [-Werror=pedantic]
>
> We can try this kind of workaround (disable pedantic locally):
> https://github.com/HowardHinnant/date/pull/38/commits/177032852d5b4614112ca1ab3ef42d6b41824816
Or better:
__extension__ typedef __int128 int128;
> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Thursday, April 4, 2019 7:13 AM
> To: Yigit, Ferruh <ferruh.yigit@intel.com>; Eads, Gage
> <gage.eads@intel.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; Matan Azrad
> <matan@mellanox.com>; Yongseok Koh <yskoh@mellanox.com>;
> dev@dpdk.org; olivier.matz@6wind.com; arybchenko@solarflare.com;
> Richardson, Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; gavin.hu@arm.com;
> Honnappa.Nagarahalli@arm.com; nd@arm.com;
> chaozhu@linux.vnet.ibm.com; jerinj@marvell.com;
> hemant.agrawal@nxp.com
> Subject: Re: [dpdk-dev] [PATCH v6] eal/x86: add 128-bit atomic compare
> exchange
>
> 04/04/2019 14:08, Thomas Monjalon:
> > 04/04/2019 13:47, Ferruh Yigit:
> > > .../dpdk/x86_64-native-linuxapp-gcc/include/rte_atomic_64.h:223:3:
> > > error: ISO C does not support ‘__int128’ types [-Werror=pedantic]
> >
> > We can try this kind of workaround (disable pedantic locally):
> >
> https://github.com/HowardHinnant/date/pull/38/commits/177032852d5b46
> 14
> > 112ca1ab3ef42d6b41824816
>
> Or better:
> __extension__ typedef __int128 int128;
>
Taking that one step further -- RTE_STD_C11 evaluates to __extension__ (when the STD C version is sufficiently old).
04/04/2019 14:14, Eads, Gage:
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > 04/04/2019 14:08, Thomas Monjalon:
> > > 04/04/2019 13:47, Ferruh Yigit:
> > > > .../dpdk/x86_64-native-linuxapp-gcc/include/rte_atomic_64.h:223:3:
> > > > error: ISO C does not support ‘__int128’ types [-Werror=pedantic]
> > >
> > > We can try this kind of workaround (disable pedantic locally):
> > >
> > https://github.com/HowardHinnant/date/pull/38/commits/177032852d5b46
> > 14
> > > 112ca1ab3ef42d6b41824816
> >
> > Or better:
> > __extension__ typedef __int128 int128;
> >
>
> Taking that one step further -- RTE_STD_C11 evaluates to __extension__ (when the STD C version is sufficiently old).
I don't think __int128 is part of C11. Is it?
Ferruh, I cannot reproduce the compiler error. What is your compiler?
Please, could you test this patch?
--- a/lib/librte_eal/common/include/arch/x86/rte_atomic_64.h
+++ b/lib/librte_eal/common/include/arch/x86/rte_atomic_64.h
@@ -220,7 +220,7 @@ typedef struct {
RTE_STD_C11
union {
uint64_t val[2];
- __int128 int128;
+ __extension__ __int128 int128;
};
} __rte_aligned(16) rte_int128_t;
> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Thursday, April 4, 2019 7:18 AM
> To: Eads, Gage <gage.eads@intel.com>; Yigit, Ferruh
> <ferruh.yigit@intel.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; Matan Azrad
> <matan@mellanox.com>; Yongseok Koh <yskoh@mellanox.com>;
> dev@dpdk.org; olivier.matz@6wind.com; arybchenko@solarflare.com;
> Richardson, Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; gavin.hu@arm.com;
> Honnappa.Nagarahalli@arm.com; nd@arm.com;
> chaozhu@linux.vnet.ibm.com; jerinj@marvell.com;
> hemant.agrawal@nxp.com
> Subject: Re: [dpdk-dev] [PATCH v6] eal/x86: add 128-bit atomic compare
> exchange
>
> 04/04/2019 14:14, Eads, Gage:
> > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > 04/04/2019 14:08, Thomas Monjalon:
> > > > 04/04/2019 13:47, Ferruh Yigit:
> > > > > .../dpdk/x86_64-native-linuxapp-gcc/include/rte_atomic_64.h:223:3:
> > > > > error: ISO C does not support ‘__int128’ types [-Werror=pedantic]
> > > >
> > > > We can try this kind of workaround (disable pedantic locally):
> > > >
> > >
> https://github.com/HowardHinnant/date/pull/38/commits/177032852d5b46
> > > 14
> > > > 112ca1ab3ef42d6b41824816
> > >
> > > Or better:
> > > __extension__ typedef __int128 int128;
> > >
> >
> > Taking that one step further -- RTE_STD_C11 evaluates to __extension__
> (when the STD C version is sufficiently old).
>
> I don't think __int128 is part of C11. Is it?
You're right, this is a compiler extension. Using '__extension__' makes more sense.
> -----Original Message-----
> From: Eads, Gage
> Sent: Thursday, April 4, 2019 7:23 AM
> To: 'Thomas Monjalon' <thomas@monjalon.net>; Yigit, Ferruh
> <ferruh.yigit@intel.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; Matan Azrad
> <matan@mellanox.com>; Yongseok Koh <yskoh@mellanox.com>;
> dev@dpdk.org; olivier.matz@6wind.com; arybchenko@solarflare.com;
> Richardson, Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; gavin.hu@arm.com;
> Honnappa.Nagarahalli@arm.com; nd@arm.com;
> chaozhu@linux.vnet.ibm.com; jerinj@marvell.com;
> hemant.agrawal@nxp.com
> Subject: RE: [dpdk-dev] [PATCH v6] eal/x86: add 128-bit atomic compare
> exchange
>
> > -----Original Message-----
> > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > Sent: Thursday, April 4, 2019 7:18 AM
> > To: Eads, Gage <gage.eads@intel.com>; Yigit, Ferruh
> > <ferruh.yigit@intel.com>
> > Cc: Shahaf Shuler <shahafs@mellanox.com>; Matan Azrad
> > <matan@mellanox.com>; Yongseok Koh <yskoh@mellanox.com>;
> dev@dpdk.org;
> > olivier.matz@6wind.com; arybchenko@solarflare.com; Richardson, Bruce
> > <bruce.richardson@intel.com>; Ananyev, Konstantin
> > <konstantin.ananyev@intel.com>; gavin.hu@arm.com;
> > Honnappa.Nagarahalli@arm.com; nd@arm.com;
> chaozhu@linux.vnet.ibm.com;
> > jerinj@marvell.com; hemant.agrawal@nxp.com
> > Subject: Re: [dpdk-dev] [PATCH v6] eal/x86: add 128-bit atomic compare
> > exchange
> >
> > 04/04/2019 14:14, Eads, Gage:
> > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > > 04/04/2019 14:08, Thomas Monjalon:
> > > > > 04/04/2019 13:47, Ferruh Yigit:
> > > > > > .../dpdk/x86_64-native-linuxapp-
> gcc/include/rte_atomic_64.h:223:3:
> > > > > > error: ISO C does not support ‘__int128’ types
> > > > > > [-Werror=pedantic]
> > > > >
> > > > > We can try this kind of workaround (disable pedantic locally):
> > > > >
> > > >
> >
> https://github.com/HowardHinnant/date/pull/38/commits/177032852d5b46
> > > > 14
> > > > > 112ca1ab3ef42d6b41824816
> > > >
> > > > Or better:
> > > > __extension__ typedef __int128 int128;
> > > >
> > >
> > > Taking that one step further -- RTE_STD_C11 evaluates to
> > > __extension__
> > (when the STD C version is sufficiently old).
> >
> > I don't think __int128 is part of C11. Is it?
>
> You're right, this is a compiler extension. Using '__extension__' makes more
> sense.
FYI, __int128 was added in GCC 4.6 (and DPDK's minimum version is currently 4.9).
On 4/4/2019 1:18 PM, Thomas Monjalon wrote:
> 04/04/2019 14:14, Eads, Gage:
>> From: Thomas Monjalon [mailto:thomas@monjalon.net]
>>> 04/04/2019 14:08, Thomas Monjalon:
>>>> 04/04/2019 13:47, Ferruh Yigit:
>>>>> .../dpdk/x86_64-native-linuxapp-gcc/include/rte_atomic_64.h:223:3:
>>>>> error: ISO C does not support ‘__int128’ types [-Werror=pedantic]
>>>>
>>>> We can try this kind of workaround (disable pedantic locally):
>>>>
>>> https://github.com/HowardHinnant/date/pull/38/commits/177032852d5b46
>>> 14
>>>> 112ca1ab3ef42d6b41824816
>>>
>>> Or better:
>>> __extension__ typedef __int128 int128;
>>>
>>
>> Taking that one step further -- RTE_STD_C11 evaluates to __extension__ (when the STD C version is sufficiently old).
>
> I don't think __int128 is part of C11. Is it?
>
> Ferruh, I cannot reproduce the compiler error. What is your compiler?
> Please, could you test this patch?
It is gcc [1]. Sure will test it.
[1]
gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2)
>
> --- a/lib/librte_eal/common/include/arch/x86/rte_atomic_64.h
> +++ b/lib/librte_eal/common/include/arch/x86/rte_atomic_64.h
> @@ -220,7 +220,7 @@ typedef struct {
> RTE_STD_C11
> union {
> uint64_t val[2];
> - __int128 int128;
> + __extension__ __int128 int128;
> };
> } __rte_aligned(16) rte_int128_t;
>
>
>
@@ -34,6 +34,7 @@
/*
* Inspired from FreeBSD src/sys/amd64/include/atomic.h
* Copyright (c) 1998 Doug Rabson
+ * Copyright (c) 2019 Intel Corporation
* All rights reserved.
*/
@@ -46,6 +47,7 @@
#include <stdint.h>
#include <rte_common.h>
+#include <rte_compat.h>
#include <rte_atomic.h>
/*------------------------- 64 bit atomic operations -------------------------*/
@@ -208,4 +210,49 @@ static inline void rte_atomic64_clear(rte_atomic64_t *v)
}
#endif
+/*------------------------ 128 bit atomic operations -------------------------*/
+
+/**
+ * 128-bit integer structure.
+ */
+RTE_STD_C11
+typedef struct {
+ RTE_STD_C11
+ union {
+ uint64_t val[2];
+ __int128 int128;
+ };
+} __rte_aligned(16) rte_int128_t;
+
+static inline int __rte_experimental
+rte_atomic128_cmp_exchange(rte_int128_t *dst,
+ rte_int128_t *exp,
+ const rte_int128_t *src,
+ unsigned int weak,
+ int success,
+ int failure)
+{
+ RTE_SET_USED(weak);
+ RTE_SET_USED(success);
+ RTE_SET_USED(failure);
+ uint8_t res;
+
+ asm volatile (
+ MPLOCKED
+ "cmpxchg16b %[dst];"
+ " sete %[res]"
+ : [dst] "=m" (dst->val[0]),
+ "=a" (exp->val[0]),
+ "=d" (exp->val[1]),
+ [res] "=r" (res)
+ : "b" (src->val[0]),
+ "c" (src->val[1]),
+ "a" (exp->val[0]),
+ "d" (exp->val[1]),
+ "m" (dst->val[0])
+ : "memory");
+
+ return res;
+}
+
#endif /* _RTE_ATOMIC_X86_64_H_ */
@@ -1079,4 +1079,56 @@ static inline void rte_atomic64_clear(rte_atomic64_t *v)
}
#endif
+/*------------------------ 128 bit atomic operations -------------------------*/
+
+#ifdef __DOXYGEN__
+
+/**
+ * An atomic compare and set function used by the mutex functions.
+ * (Atomically) Equivalent to:
+ * @code
+ * if (*dst == *exp)
+ * *dst = *src
+ * else
+ * *exp = *dst
+ * @endcode
+ *
+ * @note This function is currently only available for the x86-64 platform.
+ *
+ * @note The success and failure arguments must be one of the __ATOMIC_* values
+ * defined in the C++11 standard. For details on their behavior, refer to the
+ * standard.
+ *
+ * @param dst
+ * The destination into which the value will be written.
+ * @param exp
+ * Pointer to the expected value. If the operation fails, this memory is
+ * updated with the actual value.
+ * @param src
+ * Pointer to the new value.
+ * @param weak
+ * A value of true allows the comparison to spuriously fail and allows the
+ * 'exp' update to occur non-atomically (i.e. a torn read may occur).
+ * Implementations may ignore this argument and only implement the strong
+ * variant.
+ * @param success
+ * If successful, the operation's memory behavior conforms to this (or a
+ * stronger) model.
+ * @param failure
+ * If unsuccessful, the operation's memory behavior conforms to this (or a
+ * stronger) model. This argument cannot be __ATOMIC_RELEASE,
+ * __ATOMIC_ACQ_REL, or a stronger model than success.
+ * @return
+ * Non-zero on success; 0 on failure.
+ */
+static inline int __rte_experimental
+rte_atomic128_cmp_exchange(rte_int128_t *dst,
+ rte_int128_t *exp,
+ const rte_int128_t *src,
+ unsigned int weak,
+ int success,
+ int failure);
+
+#endif /* __DOXYGEN__ */
+
#endif /* _RTE_ATOMIC_H_ */