Message ID | 20201012145148.290451-1-bruce.richardson@intel.com (mailing list archive) |
---|---|
State | Accepted, archived |
Delegated to: | Thomas Monjalon |
Headers | show |
Series | build: fix memcpy behaviour regression | expand |
Context | Check | Description |
---|---|---|
ci/Intel-compilation | success | Compilation OK |
ci/iol-mellanox-Performance | success | Performance Testing PASS |
ci/travis-robot | success | Travis build: passed |
ci/iol-intel-Performance | success | Performance Testing PASS |
ci/iol-testing | success | Testing PASS |
ci/iol-broadcom-Functional | success | Functional Testing PASS |
ci/iol-broadcom-Performance | success | Performance Testing PASS |
ci/checkpatch | warning | coding style issues |
Tested-by: Han, Yingya <yingyax.han@intel.com> Best Regards, Yingya -----Original Message----- From: Richardson, Bruce <bruce.richardson@intel.com> Sent: Monday, October 12, 2020 10:52 PM To: dev@dpdk.org Cc: Han, YingyaX <yingyax.han@intel.com>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Tu, Lijuan <lijuan.tu@intel.com>; Richardson, Bruce <bruce.richardson@intel.com> Subject: [PATCH] build: fix memcpy behaviour regression When testing on some x86 platforms, code compiled with meson was observed running at a different power-license level to that compiled with make. This is due to the fact that meson auto-detects the instruction sets available on the system and enabled AVX512 rte_memcpy when AVX512 was available, while on make, a build time AVX-512 flag needed to be explicitly set to enable that AVX512 rte_memcpy code path. In the absense of runtime path selection for rte_memcpy - which is complicated by it being a static inline function in a header file - we can fix this behaviour regression by similarly having a build-time option which must be set to enable the AVX-512 memcpy path.
> From: Richardson, Bruce <bruce.richardson@intel.com> > > When testing on some x86 platforms, code compiled with meson was observed running at a different power-license level to that compiled with make. This is due to the fact that meson auto-detects the instruction sets available on the system and enabled AVX512 rte_memcpy when AVX512 was available, while on make, a build time AVX-512 flag needed to be explicitly set to enable that AVX512 rte_memcpy code path. > > In the absense of runtime path selection for rte_memcpy - which is complicated by it being a static inline function in a header file - we can fix this behaviour regression by similarly having a build-time option which must be set to enable the AVX-512 memcpy path. > > Tested-by: Han, Yingya <yingyax.han@intel.com> Applied, thanks
diff --git a/lib/librte_eal/include/generic/rte_memcpy.h b/lib/librte_eal/include/generic/rte_memcpy.h index 701e550c3..e7f0f8eaa 100644 --- a/lib/librte_eal/include/generic/rte_memcpy.h +++ b/lib/librte_eal/include/generic/rte_memcpy.h @@ -95,6 +95,10 @@ rte_mov256(uint8_t *dst, const uint8_t *src); * @note This is implemented as a macro, so it's address should not be taken * and care is needed as parameter expressions may be evaluated multiple times. * + * @note For x86 platforms to enable the AVX-512 memcpy implementation, set + * -DRTE_MEMCPY_AVX512 macro in CFLAGS, or define the RTE_MEMCPY_AVX512 macro + * explicitly in the source file before including the rte_memcpy header file. + * * @param dst * Pointer to the destination of the data. * @param src diff --git a/lib/librte_eal/x86/include/rte_memcpy.h b/lib/librte_eal/x86/include/rte_memcpy.h index 008a3de67..79f381dd9 100644 --- a/lib/librte_eal/x86/include/rte_memcpy.h +++ b/lib/librte_eal/x86/include/rte_memcpy.h @@ -45,7 +45,7 @@ extern "C" { static __rte_always_inline void * rte_memcpy(void *dst, const void *src, size_t n); -#ifdef __AVX512F__ +#if defined __AVX512F__ && defined RTE_MEMCPY_AVX512 #define ALIGNMENT_MASK 0x3F
When testing on some x86 platforms, code compiled with meson was observed running at a different power-license level to that compiled with make. This is due to the fact that meson auto-detects the instruction sets available on the system and enabled AVX512 rte_memcpy when AVX512 was available, while on make, a build time AVX-512 flag needed to be explicitly set to enable that AVX512 rte_memcpy code path. In the absense of runtime path selection for rte_memcpy - which is complicated by it being a static inline function in a header file - we can fix this behaviour regression by similarly having a build-time option which must be set to enable the AVX-512 memcpy path. Fixes: a25a650be5f0 ("build: add infrastructure for meson and ninja builds") Fixes: 3e1bb55fd6ef ("build/x86: add SSE flags") Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> --- NOTE: This patch is not suitable for backporting, as it will break the build support for make builds without addition makefile changes. --- lib/librte_eal/include/generic/rte_memcpy.h | 4 ++++ lib/librte_eal/x86/include/rte_memcpy.h | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-)