From patchwork Thu Sep 24 08:18:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Radu Nicolau X-Patchwork-Id: 78653 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7C985A04B1; Thu, 24 Sep 2020 10:19:27 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 57C4C1DD69; Thu, 24 Sep 2020 10:19:26 +0200 (CEST) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 4B5BA1DB32 for ; Thu, 24 Sep 2020 10:19:19 +0200 (CEST) IronPort-SDR: eIo0UbBfMAwG0pOYaurlwRXWwwx7l/geG5l4NoBpAFjALdNaEHc8a7Bd3La9rlrcy4PSDFwxzH QhT/qxneLa7g== X-IronPort-AV: E=McAfee;i="6000,8403,9753"; a="148790605" X-IronPort-AV: E=Sophos;i="5.77,296,1596524400"; d="scan'208";a="148790605" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Sep 2020 01:19:18 -0700 IronPort-SDR: 6lc3jLSMzN3yUY+J1YO+wrbOYg/xCsHmWH349FCvpGCPBkH9tUe17ZpzUGn5UlFRyOcOguSSEC EsNfwGxFwyCw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,296,1596524400"; d="scan'208";a="455254972" Received: from silpixa00399477.ir.intel.com ([10.237.214.232]) by orsmga004.jf.intel.com with ESMTP; 24 Sep 2020 01:19:16 -0700 From: Radu Nicolau To: dev@dpdk.org Cc: thomas@monjalon.net, david.marchand@redhat.com, viktorin@rehivetech.com, ruifeng.wang@arm.com, jerinj@marvell.com, drc@linux.vnet.ibm.com, bruce.richardson@intel.com, konstantin.ananyev@intel.com, Radu Nicolau , Sean Morrissey Date: Thu, 24 Sep 2020 08:18:29 +0000 Message-Id: <20200924081832.21581-2-radu.nicolau@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200924081832.21581-1-radu.nicolau@intel.com> References: <20200902104343.31774-2-radu.nicolau@intel.com> <20200924081832.21581-1-radu.nicolau@intel.com> Subject: [dpdk-dev] [PATCH v3 1/4] x86: change cpuflag macros to compiler macros X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Replace use of RTE_MACHINE_CPUFLAG macros with regular compiler macros, which are more complete than those provided by DPDK, and as such it allows new instruction sets to be leveraged without having to do extra work to set them up in DPDK. Signed-off-by: Sean Morrissey Signed-off-by: Radu Nicolau --- app/test/test_memcpy_perf.c | 8 ++++---- config/x86/meson.build | 2 -- drivers/net/enic/meson.build | 2 +- drivers/net/i40e/meson.build | 2 +- drivers/net/iavf/meson.build | 2 +- drivers/net/ice/meson.build | 2 +- examples/l3fwd/l3fwd_em.c | 2 +- lib/librte_acl/meson.build | 2 +- lib/librte_eal/common/rte_random.c | 4 ++-- lib/librte_eal/x86/include/rte_memcpy.h | 8 ++++---- lib/librte_efd/rte_efd_x86.h | 2 +- lib/librte_hash/rte_cuckoo_hash.c | 2 +- lib/librte_member/rte_member_ht.c | 10 +++++----- lib/librte_member/rte_member_x86.h | 2 +- lib/librte_net/rte_net_crc.c | 2 +- 15 files changed, 25 insertions(+), 27 deletions(-) diff --git a/app/test/test_memcpy_perf.c b/app/test/test_memcpy_perf.c index 00a2092b4..c711e36ba 100644 --- a/app/test/test_memcpy_perf.c +++ b/app/test/test_memcpy_perf.c @@ -51,13 +51,13 @@ static size_t buf_sizes[TEST_VALUE_RANGE]; #define TEST_BATCH_SIZE 100 /* Data is aligned on this many bytes (power of 2) */ -#ifdef RTE_MACHINE_CPUFLAG_AVX512F +#ifdef __AVX512F__ #define ALIGNMENT_UNIT 64 -#elif defined RTE_MACHINE_CPUFLAG_AVX2 +#elif defined __AVX2__ #define ALIGNMENT_UNIT 32 -#else /* RTE_MACHINE_CPUFLAG */ +#else #define ALIGNMENT_UNIT 16 -#endif /* RTE_MACHINE_CPUFLAG */ +#endif /* * Pointers used in performance tests. The two large buffers are for uncached diff --git a/config/x86/meson.build b/config/x86/meson.build index 6ec020ef6..fea4d5403 100644 --- a/config/x86/meson.build +++ b/config/x86/meson.build @@ -18,7 +18,6 @@ endif base_flags = ['SSE', 'SSE2', 'SSE3','SSSE3', 'SSE4_1', 'SSE4_2'] foreach f:base_flags - dpdk_conf.set('RTE_MACHINE_CPUFLAG_' + f, 1) compile_time_cpuflags += ['RTE_CPUFLAG_' + f] endforeach @@ -32,7 +31,6 @@ foreach f:optional_flags elif f == 'RDRND' f = 'RDRAND' endif - dpdk_conf.set('RTE_MACHINE_CPUFLAG_' + f, 1) compile_time_cpuflags += ['RTE_CPUFLAG_' + f] endif endforeach diff --git a/drivers/net/enic/meson.build b/drivers/net/enic/meson.build index 7f4836d0f..86ef2a8a2 100644 --- a/drivers/net/enic/meson.build +++ b/drivers/net/enic/meson.build @@ -20,7 +20,7 @@ deps += ['hash'] includes += include_directories('base') # The current implementation assumes 64-bit pointers -if dpdk_conf.has('RTE_MACHINE_CPUFLAG_AVX2') and dpdk_conf.get('RTE_ARCH_64') +if cc.get_define('__AVX2__', args: machine_args) != '' and dpdk_conf.get('RTE_ARCH_64') sources += files('enic_rxtx_vec_avx2.c') # Build the avx2 handler if the compiler supports it, even though 'machine' # does not. This is to support users who build for the min supported machine diff --git a/drivers/net/i40e/meson.build b/drivers/net/i40e/meson.build index 211d45d88..68f9895cd 100644 --- a/drivers/net/i40e/meson.build +++ b/drivers/net/i40e/meson.build @@ -31,7 +31,7 @@ if arch_subdir == 'x86' # compile AVX2 version if either: # a. we have AVX supported in minimum instruction set baseline # b. it's not minimum instruction set, but supported by compiler - if dpdk_conf.has('RTE_MACHINE_CPUFLAG_AVX2') + if cc.get_define('__AVX2__', args: machine_args) != '' cflags += ['-DCC_AVX2_SUPPORT'] sources += files('i40e_rxtx_vec_avx2.c') elif cc.has_argument('-mavx2') diff --git a/drivers/net/iavf/meson.build b/drivers/net/iavf/meson.build index a3fad363d..33407c503 100644 --- a/drivers/net/iavf/meson.build +++ b/drivers/net/iavf/meson.build @@ -21,7 +21,7 @@ if arch_subdir == 'x86' # compile AVX2 version if either: # a. we have AVX supported in minimum instruction set baseline # b. it's not minimum instruction set, but supported by compiler - if dpdk_conf.has('RTE_MACHINE_CPUFLAG_AVX2') + if cc.get_define('__AVX2__', args: machine_args) != '' cflags += ['-DCC_AVX2_SUPPORT'] sources += files('iavf_rxtx_vec_avx2.c') elif cc.has_argument('-mavx2') diff --git a/drivers/net/ice/meson.build b/drivers/net/ice/meson.build index e6fe74487..99e1b773a 100644 --- a/drivers/net/ice/meson.build +++ b/drivers/net/ice/meson.build @@ -22,7 +22,7 @@ if arch_subdir == 'x86' # compile AVX2 version if either: # a. we have AVX supported in minimum instruction set baseline # b. it's not minimum instruction set, but supported by compiler - if dpdk_conf.has('RTE_MACHINE_CPUFLAG_AVX2') + if cc.get_define('__AVX2__', args: machine_args) != '' sources += files('ice_rxtx_vec_avx2.c') elif cc.has_argument('-mavx2') ice_avx2_lib = static_library('ice_avx2_lib', diff --git a/examples/l3fwd/l3fwd_em.c b/examples/l3fwd/l3fwd_em.c index fdbee70b4..df0c8dd16 100644 --- a/examples/l3fwd/l3fwd_em.c +++ b/examples/l3fwd/l3fwd_em.c @@ -215,7 +215,7 @@ static rte_xmm_t mask0; static rte_xmm_t mask1; static rte_xmm_t mask2; -#if defined(RTE_MACHINE_CPUFLAG_SSE2) +#if defined(__SSE2__) static inline xmm_t em_mask_key(void *key, xmm_t mask) { diff --git a/lib/librte_acl/meson.build b/lib/librte_acl/meson.build index d1e2c184c..b31a3f798 100644 --- a/lib/librte_acl/meson.build +++ b/lib/librte_acl/meson.build @@ -15,7 +15,7 @@ if dpdk_conf.has('RTE_ARCH_X86') # in former case, just add avx2 C file to files list # in latter case, compile c file to static lib, using correct compiler # flags, and then have the .o file from static lib linked into main lib. - if dpdk_conf.has('RTE_MACHINE_CPUFLAG_AVX2') + if cc.get_define('__AVX2__', args: machine_args) != '' sources += files('acl_run_avx2.c') cflags += '-DCC_AVX2_SUPPORT' elif cc.has_argument('-mavx2') diff --git a/lib/librte_eal/common/rte_random.c b/lib/librte_eal/common/rte_random.c index b7a089ac4..b2c5416b3 100644 --- a/lib/librte_eal/common/rte_random.c +++ b/lib/librte_eal/common/rte_random.c @@ -2,7 +2,7 @@ * Copyright(c) 2019 Ericsson AB */ -#ifdef RTE_MACHINE_CPUFLAG_RDSEED +#ifdef __RDSEED__ #include #endif #include @@ -188,7 +188,7 @@ __rte_random_initial_seed(void) if (ge_rc == 0) return ge_seed; #endif -#ifdef RTE_MACHINE_CPUFLAG_RDSEED +#ifdef __RDSEED__ unsigned int rdseed_low; unsigned int rdseed_high; diff --git a/lib/librte_eal/x86/include/rte_memcpy.h b/lib/librte_eal/x86/include/rte_memcpy.h index 9c67232df..008a3de67 100644 --- a/lib/librte_eal/x86/include/rte_memcpy.h +++ b/lib/librte_eal/x86/include/rte_memcpy.h @@ -45,7 +45,7 @@ extern "C" { static __rte_always_inline void * rte_memcpy(void *dst, const void *src, size_t n); -#ifdef RTE_MACHINE_CPUFLAG_AVX512F +#ifdef __AVX512F__ #define ALIGNMENT_MASK 0x3F @@ -286,7 +286,7 @@ rte_memcpy_generic(void *dst, const void *src, size_t n) goto COPY_BLOCK_128_BACK63; } -#elif defined RTE_MACHINE_CPUFLAG_AVX2 +#elif defined __AVX2__ #define ALIGNMENT_MASK 0x1F @@ -479,7 +479,7 @@ rte_memcpy_generic(void *dst, const void *src, size_t n) goto COPY_BLOCK_128_BACK31; } -#else /* RTE_MACHINE_CPUFLAG */ +#else /* __AVX512F__ */ #define ALIGNMENT_MASK 0x0F @@ -803,7 +803,7 @@ rte_memcpy_generic(void *dst, const void *src, size_t n) goto COPY_BLOCK_64_BACK15; } -#endif /* RTE_MACHINE_CPUFLAG */ +#endif /* __AVX512F__ */ static __rte_always_inline void * rte_memcpy_aligned(void *dst, const void *src, size_t n) diff --git a/lib/librte_efd/rte_efd_x86.h b/lib/librte_efd/rte_efd_x86.h index 6c207e87d..e2f9dcca8 100644 --- a/lib/librte_efd/rte_efd_x86.h +++ b/lib/librte_efd/rte_efd_x86.h @@ -19,7 +19,7 @@ efd_lookup_internal_avx2(const efd_hashfunc_t *group_hash_idx, const efd_lookuptbl_t *group_lookup_table, const uint32_t hash_val_a, const uint32_t hash_val_b) { -#ifdef RTE_MACHINE_CPUFLAG_AVX2 +#ifdef __AVX2__ efd_value_t value = 0; uint32_t i = 0; __m256i vhash_val_a = _mm256_set1_epi32(hash_val_a); diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c index 0a6d47471..7c7ab84af 100644 --- a/lib/librte_hash/rte_cuckoo_hash.c +++ b/lib/librte_hash/rte_cuckoo_hash.c @@ -1691,7 +1691,7 @@ compare_signatures(uint32_t *prim_hash_matches, uint32_t *sec_hash_matches, /* For match mask the first bit of every two bits indicates the match */ switch (sig_cmp_fn) { -#if defined(RTE_MACHINE_CPUFLAG_SSE2) +#if defined(__SSE2__) case RTE_HASH_COMPARE_SSE: /* Compare all signatures in the bucket */ *prim_hash_matches = _mm_movemask_epi8(_mm_cmpeq_epi16( diff --git a/lib/librte_member/rte_member_ht.c b/lib/librte_member/rte_member_ht.c index cbcd0d440..3ea293a09 100644 --- a/lib/librte_member/rte_member_ht.c +++ b/lib/librte_member/rte_member_ht.c @@ -176,7 +176,7 @@ rte_member_lookup_ht(const struct rte_member_setsum *ss, get_buckets_index(ss, key, &prim_bucket, &sec_bucket, &tmp_sig); switch (ss->sig_cmp_fn) { -#if defined(RTE_ARCH_X86) && defined(RTE_MACHINE_CPUFLAG_AVX2) +#if defined(RTE_ARCH_X86) && defined(__AVX2__) case RTE_MEMBER_COMPARE_AVX2: if (search_bucket_single_avx(prim_bucket, tmp_sig, buckets, set_id) || @@ -216,7 +216,7 @@ rte_member_lookup_bulk_ht(const struct rte_member_setsum *ss, for (i = 0; i < num_keys; i++) { switch (ss->sig_cmp_fn) { -#if defined(RTE_ARCH_X86) && defined(RTE_MACHINE_CPUFLAG_AVX2) +#if defined(RTE_ARCH_X86) && defined(__AVX2__) case RTE_MEMBER_COMPARE_AVX2: if (search_bucket_single_avx(prim_buckets[i], tmp_sig[i], buckets, &set_id[i]) || @@ -253,7 +253,7 @@ rte_member_lookup_multi_ht(const struct rte_member_setsum *ss, get_buckets_index(ss, key, &prim_bucket, &sec_bucket, &tmp_sig); switch (ss->sig_cmp_fn) { -#if defined(RTE_ARCH_X86) && defined(RTE_MACHINE_CPUFLAG_AVX2) +#if defined(RTE_ARCH_X86) && defined(__AVX2__) case RTE_MEMBER_COMPARE_AVX2: search_bucket_multi_avx(prim_bucket, tmp_sig, buckets, &num_matches, match_per_key, set_id); @@ -296,7 +296,7 @@ rte_member_lookup_multi_bulk_ht(const struct rte_member_setsum *ss, match_cnt_tmp = 0; switch (ss->sig_cmp_fn) { -#if defined(RTE_ARCH_X86) && defined(RTE_MACHINE_CPUFLAG_AVX2) +#if defined(RTE_ARCH_X86) && defined(__AVX2__) case RTE_MEMBER_COMPARE_AVX2: search_bucket_multi_avx(prim_buckets[i], tmp_sig[i], buckets, &match_cnt_tmp, match_per_key, @@ -357,7 +357,7 @@ try_update(struct member_ht_bucket *buckets, uint32_t prim, uint32_t sec, enum rte_member_sig_compare_function cmp_fn) { switch (cmp_fn) { -#if defined(RTE_ARCH_X86) && defined(RTE_MACHINE_CPUFLAG_AVX2) +#if defined(RTE_ARCH_X86) && defined(__AVX2__) case RTE_MEMBER_COMPARE_AVX2: if (update_entry_search_avx(prim, sig, buckets, set_id) || update_entry_search_avx(sec, sig, buckets, diff --git a/lib/librte_member/rte_member_x86.h b/lib/librte_member/rte_member_x86.h index 21a498ef0..74c8e3885 100644 --- a/lib/librte_member/rte_member_x86.h +++ b/lib/librte_member/rte_member_x86.h @@ -11,7 +11,7 @@ extern "C" { #include -#if defined(RTE_MACHINE_CPUFLAG_AVX2) +#if defined(__AVX2__) static inline int update_entry_search_avx(uint32_t bucket_id, member_sig_t tmp_sig, diff --git a/lib/librte_net/rte_net_crc.c b/lib/librte_net/rte_net_crc.c index 9fd4794a9..56a0ed129 100644 --- a/lib/librte_net/rte_net_crc.c +++ b/lib/librte_net/rte_net_crc.c @@ -10,7 +10,7 @@ #include #include -#if defined(RTE_ARCH_X86_64) && defined(RTE_MACHINE_CPUFLAG_PCLMULQDQ) +#if defined(RTE_ARCH_X86_64) && defined(__PCLMUL__) #define X86_64_SSE42_PCLMULQDQ 1 #elif defined(RTE_ARCH_ARM64) && defined(RTE_MACHINE_CPUFLAG_PMULL) #define ARM64_NEON_PMULL 1