From patchwork Thu Aug 17 21:24:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Harjot Singh X-Patchwork-Id: 130476 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 6A55243094; Thu, 17 Aug 2023 23:24:43 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 6082E410FD; Thu, 17 Aug 2023 23:24:38 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id 9DC44406A2 for ; Thu, 17 Aug 2023 23:24:35 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DB8C11063; Thu, 17 Aug 2023 14:25:15 -0700 (PDT) Received: from 2u-thunderx2.usa.Arm.com (unknown [10.118.12.78]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D4AF63F6C4; Thu, 17 Aug 2023 14:24:34 -0700 (PDT) From: Harjot Singh To: Thomas Monjalon , Yipeng Wang , Sameh Gobriel , Bruce Richardson , Vladimir Medvedkin Cc: dev@dpdk.org, nd@arm.com, Harjot Singh , Nathan Brown , Feifei Wang , Jieqiang Wang , Honnappa Nagarahalli Subject: [PATCH 1/1] hash: add SVE support for bulk key lookup Date: Thu, 17 Aug 2023 21:24:17 +0000 Message-Id: <20230817212417.3637080-2-Harjot.Singh@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230817212417.3637080-1-Harjot.Singh@arm.com> References: <20230817212417.3637080-1-Harjot.Singh@arm.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Harjot Singh - Implemented Vector Length Agnostic SVE code for comparing signatures in bulk lookup. - Added Defines in code for SVE code support. - New Optimised SVE code is 1-2 CPU cycle slower than NEON for N2 processor. Performance Numbers from hash_perf_autotest : Elements in Primary or Secondary Location Results (in CPU cycles/operation) ----------------------------------- Operations without data Without pre-computed hash values Keysize Add/Lookup/Lookup_bulk Neon SVE 4 93/71/26 93/71/27 8 93/70/26 93/70/27 9 94/74/27 94/74/28 13 100/80/31 100/79/32 16 100/78/30 100/78/31 32 109/110/38 108/110/39 With pre-computed hash values Keysize Add/Lookup/Lookup_bulk Neon SVE 4 83/58/27 83/58/29 8 83/57/27 83/57/28 9 83/60/28 83/60/29 13 84/60/28 83/60/29 16 83/58/27 83/58/29 32 84/68/31 84/68/32 Signed-off-by: Harjot Singh Reviewed-by: Nathan Brown Reviewed-by: Feifei Wang Reviewed-by: Jieqiang Wang Reviewed-by: Honnappa Nagarahalli --- .mailmap | 1 + lib/hash/rte_cuckoo_hash.c | 37 ++++++++++++++++++++++++++++++++++++- lib/hash/rte_cuckoo_hash.h | 1 + 3 files changed, 38 insertions(+), 1 deletion(-) diff --git a/.mailmap b/.mailmap index 864d33ee46..2cce48c900 100644 --- a/.mailmap +++ b/.mailmap @@ -481,6 +481,7 @@ Hari Kumar Vemula Harini Ramakrishnan Hariprasad Govindharajan Harish Patil +Harjot Singh Harman Kalra Harneet Singh Harold Huang diff --git a/lib/hash/rte_cuckoo_hash.c b/lib/hash/rte_cuckoo_hash.c index d92a903bb3..fdb06eb33e 100644 --- a/lib/hash/rte_cuckoo_hash.c +++ b/lib/hash/rte_cuckoo_hash.c @@ -435,8 +435,11 @@ rte_hash_create(const struct rte_hash_parameters *params) h->sig_cmp_fn = RTE_HASH_COMPARE_SSE; else #elif defined(RTE_ARCH_ARM64) - if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_NEON)) + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_NEON)) { h->sig_cmp_fn = RTE_HASH_COMPARE_NEON; + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SVE)) + h->sig_cmp_fn = RTE_HASH_COMPARE_SVE; + } else #endif h->sig_cmp_fn = RTE_HASH_COMPARE_SCALAR; @@ -1892,6 +1895,38 @@ compare_signatures(uint32_t *prim_hash_matches, uint32_t *sec_hash_matches, *sec_hash_matches = (uint32_t)(vaddvq_u16(x)); } break; +#if defined(RTE_HAS_SVE_ACLE) + case RTE_HASH_COMPARE_SVE: { + svuint16_t vsign, shift, sv_prim_matches, sv_sec_matches; + svbool_t pred, p_match, s_match; + int i = 0; + uint64_t vl = svcnth(); + + vsign = svdup_u16(sig); + shift = svindex_u16(0, 2); + do { + pred = svwhilelt_b16(i, RTE_HASH_BUCKET_ENTRIES); + /* Compare all signatures in the primary bucket */ + p_match = svcmpeq_u16(pred, vsign, svld1_u16(pred, + &prim_bkt->sig_current[i])); + if (svptest_any(svptrue_b16(), p_match)) { + sv_prim_matches = svdup_u16_z(p_match, 1); + sv_prim_matches = svlsl_u16_z(pred, sv_prim_matches, shift); + *prim_hash_matches |= svorv_u16(pred, sv_prim_matches); + } + /* Compare all signatures in the secondary bucket */ + s_match = svcmpeq_u16(pred, vsign, svld1_u16(pred, + &sec_bkt->sig_current[i])); + if (svptest_any(svptrue_b16(), s_match)) { + sv_sec_matches = svdup_u16_z(s_match, 1); + sv_sec_matches = svlsl_u16_z(pred, sv_sec_matches, shift); + *sec_hash_matches |= svorv_u16(pred, sv_sec_matches); + } + i += vl; + } while (i < RTE_HASH_BUCKET_ENTRIES); + } + break; +#endif #endif default: for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) { diff --git a/lib/hash/rte_cuckoo_hash.h b/lib/hash/rte_cuckoo_hash.h index eb2644f74b..356ec2a69e 100644 --- a/lib/hash/rte_cuckoo_hash.h +++ b/lib/hash/rte_cuckoo_hash.h @@ -148,6 +148,7 @@ enum rte_hash_sig_compare_function { RTE_HASH_COMPARE_SCALAR = 0, RTE_HASH_COMPARE_SSE, RTE_HASH_COMPARE_NEON, + RTE_HASH_COMPARE_SVE, RTE_HASH_COMPARE_NUM };