Patch Detail
get:
Show a patch.
patch:
Update a patch.
put:
Update a patch.
GET /api/patches/137325/?format=api
http://patches.dpdk.org/api/patches/137325/?format=api", "web_url": "http://patches.dpdk.org/project/dpdk/patch/20240226170203.2881280-2-yoan.picchi@arm.com/", "project": { "id": 1, "url": "http://patches.dpdk.org/api/projects/1/?format=api", "name": "DPDK", "link_name": "dpdk", "list_id": "dev.dpdk.org", "list_email": "dev@dpdk.org", "web_url": "http://core.dpdk.org", "scm_url": "git://dpdk.org/dpdk", "webscm_url": "http://git.dpdk.org/dpdk", "list_archive_url": "https://inbox.dpdk.org/dev", "list_archive_url_format": "https://inbox.dpdk.org/dev/{}", "commit_url_format": "" }, "msgid": "<20240226170203.2881280-2-yoan.picchi@arm.com>", "list_archive_url": "https://inbox.dpdk.org/dev/20240226170203.2881280-2-yoan.picchi@arm.com", "date": "2024-02-26T17:02:00", "name": "[v4,1/4] hash: pack the hitmask for hash in bulk lookup", "commit_ref": null, "pull_url": null, "state": "superseded", "archived": true, "hash": "ea0a4feeb938beb191392a9c69bc7846664a3ae2", "submitter": { "id": 3196, "url": "http://patches.dpdk.org/api/people/3196/?format=api", "name": "Yoan Picchi", "email": "yoan.picchi@arm.com" }, "delegate": { "id": 1, "url": "http://patches.dpdk.org/api/users/1/?format=api", "username": "tmonjalo", "first_name": "Thomas", "last_name": "Monjalon", "email": "thomas@monjalon.net" }, "mbox": "http://patches.dpdk.org/project/dpdk/patch/20240226170203.2881280-2-yoan.picchi@arm.com/mbox/", "series": [ { "id": 31236, "url": "http://patches.dpdk.org/api/series/31236/?format=api", "web_url": "http://patches.dpdk.org/project/dpdk/list/?series=31236", "date": "2024-02-26T17:02:00", "name": "[v4,1/4] hash: pack the hitmask for hash in bulk lookup", "version": 4, "mbox": "http://patches.dpdk.org/series/31236/mbox/" } ], "comments": "http://patches.dpdk.org/api/patches/137325/comments/", "check": "success", "checks": "http://patches.dpdk.org/api/patches/137325/checks/", "tags": {}, "related": [], "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@inbox.dpdk.org", "Delivered-To": "patchwork@inbox.dpdk.org", "Received": [ "from mails.dpdk.org (mails.dpdk.org [217.70.189.124])\n\tby inbox.dpdk.org (Postfix) with ESMTP id D446D43C03;\n\tTue, 27 Feb 2024 07:03:26 +0100 (CET)", "from mails.dpdk.org (localhost [127.0.0.1])\n\tby mails.dpdk.org (Postfix) with ESMTP id 90ECD42EBD;\n\tTue, 27 Feb 2024 07:02:40 +0100 (CET)", "from foss.arm.com (foss.arm.com [217.140.110.172])\n by mails.dpdk.org (Postfix) with ESMTP id CBDBD402B2\n for <dev@dpdk.org>; Mon, 26 Feb 2024 18:02:23 +0100 (CET)", "from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])\n by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C05C4FEC;\n Mon, 26 Feb 2024 09:03:01 -0800 (PST)", "from octeon10-1.usa.Arm.com (unknown [10.118.91.161])\n by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id CFCEE3F73F;\n Mon, 26 Feb 2024 09:02:22 -0800 (PST)" ], "From": "Yoan Picchi <yoan.picchi@arm.com>", "To": "Thomas Monjalon <thomas@monjalon.net>,\n Yipeng Wang <yipeng1.wang@intel.com>,\n Sameh Gobriel <sameh.gobriel@intel.com>,\n Bruce Richardson <bruce.richardson@intel.com>,\n Vladimir Medvedkin <vladimir.medvedkin@intel.com>", "Cc": "dev@dpdk.org, nd@arm.com, Yoan Picchi <yoan.picchi@arm.com>,\n Ruifeng Wang <ruifeng.wang@arm.com>, Nathan Brown <nathan.brown@arm.com>", "Subject": "[PATCH v4 1/4] hash: pack the hitmask for hash in bulk lookup", "Date": "Mon, 26 Feb 2024 17:02:00 +0000", "Message-Id": "<20240226170203.2881280-2-yoan.picchi@arm.com>", "X-Mailer": "git-send-email 2.25.1", "In-Reply-To": "<20240226170203.2881280-1-yoan.picchi@arm.com>", "References": "<20231107121845.2758454-1-yoan.picchi@arm.com>\n <20240226170203.2881280-1-yoan.picchi@arm.com>", "MIME-Version": "1.0", "Content-Transfer-Encoding": "8bit", "X-Mailman-Approved-At": "Tue, 27 Feb 2024 07:02:24 +0100", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.29", "Precedence": "list", "List-Id": "DPDK patches and discussions <dev.dpdk.org>", "List-Unsubscribe": "<https://mails.dpdk.org/options/dev>,\n <mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://mails.dpdk.org/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<https://mails.dpdk.org/listinfo/dev>,\n <mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org" }, "content": "Current hitmask includes padding due to Intel's SIMD\nimplementation detail. This patch allows non Intel SIMD\nimplementations to benefit from a dense hitmask.\n\nSigned-off-by: Yoan Picchi <yoan.picchi@arm.com>\nReviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>\nReviewed-by: Nathan Brown <nathan.brown@arm.com>\n---\n .mailmap | 2 +\n lib/hash/rte_cuckoo_hash.c | 118 ++++++++++++++++++++++++++-----------\n 2 files changed, 86 insertions(+), 34 deletions(-)", "diff": "diff --git a/.mailmap b/.mailmap\nindex 12d2875641..60500bbe36 100644\n--- a/.mailmap\n+++ b/.mailmap\n@@ -492,6 +492,7 @@ Hari Kumar Vemula <hari.kumarx.vemula@intel.com>\n Harini Ramakrishnan <harini.ramakrishnan@microsoft.com>\n Hariprasad Govindharajan <hariprasad.govindharajan@intel.com>\n Harish Patil <harish.patil@cavium.com> <harish.patil@qlogic.com>\n+Harjot Singh <harjot.singh@arm.com>\n Harman Kalra <hkalra@marvell.com>\n Harneet Singh <harneet.singh@intel.com>\n Harold Huang <baymaxhuang@gmail.com>\n@@ -1625,6 +1626,7 @@ Yixue Wang <yixue.wang@intel.com>\n Yi Yang <yangyi01@inspur.com> <yi.y.yang@intel.com>\n Yi Zhang <zhang.yi75@zte.com.cn>\n Yoann Desmouceaux <ydesmouc@cisco.com>\n+Yoan Picchi <yoan.picchi@arm.com>\n Yogesh Jangra <yogesh.jangra@intel.com>\n Yogev Chaimovich <yogev@cgstowernetworks.com>\n Yongjie Gu <yongjiex.gu@intel.com>\ndiff --git a/lib/hash/rte_cuckoo_hash.c b/lib/hash/rte_cuckoo_hash.c\nindex 9cf94645f6..0550165584 100644\n--- a/lib/hash/rte_cuckoo_hash.c\n+++ b/lib/hash/rte_cuckoo_hash.c\n@@ -1857,8 +1857,50 @@ rte_hash_free_key_with_position(const struct rte_hash *h,\n \n }\n \n+#if defined(__ARM_NEON)\n+\n+static inline void\n+compare_signatures_dense(uint32_t *prim_hash_matches, uint32_t *sec_hash_matches,\n+\t\t\tconst struct rte_hash_bucket *prim_bkt,\n+\t\t\tconst struct rte_hash_bucket *sec_bkt,\n+\t\t\tuint16_t sig,\n+\t\t\tenum rte_hash_sig_compare_function sig_cmp_fn)\n+{\n+\tunsigned int i;\n+\n+\t/* For match mask every bits indicates the match */\n+\tswitch (sig_cmp_fn) {\n+\tcase RTE_HASH_COMPARE_NEON: {\n+\t\tuint16x8_t vmat, vsig, x;\n+\t\tint16x8_t shift = {0, 1, 2, 3, 4, 5, 6, 7};\n+\n+\t\tvsig = vld1q_dup_u16((uint16_t const *)&sig);\n+\t\t/* Compare all signatures in the primary bucket */\n+\t\tvmat = vceqq_u16(vsig,\n+\t\t\tvld1q_u16((uint16_t const *)prim_bkt->sig_current));\n+\t\tx = vshlq_u16(vandq_u16(vmat, vdupq_n_u16(0x0001)), shift);\n+\t\t*prim_hash_matches = (uint32_t)(vaddvq_u16(x));\n+\t\t/* Compare all signatures in the secondary bucket */\n+\t\tvmat = vceqq_u16(vsig,\n+\t\t\tvld1q_u16((uint16_t const *)sec_bkt->sig_current));\n+\t\tx = vshlq_u16(vandq_u16(vmat, vdupq_n_u16(0x0001)), shift);\n+\t\t*sec_hash_matches = (uint32_t)(vaddvq_u16(x));\n+\t\t}\n+\t\tbreak;\n+\tdefault:\n+\t\tfor (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {\n+\t\t\t*prim_hash_matches |=\n+\t\t\t\t((sig == prim_bkt->sig_current[i]) << i);\n+\t\t\t*sec_hash_matches |=\n+\t\t\t\t((sig == sec_bkt->sig_current[i]) << i);\n+\t\t}\n+\t}\n+}\n+\n+#else\n+\n static inline void\n-compare_signatures(uint32_t *prim_hash_matches, uint32_t *sec_hash_matches,\n+compare_signatures_sparse(uint32_t *prim_hash_matches, uint32_t *sec_hash_matches,\n \t\t\tconst struct rte_hash_bucket *prim_bkt,\n \t\t\tconst struct rte_hash_bucket *sec_bkt,\n \t\t\tuint16_t sig,\n@@ -1885,25 +1927,7 @@ compare_signatures(uint32_t *prim_hash_matches, uint32_t *sec_hash_matches,\n \t\t/* Extract the even-index bits only */\n \t\t*sec_hash_matches &= 0x5555;\n \t\tbreak;\n-#elif defined(__ARM_NEON)\n-\tcase RTE_HASH_COMPARE_NEON: {\n-\t\tuint16x8_t vmat, vsig, x;\n-\t\tint16x8_t shift = {-15, -13, -11, -9, -7, -5, -3, -1};\n-\n-\t\tvsig = vld1q_dup_u16((uint16_t const *)&sig);\n-\t\t/* Compare all signatures in the primary bucket */\n-\t\tvmat = vceqq_u16(vsig,\n-\t\t\tvld1q_u16((uint16_t const *)prim_bkt->sig_current));\n-\t\tx = vshlq_u16(vandq_u16(vmat, vdupq_n_u16(0x8000)), shift);\n-\t\t*prim_hash_matches = (uint32_t)(vaddvq_u16(x));\n-\t\t/* Compare all signatures in the secondary bucket */\n-\t\tvmat = vceqq_u16(vsig,\n-\t\t\tvld1q_u16((uint16_t const *)sec_bkt->sig_current));\n-\t\tx = vshlq_u16(vandq_u16(vmat, vdupq_n_u16(0x8000)), shift);\n-\t\t*sec_hash_matches = (uint32_t)(vaddvq_u16(x));\n-\t\t}\n-\t\tbreak;\n-#endif\n+#endif /* defined(__SSE2__) */\n \tdefault:\n \t\tfor (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {\n \t\t\t*prim_hash_matches |=\n@@ -1914,6 +1938,8 @@ compare_signatures(uint32_t *prim_hash_matches, uint32_t *sec_hash_matches,\n \t}\n }\n \n+#endif /* defined(__ARM_NEON) */\n+\n static inline void\n __bulk_lookup_l(const struct rte_hash *h, const void **keys,\n \t\tconst struct rte_hash_bucket **primary_bkt,\n@@ -1928,18 +1954,30 @@ __bulk_lookup_l(const struct rte_hash *h, const void **keys,\n \tuint32_t sec_hitmask[RTE_HASH_LOOKUP_BULK_MAX] = {0};\n \tstruct rte_hash_bucket *cur_bkt, *next_bkt;\n \n+#if defined(__ARM_NEON)\n+\tconst int hitmask_padding = 0;\n+#else\n+\tconst int hitmask_padding = 1;\n+#endif\n+\n \t__hash_rw_reader_lock(h);\n \n \t/* Compare signatures and prefetch key slot of first hit */\n \tfor (i = 0; i < num_keys; i++) {\n-\t\tcompare_signatures(&prim_hitmask[i], &sec_hitmask[i],\n+#if defined(__ARM_NEON)\n+\t\tcompare_signatures_dense(&prim_hitmask[i], &sec_hitmask[i],\n+\t\t\tprimary_bkt[i], secondary_bkt[i],\n+\t\t\tsig[i], h->sig_cmp_fn);\n+#else\n+\t\tcompare_signatures_sparse(&prim_hitmask[i], &sec_hitmask[i],\n \t\t\tprimary_bkt[i], secondary_bkt[i],\n \t\t\tsig[i], h->sig_cmp_fn);\n+#endif\n \n \t\tif (prim_hitmask[i]) {\n \t\t\tuint32_t first_hit =\n \t\t\t\t\trte_ctz32(prim_hitmask[i])\n-\t\t\t\t\t>> 1;\n+\t\t\t\t\t>> hitmask_padding;\n \t\t\tuint32_t key_idx =\n \t\t\t\tprimary_bkt[i]->key_idx[first_hit];\n \t\t\tconst struct rte_hash_key *key_slot =\n@@ -1953,7 +1991,7 @@ __bulk_lookup_l(const struct rte_hash *h, const void **keys,\n \t\tif (sec_hitmask[i]) {\n \t\t\tuint32_t first_hit =\n \t\t\t\t\trte_ctz32(sec_hitmask[i])\n-\t\t\t\t\t>> 1;\n+\t\t\t\t\t>> hitmask_padding;\n \t\t\tuint32_t key_idx =\n \t\t\t\tsecondary_bkt[i]->key_idx[first_hit];\n \t\t\tconst struct rte_hash_key *key_slot =\n@@ -1970,7 +2008,7 @@ __bulk_lookup_l(const struct rte_hash *h, const void **keys,\n \t\twhile (prim_hitmask[i]) {\n \t\t\tuint32_t hit_index =\n \t\t\t\t\trte_ctz32(prim_hitmask[i])\n-\t\t\t\t\t>> 1;\n+\t\t\t\t\t>> hitmask_padding;\n \t\t\tuint32_t key_idx =\n \t\t\t\tprimary_bkt[i]->key_idx[hit_index];\n \t\t\tconst struct rte_hash_key *key_slot =\n@@ -1992,13 +2030,13 @@ __bulk_lookup_l(const struct rte_hash *h, const void **keys,\n \t\t\t\tpositions[i] = key_idx - 1;\n \t\t\t\tgoto next_key;\n \t\t\t}\n-\t\t\tprim_hitmask[i] &= ~(3ULL << (hit_index << 1));\n+\t\t\tprim_hitmask[i] &= ~(1 << (hit_index << hitmask_padding));\n \t\t}\n \n \t\twhile (sec_hitmask[i]) {\n \t\t\tuint32_t hit_index =\n \t\t\t\t\trte_ctz32(sec_hitmask[i])\n-\t\t\t\t\t>> 1;\n+\t\t\t\t\t>> hitmask_padding;\n \t\t\tuint32_t key_idx =\n \t\t\t\tsecondary_bkt[i]->key_idx[hit_index];\n \t\t\tconst struct rte_hash_key *key_slot =\n@@ -2021,7 +2059,7 @@ __bulk_lookup_l(const struct rte_hash *h, const void **keys,\n \t\t\t\tpositions[i] = key_idx - 1;\n \t\t\t\tgoto next_key;\n \t\t\t}\n-\t\t\tsec_hitmask[i] &= ~(3ULL << (hit_index << 1));\n+\t\t\tsec_hitmask[i] &= ~(1 << (hit_index << hitmask_padding));\n \t\t}\n next_key:\n \t\tcontinue;\n@@ -2076,6 +2114,12 @@ __bulk_lookup_lf(const struct rte_hash *h, const void **keys,\n \tstruct rte_hash_bucket *cur_bkt, *next_bkt;\n \tuint32_t cnt_b, cnt_a;\n \n+#if defined(__ARM_NEON)\n+\tconst int hitmask_padding = 0;\n+#else\n+\tconst int hitmask_padding = 1;\n+#endif\n+\n \tfor (i = 0; i < num_keys; i++)\n \t\tpositions[i] = -ENOENT;\n \n@@ -2089,14 +2133,20 @@ __bulk_lookup_lf(const struct rte_hash *h, const void **keys,\n \n \t\t/* Compare signatures and prefetch key slot of first hit */\n \t\tfor (i = 0; i < num_keys; i++) {\n-\t\t\tcompare_signatures(&prim_hitmask[i], &sec_hitmask[i],\n+#if defined(__ARM_NEON)\n+\t\t\tcompare_signatures_dense(&prim_hitmask[i], &sec_hitmask[i],\n \t\t\t\tprimary_bkt[i], secondary_bkt[i],\n \t\t\t\tsig[i], h->sig_cmp_fn);\n+#else\n+\t\t\tcompare_signatures_sparse(&prim_hitmask[i], &sec_hitmask[i],\n+\t\t\t\tprimary_bkt[i], secondary_bkt[i],\n+\t\t\t\tsig[i], h->sig_cmp_fn);\n+#endif\n \n \t\t\tif (prim_hitmask[i]) {\n \t\t\t\tuint32_t first_hit =\n \t\t\t\t\t\trte_ctz32(prim_hitmask[i])\n-\t\t\t\t\t\t>> 1;\n+\t\t\t\t\t\t>> hitmask_padding;\n \t\t\t\tuint32_t key_idx =\n \t\t\t\t\tprimary_bkt[i]->key_idx[first_hit];\n \t\t\t\tconst struct rte_hash_key *key_slot =\n@@ -2110,7 +2160,7 @@ __bulk_lookup_lf(const struct rte_hash *h, const void **keys,\n \t\t\tif (sec_hitmask[i]) {\n \t\t\t\tuint32_t first_hit =\n \t\t\t\t\t\trte_ctz32(sec_hitmask[i])\n-\t\t\t\t\t\t>> 1;\n+\t\t\t\t\t\t>> hitmask_padding;\n \t\t\t\tuint32_t key_idx =\n \t\t\t\t\tsecondary_bkt[i]->key_idx[first_hit];\n \t\t\t\tconst struct rte_hash_key *key_slot =\n@@ -2126,7 +2176,7 @@ __bulk_lookup_lf(const struct rte_hash *h, const void **keys,\n \t\t\twhile (prim_hitmask[i]) {\n \t\t\t\tuint32_t hit_index =\n \t\t\t\t\t\trte_ctz32(prim_hitmask[i])\n-\t\t\t\t\t\t>> 1;\n+\t\t\t\t\t\t>> hitmask_padding;\n \t\t\t\tuint32_t key_idx =\n \t\t\t\trte_atomic_load_explicit(\n \t\t\t\t\t&primary_bkt[i]->key_idx[hit_index],\n@@ -2152,13 +2202,13 @@ __bulk_lookup_lf(const struct rte_hash *h, const void **keys,\n \t\t\t\t\tpositions[i] = key_idx - 1;\n \t\t\t\t\tgoto next_key;\n \t\t\t\t}\n-\t\t\t\tprim_hitmask[i] &= ~(3ULL << (hit_index << 1));\n+\t\t\t\tprim_hitmask[i] &= ~(1 << (hit_index << hitmask_padding));\n \t\t\t}\n \n \t\t\twhile (sec_hitmask[i]) {\n \t\t\t\tuint32_t hit_index =\n \t\t\t\t\t\trte_ctz32(sec_hitmask[i])\n-\t\t\t\t\t\t>> 1;\n+\t\t\t\t\t\t>> hitmask_padding;\n \t\t\t\tuint32_t key_idx =\n \t\t\t\trte_atomic_load_explicit(\n \t\t\t\t\t&secondary_bkt[i]->key_idx[hit_index],\n@@ -2185,7 +2235,7 @@ __bulk_lookup_lf(const struct rte_hash *h, const void **keys,\n \t\t\t\t\tpositions[i] = key_idx - 1;\n \t\t\t\t\tgoto next_key;\n \t\t\t\t}\n-\t\t\t\tsec_hitmask[i] &= ~(3ULL << (hit_index << 1));\n+\t\t\t\tsec_hitmask[i] &= ~(1 << (hit_index << hitmask_padding));\n \t\t\t}\n next_key:\n \t\t\tcontinue;\n", "prefixes": [ "v4", "1/4" ] }{ "id": 137325, "url": "