[v3,0/4] hash: add SVE support for bulk key lookup

Message ID 20231107121845.2758454-1-yoan.picchi@arm.com (mailing list archive)
Headers
Series hash: add SVE support for bulk key lookup |

Message

Yoan Picchi Nov. 7, 2023, 12:18 p.m. UTC
  This patchset adds SVE support for the signature comparison in the cuckoo
hash lookup and improves the existing NEON implementation. These
optimizations required changes to the data format and signature of the
relevant functions to support dense hitmasks (no padding) and having the
primary and secondary hitmasks interleaved instead of being in their own
array each.

Benchmarking the cuckoo hash perf test, I observed this effect on speed:
  There are no significant changes on Intel (ran on Sapphire Rapids)
  Neon is up to 7-10% faster (ran on ampere altra)
  128b SVE is about 3-5% slower than the optimized neon (ran on a graviton
    3 cloud instance)
  256b SVE is about 0-3% slower than the optimized neon (ran on a graviton
    3 cloud instance)

V2->V3:
  Remove a redundant if in the test
  Change a couple int to uint16_t in compare_signatures_dense
  Several codding-style fix

Yoan Picchi (4):
  hash: pack the hitmask for hash in bulk lookup
  hash: optimize compare signature for NEON
  test/hash: check bulk lookup of keys after collision
  hash: add SVE support for bulk key lookup

 .mailmap                   |   2 +
 app/test/test_hash.c       |  99 ++++++++++----
 lib/hash/rte_cuckoo_hash.c | 264 +++++++++++++++++++++++++++++--------
 lib/hash/rte_cuckoo_hash.h |   1 +
 4 files changed, 287 insertions(+), 79 deletions(-)
  

Comments

Yoan Picchi Feb. 26, 2024, 5:01 p.m. UTC | #1
This patchset adds SVE support for the signature comparison in the cuckoo
hash lookup and improves the existing NEON implementation. These
optimizations required changes to the data format and signature of the
relevant functions to support dense hitmasks (no padding) and having the
primary and secondary hitmasks interleaved instead of being in their own
array each.

Benchmarking the cuckoo hash perf test, I observed this effect on speed:
  There are no significant changes on Intel (ran on Sapphire Rapids)
  Neon is up to 7-10% faster (ran on ampere altra)
  128b SVE is about 3-5% slower than the optimized neon (ran on a graviton
    3 cloud instance)
  256b SVE is about 0-3% slower than the optimized neon (ran on a graviton
    3 cloud instance)

V2->V3:
  Remove a redundant if in the test
  Change a couple int to uint16_t in compare_signatures_dense
  Several codding-style fix

V3->V4:
  Rebase

Yoan Picchi (4):
  hash: pack the hitmask for hash in bulk lookup
  hash: optimize compare signature for NEON
  test/hash: check bulk lookup of keys after collision
  hash: add SVE support for bulk key lookup

 .mailmap                   |   2 +
 app/test/test_hash.c       |  99 ++++++++++----
 lib/hash/rte_cuckoo_hash.c | 264 +++++++++++++++++++++++++++++--------
 lib/hash/rte_cuckoo_hash.h |   1 +
 4 files changed, 287 insertions(+), 79 deletions(-)