mbox series

[V2,0/6] pipeline: make the hash function configurable per table

Message ID 20220819195225.1483020-1-cristian.dumitrescu@intel.com (mailing list archive)
Headers
Series pipeline: make the hash function configurable per table |

Message

Cristian Dumitrescu Aug. 19, 2022, 7:52 p.m. UTC
  The exact match and learner tables use a hash function for the lookup
operation. This patch set makes the hash function configurable and
removes some limitations.

The hash function previously used by these table types had the
following limitations:
a) Not configurable: An internally hardcoded version was used;
b) Mask-based: This prevents using most of the available
   hash functions, as they are not mask-based;
c) Key size limited to 64 bytes or less.

The new hash function is:
a) Configurable;
b) Not mask-based;
c) Not limited to key sizes to less than or equal to 64 bytes.

Also, since this flexibility has some performance cost, this patch set
also introduces key comparison functions specialized for each key size
value. Since the key size is fixed for each table, the key comparison
function can be selected at initialization as opposed to using a
generic function that can handle any key size. This strategy result in
a performance improvement for the table lookup operation of around 5%.

Depends-on: series-24117 ("pipeline: pipeline configuration and build improvements")

Change log:

V2:
-Added check for table match fields to be contiguous for exact match tables.
-Fixed bug in the specification file parsing related to hash function configuration.

Cristian Dumitrescu (6):
  table: add hash function prototype
  table: add key comparison functions
  table: configure the hash function for regular tables
  pipeline: configure the hash function for regular tables
  table: configure the hash function for learner tables
  pipeline: configure the hash function for learner tables

 lib/pipeline/rte_swx_ctl.c               |   1 +
 lib/pipeline/rte_swx_ctl.h               |   3 +
 lib/pipeline/rte_swx_pipeline.c          | 134 ++++++++++--
 lib/pipeline/rte_swx_pipeline.h          |  30 ++-
 lib/pipeline/rte_swx_pipeline_internal.h |   2 +
 lib/pipeline/rte_swx_pipeline_spec.c     |  83 ++++++-
 lib/pipeline/rte_swx_pipeline_spec.h     |   2 +
 lib/table/meson.build                    |   2 +
 lib/table/rte_swx_hash_func.h            |  39 ++++
 lib/table/rte_swx_keycmp.c               | 166 ++++++++++++++
 lib/table/rte_swx_keycmp.h               |  49 +++++
 lib/table/rte_swx_table.h                |   8 +
 lib/table/rte_swx_table_em.c             | 266 ++++-------------------
 lib/table/rte_swx_table_learner.c        | 220 +++----------------
 lib/table/rte_swx_table_learner.h        |   6 +
 15 files changed, 555 insertions(+), 456 deletions(-)
 create mode 100644 lib/table/rte_swx_hash_func.h
 create mode 100644 lib/table/rte_swx_keycmp.c
 create mode 100644 lib/table/rte_swx_keycmp.h
  

Comments

Stephen Hemminger Aug. 31, 2022, 4:23 p.m. UTC | #1
On Fri, 19 Aug 2022 19:52:19 +0000
Cristian Dumitrescu <cristian.dumitrescu@intel.com> wrote:

> Also, since this flexibility has some performance cost, this patch set
> also introduces key comparison functions specialized for each key size
> value. Since the key size is fixed for each table, the key comparison
> function can be selected at initialization as opposed to using a
> generic function that can handle any key size. This strategy result in
> a performance improvement for the table lookup operation of around 5%.

I wonder if DPDK should start to adopt the Linux kernel optimizations
around indirect calls. For most all cases, the function pointer will be
a certain value and the cpu can do direct rather than indirect call.

As in:

     if (likely(hash_func == crc32_hash))
            crc32_hash(x, y)
     else
            (*hash_func)(x, y)

This was done in Linux kernel because of the overhead of the Spectre/Meltdown
mitigation's, but could apply more generally in DPDK.
  
Morten Brørup Sept. 1, 2022, 8:11 a.m. UTC | #2
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Wednesday, 31 August 2022 18.23
> Subject: Re: [PATCH V2 0/6] pipeline: make the hash function
> configurable per table
> 
> On Fri, 19 Aug 2022 19:52:19 +0000
> Cristian Dumitrescu <cristian.dumitrescu@intel.com> wrote:
> 
> > Also, since this flexibility has some performance cost, this patch
> set
> > also introduces key comparison functions specialized for each key
> size
> > value. Since the key size is fixed for each table, the key comparison
> > function can be selected at initialization as opposed to using a
> > generic function that can handle any key size. This strategy result
> in
> > a performance improvement for the table lookup operation of around
> 5%.
> 
> I wonder if DPDK should start to adopt the Linux kernel optimizations
> around indirect calls. For most all cases, the function pointer will be
> a certain value and the cpu can do direct rather than indirect call.
> 
> As in:
> 
>      if (likely(hash_func == crc32_hash))
>             crc32_hash(x, y)
>      else
>             (*hash_func)(x, y)
> 
> This was done in Linux kernel because of the overhead of the
> Spectre/Meltdown
> mitigation's, but could apply more generally in DPDK.

+1 to that!

Along the very same lines, I remember reading on LWN about the Linux kernel using some magic to avoid function pointers, or to install optimized functions: At locations in the code where multiple variants of a function could be used, the address of the correct/optimized function is written directly into those locations in the code at startup. I didn't read the article in depth back then, and I can't find it now. Perhaps you know what I'm referring to, Stephen? I wonder if that also might be relevant for DPDK.
  
Thomas Monjalon Sept. 23, 2022, 3:58 p.m. UTC | #3
19/08/2022 21:52, Cristian Dumitrescu:
> The exact match and learner tables use a hash function for the lookup
> operation. This patch set makes the hash function configurable and
> removes some limitations.
> 
> The hash function previously used by these table types had the
> following limitations:
> a) Not configurable: An internally hardcoded version was used;
> b) Mask-based: This prevents using most of the available
>    hash functions, as they are not mask-based;
> c) Key size limited to 64 bytes or less.
> 
> The new hash function is:
> a) Configurable;
> b) Not mask-based;
> c) Not limited to key sizes to less than or equal to 64 bytes.
> 
> Also, since this flexibility has some performance cost, this patch set
> also introduces key comparison functions specialized for each key size
> value. Since the key size is fixed for each table, the key comparison
> function can be selected at initialization as opposed to using a
> generic function that can handle any key size. This strategy result in
> a performance improvement for the table lookup operation of around 5%.
> 
> Depends-on: series-24117 ("pipeline: pipeline configuration and build improvements")
> 
> Change log:
> 
> V2:
> -Added check for table match fields to be contiguous for exact match tables.
> -Fixed bug in the specification file parsing related to hash function configuration.
> 
> Cristian Dumitrescu (6):
>   table: add hash function prototype
>   table: add key comparison functions
>   table: configure the hash function for regular tables
>   pipeline: configure the hash function for regular tables
>   table: configure the hash function for learner tables
>   pipeline: configure the hash function for learner tables

Applied, thanks.