[3/4] hash: fix rw concurrency while moving keys

Message ID 1536253938-192391-4-git-send-email-honnappa.nagarahalli@arm.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series Address reader-writer concurrency in rte_hash |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Honnappa Nagarahalli Sept. 6, 2018, 5:12 p.m. UTC
  Reader-writer concurrency issue, caused by moving the keys
to their alternative locations during key insert, is solved
by introducing a global counter(tbl_chng_cnt) indicating a
change in table.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
---
 lib/librte_hash/rte_cuckoo_hash.c | 307 +++++++++++++++++++++++++-------------
 lib/librte_hash/rte_cuckoo_hash.h |   2 +
 lib/librte_hash/rte_hash.h        |   8 +-
 3 files changed, 206 insertions(+), 111 deletions(-)
  

Comments

Wang, Yipeng1 Sept. 28, 2018, 1 a.m. UTC | #1
Reply inlined:

>-----Original Message-----
>From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Honnappa Nagarahalli
>Sent: Thursday, September 6, 2018 10:12 AM
>To: Richardson, Bruce <bruce.richardson@intel.com>; De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
>Cc: dev@dpdk.org; honnappa.nagarahalli@dpdk.org; gavin.hu@arm.com; steve.capper@arm.com; ola.liljedahl@arm.com;
>nd@arm.com; Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
>Subject: [dpdk-dev] [PATCH 3/4] hash: fix rw concurrency while moving keys
>
>Reader-writer concurrency issue, caused by moving the keys
>to their alternative locations during key insert, is solved
>by introducing a global counter(tbl_chng_cnt) indicating a
>change in table.
>
>@@ -662,6 +679,20 @@ rte_hash_cuckoo_move_insert_mw(const struct rte_hash *h,
> 		curr_bkt = curr_node->bkt;
> 	}
>
>+	/* Inform the previous move. The current move need
>+	 * not be informed now as the current bucket entry
>+	 * is present in both primary and secondary.
>+	 * Since there is one writer, load acquires on
>+	 * tbl_chng_cnt are not required.
>+	 */
>+	__atomic_store_n(&h->tbl_chng_cnt,
>+			 h->tbl_chng_cnt + 1,
>+			 __ATOMIC_RELEASE);
>+	/* The stores to sig_alt and sig_current should not
>+	 * move above the store to tbl_chng_cnt.
>+	 */
>+	__atomic_thread_fence(__ATOMIC_RELEASE);
>+
[Wang, Yipeng] I believe for X86 this fence should not be compiled to any code, otherwise
we need macros for the compile time check.

>@@ -926,30 +957,56 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
> 	uint32_t bucket_idx;
> 	hash_sig_t alt_hash;
> 	struct rte_hash_bucket *bkt;
>+	uint32_t cnt_b, cnt_a;
> 	int ret;
>
>-	bucket_idx = sig & h->bucket_bitmask;
>-	bkt = &h->buckets[bucket_idx];
>-
> 	__hash_rw_reader_lock(h);
>
>-	/* Check if key is in primary location */
>-	ret = search_one_bucket(h, key, sig, data, bkt);
>-	if (ret != -1) {
>-		__hash_rw_reader_unlock(h);
>-		return ret;
>-	}
>-	/* Calculate secondary hash */
>-	alt_hash = rte_hash_secondary_hash(sig);
>-	bucket_idx = alt_hash & h->bucket_bitmask;
>-	bkt = &h->buckets[bucket_idx];
>+	do {
[Wang, Yipeng] As far as I know, the MemC3 paper "MemC3: Compact and Concurrent
MemCache with Dumber Caching and Smarter Hashing"
as well as OvS cmap uses similar version counter to implement read-write concurrency for hash table,
but one difference is reader checks even/odd of the version counter to make sure there is no
concurrent writer. Could you just double check and confirm that this is not needed for your implementation?

>--- a/lib/librte_hash/rte_hash.h
>+++ b/lib/librte_hash/rte_hash.h
>@@ -156,7 +156,7 @@ rte_hash_count(const struct rte_hash *h);
>  *   - -ENOSPC if there is no space in the hash for this key.
>  */
> int
>-rte_hash_add_key_data(const struct rte_hash *h, const void *key, void *data);
>+rte_hash_add_key_data(struct rte_hash *h, const void *key, void *data);
>
> /**
>  * Add a key-value pair with a pre-computed hash value
>@@ -180,7 +180,7 @@ rte_hash_add_key_data(const struct rte_hash *h, const void *key, void *data);
>  *   - -ENOSPC if there is no space in the hash for this key.
>  */
> int32_t
>-rte_hash_add_key_with_hash_data(const struct rte_hash *h, const void *key,
>+rte_hash_add_key_with_hash_data(struct rte_hash *h, const void *key,
> 						hash_sig_t sig, void *data);
>
> /**
>@@ -200,7 +200,7 @@ rte_hash_add_key_with_hash_data(const struct rte_hash *h, const void *key,
>  *     array of user data. This value is unique for this key.
>  */
> int32_t
>-rte_hash_add_key(const struct rte_hash *h, const void *key);
>+rte_hash_add_key(struct rte_hash *h, const void *key);
>
> /**
>  * Add a key to an existing hash table.
>@@ -222,7 +222,7 @@ rte_hash_add_key(const struct rte_hash *h, const void *key);
>  *     array of user data. This value is unique for this key.
>  */
> int32_t
>-rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, hash_sig_t sig);
>+rte_hash_add_key_with_hash(struct rte_hash *h, const void *key, hash_sig_t sig);
>
> /

I think the above changes will break ABI by changing the parameter type? Other people may know better on this.
  
Bruce Richardson Sept. 28, 2018, 8:26 a.m. UTC | #2
On Fri, Sep 28, 2018 at 02:00:00AM +0100, Wang, Yipeng1 wrote:
> Reply inlined:
> 
> >-----Original Message-----
> >From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Honnappa Nagarahalli
> >Sent: Thursday, September 6, 2018 10:12 AM
> >To: Richardson, Bruce <bruce.richardson@intel.com>; De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
> >Cc: dev@dpdk.org; honnappa.nagarahalli@dpdk.org; gavin.hu@arm.com; steve.capper@arm.com; ola.liljedahl@arm.com;
> >nd@arm.com; Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> >Subject: [dpdk-dev] [PATCH 3/4] hash: fix rw concurrency while moving keys
> >
> >Reader-writer concurrency issue, caused by moving the keys
> >to their alternative locations during key insert, is solved
> >by introducing a global counter(tbl_chng_cnt) indicating a
> >change in table.
> >
> >@@ -662,6 +679,20 @@ rte_hash_cuckoo_move_insert_mw(const struct rte_hash *h,
> > 		curr_bkt = curr_node->bkt;
> > 	}
> >
> >+	/* Inform the previous move. The current move need
> >+	 * not be informed now as the current bucket entry
> >+	 * is present in both primary and secondary.
> >+	 * Since there is one writer, load acquires on
> >+	 * tbl_chng_cnt are not required.
> >+	 */
> >+	__atomic_store_n(&h->tbl_chng_cnt,
> >+			 h->tbl_chng_cnt + 1,
> >+			 __ATOMIC_RELEASE);
> >+	/* The stores to sig_alt and sig_current should not
> >+	 * move above the store to tbl_chng_cnt.
> >+	 */
> >+	__atomic_thread_fence(__ATOMIC_RELEASE);
> >+
> [Wang, Yipeng] I believe for X86 this fence should not be compiled to any code, otherwise
> we need macros for the compile time check.
> 
> >@@ -926,30 +957,56 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
> > 	uint32_t bucket_idx;
> > 	hash_sig_t alt_hash;
> > 	struct rte_hash_bucket *bkt;
> >+	uint32_t cnt_b, cnt_a;
> > 	int ret;
> >
> >-	bucket_idx = sig & h->bucket_bitmask;
> >-	bkt = &h->buckets[bucket_idx];
> >-
> > 	__hash_rw_reader_lock(h);
> >
> >-	/* Check if key is in primary location */
> >-	ret = search_one_bucket(h, key, sig, data, bkt);
> >-	if (ret != -1) {
> >-		__hash_rw_reader_unlock(h);
> >-		return ret;
> >-	}
> >-	/* Calculate secondary hash */
> >-	alt_hash = rte_hash_secondary_hash(sig);
> >-	bucket_idx = alt_hash & h->bucket_bitmask;
> >-	bkt = &h->buckets[bucket_idx];
> >+	do {
> [Wang, Yipeng] As far as I know, the MemC3 paper "MemC3: Compact and Concurrent
> MemCache with Dumber Caching and Smarter Hashing"
> as well as OvS cmap uses similar version counter to implement read-write concurrency for hash table,
> but one difference is reader checks even/odd of the version counter to make sure there is no
> concurrent writer. Could you just double check and confirm that this is not needed for your implementation?
> 
> >--- a/lib/librte_hash/rte_hash.h
> >+++ b/lib/librte_hash/rte_hash.h
> >@@ -156,7 +156,7 @@ rte_hash_count(const struct rte_hash *h);
> >  *   - -ENOSPC if there is no space in the hash for this key.
> >  */
> > int
> >-rte_hash_add_key_data(const struct rte_hash *h, const void *key, void *data);
> >+rte_hash_add_key_data(struct rte_hash *h, const void *key, void *data);
> >
> > /**
> >  * Add a key-value pair with a pre-computed hash value
> >@@ -180,7 +180,7 @@ rte_hash_add_key_data(const struct rte_hash *h, const void *key, void *data);
> >  *   - -ENOSPC if there is no space in the hash for this key.
> >  */
> > int32_t
> >-rte_hash_add_key_with_hash_data(const struct rte_hash *h, const void *key,
> >+rte_hash_add_key_with_hash_data(struct rte_hash *h, const void *key,
> > 						hash_sig_t sig, void *data);
> >
> > /**
> >@@ -200,7 +200,7 @@ rte_hash_add_key_with_hash_data(const struct rte_hash *h, const void *key,
> >  *     array of user data. This value is unique for this key.
> >  */
> > int32_t
> >-rte_hash_add_key(const struct rte_hash *h, const void *key);
> >+rte_hash_add_key(struct rte_hash *h, const void *key);
> >
> > /**
> >  * Add a key to an existing hash table.
> >@@ -222,7 +222,7 @@ rte_hash_add_key(const struct rte_hash *h, const void *key);
> >  *     array of user data. This value is unique for this key.
> >  */
> > int32_t
> >-rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, hash_sig_t sig);
> >+rte_hash_add_key_with_hash(struct rte_hash *h, const void *key, hash_sig_t sig);
> >
> > /
> 
> I think the above changes will break ABI by changing the parameter type? Other people may know better on this.

Just removing a const should not change the ABI, I believe, since the const
is just advisory hint to the compiler. Actual parameter size and count
remains unchanged so I don't believe there is an issue. 
[ABI experts, please correct me if I'm wrong on this]

/Bruce
  
Van Haaren, Harry Sept. 28, 2018, 8:55 a.m. UTC | #3
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson
> Sent: Friday, September 28, 2018 9:26 AM
> To: Wang, Yipeng1 <yipeng1.wang@intel.com>
> Cc: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>; De Lara Guarch,
> Pablo <pablo.de.lara.guarch@intel.com>; dev@dpdk.org; gavin.hu@arm.com;
> steve.capper@arm.com; ola.liljedahl@arm.com; nd@arm.com
> Subject: Re: [dpdk-dev] [PATCH 3/4] hash: fix rw concurrency while moving
> keys
> 
> On Fri, Sep 28, 2018 at 02:00:00AM +0100, Wang, Yipeng1 wrote:
> > Reply inlined:
> >
> > >-----Original Message-----
> > >From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Honnappa Nagarahalli
> > >Sent: Thursday, September 6, 2018 10:12 AM
> > >To: Richardson, Bruce <bruce.richardson@intel.com>; De Lara Guarch, Pablo
> <pablo.de.lara.guarch@intel.com>
> > >Cc: dev@dpdk.org; honnappa.nagarahalli@dpdk.org; gavin.hu@arm.com;
> steve.capper@arm.com; ola.liljedahl@arm.com;
> > >nd@arm.com; Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > >Subject: [dpdk-dev] [PATCH 3/4] hash: fix rw concurrency while moving
> keys
> > >
> > >Reader-writer concurrency issue, caused by moving the keys
> > >to their alternative locations during key insert, is solved
> > >by introducing a global counter(tbl_chng_cnt) indicating a
> > >change in table.

<snip>

> > > /**
> > >@@ -200,7 +200,7 @@ rte_hash_add_key_with_hash_data(const struct rte_hash
> *h, const void *key,
> > >  *     array of user data. This value is unique for this key.
> > >  */
> > > int32_t
> > >-rte_hash_add_key(const struct rte_hash *h, const void *key);
> > >+rte_hash_add_key(struct rte_hash *h, const void *key);
> > >
> > > /**
> > >  * Add a key to an existing hash table.
> > >@@ -222,7 +222,7 @@ rte_hash_add_key(const struct rte_hash *h, const void
> *key);
> > >  *     array of user data. This value is unique for this key.
> > >  */
> > > int32_t
> > >-rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
> hash_sig_t sig);
> > >+rte_hash_add_key_with_hash(struct rte_hash *h, const void *key,
> hash_sig_t sig);
> > >
> > > /
> >
> > I think the above changes will break ABI by changing the parameter type?
> Other people may know better on this.
> 
> Just removing a const should not change the ABI, I believe, since the const
> is just advisory hint to the compiler. Actual parameter size and count
> remains unchanged so I don't believe there is an issue.
> [ABI experts, please correct me if I'm wrong on this]


[Certainly no ABI expert, but...]

I think this is an API break, not ABI break.

Given application code as follows, it will fail to compile - even though
running the new code as a .so wouldn't cause any issues (AFAIK).

void do_hash_stuff(const struct rte_hash *h, ...)
{
    /* parameter passed in is const, but updated function prototype is non-const */
    rte_hash_add_key_with_hash(h, ...);
}

This means that we can't recompile apps against latest patch without application
code changes, if the app was passing a const rte_hash struct as the first parameter.


-Harry
  
Honnappa Nagarahalli Sept. 30, 2018, 10:33 p.m. UTC | #4
> > > >
> > > >Reader-writer concurrency issue, caused by moving the keys to their
> > > >alternative locations during key insert, is solved by introducing a
> > > >global counter(tbl_chng_cnt) indicating a change in table.
> 
> <snip>
> 
> > > > /**
> > > >@@ -200,7 +200,7 @@ rte_hash_add_key_with_hash_data(const struct
> > > >rte_hash
> > *h, const void *key,
> > > >  *     array of user data. This value is unique for this key.
> > > >  */
> > > > int32_t
> > > >-rte_hash_add_key(const struct rte_hash *h, const void *key);
> > > >+rte_hash_add_key(struct rte_hash *h, const void *key);
> > > >
> > > > /**
> > > >  * Add a key to an existing hash table.
> > > >@@ -222,7 +222,7 @@ rte_hash_add_key(const struct rte_hash *h,
> > > >const void
> > *key);
> > > >  *     array of user data. This value is unique for this key.
> > > >  */
> > > > int32_t
> > > >-rte_hash_add_key_with_hash(const struct rte_hash *h, const void
> > > >*key,
> > hash_sig_t sig);
> > > >+rte_hash_add_key_with_hash(struct rte_hash *h, const void *key,
> > hash_sig_t sig);
> > > >
> > > > /
> > >
> > > I think the above changes will break ABI by changing the parameter type?
> > Other people may know better on this.
> >
> > Just removing a const should not change the ABI, I believe, since the
> > const is just advisory hint to the compiler. Actual parameter size and
> > count remains unchanged so I don't believe there is an issue.
> > [ABI experts, please correct me if I'm wrong on this]
> 
> 
> [Certainly no ABI expert, but...]
> 
> I think this is an API break, not ABI break.
> 
> Given application code as follows, it will fail to compile - even though running
> the new code as a .so wouldn't cause any issues (AFAIK).
> 
> void do_hash_stuff(const struct rte_hash *h, ...) {
>     /* parameter passed in is const, but updated function prototype is non-
> const */
>     rte_hash_add_key_with_hash(h, ...);
> }
> 
> This means that we can't recompile apps against latest patch without
> application code changes, if the app was passing a const rte_hash struct as
> the first parameter.
> 
Agree. Do we need to do anything for this?

> 
> -Harry
  
Honnappa Nagarahalli Sept. 30, 2018, 11:05 p.m. UTC | #5
> >
> >Reader-writer concurrency issue, caused by moving the keys to their
> >alternative locations during key insert, is solved by introducing a
> >global counter(tbl_chng_cnt) indicating a change in table.
> >
> >@@ -662,6 +679,20 @@ rte_hash_cuckoo_move_insert_mw(const struct
> rte_hash *h,
> > 		curr_bkt = curr_node->bkt;
> > 	}
> >
> >+	/* Inform the previous move. The current move need
> >+	 * not be informed now as the current bucket entry
> >+	 * is present in both primary and secondary.
> >+	 * Since there is one writer, load acquires on
> >+	 * tbl_chng_cnt are not required.
> >+	 */
> >+	__atomic_store_n(&h->tbl_chng_cnt,
> >+			 h->tbl_chng_cnt + 1,
> >+			 __ATOMIC_RELEASE);
> >+	/* The stores to sig_alt and sig_current should not
> >+	 * move above the store to tbl_chng_cnt.
> >+	 */
> >+	__atomic_thread_fence(__ATOMIC_RELEASE);
> >+
> [Wang, Yipeng] I believe for X86 this fence should not be compiled to any
> code, otherwise we need macros for the compile time check.
'__atomic_thread_fence(__ATOMIC_RELEASE)' provides load-load and load-store fence [1]. Hence, it should not add any barriers for x86.

[1] https://preshing.com/20130922/acquire-and-release-fences/

> 
> >@@ -926,30 +957,56 @@ __rte_hash_lookup_with_hash(const struct
> rte_hash *h, const void *key,
> > 	uint32_t bucket_idx;
> > 	hash_sig_t alt_hash;
> > 	struct rte_hash_bucket *bkt;
> >+	uint32_t cnt_b, cnt_a;
> > 	int ret;
> >
> >-	bucket_idx = sig & h->bucket_bitmask;
> >-	bkt = &h->buckets[bucket_idx];
> >-
> > 	__hash_rw_reader_lock(h);
> >
> >-	/* Check if key is in primary location */
> >-	ret = search_one_bucket(h, key, sig, data, bkt);
> >-	if (ret != -1) {
> >-		__hash_rw_reader_unlock(h);
> >-		return ret;
> >-	}
> >-	/* Calculate secondary hash */
> >-	alt_hash = rte_hash_secondary_hash(sig);
> >-	bucket_idx = alt_hash & h->bucket_bitmask;
> >-	bkt = &h->buckets[bucket_idx];
> >+	do {
> [Wang, Yipeng] As far as I know, the MemC3 paper "MemC3: Compact and
> Concurrent MemCache with Dumber Caching and Smarter Hashing"
> as well as OvS cmap uses similar version counter to implement read-write
> concurrency for hash table, but one difference is reader checks even/odd of
> the version counter to make sure there is no concurrent writer. Could you just
> double check and confirm that this is not needed for your implementation?
> 
I relooked at this paper. My patch makes use of the fact that during the process of shifting the key will be present in both primary and secondary buckets. The check for odd version counter is not required as the full key comparison would have identified any false signature matches.

> >--- a/lib/librte_hash/rte_hash.h
> >+++ b/lib/librte_hash/rte_hash.h
> >@@ -156,7 +156,7 @@ rte_hash_count(const struct rte_hash *h);
> >  *   - -ENOSPC if there is no space in the hash for this key.
> >  */
> > int
> >-rte_hash_add_key_data(const struct rte_hash *h, const void *key, void
> >*data);
> >+rte_hash_add_key_data(struct rte_hash *h, const void *key, void
> >+*data);
> >
> > /**
> >  * Add a key-value pair with a pre-computed hash value @@ -180,7
> >+180,7 @@ rte_hash_add_key_data(const struct rte_hash *h, const void
> *key, void *data);
> >  *   - -ENOSPC if there is no space in the hash for this key.
> >  */
> > int32_t
> >-rte_hash_add_key_with_hash_data(const struct rte_hash *h, const void
> >*key,
> >+rte_hash_add_key_with_hash_data(struct rte_hash *h, const void *key,
> > 						hash_sig_t sig, void *data);
> >
> > /**
> >@@ -200,7 +200,7 @@ rte_hash_add_key_with_hash_data(const struct
> rte_hash *h, const void *key,
> >  *     array of user data. This value is unique for this key.
> >  */
> > int32_t
> >-rte_hash_add_key(const struct rte_hash *h, const void *key);
> >+rte_hash_add_key(struct rte_hash *h, const void *key);
> >
> > /**
> >  * Add a key to an existing hash table.
> >@@ -222,7 +222,7 @@ rte_hash_add_key(const struct rte_hash *h, const
> void *key);
> >  *     array of user data. This value is unique for this key.
> >  */
> > int32_t
> >-rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
> >hash_sig_t sig);
> >+rte_hash_add_key_with_hash(struct rte_hash *h, const void *key,
> >+hash_sig_t sig);
> >
> > /
> 
> I think the above changes will break ABI by changing the parameter type?
> Other people may know better on this.
  
Wang, Yipeng1 Oct. 1, 2018, 10:56 p.m. UTC | #6
>-----Original Message-----
>From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
>Sent: Sunday, September 30, 2018 4:06 PM
>To: Wang, Yipeng1 <yipeng1.wang@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>; De Lara Guarch, Pablo
><pablo.de.lara.guarch@intel.com>
>Cc: dev@dpdk.org; Gavin Hu (Arm Technology China) <Gavin.Hu@arm.com>; Steve Capper <Steve.Capper@arm.com>; Ola Liljedahl
><Ola.Liljedahl@arm.com>; nd <nd@arm.com>
>Subject: RE: [dpdk-dev] [PATCH 3/4] hash: fix rw concurrency while moving keys
>
>> >+	__atomic_store_n(&h->tbl_chng_cnt,
>> >+			 h->tbl_chng_cnt + 1,
>> >+			 __ATOMIC_RELEASE);
>> >+	/* The stores to sig_alt and sig_current should not
>> >+	 * move above the store to tbl_chng_cnt.
>> >+	 */
>> >+	__atomic_thread_fence(__ATOMIC_RELEASE);
>> >+
>> [Wang, Yipeng] I believe for X86 this fence should not be compiled to any
>> code, otherwise we need macros for the compile time check.
>'__atomic_thread_fence(__ATOMIC_RELEASE)' provides load-load and load-store fence [1]. Hence, it should not add any barriers for
>x86.
>
>[1] https://preshing.com/20130922/acquire-and-release-fences/
>
[Wang, Yipeng] Thanks for the link, it is very informative!
>>
>> >@@ -926,30 +957,56 @@ __rte_hash_lookup_with_hash(const struct
>> rte_hash *h, const void *key,
>> > 	uint32_t bucket_idx;
>> > 	hash_sig_t alt_hash;
>> > 	struct rte_hash_bucket *bkt;
>> >+	uint32_t cnt_b, cnt_a;
>> > 	int ret;
>> >
>> >-	bucket_idx = sig & h->bucket_bitmask;
>> >-	bkt = &h->buckets[bucket_idx];
>> >-
>> > 	__hash_rw_reader_lock(h);
>> >
>> >-	/* Check if key is in primary location */
>> >-	ret = search_one_bucket(h, key, sig, data, bkt);
>> >-	if (ret != -1) {
>> >-		__hash_rw_reader_unlock(h);
>> >-		return ret;
>> >-	}
>> >-	/* Calculate secondary hash */
>> >-	alt_hash = rte_hash_secondary_hash(sig);
>> >-	bucket_idx = alt_hash & h->bucket_bitmask;
>> >-	bkt = &h->buckets[bucket_idx];
>> >+	do {
>> [Wang, Yipeng] As far as I know, the MemC3 paper "MemC3: Compact and
>> Concurrent MemCache with Dumber Caching and Smarter Hashing"
>> as well as OvS cmap uses similar version counter to implement read-write
>> concurrency for hash table, but one difference is reader checks even/odd of
>> the version counter to make sure there is no concurrent writer. Could you just
>> double check and confirm that this is not needed for your implementation?
>>
>I relooked at this paper. My patch makes use of the fact that during the process of shifting the key will be present in both primary and
>secondary buckets. The check for odd version counter is not required as the full key comparison would have identified any false
>signature matches.
[Wang, Yipeng] I was thinking about another corner case and wondering if the version counter needs to be done on key deletion as well.
For example, a reader reads out the index and falsely matches a signature, before it reads out the data and the key, the key-data pair got
deleted, removed, and recycled by another writer thread.  This writer partially overwrote the key (because key store is too large to be atomic).
Now, the reader begins to compare the key, and accidentally matches the key (because the key is partially written and accidentally matches), will
The reader read the wrong data out (which should have been a lookup miss)?
  
Van Haaren, Harry Oct. 2, 2018, 1:17 p.m. UTC | #7
> -----Original Message-----
> From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
> Sent: Sunday, September 30, 2018 11:33 PM
> To: Van Haaren, Harry <harry.van.haaren@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; Wang, Yipeng1 <yipeng1.wang@intel.com>
> Cc: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>; dev@dpdk.org;
> Gavin Hu (Arm Technology China) <Gavin.Hu@arm.com>; Steve Capper
> <Steve.Capper@arm.com>; Ola Liljedahl <Ola.Liljedahl@arm.com>; nd
> <nd@arm.com>
> Subject: RE: [dpdk-dev] [PATCH 3/4] hash: fix rw concurrency while moving
> keys
> 
> 
> > > > >
> > > > >Reader-writer concurrency issue, caused by moving the keys to their
> > > > >alternative locations during key insert, is solved by introducing a
> > > > >global counter(tbl_chng_cnt) indicating a change in table.
> >
> > <snip>
> >
> > > > > /**
> > > > >@@ -200,7 +200,7 @@ rte_hash_add_key_with_hash_data(const struct
> > > > >rte_hash
> > > *h, const void *key,
> > > > >  *     array of user data. This value is unique for this key.
> > > > >  */
> > > > > int32_t
> > > > >-rte_hash_add_key(const struct rte_hash *h, const void *key);
> > > > >+rte_hash_add_key(struct rte_hash *h, const void *key);
> > > > >
> > > > > /**
> > > > >  * Add a key to an existing hash table.
> > > > >@@ -222,7 +222,7 @@ rte_hash_add_key(const struct rte_hash *h,
> > > > >const void
> > > *key);
> > > > >  *     array of user data. This value is unique for this key.
> > > > >  */
> > > > > int32_t
> > > > >-rte_hash_add_key_with_hash(const struct rte_hash *h, const void
> > > > >*key,
> > > hash_sig_t sig);
> > > > >+rte_hash_add_key_with_hash(struct rte_hash *h, const void *key,
> > > hash_sig_t sig);
> > > > >
> > > > > /
> > > >
> > > > I think the above changes will break ABI by changing the parameter
> type?
> > > Other people may know better on this.
> > >
> > > Just removing a const should not change the ABI, I believe, since the
> > > const is just advisory hint to the compiler. Actual parameter size and
> > > count remains unchanged so I don't believe there is an issue.
> > > [ABI experts, please correct me if I'm wrong on this]
> >
> >
> > [Certainly no ABI expert, but...]
> >
> > I think this is an API break, not ABI break.
> >
> > Given application code as follows, it will fail to compile - even though
> running
> > the new code as a .so wouldn't cause any issues (AFAIK).
> >
> > void do_hash_stuff(const struct rte_hash *h, ...) {
> >     /* parameter passed in is const, but updated function prototype is
> non-
> > const */
> >     rte_hash_add_key_with_hash(h, ...);
> > }
> >
> > This means that we can't recompile apps against latest patch without
> > application code changes, if the app was passing a const rte_hash struct
> as
> > the first parameter.
> >
> Agree. Do we need to do anything for this?

I think we should try to avoid breaking API wherever possible.
If we must, then I suppose we could follow the ABI process of
a deprecation notice.

From my reading of the versioning docs, it doesn't document this case:
https://doc.dpdk.org/guides/contributing/versioning.html

I don't recall a similar situation in DPDK previously - so I suggest
you ask Tech board for input here.

Hope that helps! -Harry
  
Wang, Yipeng1 Oct. 2, 2018, 11:58 p.m. UTC | #8
>-----Original Message-----
>From: Van Haaren, Harry
>> > > > > /**
>> > > > >  * Add a key to an existing hash table.
>> > > > >@@ -222,7 +222,7 @@ rte_hash_add_key(const struct rte_hash *h,
>> > > > >const void
>> > > *key);
>> > > > >  *     array of user data. This value is unique for this key.
>> > > > >  */
>> > > > > int32_t
>> > > > >-rte_hash_add_key_with_hash(const struct rte_hash *h, const void
>> > > > >*key,
>> > > hash_sig_t sig);
>> > > > >+rte_hash_add_key_with_hash(struct rte_hash *h, const void *key,
>> > > hash_sig_t sig);
>> > > > >
>> > > > > /
>> > > >
>> > > > I think the above changes will break ABI by changing the parameter
>> type?
>> > > Other people may know better on this.
>> > >
>> > > Just removing a const should not change the ABI, I believe, since the
>> > > const is just advisory hint to the compiler. Actual parameter size and
>> > > count remains unchanged so I don't believe there is an issue.
>> > > [ABI experts, please correct me if I'm wrong on this]
>> >
>> >
>> > [Certainly no ABI expert, but...]
>> >
>> > I think this is an API break, not ABI break.
>> >
>> > Given application code as follows, it will fail to compile - even though
>> running
>> > the new code as a .so wouldn't cause any issues (AFAIK).
>> >
>> > void do_hash_stuff(const struct rte_hash *h, ...) {
>> >     /* parameter passed in is const, but updated function prototype is
>> non-
>> > const */
>> >     rte_hash_add_key_with_hash(h, ...);
>> > }
>> >
>> > This means that we can't recompile apps against latest patch without
>> > application code changes, if the app was passing a const rte_hash struct
>> as
>> > the first parameter.
>> >
>> Agree. Do we need to do anything for this?
>
>I think we should try to avoid breaking API wherever possible.
>If we must, then I suppose we could follow the ABI process of
>a deprecation notice.
>
>From my reading of the versioning docs, it doesn't document this case:
>https://doc.dpdk.org/guides/contributing/versioning.html
>
>I don't recall a similar situation in DPDK previously - so I suggest
>you ask Tech board for input here.
>
>Hope that helps! -Harry
[Wang, Yipeng] 
Honnappa, how about use a pointer to the counter in the rte_hash struct instead of the counter? Will this avoid
API change?
  
Wang, Yipeng1 Oct. 3, 2018, 12:16 a.m. UTC | #9
>-----Original Message-----
>[Wang, Yipeng] I was thinking about another corner case and wondering if the version counter needs to be done on key deletion as
>well.
>For example, a reader reads out the index and falsely matches a signature, before it reads out the data and the key, the key-data pair
>got
>deleted, removed, and recycled by another writer thread.  This writer partially overwrote the key (because key store is too large to be
>atomic).
>Now, the reader begins to compare the key, and accidentally matches the key (because the key is partially written and accidentally
>matches), will
>The reader read the wrong data out (which should have been a lookup miss)?
>
[Wang, Yipeng] As a second thought after reading your slides, I think the scenario I described above should be handled by the upper
level application using RCU-like mechanisms to recycle key-data only after the grace period. So I think it is OK. Please ignore my last comment.

But if convenient, please add more comment in the API doc and in user guide as well just in case users might not understand the restriction.

Thanks
Yipeng
  
Honnappa Nagarahalli Oct. 3, 2018, 5:32 p.m. UTC | #10
> >-----Original Message-----
> >From: Van Haaren, Harry
> >> > > > > /**
> >> > > > >  * Add a key to an existing hash table.
> >> > > > >@@ -222,7 +222,7 @@ rte_hash_add_key(const struct rte_hash *h,
> >> > > > >const void
> >> > > *key);
> >> > > > >  *     array of user data. This value is unique for this key.
> >> > > > >  */
> >> > > > > int32_t
> >> > > > >-rte_hash_add_key_with_hash(const struct rte_hash *h, const
> >> > > > >void *key,
> >> > > hash_sig_t sig);
> >> > > > >+rte_hash_add_key_with_hash(struct rte_hash *h, const void
> >> > > > >+*key,
> >> > > hash_sig_t sig);
> >> > > > >
> >> > > > > /
> >> > > >
> >> > > > I think the above changes will break ABI by changing the
> >> > > > parameter
> >> type?
> >> > > Other people may know better on this.
> >> > >
> >> > > Just removing a const should not change the ABI, I believe, since
> >> > > the const is just advisory hint to the compiler. Actual parameter
> >> > > size and count remains unchanged so I don't believe there is an issue.
> >> > > [ABI experts, please correct me if I'm wrong on this]
> >> >
> >> >
> >> > [Certainly no ABI expert, but...]
> >> >
> >> > I think this is an API break, not ABI break.
> >> >
> >> > Given application code as follows, it will fail to compile - even
> >> > though
> >> running
> >> > the new code as a .so wouldn't cause any issues (AFAIK).
> >> >
> >> > void do_hash_stuff(const struct rte_hash *h, ...) {
> >> >     /* parameter passed in is const, but updated function prototype
> >> > is
> >> non-
> >> > const */
> >> >     rte_hash_add_key_with_hash(h, ...); }
> >> >
> >> > This means that we can't recompile apps against latest patch
> >> > without application code changes, if the app was passing a const
> >> > rte_hash struct
> >> as
> >> > the first parameter.
> >> >
> >> Agree. Do we need to do anything for this?
> >
> >I think we should try to avoid breaking API wherever possible.
> >If we must, then I suppose we could follow the ABI process of a
> >deprecation notice.
> >
> >From my reading of the versioning docs, it doesn't document this case:
> >https://doc.dpdk.org/guides/contributing/versioning.html
> >
> >I don't recall a similar situation in DPDK previously - so I suggest
> >you ask Tech board for input here.
> >
> >Hope that helps! -Harry
> [Wang, Yipeng]
> Honnappa, how about use a pointer to the counter in the rte_hash struct
> instead of the counter? Will this avoid API change?
I think it defeats the purpose of 'const' parameter to the API and provides incorrect information to the user.
IMO, DPDK should have guidelines on how to handle the API compatibility breaks. I will send an email to tech board on this.
We can also solve this by having counters on the bucket. I was planning to do this little bit later. I will look at the effort involved and may be do it now.
  
Honnappa Nagarahalli Oct. 3, 2018, 5:39 p.m. UTC | #11
> >-----Original Message-----
> >[Wang, Yipeng] I was thinking about another corner case and wondering
> >if the version counter needs to be done on key deletion as well.
> >For example, a reader reads out the index and falsely matches a
> >signature, before it reads out the data and the key, the key-data pair
> >got deleted, removed, and recycled by another writer thread.  This
> >writer partially overwrote the key (because key store is too large to be
> atomic).
> >Now, the reader begins to compare the key, and accidentally matches the
> >key (because the key is partially written and accidentally matches),
> >will The reader read the wrong data out (which should have been a lookup
> miss)?
> >
> [Wang, Yipeng] As a second thought after reading your slides, I think the
> scenario I described above should be handled by the upper level application
> using RCU-like mechanisms to recycle key-data only after the grace period.
> So I think it is OK. Please ignore my last comment.
> 
> But if convenient, please add more comment in the API doc and in user
> guide as well just in case users might not understand the restriction.
> 
Agree, it needs careful documentation. I plan to change the documentation after the patch is completed.

> Thanks
> Yipeng
  
Wang, Yipeng1 Oct. 3, 2018, 5:56 p.m. UTC | #12
>-----Original Message-----
>From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
>Sent: Wednesday, October 3, 2018 10:33 AM
>To: Wang, Yipeng1 <yipeng1.wang@intel.com>; Van Haaren, Harry <harry.van.haaren@intel.com>; Richardson, Bruce
><bruce.richardson@intel.com>
>Cc: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>; dev@dpdk.org; Gavin Hu (Arm Technology China)
><Gavin.Hu@arm.com>; Steve Capper <Steve.Capper@arm.com>; Ola Liljedahl <Ola.Liljedahl@arm.com>; nd <nd@arm.com>; Gobriel,
>Sameh <sameh.gobriel@intel.com>
>Subject: RE: [dpdk-dev] [PATCH 3/4] hash: fix rw concurrency while moving keys
>
>> >-----Original Message-----
>> >From: Van Haaren, Harry
>> >> > > > > /**
>> >> > > > >  * Add a key to an existing hash table.
>> >> > > > >@@ -222,7 +222,7 @@ rte_hash_add_key(const struct rte_hash *h,
>> >> > > > >const void
>> >> > > *key);
>> >> > > > >  *     array of user data. This value is unique for this key.
>> >> > > > >  */
>> >> > > > > int32_t
>> >> > > > >-rte_hash_add_key_with_hash(const struct rte_hash *h, const
>> >> > > > >void *key,
>> >> > > hash_sig_t sig);
>> >> > > > >+rte_hash_add_key_with_hash(struct rte_hash *h, const void
>> >> > > > >+*key,
>> >> > > hash_sig_t sig);
>> >> > > > >
>> >> > > > > /
>> >> > > >
>> >> > > > I think the above changes will break ABI by changing the
>> >> > > > parameter
>> >> type?
>> >> > > Other people may know better on this.
>> >> > >
>> >> > > Just removing a const should not change the ABI, I believe, since
>> >> > > the const is just advisory hint to the compiler. Actual parameter
>> >> > > size and count remains unchanged so I don't believe there is an issue.
>> >> > > [ABI experts, please correct me if I'm wrong on this]
>> >> >
>> >> >
>> >> > [Certainly no ABI expert, but...]
>> >> >
>> >> > I think this is an API break, not ABI break.
>> >> >
>> >> > Given application code as follows, it will fail to compile - even
>> >> > though
>> >> running
>> >> > the new code as a .so wouldn't cause any issues (AFAIK).
>> >> >
>> >> > void do_hash_stuff(const struct rte_hash *h, ...) {
>> >> >     /* parameter passed in is const, but updated function prototype
>> >> > is
>> >> non-
>> >> > const */
>> >> >     rte_hash_add_key_with_hash(h, ...); }
>> >> >
>> >> > This means that we can't recompile apps against latest patch
>> >> > without application code changes, if the app was passing a const
>> >> > rte_hash struct
>> >> as
>> >> > the first parameter.
>> >> >
>> >> Agree. Do we need to do anything for this?
>> >
>> >I think we should try to avoid breaking API wherever possible.
>> >If we must, then I suppose we could follow the ABI process of a
>> >deprecation notice.
>> >
>> >From my reading of the versioning docs, it doesn't document this case:
>> >https://doc.dpdk.org/guides/contributing/versioning.html
>> >
>> >I don't recall a similar situation in DPDK previously - so I suggest
>> >you ask Tech board for input here.
>> >
>> >Hope that helps! -Harry
>> [Wang, Yipeng]
>> Honnappa, how about use a pointer to the counter in the rte_hash struct
>> instead of the counter? Will this avoid API change?
>I think it defeats the purpose of 'const' parameter to the API and provides incorrect information to the user.
>IMO, DPDK should have guidelines on how to handle the API compatibility breaks. I will send an email to tech board on this.
>We can also solve this by having counters on the bucket. I was planning to do this little bit later. I will look at the effort involved and
>may be do it now.
[Wang, Yipeng] 
I think with ABI/API change, you might need to announce it one release cycle ahead.

In the cuckoo switch paper: Scalable, High Performance Ethernet Forwarding with
CUCKOOSWITCH
it separates the version counter array and the hash table. You can strike a balance
between granularity of the version counter and the cache/memory requirement.
Is it a better way?

Another consideration is current bucket is 64-byte exactly with the partial-key-hashing.
To add another counter, we need to think about changing certain variables to still align
cache line.
  
Ola Liljedahl Oct. 3, 2018, 11:05 p.m. UTC | #13
On 03/10/2018, 20:00, "Wang, Yipeng1" <yipeng1.wang@intel.com> wrote:

    
    
    >-----Original Message-----
    >From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
    >Sent: Wednesday, October 3, 2018 10:33 AM
    >To: Wang, Yipeng1 <yipeng1.wang@intel.com>; Van Haaren, Harry <harry.van.haaren@intel.com>; Richardson, Bruce
    ><bruce.richardson@intel.com>
    >Cc: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>; dev@dpdk.org; Gavin Hu (Arm Technology China)
    ><Gavin.Hu@arm.com>; Steve Capper <Steve.Capper@arm.com>; Ola Liljedahl <Ola.Liljedahl@arm.com>; nd <nd@arm.com>; Gobriel,
    >Sameh <sameh.gobriel@intel.com>
    >Subject: RE: [dpdk-dev] [PATCH 3/4] hash: fix rw concurrency while moving keys
    >
    >> >-----Original Message-----
    >> >From: Van Haaren, Harry
    >> >> > > > > /**
    >> >> > > > >  * Add a key to an existing hash table.
    >> >> > > > >@@ -222,7 +222,7 @@ rte_hash_add_key(const struct rte_hash *h,
    >> >> > > > >const void
    >> >> > > *key);
    >> >> > > > >  *     array of user data. This value is unique for this key.
    >> >> > > > >  */
    >> >> > > > > int32_t
    >> >> > > > >-rte_hash_add_key_with_hash(const struct rte_hash *h, const
    >> >> > > > >void *key,
    >> >> > > hash_sig_t sig);
    >> >> > > > >+rte_hash_add_key_with_hash(struct rte_hash *h, const void
    >> >> > > > >+*key,
    >> >> > > hash_sig_t sig);
    >> >> > > > >
    >> >> > > > > /
    >> >> > > >
    >> >> > > > I think the above changes will break ABI by changing the
    >> >> > > > parameter
    >> >> type?
    >> >> > > Other people may know better on this.
    >> >> > >
    >> >> > > Just removing a const should not change the ABI, I believe, since
    >> >> > > the const is just advisory hint to the compiler. Actual parameter
    >> >> > > size and count remains unchanged so I don't believe there is an issue.
    >> >> > > [ABI experts, please correct me if I'm wrong on this]
    >> >> >
    >> >> >
    >> >> > [Certainly no ABI expert, but...]
    >> >> >
    >> >> > I think this is an API break, not ABI break.
    >> >> >
    >> >> > Given application code as follows, it will fail to compile - even
    >> >> > though
    >> >> running
    >> >> > the new code as a .so wouldn't cause any issues (AFAIK).
    >> >> >
    >> >> > void do_hash_stuff(const struct rte_hash *h, ...) {
    >> >> >     /* parameter passed in is const, but updated function prototype
    >> >> > is
    >> >> non-
    >> >> > const */
    >> >> >     rte_hash_add_key_with_hash(h, ...); }
    >> >> >
    >> >> > This means that we can't recompile apps against latest patch
    >> >> > without application code changes, if the app was passing a const
    >> >> > rte_hash struct
    >> >> as
    >> >> > the first parameter.
    >> >> >
    >> >> Agree. Do we need to do anything for this?
    >> >
    >> >I think we should try to avoid breaking API wherever possible.
    >> >If we must, then I suppose we could follow the ABI process of a
    >> >deprecation notice.
    >> >
    >> >From my reading of the versioning docs, it doesn't document this case:
    >> >https://doc.dpdk.org/guides/contributing/versioning.html
    >> >
    >> >I don't recall a similar situation in DPDK previously - so I suggest
    >> >you ask Tech board for input here.
    >> >
    >> >Hope that helps! -Harry
    >> [Wang, Yipeng]
    >> Honnappa, how about use a pointer to the counter in the rte_hash struct
    >> instead of the counter? Will this avoid API change?
    >I think it defeats the purpose of 'const' parameter to the API and provides incorrect information to the user.
    >IMO, DPDK should have guidelines on how to handle the API compatibility breaks. I will send an email to tech board on this.
    >We can also solve this by having counters on the bucket. I was planning to do this little bit later. I will look at the effort involved and
    >may be do it now.
    [Wang, Yipeng] 
    I think with ABI/API change, you might need to announce it one release cycle ahead.
    
    In the cuckoo switch paper: Scalable, High Performance Ethernet Forwarding with
    CUCKOOSWITCH
    it separates the version counter array and the hash table. You can strike a balance
    between granularity of the version counter and the cache/memory requirement.
    Is it a better way?
[Ola] Having only a single generation counter can easily become a scalability bottleneck due to write contention to the cache line.
Ideally you want each gen counter to be located in its own cache line (multiple gen counters in the same cache line will experience the same write contention). But that seems a bit wasteful of memory.
Ideally (in order to avoid accessing more cache lines), the gen counter should be located in the hash bucket. But as you write below, this would create problems for the layout of the hash bucket, there isn't any room for another field.
So I propose an array of gen. counters, indexed by the hash (of somekind) of primary and alternate hashes (signatures) of the element (modulo the array size). So another hash table.
    
    Another consideration is current bucket is 64-byte exactly with the partial-key-hashing.
    To add another counter, we need to think about changing certain variables to still align
    cache line.
  
Honnappa Nagarahalli Oct. 4, 2018, 3:32 a.m. UTC | #14
> >
> >> >-----Original Message-----
> >> >From: Van Haaren, Harry
> >> >> > > > > /**
> >> >> > > > >  * Add a key to an existing hash table.
> >> >> > > > >@@ -222,7 +222,7 @@ rte_hash_add_key(const struct rte_hash
> >> >> > > > >*h, const void
> >> >> > > *key);
> >> >> > > > >  *     array of user data. This value is unique for this key.
> >> >> > > > >  */
> >> >> > > > > int32_t
> >> >> > > > >-rte_hash_add_key_with_hash(const struct rte_hash *h, const
> >> >> > > > >void *key,
> >> >> > > hash_sig_t sig);
> >> >> > > > >+rte_hash_add_key_with_hash(struct rte_hash *h, const void
> >> >> > > > >+*key,
> >> >> > > hash_sig_t sig);
> >> >> > > > >
> >> >> > > > > /
> >> >> > > >
> >> >> > > > I think the above changes will break ABI by changing the
> >> >> > > > parameter
> >> >> type?
> >> >> > > Other people may know better on this.
> >> >> > >
> >> >> > > Just removing a const should not change the ABI, I believe,
> >> >> > > since the const is just advisory hint to the compiler. Actual
> >> >> > > parameter size and count remains unchanged so I don't believe
> there is an issue.
> >> >> > > [ABI experts, please correct me if I'm wrong on this]
> >> >> >
> >> >> >
> >> >> > [Certainly no ABI expert, but...]
> >> >> >
> >> >> > I think this is an API break, not ABI break.
> >> >> >
> >> >> > Given application code as follows, it will fail to compile -
> >> >> > even though
> >> >> running
> >> >> > the new code as a .so wouldn't cause any issues (AFAIK).
> >> >> >
> >> >> > void do_hash_stuff(const struct rte_hash *h, ...) {
> >> >> >     /* parameter passed in is const, but updated function
> >> >> > prototype is
> >> >> non-
> >> >> > const */
> >> >> >     rte_hash_add_key_with_hash(h, ...); }
> >> >> >
> >> >> > This means that we can't recompile apps against latest patch
> >> >> > without application code changes, if the app was passing a const
> >> >> > rte_hash struct
> >> >> as
> >> >> > the first parameter.
> >> >> >
> >> >> Agree. Do we need to do anything for this?
> >> >
> >> >I think we should try to avoid breaking API wherever possible.
> >> >If we must, then I suppose we could follow the ABI process of a
> >> >deprecation notice.
> >> >
> >> >From my reading of the versioning docs, it doesn't document this case:
> >> >https://doc.dpdk.org/guides/contributing/versioning.html
> >> >
> >> >I don't recall a similar situation in DPDK previously - so I suggest
> >> >you ask Tech board for input here.
> >> >
> >> >Hope that helps! -Harry
> >> [Wang, Yipeng]
> >> Honnappa, how about use a pointer to the counter in the rte_hash
> >> struct instead of the counter? Will this avoid API change?
> >I think it defeats the purpose of 'const' parameter to the API and provides
> incorrect information to the user.
> >IMO, DPDK should have guidelines on how to handle the API compatibility
> breaks. I will send an email to tech board on this.
> >We can also solve this by having counters on the bucket. I was planning
> >to do this little bit later. I will look at the effort involved and may be do it
> now.
> [Wang, Yipeng]
> I think with ABI/API change, you might need to announce it one release cycle
> ahead.
> 
> In the cuckoo switch paper: Scalable, High Performance Ethernet Forwarding
> with CUCKOOSWITCH it separates the version counter array and the hash
> table. You can strike a balance between granularity of the version counter and
> the cache/memory requirement.
> Is it a better way?
This will introduce another cache line access. It would be good to stay within the single cacheline.

> 
> Another consideration is current bucket is 64-byte exactly with the partial-
> key-hashing.
> To add another counter, we need to think about changing certain variables to
> still align cache line.
The 'flags' structure member is not being used. I plan to remove that. That will give us 8B, I will use 4B out of it for the counter.
  
Honnappa Nagarahalli Oct. 4, 2018, 3:54 a.m. UTC | #15
> 
> > >-----Original Message-----
> > >From: Van Haaren, Harry
> > >> > > > > /**
> > >> > > > >  * Add a key to an existing hash table.
> > >> > > > >@@ -222,7 +222,7 @@ rte_hash_add_key(const struct rte_hash
> > >> > > > >*h, const void
> > >> > > *key);
> > >> > > > >  *     array of user data. This value is unique for this key.
> > >> > > > >  */
> > >> > > > > int32_t
> > >> > > > >-rte_hash_add_key_with_hash(const struct rte_hash *h, const
> > >> > > > >void *key,
> > >> > > hash_sig_t sig);
> > >> > > > >+rte_hash_add_key_with_hash(struct rte_hash *h, const void
> > >> > > > >+*key,
> > >> > > hash_sig_t sig);
> > >> > > > >
> > >> > > > > /
> > >> > > >
> > >> > > > I think the above changes will break ABI by changing the
> > >> > > > parameter
> > >> type?
> > >> > > Other people may know better on this.
> > >> > >
> > >> > > Just removing a const should not change the ABI, I believe,
> > >> > > since the const is just advisory hint to the compiler. Actual
> > >> > > parameter size and count remains unchanged so I don't believe there
> is an issue.
> > >> > > [ABI experts, please correct me if I'm wrong on this]
> > >> >
> > >> >
> > >> > [Certainly no ABI expert, but...]
> > >> >
> > >> > I think this is an API break, not ABI break.
> > >> >
> > >> > Given application code as follows, it will fail to compile - even
> > >> > though
> > >> running
> > >> > the new code as a .so wouldn't cause any issues (AFAIK).
> > >> >
> > >> > void do_hash_stuff(const struct rte_hash *h, ...) {
> > >> >     /* parameter passed in is const, but updated function
> > >> > prototype is
> > >> non-
> > >> > const */
> > >> >     rte_hash_add_key_with_hash(h, ...); }
> > >> >
> > >> > This means that we can't recompile apps against latest patch
> > >> > without application code changes, if the app was passing a const
> > >> > rte_hash struct
> > >> as
> > >> > the first parameter.
> > >> >
> > >> Agree. Do we need to do anything for this?
> > >
> > >I think we should try to avoid breaking API wherever possible.
> > >If we must, then I suppose we could follow the ABI process of a
> > >deprecation notice.
> > >
> > >From my reading of the versioning docs, it doesn't document this case:
> > >https://doc.dpdk.org/guides/contributing/versioning.html
> > >
> > >I don't recall a similar situation in DPDK previously - so I suggest
> > >you ask Tech board for input here.
> > >
> > >Hope that helps! -Harry
> > [Wang, Yipeng]
> > Honnappa, how about use a pointer to the counter in the rte_hash
> > struct instead of the counter? Will this avoid API change?
> I think it defeats the purpose of 'const' parameter to the API and provides
> incorrect information to the user.
Yipeng, I think I have misunderstood your comment. I believe you meant; we could allocate memory to the counter and store the pointer in the structure. Please correct me if I am wrong.
This could be a solution, though it will be another cache line access. It might be ok given that it is a single cache line for entire hash table.

> IMO, DPDK should have guidelines on how to handle the API compatibility
> breaks. I will send an email to tech board on this.
> We can also solve this by having counters on the bucket. I was planning to do
> this little bit later. I will look at the effort involved and may be do it now.
  
Wang, Yipeng1 Oct. 4, 2018, 7:16 p.m. UTC | #16
>-----Original Message-----
>From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
>Sent: Wednesday, October 3, 2018 8:54 PM
>To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Wang, Yipeng1 <yipeng1.wang@intel.com>; Van Haaren, Harry
><harry.van.haaren@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>
>Cc: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>; dev@dpdk.org; Gavin Hu (Arm Technology China)
><Gavin.Hu@arm.com>; Steve Capper <Steve.Capper@arm.com>; Ola Liljedahl <Ola.Liljedahl@arm.com>; nd <nd@arm.com>; Gobriel,
>Sameh <sameh.gobriel@intel.com>
>Subject: RE: [dpdk-dev] [PATCH 3/4] hash: fix rw concurrency while moving keys
>
>>
>> > >-----Original Message-----
>> > >From: Van Haaren, Harry
>> > >> > > > > /**
>> > >> > > > >  * Add a key to an existing hash table.
>> > >> > > > >@@ -222,7 +222,7 @@ rte_hash_add_key(const struct rte_hash
>> > >> > > > >*h, const void
>> > >> > > *key);
>> > >> > > > >  *     array of user data. This value is unique for this key.
>> > >> > > > >  */
>> > >> > > > > int32_t
>> > >> > > > >-rte_hash_add_key_with_hash(const struct rte_hash *h, const
>> > >> > > > >void *key,
>> > >> > > hash_sig_t sig);
>> > >> > > > >+rte_hash_add_key_with_hash(struct rte_hash *h, const void
>> > >> > > > >+*key,
>> > >> > > hash_sig_t sig);
>> > >> > > > >
>> > >> > > > > /
>> > >> > > >
>> > >> > > > I think the above changes will break ABI by changing the
>> > >> > > > parameter
>> > >> type?
>> > >> > > Other people may know better on this.
>> > >> > >
>> > >> > > Just removing a const should not change the ABI, I believe,
>> > >> > > since the const is just advisory hint to the compiler. Actual
>> > >> > > parameter size and count remains unchanged so I don't believe there
>> is an issue.
>> > >> > > [ABI experts, please correct me if I'm wrong on this]
>> > >> >
>> > >> >
>> > >> > [Certainly no ABI expert, but...]
>> > >> >
>> > >> > I think this is an API break, not ABI break.
>> > >> >
>> > >> > Given application code as follows, it will fail to compile - even
>> > >> > though
>> > >> running
>> > >> > the new code as a .so wouldn't cause any issues (AFAIK).
>> > >> >
>> > >> > void do_hash_stuff(const struct rte_hash *h, ...) {
>> > >> >     /* parameter passed in is const, but updated function
>> > >> > prototype is
>> > >> non-
>> > >> > const */
>> > >> >     rte_hash_add_key_with_hash(h, ...); }
>> > >> >
>> > >> > This means that we can't recompile apps against latest patch
>> > >> > without application code changes, if the app was passing a const
>> > >> > rte_hash struct
>> > >> as
>> > >> > the first parameter.
>> > >> >
>> > >> Agree. Do we need to do anything for this?
>> > >
>> > >I think we should try to avoid breaking API wherever possible.
>> > >If we must, then I suppose we could follow the ABI process of a
>> > >deprecation notice.
>> > >
>> > >From my reading of the versioning docs, it doesn't document this case:
>> > >https://doc.dpdk.org/guides/contributing/versioning.html
>> > >
>> > >I don't recall a similar situation in DPDK previously - so I suggest
>> > >you ask Tech board for input here.
>> > >
>> > >Hope that helps! -Harry
>> > [Wang, Yipeng]
>> > Honnappa, how about use a pointer to the counter in the rte_hash
>> > struct instead of the counter? Will this avoid API change?
>> I think it defeats the purpose of 'const' parameter to the API and provides
>> incorrect information to the user.
>Yipeng, I think I have misunderstood your comment. I believe you meant; we could allocate memory to the counter and store the
>pointer in the structure. Please correct me if I am wrong.
>This could be a solution, though it will be another cache line access. It might be ok given that it is a single cache line for entire hash
>table.
[Wang, Yipeng] Yeah that is what I meant. It is an additional memory access but probably it will be in local cache.
Since time is tight, it could be a simple workaround for this version and in future you can extend this pointed counter to a counter array as Ola suggested and the
Cuckooo switch paper did for scaling issue. 

>
>> IMO, DPDK should have guidelines on how to handle the API compatibility
>> breaks. I will send an email to tech board on this.
>> We can also solve this by having counters on the bucket. I was planning to do
>> this little bit later. I will look at the effort involved and may be do it now.
  

Patch

diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index 2d89158..1e4a8d4 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -256,6 +256,7 @@  rte_hash_create(const struct rte_hash_parameters *params)
 #endif
 	/* Setup hash context */
 	snprintf(h->name, sizeof(h->name), "%s", params->name);
+	h->tbl_chng_cnt = 0;
 	h->entries = params->entries;
 	h->key_len = params->key_len;
 	h->key_entry_size = key_entry_size;
@@ -588,7 +589,7 @@  rte_hash_cuckoo_insert_mw(const struct rte_hash *h,
  * return 0 if succeeds.
  */
 static inline int
-rte_hash_cuckoo_move_insert_mw(const struct rte_hash *h,
+rte_hash_cuckoo_move_insert_mw(struct rte_hash *h,
 			struct rte_hash_bucket *bkt,
 			struct rte_hash_bucket *alt_bkt,
 			const struct rte_hash_key *key, void *data,
@@ -639,11 +640,27 @@  rte_hash_cuckoo_move_insert_mw(const struct rte_hash *h,
 		if (unlikely(&h->buckets[prev_alt_bkt_idx]
 				!= curr_bkt)) {
 			/* revert it to empty, otherwise duplicated keys */
-			curr_bkt->key_idx[curr_slot] = EMPTY_SLOT;
+			__atomic_store_n(&curr_bkt->key_idx[curr_slot],
+				EMPTY_SLOT,
+				__ATOMIC_RELEASE);
 			__hash_rw_writer_unlock(h);
 			return -1;
 		}
 
+		/* Inform the previous move. The current move need
+		 * not be informed now as the current bucket entry
+		 * is present in both primary and secondary.
+		 * Since there is one writer, load acquires on
+		 * tbl_chng_cnt are not required.
+		 */
+		__atomic_store_n(&h->tbl_chng_cnt,
+				 h->tbl_chng_cnt + 1,
+				 __ATOMIC_RELEASE);
+		/* The stores to sig_alt and sig_current should not
+		 * move above the store to tbl_chng_cnt.
+		 */
+		__atomic_thread_fence(__ATOMIC_RELEASE);
+
 		/* Need to swap current/alt sig to allow later
 		 * Cuckoo insert to move elements back to its
 		 * primary bucket if available
@@ -662,6 +679,20 @@  rte_hash_cuckoo_move_insert_mw(const struct rte_hash *h,
 		curr_bkt = curr_node->bkt;
 	}
 
+	/* Inform the previous move. The current move need
+	 * not be informed now as the current bucket entry
+	 * is present in both primary and secondary.
+	 * Since there is one writer, load acquires on
+	 * tbl_chng_cnt are not required.
+	 */
+	__atomic_store_n(&h->tbl_chng_cnt,
+			 h->tbl_chng_cnt + 1,
+			 __ATOMIC_RELEASE);
+	/* The stores to sig_alt and sig_current should not
+	 * move above the store to tbl_chng_cnt.
+	 */
+	__atomic_thread_fence(__ATOMIC_RELEASE);
+
 	curr_bkt->sig_current[curr_slot] = sig;
 	curr_bkt->sig_alt[curr_slot] = alt_hash;
 	/* Release the new bucket entry */
@@ -680,7 +711,7 @@  rte_hash_cuckoo_move_insert_mw(const struct rte_hash *h,
  * Cuckoo
  */
 static inline int
-rte_hash_cuckoo_make_space_mw(const struct rte_hash *h,
+rte_hash_cuckoo_make_space_mw(struct rte_hash *h,
 			struct rte_hash_bucket *bkt,
 			struct rte_hash_bucket *sec_bkt,
 			const struct rte_hash_key *key, void *data,
@@ -728,7 +759,7 @@  rte_hash_cuckoo_make_space_mw(const struct rte_hash *h,
 }
 
 static inline int32_t
-__rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
+__rte_hash_add_key_with_hash(struct rte_hash *h, const void *key,
 						hash_sig_t sig, void *data)
 {
 	hash_sig_t alt_hash;
@@ -844,7 +875,7 @@  __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
 }
 
 int32_t
-rte_hash_add_key_with_hash(const struct rte_hash *h,
+rte_hash_add_key_with_hash(struct rte_hash *h,
 			const void *key, hash_sig_t sig)
 {
 	RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL);
@@ -852,14 +883,14 @@  rte_hash_add_key_with_hash(const struct rte_hash *h,
 }
 
 int32_t
-rte_hash_add_key(const struct rte_hash *h, const void *key)
+rte_hash_add_key(struct rte_hash *h, const void *key)
 {
 	RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL);
 	return __rte_hash_add_key_with_hash(h, key, rte_hash_hash(h, key), 0);
 }
 
 int
-rte_hash_add_key_with_hash_data(const struct rte_hash *h,
+rte_hash_add_key_with_hash_data(struct rte_hash *h,
 			const void *key, hash_sig_t sig, void *data)
 {
 	int ret;
@@ -873,7 +904,7 @@  rte_hash_add_key_with_hash_data(const struct rte_hash *h,
 }
 
 int
-rte_hash_add_key_data(const struct rte_hash *h, const void *key, void *data)
+rte_hash_add_key_data(struct rte_hash *h, const void *key, void *data)
 {
 	int ret;
 
@@ -926,30 +957,56 @@  __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
 	uint32_t bucket_idx;
 	hash_sig_t alt_hash;
 	struct rte_hash_bucket *bkt;
+	uint32_t cnt_b, cnt_a;
 	int ret;
 
-	bucket_idx = sig & h->bucket_bitmask;
-	bkt = &h->buckets[bucket_idx];
-
 	__hash_rw_reader_lock(h);
 
-	/* Check if key is in primary location */
-	ret = search_one_bucket(h, key, sig, data, bkt);
-	if (ret != -1) {
-		__hash_rw_reader_unlock(h);
-		return ret;
-	}
-	/* Calculate secondary hash */
-	alt_hash = rte_hash_secondary_hash(sig);
-	bucket_idx = alt_hash & h->bucket_bitmask;
-	bkt = &h->buckets[bucket_idx];
+	do {
+		/* Load the table change counter before the lookup
+		 * starts. Acquire semantics will make sure that
+		 * loads in search_one_bucket are not hoisted.
+		 */
+		cnt_b = __atomic_load_n(&h->tbl_chng_cnt,
+				__ATOMIC_ACQUIRE);
+
+		bucket_idx = sig & h->bucket_bitmask;
+		bkt = &h->buckets[bucket_idx];
+
+		/* Check if key is in primary location */
+		ret = search_one_bucket(h, key, sig, data, bkt);
+		if (ret != -1) {
+			__hash_rw_reader_unlock(h);
+			return ret;
+		}
+		/* Calculate secondary hash */
+		alt_hash = rte_hash_secondary_hash(sig);
+		bucket_idx = alt_hash & h->bucket_bitmask;
+		bkt = &h->buckets[bucket_idx];
+
+		/* Check if key is in secondary location */
+		ret = search_one_bucket(h, key, alt_hash, data, bkt);
+		if (ret != -1) {
+			__hash_rw_reader_unlock(h);
+			return ret;
+		}
+
+		/* The loads of sig_current in search_one_bucket
+		 * should not move below the load from tbl_chng_cnt.
+		 */
+		__atomic_thread_fence(__ATOMIC_ACQUIRE);
+		/* Re-read the table change counter to check if the
+		 * table has changed during search. If yes, re-do
+		 * the search.
+		 * This load should not get hoisted. The load
+		 * acquires on cnt_b, key index in primary bucket
+		 * and key index in secondary bucket will make sure
+		 * that it does not get hoisted.
+		 */
+		cnt_a = __atomic_load_n(&h->tbl_chng_cnt,
+					__ATOMIC_ACQUIRE);
+	} while (cnt_b != cnt_a);
 
-	/* Check if key is in secondary location */
-	ret = search_one_bucket(h, key, alt_hash, data, bkt);
-	if (ret != -1) {
-		__hash_rw_reader_unlock(h);
-		return ret;
-	}
 	__hash_rw_reader_unlock(h);
 	return -ENOENT;
 }
@@ -1190,6 +1247,7 @@  __rte_hash_lookup_bulk(const struct rte_hash *h, const void **keys,
 	uint32_t prim_hitmask[RTE_HASH_LOOKUP_BULK_MAX] = {0};
 	uint32_t sec_hitmask[RTE_HASH_LOOKUP_BULK_MAX] = {0};
 	void *pdata[RTE_HASH_LOOKUP_BULK_MAX];
+	uint32_t cnt_b, cnt_a;
 
 	/* Prefetch first keys */
 	for (i = 0; i < PREFETCH_OFFSET && i < num_keys; i++)
@@ -1225,102 +1283,137 @@  __rte_hash_lookup_bulk(const struct rte_hash *h, const void **keys,
 	}
 
 	__hash_rw_reader_lock(h);
-	/* Compare signatures and prefetch key slot of first hit */
-	for (i = 0; i < num_keys; i++) {
-		compare_signatures(&prim_hitmask[i], &sec_hitmask[i],
+	do {
+		/* Load the table change counter before the lookup
+		 * starts. Acquire semantics will make sure that
+		 * loads in compare_signatures are not hoisted.
+		 */
+		cnt_b = __atomic_load_n(&h->tbl_chng_cnt,
+					__ATOMIC_ACQUIRE);
+
+		/* Compare signatures and prefetch key slot of first hit */
+		for (i = 0; i < num_keys; i++) {
+			compare_signatures(&prim_hitmask[i], &sec_hitmask[i],
 				primary_bkt[i], secondary_bkt[i],
 				prim_hash[i], sec_hash[i], h->sig_cmp_fn);
 
-		if (prim_hitmask[i]) {
-			uint32_t first_hit = __builtin_ctzl(prim_hitmask[i]);
-			uint32_t key_idx = primary_bkt[i]->key_idx[first_hit];
-			const struct rte_hash_key *key_slot =
-				(const struct rte_hash_key *)(
-				(const char *)h->key_store +
-				key_idx * h->key_entry_size);
-			rte_prefetch0(key_slot);
-			continue;
-		}
+			if (prim_hitmask[i]) {
+				uint32_t first_hit =
+						__builtin_ctzl(prim_hitmask[i]);
+				uint32_t key_idx =
+					primary_bkt[i]->key_idx[first_hit];
+				const struct rte_hash_key *key_slot =
+					(const struct rte_hash_key *)(
+					(const char *)h->key_store +
+					key_idx * h->key_entry_size);
+				rte_prefetch0(key_slot);
+				continue;
+			}
 
-		if (sec_hitmask[i]) {
-			uint32_t first_hit = __builtin_ctzl(sec_hitmask[i]);
-			uint32_t key_idx = secondary_bkt[i]->key_idx[first_hit];
-			const struct rte_hash_key *key_slot =
-				(const struct rte_hash_key *)(
-				(const char *)h->key_store +
-				key_idx * h->key_entry_size);
-			rte_prefetch0(key_slot);
+			if (sec_hitmask[i]) {
+				uint32_t first_hit =
+						__builtin_ctzl(sec_hitmask[i]);
+				uint32_t key_idx =
+					secondary_bkt[i]->key_idx[first_hit];
+				const struct rte_hash_key *key_slot =
+					(const struct rte_hash_key *)(
+					(const char *)h->key_store +
+					key_idx * h->key_entry_size);
+				rte_prefetch0(key_slot);
+			}
 		}
-	}
 
-	/* Compare keys, first hits in primary first */
-	for (i = 0; i < num_keys; i++) {
-		positions[i] = -ENOENT;
-		while (prim_hitmask[i]) {
-			uint32_t hit_index = __builtin_ctzl(prim_hitmask[i]);
+		/* Compare keys, first hits in primary first */
+		for (i = 0; i < num_keys; i++) {
+			positions[i] = -ENOENT;
+			while (prim_hitmask[i]) {
+				uint32_t hit_index =
+						__builtin_ctzl(prim_hitmask[i]);
 
-			uint32_t key_idx =
-			__atomic_load_n(
-				&primary_bkt[i]->key_idx[hit_index],
-				__ATOMIC_ACQUIRE);
-			const struct rte_hash_key *key_slot =
-				(const struct rte_hash_key *)(
-				(const char *)h->key_store +
-				key_idx * h->key_entry_size);
-
-			if (key_idx != EMPTY_SLOT)
-				pdata[i] = __atomic_load_n(&key_slot->pdata,
-						__ATOMIC_ACQUIRE);
-			/*
-			 * If key index is 0, do not compare key,
-			 * as it is checking the dummy slot
-			 */
-			if (!!key_idx & !rte_hash_cmp_eq(key_slot->key, keys[i], h)) {
-				if (data != NULL)
-					data[i] = pdata[i];
+				uint32_t key_idx =
+				__atomic_load_n(
+					&primary_bkt[i]->key_idx[hit_index],
+					__ATOMIC_ACQUIRE);
+				const struct rte_hash_key *key_slot =
+					(const struct rte_hash_key *)(
+					(const char *)h->key_store +
+					key_idx * h->key_entry_size);
 
-				hits |= 1ULL << i;
-				positions[i] = key_idx - 1;
-				goto next_key;
+				if (key_idx != EMPTY_SLOT)
+					pdata[i] = __atomic_load_n(
+							&key_slot->pdata,
+							__ATOMIC_ACQUIRE);
+				/*
+				 * If key index is 0, do not compare key,
+				 * as it is checking the dummy slot
+				 */
+				if (!!key_idx &
+					!rte_hash_cmp_eq(
+						key_slot->key, keys[i], h)) {
+					if (data != NULL)
+						data[i] = pdata[i];
+
+					hits |= 1ULL << i;
+					positions[i] = key_idx - 1;
+					goto next_key;
+				}
+				prim_hitmask[i] &= ~(1 << (hit_index));
 			}
-			prim_hitmask[i] &= ~(1 << (hit_index));
-		}
 
-		while (sec_hitmask[i]) {
-			uint32_t hit_index = __builtin_ctzl(sec_hitmask[i]);
+			while (sec_hitmask[i]) {
+				uint32_t hit_index =
+						__builtin_ctzl(sec_hitmask[i]);
 
-			uint32_t key_idx =
-			__atomic_load_n(
-				&secondary_bkt[i]->key_idx[hit_index],
-				__ATOMIC_ACQUIRE);
-			const struct rte_hash_key *key_slot =
-				(const struct rte_hash_key *)(
-				(const char *)h->key_store +
-				key_idx * h->key_entry_size);
-
-			if (key_idx != EMPTY_SLOT)
-				pdata[i] = __atomic_load_n(&key_slot->pdata,
-						__ATOMIC_ACQUIRE);
-
-			/*
-			 * If key index is 0, do not compare key,
-			 * as it is checking the dummy slot
-			 */
+				uint32_t key_idx =
+				__atomic_load_n(
+					&secondary_bkt[i]->key_idx[hit_index],
+					__ATOMIC_ACQUIRE);
+				const struct rte_hash_key *key_slot =
+					(const struct rte_hash_key *)(
+					(const char *)h->key_store +
+					key_idx * h->key_entry_size);
 
-			if (!!key_idx & !rte_hash_cmp_eq(key_slot->key, keys[i], h)) {
-				if (data != NULL)
-					data[i] = pdata[i];
+				if (key_idx != EMPTY_SLOT)
+					pdata[i] = __atomic_load_n(
+							&key_slot->pdata,
+							__ATOMIC_ACQUIRE);
+				/*
+				 * If key index is 0, do not compare key,
+				 * as it is checking the dummy slot
+				 */
 
-				hits |= 1ULL << i;
-				positions[i] = key_idx - 1;
-				goto next_key;
+				if (!!key_idx &
+					!rte_hash_cmp_eq(
+						key_slot->key, keys[i], h)) {
+					if (data != NULL)
+						data[i] = pdata[i];
+
+					hits |= 1ULL << i;
+					positions[i] = key_idx - 1;
+					goto next_key;
+				}
+				sec_hitmask[i] &= ~(1 << (hit_index));
 			}
-			sec_hitmask[i] &= ~(1 << (hit_index));
-		}
 
 next_key:
-		continue;
-	}
+			continue;
+		}
+
+		/* The loads of sig_current in compare_signatures
+		 * should not move below the load from tbl_chng_cnt.
+		 */
+		__atomic_thread_fence(__ATOMIC_ACQUIRE);
+		/* Re-read the table change counter to check if the
+		 * table has changed during search. If yes, re-do
+		 * the search.
+		 * This load should not get hoisted. The load
+		 * acquires on cnt_b, primary key index and secondary
+		 * key index will make sure that it does not get
+		 * hoisted.
+		 */
+		cnt_a = __atomic_load_n(&h->tbl_chng_cnt,
+					__ATOMIC_ACQUIRE);
+	} while (cnt_b != cnt_a);
 
 	__hash_rw_reader_unlock(h);
 
diff --git a/lib/librte_hash/rte_cuckoo_hash.h b/lib/librte_hash/rte_cuckoo_hash.h
index b0c7ef9..5ce864c 100644
--- a/lib/librte_hash/rte_cuckoo_hash.h
+++ b/lib/librte_hash/rte_cuckoo_hash.h
@@ -186,6 +186,8 @@  struct rte_hash {
 	 * to the key table.
 	 */
 	rte_rwlock_t *readwrite_lock; /**< Read-write lock thread-safety. */
+	uint32_t tbl_chng_cnt;
+	/**< Indicates if the hash table changed from last read. */
 } __rte_cache_aligned;
 
 struct queue_node {
diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h
index 9e7d931..05e024b 100644
--- a/lib/librte_hash/rte_hash.h
+++ b/lib/librte_hash/rte_hash.h
@@ -156,7 +156,7 @@  rte_hash_count(const struct rte_hash *h);
  *   - -ENOSPC if there is no space in the hash for this key.
  */
 int
-rte_hash_add_key_data(const struct rte_hash *h, const void *key, void *data);
+rte_hash_add_key_data(struct rte_hash *h, const void *key, void *data);
 
 /**
  * Add a key-value pair with a pre-computed hash value
@@ -180,7 +180,7 @@  rte_hash_add_key_data(const struct rte_hash *h, const void *key, void *data);
  *   - -ENOSPC if there is no space in the hash for this key.
  */
 int32_t
-rte_hash_add_key_with_hash_data(const struct rte_hash *h, const void *key,
+rte_hash_add_key_with_hash_data(struct rte_hash *h, const void *key,
 						hash_sig_t sig, void *data);
 
 /**
@@ -200,7 +200,7 @@  rte_hash_add_key_with_hash_data(const struct rte_hash *h, const void *key,
  *     array of user data. This value is unique for this key.
  */
 int32_t
-rte_hash_add_key(const struct rte_hash *h, const void *key);
+rte_hash_add_key(struct rte_hash *h, const void *key);
 
 /**
  * Add a key to an existing hash table.
@@ -222,7 +222,7 @@  rte_hash_add_key(const struct rte_hash *h, const void *key);
  *     array of user data. This value is unique for this key.
  */
 int32_t
-rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, hash_sig_t sig);
+rte_hash_add_key_with_hash(struct rte_hash *h, const void *key, hash_sig_t sig);
 
 /**
  * Remove a key from an existing hash table.