[dpdk-dev,v2,08/11] hash: add new functionality to store data in hash table

Message ID 20150626094955.4a999f79@urahara (mailing list archive)
State Not Applicable, archived
Headers

Commit Message

Stephen Hemminger June 26, 2015, 4:49 p.m. UTC
  We did same thing with a slightly different method.

Subject: rte_hash: split key and bucket size

It is useful to store more data in the has bucket than just the key size.
For example, storing an addresss and additional data.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
  

Comments

De Lara Guarch, Pablo June 28, 2015, 10:23 p.m. UTC | #1
Hi Stephen,

> -----Original Message-----
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Friday, June 26, 2015 5:50 PM
> To: De Lara Guarch, Pablo
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 08/11] hash: add new functionality to
> store data in hash table
> 
> We did same thing with a slightly different method.
> 
> Subject: rte_hash: split key and bucket size
> 
> It is useful to store more data in the has bucket than just the key size.
> For example, storing an addresss and additional data.
> 
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

Did you send this patch? I did not see it in the mailing list...
Anyway, I like the idea of allowing the user to store variable size data (I was storing 8-byte data).
So I have sent a v3 with a more similar method, although I still use the new functions,
and use the parameter data_len for knowing the amount of bytes of data, and I keep constant the input key,
and return the data in another parameter.

Thanks for it!
Pablo
  
Stephen Hemminger June 30, 2015, 7:36 a.m. UTC | #2
On Sun, 28 Jun 2015 22:23:28 +0000
"De Lara Guarch, Pablo" <pablo.de.lara.guarch@intel.com> wrote:

> Hi Stephen,
> 
> > -----Original Message-----
> > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > Sent: Friday, June 26, 2015 5:50 PM
> > To: De Lara Guarch, Pablo
> > Cc: dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2 08/11] hash: add new functionality to
> > store data in hash table
> > 
> > We did same thing with a slightly different method.
> > 
> > Subject: rte_hash: split key and bucket size
> > 
> > It is useful to store more data in the has bucket than just the key size.
> > For example, storing an addresss and additional data.
> > 
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> 
> Did you send this patch? I did not see it in the mailing list...
> Anyway, I like the idea of allowing the user to store variable size data (I was storing 8-byte data).
> So I have sent a v3 with a more similar method, although I still use the new functions,
> and use the parameter data_len for knowing the amount of bytes of data, and I keep constant the input key,
> and return the data in another parameter.
> 
> Thanks for it!
> Pablo

I did not send the patch upstream because it would break the ABI and didn't
want to bother fighting that.
  
Bruce Richardson July 10, 2015, 10:39 a.m. UTC | #3
On Sun, Jun 28, 2015 at 10:23:28PM +0000, De Lara Guarch, Pablo wrote:
> Hi Stephen,
> 
> > -----Original Message-----
> > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > Sent: Friday, June 26, 2015 5:50 PM
> > To: De Lara Guarch, Pablo
> > Cc: dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2 08/11] hash: add new functionality to
> > store data in hash table
> > 
> > We did same thing with a slightly different method.
> > 
> > Subject: rte_hash: split key and bucket size
> > 
> > It is useful to store more data in the has bucket than just the key size.
> > For example, storing an addresss and additional data.
> > 
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> 
> Did you send this patch? I did not see it in the mailing list...
> Anyway, I like the idea of allowing the user to store variable size data (I was storing 8-byte data).
> So I have sent a v3 with a more similar method, although I still use the new functions,
> and use the parameter data_len for knowing the amount of bytes of data, and I keep constant the input key,
> and return the data in another parameter.
> 
> Thanks for it!
> Pablo

Hi Pablo, Stephen,

having looked at the V3, having an arbitrary length of data stored in the hash
table doesn't seem right to me. Since the original versions of this we have support
for storing an 8-byte pointer in the hash table to be returned to the user on
lookup, and I believe this provides the best option from an implementation and
usability perspective.

For cases where the data to be stored is fairly large, for example, 16/32 bytes
or more, storing that in the hash table is going to waste a lot of space, and having
the data stored outside the table is probably a better option, with a table lookup
returning a pointer. We can even have the table lookup prefetch the pointer. It
also allows multiple hash table lookups to point to the same external piece of
data.
For storing integer values, e.g. indexes into an external table, those can be
stored as uintptr_t types in place of the pointer itself.

Regards,
/Bruce
  

Patch

--- a/lib/librte_hash/rte_hash.h
+++ b/lib/librte_hash/rte_hash.h
@@ -78,7 +78,8 @@  struct rte_hash_parameters {
 	const char *name;		/**< Name of the hash. */
 	uint32_t entries;		/**< Total hash table entries. */
 	uint32_t bucket_entries;	/**< Bucket entries. */
-	uint32_t key_len;		/**< Length of hash key. */
+	uint16_t key_len;		/**< Length of hash key. */
+	uint16_t cmp_len;		/**< Length of hash key compare (bytes). */
 	rte_hash_function hash_func;	/**< Function used to calculate hash. */
 	uint32_t hash_func_init_val;	/**< Init value used by hash_func. */
 	int socket_id;			/**< NUMA Socket ID for memory. */
@@ -89,7 +90,8 @@  struct rte_hash {
 	char name[RTE_HASH_NAMESIZE];	/**< Name of the hash. */
 	uint32_t entries;		/**< Total table entries. */
 	uint32_t bucket_entries;	/**< Bucket entries. */
-	uint32_t key_len;		/**< Length of hash key. */
+	uint16_t key_len;		/**< Length of hash key. */
+	uint16_t cmp_len;		/**< Length of hash key compare (bytes) */
 	rte_hash_function hash_func;	/**< Function used to calculate hash. */
 	uint32_t hash_func_init_val;	/**< Init value used by hash_func. */
 	uint32_t num_buckets;		/**< Number of buckets in table. */
@@ -240,7 +242,7 @@  rte_hash_del_key_with_hash(const struct
  *     value that was returned when the key was added.
  */
 int32_t
-rte_hash_lookup(const struct rte_hash *h, const void *key);
+rte_hash_lookup(const struct rte_hash *h, void *key);
 
 /**
  * Find a key in the hash table. This operation is multi-thread safe.
@@ -260,7 +262,7 @@  rte_hash_lookup(const struct rte_hash *h
  */
 int32_t
 rte_hash_lookup_with_hash(const struct rte_hash *h,
-				const void *key, hash_sig_t sig);
+			  void *key, hash_sig_t sig);
 
 
 /**
@@ -277,7 +279,7 @@  static inline hash_sig_t
 rte_hash_hash(const struct rte_hash *h, const void *key)
 {
 	/* calc hash result by key */
-	return h->hash_func(key, h->key_len, h->hash_func_init_val);
+	return h->hash_func(key, h->cmp_len, h->hash_func_init_val);
 }
 
 #define rte_hash_lookup_multi rte_hash_lookup_bulk
--- a/lib/librte_hash/rte_hash.c
+++ b/lib/librte_hash/rte_hash.c
@@ -185,7 +185,8 @@  rte_hash_create(const struct rte_hash_pa
 			!rte_is_power_of_2(params->entries) ||
 			!rte_is_power_of_2(params->bucket_entries) ||
 			(params->key_len == 0) ||
-			(params->key_len > RTE_HASH_KEY_LENGTH_MAX)) {
+			(params->key_len > RTE_HASH_KEY_LENGTH_MAX)||
+			(params->cmp_len > params->key_len)) {
 		rte_errno = EINVAL;
 		RTE_LOG(ERR, HASH, "rte_hash_create has invalid parameters\n");
 		return NULL;
@@ -238,6 +239,7 @@  rte_hash_create(const struct rte_hash_pa
 	h->entries = params->entries;
 	h->bucket_entries = params->bucket_entries;
 	h->key_len = params->key_len;
+	h->cmp_len = params->cmp_len ? params->cmp_len : h->key_len;
 	h->hash_func_init_val = params->hash_func_init_val;
 	h->num_buckets = num_buckets;
 	h->bucket_bitmask = h->num_buckets - 1;
@@ -310,7 +312,7 @@  __rte_hash_add_key_with_hash(const struc
 	for (i = 0; i < h->bucket_entries; i++) {
 		if ((sig == sig_bucket[i]) &&
 		    likely(memcmp(key, get_key_from_bucket(h, key_bucket, i),
-				  h->key_len) == 0)) {
+				  h->cmp_len) == 0)) {
 			return bucket_index * h->bucket_entries + i;
 		}
 	}
@@ -360,7 +362,7 @@  __rte_hash_del_key_with_hash(const struc
 	for (i = 0; i < h->bucket_entries; i++) {
 		if ((sig == sig_bucket[i]) &&
 		    likely(memcmp(key, get_key_from_bucket(h, key_bucket, i),
-				  h->key_len) == 0)) {
+				  h->cmp_len) == 0)) {
 			sig_bucket[i] = NULL_SIGNATURE;
 			return bucket_index * h->bucket_entries + i;
 		}
@@ -386,7 +388,7 @@  rte_hash_del_key(const struct rte_hash *
 
 static inline int32_t
 __rte_hash_lookup_with_hash(const struct rte_hash *h,
-			const void *key, hash_sig_t sig)
+			    void *key, hash_sig_t sig)
 {
 	hash_sig_t *sig_bucket;
 	uint8_t *key_bucket;
@@ -402,7 +404,9 @@  __rte_hash_lookup_with_hash(const struct
 	for (i = 0; i < h->bucket_entries; i++) {
 		if ((sig == sig_bucket[i]) &&
 		    likely(memcmp(key, get_key_from_bucket(h, key_bucket, i),
-				  h->key_len) == 0)) {
+				  h->cmp_len) == 0)) {
+		        rte_memcpy(key, get_key_from_bucket(h, key_bucket, i),
+				   h->key_len);
 			return bucket_index * h->bucket_entries + i;
 		}
 	}
@@ -412,14 +416,14 @@  __rte_hash_lookup_with_hash(const struct
 
 int32_t
 rte_hash_lookup_with_hash(const struct rte_hash *h,
-			const void *key, hash_sig_t sig)
+			  void *key, hash_sig_t sig)
 {
 	RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL);
 	return __rte_hash_lookup_with_hash(h, key, sig);
 }
 
 int32_t
-rte_hash_lookup(const struct rte_hash *h, const void *key)
+rte_hash_lookup(const struct rte_hash *h, void *key)
 {
 	RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL);
 	return __rte_hash_lookup_with_hash(h, key, rte_hash_hash(h, key));
@@ -438,7 +442,7 @@  rte_hash_lookup_bulk(const struct rte_ha
 
 	/* Get the hash signature and bucket index */
 	for (i = 0; i < num_keys; i++) {
-		sigs[i] = h->hash_func(keys[i], h->key_len,
+		sigs[i] = h->hash_func(keys[i], h->cmp_len,
 				h->hash_func_init_val) | h->sig_msb;
 		bucket_index = sigs[i] & h->bucket_bitmask;
 
@@ -459,7 +463,7 @@  rte_hash_lookup_bulk(const struct rte_ha
 			if ((sigs[i] == sig_bucket[j]) &&
 			    likely(memcmp(keys[i],
 					  get_key_from_bucket(h, key_bucket, j),
-					  h->key_len) == 0)) {
+					  h->cmp_len) == 0)) {
 				positions[i] = bucket_index *
 					h->bucket_entries + j;
 				break;