[dpdk-dev] Fix for LRU corrupted returns

Message ID A4A7819F3AE089479B8A88B97202C75D37BE86F9@ex10-mbx-9001.ant.amazon.com (mailing list archive)
State Not Applicable, archived
Headers

Commit Message

Saha, Avik (AWS) Sept. 25, 2014, 7:46 a.m. UTC
This is a patch to a problem that I have faced (described in the  thread) and this works for me.

1)      Since the data_size_shl was getting its value from the key_size, the table data entries were being corrupted when the calculation to shift the number of bits was being made based on the key_size (according to the document the key_size and entry_size are independently configurable) - With this fix, we get the MSB that is set in entry_size (also removes the constraint of this having to be a power of 2 - not entirely sure if this was the reason the constraint was kept though)
2)      The document does not say that the entry_size needs to be a power of 2 and this was failing silently when I was trying to bring my application up.


-----Original Message-----
From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Saha, Avik (AWS)
Sent: Wednesday, September 24, 2014 4:12 PM
To: dev@dpdk.org
Subject: [dpdk-dev] Strange behaviour with LRU table

1)      All the calls to add entries succeeds

2)      The key look up works as expected.

3)      The value (entry_data) that is returned is incorrect for every other entry - 1st  entry data on .f_action_hit is wrong, 2nd entry_data on .f_action_hit is correct and so on.

I have initialized my LRU as follows:

    struct rte_pipeline_table_params table_params = {
            .ops = &rte_table_hash_lru_dosig_ops,
            .arg_create = &rule_tbl_params,
            .f_action_hit = rw_pipeline_stage_2_cache_hit,
            .f_action_miss = rw_pipeline_stage_2_cache_miss,
            .arg_ah = (void *)lcore_params,
            .action_data_size = 16,
    };


Is there something obvious I am missing - from first look it seems to be a problem with cache lines but I really am not sure.

Avik
  

Comments

Neil Horman Sept. 25, 2014, 10:21 a.m. UTC | #1
On Thu, Sep 25, 2014 at 07:46:16AM +0000, Saha, Avik (AWS) wrote:
> This is a patch to a problem that I have faced (described in the  thread) and this works for me.
> 
> 1)      Since the data_size_shl was getting its value from the key_size, the table data entries were being corrupted when the calculation to shift the number of bits was being made based on the key_size (according to the document the key_size and entry_size are independently configurable) - With this fix, we get the MSB that is set in entry_size (also removes the constraint of this having to be a power of 2 - not entirely sure if this was the reason the constraint was kept though)
> 2)      The document does not say that the entry_size needs to be a power of 2 and this was failing silently when I was trying to bring my application up.
> 
> diff --git a/DPDK/lib/librte_table/rte_table_hash_lru.c b/DPDK/lib/librte_table/rte_table_hash_lru.c
> index d1a4984..4ec9aa4 100644
> --- a/DPDK/lib/librte_table/rte_table_hash_lru.c
> +++ b/DPDK/lib/librte_table/rte_table_hash_lru.c
> @@ -153,8 +153,10 @@ rte_table_hash_lru_create(void *params, int socket_id, uint32_t entry_size)
>         uint32_t i;
> 
>         /* Check input parameters */
> -       if ((check_params_create(p) != 0) ||
> -               (!rte_is_power_of_2(entry_size)) ||
> +       // Commenting out the power of 2 check on the entry_size since the
> +       // Programmers Guide does not call this out and we are going to handle
> +       // the data_size_shl of the table later on (Line 197)
Please remove the reference to Line 197 here.  Thats not going to remain
accurate for very long.

> +       if ((check_params_create(p) != 0) ||
>                 ((sizeof(struct rte_table_hash) % CACHE_LINE_SIZE) != 0) ||
>                 (sizeof(struct bucket) != (CACHE_LINE_SIZE / 2))) {
>                 return NULL;
> @@ -192,7 +194,7 @@ rte_table_hash_lru_create(void *params, int socket_id, uint32_t entry_size)
>         /* Internal */
>         t->bucket_mask = t->n_buckets - 1;
>         t->key_size_shl = __builtin_ctzl(p->key_size);
> -       t->data_size_shl = __builtin_ctzl(p->key_size);
> +       t->data_size_shl = 32 - (__builtin_clz(entry_size));
I presume the 32 value here is a cache line size?  That should be replaced with
CACHE_LINE_SIZE...Though looking at it, that doesn't seem sufficient.  Seems
like we need a eal abstraction to dynamically tell us what the cache line size
is (we can read it from /proc/cpuinfo in linux, not sure about bsd).

Neil
  
Saha, Avik (AWS) Sept. 30, 2014, 6:26 a.m. UTC | #2
Sorry about the delay. The number 32 is not really a CACHE_LINE_SIZE but since __builtin_clz returns the number of leading 0's before the most significant set bit in a 32 bit number (entry_size is uint32_t), I subtract that number from 32 to get the number of trailing bits after the most significant set bit. This will be the separation in my data_mem regions.

-----Original Message-----
From: Neil Horman [mailto:nhorman@tuxdriver.com] 
Sent: Thursday, September 25, 2014 3:22 AM
To: Saha, Avik (AWS)
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH] Fix for LRU corrupted returns

On Thu, Sep 25, 2014 at 07:46:16AM +0000, Saha, Avik (AWS) wrote:
> This is a patch to a problem that I have faced (described in the  thread) and this works for me.
> 
> 1)      Since the data_size_shl was getting its value from the key_size, the table data entries were being corrupted when the calculation to shift the number of bits was being made based on the key_size (according to the document the key_size and entry_size are independently configurable) - With this fix, we get the MSB that is set in entry_size (also removes the constraint of this having to be a power of 2 - not entirely sure if this was the reason the constraint was kept though)
> 2)      The document does not say that the entry_size needs to be a power of 2 and this was failing silently when I was trying to bring my application up.
> 
> diff --git a/DPDK/lib/librte_table/rte_table_hash_lru.c 
> b/DPDK/lib/librte_table/rte_table_hash_lru.c
> index d1a4984..4ec9aa4 100644
> --- a/DPDK/lib/librte_table/rte_table_hash_lru.c
> +++ b/DPDK/lib/librte_table/rte_table_hash_lru.c
> @@ -153,8 +153,10 @@ rte_table_hash_lru_create(void *params, int socket_id, uint32_t entry_size)
>         uint32_t i;
> 
>         /* Check input parameters */
> -       if ((check_params_create(p) != 0) ||
> -               (!rte_is_power_of_2(entry_size)) ||
> +       // Commenting out the power of 2 check on the entry_size since the
> +       // Programmers Guide does not call this out and we are going to handle
> +       // the data_size_shl of the table later on (Line 197)
Please remove the reference to Line 197 here.  Thats not going to remain accurate for very long.

> +       if ((check_params_create(p) != 0) ||
>                 ((sizeof(struct rte_table_hash) % CACHE_LINE_SIZE) != 0) ||
>                 (sizeof(struct bucket) != (CACHE_LINE_SIZE / 2))) {
>                 return NULL;
> @@ -192,7 +194,7 @@ rte_table_hash_lru_create(void *params, int socket_id, uint32_t entry_size)
>         /* Internal */
>         t->bucket_mask = t->n_buckets - 1;
>         t->key_size_shl = __builtin_ctzl(p->key_size);
> -       t->data_size_shl = __builtin_ctzl(p->key_size);
> +       t->data_size_shl = 32 - (__builtin_clz(entry_size));
I presume the 32 value here is a cache line size?  That should be replaced with CACHE_LINE_SIZE...Though looking at it, that doesn't seem sufficient.  Seems like we need a eal abstraction to dynamically tell us what the cache line size is (we can read it from /proc/cpuinfo in linux, not sure about bsd).

Neil
  
Neil Horman Sept. 30, 2014, 12:51 p.m. UTC | #3
On Tue, Sep 30, 2014 at 06:26:23AM +0000, Saha, Avik (AWS) wrote:
> Sorry about the delay. The number 32 is not really a CACHE_LINE_SIZE but since __builtin_clz returns the number of leading 0's before the most significant set bit in a 32 bit number (entry_size is uint32_t), I subtract that number from 32 to get the number of trailing bits after the most significant set bit. This will be the separation in my data_mem regions.
> 
Ah, ok, then change that 32 to sizeof(t->data_size_shl) to protect you
against type changes and to avoid having magic values running around in your
code.  Also, you might want to do some sanity checking of entry_size as it seems
like theres a soft assumption that entry size is non-zero and a power of two.
while the latter is checked higher in the function, the former isn't and
__builtin_clz has undefined behavior if its passed a zero value.

Neil

> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com] 
> Sent: Thursday, September 25, 2014 3:22 AM
> To: Saha, Avik (AWS)
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] Fix for LRU corrupted returns
> 
> On Thu, Sep 25, 2014 at 07:46:16AM +0000, Saha, Avik (AWS) wrote:
> > This is a patch to a problem that I have faced (described in the  thread) and this works for me.
> > 
> > 1)      Since the data_size_shl was getting its value from the key_size, the table data entries were being corrupted when the calculation to shift the number of bits was being made based on the key_size (according to the document the key_size and entry_size are independently configurable) - With this fix, we get the MSB that is set in entry_size (also removes the constraint of this having to be a power of 2 - not entirely sure if this was the reason the constraint was kept though)
> > 2)      The document does not say that the entry_size needs to be a power of 2 and this was failing silently when I was trying to bring my application up.
> > 
> > diff --git a/DPDK/lib/librte_table/rte_table_hash_lru.c 
> > b/DPDK/lib/librte_table/rte_table_hash_lru.c
> > index d1a4984..4ec9aa4 100644
> > --- a/DPDK/lib/librte_table/rte_table_hash_lru.c
> > +++ b/DPDK/lib/librte_table/rte_table_hash_lru.c
> > @@ -153,8 +153,10 @@ rte_table_hash_lru_create(void *params, int socket_id, uint32_t entry_size)
> >         uint32_t i;
> > 
> >         /* Check input parameters */
> > -       if ((check_params_create(p) != 0) ||
> > -               (!rte_is_power_of_2(entry_size)) ||
> > +       // Commenting out the power of 2 check on the entry_size since the
> > +       // Programmers Guide does not call this out and we are going to handle
> > +       // the data_size_shl of the table later on (Line 197)
> Please remove the reference to Line 197 here.  Thats not going to remain accurate for very long.
> 
> > +       if ((check_params_create(p) != 0) ||
> >                 ((sizeof(struct rte_table_hash) % CACHE_LINE_SIZE) != 0) ||
> >                 (sizeof(struct bucket) != (CACHE_LINE_SIZE / 2))) {
> >                 return NULL;
> > @@ -192,7 +194,7 @@ rte_table_hash_lru_create(void *params, int socket_id, uint32_t entry_size)
> >         /* Internal */
> >         t->bucket_mask = t->n_buckets - 1;
> >         t->key_size_shl = __builtin_ctzl(p->key_size);
> > -       t->data_size_shl = __builtin_ctzl(p->key_size);
> > +       t->data_size_shl = 32 - (__builtin_clz(entry_size));
> I presume the 32 value here is a cache line size?  That should be replaced with CACHE_LINE_SIZE...Though looking at it, that doesn't seem sufficient.  Seems like we need a eal abstraction to dynamically tell us what the cache line size is (we can read it from /proc/cpuinfo in linux, not sure about bsd).
> 
> Neil
> 
>
  
Saha, Avik (AWS) Sept. 30, 2014, 6:14 p.m. UTC | #4
I have to point out that I am commenting out the the power_of_2 check on entry_size. I am not sure if this is the right way but I don't know why this soft assumption is important (since I cannot find the power of 2 constraint in the documentation). I agree with the 0 check but the only reason I did not put that in is because entry size would at least be sizeof(struct rte_pipeline_table_entry) = 8 bytes (to which the action_data_size is added)

Avik

-----Original Message-----
From: Neil Horman [mailto:nhorman@tuxdriver.com] 
Sent: Tuesday, September 30, 2014 5:51 AM
To: Saha, Avik (AWS)
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH] Fix for LRU corrupted returns

On Tue, Sep 30, 2014 at 06:26:23AM +0000, Saha, Avik (AWS) wrote:
> Sorry about the delay. The number 32 is not really a CACHE_LINE_SIZE but since __builtin_clz returns the number of leading 0's before the most significant set bit in a 32 bit number (entry_size is uint32_t), I subtract that number from 32 to get the number of trailing bits after the most significant set bit. This will be the separation in my data_mem regions.
> 
Ah, ok, then change that 32 to sizeof(t->data_size_shl) to protect you against type changes and to avoid having magic values running around in your code.  Also, you might want to do some sanity checking of entry_size as it seems like theres a soft assumption that entry size is non-zero and a power of two.
while the latter is checked higher in the function, the former isn't and __builtin_clz has undefined behavior if its passed a zero value.

Neil

> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Thursday, September 25, 2014 3:22 AM
> To: Saha, Avik (AWS)
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] Fix for LRU corrupted returns
> 
> On Thu, Sep 25, 2014 at 07:46:16AM +0000, Saha, Avik (AWS) wrote:
> > This is a patch to a problem that I have faced (described in the  thread) and this works for me.
> > 
> > 1)      Since the data_size_shl was getting its value from the key_size, the table data entries were being corrupted when the calculation to shift the number of bits was being made based on the key_size (according to the document the key_size and entry_size are independently configurable) - With this fix, we get the MSB that is set in entry_size (also removes the constraint of this having to be a power of 2 - not entirely sure if this was the reason the constraint was kept though)
> > 2)      The document does not say that the entry_size needs to be a power of 2 and this was failing silently when I was trying to bring my application up.
> > 
> > diff --git a/DPDK/lib/librte_table/rte_table_hash_lru.c
> > b/DPDK/lib/librte_table/rte_table_hash_lru.c
> > index d1a4984..4ec9aa4 100644
> > --- a/DPDK/lib/librte_table/rte_table_hash_lru.c
> > +++ b/DPDK/lib/librte_table/rte_table_hash_lru.c
> > @@ -153,8 +153,10 @@ rte_table_hash_lru_create(void *params, int socket_id, uint32_t entry_size)
> >         uint32_t i;
> > 
> >         /* Check input parameters */
> > -       if ((check_params_create(p) != 0) ||
> > -               (!rte_is_power_of_2(entry_size)) ||
> > +       // Commenting out the power of 2 check on the entry_size since the
> > +       // Programmers Guide does not call this out and we are going to handle
> > +       // the data_size_shl of the table later on (Line 197)
> Please remove the reference to Line 197 here.  Thats not going to remain accurate for very long.
> 
> > +       if ((check_params_create(p) != 0) ||
> >                 ((sizeof(struct rte_table_hash) % CACHE_LINE_SIZE) != 0) ||
> >                 (sizeof(struct bucket) != (CACHE_LINE_SIZE / 2))) {
> >                 return NULL;
> > @@ -192,7 +194,7 @@ rte_table_hash_lru_create(void *params, int socket_id, uint32_t entry_size)
> >         /* Internal */
> >         t->bucket_mask = t->n_buckets - 1;
> >         t->key_size_shl = __builtin_ctzl(p->key_size);
> > -       t->data_size_shl = __builtin_ctzl(p->key_size);
> > +       t->data_size_shl = 32 - (__builtin_clz(entry_size));
> I presume the 32 value here is a cache line size?  That should be replaced with CACHE_LINE_SIZE...Though looking at it, that doesn't seem sufficient.  Seems like we need a eal abstraction to dynamically tell us what the cache line size is (we can read it from /proc/cpuinfo in linux, not sure about bsd).
> 
> Neil
> 
>
  
Neil Horman Sept. 30, 2014, 6:33 p.m. UTC | #5
On Tue, Sep 30, 2014 at 06:14:46PM +0000, Saha, Avik (AWS) wrote:
> I have to point out that I am commenting out the the power_of_2 check on entry_size. I am not sure if this is the right way but I don't know why this soft assumption is important (since I cannot find the power of 2 constraint in the documentation). I agree with the 0 check but the only reason I did not put that in is because entry size would at least be sizeof(struct rte_pipeline_table_entry) = 8 bytes (to which the action_data_size is added)
> 
> Avik
> 
I would imagine the power of two check is in place sepcifically because of the
zero bit searchs immediately below it.  I.e. you can't really create bit masks
for multi-field values, when those fields aren't contiguous.

Neil

> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com] 
> Sent: Tuesday, September 30, 2014 5:51 AM
> To: Saha, Avik (AWS)
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] Fix for LRU corrupted returns
> 
> On Tue, Sep 30, 2014 at 06:26:23AM +0000, Saha, Avik (AWS) wrote:
> > Sorry about the delay. The number 32 is not really a CACHE_LINE_SIZE but since __builtin_clz returns the number of leading 0's before the most significant set bit in a 32 bit number (entry_size is uint32_t), I subtract that number from 32 to get the number of trailing bits after the most significant set bit. This will be the separation in my data_mem regions.
> > 
> Ah, ok, then change that 32 to sizeof(t->data_size_shl) to protect you against type changes and to avoid having magic values running around in your code.  Also, you might want to do some sanity checking of entry_size as it seems like theres a soft assumption that entry size is non-zero and a power of two.
> while the latter is checked higher in the function, the former isn't and __builtin_clz has undefined behavior if its passed a zero value.
> 
> Neil
> 
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > Sent: Thursday, September 25, 2014 3:22 AM
> > To: Saha, Avik (AWS)
> > Cc: dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH] Fix for LRU corrupted returns
> > 
> > On Thu, Sep 25, 2014 at 07:46:16AM +0000, Saha, Avik (AWS) wrote:
> > > This is a patch to a problem that I have faced (described in the  thread) and this works for me.
> > > 
> > > 1)      Since the data_size_shl was getting its value from the key_size, the table data entries were being corrupted when the calculation to shift the number of bits was being made based on the key_size (according to the document the key_size and entry_size are independently configurable) - With this fix, we get the MSB that is set in entry_size (also removes the constraint of this having to be a power of 2 - not entirely sure if this was the reason the constraint was kept though)
> > > 2)      The document does not say that the entry_size needs to be a power of 2 and this was failing silently when I was trying to bring my application up.
> > > 
> > > diff --git a/DPDK/lib/librte_table/rte_table_hash_lru.c
> > > b/DPDK/lib/librte_table/rte_table_hash_lru.c
> > > index d1a4984..4ec9aa4 100644
> > > --- a/DPDK/lib/librte_table/rte_table_hash_lru.c
> > > +++ b/DPDK/lib/librte_table/rte_table_hash_lru.c
> > > @@ -153,8 +153,10 @@ rte_table_hash_lru_create(void *params, int socket_id, uint32_t entry_size)
> > >         uint32_t i;
> > > 
> > >         /* Check input parameters */
> > > -       if ((check_params_create(p) != 0) ||
> > > -               (!rte_is_power_of_2(entry_size)) ||
> > > +       // Commenting out the power of 2 check on the entry_size since the
> > > +       // Programmers Guide does not call this out and we are going to handle
> > > +       // the data_size_shl of the table later on (Line 197)
> > Please remove the reference to Line 197 here.  Thats not going to remain accurate for very long.
> > 
> > > +       if ((check_params_create(p) != 0) ||
> > >                 ((sizeof(struct rte_table_hash) % CACHE_LINE_SIZE) != 0) ||
> > >                 (sizeof(struct bucket) != (CACHE_LINE_SIZE / 2))) {
> > >                 return NULL;
> > > @@ -192,7 +194,7 @@ rte_table_hash_lru_create(void *params, int socket_id, uint32_t entry_size)
> > >         /* Internal */
> > >         t->bucket_mask = t->n_buckets - 1;
> > >         t->key_size_shl = __builtin_ctzl(p->key_size);
> > > -       t->data_size_shl = __builtin_ctzl(p->key_size);
> > > +       t->data_size_shl = 32 - (__builtin_clz(entry_size));
> > I presume the 32 value here is a cache line size?  That should be replaced with CACHE_LINE_SIZE...Though looking at it, that doesn't seem sufficient.  Seems like we need a eal abstraction to dynamically tell us what the cache line size is (we can read it from /proc/cpuinfo in linux, not sure about bsd).
> > 
> > Neil
> > 
> > 
>
  

Patch

diff --git a/DPDK/lib/librte_table/rte_table_hash_lru.c b/DPDK/lib/librte_table/rte_table_hash_lru.c
index d1a4984..4ec9aa4 100644
--- a/DPDK/lib/librte_table/rte_table_hash_lru.c
+++ b/DPDK/lib/librte_table/rte_table_hash_lru.c
@@ -153,8 +153,10 @@  rte_table_hash_lru_create(void *params, int socket_id, uint32_t entry_size)
        uint32_t i;

        /* Check input parameters */
-       if ((check_params_create(p) != 0) ||
-               (!rte_is_power_of_2(entry_size)) ||
+       // Commenting out the power of 2 check on the entry_size since the
+       // Programmers Guide does not call this out and we are going to handle
+       // the data_size_shl of the table later on (Line 197)
+       if ((check_params_create(p) != 0) ||
                ((sizeof(struct rte_table_hash) % CACHE_LINE_SIZE) != 0) ||
                (sizeof(struct bucket) != (CACHE_LINE_SIZE / 2))) {
                return NULL;
@@ -192,7 +194,7 @@  rte_table_hash_lru_create(void *params, int socket_id, uint32_t entry_size)
        /* Internal */
        t->bucket_mask = t->n_buckets - 1;
        t->key_size_shl = __builtin_ctzl(p->key_size);
-       t->data_size_shl = __builtin_ctzl(p->key_size);
+       t->data_size_shl = 32 - (__builtin_clz(entry_size));

        /* Tables */
        table_meta_offset = 0;