[dpdk-dev,v2] ring: use aligned memzone allocation

Message ID 20170602201213.51143-1-daniel.verkamp@intel.com (mailing list archive)
State Accepted, archived
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Daniel Verkamp June 2, 2017, 8:12 p.m. UTC
  rte_memzone_reserve() provides cache line alignment, but
struct rte_ring may require more than cache line alignment: on x86-64,
it needs 128-byte alignment due to PROD_ALIGN and CONS_ALIGN, which are
128 bytes, but cache line size is 64 bytes.

Fixes runtime warnings with UBSan enabled.

Fixes: d9f0d3a1ffd4 ("ring: remove split cacheline build setting")

Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
---

v2: fixed checkpatch warnings

 lib/librte_ring/rte_ring.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
  

Comments

Ananyev, Konstantin June 2, 2017, 8:51 p.m. UTC | #1
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Daniel Verkamp
> Sent: Friday, June 2, 2017 9:12 PM
> To: dev@dpdk.org
> Cc: Verkamp, Daniel <daniel.verkamp@intel.com>
> Subject: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> 
> rte_memzone_reserve() provides cache line alignment, but
> struct rte_ring may require more than cache line alignment: on x86-64,
> it needs 128-byte alignment due to PROD_ALIGN and CONS_ALIGN, which are
> 128 bytes, but cache line size is 64 bytes.

Hmm but what for?
I understand we need our rte_ring cche-line aligned,
but why do you want it 2 cache-line aligned?
Konstantin 

> 
> Fixes runtime warnings with UBSan enabled.
> 
> Fixes: d9f0d3a1ffd4 ("ring: remove split cacheline build setting")
> 
> Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
> ---
> 
> v2: fixed checkpatch warnings
> 
>  lib/librte_ring/rte_ring.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
> index 5f98c33..6f58faf 100644
> --- a/lib/librte_ring/rte_ring.c
> +++ b/lib/librte_ring/rte_ring.c
> @@ -189,7 +189,8 @@ rte_ring_create(const char *name, unsigned count, int socket_id,
>  	/* reserve a memory zone for this ring. If we can't get rte_config or
>  	 * we are secondary process, the memzone_reserve function will set
>  	 * rte_errno for us appropriately - hence no check in this this function */
> -	mz = rte_memzone_reserve(mz_name, ring_size, socket_id, mz_flags);
> +	mz = rte_memzone_reserve_aligned(mz_name, ring_size, socket_id,
> +					 mz_flags, __alignof__(*r));
>  	if (mz != NULL) {
>  		r = mz->addr;
>  		/* no need to check return value here, we already checked the
> --
> 2.9.4
  
Daniel Verkamp June 2, 2017, 10:24 p.m. UTC | #2
The PROD/CONS_ALIGN values on x86-64 are set to 2 cache lines, so members of struct rte_ring are 128 byte aligned, and therefore the whole struct needs 128-byte alignment according to the ABI so that the 128-byte alignment of the fields can be guaranteed.

If the allocation is only 64-byte aligned, the beginning of the prod and cons fields may not actually be 128-byte aligned (but we've told the compiler that they are using the __rte_aligned macro).  Accessing these fields when they are misaligned will work in practice on x86 (as long as the compiler doesn't use e.g. aligned SSE instructions), but it is undefined behavior according to the C standard, and UBSan (-fsanitize=undefined) checks for this.

Thanks,
-- Daniel Verkamp

> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Friday, June 2, 2017 1:52 PM
> To: Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> Cc: Verkamp, Daniel <daniel.verkamp@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> 
> 
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Daniel Verkamp
> > Sent: Friday, June 2, 2017 9:12 PM
> > To: dev@dpdk.org
> > Cc: Verkamp, Daniel <daniel.verkamp@intel.com>
> > Subject: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> >
> > rte_memzone_reserve() provides cache line alignment, but
> > struct rte_ring may require more than cache line alignment: on x86-64,
> > it needs 128-byte alignment due to PROD_ALIGN and CONS_ALIGN, which are
> > 128 bytes, but cache line size is 64 bytes.
> 
> Hmm but what for?
> I understand we need our rte_ring cche-line aligned,
> but why do you want it 2 cache-line aligned?
> Konstantin
> 
> >
> > Fixes runtime warnings with UBSan enabled.
> >
> > Fixes: d9f0d3a1ffd4 ("ring: remove split cacheline build setting")
> >
> > Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
> > ---
> >
> > v2: fixed checkpatch warnings
> >
> >  lib/librte_ring/rte_ring.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
> > index 5f98c33..6f58faf 100644
> > --- a/lib/librte_ring/rte_ring.c
> > +++ b/lib/librte_ring/rte_ring.c
> > @@ -189,7 +189,8 @@ rte_ring_create(const char *name, unsigned count, int
> socket_id,
> >  	/* reserve a memory zone for this ring. If we can't get rte_config or
> >  	 * we are secondary process, the memzone_reserve function will set
> >  	 * rte_errno for us appropriately - hence no check in this this function */
> > -	mz = rte_memzone_reserve(mz_name, ring_size, socket_id, mz_flags);
> > +	mz = rte_memzone_reserve_aligned(mz_name, ring_size, socket_id,
> > +					 mz_flags, __alignof__(*r));
> >  	if (mz != NULL) {
> >  		r = mz->addr;
> >  		/* no need to check return value here, we already checked the
> > --
> > 2.9.4
  
Ananyev, Konstantin June 3, 2017, 10 a.m. UTC | #3
> 
> The PROD/CONS_ALIGN values on x86-64 are set to 2 cache lines, so members of struct rte_ring are 128 byte aligned, 
>and therefore the whole struct needs 128-byte alignment according to the ABI so that the 128-byte alignment of the fields can be guaranteed.

Ah ok, missed the fact that rte_ring is 128B aligned these days.
BTW, I probably missed the initial discussion, but what was the reason for that?
Konstantin

> 
> If the allocation is only 64-byte aligned, the beginning of the prod and cons fields may not actually be 128-byte aligned (but we've told the
> compiler that they are using the __rte_aligned macro).  Accessing these fields when they are misaligned will work in practice on x86 (as long
> as the compiler doesn't use e.g. aligned SSE instructions), but it is undefined behavior according to the C standard, and UBSan (-
> fsanitize=undefined) checks for this.
> 
> Thanks,
> -- Daniel Verkamp
> 
> > -----Original Message-----
> > From: Ananyev, Konstantin
> > Sent: Friday, June 2, 2017 1:52 PM
> > To: Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > Cc: Verkamp, Daniel <daniel.verkamp@intel.com>
> > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> >
> >
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Daniel Verkamp
> > > Sent: Friday, June 2, 2017 9:12 PM
> > > To: dev@dpdk.org
> > > Cc: Verkamp, Daniel <daniel.verkamp@intel.com>
> > > Subject: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > >
> > > rte_memzone_reserve() provides cache line alignment, but
> > > struct rte_ring may require more than cache line alignment: on x86-64,
> > > it needs 128-byte alignment due to PROD_ALIGN and CONS_ALIGN, which are
> > > 128 bytes, but cache line size is 64 bytes.
> >
> > Hmm but what for?
> > I understand we need our rte_ring cche-line aligned,
> > but why do you want it 2 cache-line aligned?
> > Konstantin
> >
> > >
> > > Fixes runtime warnings with UBSan enabled.
> > >
> > > Fixes: d9f0d3a1ffd4 ("ring: remove split cacheline build setting")
> > >
> > > Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
> > > ---
> > >
> > > v2: fixed checkpatch warnings
> > >
> > >  lib/librte_ring/rte_ring.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
> > > index 5f98c33..6f58faf 100644
> > > --- a/lib/librte_ring/rte_ring.c
> > > +++ b/lib/librte_ring/rte_ring.c
> > > @@ -189,7 +189,8 @@ rte_ring_create(const char *name, unsigned count, int
> > socket_id,
> > >  	/* reserve a memory zone for this ring. If we can't get rte_config or
> > >  	 * we are secondary process, the memzone_reserve function will set
> > >  	 * rte_errno for us appropriately - hence no check in this this function */
> > > -	mz = rte_memzone_reserve(mz_name, ring_size, socket_id, mz_flags);
> > > +	mz = rte_memzone_reserve_aligned(mz_name, ring_size, socket_id,
> > > +					 mz_flags, __alignof__(*r));
> > >  	if (mz != NULL) {
> > >  		r = mz->addr;
> > >  		/* no need to check return value here, we already checked the
> > > --
> > > 2.9.4
  
Daniel Verkamp June 5, 2017, 4:21 p.m. UTC | #4
> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Saturday, June 3, 2017 3:00 AM
> To: Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> Cc: Richardson, Bruce <bruce.richardson@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> 
> 
> 
> >
> > The PROD/CONS_ALIGN values on x86-64 are set to 2 cache lines, so members
> of struct rte_ring are 128 byte aligned,
> >and therefore the whole struct needs 128-byte alignment according to the ABI
> so that the 128-byte alignment of the fields can be guaranteed.
> 
> Ah ok, missed the fact that rte_ring is 128B aligned these days.
> BTW, I probably missed the initial discussion, but what was the reason for that?
> Konstantin

I don't know why PROD_ALIGN/CONS_ALIGN use 128 byte alignment; it seems unnecessary if the cache line is only 64 bytes.  An alternate fix would be to just use cache line alignment for these fields (since memzones are already cache line aligned).  Maybe there is some deeper reason for the >= 128-byte alignment logic in rte_ring.h?

Thanks,
-- Daniel
  
Ananyev, Konstantin June 6, 2017, 9:59 a.m. UTC | #5
> >
> >
> >
> > >
> > > The PROD/CONS_ALIGN values on x86-64 are set to 2 cache lines, so members
> > of struct rte_ring are 128 byte aligned,
> > >and therefore the whole struct needs 128-byte alignment according to the ABI
> > so that the 128-byte alignment of the fields can be guaranteed.
> >
> > Ah ok, missed the fact that rte_ring is 128B aligned these days.
> > BTW, I probably missed the initial discussion, but what was the reason for that?
> > Konstantin
> 
> I don't know why PROD_ALIGN/CONS_ALIGN use 128 byte alignment; it seems unnecessary if the cache line is only 64 bytes.  An alternate
> fix would be to just use cache line alignment for these fields (since memzones are already cache line aligned). 

Yes, had the same thought.

> Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?

Might be, would be good to hear opinion the author of that change. 
Thanks
Konstantin
  
Bruce Richardson June 6, 2017, 12:42 p.m. UTC | #6
On Tue, Jun 06, 2017 at 10:59:59AM +0100, Ananyev, Konstantin wrote:
> 
> > >
> > >
> > >
> > > >
> > > > The PROD/CONS_ALIGN values on x86-64 are set to 2 cache lines, so members
> > > of struct rte_ring are 128 byte aligned,
> > > >and therefore the whole struct needs 128-byte alignment according to the ABI
> > > so that the 128-byte alignment of the fields can be guaranteed.
> > >
> > > Ah ok, missed the fact that rte_ring is 128B aligned these days.
> > > BTW, I probably missed the initial discussion, but what was the reason for that?
> > > Konstantin
> > 
> > I don't know why PROD_ALIGN/CONS_ALIGN use 128 byte alignment; it seems unnecessary if the cache line is only 64 bytes.  An alternate
> > fix would be to just use cache line alignment for these fields (since memzones are already cache line aligned). 
> 
> Yes, had the same thought.
> 
> > Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?
> 
> Might be, would be good to hear opinion the author of that change. 

It gives improved performance for core-2-core transfer.

/Bruce
  
Ananyev, Konstantin June 6, 2017, 1:19 p.m. UTC | #7
> -----Original Message-----
> From: Richardson, Bruce
> Sent: Tuesday, June 6, 2017 1:42 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Cc: Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> 
> On Tue, Jun 06, 2017 at 10:59:59AM +0100, Ananyev, Konstantin wrote:
> >
> > > >
> > > >
> > > >
> > > > >
> > > > > The PROD/CONS_ALIGN values on x86-64 are set to 2 cache lines, so members
> > > > of struct rte_ring are 128 byte aligned,
> > > > >and therefore the whole struct needs 128-byte alignment according to the ABI
> > > > so that the 128-byte alignment of the fields can be guaranteed.
> > > >
> > > > Ah ok, missed the fact that rte_ring is 128B aligned these days.
> > > > BTW, I probably missed the initial discussion, but what was the reason for that?
> > > > Konstantin
> > >
> > > I don't know why PROD_ALIGN/CONS_ALIGN use 128 byte alignment; it seems unnecessary if the cache line is only 64 bytes.  An
> alternate
> > > fix would be to just use cache line alignment for these fields (since memzones are already cache line aligned).
> >
> > Yes, had the same thought.
> >
> > > Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?
> >
> > Might be, would be good to hear opinion the author of that change.
> 
> It gives improved performance for core-2-core transfer.

You mean empty cache-line(s) after prod/cons, correct?
That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
Something like that:
struct rte_ring {
   ...
   struct rte_ring_headtail prod __rte_cache_aligned;
   EMPTY_CACHE_LINE   __rte_cache_aligned;
   struct rte_ring_headtail cons __rte_cache_aligned;
   EMPTY_CACHE_LINE   __rte_cache_aligned;
};

Konstantin
  
Bruce Richardson June 6, 2017, 2:56 p.m. UTC | #8
On Tue, Jun 06, 2017 at 02:19:21PM +0100, Ananyev, Konstantin wrote:
> 
> 
> > -----Original Message-----
> > From: Richardson, Bruce
> > Sent: Tuesday, June 6, 2017 1:42 PM
> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > Cc: Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > 
> > On Tue, Jun 06, 2017 at 10:59:59AM +0100, Ananyev, Konstantin wrote:
> > >
> > > > >
> > > > >
> > > > >
> > > > > >
> > > > > > The PROD/CONS_ALIGN values on x86-64 are set to 2 cache lines, so members
> > > > > of struct rte_ring are 128 byte aligned,
> > > > > >and therefore the whole struct needs 128-byte alignment according to the ABI
> > > > > so that the 128-byte alignment of the fields can be guaranteed.
> > > > >
> > > > > Ah ok, missed the fact that rte_ring is 128B aligned these days.
> > > > > BTW, I probably missed the initial discussion, but what was the reason for that?
> > > > > Konstantin
> > > >
> > > > I don't know why PROD_ALIGN/CONS_ALIGN use 128 byte alignment; it seems unnecessary if the cache line is only 64 bytes.  An
> > alternate
> > > > fix would be to just use cache line alignment for these fields (since memzones are already cache line aligned).
> > >
> > > Yes, had the same thought.
> > >
> > > > Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?
> > >
> > > Might be, would be good to hear opinion the author of that change.
> > 
> > It gives improved performance for core-2-core transfer.
> 
> You mean empty cache-line(s) after prod/cons, correct?
> That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
> Something like that:
> struct rte_ring {
>    ...
>    struct rte_ring_headtail prod __rte_cache_aligned;
>    EMPTY_CACHE_LINE   __rte_cache_aligned;
>    struct rte_ring_headtail cons __rte_cache_aligned;
>    EMPTY_CACHE_LINE   __rte_cache_aligned;
> };
> 
> Konstantin

Sure. That should probably work too. 

/Bruce
  
Olivier Matz June 8, 2017, 12:45 p.m. UTC | #9
On Tue, 6 Jun 2017 15:56:28 +0100, Bruce Richardson <bruce.richardson@intel.com> wrote:
> On Tue, Jun 06, 2017 at 02:19:21PM +0100, Ananyev, Konstantin wrote:
> > 
> >   
> > > -----Original Message-----
> > > From: Richardson, Bruce
> > > Sent: Tuesday, June 6, 2017 1:42 PM
> > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > Cc: Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > 
> > > On Tue, Jun 06, 2017 at 10:59:59AM +0100, Ananyev, Konstantin wrote:  
> > > >  
> > > > > >
> > > > > >
> > > > > >  
> > > > > > >
> > > > > > > The PROD/CONS_ALIGN values on x86-64 are set to 2 cache lines, so members  
> > > > > > of struct rte_ring are 128 byte aligned,  
> > > > > > >and therefore the whole struct needs 128-byte alignment according to the ABI  
> > > > > > so that the 128-byte alignment of the fields can be guaranteed.
> > > > > >
> > > > > > Ah ok, missed the fact that rte_ring is 128B aligned these days.
> > > > > > BTW, I probably missed the initial discussion, but what was the reason for that?
> > > > > > Konstantin  
> > > > >
> > > > > I don't know why PROD_ALIGN/CONS_ALIGN use 128 byte alignment; it seems unnecessary if the cache line is only 64 bytes.  An  
> > > alternate  
> > > > > fix would be to just use cache line alignment for these fields (since memzones are already cache line aligned).  
> > > >
> > > > Yes, had the same thought.
> > > >  
> > > > > Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?  
> > > >
> > > > Might be, would be good to hear opinion the author of that change.  
> > > 
> > > It gives improved performance for core-2-core transfer.  
> > 
> > You mean empty cache-line(s) after prod/cons, correct?
> > That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
> > Something like that:
> > struct rte_ring {
> >    ...
> >    struct rte_ring_headtail prod __rte_cache_aligned;
> >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> >    struct rte_ring_headtail cons __rte_cache_aligned;
> >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > };
> > 
> > Konstantin  
> 
> Sure. That should probably work too. 
> 
> /Bruce

I also agree with Konstantin's proposal. One question though: since it
changes the alignment constraint of the rte_ring structure, I think it is
an ABI breakage: a structure including the rte_ring structure inherits
from this constraint.

How could we handle that, knowing this is probably a rare case?
  
Bruce Richardson June 8, 2017, 1:20 p.m. UTC | #10
On Thu, Jun 08, 2017 at 02:45:40PM +0200, Olivier Matz wrote:
> On Tue, 6 Jun 2017 15:56:28 +0100, Bruce Richardson <bruce.richardson@intel.com> wrote:
> > On Tue, Jun 06, 2017 at 02:19:21PM +0100, Ananyev, Konstantin wrote:
> > > 
> > >   
> > > > -----Original Message-----
> > > > From: Richardson, Bruce
> > > > Sent: Tuesday, June 6, 2017 1:42 PM
> > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > Cc: Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > 
> > > > On Tue, Jun 06, 2017 at 10:59:59AM +0100, Ananyev, Konstantin wrote:  
> > > > >  
> > > > > > >
> > > > > > >
> > > > > > >  
> > > > > > > >
> > > > > > > > The PROD/CONS_ALIGN values on x86-64 are set to 2 cache lines, so members  
> > > > > > > of struct rte_ring are 128 byte aligned,  
> > > > > > > >and therefore the whole struct needs 128-byte alignment according to the ABI  
> > > > > > > so that the 128-byte alignment of the fields can be guaranteed.
> > > > > > >
> > > > > > > Ah ok, missed the fact that rte_ring is 128B aligned these days.
> > > > > > > BTW, I probably missed the initial discussion, but what was the reason for that?
> > > > > > > Konstantin  
> > > > > >
> > > > > > I don't know why PROD_ALIGN/CONS_ALIGN use 128 byte alignment; it seems unnecessary if the cache line is only 64 bytes.  An  
> > > > alternate  
> > > > > > fix would be to just use cache line alignment for these fields (since memzones are already cache line aligned).  
> > > > >
> > > > > Yes, had the same thought.
> > > > >  
> > > > > > Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?  
> > > > >
> > > > > Might be, would be good to hear opinion the author of that change.  
> > > > 
> > > > It gives improved performance for core-2-core transfer.  
> > > 
> > > You mean empty cache-line(s) after prod/cons, correct?
> > > That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
> > > Something like that:
> > > struct rte_ring {
> > >    ...
> > >    struct rte_ring_headtail prod __rte_cache_aligned;
> > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > >    struct rte_ring_headtail cons __rte_cache_aligned;
> > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > };
> > > 
> > > Konstantin  
> > 
> > Sure. That should probably work too. 
> > 
> > /Bruce
> 
> I also agree with Konstantin's proposal. One question though: since it
> changes the alignment constraint of the rte_ring structure, I think it is
> an ABI breakage: a structure including the rte_ring structure inherits
> from this constraint.
> 
> How could we handle that, knowing this is probably a rare case?
> 
>
Is it an ABI break so long as we keep the resulting size and field
placement of the structures the same? The alignment being reduced should
not be a problem, as 128byte alignment is also valid as 64byte
alignment, after all.

/Bruce
  
Olivier Matz June 8, 2017, 2:05 p.m. UTC | #11
On Thu, 8 Jun 2017 14:20:52 +0100, Bruce Richardson <bruce.richardson@intel.com> wrote:
> On Thu, Jun 08, 2017 at 02:45:40PM +0200, Olivier Matz wrote:
> > On Tue, 6 Jun 2017 15:56:28 +0100, Bruce Richardson <bruce.richardson@intel.com> wrote:  
> > > On Tue, Jun 06, 2017 at 02:19:21PM +0100, Ananyev, Konstantin wrote:  
> > > > 
> > > >     
> > > > > -----Original Message-----
> > > > > From: Richardson, Bruce
> > > > > Sent: Tuesday, June 6, 2017 1:42 PM
> > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > > Cc: Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > 
> > > > > On Tue, Jun 06, 2017 at 10:59:59AM +0100, Ananyev, Konstantin wrote:    
> > > > > >    
> > > > > > > >
> > > > > > > >
> > > > > > > >    
> > > > > > > > >
> > > > > > > > > The PROD/CONS_ALIGN values on x86-64 are set to 2 cache lines, so members    
> > > > > > > > of struct rte_ring are 128 byte aligned,    
> > > > > > > > >and therefore the whole struct needs 128-byte alignment according to the ABI    
> > > > > > > > so that the 128-byte alignment of the fields can be guaranteed.
> > > > > > > >
> > > > > > > > Ah ok, missed the fact that rte_ring is 128B aligned these days.
> > > > > > > > BTW, I probably missed the initial discussion, but what was the reason for that?
> > > > > > > > Konstantin    
> > > > > > >
> > > > > > > I don't know why PROD_ALIGN/CONS_ALIGN use 128 byte alignment; it seems unnecessary if the cache line is only 64 bytes.  An    
> > > > > alternate    
> > > > > > > fix would be to just use cache line alignment for these fields (since memzones are already cache line aligned).    
> > > > > >
> > > > > > Yes, had the same thought.
> > > > > >    
> > > > > > > Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?    
> > > > > >
> > > > > > Might be, would be good to hear opinion the author of that change.    
> > > > > 
> > > > > It gives improved performance for core-2-core transfer.    
> > > > 
> > > > You mean empty cache-line(s) after prod/cons, correct?
> > > > That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
> > > > Something like that:
> > > > struct rte_ring {
> > > >    ...
> > > >    struct rte_ring_headtail prod __rte_cache_aligned;
> > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > >    struct rte_ring_headtail cons __rte_cache_aligned;
> > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > };
> > > > 
> > > > Konstantin    
> > > 
> > > Sure. That should probably work too. 
> > > 
> > > /Bruce  
> > 
> > I also agree with Konstantin's proposal. One question though: since it
> > changes the alignment constraint of the rte_ring structure, I think it is
> > an ABI breakage: a structure including the rte_ring structure inherits
> > from this constraint.
> > 
> > How could we handle that, knowing this is probably a rare case?
> > 
> >  
> Is it an ABI break so long as we keep the resulting size and field
> placement of the structures the same? The alignment being reduced should
> not be a problem, as 128byte alignment is also valid as 64byte
> alignment, after all.

I'd say yes. Consider the following example:

---8<---
#include <stdio.h>
#include <stdlib.h>

#define ALIGN 64
/* #define ALIGN 128 */

/* dummy rte_ring struct */
struct rte_ring {
	char x[128];
} __attribute__((aligned(ALIGN)));

struct foo {
	struct rte_ring r;
	unsigned bar;
};

int main(void)
{
	struct foo array[2];

	printf("sizeof(ring)=%zu diff=%u\n",
		sizeof(struct rte_ring),
		(unsigned int)((char *)&array[1].r - (char *)array));

	return 0;
}
---8<---

The size of rte_ring is always 128.
diff is 192 or 256, depending on the value of ALIGN.



Olivier
  
Bruce Richardson June 8, 2017, 2:11 p.m. UTC | #12
On Thu, Jun 08, 2017 at 04:05:26PM +0200, Olivier Matz wrote:
> On Thu, 8 Jun 2017 14:20:52 +0100, Bruce Richardson <bruce.richardson@intel.com> wrote:
> > On Thu, Jun 08, 2017 at 02:45:40PM +0200, Olivier Matz wrote:
> > > On Tue, 6 Jun 2017 15:56:28 +0100, Bruce Richardson <bruce.richardson@intel.com> wrote:  
> > > > On Tue, Jun 06, 2017 at 02:19:21PM +0100, Ananyev, Konstantin wrote:  
> > > > > 
> > > > >     
> > > > > > -----Original Message-----
> > > > > > From: Richardson, Bruce
> > > > > > Sent: Tuesday, June 6, 2017 1:42 PM
> > > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > > > Cc: Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > 
> > > > > > On Tue, Jun 06, 2017 at 10:59:59AM +0100, Ananyev, Konstantin wrote:    
> > > > > > >    
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >    
> > > > > > > > > >
> > > > > > > > > > The PROD/CONS_ALIGN values on x86-64 are set to 2 cache lines, so members    
> > > > > > > > > of struct rte_ring are 128 byte aligned,    
> > > > > > > > > >and therefore the whole struct needs 128-byte alignment according to the ABI    
> > > > > > > > > so that the 128-byte alignment of the fields can be guaranteed.
> > > > > > > > >
> > > > > > > > > Ah ok, missed the fact that rte_ring is 128B aligned these days.
> > > > > > > > > BTW, I probably missed the initial discussion, but what was the reason for that?
> > > > > > > > > Konstantin    
> > > > > > > >
> > > > > > > > I don't know why PROD_ALIGN/CONS_ALIGN use 128 byte alignment; it seems unnecessary if the cache line is only 64 bytes.  An    
> > > > > > alternate    
> > > > > > > > fix would be to just use cache line alignment for these fields (since memzones are already cache line aligned).    
> > > > > > >
> > > > > > > Yes, had the same thought.
> > > > > > >    
> > > > > > > > Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?    
> > > > > > >
> > > > > > > Might be, would be good to hear opinion the author of that change.    
> > > > > > 
> > > > > > It gives improved performance for core-2-core transfer.    
> > > > > 
> > > > > You mean empty cache-line(s) after prod/cons, correct?
> > > > > That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
> > > > > Something like that:
> > > > > struct rte_ring {
> > > > >    ...
> > > > >    struct rte_ring_headtail prod __rte_cache_aligned;
> > > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > >    struct rte_ring_headtail cons __rte_cache_aligned;
> > > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > };
> > > > > 
> > > > > Konstantin    
> > > > 
> > > > Sure. That should probably work too. 
> > > > 
> > > > /Bruce  
> > > 
> > > I also agree with Konstantin's proposal. One question though: since it
> > > changes the alignment constraint of the rte_ring structure, I think it is
> > > an ABI breakage: a structure including the rte_ring structure inherits
> > > from this constraint.
> > > 
> > > How could we handle that, knowing this is probably a rare case?
> > > 
> > >  
> > Is it an ABI break so long as we keep the resulting size and field
> > placement of the structures the same? The alignment being reduced should
> > not be a problem, as 128byte alignment is also valid as 64byte
> > alignment, after all.
> 
> I'd say yes. Consider the following example:
> 
> ---8<---
> #include <stdio.h>
> #include <stdlib.h>
> 
> #define ALIGN 64
> /* #define ALIGN 128 */
> 
> /* dummy rte_ring struct */
> struct rte_ring {
> 	char x[128];
> } __attribute__((aligned(ALIGN)));
> 
> struct foo {
> 	struct rte_ring r;
> 	unsigned bar;
> };
> 
> int main(void)
> {
> 	struct foo array[2];
> 
> 	printf("sizeof(ring)=%zu diff=%u\n",
> 		sizeof(struct rte_ring),
> 		(unsigned int)((char *)&array[1].r - (char *)array));
> 
> 	return 0;
> }
> ---8<---
> 
> The size of rte_ring is always 128.
> diff is 192 or 256, depending on the value of ALIGN.
> 
> 
> 
> Olivier

Yes, the diff will change, but that is after a recompile. If we have
rte_ring_create function always return a 128-byte aligned structure,
will any already-compiled apps fail to work if we also change the alignment
of the rte_ring struct in the header?

/Bruce
  
Ananyev, Konstantin June 8, 2017, 2:50 p.m. UTC | #13
> -----Original Message-----
> From: Richardson, Bruce
> Sent: Thursday, June 8, 2017 3:12 PM
> To: Olivier Matz <olivier.matz@6wind.com>
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> 
> On Thu, Jun 08, 2017 at 04:05:26PM +0200, Olivier Matz wrote:
> > On Thu, 8 Jun 2017 14:20:52 +0100, Bruce Richardson <bruce.richardson@intel.com> wrote:
> > > On Thu, Jun 08, 2017 at 02:45:40PM +0200, Olivier Matz wrote:
> > > > On Tue, 6 Jun 2017 15:56:28 +0100, Bruce Richardson <bruce.richardson@intel.com> wrote:
> > > > > On Tue, Jun 06, 2017 at 02:19:21PM +0100, Ananyev, Konstantin wrote:
> > > > > >
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Richardson, Bruce
> > > > > > > Sent: Tuesday, June 6, 2017 1:42 PM
> > > > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > > > > Cc: Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > >
> > > > > > > On Tue, Jun 06, 2017 at 10:59:59AM +0100, Ananyev, Konstantin wrote:
> > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > The PROD/CONS_ALIGN values on x86-64 are set to 2 cache lines, so members
> > > > > > > > > > of struct rte_ring are 128 byte aligned,
> > > > > > > > > > >and therefore the whole struct needs 128-byte alignment according to the ABI
> > > > > > > > > > so that the 128-byte alignment of the fields can be guaranteed.
> > > > > > > > > >
> > > > > > > > > > Ah ok, missed the fact that rte_ring is 128B aligned these days.
> > > > > > > > > > BTW, I probably missed the initial discussion, but what was the reason for that?
> > > > > > > > > > Konstantin
> > > > > > > > >
> > > > > > > > > I don't know why PROD_ALIGN/CONS_ALIGN use 128 byte alignment; it seems unnecessary if the cache line is only 64 bytes.
> An
> > > > > > > alternate
> > > > > > > > > fix would be to just use cache line alignment for these fields (since memzones are already cache line aligned).
> > > > > > > >
> > > > > > > > Yes, had the same thought.
> > > > > > > >
> > > > > > > > > Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?
> > > > > > > >
> > > > > > > > Might be, would be good to hear opinion the author of that change.
> > > > > > >
> > > > > > > It gives improved performance for core-2-core transfer.
> > > > > >
> > > > > > You mean empty cache-line(s) after prod/cons, correct?
> > > > > > That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
> > > > > > Something like that:
> > > > > > struct rte_ring {
> > > > > >    ...
> > > > > >    struct rte_ring_headtail prod __rte_cache_aligned;
> > > > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > >    struct rte_ring_headtail cons __rte_cache_aligned;
> > > > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > };
> > > > > >
> > > > > > Konstantin
> > > > >
> > > > > Sure. That should probably work too.
> > > > >
> > > > > /Bruce
> > > >
> > > > I also agree with Konstantin's proposal. One question though: since it
> > > > changes the alignment constraint of the rte_ring structure, I think it is
> > > > an ABI breakage: a structure including the rte_ring structure inherits
> > > > from this constraint.
> > > >
> > > > How could we handle that, knowing this is probably a rare case?
> > > >
> > > >
> > > Is it an ABI break so long as we keep the resulting size and field
> > > placement of the structures the same? The alignment being reduced should
> > > not be a problem, as 128byte alignment is also valid as 64byte
> > > alignment, after all.
> >
> > I'd say yes. Consider the following example:
> >
> > ---8<---
> > #include <stdio.h>
> > #include <stdlib.h>
> >
> > #define ALIGN 64
> > /* #define ALIGN 128 */
> >
> > /* dummy rte_ring struct */
> > struct rte_ring {
> > 	char x[128];
> > } __attribute__((aligned(ALIGN)));
> >
> > struct foo {
> > 	struct rte_ring r;
> > 	unsigned bar;
> > };
> >
> > int main(void)
> > {
> > 	struct foo array[2];
> >
> > 	printf("sizeof(ring)=%zu diff=%u\n",
> > 		sizeof(struct rte_ring),
> > 		(unsigned int)((char *)&array[1].r - (char *)array));
> >
> > 	return 0;
> > }
> > ---8<---
> >
> > The size of rte_ring is always 128.
> > diff is 192 or 256, depending on the value of ALIGN.
> >
> >
> >
> > Olivier

About would it be an ABI breakage to 17.05 - I think would...
Though for me the actual breakage happens in 17.05 when rte_ring
alignment was increased from 64B 128B.
Now we just restoring it.

> 
> Yes, the diff will change, but that is after a recompile. If we have
> rte_ring_create function always return a 128-byte aligned structure,
> will any already-compiled apps fail to work if we also change the alignment
> of the rte_ring struct in the header?

Why 128B?
I thought we are discussing making rte_ring 64B aligned again?

Konstantin
  
Bruce Richardson June 8, 2017, 3:24 p.m. UTC | #14
On Thu, Jun 08, 2017 at 03:50:34PM +0100, Ananyev, Konstantin wrote:
> 
> 
> > -----Original Message-----
> > From: Richardson, Bruce
> > Sent: Thursday, June 8, 2017 3:12 PM
> > To: Olivier Matz <olivier.matz@6wind.com>
> > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > 
> > On Thu, Jun 08, 2017 at 04:05:26PM +0200, Olivier Matz wrote:
> > > On Thu, 8 Jun 2017 14:20:52 +0100, Bruce Richardson <bruce.richardson@intel.com> wrote:
> > > > On Thu, Jun 08, 2017 at 02:45:40PM +0200, Olivier Matz wrote:
> > > > > On Tue, 6 Jun 2017 15:56:28 +0100, Bruce Richardson <bruce.richardson@intel.com> wrote:
> > > > > > On Tue, Jun 06, 2017 at 02:19:21PM +0100, Ananyev, Konstantin wrote:
> > > > > > >
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Richardson, Bruce
> > > > > > > > Sent: Tuesday, June 6, 2017 1:42 PM
> > > > > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > > > > > Cc: Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > > >
> > > > > > > > On Tue, Jun 06, 2017 at 10:59:59AM +0100, Ananyev, Konstantin wrote:
> > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > The PROD/CONS_ALIGN values on x86-64 are set to 2 cache lines, so members
> > > > > > > > > > > of struct rte_ring are 128 byte aligned,
> > > > > > > > > > > >and therefore the whole struct needs 128-byte alignment according to the ABI
> > > > > > > > > > > so that the 128-byte alignment of the fields can be guaranteed.
> > > > > > > > > > >
> > > > > > > > > > > Ah ok, missed the fact that rte_ring is 128B aligned these days.
> > > > > > > > > > > BTW, I probably missed the initial discussion, but what was the reason for that?
> > > > > > > > > > > Konstantin
> > > > > > > > > >
> > > > > > > > > > I don't know why PROD_ALIGN/CONS_ALIGN use 128 byte alignment; it seems unnecessary if the cache line is only 64 bytes.
> > An
> > > > > > > > alternate
> > > > > > > > > > fix would be to just use cache line alignment for these fields (since memzones are already cache line aligned).
> > > > > > > > >
> > > > > > > > > Yes, had the same thought.
> > > > > > > > >
> > > > > > > > > > Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?
> > > > > > > > >
> > > > > > > > > Might be, would be good to hear opinion the author of that change.
> > > > > > > >
> > > > > > > > It gives improved performance for core-2-core transfer.
> > > > > > >
> > > > > > > You mean empty cache-line(s) after prod/cons, correct?
> > > > > > > That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
> > > > > > > Something like that:
> > > > > > > struct rte_ring {
> > > > > > >    ...
> > > > > > >    struct rte_ring_headtail prod __rte_cache_aligned;
> > > > > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > >    struct rte_ring_headtail cons __rte_cache_aligned;
> > > > > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > };
> > > > > > >
> > > > > > > Konstantin
> > > > > >
> > > > > > Sure. That should probably work too.
> > > > > >
> > > > > > /Bruce
> > > > >
> > > > > I also agree with Konstantin's proposal. One question though: since it
> > > > > changes the alignment constraint of the rte_ring structure, I think it is
> > > > > an ABI breakage: a structure including the rte_ring structure inherits
> > > > > from this constraint.
> > > > >
> > > > > How could we handle that, knowing this is probably a rare case?
> > > > >
> > > > >
> > > > Is it an ABI break so long as we keep the resulting size and field
> > > > placement of the structures the same? The alignment being reduced should
> > > > not be a problem, as 128byte alignment is also valid as 64byte
> > > > alignment, after all.
> > >
> > > I'd say yes. Consider the following example:
> > >
> > > ---8<---
> > > #include <stdio.h>
> > > #include <stdlib.h>
> > >
> > > #define ALIGN 64
> > > /* #define ALIGN 128 */
> > >
> > > /* dummy rte_ring struct */
> > > struct rte_ring {
> > > 	char x[128];
> > > } __attribute__((aligned(ALIGN)));
> > >
> > > struct foo {
> > > 	struct rte_ring r;
> > > 	unsigned bar;
> > > };
> > >
> > > int main(void)
> > > {
> > > 	struct foo array[2];
> > >
> > > 	printf("sizeof(ring)=%zu diff=%u\n",
> > > 		sizeof(struct rte_ring),
> > > 		(unsigned int)((char *)&array[1].r - (char *)array));
> > >
> > > 	return 0;
> > > }
> > > ---8<---
> > >
> > > The size of rte_ring is always 128.
> > > diff is 192 or 256, depending on the value of ALIGN.
> > >
> > >
> > >
> > > Olivier
> 
> About would it be an ABI breakage to 17.05 - I think would...
> Though for me the actual breakage happens in 17.05 when rte_ring
> alignment was increased from 64B 128B.
> Now we just restoring it.
> 
Yes, ABI change was announced in advance and explicitly broken in 17.05.
There was no announcement of ABI break in 17.08 for rte_ring.

> > 
> > Yes, the diff will change, but that is after a recompile. If we have
> > rte_ring_create function always return a 128-byte aligned structure,
> > will any already-compiled apps fail to work if we also change the alignment
> > of the rte_ring struct in the header?
> 
> Why 128B?
> I thought we are discussing making rte_ring 64B aligned again?
> 
> Konstantin

To avoid possibly breaking apps compiled against 17.05 when run against
shared libs for 17.08. Having the extra alignment won't affect 17.08
apps, since they only require 64-byte alignment, but returning only
64-byte aligned memory for apps which expect 128byte aligned memory may
cause issues.

Therefore, we should reduce the required alignment to 64B, which should
only affect any apps that do a recompile, and have memory allocation for
rings return 128B aligned addresses to work with both 64B aligned and
128B aligned ring structures.

/Bruce
  
Ananyev, Konstantin June 8, 2017, 3:35 p.m. UTC | #15
> -----Original Message-----
> From: Richardson, Bruce
> Sent: Thursday, June 8, 2017 4:25 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> 
> On Thu, Jun 08, 2017 at 03:50:34PM +0100, Ananyev, Konstantin wrote:
> >
> >
> > > -----Original Message-----
> > > From: Richardson, Bruce
> > > Sent: Thursday, June 8, 2017 3:12 PM
> > > To: Olivier Matz <olivier.matz@6wind.com>
> > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > >
> > > On Thu, Jun 08, 2017 at 04:05:26PM +0200, Olivier Matz wrote:
> > > > On Thu, 8 Jun 2017 14:20:52 +0100, Bruce Richardson <bruce.richardson@intel.com> wrote:
> > > > > On Thu, Jun 08, 2017 at 02:45:40PM +0200, Olivier Matz wrote:
> > > > > > On Tue, 6 Jun 2017 15:56:28 +0100, Bruce Richardson <bruce.richardson@intel.com> wrote:
> > > > > > > On Tue, Jun 06, 2017 at 02:19:21PM +0100, Ananyev, Konstantin wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Richardson, Bruce
> > > > > > > > > Sent: Tuesday, June 6, 2017 1:42 PM
> > > > > > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > > > > > > Cc: Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > > > >
> > > > > > > > > On Tue, Jun 06, 2017 at 10:59:59AM +0100, Ananyev, Konstantin wrote:
> > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > The PROD/CONS_ALIGN values on x86-64 are set to 2 cache lines, so members
> > > > > > > > > > > > of struct rte_ring are 128 byte aligned,
> > > > > > > > > > > > >and therefore the whole struct needs 128-byte alignment according to the ABI
> > > > > > > > > > > > so that the 128-byte alignment of the fields can be guaranteed.
> > > > > > > > > > > >
> > > > > > > > > > > > Ah ok, missed the fact that rte_ring is 128B aligned these days.
> > > > > > > > > > > > BTW, I probably missed the initial discussion, but what was the reason for that?
> > > > > > > > > > > > Konstantin
> > > > > > > > > > >
> > > > > > > > > > > I don't know why PROD_ALIGN/CONS_ALIGN use 128 byte alignment; it seems unnecessary if the cache line is only 64
> bytes.
> > > An
> > > > > > > > > alternate
> > > > > > > > > > > fix would be to just use cache line alignment for these fields (since memzones are already cache line aligned).
> > > > > > > > > >
> > > > > > > > > > Yes, had the same thought.
> > > > > > > > > >
> > > > > > > > > > > Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?
> > > > > > > > > >
> > > > > > > > > > Might be, would be good to hear opinion the author of that change.
> > > > > > > > >
> > > > > > > > > It gives improved performance for core-2-core transfer.
> > > > > > > >
> > > > > > > > You mean empty cache-line(s) after prod/cons, correct?
> > > > > > > > That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
> > > > > > > > Something like that:
> > > > > > > > struct rte_ring {
> > > > > > > >    ...
> > > > > > > >    struct rte_ring_headtail prod __rte_cache_aligned;
> > > > > > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > >    struct rte_ring_headtail cons __rte_cache_aligned;
> > > > > > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > };
> > > > > > > >
> > > > > > > > Konstantin
> > > > > > >
> > > > > > > Sure. That should probably work too.
> > > > > > >
> > > > > > > /Bruce
> > > > > >
> > > > > > I also agree with Konstantin's proposal. One question though: since it
> > > > > > changes the alignment constraint of the rte_ring structure, I think it is
> > > > > > an ABI breakage: a structure including the rte_ring structure inherits
> > > > > > from this constraint.
> > > > > >
> > > > > > How could we handle that, knowing this is probably a rare case?
> > > > > >
> > > > > >
> > > > > Is it an ABI break so long as we keep the resulting size and field
> > > > > placement of the structures the same? The alignment being reduced should
> > > > > not be a problem, as 128byte alignment is also valid as 64byte
> > > > > alignment, after all.
> > > >
> > > > I'd say yes. Consider the following example:
> > > >
> > > > ---8<---
> > > > #include <stdio.h>
> > > > #include <stdlib.h>
> > > >
> > > > #define ALIGN 64
> > > > /* #define ALIGN 128 */
> > > >
> > > > /* dummy rte_ring struct */
> > > > struct rte_ring {
> > > > 	char x[128];
> > > > } __attribute__((aligned(ALIGN)));
> > > >
> > > > struct foo {
> > > > 	struct rte_ring r;
> > > > 	unsigned bar;
> > > > };
> > > >
> > > > int main(void)
> > > > {
> > > > 	struct foo array[2];
> > > >
> > > > 	printf("sizeof(ring)=%zu diff=%u\n",
> > > > 		sizeof(struct rte_ring),
> > > > 		(unsigned int)((char *)&array[1].r - (char *)array));
> > > >
> > > > 	return 0;
> > > > }
> > > > ---8<---
> > > >
> > > > The size of rte_ring is always 128.
> > > > diff is 192 or 256, depending on the value of ALIGN.
> > > >
> > > >
> > > >
> > > > Olivier
> >
> > About would it be an ABI breakage to 17.05 - I think would...
> > Though for me the actual breakage happens in 17.05 when rte_ring
> > alignment was increased from 64B 128B.
> > Now we just restoring it.
> >
> Yes, ABI change was announced in advance and explicitly broken in 17.05.
> There was no announcement of ABI break in 17.08 for rte_ring.
> 
> > >
> > > Yes, the diff will change, but that is after a recompile. If we have
> > > rte_ring_create function always return a 128-byte aligned structure,
> > > will any already-compiled apps fail to work if we also change the alignment
> > > of the rte_ring struct in the header?
> >
> > Why 128B?
> > I thought we are discussing making rte_ring 64B aligned again?
> >
> > Konstantin
> 
> To avoid possibly breaking apps compiled against 17.05 when run against
> shared libs for 17.08. Having the extra alignment won't affect 17.08
> apps, since they only require 64-byte alignment, but returning only
> 64-byte aligned memory for apps which expect 128byte aligned memory may
> cause issues.
> 
> Therefore, we should reduce the required alignment to 64B, which should
> only affect any apps that do a recompile, and have memory allocation for
> rings return 128B aligned addresses to work with both 64B aligned and
> 128B aligned ring structures.

Ah, I see - you are talking just about rte_ring_create().
BTW, are you sure that right now it allocates rings 128B aligned?
As I can see it does just:
mz = rte_memzone_reserve(mz_name, ring_size, socket_id, mz_flags);
which means cache line alignment.

Konstantin
  
Bruce Richardson June 8, 2017, 4:03 p.m. UTC | #16
On Thu, Jun 08, 2017 at 04:35:20PM +0100, Ananyev, Konstantin wrote:
> 
> 
> > -----Original Message-----
> > From: Richardson, Bruce
> > Sent: Thursday, June 8, 2017 4:25 PM
> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > 
> > On Thu, Jun 08, 2017 at 03:50:34PM +0100, Ananyev, Konstantin wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Richardson, Bruce
> > > > Sent: Thursday, June 8, 2017 3:12 PM
> > > > To: Olivier Matz <olivier.matz@6wind.com>
> > > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > >
> > > > On Thu, Jun 08, 2017 at 04:05:26PM +0200, Olivier Matz wrote:
> > > > > On Thu, 8 Jun 2017 14:20:52 +0100, Bruce Richardson <bruce.richardson@intel.com> wrote:
> > > > > > On Thu, Jun 08, 2017 at 02:45:40PM +0200, Olivier Matz wrote:
> > > > > > > On Tue, 6 Jun 2017 15:56:28 +0100, Bruce Richardson <bruce.richardson@intel.com> wrote:
> > > > > > > > On Tue, Jun 06, 2017 at 02:19:21PM +0100, Ananyev, Konstantin wrote:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > -----Original Message-----
> > > > > > > > > > From: Richardson, Bruce
> > > > > > > > > > Sent: Tuesday, June 6, 2017 1:42 PM
> > > > > > > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > > > > > > > Cc: Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > > > > >
> > > > > > > > > > On Tue, Jun 06, 2017 at 10:59:59AM +0100, Ananyev, Konstantin wrote:
> > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The PROD/CONS_ALIGN values on x86-64 are set to 2 cache lines, so members
> > > > > > > > > > > > > of struct rte_ring are 128 byte aligned,
> > > > > > > > > > > > > >and therefore the whole struct needs 128-byte alignment according to the ABI
> > > > > > > > > > > > > so that the 128-byte alignment of the fields can be guaranteed.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Ah ok, missed the fact that rte_ring is 128B aligned these days.
> > > > > > > > > > > > > BTW, I probably missed the initial discussion, but what was the reason for that?
> > > > > > > > > > > > > Konstantin
> > > > > > > > > > > >
> > > > > > > > > > > > I don't know why PROD_ALIGN/CONS_ALIGN use 128 byte alignment; it seems unnecessary if the cache line is only 64
> > bytes.
> > > > An
> > > > > > > > > > alternate
> > > > > > > > > > > > fix would be to just use cache line alignment for these fields (since memzones are already cache line aligned).
> > > > > > > > > > >
> > > > > > > > > > > Yes, had the same thought.
> > > > > > > > > > >
> > > > > > > > > > > > Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?
> > > > > > > > > > >
> > > > > > > > > > > Might be, would be good to hear opinion the author of that change.
> > > > > > > > > >
> > > > > > > > > > It gives improved performance for core-2-core transfer.
> > > > > > > > >
> > > > > > > > > You mean empty cache-line(s) after prod/cons, correct?
> > > > > > > > > That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
> > > > > > > > > Something like that:
> > > > > > > > > struct rte_ring {
> > > > > > > > >    ...
> > > > > > > > >    struct rte_ring_headtail prod __rte_cache_aligned;
> > > > > > > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > >    struct rte_ring_headtail cons __rte_cache_aligned;
> > > > > > > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > };
> > > > > > > > >
> > > > > > > > > Konstantin
> > > > > > > >
> > > > > > > > Sure. That should probably work too.
> > > > > > > >
> > > > > > > > /Bruce
> > > > > > >
> > > > > > > I also agree with Konstantin's proposal. One question though: since it
> > > > > > > changes the alignment constraint of the rte_ring structure, I think it is
> > > > > > > an ABI breakage: a structure including the rte_ring structure inherits
> > > > > > > from this constraint.
> > > > > > >
> > > > > > > How could we handle that, knowing this is probably a rare case?
> > > > > > >
> > > > > > >
> > > > > > Is it an ABI break so long as we keep the resulting size and field
> > > > > > placement of the structures the same? The alignment being reduced should
> > > > > > not be a problem, as 128byte alignment is also valid as 64byte
> > > > > > alignment, after all.
> > > > >
> > > > > I'd say yes. Consider the following example:
> > > > >
> > > > > ---8<---
> > > > > #include <stdio.h>
> > > > > #include <stdlib.h>
> > > > >
> > > > > #define ALIGN 64
> > > > > /* #define ALIGN 128 */
> > > > >
> > > > > /* dummy rte_ring struct */
> > > > > struct rte_ring {
> > > > > 	char x[128];
> > > > > } __attribute__((aligned(ALIGN)));
> > > > >
> > > > > struct foo {
> > > > > 	struct rte_ring r;
> > > > > 	unsigned bar;
> > > > > };
> > > > >
> > > > > int main(void)
> > > > > {
> > > > > 	struct foo array[2];
> > > > >
> > > > > 	printf("sizeof(ring)=%zu diff=%u\n",
> > > > > 		sizeof(struct rte_ring),
> > > > > 		(unsigned int)((char *)&array[1].r - (char *)array));
> > > > >
> > > > > 	return 0;
> > > > > }
> > > > > ---8<---
> > > > >
> > > > > The size of rte_ring is always 128.
> > > > > diff is 192 or 256, depending on the value of ALIGN.
> > > > >
> > > > >
> > > > >
> > > > > Olivier
> > >
> > > About would it be an ABI breakage to 17.05 - I think would...
> > > Though for me the actual breakage happens in 17.05 when rte_ring
> > > alignment was increased from 64B 128B.
> > > Now we just restoring it.
> > >
> > Yes, ABI change was announced in advance and explicitly broken in 17.05.
> > There was no announcement of ABI break in 17.08 for rte_ring.
> > 
> > > >
> > > > Yes, the diff will change, but that is after a recompile. If we have
> > > > rte_ring_create function always return a 128-byte aligned structure,
> > > > will any already-compiled apps fail to work if we also change the alignment
> > > > of the rte_ring struct in the header?
> > >
> > > Why 128B?
> > > I thought we are discussing making rte_ring 64B aligned again?
> > >
> > > Konstantin
> > 
> > To avoid possibly breaking apps compiled against 17.05 when run against
> > shared libs for 17.08. Having the extra alignment won't affect 17.08
> > apps, since they only require 64-byte alignment, but returning only
> > 64-byte aligned memory for apps which expect 128byte aligned memory may
> > cause issues.
> > 
> > Therefore, we should reduce the required alignment to 64B, which should
> > only affect any apps that do a recompile, and have memory allocation for
> > rings return 128B aligned addresses to work with both 64B aligned and
> > 128B aligned ring structures.
> 
> Ah, I see - you are talking just about rte_ring_create().
> BTW, are you sure that right now it allocates rings 128B aligned?
> As I can see it does just:
> mz = rte_memzone_reserve(mz_name, ring_size, socket_id, mz_flags);
> which means cache line alignment.
> 
It doesn't currently allocate with that alignment, which is something we
need to fix - and what this patch was originally submitted to do. So I
think this patch should be applied, along with a further patch to reduce
the alignment going forward to avoid any other problems.

/Bruce
  
Ananyev, Konstantin June 8, 2017, 4:12 p.m. UTC | #17
> -----Original Message-----
> From: Richardson, Bruce
> Sent: Thursday, June 8, 2017 5:04 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> 
> On Thu, Jun 08, 2017 at 04:35:20PM +0100, Ananyev, Konstantin wrote:
> >
> >
> > > -----Original Message-----
> > > From: Richardson, Bruce
> > > Sent: Thursday, June 8, 2017 4:25 PM
> > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > >
> > > On Thu, Jun 08, 2017 at 03:50:34PM +0100, Ananyev, Konstantin wrote:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Richardson, Bruce
> > > > > Sent: Thursday, June 8, 2017 3:12 PM
> > > > > To: Olivier Matz <olivier.matz@6wind.com>
> > > > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > >
> > > > > On Thu, Jun 08, 2017 at 04:05:26PM +0200, Olivier Matz wrote:
> > > > > > On Thu, 8 Jun 2017 14:20:52 +0100, Bruce Richardson <bruce.richardson@intel.com> wrote:
> > > > > > > On Thu, Jun 08, 2017 at 02:45:40PM +0200, Olivier Matz wrote:
> > > > > > > > On Tue, 6 Jun 2017 15:56:28 +0100, Bruce Richardson <bruce.richardson@intel.com> wrote:
> > > > > > > > > On Tue, Jun 06, 2017 at 02:19:21PM +0100, Ananyev, Konstantin wrote:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > -----Original Message-----
> > > > > > > > > > > From: Richardson, Bruce
> > > > > > > > > > > Sent: Tuesday, June 6, 2017 1:42 PM
> > > > > > > > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > > > > > > > > Cc: Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > > > > > >
> > > > > > > > > > > On Tue, Jun 06, 2017 at 10:59:59AM +0100, Ananyev, Konstantin wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The PROD/CONS_ALIGN values on x86-64 are set to 2 cache lines, so members
> > > > > > > > > > > > > > of struct rte_ring are 128 byte aligned,
> > > > > > > > > > > > > > >and therefore the whole struct needs 128-byte alignment according to the ABI
> > > > > > > > > > > > > > so that the 128-byte alignment of the fields can be guaranteed.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Ah ok, missed the fact that rte_ring is 128B aligned these days.
> > > > > > > > > > > > > > BTW, I probably missed the initial discussion, but what was the reason for that?
> > > > > > > > > > > > > > Konstantin
> > > > > > > > > > > > >
> > > > > > > > > > > > > I don't know why PROD_ALIGN/CONS_ALIGN use 128 byte alignment; it seems unnecessary if the cache line is only 64
> > > bytes.
> > > > > An
> > > > > > > > > > > alternate
> > > > > > > > > > > > > fix would be to just use cache line alignment for these fields (since memzones are already cache line aligned).
> > > > > > > > > > > >
> > > > > > > > > > > > Yes, had the same thought.
> > > > > > > > > > > >
> > > > > > > > > > > > > Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?
> > > > > > > > > > > >
> > > > > > > > > > > > Might be, would be good to hear opinion the author of that change.
> > > > > > > > > > >
> > > > > > > > > > > It gives improved performance for core-2-core transfer.
> > > > > > > > > >
> > > > > > > > > > You mean empty cache-line(s) after prod/cons, correct?
> > > > > > > > > > That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
> > > > > > > > > > Something like that:
> > > > > > > > > > struct rte_ring {
> > > > > > > > > >    ...
> > > > > > > > > >    struct rte_ring_headtail prod __rte_cache_aligned;
> > > > > > > > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > >    struct rte_ring_headtail cons __rte_cache_aligned;
> > > > > > > > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > > };
> > > > > > > > > >
> > > > > > > > > > Konstantin
> > > > > > > > >
> > > > > > > > > Sure. That should probably work too.
> > > > > > > > >
> > > > > > > > > /Bruce
> > > > > > > >
> > > > > > > > I also agree with Konstantin's proposal. One question though: since it
> > > > > > > > changes the alignment constraint of the rte_ring structure, I think it is
> > > > > > > > an ABI breakage: a structure including the rte_ring structure inherits
> > > > > > > > from this constraint.
> > > > > > > >
> > > > > > > > How could we handle that, knowing this is probably a rare case?
> > > > > > > >
> > > > > > > >
> > > > > > > Is it an ABI break so long as we keep the resulting size and field
> > > > > > > placement of the structures the same? The alignment being reduced should
> > > > > > > not be a problem, as 128byte alignment is also valid as 64byte
> > > > > > > alignment, after all.
> > > > > >
> > > > > > I'd say yes. Consider the following example:
> > > > > >
> > > > > > ---8<---
> > > > > > #include <stdio.h>
> > > > > > #include <stdlib.h>
> > > > > >
> > > > > > #define ALIGN 64
> > > > > > /* #define ALIGN 128 */
> > > > > >
> > > > > > /* dummy rte_ring struct */
> > > > > > struct rte_ring {
> > > > > > 	char x[128];
> > > > > > } __attribute__((aligned(ALIGN)));
> > > > > >
> > > > > > struct foo {
> > > > > > 	struct rte_ring r;
> > > > > > 	unsigned bar;
> > > > > > };
> > > > > >
> > > > > > int main(void)
> > > > > > {
> > > > > > 	struct foo array[2];
> > > > > >
> > > > > > 	printf("sizeof(ring)=%zu diff=%u\n",
> > > > > > 		sizeof(struct rte_ring),
> > > > > > 		(unsigned int)((char *)&array[1].r - (char *)array));
> > > > > >
> > > > > > 	return 0;
> > > > > > }
> > > > > > ---8<---
> > > > > >
> > > > > > The size of rte_ring is always 128.
> > > > > > diff is 192 or 256, depending on the value of ALIGN.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Olivier
> > > >
> > > > About would it be an ABI breakage to 17.05 - I think would...
> > > > Though for me the actual breakage happens in 17.05 when rte_ring
> > > > alignment was increased from 64B 128B.
> > > > Now we just restoring it.
> > > >
> > > Yes, ABI change was announced in advance and explicitly broken in 17.05.
> > > There was no announcement of ABI break in 17.08 for rte_ring.
> > >
> > > > >
> > > > > Yes, the diff will change, but that is after a recompile. If we have
> > > > > rte_ring_create function always return a 128-byte aligned structure,
> > > > > will any already-compiled apps fail to work if we also change the alignment
> > > > > of the rte_ring struct in the header?
> > > >
> > > > Why 128B?
> > > > I thought we are discussing making rte_ring 64B aligned again?
> > > >
> > > > Konstantin
> > >
> > > To avoid possibly breaking apps compiled against 17.05 when run against
> > > shared libs for 17.08. Having the extra alignment won't affect 17.08
> > > apps, since they only require 64-byte alignment, but returning only
> > > 64-byte aligned memory for apps which expect 128byte aligned memory may
> > > cause issues.
> > >
> > > Therefore, we should reduce the required alignment to 64B, which should
> > > only affect any apps that do a recompile, and have memory allocation for
> > > rings return 128B aligned addresses to work with both 64B aligned and
> > > 128B aligned ring structures.
> >
> > Ah, I see - you are talking just about rte_ring_create().
> > BTW, are you sure that right now it allocates rings 128B aligned?
> > As I can see it does just:
> > mz = rte_memzone_reserve(mz_name, ring_size, socket_id, mz_flags);
> > which means cache line alignment.
> >
> It doesn't currently allocate with that alignment, which is something we
> need to fix - and what this patch was originally submitted to do. So I
> think this patch should be applied, along with a further patch to reduce
> the alignment going forward to avoid any other problems.

But if we going to reduce alignment anyway (patch #2) why do we need
patch #1 at all?
  
Bruce Richardson June 8, 2017, 4:20 p.m. UTC | #18
> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Thursday, June 8, 2017 5:13 PM
> To: Richardson, Bruce <bruce.richardson@intel.com>
> Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel
> <daniel.verkamp@intel.com>; dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> 
> 
> 
> > -----Original Message-----
> > From: Richardson, Bruce
> > Sent: Thursday, June 8, 2017 5:04 PM
> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel
> > <daniel.verkamp@intel.com>; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone
> > allocation
> >
> > On Thu, Jun 08, 2017 at 04:35:20PM +0100, Ananyev, Konstantin wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Richardson, Bruce
> > > > Sent: Thursday, June 8, 2017 4:25 PM
> > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel
> > > > <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone
> > > > allocation
> > > >
> > > > On Thu, Jun 08, 2017 at 03:50:34PM +0100, Ananyev, Konstantin wrote:
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Richardson, Bruce
> > > > > > Sent: Thursday, June 8, 2017 3:12 PM
> > > > > > To: Olivier Matz <olivier.matz@6wind.com>
> > > > > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> > > > > > Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone
> > > > > > allocation
> > > > > >
> > > > > > On Thu, Jun 08, 2017 at 04:05:26PM +0200, Olivier Matz wrote:
> > > > > > > On Thu, 8 Jun 2017 14:20:52 +0100, Bruce Richardson
> <bruce.richardson@intel.com> wrote:
> > > > > > > > On Thu, Jun 08, 2017 at 02:45:40PM +0200, Olivier Matz
> wrote:
> > > > > > > > > On Tue, 6 Jun 2017 15:56:28 +0100, Bruce Richardson
> <bruce.richardson@intel.com> wrote:
> > > > > > > > > > On Tue, Jun 06, 2017 at 02:19:21PM +0100, Ananyev,
> Konstantin wrote:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > -----Original Message-----
> > > > > > > > > > > > From: Richardson, Bruce
> > > > > > > > > > > > Sent: Tuesday, June 6, 2017 1:42 PM
> > > > > > > > > > > > To: Ananyev, Konstantin
> > > > > > > > > > > > <konstantin.ananyev@intel.com>
> > > > > > > > > > > > Cc: Verkamp, Daniel <daniel.verkamp@intel.com>;
> > > > > > > > > > > > dev@dpdk.org
> > > > > > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use
> > > > > > > > > > > > aligned memzone allocation
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, Jun 06, 2017 at 10:59:59AM +0100, Ananyev,
> Konstantin wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > The PROD/CONS_ALIGN values on x86-64 are
> > > > > > > > > > > > > > > > set to 2 cache lines, so members
> > > > > > > > > > > > > > > of struct rte_ring are 128 byte aligned,
> > > > > > > > > > > > > > > >and therefore the whole struct needs
> > > > > > > > > > > > > > > >128-byte alignment according to the ABI
> > > > > > > > > > > > > > > so that the 128-byte alignment of the fields
> can be guaranteed.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Ah ok, missed the fact that rte_ring is 128B
> aligned these days.
> > > > > > > > > > > > > > > BTW, I probably missed the initial discussion,
> but what was the reason for that?
> > > > > > > > > > > > > > > Konstantin
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I don't know why PROD_ALIGN/CONS_ALIGN use 128
> > > > > > > > > > > > > > byte alignment; it seems unnecessary if the
> > > > > > > > > > > > > > cache line is only 64
> > > > bytes.
> > > > > > An
> > > > > > > > > > > > alternate
> > > > > > > > > > > > > > fix would be to just use cache line alignment
> for these fields (since memzones are already cache line aligned).
> > > > > > > > > > > > >
> > > > > > > > > > > > > Yes, had the same thought.
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Maybe there is some deeper  reason for the >=
> 128-byte alignment logic in rte_ring.h?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Might be, would be good to hear opinion the author
> of that change.
> > > > > > > > > > > >
> > > > > > > > > > > > It gives improved performance for core-2-core
> transfer.
> > > > > > > > > > >
> > > > > > > > > > > You mean empty cache-line(s) after prod/cons, correct?
> > > > > > > > > > > That's ok but why we can't keep them and whole
> rte_ring aligned on cache-line boundaries?
> > > > > > > > > > > Something like that:
> > > > > > > > > > > struct rte_ring {
> > > > > > > > > > >    ...
> > > > > > > > > > >    struct rte_ring_headtail prod __rte_cache_aligned;
> > > > > > > > > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > > >    struct rte_ring_headtail cons __rte_cache_aligned;
> > > > > > > > > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > > > };
> > > > > > > > > > >
> > > > > > > > > > > Konstantin
> > > > > > > > > >
> > > > > > > > > > Sure. That should probably work too.
> > > > > > > > > >
> > > > > > > > > > /Bruce
> > > > > > > > >
> > > > > > > > > I also agree with Konstantin's proposal. One question
> > > > > > > > > though: since it changes the alignment constraint of the
> > > > > > > > > rte_ring structure, I think it is an ABI breakage: a
> > > > > > > > > structure including the rte_ring structure inherits from
> this constraint.
> > > > > > > > >
> > > > > > > > > How could we handle that, knowing this is probably a rare
> case?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > Is it an ABI break so long as we keep the resulting size
> > > > > > > > and field placement of the structures the same? The
> > > > > > > > alignment being reduced should not be a problem, as
> > > > > > > > 128byte alignment is also valid as 64byte alignment, after
> all.
> > > > > > >
> > > > > > > I'd say yes. Consider the following example:
> > > > > > >
> > > > > > > ---8<---
> > > > > > > #include <stdio.h>
> > > > > > > #include <stdlib.h>
> > > > > > >
> > > > > > > #define ALIGN 64
> > > > > > > /* #define ALIGN 128 */
> > > > > > >
> > > > > > > /* dummy rte_ring struct */
> > > > > > > struct rte_ring {
> > > > > > > 	char x[128];
> > > > > > > } __attribute__((aligned(ALIGN)));
> > > > > > >
> > > > > > > struct foo {
> > > > > > > 	struct rte_ring r;
> > > > > > > 	unsigned bar;
> > > > > > > };
> > > > > > >
> > > > > > > int main(void)
> > > > > > > {
> > > > > > > 	struct foo array[2];
> > > > > > >
> > > > > > > 	printf("sizeof(ring)=%zu diff=%u\n",
> > > > > > > 		sizeof(struct rte_ring),
> > > > > > > 		(unsigned int)((char *)&array[1].r - (char
> *)array));
> > > > > > >
> > > > > > > 	return 0;
> > > > > > > }
> > > > > > > ---8<---
> > > > > > >
> > > > > > > The size of rte_ring is always 128.
> > > > > > > diff is 192 or 256, depending on the value of ALIGN.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Olivier
> > > > >
> > > > > About would it be an ABI breakage to 17.05 - I think would...
> > > > > Though for me the actual breakage happens in 17.05 when rte_ring
> > > > > alignment was increased from 64B 128B.
> > > > > Now we just restoring it.
> > > > >
> > > > Yes, ABI change was announced in advance and explicitly broken in
> 17.05.
> > > > There was no announcement of ABI break in 17.08 for rte_ring.
> > > >
> > > > > >
> > > > > > Yes, the diff will change, but that is after a recompile. If
> > > > > > we have rte_ring_create function always return a 128-byte
> > > > > > aligned structure, will any already-compiled apps fail to work
> > > > > > if we also change the alignment of the rte_ring struct in the
> header?
> > > > >
> > > > > Why 128B?
> > > > > I thought we are discussing making rte_ring 64B aligned again?
> > > > >
> > > > > Konstantin
> > > >
> > > > To avoid possibly breaking apps compiled against 17.05 when run
> > > > against shared libs for 17.08. Having the extra alignment won't
> > > > affect 17.08 apps, since they only require 64-byte alignment, but
> > > > returning only 64-byte aligned memory for apps which expect
> > > > 128byte aligned memory may cause issues.
> > > >
> > > > Therefore, we should reduce the required alignment to 64B, which
> > > > should only affect any apps that do a recompile, and have memory
> > > > allocation for rings return 128B aligned addresses to work with
> > > > both 64B aligned and 128B aligned ring structures.
> > >
> > > Ah, I see - you are talking just about rte_ring_create().
> > > BTW, are you sure that right now it allocates rings 128B aligned?
> > > As I can see it does just:
> > > mz = rte_memzone_reserve(mz_name, ring_size, socket_id, mz_flags);
> > > which means cache line alignment.
> > >
> > It doesn't currently allocate with that alignment, which is something
> > we need to fix - and what this patch was originally submitted to do.
> > So I think this patch should be applied, along with a further patch to
> > reduce the alignment going forward to avoid any other problems.
> 
> But if we going to reduce alignment anyway (patch #2) why do we need patch
> #1 at all?

Because any app compiled against 17.05 will use the old alignment value. Therefore patch 1 should be applied to 17.08 for backward compatibility, and backported to 17.05.1.

/Bruce
  
Ananyev, Konstantin June 8, 2017, 4:42 p.m. UTC | #19
> -----Original Message-----
> From: Richardson, Bruce
> Sent: Thursday, June 8, 2017 5:21 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> 
> 
> 
> > -----Original Message-----
> > From: Ananyev, Konstantin
> > Sent: Thursday, June 8, 2017 5:13 PM
> > To: Richardson, Bruce <bruce.richardson@intel.com>
> > Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel
> > <daniel.verkamp@intel.com>; dev@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> >
> >
> >
> > > -----Original Message-----
> > > From: Richardson, Bruce
> > > Sent: Thursday, June 8, 2017 5:04 PM
> > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel
> > > <daniel.verkamp@intel.com>; dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone
> > > allocation
> > >
> > > On Thu, Jun 08, 2017 at 04:35:20PM +0100, Ananyev, Konstantin wrote:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Richardson, Bruce
> > > > > Sent: Thursday, June 8, 2017 4:25 PM
> > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > > Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel
> > > > > <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone
> > > > > allocation
> > > > >
> > > > > On Thu, Jun 08, 2017 at 03:50:34PM +0100, Ananyev, Konstantin wrote:
> > > > > >
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Richardson, Bruce
> > > > > > > Sent: Thursday, June 8, 2017 3:12 PM
> > > > > > > To: Olivier Matz <olivier.matz@6wind.com>
> > > > > > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> > > > > > > Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone
> > > > > > > allocation
> > > > > > >
> > > > > > > On Thu, Jun 08, 2017 at 04:05:26PM +0200, Olivier Matz wrote:
> > > > > > > > On Thu, 8 Jun 2017 14:20:52 +0100, Bruce Richardson
> > <bruce.richardson@intel.com> wrote:
> > > > > > > > > On Thu, Jun 08, 2017 at 02:45:40PM +0200, Olivier Matz
> > wrote:
> > > > > > > > > > On Tue, 6 Jun 2017 15:56:28 +0100, Bruce Richardson
> > <bruce.richardson@intel.com> wrote:
> > > > > > > > > > > On Tue, Jun 06, 2017 at 02:19:21PM +0100, Ananyev,
> > Konstantin wrote:
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > -----Original Message-----
> > > > > > > > > > > > > From: Richardson, Bruce
> > > > > > > > > > > > > Sent: Tuesday, June 6, 2017 1:42 PM
> > > > > > > > > > > > > To: Ananyev, Konstantin
> > > > > > > > > > > > > <konstantin.ananyev@intel.com>
> > > > > > > > > > > > > Cc: Verkamp, Daniel <daniel.verkamp@intel.com>;
> > > > > > > > > > > > > dev@dpdk.org
> > > > > > > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use
> > > > > > > > > > > > > aligned memzone allocation
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, Jun 06, 2017 at 10:59:59AM +0100, Ananyev,
> > Konstantin wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > The PROD/CONS_ALIGN values on x86-64 are
> > > > > > > > > > > > > > > > > set to 2 cache lines, so members
> > > > > > > > > > > > > > > > of struct rte_ring are 128 byte aligned,
> > > > > > > > > > > > > > > > >and therefore the whole struct needs
> > > > > > > > > > > > > > > > >128-byte alignment according to the ABI
> > > > > > > > > > > > > > > > so that the 128-byte alignment of the fields
> > can be guaranteed.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Ah ok, missed the fact that rte_ring is 128B
> > aligned these days.
> > > > > > > > > > > > > > > > BTW, I probably missed the initial discussion,
> > but what was the reason for that?
> > > > > > > > > > > > > > > > Konstantin
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I don't know why PROD_ALIGN/CONS_ALIGN use 128
> > > > > > > > > > > > > > > byte alignment; it seems unnecessary if the
> > > > > > > > > > > > > > > cache line is only 64
> > > > > bytes.
> > > > > > > An
> > > > > > > > > > > > > alternate
> > > > > > > > > > > > > > > fix would be to just use cache line alignment
> > for these fields (since memzones are already cache line aligned).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Yes, had the same thought.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Maybe there is some deeper  reason for the >=
> > 128-byte alignment logic in rte_ring.h?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Might be, would be good to hear opinion the author
> > of that change.
> > > > > > > > > > > > >
> > > > > > > > > > > > > It gives improved performance for core-2-core
> > transfer.
> > > > > > > > > > > >
> > > > > > > > > > > > You mean empty cache-line(s) after prod/cons, correct?
> > > > > > > > > > > > That's ok but why we can't keep them and whole
> > rte_ring aligned on cache-line boundaries?
> > > > > > > > > > > > Something like that:
> > > > > > > > > > > > struct rte_ring {
> > > > > > > > > > > >    ...
> > > > > > > > > > > >    struct rte_ring_headtail prod __rte_cache_aligned;
> > > > > > > > > > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > > > >    struct rte_ring_headtail cons __rte_cache_aligned;
> > > > > > > > > > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > > > > };
> > > > > > > > > > > >
> > > > > > > > > > > > Konstantin
> > > > > > > > > > >
> > > > > > > > > > > Sure. That should probably work too.
> > > > > > > > > > >
> > > > > > > > > > > /Bruce
> > > > > > > > > >
> > > > > > > > > > I also agree with Konstantin's proposal. One question
> > > > > > > > > > though: since it changes the alignment constraint of the
> > > > > > > > > > rte_ring structure, I think it is an ABI breakage: a
> > > > > > > > > > structure including the rte_ring structure inherits from
> > this constraint.
> > > > > > > > > >
> > > > > > > > > > How could we handle that, knowing this is probably a rare
> > case?
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > Is it an ABI break so long as we keep the resulting size
> > > > > > > > > and field placement of the structures the same? The
> > > > > > > > > alignment being reduced should not be a problem, as
> > > > > > > > > 128byte alignment is also valid as 64byte alignment, after
> > all.
> > > > > > > >
> > > > > > > > I'd say yes. Consider the following example:
> > > > > > > >
> > > > > > > > ---8<---
> > > > > > > > #include <stdio.h>
> > > > > > > > #include <stdlib.h>
> > > > > > > >
> > > > > > > > #define ALIGN 64
> > > > > > > > /* #define ALIGN 128 */
> > > > > > > >
> > > > > > > > /* dummy rte_ring struct */
> > > > > > > > struct rte_ring {
> > > > > > > > 	char x[128];
> > > > > > > > } __attribute__((aligned(ALIGN)));
> > > > > > > >
> > > > > > > > struct foo {
> > > > > > > > 	struct rte_ring r;
> > > > > > > > 	unsigned bar;
> > > > > > > > };
> > > > > > > >
> > > > > > > > int main(void)
> > > > > > > > {
> > > > > > > > 	struct foo array[2];
> > > > > > > >
> > > > > > > > 	printf("sizeof(ring)=%zu diff=%u\n",
> > > > > > > > 		sizeof(struct rte_ring),
> > > > > > > > 		(unsigned int)((char *)&array[1].r - (char
> > *)array));
> > > > > > > >
> > > > > > > > 	return 0;
> > > > > > > > }
> > > > > > > > ---8<---
> > > > > > > >
> > > > > > > > The size of rte_ring is always 128.
> > > > > > > > diff is 192 or 256, depending on the value of ALIGN.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > Olivier
> > > > > >
> > > > > > About would it be an ABI breakage to 17.05 - I think would...
> > > > > > Though for me the actual breakage happens in 17.05 when rte_ring
> > > > > > alignment was increased from 64B 128B.
> > > > > > Now we just restoring it.
> > > > > >
> > > > > Yes, ABI change was announced in advance and explicitly broken in
> > 17.05.
> > > > > There was no announcement of ABI break in 17.08 for rte_ring.
> > > > >
> > > > > > >
> > > > > > > Yes, the diff will change, but that is after a recompile. If
> > > > > > > we have rte_ring_create function always return a 128-byte
> > > > > > > aligned structure, will any already-compiled apps fail to work
> > > > > > > if we also change the alignment of the rte_ring struct in the
> > header?
> > > > > >
> > > > > > Why 128B?
> > > > > > I thought we are discussing making rte_ring 64B aligned again?
> > > > > >
> > > > > > Konstantin
> > > > >
> > > > > To avoid possibly breaking apps compiled against 17.05 when run
> > > > > against shared libs for 17.08. Having the extra alignment won't
> > > > > affect 17.08 apps, since they only require 64-byte alignment, but
> > > > > returning only 64-byte aligned memory for apps which expect
> > > > > 128byte aligned memory may cause issues.
> > > > >
> > > > > Therefore, we should reduce the required alignment to 64B, which
> > > > > should only affect any apps that do a recompile, and have memory
> > > > > allocation for rings return 128B aligned addresses to work with
> > > > > both 64B aligned and 128B aligned ring structures.
> > > >
> > > > Ah, I see - you are talking just about rte_ring_create().
> > > > BTW, are you sure that right now it allocates rings 128B aligned?
> > > > As I can see it does just:
> > > > mz = rte_memzone_reserve(mz_name, ring_size, socket_id, mz_flags);
> > > > which means cache line alignment.
> > > >
> > > It doesn't currently allocate with that alignment, which is something
> > > we need to fix - and what this patch was originally submitted to do.
> > > So I think this patch should be applied, along with a further patch to
> > > reduce the alignment going forward to avoid any other problems.
> >
> > But if we going to reduce alignment anyway (patch #2) why do we need patch
> > #1 at all?
> 
> Because any app compiled against 17.05 will use the old alignment value. Therefore patch 1 should be applied to 17.08 for backward
> compatibility, and backported to 17.05.1.

Why then just no backport patch #2 to 17.05.1?
  
Bruce Richardson June 9, 2017, 9:02 a.m. UTC | #20
On Thu, Jun 08, 2017 at 05:42:00PM +0100, Ananyev, Konstantin wrote:
> 
> 
> > -----Original Message-----
> > From: Richardson, Bruce
> > Sent: Thursday, June 8, 2017 5:21 PM
> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > 
> > 
> > 
> > > -----Original Message-----
> > > From: Ananyev, Konstantin
> > > Sent: Thursday, June 8, 2017 5:13 PM
> > > To: Richardson, Bruce <bruce.richardson@intel.com>
> > > Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel
> > > <daniel.verkamp@intel.com>; dev@dpdk.org
> > > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Richardson, Bruce
> > > > Sent: Thursday, June 8, 2017 5:04 PM
> > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel
> > > > <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone
> > > > allocation
> > > >
> > > > On Thu, Jun 08, 2017 at 04:35:20PM +0100, Ananyev, Konstantin wrote:
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Richardson, Bruce
> > > > > > Sent: Thursday, June 8, 2017 4:25 PM
> > > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > > > Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel
> > > > > > <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone
> > > > > > allocation
> > > > > >
> > > > > > On Thu, Jun 08, 2017 at 03:50:34PM +0100, Ananyev, Konstantin wrote:
> > > > > > >
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Richardson, Bruce
> > > > > > > > Sent: Thursday, June 8, 2017 3:12 PM
> > > > > > > > To: Olivier Matz <olivier.matz@6wind.com>
> > > > > > > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> > > > > > > > Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone
> > > > > > > > allocation
> > > > > > > >
> > > > > > > > On Thu, Jun 08, 2017 at 04:05:26PM +0200, Olivier Matz wrote:
> > > > > > > > > On Thu, 8 Jun 2017 14:20:52 +0100, Bruce Richardson
> > > <bruce.richardson@intel.com> wrote:
> > > > > > > > > > On Thu, Jun 08, 2017 at 02:45:40PM +0200, Olivier Matz
> > > wrote:
> > > > > > > > > > > On Tue, 6 Jun 2017 15:56:28 +0100, Bruce Richardson
> > > <bruce.richardson@intel.com> wrote:
> > > > > > > > > > > > On Tue, Jun 06, 2017 at 02:19:21PM +0100, Ananyev,
> > > Konstantin wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > -----Original Message-----
> > > > > > > > > > > > > > From: Richardson, Bruce
> > > > > > > > > > > > > > Sent: Tuesday, June 6, 2017 1:42 PM
> > > > > > > > > > > > > > To: Ananyev, Konstantin
> > > > > > > > > > > > > > <konstantin.ananyev@intel.com>
> > > > > > > > > > > > > > Cc: Verkamp, Daniel <daniel.verkamp@intel.com>;
> > > > > > > > > > > > > > dev@dpdk.org
> > > > > > > > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use
> > > > > > > > > > > > > > aligned memzone allocation
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Tue, Jun 06, 2017 at 10:59:59AM +0100, Ananyev,
> > > Konstantin wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > The PROD/CONS_ALIGN values on x86-64 are
> > > > > > > > > > > > > > > > > > set to 2 cache lines, so members
> > > > > > > > > > > > > > > > > of struct rte_ring are 128 byte aligned,
> > > > > > > > > > > > > > > > > >and therefore the whole struct needs
> > > > > > > > > > > > > > > > > >128-byte alignment according to the ABI
> > > > > > > > > > > > > > > > > so that the 128-byte alignment of the fields
> > > can be guaranteed.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Ah ok, missed the fact that rte_ring is 128B
> > > aligned these days.
> > > > > > > > > > > > > > > > > BTW, I probably missed the initial discussion,
> > > but what was the reason for that?
> > > > > > > > > > > > > > > > > Konstantin
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I don't know why PROD_ALIGN/CONS_ALIGN use 128
> > > > > > > > > > > > > > > > byte alignment; it seems unnecessary if the
> > > > > > > > > > > > > > > > cache line is only 64
> > > > > > bytes.
> > > > > > > > An
> > > > > > > > > > > > > > alternate
> > > > > > > > > > > > > > > > fix would be to just use cache line alignment
> > > for these fields (since memzones are already cache line aligned).
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Yes, had the same thought.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Maybe there is some deeper  reason for the >=
> > > 128-byte alignment logic in rte_ring.h?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Might be, would be good to hear opinion the author
> > > of that change.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > It gives improved performance for core-2-core
> > > transfer.
> > > > > > > > > > > > >
> > > > > > > > > > > > > You mean empty cache-line(s) after prod/cons, correct?
> > > > > > > > > > > > > That's ok but why we can't keep them and whole
> > > rte_ring aligned on cache-line boundaries?
> > > > > > > > > > > > > Something like that:
> > > > > > > > > > > > > struct rte_ring {
> > > > > > > > > > > > >    ...
> > > > > > > > > > > > >    struct rte_ring_headtail prod __rte_cache_aligned;
> > > > > > > > > > > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > > > > >    struct rte_ring_headtail cons __rte_cache_aligned;
> > > > > > > > > > > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > > > > > };
> > > > > > > > > > > > >
> > > > > > > > > > > > > Konstantin
> > > > > > > > > > > >
> > > > > > > > > > > > Sure. That should probably work too.
> > > > > > > > > > > >
> > > > > > > > > > > > /Bruce
> > > > > > > > > > >
> > > > > > > > > > > I also agree with Konstantin's proposal. One question
> > > > > > > > > > > though: since it changes the alignment constraint of the
> > > > > > > > > > > rte_ring structure, I think it is an ABI breakage: a
> > > > > > > > > > > structure including the rte_ring structure inherits from
> > > this constraint.
> > > > > > > > > > >
> > > > > > > > > > > How could we handle that, knowing this is probably a rare
> > > case?
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > Is it an ABI break so long as we keep the resulting size
> > > > > > > > > > and field placement of the structures the same? The
> > > > > > > > > > alignment being reduced should not be a problem, as
> > > > > > > > > > 128byte alignment is also valid as 64byte alignment, after
> > > all.
> > > > > > > > >
> > > > > > > > > I'd say yes. Consider the following example:
> > > > > > > > >
> > > > > > > > > ---8<---
> > > > > > > > > #include <stdio.h>
> > > > > > > > > #include <stdlib.h>
> > > > > > > > >
> > > > > > > > > #define ALIGN 64
> > > > > > > > > /* #define ALIGN 128 */
> > > > > > > > >
> > > > > > > > > /* dummy rte_ring struct */
> > > > > > > > > struct rte_ring {
> > > > > > > > > 	char x[128];
> > > > > > > > > } __attribute__((aligned(ALIGN)));
> > > > > > > > >
> > > > > > > > > struct foo {
> > > > > > > > > 	struct rte_ring r;
> > > > > > > > > 	unsigned bar;
> > > > > > > > > };
> > > > > > > > >
> > > > > > > > > int main(void)
> > > > > > > > > {
> > > > > > > > > 	struct foo array[2];
> > > > > > > > >
> > > > > > > > > 	printf("sizeof(ring)=%zu diff=%u\n",
> > > > > > > > > 		sizeof(struct rte_ring),
> > > > > > > > > 		(unsigned int)((char *)&array[1].r - (char
> > > *)array));
> > > > > > > > >
> > > > > > > > > 	return 0;
> > > > > > > > > }
> > > > > > > > > ---8<---
> > > > > > > > >
> > > > > > > > > The size of rte_ring is always 128.
> > > > > > > > > diff is 192 or 256, depending on the value of ALIGN.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Olivier
> > > > > > >
> > > > > > > About would it be an ABI breakage to 17.05 - I think would...
> > > > > > > Though for me the actual breakage happens in 17.05 when rte_ring
> > > > > > > alignment was increased from 64B 128B.
> > > > > > > Now we just restoring it.
> > > > > > >
> > > > > > Yes, ABI change was announced in advance and explicitly broken in
> > > 17.05.
> > > > > > There was no announcement of ABI break in 17.08 for rte_ring.
> > > > > >
> > > > > > > >
> > > > > > > > Yes, the diff will change, but that is after a recompile. If
> > > > > > > > we have rte_ring_create function always return a 128-byte
> > > > > > > > aligned structure, will any already-compiled apps fail to work
> > > > > > > > if we also change the alignment of the rte_ring struct in the
> > > header?
> > > > > > >
> > > > > > > Why 128B?
> > > > > > > I thought we are discussing making rte_ring 64B aligned again?
> > > > > > >
> > > > > > > Konstantin
> > > > > >
> > > > > > To avoid possibly breaking apps compiled against 17.05 when run
> > > > > > against shared libs for 17.08. Having the extra alignment won't
> > > > > > affect 17.08 apps, since they only require 64-byte alignment, but
> > > > > > returning only 64-byte aligned memory for apps which expect
> > > > > > 128byte aligned memory may cause issues.
> > > > > >
> > > > > > Therefore, we should reduce the required alignment to 64B, which
> > > > > > should only affect any apps that do a recompile, and have memory
> > > > > > allocation for rings return 128B aligned addresses to work with
> > > > > > both 64B aligned and 128B aligned ring structures.
> > > > >
> > > > > Ah, I see - you are talking just about rte_ring_create().
> > > > > BTW, are you sure that right now it allocates rings 128B aligned?
> > > > > As I can see it does just:
> > > > > mz = rte_memzone_reserve(mz_name, ring_size, socket_id, mz_flags);
> > > > > which means cache line alignment.
> > > > >
> > > > It doesn't currently allocate with that alignment, which is something
> > > > we need to fix - and what this patch was originally submitted to do.
> > > > So I think this patch should be applied, along with a further patch to
> > > > reduce the alignment going forward to avoid any other problems.
> > >
> > > But if we going to reduce alignment anyway (patch #2) why do we need patch
> > > #1 at all?
> > 
> > Because any app compiled against 17.05 will use the old alignment value. Therefore patch 1 should be applied to 17.08 for backward
> > compatibility, and backported to 17.05.1.
> 
> Why then just no backport patch #2 to 17.05.1?
> 
Maybe so. I'm just a little wary about backporting changes like that to
an older release, even though I'm not aware of any specific issues it
might cause.

/Bruce
  
Yerden Zhumabekov June 9, 2017, 12:47 p.m. UTC | #21
On 06.06.2017 19:19, Ananyev, Konstantin wrote:
>
>>>> Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?
>>> Might be, would be good to hear opinion the author of that change.
>> It gives improved performance for core-2-core transfer.
> You mean empty cache-line(s) after prod/cons, correct?
> That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
> Something like that:
> struct rte_ring {
>     ...
>     struct rte_ring_headtail prod __rte_cache_aligned;
>     EMPTY_CACHE_LINE   __rte_cache_aligned;
>     struct rte_ring_headtail cons __rte_cache_aligned;
>     EMPTY_CACHE_LINE   __rte_cache_aligned;
> };
>
> Konstantin
>

I'm curious, can anyone explain, how does it actually affect 
performance? Maybe we can utilize it application code?
  
Stephen Hemminger June 9, 2017, 5:16 p.m. UTC | #22
On Fri, 9 Jun 2017 18:47:43 +0600
Yerden Zhumabekov <e_zhumabekov@sts.kz> wrote:

> On 06.06.2017 19:19, Ananyev, Konstantin wrote:
> >  
> >>>> Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?  
> >>> Might be, would be good to hear opinion the author of that change.  
> >> It gives improved performance for core-2-core transfer.  
> > You mean empty cache-line(s) after prod/cons, correct?
> > That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
> > Something like that:
> > struct rte_ring {
> >     ...
> >     struct rte_ring_headtail prod __rte_cache_aligned;
> >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> >     struct rte_ring_headtail cons __rte_cache_aligned;
> >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > };
> >
> > Konstantin
> >  
> 
> I'm curious, can anyone explain, how does it actually affect 
> performance? Maybe we can utilize it application code?

I think it is because on Intel CPU's the CPU will speculatively fetch adjacent cache lines.
If these cache lines change, then it will create false sharing.
  
Jerin Jacob June 9, 2017, 5:28 p.m. UTC | #23
-----Original Message-----
> Date: Fri, 9 Jun 2017 10:16:25 -0700
> From: Stephen Hemminger <stephen@networkplumber.org>
> To: Yerden Zhumabekov <e_zhumabekov@sts.kz>
> Cc: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>, "Richardson,
>  Bruce" <bruce.richardson@intel.com>, "Verkamp, Daniel"
>  <daniel.verkamp@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
> Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> 
> On Fri, 9 Jun 2017 18:47:43 +0600
> Yerden Zhumabekov <e_zhumabekov@sts.kz> wrote:
> 
> > On 06.06.2017 19:19, Ananyev, Konstantin wrote:
> > >  
> > >>>> Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?  
> > >>> Might be, would be good to hear opinion the author of that change.  
> > >> It gives improved performance for core-2-core transfer.  
> > > You mean empty cache-line(s) after prod/cons, correct?
> > > That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
> > > Something like that:
> > > struct rte_ring {
> > >     ...
> > >     struct rte_ring_headtail prod __rte_cache_aligned;
> > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > >     struct rte_ring_headtail cons __rte_cache_aligned;
> > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > };
> > >
> > > Konstantin
> > >  
> > 
> > I'm curious, can anyone explain, how does it actually affect 
> > performance? Maybe we can utilize it application code?
> 
> I think it is because on Intel CPU's the CPU will speculatively fetch adjacent cache lines.
> If these cache lines change, then it will create false sharing.

I see. I think, In such cases it is better to abstract as conditional
compilation. The above logic has worst case cache memory
requirement if CPU is 128B CL and no speculative prefetch.
  
Ananyev, Konstantin June 10, 2017, 8:16 a.m. UTC | #24
> -----Original Message-----
> From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> Sent: Friday, June 9, 2017 6:29 PM
> To: Stephen Hemminger <stephen@networkplumber.org>
> Cc: Yerden Zhumabekov <e_zhumabekov@sts.kz>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> 
> -----Original Message-----
> > Date: Fri, 9 Jun 2017 10:16:25 -0700
> > From: Stephen Hemminger <stephen@networkplumber.org>
> > To: Yerden Zhumabekov <e_zhumabekov@sts.kz>
> > Cc: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>, "Richardson,
> >  Bruce" <bruce.richardson@intel.com>, "Verkamp, Daniel"
> >  <daniel.verkamp@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
> > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> >
> > On Fri, 9 Jun 2017 18:47:43 +0600
> > Yerden Zhumabekov <e_zhumabekov@sts.kz> wrote:
> >
> > > On 06.06.2017 19:19, Ananyev, Konstantin wrote:
> > > >
> > > >>>> Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?
> > > >>> Might be, would be good to hear opinion the author of that change.
> > > >> It gives improved performance for core-2-core transfer.
> > > > You mean empty cache-line(s) after prod/cons, correct?
> > > > That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
> > > > Something like that:
> > > > struct rte_ring {
> > > >     ...
> > > >     struct rte_ring_headtail prod __rte_cache_aligned;
> > > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > >     struct rte_ring_headtail cons __rte_cache_aligned;
> > > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > };
> > > >
> > > > Konstantin
> > > >
> > >
> > > I'm curious, can anyone explain, how does it actually affect
> > > performance? Maybe we can utilize it application code?
> >
> > I think it is because on Intel CPU's the CPU will speculatively fetch adjacent cache lines.
> > If these cache lines change, then it will create false sharing.
> 
> I see. I think, In such cases it is better to abstract as conditional
> compilation. The above logic has worst case cache memory
> requirement if CPU is 128B CL and no speculative prefetch.

I think this is already done for rte_ring.h:
http://dpdk.org/browse/dpdk/tree/lib/librte_ring/rte_ring.h#n119

Konstantin
  
Jerin Jacob June 12, 2017, 3:07 a.m. UTC | #25
-----Original Message-----
> Date: Sat, 10 Jun 2017 08:16:44 +0000
> From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
> To: Jerin Jacob <jerin.jacob@caviumnetworks.com>, Stephen Hemminger
>  <stephen@networkplumber.org>
> CC: Yerden Zhumabekov <e_zhumabekov@sts.kz>, "Richardson, Bruce"
>  <bruce.richardson@intel.com>, "Verkamp, Daniel"
>  <daniel.verkamp@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
> Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> 
> 
> 
> > -----Original Message-----
> > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > Sent: Friday, June 9, 2017 6:29 PM
> > To: Stephen Hemminger <stephen@networkplumber.org>
> > Cc: Yerden Zhumabekov <e_zhumabekov@sts.kz>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Richardson, Bruce
> > <bruce.richardson@intel.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > 
> > -----Original Message-----
> > > Date: Fri, 9 Jun 2017 10:16:25 -0700
> > > From: Stephen Hemminger <stephen@networkplumber.org>
> > > To: Yerden Zhumabekov <e_zhumabekov@sts.kz>
> > > Cc: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>, "Richardson,
> > >  Bruce" <bruce.richardson@intel.com>, "Verkamp, Daniel"
> > >  <daniel.verkamp@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
> > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > >
> > > On Fri, 9 Jun 2017 18:47:43 +0600
> > > Yerden Zhumabekov <e_zhumabekov@sts.kz> wrote:
> > >
> > > > On 06.06.2017 19:19, Ananyev, Konstantin wrote:
> > > > >
> > > > >>>> Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?
> > > > >>> Might be, would be good to hear opinion the author of that change.
> > > > >> It gives improved performance for core-2-core transfer.
> > > > > You mean empty cache-line(s) after prod/cons, correct?
> > > > > That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
> > > > > Something like that:
> > > > > struct rte_ring {
> > > > >     ...
> > > > >     struct rte_ring_headtail prod __rte_cache_aligned;
> > > > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > >     struct rte_ring_headtail cons __rte_cache_aligned;
> > > > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > };
> > > > >
> > > > > Konstantin
> > > > >
> > > >
> > > > I'm curious, can anyone explain, how does it actually affect
> > > > performance? Maybe we can utilize it application code?
> > >
> > > I think it is because on Intel CPU's the CPU will speculatively fetch adjacent cache lines.
> > > If these cache lines change, then it will create false sharing.
> > 
> > I see. I think, In such cases it is better to abstract as conditional
> > compilation. The above logic has worst case cache memory
> > requirement if CPU is 128B CL and no speculative prefetch.
> 
> I think this is already done for rte_ring.h:
> http://dpdk.org/browse/dpdk/tree/lib/librte_ring/rte_ring.h#n119

Yes. The suggestion was in the context of when introducing the
EMPTY_CACHE_LINE scheme, it should be a function of ARCH has
speculative next cache-line prefetch or not?

> 
> Konstantin
> 
> 
>
  
Olivier Matz June 12, 2017, 9:02 a.m. UTC | #26
On Fri, 9 Jun 2017 10:02:55 +0100, Bruce Richardson <bruce.richardson@intel.com> wrote:
> On Thu, Jun 08, 2017 at 05:42:00PM +0100, Ananyev, Konstantin wrote:
> > 
> >   
> > > -----Original Message-----
> > > From: Richardson, Bruce
> > > Sent: Thursday, June 8, 2017 5:21 PM
> > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > 
> > > 
> > >   
> > > > -----Original Message-----
> > > > From: Ananyev, Konstantin
> > > > Sent: Thursday, June 8, 2017 5:13 PM
> > > > To: Richardson, Bruce <bruce.richardson@intel.com>
> > > > Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel
> > > > <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > >
> > > >
> > > >  
> > > > > -----Original Message-----
> > > > > From: Richardson, Bruce
> > > > > Sent: Thursday, June 8, 2017 5:04 PM
> > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > > Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel
> > > > > <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone
> > > > > allocation
> > > > >
> > > > > On Thu, Jun 08, 2017 at 04:35:20PM +0100, Ananyev, Konstantin wrote:  
> > > > > >
> > > > > >  
> > > > > > > -----Original Message-----
> > > > > > > From: Richardson, Bruce
> > > > > > > Sent: Thursday, June 8, 2017 4:25 PM
> > > > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > > > > Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel
> > > > > > > <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone
> > > > > > > allocation
> > > > > > >
> > > > > > > On Thu, Jun 08, 2017 at 03:50:34PM +0100, Ananyev, Konstantin wrote:  
> > > > > > > >
> > > > > > > >  
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Richardson, Bruce
> > > > > > > > > Sent: Thursday, June 8, 2017 3:12 PM
> > > > > > > > > To: Olivier Matz <olivier.matz@6wind.com>
> > > > > > > > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> > > > > > > > > Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone
> > > > > > > > > allocation
> > > > > > > > >
> > > > > > > > > On Thu, Jun 08, 2017 at 04:05:26PM +0200, Olivier Matz wrote:  
> > > > > > > > > > On Thu, 8 Jun 2017 14:20:52 +0100, Bruce Richardson  
> > > > <bruce.richardson@intel.com> wrote:  
> > > > > > > > > > > On Thu, Jun 08, 2017 at 02:45:40PM +0200, Olivier Matz  
> > > > wrote:  
> > > > > > > > > > > > On Tue, 6 Jun 2017 15:56:28 +0100, Bruce Richardson  
> > > > <bruce.richardson@intel.com> wrote:  
> > > > > > > > > > > > > On Tue, Jun 06, 2017 at 02:19:21PM +0100, Ananyev,  
> > > > Konstantin wrote:  
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >  
> > > > > > > > > > > > > > > -----Original Message-----
> > > > > > > > > > > > > > > From: Richardson, Bruce
> > > > > > > > > > > > > > > Sent: Tuesday, June 6, 2017 1:42 PM
> > > > > > > > > > > > > > > To: Ananyev, Konstantin
> > > > > > > > > > > > > > > <konstantin.ananyev@intel.com>
> > > > > > > > > > > > > > > Cc: Verkamp, Daniel <daniel.verkamp@intel.com>;
> > > > > > > > > > > > > > > dev@dpdk.org
> > > > > > > > > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use
> > > > > > > > > > > > > > > aligned memzone allocation
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Tue, Jun 06, 2017 at 10:59:59AM +0100, Ananyev,  
> > > > Konstantin wrote:  
> > > > > > > > > > > > > > > >  
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >  
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > The PROD/CONS_ALIGN values on x86-64 are
> > > > > > > > > > > > > > > > > > > set to 2 cache lines, so members  
> > > > > > > > > > > > > > > > > > of struct rte_ring are 128 byte aligned,  
> > > > > > > > > > > > > > > > > > >and therefore the whole struct needs
> > > > > > > > > > > > > > > > > > >128-byte alignment according to the ABI  
> > > > > > > > > > > > > > > > > > so that the 128-byte alignment of the fields  
> > > > can be guaranteed.  
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Ah ok, missed the fact that rte_ring is 128B  
> > > > aligned these days.  
> > > > > > > > > > > > > > > > > > BTW, I probably missed the initial discussion,  
> > > > but what was the reason for that?  
> > > > > > > > > > > > > > > > > > Konstantin  
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I don't know why PROD_ALIGN/CONS_ALIGN use 128
> > > > > > > > > > > > > > > > > byte alignment; it seems unnecessary if the
> > > > > > > > > > > > > > > > > cache line is only 64  
> > > > > > > bytes.  
> > > > > > > > > An  
> > > > > > > > > > > > > > > alternate  
> > > > > > > > > > > > > > > > > fix would be to just use cache line alignment  
> > > > for these fields (since memzones are already cache line aligned).  
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Yes, had the same thought.
> > > > > > > > > > > > > > > >  
> > > > > > > > > > > > > > > > > Maybe there is some deeper  reason for the >=  
> > > > 128-byte alignment logic in rte_ring.h?  
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Might be, would be good to hear opinion the author  
> > > > of that change.  
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > It gives improved performance for core-2-core  
> > > > transfer.  
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > You mean empty cache-line(s) after prod/cons, correct?
> > > > > > > > > > > > > > That's ok but why we can't keep them and whole  
> > > > rte_ring aligned on cache-line boundaries?  
> > > > > > > > > > > > > > Something like that:
> > > > > > > > > > > > > > struct rte_ring {
> > > > > > > > > > > > > >    ...
> > > > > > > > > > > > > >    struct rte_ring_headtail prod __rte_cache_aligned;
> > > > > > > > > > > > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > > > > > >    struct rte_ring_headtail cons __rte_cache_aligned;
> > > > > > > > > > > > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > > > > > > };
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Konstantin  
> > > > > > > > > > > > >
> > > > > > > > > > > > > Sure. That should probably work too.
> > > > > > > > > > > > >
> > > > > > > > > > > > > /Bruce  
> > > > > > > > > > > >
> > > > > > > > > > > > I also agree with Konstantin's proposal. One question
> > > > > > > > > > > > though: since it changes the alignment constraint of the
> > > > > > > > > > > > rte_ring structure, I think it is an ABI breakage: a
> > > > > > > > > > > > structure including the rte_ring structure inherits from  
> > > > this constraint.  
> > > > > > > > > > > >
> > > > > > > > > > > > How could we handle that, knowing this is probably a rare  
> > > > case?  
> > > > > > > > > > > >
> > > > > > > > > > > >  
> > > > > > > > > > > Is it an ABI break so long as we keep the resulting size
> > > > > > > > > > > and field placement of the structures the same? The
> > > > > > > > > > > alignment being reduced should not be a problem, as
> > > > > > > > > > > 128byte alignment is also valid as 64byte alignment, after  
> > > > all.  
> > > > > > > > > >
> > > > > > > > > > I'd say yes. Consider the following example:
> > > > > > > > > >
> > > > > > > > > > ---8<---
> > > > > > > > > > #include <stdio.h>
> > > > > > > > > > #include <stdlib.h>
> > > > > > > > > >
> > > > > > > > > > #define ALIGN 64
> > > > > > > > > > /* #define ALIGN 128 */
> > > > > > > > > >
> > > > > > > > > > /* dummy rte_ring struct */
> > > > > > > > > > struct rte_ring {
> > > > > > > > > > 	char x[128];
> > > > > > > > > > } __attribute__((aligned(ALIGN)));
> > > > > > > > > >
> > > > > > > > > > struct foo {
> > > > > > > > > > 	struct rte_ring r;
> > > > > > > > > > 	unsigned bar;
> > > > > > > > > > };
> > > > > > > > > >
> > > > > > > > > > int main(void)
> > > > > > > > > > {
> > > > > > > > > > 	struct foo array[2];
> > > > > > > > > >
> > > > > > > > > > 	printf("sizeof(ring)=%zu diff=%u\n",
> > > > > > > > > > 		sizeof(struct rte_ring),
> > > > > > > > > > 		(unsigned int)((char *)&array[1].r - (char  
> > > > *)array));  
> > > > > > > > > >
> > > > > > > > > > 	return 0;
> > > > > > > > > > }
> > > > > > > > > > ---8<---
> > > > > > > > > >
> > > > > > > > > > The size of rte_ring is always 128.
> > > > > > > > > > diff is 192 or 256, depending on the value of ALIGN.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Olivier  
> > > > > > > >
> > > > > > > > About would it be an ABI breakage to 17.05 - I think would...
> > > > > > > > Though for me the actual breakage happens in 17.05 when rte_ring
> > > > > > > > alignment was increased from 64B 128B.
> > > > > > > > Now we just restoring it.
> > > > > > > >  
> > > > > > > Yes, ABI change was announced in advance and explicitly broken in  
> > > > 17.05.  
> > > > > > > There was no announcement of ABI break in 17.08 for rte_ring.
> > > > > > >  
> > > > > > > > >
> > > > > > > > > Yes, the diff will change, but that is after a recompile. If
> > > > > > > > > we have rte_ring_create function always return a 128-byte
> > > > > > > > > aligned structure, will any already-compiled apps fail to work
> > > > > > > > > if we also change the alignment of the rte_ring struct in the  
> > > > header?  
> > > > > > > >
> > > > > > > > Why 128B?
> > > > > > > > I thought we are discussing making rte_ring 64B aligned again?
> > > > > > > >
> > > > > > > > Konstantin  
> > > > > > >
> > > > > > > To avoid possibly breaking apps compiled against 17.05 when run
> > > > > > > against shared libs for 17.08. Having the extra alignment won't
> > > > > > > affect 17.08 apps, since they only require 64-byte alignment, but
> > > > > > > returning only 64-byte aligned memory for apps which expect
> > > > > > > 128byte aligned memory may cause issues.
> > > > > > >
> > > > > > > Therefore, we should reduce the required alignment to 64B, which
> > > > > > > should only affect any apps that do a recompile, and have memory
> > > > > > > allocation for rings return 128B aligned addresses to work with
> > > > > > > both 64B aligned and 128B aligned ring structures.  
> > > > > >
> > > > > > Ah, I see - you are talking just about rte_ring_create().
> > > > > > BTW, are you sure that right now it allocates rings 128B aligned?
> > > > > > As I can see it does just:
> > > > > > mz = rte_memzone_reserve(mz_name, ring_size, socket_id, mz_flags);
> > > > > > which means cache line alignment.
> > > > > >  
> > > > > It doesn't currently allocate with that alignment, which is something
> > > > > we need to fix - and what this patch was originally submitted to do.
> > > > > So I think this patch should be applied, along with a further patch to
> > > > > reduce the alignment going forward to avoid any other problems.  
> > > >
> > > > But if we going to reduce alignment anyway (patch #2) why do we need patch
> > > > #1 at all?  
> > > 
> > > Because any app compiled against 17.05 will use the old alignment value. Therefore patch 1 should be applied to 17.08 for backward
> > > compatibility, and backported to 17.05.1.  
> > 
> > Why then just no backport patch #2 to 17.05.1?
> >   
> Maybe so. I'm just a little wary about backporting changes like that to
> an older release, even though I'm not aware of any specific issues it
> might cause.


If we want to fully respect the API/ABI deprecation process, we should
have patch #1 in 17.05 and 17.08, a deprecation notice in 17.08, and patch
#2 starting from 17.11.

More pragmatically, it's quite difficult to foresee really big problems
due to the changes in patch #2. One I can see is:

- rte_ring.so: the dpdk ring library
- another_ring.so: a library based on dpdk ring. The struct another_ring
  is like the struct foo in my previous example.
- application: uses another_ring structure

After we apply patch #2 on dpdk, and recompile the another_ring library,
its ABI will change.


So I suggest to follow the deprecation process for that issue.

Olivier
  
Bruce Richardson June 12, 2017, 9:56 a.m. UTC | #27
On Mon, Jun 12, 2017 at 11:02:32AM +0200, Olivier Matz wrote:
> On Fri, 9 Jun 2017 10:02:55 +0100, Bruce Richardson <bruce.richardson@intel.com> wrote:
> > On Thu, Jun 08, 2017 at 05:42:00PM +0100, Ananyev, Konstantin wrote:
> > > 
> > >   
> > > > -----Original Message-----
> > > > From: Richardson, Bruce
> > > > Sent: Thursday, June 8, 2017 5:21 PM
> > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > 
> > > > 
> > > >   
> > > > > -----Original Message-----
> > > > > From: Ananyev, Konstantin
> > > > > Sent: Thursday, June 8, 2017 5:13 PM
> > > > > To: Richardson, Bruce <bruce.richardson@intel.com>
> > > > > Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel
> > > > > <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > >
> > > > >
> > > > >  
> > > > > > -----Original Message-----
> > > > > > From: Richardson, Bruce
> > > > > > Sent: Thursday, June 8, 2017 5:04 PM
> > > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > > > Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel
> > > > > > <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone
> > > > > > allocation
> > > > > >
> > > > > > On Thu, Jun 08, 2017 at 04:35:20PM +0100, Ananyev, Konstantin wrote:  
> > > > > > >
> > > > > > >  
> > > > > > > > -----Original Message-----
> > > > > > > > From: Richardson, Bruce
> > > > > > > > Sent: Thursday, June 8, 2017 4:25 PM
> > > > > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > > > > > Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel
> > > > > > > > <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone
> > > > > > > > allocation
> > > > > > > >
> > > > > > > > On Thu, Jun 08, 2017 at 03:50:34PM +0100, Ananyev, Konstantin wrote:  
> > > > > > > > >
> > > > > > > > >  
> > > > > > > > > > -----Original Message-----
> > > > > > > > > > From: Richardson, Bruce
> > > > > > > > > > Sent: Thursday, June 8, 2017 3:12 PM
> > > > > > > > > > To: Olivier Matz <olivier.matz@6wind.com>
> > > > > > > > > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> > > > > > > > > > Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone
> > > > > > > > > > allocation
> > > > > > > > > >
> > > > > > > > > > On Thu, Jun 08, 2017 at 04:05:26PM +0200, Olivier Matz wrote:  
> > > > > > > > > > > On Thu, 8 Jun 2017 14:20:52 +0100, Bruce Richardson  
> > > > > <bruce.richardson@intel.com> wrote:  
> > > > > > > > > > > > On Thu, Jun 08, 2017 at 02:45:40PM +0200, Olivier Matz  
> > > > > wrote:  
> > > > > > > > > > > > > On Tue, 6 Jun 2017 15:56:28 +0100, Bruce Richardson  
> > > > > <bruce.richardson@intel.com> wrote:  
> > > > > > > > > > > > > > On Tue, Jun 06, 2017 at 02:19:21PM +0100, Ananyev,  
> > > > > Konstantin wrote:  
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >  
> > > > > > > > > > > > > > > > -----Original Message-----
> > > > > > > > > > > > > > > > From: Richardson, Bruce
> > > > > > > > > > > > > > > > Sent: Tuesday, June 6, 2017 1:42 PM
> > > > > > > > > > > > > > > > To: Ananyev, Konstantin
> > > > > > > > > > > > > > > > <konstantin.ananyev@intel.com>
> > > > > > > > > > > > > > > > Cc: Verkamp, Daniel <daniel.verkamp@intel.com>;
> > > > > > > > > > > > > > > > dev@dpdk.org
> > > > > > > > > > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use
> > > > > > > > > > > > > > > > aligned memzone allocation
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Tue, Jun 06, 2017 at 10:59:59AM +0100, Ananyev,  
> > > > > Konstantin wrote:  
> > > > > > > > > > > > > > > > >  
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >  
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > The PROD/CONS_ALIGN values on x86-64 are
> > > > > > > > > > > > > > > > > > > > set to 2 cache lines, so members  
> > > > > > > > > > > > > > > > > > > of struct rte_ring are 128 byte aligned,  
> > > > > > > > > > > > > > > > > > > >and therefore the whole struct needs
> > > > > > > > > > > > > > > > > > > >128-byte alignment according to the ABI  
> > > > > > > > > > > > > > > > > > > so that the 128-byte alignment of the fields  
> > > > > can be guaranteed.  
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Ah ok, missed the fact that rte_ring is 128B  
> > > > > aligned these days.  
> > > > > > > > > > > > > > > > > > > BTW, I probably missed the initial discussion,  
> > > > > but what was the reason for that?  
> > > > > > > > > > > > > > > > > > > Konstantin  
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I don't know why PROD_ALIGN/CONS_ALIGN use 128
> > > > > > > > > > > > > > > > > > byte alignment; it seems unnecessary if the
> > > > > > > > > > > > > > > > > > cache line is only 64  
> > > > > > > > bytes.  
> > > > > > > > > > An  
> > > > > > > > > > > > > > > > alternate  
> > > > > > > > > > > > > > > > > > fix would be to just use cache line alignment  
> > > > > for these fields (since memzones are already cache line aligned).  
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Yes, had the same thought.
> > > > > > > > > > > > > > > > >  
> > > > > > > > > > > > > > > > > > Maybe there is some deeper  reason for the >=  
> > > > > 128-byte alignment logic in rte_ring.h?  
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Might be, would be good to hear opinion the author  
> > > > > of that change.  
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > It gives improved performance for core-2-core  
> > > > > transfer.  
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > You mean empty cache-line(s) after prod/cons, correct?
> > > > > > > > > > > > > > > That's ok but why we can't keep them and whole  
> > > > > rte_ring aligned on cache-line boundaries?  
> > > > > > > > > > > > > > > Something like that:
> > > > > > > > > > > > > > > struct rte_ring {
> > > > > > > > > > > > > > >    ...
> > > > > > > > > > > > > > >    struct rte_ring_headtail prod __rte_cache_aligned;
> > > > > > > > > > > > > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > > > > > > >    struct rte_ring_headtail cons __rte_cache_aligned;
> > > > > > > > > > > > > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > > > > > > > };
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Konstantin  
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Sure. That should probably work too.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > /Bruce  
> > > > > > > > > > > > >
> > > > > > > > > > > > > I also agree with Konstantin's proposal. One question
> > > > > > > > > > > > > though: since it changes the alignment constraint of the
> > > > > > > > > > > > > rte_ring structure, I think it is an ABI breakage: a
> > > > > > > > > > > > > structure including the rte_ring structure inherits from  
> > > > > this constraint.  
> > > > > > > > > > > > >
> > > > > > > > > > > > > How could we handle that, knowing this is probably a rare  
> > > > > case?  
> > > > > > > > > > > > >
> > > > > > > > > > > > >  
> > > > > > > > > > > > Is it an ABI break so long as we keep the resulting size
> > > > > > > > > > > > and field placement of the structures the same? The
> > > > > > > > > > > > alignment being reduced should not be a problem, as
> > > > > > > > > > > > 128byte alignment is also valid as 64byte alignment, after  
> > > > > all.  
> > > > > > > > > > >
> > > > > > > > > > > I'd say yes. Consider the following example:
> > > > > > > > > > >
> > > > > > > > > > > ---8<---
> > > > > > > > > > > #include <stdio.h>
> > > > > > > > > > > #include <stdlib.h>
> > > > > > > > > > >
> > > > > > > > > > > #define ALIGN 64
> > > > > > > > > > > /* #define ALIGN 128 */
> > > > > > > > > > >
> > > > > > > > > > > /* dummy rte_ring struct */
> > > > > > > > > > > struct rte_ring {
> > > > > > > > > > > 	char x[128];
> > > > > > > > > > > } __attribute__((aligned(ALIGN)));
> > > > > > > > > > >
> > > > > > > > > > > struct foo {
> > > > > > > > > > > 	struct rte_ring r;
> > > > > > > > > > > 	unsigned bar;
> > > > > > > > > > > };
> > > > > > > > > > >
> > > > > > > > > > > int main(void)
> > > > > > > > > > > {
> > > > > > > > > > > 	struct foo array[2];
> > > > > > > > > > >
> > > > > > > > > > > 	printf("sizeof(ring)=%zu diff=%u\n",
> > > > > > > > > > > 		sizeof(struct rte_ring),
> > > > > > > > > > > 		(unsigned int)((char *)&array[1].r - (char  
> > > > > *)array));  
> > > > > > > > > > >
> > > > > > > > > > > 	return 0;
> > > > > > > > > > > }
> > > > > > > > > > > ---8<---
> > > > > > > > > > >
> > > > > > > > > > > The size of rte_ring is always 128.
> > > > > > > > > > > diff is 192 or 256, depending on the value of ALIGN.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Olivier  
> > > > > > > > >
> > > > > > > > > About would it be an ABI breakage to 17.05 - I think would...
> > > > > > > > > Though for me the actual breakage happens in 17.05 when rte_ring
> > > > > > > > > alignment was increased from 64B 128B.
> > > > > > > > > Now we just restoring it.
> > > > > > > > >  
> > > > > > > > Yes, ABI change was announced in advance and explicitly broken in  
> > > > > 17.05.  
> > > > > > > > There was no announcement of ABI break in 17.08 for rte_ring.
> > > > > > > >  
> > > > > > > > > >
> > > > > > > > > > Yes, the diff will change, but that is after a recompile. If
> > > > > > > > > > we have rte_ring_create function always return a 128-byte
> > > > > > > > > > aligned structure, will any already-compiled apps fail to work
> > > > > > > > > > if we also change the alignment of the rte_ring struct in the  
> > > > > header?  
> > > > > > > > >
> > > > > > > > > Why 128B?
> > > > > > > > > I thought we are discussing making rte_ring 64B aligned again?
> > > > > > > > >
> > > > > > > > > Konstantin  
> > > > > > > >
> > > > > > > > To avoid possibly breaking apps compiled against 17.05 when run
> > > > > > > > against shared libs for 17.08. Having the extra alignment won't
> > > > > > > > affect 17.08 apps, since they only require 64-byte alignment, but
> > > > > > > > returning only 64-byte aligned memory for apps which expect
> > > > > > > > 128byte aligned memory may cause issues.
> > > > > > > >
> > > > > > > > Therefore, we should reduce the required alignment to 64B, which
> > > > > > > > should only affect any apps that do a recompile, and have memory
> > > > > > > > allocation for rings return 128B aligned addresses to work with
> > > > > > > > both 64B aligned and 128B aligned ring structures.  
> > > > > > >
> > > > > > > Ah, I see - you are talking just about rte_ring_create().
> > > > > > > BTW, are you sure that right now it allocates rings 128B aligned?
> > > > > > > As I can see it does just:
> > > > > > > mz = rte_memzone_reserve(mz_name, ring_size, socket_id, mz_flags);
> > > > > > > which means cache line alignment.
> > > > > > >  
> > > > > > It doesn't currently allocate with that alignment, which is something
> > > > > > we need to fix - and what this patch was originally submitted to do.
> > > > > > So I think this patch should be applied, along with a further patch to
> > > > > > reduce the alignment going forward to avoid any other problems.  
> > > > >
> > > > > But if we going to reduce alignment anyway (patch #2) why do we need patch
> > > > > #1 at all?  
> > > > 
> > > > Because any app compiled against 17.05 will use the old alignment value. Therefore patch 1 should be applied to 17.08 for backward
> > > > compatibility, and backported to 17.05.1.  
> > > 
> > > Why then just no backport patch #2 to 17.05.1?
> > >   
> > Maybe so. I'm just a little wary about backporting changes like that to
> > an older release, even though I'm not aware of any specific issues it
> > might cause.
> 
> 
> If we want to fully respect the API/ABI deprecation process, we should
> have patch #1 in 17.05 and 17.08, a deprecation notice in 17.08, and patch
> #2 starting from 17.11.
> 
> More pragmatically, it's quite difficult to foresee really big problems
> due to the changes in patch #2. One I can see is:
> 
> - rte_ring.so: the dpdk ring library
> - another_ring.so: a library based on dpdk ring. The struct another_ring
>   is like the struct foo in my previous example.
> - application: uses another_ring structure
> 
> After we apply patch #2 on dpdk, and recompile the another_ring library,
> its ABI will change.
> 
> 
> So I suggest to follow the deprecation process for that issue.
> 
While this theoretically can occur, I consider it fairly unlikely, so my
preference is to have patch #1 in 17.05 and .08, as you suggest, 
but put patch #2 into 17.08 as well.

/Bruce
  
Ananyev, Konstantin June 12, 2017, 10:18 a.m. UTC | #28
> -----Original Message-----
> From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> Sent: Monday, June 12, 2017 4:08 AM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Cc: Stephen Hemminger <stephen@networkplumber.org>; Yerden Zhumabekov <e_zhumabekov@sts.kz>; Richardson, Bruce
> <bruce.richardson@intel.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> 
> -----Original Message-----
> > Date: Sat, 10 Jun 2017 08:16:44 +0000
> > From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
> > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>, Stephen Hemminger
> >  <stephen@networkplumber.org>
> > CC: Yerden Zhumabekov <e_zhumabekov@sts.kz>, "Richardson, Bruce"
> >  <bruce.richardson@intel.com>, "Verkamp, Daniel"
> >  <daniel.verkamp@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
> > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> >
> >
> >
> > > -----Original Message-----
> > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > > Sent: Friday, June 9, 2017 6:29 PM
> > > To: Stephen Hemminger <stephen@networkplumber.org>
> > > Cc: Yerden Zhumabekov <e_zhumabekov@sts.kz>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Richardson, Bruce
> > > <bruce.richardson@intel.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > >
> > > -----Original Message-----
> > > > Date: Fri, 9 Jun 2017 10:16:25 -0700
> > > > From: Stephen Hemminger <stephen@networkplumber.org>
> > > > To: Yerden Zhumabekov <e_zhumabekov@sts.kz>
> > > > Cc: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>, "Richardson,
> > > >  Bruce" <bruce.richardson@intel.com>, "Verkamp, Daniel"
> > > >  <daniel.verkamp@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
> > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > >
> > > > On Fri, 9 Jun 2017 18:47:43 +0600
> > > > Yerden Zhumabekov <e_zhumabekov@sts.kz> wrote:
> > > >
> > > > > On 06.06.2017 19:19, Ananyev, Konstantin wrote:
> > > > > >
> > > > > >>>> Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?
> > > > > >>> Might be, would be good to hear opinion the author of that change.
> > > > > >> It gives improved performance for core-2-core transfer.
> > > > > > You mean empty cache-line(s) after prod/cons, correct?
> > > > > > That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
> > > > > > Something like that:
> > > > > > struct rte_ring {
> > > > > >     ...
> > > > > >     struct rte_ring_headtail prod __rte_cache_aligned;
> > > > > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > >     struct rte_ring_headtail cons __rte_cache_aligned;
> > > > > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > };
> > > > > >
> > > > > > Konstantin
> > > > > >
> > > > >
> > > > > I'm curious, can anyone explain, how does it actually affect
> > > > > performance? Maybe we can utilize it application code?
> > > >
> > > > I think it is because on Intel CPU's the CPU will speculatively fetch adjacent cache lines.
> > > > If these cache lines change, then it will create false sharing.
> > >
> > > I see. I think, In such cases it is better to abstract as conditional
> > > compilation. The above logic has worst case cache memory
> > > requirement if CPU is 128B CL and no speculative prefetch.

I suppose we can keep exactly the same logic as we have now:
archs with 64B cache-line would have an empty cache line,
for archs with 128B cacheline - no.
Is that what you are looking for?
Konstantin 

> >
> > I think this is already done for rte_ring.h:
> > http://dpdk.org/browse/dpdk/tree/lib/librte_ring/rte_ring.h#n119
> 
> Yes. The suggestion was in the context of when introducing the
> EMPTY_CACHE_LINE scheme, it should be a function of ARCH has
> speculative next cache-line prefetch or not?
> 
> >
> > Konstantin
> >
> >
> >
  
Jerin Jacob June 12, 2017, 10:34 a.m. UTC | #29
-----Original Message-----
> Date: Mon, 12 Jun 2017 10:18:39 +0000
> From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
> To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> CC: Stephen Hemminger <stephen@networkplumber.org>, Yerden Zhumabekov
>  <e_zhumabekov@sts.kz>, "Richardson, Bruce" <bruce.richardson@intel.com>,
>  "Verkamp, Daniel" <daniel.verkamp@intel.com>, "dev@dpdk.org"
>  <dev@dpdk.org>
> Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> 
> 
> 
> > -----Original Message-----
> > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > Sent: Monday, June 12, 2017 4:08 AM
> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > Cc: Stephen Hemminger <stephen@networkplumber.org>; Yerden Zhumabekov <e_zhumabekov@sts.kz>; Richardson, Bruce
> > <bruce.richardson@intel.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > 
> > -----Original Message-----
> > > Date: Sat, 10 Jun 2017 08:16:44 +0000
> > > From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
> > > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>, Stephen Hemminger
> > >  <stephen@networkplumber.org>
> > > CC: Yerden Zhumabekov <e_zhumabekov@sts.kz>, "Richardson, Bruce"
> > >  <bruce.richardson@intel.com>, "Verkamp, Daniel"
> > >  <daniel.verkamp@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
> > > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > > > Sent: Friday, June 9, 2017 6:29 PM
> > > > To: Stephen Hemminger <stephen@networkplumber.org>
> > > > Cc: Yerden Zhumabekov <e_zhumabekov@sts.kz>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Richardson, Bruce
> > > > <bruce.richardson@intel.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > >
> > > > -----Original Message-----
> > > > > Date: Fri, 9 Jun 2017 10:16:25 -0700
> > > > > From: Stephen Hemminger <stephen@networkplumber.org>
> > > > > To: Yerden Zhumabekov <e_zhumabekov@sts.kz>
> > > > > Cc: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>, "Richardson,
> > > > >  Bruce" <bruce.richardson@intel.com>, "Verkamp, Daniel"
> > > > >  <daniel.verkamp@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
> > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > >
> > > > > On Fri, 9 Jun 2017 18:47:43 +0600
> > > > > Yerden Zhumabekov <e_zhumabekov@sts.kz> wrote:
> > > > >
> > > > > > On 06.06.2017 19:19, Ananyev, Konstantin wrote:
> > > > > > >
> > > > > > >>>> Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?
> > > > > > >>> Might be, would be good to hear opinion the author of that change.
> > > > > > >> It gives improved performance for core-2-core transfer.
> > > > > > > You mean empty cache-line(s) after prod/cons, correct?
> > > > > > > That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
> > > > > > > Something like that:
> > > > > > > struct rte_ring {
> > > > > > >     ...
> > > > > > >     struct rte_ring_headtail prod __rte_cache_aligned;
> > > > > > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > >     struct rte_ring_headtail cons __rte_cache_aligned;
> > > > > > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > };
> > > > > > >
> > > > > > > Konstantin
> > > > > > >
> > > > > >
> > > > > > I'm curious, can anyone explain, how does it actually affect
> > > > > > performance? Maybe we can utilize it application code?
> > > > >
> > > > > I think it is because on Intel CPU's the CPU will speculatively fetch adjacent cache lines.
> > > > > If these cache lines change, then it will create false sharing.
> > > >
> > > > I see. I think, In such cases it is better to abstract as conditional
> > > > compilation. The above logic has worst case cache memory
> > > > requirement if CPU is 128B CL and no speculative prefetch.
> 
> I suppose we can keep exactly the same logic as we have now:
> archs with 64B cache-line would have an empty cache line,
> for archs with 128B cacheline - no.
> Is that what you are looking for?

Its valid to an arch with 128B cache-line and speculative cache prefetch.
(Cavium's recent SoCs comes with this property)
IMHO, Instead of making 128B as NOOP. We can introduce a new conditional
compilation flag(CONFIG_RTE_ARCH_SPECULATIVE_PREFETCH or something like
that) to decide the empty line and I think, In future we can use
the same config for similar use cases.

Jerin

> Konstantin 
> 
> > >
> > > I think this is already done for rte_ring.h:
> > > http://dpdk.org/browse/dpdk/tree/lib/librte_ring/rte_ring.h#n119
> > 
> > Yes. The suggestion was in the context of when introducing the
> > EMPTY_CACHE_LINE scheme, it should be a function of ARCH has
> > speculative next cache-line prefetch or not?
> > 
> > >
> > > Konstantin
> > >
> > >
> > >
  
Bruce Richardson June 12, 2017, 11:09 a.m. UTC | #30
On Mon, Jun 12, 2017 at 04:04:11PM +0530, Jerin Jacob wrote:
> -----Original Message-----
> > Date: Mon, 12 Jun 2017 10:18:39 +0000
> > From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
> > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> > CC: Stephen Hemminger <stephen@networkplumber.org>, Yerden Zhumabekov
> >  <e_zhumabekov@sts.kz>, "Richardson, Bruce" <bruce.richardson@intel.com>,
> >  "Verkamp, Daniel" <daniel.verkamp@intel.com>, "dev@dpdk.org"
> >  <dev@dpdk.org>
> > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > 
> > 
> > 
> > > -----Original Message-----
> > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > > Sent: Monday, June 12, 2017 4:08 AM
> > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > Cc: Stephen Hemminger <stephen@networkplumber.org>; Yerden Zhumabekov <e_zhumabekov@sts.kz>; Richardson, Bruce
> > > <bruce.richardson@intel.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > 
> > > -----Original Message-----
> > > > Date: Sat, 10 Jun 2017 08:16:44 +0000
> > > > From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
> > > > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>, Stephen Hemminger
> > > >  <stephen@networkplumber.org>
> > > > CC: Yerden Zhumabekov <e_zhumabekov@sts.kz>, "Richardson, Bruce"
> > > >  <bruce.richardson@intel.com>, "Verkamp, Daniel"
> > > >  <daniel.verkamp@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
> > > > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > >
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > > > > Sent: Friday, June 9, 2017 6:29 PM
> > > > > To: Stephen Hemminger <stephen@networkplumber.org>
> > > > > Cc: Yerden Zhumabekov <e_zhumabekov@sts.kz>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Richardson, Bruce
> > > > > <bruce.richardson@intel.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > >
> > > > > -----Original Message-----
> > > > > > Date: Fri, 9 Jun 2017 10:16:25 -0700
> > > > > > From: Stephen Hemminger <stephen@networkplumber.org>
> > > > > > To: Yerden Zhumabekov <e_zhumabekov@sts.kz>
> > > > > > Cc: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>, "Richardson,
> > > > > >  Bruce" <bruce.richardson@intel.com>, "Verkamp, Daniel"
> > > > > >  <daniel.verkamp@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
> > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > >
> > > > > > On Fri, 9 Jun 2017 18:47:43 +0600
> > > > > > Yerden Zhumabekov <e_zhumabekov@sts.kz> wrote:
> > > > > >
> > > > > > > On 06.06.2017 19:19, Ananyev, Konstantin wrote:
> > > > > > > >
> > > > > > > >>>> Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?
> > > > > > > >>> Might be, would be good to hear opinion the author of that change.
> > > > > > > >> It gives improved performance for core-2-core transfer.
> > > > > > > > You mean empty cache-line(s) after prod/cons, correct?
> > > > > > > > That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
> > > > > > > > Something like that:
> > > > > > > > struct rte_ring {
> > > > > > > >     ...
> > > > > > > >     struct rte_ring_headtail prod __rte_cache_aligned;
> > > > > > > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > >     struct rte_ring_headtail cons __rte_cache_aligned;
> > > > > > > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > };
> > > > > > > >
> > > > > > > > Konstantin
> > > > > > > >
> > > > > > >
> > > > > > > I'm curious, can anyone explain, how does it actually affect
> > > > > > > performance? Maybe we can utilize it application code?
> > > > > >
> > > > > > I think it is because on Intel CPU's the CPU will speculatively fetch adjacent cache lines.
> > > > > > If these cache lines change, then it will create false sharing.
> > > > >
> > > > > I see. I think, In such cases it is better to abstract as conditional
> > > > > compilation. The above logic has worst case cache memory
> > > > > requirement if CPU is 128B CL and no speculative prefetch.
> > 
> > I suppose we can keep exactly the same logic as we have now:
> > archs with 64B cache-line would have an empty cache line,
> > for archs with 128B cacheline - no.
> > Is that what you are looking for?
> 
> Its valid to an arch with 128B cache-line and speculative cache prefetch.
> (Cavium's recent SoCs comes with this property)
> IMHO, Instead of making 128B as NOOP. We can introduce a new conditional
> compilation flag(CONFIG_RTE_ARCH_SPECULATIVE_PREFETCH or something like
> that) to decide the empty line and I think, In future we can use
> the same config for similar use cases.
> 
> Jerin
> 
I'd rather not make it that complicated, and definitely don't like
adding in more build time config options. Initially, I had the extra
padding always-present, but it was felt that it made the resulting
structure too big. For those systems with 128B cachelines, is the extra
256 bytes of space per ring really a problem given modern systems have
ram in the 10's of Gigs?

/Bruce
  
Jerin Jacob June 12, 2017, 11:41 a.m. UTC | #31
-----Original Message-----
> Date: Mon, 12 Jun 2017 12:09:07 +0100
> From: Bruce Richardson <bruce.richardson@intel.com>
> To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> CC: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>, Stephen Hemminger
>  <stephen@networkplumber.org>, Yerden Zhumabekov <e_zhumabekov@sts.kz>,
>  "Verkamp, Daniel" <daniel.verkamp@intel.com>, "dev@dpdk.org"
>  <dev@dpdk.org>
> Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> User-Agent: Mutt/1.8.1 (2017-04-11)
> 
> On Mon, Jun 12, 2017 at 04:04:11PM +0530, Jerin Jacob wrote:
> > -----Original Message-----
> > > Date: Mon, 12 Jun 2017 10:18:39 +0000
> > > From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
> > > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> > > CC: Stephen Hemminger <stephen@networkplumber.org>, Yerden Zhumabekov
> > >  <e_zhumabekov@sts.kz>, "Richardson, Bruce" <bruce.richardson@intel.com>,
> > >  "Verkamp, Daniel" <daniel.verkamp@intel.com>, "dev@dpdk.org"
> > >  <dev@dpdk.org>
> > > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > 
> > > 
> > > 
> > > > -----Original Message-----
> > > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > > > Sent: Monday, June 12, 2017 4:08 AM
> > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > Cc: Stephen Hemminger <stephen@networkplumber.org>; Yerden Zhumabekov <e_zhumabekov@sts.kz>; Richardson, Bruce
> > > > <bruce.richardson@intel.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > 
> > > > -----Original Message-----
> > > > > Date: Sat, 10 Jun 2017 08:16:44 +0000
> > > > > From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
> > > > > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>, Stephen Hemminger
> > > > >  <stephen@networkplumber.org>
> > > > > CC: Yerden Zhumabekov <e_zhumabekov@sts.kz>, "Richardson, Bruce"
> > > > >  <bruce.richardson@intel.com>, "Verkamp, Daniel"
> > > > >  <daniel.verkamp@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
> > > > > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > >
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > > > > > Sent: Friday, June 9, 2017 6:29 PM
> > > > > > To: Stephen Hemminger <stephen@networkplumber.org>
> > > > > > Cc: Yerden Zhumabekov <e_zhumabekov@sts.kz>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Richardson, Bruce
> > > > > > <bruce.richardson@intel.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > >
> > > > > > -----Original Message-----
> > > > > > > Date: Fri, 9 Jun 2017 10:16:25 -0700
> > > > > > > From: Stephen Hemminger <stephen@networkplumber.org>
> > > > > > > To: Yerden Zhumabekov <e_zhumabekov@sts.kz>
> > > > > > > Cc: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>, "Richardson,
> > > > > > >  Bruce" <bruce.richardson@intel.com>, "Verkamp, Daniel"
> > > > > > >  <daniel.verkamp@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
> > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > >
> > > > > > > On Fri, 9 Jun 2017 18:47:43 +0600
> > > > > > > Yerden Zhumabekov <e_zhumabekov@sts.kz> wrote:
> > > > > > >
> > > > > > > > On 06.06.2017 19:19, Ananyev, Konstantin wrote:
> > > > > > > > >
> > > > > > > > >>>> Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?
> > > > > > > > >>> Might be, would be good to hear opinion the author of that change.
> > > > > > > > >> It gives improved performance for core-2-core transfer.
> > > > > > > > > You mean empty cache-line(s) after prod/cons, correct?
> > > > > > > > > That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
> > > > > > > > > Something like that:
> > > > > > > > > struct rte_ring {
> > > > > > > > >     ...
> > > > > > > > >     struct rte_ring_headtail prod __rte_cache_aligned;
> > > > > > > > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > >     struct rte_ring_headtail cons __rte_cache_aligned;
> > > > > > > > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > };
> > > > > > > > >
> > > > > > > > > Konstantin
> > > > > > > > >
> > > > > > > >
> > > > > > > > I'm curious, can anyone explain, how does it actually affect
> > > > > > > > performance? Maybe we can utilize it application code?
> > > > > > >
> > > > > > > I think it is because on Intel CPU's the CPU will speculatively fetch adjacent cache lines.
> > > > > > > If these cache lines change, then it will create false sharing.
> > > > > >
> > > > > > I see. I think, In such cases it is better to abstract as conditional
> > > > > > compilation. The above logic has worst case cache memory
> > > > > > requirement if CPU is 128B CL and no speculative prefetch.
> > > 
> > > I suppose we can keep exactly the same logic as we have now:
> > > archs with 64B cache-line would have an empty cache line,
> > > for archs with 128B cacheline - no.
> > > Is that what you are looking for?
> > 
> > Its valid to an arch with 128B cache-line and speculative cache prefetch.
> > (Cavium's recent SoCs comes with this property)
> > IMHO, Instead of making 128B as NOOP. We can introduce a new conditional
> > compilation flag(CONFIG_RTE_ARCH_SPECULATIVE_PREFETCH or something like
> > that) to decide the empty line and I think, In future we can use
> > the same config for similar use cases.
> > 
> > Jerin
> > 
> I'd rather not make it that complicated, and definitely don't like
> adding in more build time config options. Initially, I had the extra
> padding always-present, but it was felt that it made the resulting
> structure too big. For those systems with 128B cachelines, is the extra
> 256 bytes of space per ring really a problem given modern systems have
> ram in the 10's of Gigs?

I think, RAM size does not matter here. I was referring more on L1 and L2
cache size(which is very limited).i.e if you fetch the unwanted
lines then CPU have to evict fast and it will have effect on accommodating
interested lines in worker loop..I am not a fan of build time options but I
don't really see any issue with _ARCH_ specific build time options.
I think, it is good. common_base can have default value if any platform/arch wants
to override it can override it.I don't see any harm in proving that support.


> 
> /Bruce
  
Ananyev, Konstantin June 12, 2017, 12:17 p.m. UTC | #32
> -----Original Message-----
> From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> Sent: Monday, June 12, 2017 12:41 PM
> To: Richardson, Bruce <bruce.richardson@intel.com>
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Stephen Hemminger <stephen@networkplumber.org>; Yerden Zhumabekov
> <e_zhumabekov@sts.kz>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> 
> -----Original Message-----
> > Date: Mon, 12 Jun 2017 12:09:07 +0100
> > From: Bruce Richardson <bruce.richardson@intel.com>
> > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> > CC: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>, Stephen Hemminger
> >  <stephen@networkplumber.org>, Yerden Zhumabekov <e_zhumabekov@sts.kz>,
> >  "Verkamp, Daniel" <daniel.verkamp@intel.com>, "dev@dpdk.org"
> >  <dev@dpdk.org>
> > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > User-Agent: Mutt/1.8.1 (2017-04-11)
> >
> > On Mon, Jun 12, 2017 at 04:04:11PM +0530, Jerin Jacob wrote:
> > > -----Original Message-----
> > > > Date: Mon, 12 Jun 2017 10:18:39 +0000
> > > > From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
> > > > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> > > > CC: Stephen Hemminger <stephen@networkplumber.org>, Yerden Zhumabekov
> > > >  <e_zhumabekov@sts.kz>, "Richardson, Bruce" <bruce.richardson@intel.com>,
> > > >  "Verkamp, Daniel" <daniel.verkamp@intel.com>, "dev@dpdk.org"
> > > >  <dev@dpdk.org>
> > > > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > >
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > > > > Sent: Monday, June 12, 2017 4:08 AM
> > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > > Cc: Stephen Hemminger <stephen@networkplumber.org>; Yerden Zhumabekov <e_zhumabekov@sts.kz>; Richardson, Bruce
> > > > > <bruce.richardson@intel.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > >
> > > > > -----Original Message-----
> > > > > > Date: Sat, 10 Jun 2017 08:16:44 +0000
> > > > > > From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
> > > > > > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>, Stephen Hemminger
> > > > > >  <stephen@networkplumber.org>
> > > > > > CC: Yerden Zhumabekov <e_zhumabekov@sts.kz>, "Richardson, Bruce"
> > > > > >  <bruce.richardson@intel.com>, "Verkamp, Daniel"
> > > > > >  <daniel.verkamp@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
> > > > > > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > >
> > > > > >
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > > > > > > Sent: Friday, June 9, 2017 6:29 PM
> > > > > > > To: Stephen Hemminger <stephen@networkplumber.org>
> > > > > > > Cc: Yerden Zhumabekov <e_zhumabekov@sts.kz>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Richardson, Bruce
> > > > > > > <bruce.richardson@intel.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > >
> > > > > > > -----Original Message-----
> > > > > > > > Date: Fri, 9 Jun 2017 10:16:25 -0700
> > > > > > > > From: Stephen Hemminger <stephen@networkplumber.org>
> > > > > > > > To: Yerden Zhumabekov <e_zhumabekov@sts.kz>
> > > > > > > > Cc: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>, "Richardson,
> > > > > > > >  Bruce" <bruce.richardson@intel.com>, "Verkamp, Daniel"
> > > > > > > >  <daniel.verkamp@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
> > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > > >
> > > > > > > > On Fri, 9 Jun 2017 18:47:43 +0600
> > > > > > > > Yerden Zhumabekov <e_zhumabekov@sts.kz> wrote:
> > > > > > > >
> > > > > > > > > On 06.06.2017 19:19, Ananyev, Konstantin wrote:
> > > > > > > > > >
> > > > > > > > > >>>> Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?
> > > > > > > > > >>> Might be, would be good to hear opinion the author of that change.
> > > > > > > > > >> It gives improved performance for core-2-core transfer.
> > > > > > > > > > You mean empty cache-line(s) after prod/cons, correct?
> > > > > > > > > > That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
> > > > > > > > > > Something like that:
> > > > > > > > > > struct rte_ring {
> > > > > > > > > >     ...
> > > > > > > > > >     struct rte_ring_headtail prod __rte_cache_aligned;
> > > > > > > > > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > >     struct rte_ring_headtail cons __rte_cache_aligned;
> > > > > > > > > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > > };
> > > > > > > > > >
> > > > > > > > > > Konstantin
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > I'm curious, can anyone explain, how does it actually affect
> > > > > > > > > performance? Maybe we can utilize it application code?
> > > > > > > >
> > > > > > > > I think it is because on Intel CPU's the CPU will speculatively fetch adjacent cache lines.
> > > > > > > > If these cache lines change, then it will create false sharing.
> > > > > > >
> > > > > > > I see. I think, In such cases it is better to abstract as conditional
> > > > > > > compilation. The above logic has worst case cache memory
> > > > > > > requirement if CPU is 128B CL and no speculative prefetch.
> > > >
> > > > I suppose we can keep exactly the same logic as we have now:
> > > > archs with 64B cache-line would have an empty cache line,
> > > > for archs with 128B cacheline - no.
> > > > Is that what you are looking for?
> > >
> > > Its valid to an arch with 128B cache-line and speculative cache prefetch.
> > > (Cavium's recent SoCs comes with this property)
> > > IMHO, Instead of making 128B as NOOP. We can introduce a new conditional
> > > compilation flag(CONFIG_RTE_ARCH_SPECULATIVE_PREFETCH or something like
> > > that) to decide the empty line and I think, In future we can use
> > > the same config for similar use cases.
> > >
> > > Jerin
> > >
> > I'd rather not make it that complicated, and definitely don't like
> > adding in more build time config options. Initially, I had the extra
> > padding always-present, but it was felt that it made the resulting
> > structure too big. For those systems with 128B cachelines, is the extra
> > 256 bytes of space per ring really a problem given modern systems have
> > ram in the 10's of Gigs?
> 
> I think, RAM size does not matter here. I was referring more on L1 and L2
> cache size(which is very limited).i.e if you fetch the unwanted
> lines then CPU have to evict fast and it will have effect on accommodating
> interested lines in worker loop..

Not sure I understand you here - as I know, we can't control HW speculative fetch.
It would either happen, or no depending on the actual CPU.
The only thing we can control here - what exactly will be fetched:
either empty cache line not used by anyone or cache-line with some data.
Konstantin

>I am not a fan of build time options but I
> don't really see any issue with _ARCH_ specific build time options.
> I think, it is good. common_base can have default value if any platform/arch wants
> to override it can override it.I don't see any harm in proving that support.
> 
> 
> >
> > /Bruce
  
Jerin Jacob June 12, 2017, 12:42 p.m. UTC | #33
-----Original Message-----
> Date: Mon, 12 Jun 2017 12:17:48 +0000
> From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
> To: Jerin Jacob <jerin.jacob@caviumnetworks.com>, "Richardson, Bruce"
>  <bruce.richardson@intel.com>
> CC: Stephen Hemminger <stephen@networkplumber.org>, Yerden Zhumabekov
>  <e_zhumabekov@sts.kz>, "Verkamp, Daniel" <daniel.verkamp@intel.com>,
>  "dev@dpdk.org" <dev@dpdk.org>
> Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> 
> 
> 
> > -----Original Message-----
> > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > Sent: Monday, June 12, 2017 12:41 PM
> > To: Richardson, Bruce <bruce.richardson@intel.com>
> > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Stephen Hemminger <stephen@networkplumber.org>; Yerden Zhumabekov
> > <e_zhumabekov@sts.kz>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > 
> > -----Original Message-----
> > > Date: Mon, 12 Jun 2017 12:09:07 +0100
> > > From: Bruce Richardson <bruce.richardson@intel.com>
> > > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> > > CC: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>, Stephen Hemminger
> > >  <stephen@networkplumber.org>, Yerden Zhumabekov <e_zhumabekov@sts.kz>,
> > >  "Verkamp, Daniel" <daniel.verkamp@intel.com>, "dev@dpdk.org"
> > >  <dev@dpdk.org>
> > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > User-Agent: Mutt/1.8.1 (2017-04-11)
> > >
> > > On Mon, Jun 12, 2017 at 04:04:11PM +0530, Jerin Jacob wrote:
> > > > -----Original Message-----
> > > > > Date: Mon, 12 Jun 2017 10:18:39 +0000
> > > > > From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
> > > > > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> > > > > CC: Stephen Hemminger <stephen@networkplumber.org>, Yerden Zhumabekov
> > > > >  <e_zhumabekov@sts.kz>, "Richardson, Bruce" <bruce.richardson@intel.com>,
> > > > >  "Verkamp, Daniel" <daniel.verkamp@intel.com>, "dev@dpdk.org"
> > > > >  <dev@dpdk.org>
> > > > > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > >
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > > > > > Sent: Monday, June 12, 2017 4:08 AM
> > > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > > > Cc: Stephen Hemminger <stephen@networkplumber.org>; Yerden Zhumabekov <e_zhumabekov@sts.kz>; Richardson, Bruce
> > > > > > <bruce.richardson@intel.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > >
> > > > > > -----Original Message-----
> > > > > > > Date: Sat, 10 Jun 2017 08:16:44 +0000
> > > > > > > From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
> > > > > > > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>, Stephen Hemminger
> > > > > > >  <stephen@networkplumber.org>
> > > > > > > CC: Yerden Zhumabekov <e_zhumabekov@sts.kz>, "Richardson, Bruce"
> > > > > > >  <bruce.richardson@intel.com>, "Verkamp, Daniel"
> > > > > > >  <daniel.verkamp@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
> > > > > > > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > > > > > > > Sent: Friday, June 9, 2017 6:29 PM
> > > > > > > > To: Stephen Hemminger <stephen@networkplumber.org>
> > > > > > > > Cc: Yerden Zhumabekov <e_zhumabekov@sts.kz>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Richardson, Bruce
> > > > > > > > <bruce.richardson@intel.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > > Date: Fri, 9 Jun 2017 10:16:25 -0700
> > > > > > > > > From: Stephen Hemminger <stephen@networkplumber.org>
> > > > > > > > > To: Yerden Zhumabekov <e_zhumabekov@sts.kz>
> > > > > > > > > Cc: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>, "Richardson,
> > > > > > > > >  Bruce" <bruce.richardson@intel.com>, "Verkamp, Daniel"
> > > > > > > > >  <daniel.verkamp@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
> > > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > > > >
> > > > > > > > > On Fri, 9 Jun 2017 18:47:43 +0600
> > > > > > > > > Yerden Zhumabekov <e_zhumabekov@sts.kz> wrote:
> > > > > > > > >
> > > > > > > > > > On 06.06.2017 19:19, Ananyev, Konstantin wrote:
> > > > > > > > > > >
> > > > > > > > > > >>>> Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?
> > > > > > > > > > >>> Might be, would be good to hear opinion the author of that change.
> > > > > > > > > > >> It gives improved performance for core-2-core transfer.
> > > > > > > > > > > You mean empty cache-line(s) after prod/cons, correct?
> > > > > > > > > > > That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
> > > > > > > > > > > Something like that:
> > > > > > > > > > > struct rte_ring {
> > > > > > > > > > >     ...
> > > > > > > > > > >     struct rte_ring_headtail prod __rte_cache_aligned;
> > > > > > > > > > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > > >     struct rte_ring_headtail cons __rte_cache_aligned;
> > > > > > > > > > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > > > };
> > > > > > > > > > >
> > > > > > > > > > > Konstantin
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I'm curious, can anyone explain, how does it actually affect
> > > > > > > > > > performance? Maybe we can utilize it application code?
> > > > > > > > >
> > > > > > > > > I think it is because on Intel CPU's the CPU will speculatively fetch adjacent cache lines.
> > > > > > > > > If these cache lines change, then it will create false sharing.
> > > > > > > >
> > > > > > > > I see. I think, In such cases it is better to abstract as conditional
> > > > > > > > compilation. The above logic has worst case cache memory
> > > > > > > > requirement if CPU is 128B CL and no speculative prefetch.
> > > > >
> > > > > I suppose we can keep exactly the same logic as we have now:
> > > > > archs with 64B cache-line would have an empty cache line,
> > > > > for archs with 128B cacheline - no.
> > > > > Is that what you are looking for?
> > > >
> > > > Its valid to an arch with 128B cache-line and speculative cache prefetch.
> > > > (Cavium's recent SoCs comes with this property)
> > > > IMHO, Instead of making 128B as NOOP. We can introduce a new conditional
> > > > compilation flag(CONFIG_RTE_ARCH_SPECULATIVE_PREFETCH or something like
> > > > that) to decide the empty line and I think, In future we can use
> > > > the same config for similar use cases.
> > > >
> > > > Jerin
> > > >
> > > I'd rather not make it that complicated, and definitely don't like
> > > adding in more build time config options. Initially, I had the extra
> > > padding always-present, but it was felt that it made the resulting
> > > structure too big. For those systems with 128B cachelines, is the extra
> > > 256 bytes of space per ring really a problem given modern systems have
> > > ram in the 10's of Gigs?
> > 
> > I think, RAM size does not matter here. I was referring more on L1 and L2
> > cache size(which is very limited).i.e if you fetch the unwanted
> > lines then CPU have to evict fast and it will have effect on accommodating
> > interested lines in worker loop..
> 
> Not sure I understand you here - as I know, we can't control HW speculative fetch.

Yes. But we can know in advance if a product family supports HW speculative fetch
or not. Typically a product family defines this feature in arm64 case and
we have different targets for each product family.

> It would either happen, or no depending on the actual CPU.
> The only thing we can control here - what exactly will be fetched:
> either empty cache line not used by anyone or cache-line with some data.

Yes. If we know a given CPU does not provide HW speculative fetch in
advance then we don't need to give empty line.


> Konstantin
> 
> >I am not a fan of build time options but I
> > don't really see any issue with _ARCH_ specific build time options.
> > I think, it is good. common_base can have default value if any platform/arch wants
> > to override it can override it.I don't see any harm in proving that support.
> > 
> > 
> > >
> > > /Bruce
  
Ananyev, Konstantin June 12, 2017, 12:51 p.m. UTC | #34
> -----Original Message-----
> From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> Sent: Monday, June 12, 2017 1:42 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Cc: Richardson, Bruce <bruce.richardson@intel.com>; Stephen Hemminger <stephen@networkplumber.org>; Yerden Zhumabekov
> <e_zhumabekov@sts.kz>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> 
> -----Original Message-----
> > Date: Mon, 12 Jun 2017 12:17:48 +0000
> > From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
> > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>, "Richardson, Bruce"
> >  <bruce.richardson@intel.com>
> > CC: Stephen Hemminger <stephen@networkplumber.org>, Yerden Zhumabekov
> >  <e_zhumabekov@sts.kz>, "Verkamp, Daniel" <daniel.verkamp@intel.com>,
> >  "dev@dpdk.org" <dev@dpdk.org>
> > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> >
> >
> >
> > > -----Original Message-----
> > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > > Sent: Monday, June 12, 2017 12:41 PM
> > > To: Richardson, Bruce <bruce.richardson@intel.com>
> > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Stephen Hemminger <stephen@networkplumber.org>; Yerden Zhumabekov
> > > <e_zhumabekov@sts.kz>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > >
> > > -----Original Message-----
> > > > Date: Mon, 12 Jun 2017 12:09:07 +0100
> > > > From: Bruce Richardson <bruce.richardson@intel.com>
> > > > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> > > > CC: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>, Stephen Hemminger
> > > >  <stephen@networkplumber.org>, Yerden Zhumabekov <e_zhumabekov@sts.kz>,
> > > >  "Verkamp, Daniel" <daniel.verkamp@intel.com>, "dev@dpdk.org"
> > > >  <dev@dpdk.org>
> > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > User-Agent: Mutt/1.8.1 (2017-04-11)
> > > >
> > > > On Mon, Jun 12, 2017 at 04:04:11PM +0530, Jerin Jacob wrote:
> > > > > -----Original Message-----
> > > > > > Date: Mon, 12 Jun 2017 10:18:39 +0000
> > > > > > From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
> > > > > > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> > > > > > CC: Stephen Hemminger <stephen@networkplumber.org>, Yerden Zhumabekov
> > > > > >  <e_zhumabekov@sts.kz>, "Richardson, Bruce" <bruce.richardson@intel.com>,
> > > > > >  "Verkamp, Daniel" <daniel.verkamp@intel.com>, "dev@dpdk.org"
> > > > > >  <dev@dpdk.org>
> > > > > > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > >
> > > > > >
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > > > > > > Sent: Monday, June 12, 2017 4:08 AM
> > > > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > > > > Cc: Stephen Hemminger <stephen@networkplumber.org>; Yerden Zhumabekov <e_zhumabekov@sts.kz>; Richardson, Bruce
> > > > > > > <bruce.richardson@intel.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > >
> > > > > > > -----Original Message-----
> > > > > > > > Date: Sat, 10 Jun 2017 08:16:44 +0000
> > > > > > > > From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
> > > > > > > > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>, Stephen Hemminger
> > > > > > > >  <stephen@networkplumber.org>
> > > > > > > > CC: Yerden Zhumabekov <e_zhumabekov@sts.kz>, "Richardson, Bruce"
> > > > > > > >  <bruce.richardson@intel.com>, "Verkamp, Daniel"
> > > > > > > >  <daniel.verkamp@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
> > > > > > > > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > > > > > > > > Sent: Friday, June 9, 2017 6:29 PM
> > > > > > > > > To: Stephen Hemminger <stephen@networkplumber.org>
> > > > > > > > > Cc: Yerden Zhumabekov <e_zhumabekov@sts.kz>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Richardson, Bruce
> > > > > > > > > <bruce.richardson@intel.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > > > >
> > > > > > > > > -----Original Message-----
> > > > > > > > > > Date: Fri, 9 Jun 2017 10:16:25 -0700
> > > > > > > > > > From: Stephen Hemminger <stephen@networkplumber.org>
> > > > > > > > > > To: Yerden Zhumabekov <e_zhumabekov@sts.kz>
> > > > > > > > > > Cc: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>, "Richardson,
> > > > > > > > > >  Bruce" <bruce.richardson@intel.com>, "Verkamp, Daniel"
> > > > > > > > > >  <daniel.verkamp@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
> > > > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > > > > >
> > > > > > > > > > On Fri, 9 Jun 2017 18:47:43 +0600
> > > > > > > > > > Yerden Zhumabekov <e_zhumabekov@sts.kz> wrote:
> > > > > > > > > >
> > > > > > > > > > > On 06.06.2017 19:19, Ananyev, Konstantin wrote:
> > > > > > > > > > > >
> > > > > > > > > > > >>>> Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?
> > > > > > > > > > > >>> Might be, would be good to hear opinion the author of that change.
> > > > > > > > > > > >> It gives improved performance for core-2-core transfer.
> > > > > > > > > > > > You mean empty cache-line(s) after prod/cons, correct?
> > > > > > > > > > > > That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
> > > > > > > > > > > > Something like that:
> > > > > > > > > > > > struct rte_ring {
> > > > > > > > > > > >     ...
> > > > > > > > > > > >     struct rte_ring_headtail prod __rte_cache_aligned;
> > > > > > > > > > > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > > > >     struct rte_ring_headtail cons __rte_cache_aligned;
> > > > > > > > > > > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > > > > };
> > > > > > > > > > > >
> > > > > > > > > > > > Konstantin
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I'm curious, can anyone explain, how does it actually affect
> > > > > > > > > > > performance? Maybe we can utilize it application code?
> > > > > > > > > >
> > > > > > > > > > I think it is because on Intel CPU's the CPU will speculatively fetch adjacent cache lines.
> > > > > > > > > > If these cache lines change, then it will create false sharing.
> > > > > > > > >
> > > > > > > > > I see. I think, In such cases it is better to abstract as conditional
> > > > > > > > > compilation. The above logic has worst case cache memory
> > > > > > > > > requirement if CPU is 128B CL and no speculative prefetch.
> > > > > >
> > > > > > I suppose we can keep exactly the same logic as we have now:
> > > > > > archs with 64B cache-line would have an empty cache line,
> > > > > > for archs with 128B cacheline - no.
> > > > > > Is that what you are looking for?
> > > > >
> > > > > Its valid to an arch with 128B cache-line and speculative cache prefetch.
> > > > > (Cavium's recent SoCs comes with this property)
> > > > > IMHO, Instead of making 128B as NOOP. We can introduce a new conditional
> > > > > compilation flag(CONFIG_RTE_ARCH_SPECULATIVE_PREFETCH or something like
> > > > > that) to decide the empty line and I think, In future we can use
> > > > > the same config for similar use cases.
> > > > >
> > > > > Jerin
> > > > >
> > > > I'd rather not make it that complicated, and definitely don't like
> > > > adding in more build time config options. Initially, I had the extra
> > > > padding always-present, but it was felt that it made the resulting
> > > > structure too big. For those systems with 128B cachelines, is the extra
> > > > 256 bytes of space per ring really a problem given modern systems have
> > > > ram in the 10's of Gigs?
> > >
> > > I think, RAM size does not matter here. I was referring more on L1 and L2
> > > cache size(which is very limited).i.e if you fetch the unwanted
> > > lines then CPU have to evict fast and it will have effect on accommodating
> > > interested lines in worker loop..
> >
> > Not sure I understand you here - as I know, we can't control HW speculative fetch.
> 
> Yes. But we can know in advance if a product family supports HW speculative fetch
> or not. Typically a product family defines this feature in arm64 case and
> we have different targets for each product family.

> 
> > It would either happen, or no depending on the actual CPU.
> > The only thing we can control here - what exactly will be fetched:
> > either empty cache line not used by anyone or cache-line with some data.
> 
> Yes. If we know a given CPU does not provide HW speculative fetch in
> advance then we don't need to give empty line.

I understand that part, what I don't understand how not providing empty cache line
(for specific HW) will improve L1/L2 cache usage?
Suppose you do have an empty line for all cases, and HW doesn't fetch next cache line.
Then it should never occur into the cache hierarchy anyway:
You never read/write to it manually, HW doesn't speculatively fetch it.
correct?
Konstantin
  
Bruce Richardson June 12, 2017, 1:06 p.m. UTC | #35
On Mon, Jun 12, 2017 at 01:51:26PM +0100, Ananyev, Konstantin wrote:
> 
> 
> > -----Original Message-----
> > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > Sent: Monday, June 12, 2017 1:42 PM
> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > Cc: Richardson, Bruce <bruce.richardson@intel.com>; Stephen Hemminger <stephen@networkplumber.org>; Yerden Zhumabekov
> > <e_zhumabekov@sts.kz>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > 
> > -----Original Message-----
> > > Date: Mon, 12 Jun 2017 12:17:48 +0000
> > > From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
> > > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>, "Richardson, Bruce"
> > >  <bruce.richardson@intel.com>
> > > CC: Stephen Hemminger <stephen@networkplumber.org>, Yerden Zhumabekov
> > >  <e_zhumabekov@sts.kz>, "Verkamp, Daniel" <daniel.verkamp@intel.com>,
> > >  "dev@dpdk.org" <dev@dpdk.org>
> > > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > > > Sent: Monday, June 12, 2017 12:41 PM
> > > > To: Richardson, Bruce <bruce.richardson@intel.com>
> > > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Stephen Hemminger <stephen@networkplumber.org>; Yerden Zhumabekov
> > > > <e_zhumabekov@sts.kz>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > >
> > > > -----Original Message-----
> > > > > Date: Mon, 12 Jun 2017 12:09:07 +0100
> > > > > From: Bruce Richardson <bruce.richardson@intel.com>
> > > > > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> > > > > CC: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>, Stephen Hemminger
> > > > >  <stephen@networkplumber.org>, Yerden Zhumabekov <e_zhumabekov@sts.kz>,
> > > > >  "Verkamp, Daniel" <daniel.verkamp@intel.com>, "dev@dpdk.org"
> > > > >  <dev@dpdk.org>
> > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > User-Agent: Mutt/1.8.1 (2017-04-11)
> > > > >
> > > > > On Mon, Jun 12, 2017 at 04:04:11PM +0530, Jerin Jacob wrote:
> > > > > > -----Original Message-----
> > > > > > > Date: Mon, 12 Jun 2017 10:18:39 +0000
> > > > > > > From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
> > > > > > > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> > > > > > > CC: Stephen Hemminger <stephen@networkplumber.org>, Yerden Zhumabekov
> > > > > > >  <e_zhumabekov@sts.kz>, "Richardson, Bruce" <bruce.richardson@intel.com>,
> > > > > > >  "Verkamp, Daniel" <daniel.verkamp@intel.com>, "dev@dpdk.org"
> > > > > > >  <dev@dpdk.org>
> > > > > > > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > > > > > > > Sent: Monday, June 12, 2017 4:08 AM
> > > > > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > > > > > Cc: Stephen Hemminger <stephen@networkplumber.org>; Yerden Zhumabekov <e_zhumabekov@sts.kz>; Richardson, Bruce
> > > > > > > > <bruce.richardson@intel.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > > Date: Sat, 10 Jun 2017 08:16:44 +0000
> > > > > > > > > From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
> > > > > > > > > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>, Stephen Hemminger
> > > > > > > > >  <stephen@networkplumber.org>
> > > > > > > > > CC: Yerden Zhumabekov <e_zhumabekov@sts.kz>, "Richardson, Bruce"
> > > > > > > > >  <bruce.richardson@intel.com>, "Verkamp, Daniel"
> > > > > > > > >  <daniel.verkamp@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
> > > > > > > > > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > -----Original Message-----
> > > > > > > > > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > > > > > > > > > Sent: Friday, June 9, 2017 6:29 PM
> > > > > > > > > > To: Stephen Hemminger <stephen@networkplumber.org>
> > > > > > > > > > Cc: Yerden Zhumabekov <e_zhumabekov@sts.kz>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Richardson, Bruce
> > > > > > > > > > <bruce.richardson@intel.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > > > > >
> > > > > > > > > > -----Original Message-----
> > > > > > > > > > > Date: Fri, 9 Jun 2017 10:16:25 -0700
> > > > > > > > > > > From: Stephen Hemminger <stephen@networkplumber.org>
> > > > > > > > > > > To: Yerden Zhumabekov <e_zhumabekov@sts.kz>
> > > > > > > > > > > Cc: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>, "Richardson,
> > > > > > > > > > >  Bruce" <bruce.richardson@intel.com>, "Verkamp, Daniel"
> > > > > > > > > > >  <daniel.verkamp@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
> > > > > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > > > > > >
> > > > > > > > > > > On Fri, 9 Jun 2017 18:47:43 +0600
> > > > > > > > > > > Yerden Zhumabekov <e_zhumabekov@sts.kz> wrote:
> > > > > > > > > > >
> > > > > > > > > > > > On 06.06.2017 19:19, Ananyev, Konstantin wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > >>>> Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?
> > > > > > > > > > > > >>> Might be, would be good to hear opinion the author of that change.
> > > > > > > > > > > > >> It gives improved performance for core-2-core transfer.
> > > > > > > > > > > > > You mean empty cache-line(s) after prod/cons, correct?
> > > > > > > > > > > > > That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
> > > > > > > > > > > > > Something like that:
> > > > > > > > > > > > > struct rte_ring {
> > > > > > > > > > > > >     ...
> > > > > > > > > > > > >     struct rte_ring_headtail prod __rte_cache_aligned;
> > > > > > > > > > > > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > > > > >     struct rte_ring_headtail cons __rte_cache_aligned;
> > > > > > > > > > > > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > > > > > };
> > > > > > > > > > > > >
> > > > > > > > > > > > > Konstantin
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > I'm curious, can anyone explain, how does it actually affect
> > > > > > > > > > > > performance? Maybe we can utilize it application code?
> > > > > > > > > > >
> > > > > > > > > > > I think it is because on Intel CPU's the CPU will speculatively fetch adjacent cache lines.
> > > > > > > > > > > If these cache lines change, then it will create false sharing.
> > > > > > > > > >
> > > > > > > > > > I see. I think, In such cases it is better to abstract as conditional
> > > > > > > > > > compilation. The above logic has worst case cache memory
> > > > > > > > > > requirement if CPU is 128B CL and no speculative prefetch.
> > > > > > >
> > > > > > > I suppose we can keep exactly the same logic as we have now:
> > > > > > > archs with 64B cache-line would have an empty cache line,
> > > > > > > for archs with 128B cacheline - no.
> > > > > > > Is that what you are looking for?
> > > > > >
> > > > > > Its valid to an arch with 128B cache-line and speculative cache prefetch.
> > > > > > (Cavium's recent SoCs comes with this property)
> > > > > > IMHO, Instead of making 128B as NOOP. We can introduce a new conditional
> > > > > > compilation flag(CONFIG_RTE_ARCH_SPECULATIVE_PREFETCH or something like
> > > > > > that) to decide the empty line and I think, In future we can use
> > > > > > the same config for similar use cases.
> > > > > >
> > > > > > Jerin
> > > > > >
> > > > > I'd rather not make it that complicated, and definitely don't like
> > > > > adding in more build time config options. Initially, I had the extra
> > > > > padding always-present, but it was felt that it made the resulting
> > > > > structure too big. For those systems with 128B cachelines, is the extra
> > > > > 256 bytes of space per ring really a problem given modern systems have
> > > > > ram in the 10's of Gigs?
> > > >
> > > > I think, RAM size does not matter here. I was referring more on L1 and L2
> > > > cache size(which is very limited).i.e if you fetch the unwanted
> > > > lines then CPU have to evict fast and it will have effect on accommodating
> > > > interested lines in worker loop..
> > >
> > > Not sure I understand you here - as I know, we can't control HW speculative fetch.
> > 
> > Yes. But we can know in advance if a product family supports HW speculative fetch
> > or not. Typically a product family defines this feature in arm64 case and
> > we have different targets for each product family.
> 
> > 
> > > It would either happen, or no depending on the actual CPU.
> > > The only thing we can control here - what exactly will be fetched:
> > > either empty cache line not used by anyone or cache-line with some data.
> > 
> > Yes. If we know a given CPU does not provide HW speculative fetch in
> > advance then we don't need to give empty line.
> 
> I understand that part, what I don't understand how not providing empty cache line
> (for specific HW) will improve L1/L2 cache usage?
> Suppose you do have an empty line for all cases, and HW doesn't fetch next cache line.
> Then it should never occur into the cache hierarchy anyway:
> You never read/write to it manually, HW doesn't speculatively fetch it.
> correct?
> Konstantin
>
+1 
If there are HW prefetchers that will pull in the next line, you need
the extra padding. If there aren't prefetchers then having the padding
has no effect on cache usage, so there is now downside to having it
(from a cache point of view anyway)

/Bruce
  
Jerin Jacob June 12, 2017, 1:20 p.m. UTC | #36
-----Original Message-----
> Date: Mon, 12 Jun 2017 12:51:26 +0000
> From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
> To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> CC: "Richardson, Bruce" <bruce.richardson@intel.com>, Stephen Hemminger
>  <stephen@networkplumber.org>, Yerden Zhumabekov <e_zhumabekov@sts.kz>,
>  "Verkamp, Daniel" <daniel.verkamp@intel.com>, "dev@dpdk.org"
>  <dev@dpdk.org>
> Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> 
> 
> 
> > -----Original Message-----
> > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > Sent: Monday, June 12, 2017 1:42 PM
> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > Cc: Richardson, Bruce <bruce.richardson@intel.com>; Stephen Hemminger <stephen@networkplumber.org>; Yerden Zhumabekov
> > <e_zhumabekov@sts.kz>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > 
> > -----Original Message-----
> > > Date: Mon, 12 Jun 2017 12:17:48 +0000
> > > From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
> > > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>, "Richardson, Bruce"
> > >  <bruce.richardson@intel.com>
> > > CC: Stephen Hemminger <stephen@networkplumber.org>, Yerden Zhumabekov
> > >  <e_zhumabekov@sts.kz>, "Verkamp, Daniel" <daniel.verkamp@intel.com>,
> > >  "dev@dpdk.org" <dev@dpdk.org>
> > > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > > > Sent: Monday, June 12, 2017 12:41 PM
> > > > To: Richardson, Bruce <bruce.richardson@intel.com>
> > > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Stephen Hemminger <stephen@networkplumber.org>; Yerden Zhumabekov
> > > > <e_zhumabekov@sts.kz>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > >
> > > > -----Original Message-----
> > > > > Date: Mon, 12 Jun 2017 12:09:07 +0100
> > > > > From: Bruce Richardson <bruce.richardson@intel.com>
> > > > > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> > > > > CC: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>, Stephen Hemminger
> > > > >  <stephen@networkplumber.org>, Yerden Zhumabekov <e_zhumabekov@sts.kz>,
> > > > >  "Verkamp, Daniel" <daniel.verkamp@intel.com>, "dev@dpdk.org"
> > > > >  <dev@dpdk.org>
> > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > User-Agent: Mutt/1.8.1 (2017-04-11)
> > > > >
> > > > > On Mon, Jun 12, 2017 at 04:04:11PM +0530, Jerin Jacob wrote:
> > > > > > -----Original Message-----
> > > > > > > Date: Mon, 12 Jun 2017 10:18:39 +0000
> > > > > > > From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
> > > > > > > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> > > > > > > CC: Stephen Hemminger <stephen@networkplumber.org>, Yerden Zhumabekov
> > > > > > >  <e_zhumabekov@sts.kz>, "Richardson, Bruce" <bruce.richardson@intel.com>,
> > > > > > >  "Verkamp, Daniel" <daniel.verkamp@intel.com>, "dev@dpdk.org"
> > > > > > >  <dev@dpdk.org>
> > > > > > > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > > > > > > > Sent: Monday, June 12, 2017 4:08 AM
> > > > > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > > > > > Cc: Stephen Hemminger <stephen@networkplumber.org>; Yerden Zhumabekov <e_zhumabekov@sts.kz>; Richardson, Bruce
> > > > > > > > <bruce.richardson@intel.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > > Date: Sat, 10 Jun 2017 08:16:44 +0000
> > > > > > > > > From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
> > > > > > > > > To: Jerin Jacob <jerin.jacob@caviumnetworks.com>, Stephen Hemminger
> > > > > > > > >  <stephen@networkplumber.org>
> > > > > > > > > CC: Yerden Zhumabekov <e_zhumabekov@sts.kz>, "Richardson, Bruce"
> > > > > > > > >  <bruce.richardson@intel.com>, "Verkamp, Daniel"
> > > > > > > > >  <daniel.verkamp@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
> > > > > > > > > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > -----Original Message-----
> > > > > > > > > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > > > > > > > > > Sent: Friday, June 9, 2017 6:29 PM
> > > > > > > > > > To: Stephen Hemminger <stephen@networkplumber.org>
> > > > > > > > > > Cc: Yerden Zhumabekov <e_zhumabekov@sts.kz>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Richardson, Bruce
> > > > > > > > > > <bruce.richardson@intel.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > > > > >
> > > > > > > > > > -----Original Message-----
> > > > > > > > > > > Date: Fri, 9 Jun 2017 10:16:25 -0700
> > > > > > > > > > > From: Stephen Hemminger <stephen@networkplumber.org>
> > > > > > > > > > > To: Yerden Zhumabekov <e_zhumabekov@sts.kz>
> > > > > > > > > > > Cc: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>, "Richardson,
> > > > > > > > > > >  Bruce" <bruce.richardson@intel.com>, "Verkamp, Daniel"
> > > > > > > > > > >  <daniel.verkamp@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
> > > > > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > > > > > > >
> > > > > > > > > > > On Fri, 9 Jun 2017 18:47:43 +0600
> > > > > > > > > > > Yerden Zhumabekov <e_zhumabekov@sts.kz> wrote:
> > > > > > > > > > >
> > > > > > > > > > > > On 06.06.2017 19:19, Ananyev, Konstantin wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > >>>> Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?
> > > > > > > > > > > > >>> Might be, would be good to hear opinion the author of that change.
> > > > > > > > > > > > >> It gives improved performance for core-2-core transfer.
> > > > > > > > > > > > > You mean empty cache-line(s) after prod/cons, correct?
> > > > > > > > > > > > > That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
> > > > > > > > > > > > > Something like that:
> > > > > > > > > > > > > struct rte_ring {
> > > > > > > > > > > > >     ...
> > > > > > > > > > > > >     struct rte_ring_headtail prod __rte_cache_aligned;
> > > > > > > > > > > > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > > > > >     struct rte_ring_headtail cons __rte_cache_aligned;
> > > > > > > > > > > > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > > > > > };
> > > > > > > > > > > > >
> > > > > > > > > > > > > Konstantin
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > I'm curious, can anyone explain, how does it actually affect
> > > > > > > > > > > > performance? Maybe we can utilize it application code?
> > > > > > > > > > >
> > > > > > > > > > > I think it is because on Intel CPU's the CPU will speculatively fetch adjacent cache lines.
> > > > > > > > > > > If these cache lines change, then it will create false sharing.
> > > > > > > > > >
> > > > > > > > > > I see. I think, In such cases it is better to abstract as conditional
> > > > > > > > > > compilation. The above logic has worst case cache memory
> > > > > > > > > > requirement if CPU is 128B CL and no speculative prefetch.
> > > > > > >
> > > > > > > I suppose we can keep exactly the same logic as we have now:
> > > > > > > archs with 64B cache-line would have an empty cache line,
> > > > > > > for archs with 128B cacheline - no.
> > > > > > > Is that what you are looking for?
> > > > > >
> > > > > > Its valid to an arch with 128B cache-line and speculative cache prefetch.
> > > > > > (Cavium's recent SoCs comes with this property)
> > > > > > IMHO, Instead of making 128B as NOOP. We can introduce a new conditional
> > > > > > compilation flag(CONFIG_RTE_ARCH_SPECULATIVE_PREFETCH or something like
> > > > > > that) to decide the empty line and I think, In future we can use
> > > > > > the same config for similar use cases.
> > > > > >
> > > > > > Jerin
> > > > > >
> > > > > I'd rather not make it that complicated, and definitely don't like
> > > > > adding in more build time config options. Initially, I had the extra
> > > > > padding always-present, but it was felt that it made the resulting
> > > > > structure too big. For those systems with 128B cachelines, is the extra
> > > > > 256 bytes of space per ring really a problem given modern systems have
> > > > > ram in the 10's of Gigs?
> > > >
> > > > I think, RAM size does not matter here. I was referring more on L1 and L2
> > > > cache size(which is very limited).i.e if you fetch the unwanted
> > > > lines then CPU have to evict fast and it will have effect on accommodating
> > > > interested lines in worker loop..
> > >
> > > Not sure I understand you here - as I know, we can't control HW speculative fetch.
> > 
> > Yes. But we can know in advance if a product family supports HW speculative fetch
> > or not. Typically a product family defines this feature in arm64 case and
> > we have different targets for each product family.
> 
> > 
> > > It would either happen, or no depending on the actual CPU.
> > > The only thing we can control here - what exactly will be fetched:
> > > either empty cache line not used by anyone or cache-line with some data.
> > 
> > Yes. If we know a given CPU does not provide HW speculative fetch in
> > advance then we don't need to give empty line.
> 
> I understand that part, what I don't understand how not providing empty cache line
> (for specific HW) will improve L1/L2 cache usage?
> Suppose you do have an empty line for all cases, and HW doesn't fetch next cache line.
> Then it should never occur into the cache hierarchy anyway:
> You never read/write to it manually, HW doesn't speculatively fetch it.
> correct?

Yes. You are right.

> Konstantin
>
  
Olivier Matz June 30, 2017, 11:35 a.m. UTC | #37
On Mon, 12 Jun 2017 10:56:09 +0100, Bruce Richardson <bruce.richardson@intel.com> wrote:
> On Mon, Jun 12, 2017 at 11:02:32AM +0200, Olivier Matz wrote:
> > On Fri, 9 Jun 2017 10:02:55 +0100, Bruce Richardson <bruce.richardson@intel.com> wrote:  
> > > On Thu, Jun 08, 2017 at 05:42:00PM +0100, Ananyev, Konstantin wrote:  
> > > > 
> > > >     
> > > > > -----Original Message-----
> > > > > From: Richardson, Bruce
> > > > > Sent: Thursday, June 8, 2017 5:21 PM
> > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > > Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > 
> > > > > 
> > > > >     
> > > > > > -----Original Message-----
> > > > > > From: Ananyev, Konstantin
> > > > > > Sent: Thursday, June 8, 2017 5:13 PM
> > > > > > To: Richardson, Bruce <bruce.richardson@intel.com>
> > > > > > Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel
> > > > > > <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > Subject: RE: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> > > > > >
> > > > > >
> > > > > >    
> > > > > > > -----Original Message-----
> > > > > > > From: Richardson, Bruce
> > > > > > > Sent: Thursday, June 8, 2017 5:04 PM
> > > > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > > > > Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel
> > > > > > > <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone
> > > > > > > allocation
> > > > > > >
> > > > > > > On Thu, Jun 08, 2017 at 04:35:20PM +0100, Ananyev, Konstantin wrote:    
> > > > > > > >
> > > > > > > >    
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Richardson, Bruce
> > > > > > > > > Sent: Thursday, June 8, 2017 4:25 PM
> > > > > > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > > > > > > Cc: Olivier Matz <olivier.matz@6wind.com>; Verkamp, Daniel
> > > > > > > > > <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone
> > > > > > > > > allocation
> > > > > > > > >
> > > > > > > > > On Thu, Jun 08, 2017 at 03:50:34PM +0100, Ananyev, Konstantin wrote:    
> > > > > > > > > >
> > > > > > > > > >    
> > > > > > > > > > > -----Original Message-----
> > > > > > > > > > > From: Richardson, Bruce
> > > > > > > > > > > Sent: Thursday, June 8, 2017 3:12 PM
> > > > > > > > > > > To: Olivier Matz <olivier.matz@6wind.com>
> > > > > > > > > > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> > > > > > > > > > > Verkamp, Daniel <daniel.verkamp@intel.com>; dev@dpdk.org
> > > > > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone
> > > > > > > > > > > allocation
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Jun 08, 2017 at 04:05:26PM +0200, Olivier Matz wrote:    
> > > > > > > > > > > > On Thu, 8 Jun 2017 14:20:52 +0100, Bruce Richardson    
> > > > > > <bruce.richardson@intel.com> wrote:    
> > > > > > > > > > > > > On Thu, Jun 08, 2017 at 02:45:40PM +0200, Olivier Matz    
> > > > > > wrote:    
> > > > > > > > > > > > > > On Tue, 6 Jun 2017 15:56:28 +0100, Bruce Richardson    
> > > > > > <bruce.richardson@intel.com> wrote:    
> > > > > > > > > > > > > > > On Tue, Jun 06, 2017 at 02:19:21PM +0100, Ananyev,    
> > > > > > Konstantin wrote:    
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >    
> > > > > > > > > > > > > > > > > -----Original Message-----
> > > > > > > > > > > > > > > > > From: Richardson, Bruce
> > > > > > > > > > > > > > > > > Sent: Tuesday, June 6, 2017 1:42 PM
> > > > > > > > > > > > > > > > > To: Ananyev, Konstantin
> > > > > > > > > > > > > > > > > <konstantin.ananyev@intel.com>
> > > > > > > > > > > > > > > > > Cc: Verkamp, Daniel <daniel.verkamp@intel.com>;
> > > > > > > > > > > > > > > > > dev@dpdk.org
> > > > > > > > > > > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] ring: use
> > > > > > > > > > > > > > > > > aligned memzone allocation
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Tue, Jun 06, 2017 at 10:59:59AM +0100, Ananyev,    
> > > > > > Konstantin wrote:    
> > > > > > > > > > > > > > > > > >    
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >    
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > The PROD/CONS_ALIGN values on x86-64 are
> > > > > > > > > > > > > > > > > > > > > set to 2 cache lines, so members    
> > > > > > > > > > > > > > > > > > > > of struct rte_ring are 128 byte aligned,    
> > > > > > > > > > > > > > > > > > > > >and therefore the whole struct needs
> > > > > > > > > > > > > > > > > > > > >128-byte alignment according to the ABI    
> > > > > > > > > > > > > > > > > > > > so that the 128-byte alignment of the fields    
> > > > > > can be guaranteed.    
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Ah ok, missed the fact that rte_ring is 128B    
> > > > > > aligned these days.    
> > > > > > > > > > > > > > > > > > > > BTW, I probably missed the initial discussion,    
> > > > > > but what was the reason for that?    
> > > > > > > > > > > > > > > > > > > > Konstantin    
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I don't know why PROD_ALIGN/CONS_ALIGN use 128
> > > > > > > > > > > > > > > > > > > byte alignment; it seems unnecessary if the
> > > > > > > > > > > > > > > > > > > cache line is only 64    
> > > > > > > > > bytes.    
> > > > > > > > > > > An    
> > > > > > > > > > > > > > > > > alternate    
> > > > > > > > > > > > > > > > > > > fix would be to just use cache line alignment    
> > > > > > for these fields (since memzones are already cache line aligned).    
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Yes, had the same thought.
> > > > > > > > > > > > > > > > > >    
> > > > > > > > > > > > > > > > > > > Maybe there is some deeper  reason for the >=    
> > > > > > 128-byte alignment logic in rte_ring.h?    
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Might be, would be good to hear opinion the author    
> > > > > > of that change.    
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > It gives improved performance for core-2-core    
> > > > > > transfer.    
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > You mean empty cache-line(s) after prod/cons, correct?
> > > > > > > > > > > > > > > > That's ok but why we can't keep them and whole    
> > > > > > rte_ring aligned on cache-line boundaries?    
> > > > > > > > > > > > > > > > Something like that:
> > > > > > > > > > > > > > > > struct rte_ring {
> > > > > > > > > > > > > > > >    ...
> > > > > > > > > > > > > > > >    struct rte_ring_headtail prod __rte_cache_aligned;
> > > > > > > > > > > > > > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > > > > > > > >    struct rte_ring_headtail cons __rte_cache_aligned;
> > > > > > > > > > > > > > > >    EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > > > > > > > > > > > > > > };
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Konstantin    
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Sure. That should probably work too.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > /Bruce    
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I also agree with Konstantin's proposal. One question
> > > > > > > > > > > > > > though: since it changes the alignment constraint of the
> > > > > > > > > > > > > > rte_ring structure, I think it is an ABI breakage: a
> > > > > > > > > > > > > > structure including the rte_ring structure inherits from    
> > > > > > this constraint.    
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > How could we handle that, knowing this is probably a rare    
> > > > > > case?    
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >    
> > > > > > > > > > > > > Is it an ABI break so long as we keep the resulting size
> > > > > > > > > > > > > and field placement of the structures the same? The
> > > > > > > > > > > > > alignment being reduced should not be a problem, as
> > > > > > > > > > > > > 128byte alignment is also valid as 64byte alignment, after    
> > > > > > all.    
> > > > > > > > > > > >
> > > > > > > > > > > > I'd say yes. Consider the following example:
> > > > > > > > > > > >
> > > > > > > > > > > > ---8<---
> > > > > > > > > > > > #include <stdio.h>
> > > > > > > > > > > > #include <stdlib.h>
> > > > > > > > > > > >
> > > > > > > > > > > > #define ALIGN 64
> > > > > > > > > > > > /* #define ALIGN 128 */
> > > > > > > > > > > >
> > > > > > > > > > > > /* dummy rte_ring struct */
> > > > > > > > > > > > struct rte_ring {
> > > > > > > > > > > > 	char x[128];
> > > > > > > > > > > > } __attribute__((aligned(ALIGN)));
> > > > > > > > > > > >
> > > > > > > > > > > > struct foo {
> > > > > > > > > > > > 	struct rte_ring r;
> > > > > > > > > > > > 	unsigned bar;
> > > > > > > > > > > > };
> > > > > > > > > > > >
> > > > > > > > > > > > int main(void)
> > > > > > > > > > > > {
> > > > > > > > > > > > 	struct foo array[2];
> > > > > > > > > > > >
> > > > > > > > > > > > 	printf("sizeof(ring)=%zu diff=%u\n",
> > > > > > > > > > > > 		sizeof(struct rte_ring),
> > > > > > > > > > > > 		(unsigned int)((char *)&array[1].r - (char    
> > > > > > *)array));    
> > > > > > > > > > > >
> > > > > > > > > > > > 	return 0;
> > > > > > > > > > > > }
> > > > > > > > > > > > ---8<---
> > > > > > > > > > > >
> > > > > > > > > > > > The size of rte_ring is always 128.
> > > > > > > > > > > > diff is 192 or 256, depending on the value of ALIGN.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Olivier    
> > > > > > > > > >
> > > > > > > > > > About would it be an ABI breakage to 17.05 - I think would...
> > > > > > > > > > Though for me the actual breakage happens in 17.05 when rte_ring
> > > > > > > > > > alignment was increased from 64B 128B.
> > > > > > > > > > Now we just restoring it.
> > > > > > > > > >    
> > > > > > > > > Yes, ABI change was announced in advance and explicitly broken in    
> > > > > > 17.05.    
> > > > > > > > > There was no announcement of ABI break in 17.08 for rte_ring.
> > > > > > > > >    
> > > > > > > > > > >
> > > > > > > > > > > Yes, the diff will change, but that is after a recompile. If
> > > > > > > > > > > we have rte_ring_create function always return a 128-byte
> > > > > > > > > > > aligned structure, will any already-compiled apps fail to work
> > > > > > > > > > > if we also change the alignment of the rte_ring struct in the    
> > > > > > header?    
> > > > > > > > > >
> > > > > > > > > > Why 128B?
> > > > > > > > > > I thought we are discussing making rte_ring 64B aligned again?
> > > > > > > > > >
> > > > > > > > > > Konstantin    
> > > > > > > > >
> > > > > > > > > To avoid possibly breaking apps compiled against 17.05 when run
> > > > > > > > > against shared libs for 17.08. Having the extra alignment won't
> > > > > > > > > affect 17.08 apps, since they only require 64-byte alignment, but
> > > > > > > > > returning only 64-byte aligned memory for apps which expect
> > > > > > > > > 128byte aligned memory may cause issues.
> > > > > > > > >
> > > > > > > > > Therefore, we should reduce the required alignment to 64B, which
> > > > > > > > > should only affect any apps that do a recompile, and have memory
> > > > > > > > > allocation for rings return 128B aligned addresses to work with
> > > > > > > > > both 64B aligned and 128B aligned ring structures.    
> > > > > > > >
> > > > > > > > Ah, I see - you are talking just about rte_ring_create().
> > > > > > > > BTW, are you sure that right now it allocates rings 128B aligned?
> > > > > > > > As I can see it does just:
> > > > > > > > mz = rte_memzone_reserve(mz_name, ring_size, socket_id, mz_flags);
> > > > > > > > which means cache line alignment.
> > > > > > > >    
> > > > > > > It doesn't currently allocate with that alignment, which is something
> > > > > > > we need to fix - and what this patch was originally submitted to do.
> > > > > > > So I think this patch should be applied, along with a further patch to
> > > > > > > reduce the alignment going forward to avoid any other problems.    
> > > > > >
> > > > > > But if we going to reduce alignment anyway (patch #2) why do we need patch
> > > > > > #1 at all?    
> > > > > 
> > > > > Because any app compiled against 17.05 will use the old alignment value. Therefore patch 1 should be applied to 17.08 for backward
> > > > > compatibility, and backported to 17.05.1.    
> > > > 
> > > > Why then just no backport patch #2 to 17.05.1?
> > > >     
> > > Maybe so. I'm just a little wary about backporting changes like that to
> > > an older release, even though I'm not aware of any specific issues it
> > > might cause.  
> > 
> > 
> > If we want to fully respect the API/ABI deprecation process, we should
> > have patch #1 in 17.05 and 17.08, a deprecation notice in 17.08, and patch
> > #2 starting from 17.11.
> > 
> > More pragmatically, it's quite difficult to foresee really big problems
> > due to the changes in patch #2. One I can see is:
> > 
> > - rte_ring.so: the dpdk ring library
> > - another_ring.so: a library based on dpdk ring. The struct another_ring
> >   is like the struct foo in my previous example.
> > - application: uses another_ring structure
> > 
> > After we apply patch #2 on dpdk, and recompile the another_ring library,
> > its ABI will change.
> > 
> > 
> > So I suggest to follow the deprecation process for that issue.
> >   
> While this theoretically can occur, I consider it fairly unlikely, so my
> preference is to have patch #1 in 17.05 and .08, as you suggest, 
> but put patch #2 into 17.08 as well.

Ok, let's move forward. I'll ack Daniel's patch + CC stable.

Then I'll submit Konstantin's proposal on the ML.


Olivier
  
Olivier Matz June 30, 2017, 11:36 a.m. UTC | #38
On Fri,  2 Jun 2017 13:12:13 -0700, Daniel Verkamp <daniel.verkamp@intel.com> wrote:
> rte_memzone_reserve() provides cache line alignment, but
> struct rte_ring may require more than cache line alignment: on x86-64,
> it needs 128-byte alignment due to PROD_ALIGN and CONS_ALIGN, which are
> 128 bytes, but cache line size is 64 bytes.
> 
> Fixes runtime warnings with UBSan enabled.
> 
> Fixes: d9f0d3a1ffd4 ("ring: remove split cacheline build setting")
> 
> Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>

Acked-by: Olivier Matz <olivier.matz@6wind.com>
  
Thomas Monjalon July 1, 2017, 11:14 a.m. UTC | #39
30/06/2017 13:36, Olivier Matz:
> On Fri,  2 Jun 2017 13:12:13 -0700, Daniel Verkamp <daniel.verkamp@intel.com> wrote:
> > rte_memzone_reserve() provides cache line alignment, but
> > struct rte_ring may require more than cache line alignment: on x86-64,
> > it needs 128-byte alignment due to PROD_ALIGN and CONS_ALIGN, which are
> > 128 bytes, but cache line size is 64 bytes.
> > 
> > Fixes runtime warnings with UBSan enabled.
> > 
> > Fixes: d9f0d3a1ffd4 ("ring: remove split cacheline build setting")
> > 
> > Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
> 
> Acked-by: Olivier Matz <olivier.matz@6wind.com>

Applied, thanks
  
Thomas Monjalon July 1, 2017, 11:25 a.m. UTC | #40
01/07/2017 13:14, Thomas Monjalon:
> 30/06/2017 13:36, Olivier Matz:
> > On Fri,  2 Jun 2017 13:12:13 -0700, Daniel Verkamp <daniel.verkamp@intel.com> wrote:
> > > rte_memzone_reserve() provides cache line alignment, but
> > > struct rte_ring may require more than cache line alignment: on x86-64,
> > > it needs 128-byte alignment due to PROD_ALIGN and CONS_ALIGN, which are
> > > 128 bytes, but cache line size is 64 bytes.
> > > 
> > > Fixes runtime warnings with UBSan enabled.
> > > 
> > > Fixes: d9f0d3a1ffd4 ("ring: remove split cacheline build setting")
> > > 
> > > Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
> > 
> > Acked-by: Olivier Matz <olivier.matz@6wind.com>
> 
> Applied, thanks

and Cc stable@dpdk.org ;)
  

Patch

diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
index 5f98c33..6f58faf 100644
--- a/lib/librte_ring/rte_ring.c
+++ b/lib/librte_ring/rte_ring.c
@@ -189,7 +189,8 @@  rte_ring_create(const char *name, unsigned count, int socket_id,
 	/* reserve a memory zone for this ring. If we can't get rte_config or
 	 * we are secondary process, the memzone_reserve function will set
 	 * rte_errno for us appropriately - hence no check in this this function */
-	mz = rte_memzone_reserve(mz_name, ring_size, socket_id, mz_flags);
+	mz = rte_memzone_reserve_aligned(mz_name, ring_size, socket_id,
+					 mz_flags, __alignof__(*r));
 	if (mz != NULL) {
 		r = mz->addr;
 		/* no need to check return value here, we already checked the