mbuf: align rte_mbuf for Windows

Message ID 20200519184111.4504-1-talshn@mellanox.com (mailing list archive)
State Accepted, archived
Delegated to: Thomas Monjalon
Headers
Series mbuf: align rte_mbuf for Windows |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation fail Compilation issues
ci/travis-robot success Travis build: passed
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-nxp-Performance success Performance Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS

Commit Message

Tal Shnaiderman May 19, 2020, 6:41 p.m. UTC
  From: Tal Shnaiderman <talshn@mellanox.com>

Using uint32_t type bit-fields in Windows will pads the
'L2/L3/L4 and tunnel information' union with additional bits.

This padding causes rte_mbuf size misalignment and the total size
increases to 3 cache-lines.

Changed packet_type bit-fields types from uint32_t to uint8_t
to allow unified 2 cache-line structure size.

Added the __extension__ attribute over the modified struct to avoid
the warning:

type of bit-field ... is a GCC extension [-pedantic]

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
---
 lib/librte_mbuf/rte_mbuf_core.h | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)
  

Comments

Thomas Monjalon May 19, 2020, 6:49 p.m. UTC | #1
+Cc more maintainers

19/05/2020 20:41, talshn@mellanox.com:
> From: Tal Shnaiderman <talshn@mellanox.com>
> 
> Using uint32_t type bit-fields in Windows will pads the
> 'L2/L3/L4 and tunnel information' union with additional bits.
> 
> This padding causes rte_mbuf size misalignment and the total size
> increases to 3 cache-lines.
> 
> Changed packet_type bit-fields types from uint32_t to uint8_t
> to allow unified 2 cache-line structure size.
> 
> Added the __extension__ attribute over the modified struct to avoid
> the warning:
> 
> type of bit-field ... is a GCC extension [-pedantic]
> 
> Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
> ---
>  lib/librte_mbuf/rte_mbuf_core.h | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> index b9a59c879..82441555e 100644
> --- a/lib/librte_mbuf/rte_mbuf_core.h
> +++ b/lib/librte_mbuf/rte_mbuf_core.h
> @@ -521,11 +521,12 @@ struct rte_mbuf {
>  	RTE_STD_C11
>  	union {
>  		uint32_t packet_type; /**< L2/L3/L4 and tunnel information. */
> +		__extension__
>  		struct {
> -			uint32_t l2_type:4; /**< (Outer) L2 type. */
> -			uint32_t l3_type:4; /**< (Outer) L3 type. */
> -			uint32_t l4_type:4; /**< (Outer) L4 type. */
> -			uint32_t tun_type:4; /**< Tunnel type. */
> +			uint8_t l2_type:4; /**< (Outer) L2 type. */
> +			uint8_t l3_type:4; /**< (Outer) L3 type. */
> +			uint8_t l4_type:4; /**< (Outer) L4 type. */
> +			uint8_t tun_type:4; /**< Tunnel type. */
>  			RTE_STD_C11
>  			union {
>  				uint8_t inner_esp_next_proto;
> @@ -541,7 +542,7 @@ struct rte_mbuf {
>  					/**< Inner L3 type. */
>  				};
>  			};
> -			uint32_t inner_l4_type:4; /**< Inner L4 type. */
> +			uint8_t inner_l4_type:4; /**< Inner L4 type. */
>  		};
>  	};
  
Dmitry Kozlyuk May 19, 2020, 7:57 p.m. UTC | #2
On Tue, 19 May 2020 20:49:50 +0200
Thomas Monjalon <thomas@monjalon.net> wrote:

> +Cc more maintainers
> 
> 19/05/2020 20:41, talshn@mellanox.com:
> > From: Tal Shnaiderman <talshn@mellanox.com>
> > 
> > Using uint32_t type bit-fields in Windows will pads the
> > 'L2/L3/L4 and tunnel information' union with additional bits.
> > 
> > This padding causes rte_mbuf size misalignment and the total size
> > increases to 3 cache-lines.
> > 
> > Changed packet_type bit-fields types from uint32_t to uint8_t
> > to allow unified 2 cache-line structure size.
> > 
> > Added the __extension__ attribute over the modified struct to avoid
> > the warning:
> > 
> > type of bit-field ... is a GCC extension [-pedantic]
> > 
> > Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
> > ---
> >  lib/librte_mbuf/rte_mbuf_core.h | 11 ++++++-----
> >  1 file changed, 6 insertions(+), 5 deletions(-)
> > 
> > diff --git a/lib/librte_mbuf/rte_mbuf_core.h
> > b/lib/librte_mbuf/rte_mbuf_core.h index b9a59c879..82441555e 100644
> > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > @@ -521,11 +521,12 @@ struct rte_mbuf {
> >  	RTE_STD_C11
> >  	union {
> >  		uint32_t packet_type; /**< L2/L3/L4 and tunnel
> > information. */
> > +		__extension__
> >  		struct {
> > -			uint32_t l2_type:4; /**< (Outer) L2 type.
> > */
> > -			uint32_t l3_type:4; /**< (Outer) L3 type.
> > */
> > -			uint32_t l4_type:4; /**< (Outer) L4 type.
> > */
> > -			uint32_t tun_type:4; /**< Tunnel type. */
> > +			uint8_t l2_type:4; /**< (Outer) L2 type. */
> > +			uint8_t l3_type:4; /**< (Outer) L3 type. */
> > +			uint8_t l4_type:4; /**< (Outer) L4 type. */
> > +			uint8_t tun_type:4; /**< Tunnel type. */
> >  			RTE_STD_C11
> >  			union {
> >  				uint8_t inner_esp_next_proto;
> > @@ -541,7 +542,7 @@ struct rte_mbuf {
> >  					/**< Inner L3 type. */
> >  				};
> >  			};
> > -			uint32_t inner_l4_type:4; /**< Inner L4
> > type. */
> > +			uint8_t inner_l4_type:4; /**< Inner L4
> > type. */ };
> >  	};  
> 
> 
> 

Such a clean and simple solution to what seemed to require compiler
workaround or fix! All offsets are equal on Windows and Linux for the
following toolchains, x86_64:

* cross-compilation with MinGW-w64 6.0.0 GCC 9.3.0
* Windows native MinGW-w64 6.0.0 GCC 8.1.0 and Clang 9.0.1

Tested-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>

--
Dmitry Kozlyuk
  
Thomas Monjalon May 19, 2020, 8:18 p.m. UTC | #3
19/05/2020 21:57, Dmitry Kozlyuk:
> On Tue, 19 May 2020 20:49:50 +0200
> Thomas Monjalon <thomas@monjalon.net> wrote:
> 
> > +Cc more maintainers
> > 
> > 19/05/2020 20:41, talshn@mellanox.com:
> > > From: Tal Shnaiderman <talshn@mellanox.com>
> > > 
> > > Using uint32_t type bit-fields in Windows will pads the
> > > 'L2/L3/L4 and tunnel information' union with additional bits.
> > > 
> > > This padding causes rte_mbuf size misalignment and the total size
> > > increases to 3 cache-lines.
> > > 
> > > Changed packet_type bit-fields types from uint32_t to uint8_t
> > > to allow unified 2 cache-line structure size.
> > > 
> > > Added the __extension__ attribute over the modified struct to avoid
> > > the warning:
> > > 
> > > type of bit-field ... is a GCC extension [-pedantic]
> > > 
> > > Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
> > > ---
> > >  lib/librte_mbuf/rte_mbuf_core.h | 11 ++++++-----
> > >  1 file changed, 6 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/lib/librte_mbuf/rte_mbuf_core.h
> > > b/lib/librte_mbuf/rte_mbuf_core.h index b9a59c879..82441555e 100644
> > > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > > @@ -521,11 +521,12 @@ struct rte_mbuf {
> > >  	RTE_STD_C11
> > >  	union {
> > >  		uint32_t packet_type; /**< L2/L3/L4 and tunnel
> > > information. */
> > > +		__extension__
> > >  		struct {
> > > -			uint32_t l2_type:4; /**< (Outer) L2 type.
> > > */
> > > -			uint32_t l3_type:4; /**< (Outer) L3 type.
> > > */
> > > -			uint32_t l4_type:4; /**< (Outer) L4 type.
> > > */
> > > -			uint32_t tun_type:4; /**< Tunnel type. */
> > > +			uint8_t l2_type:4; /**< (Outer) L2 type. */
> > > +			uint8_t l3_type:4; /**< (Outer) L3 type. */
> > > +			uint8_t l4_type:4; /**< (Outer) L4 type. */
> > > +			uint8_t tun_type:4; /**< Tunnel type. */
> > >  			RTE_STD_C11
> > >  			union {
> > >  				uint8_t inner_esp_next_proto;
> > > @@ -541,7 +542,7 @@ struct rte_mbuf {
> > >  					/**< Inner L3 type. */
> > >  				};
> > >  			};
> > > -			uint32_t inner_l4_type:4; /**< Inner L4
> > > type. */
> > > +			uint8_t inner_l4_type:4; /**< Inner L4
> > > type. */ };
> > >  	};  
> > 
> > 
> > 
> 
> Such a clean and simple solution to what seemed to require compiler
> workaround or fix! All offsets are equal on Windows and Linux for the
> following toolchains, x86_64:
> 
> * cross-compilation with MinGW-w64 6.0.0 GCC 9.3.0
> * Windows native MinGW-w64 6.0.0 GCC 8.1.0 and Clang 9.0.1
> 
> Tested-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>

Would be interesting to see an offset comparison in little and big endian.
  
Menon, Ranjit May 19, 2020, 10:15 p.m. UTC | #4
On 5/19/2020 1:18 PM, Thomas Monjalon wrote:
> 19/05/2020 21:57, Dmitry Kozlyuk:
>> On Tue, 19 May 2020 20:49:50 +0200
>> Thomas Monjalon <thomas@monjalon.net> wrote:
>>
>>> +Cc more maintainers
>>>
>>> 19/05/2020 20:41, talshn@mellanox.com:
>>>> From: Tal Shnaiderman <talshn@mellanox.com>
>>>>
>>>> Using uint32_t type bit-fields in Windows will pads the
>>>> 'L2/L3/L4 and tunnel information' union with additional bits.
>>>>
>>>> This padding causes rte_mbuf size misalignment and the total size
>>>> increases to 3 cache-lines.
>>>>
>>>> Changed packet_type bit-fields types from uint32_t to uint8_t
>>>> to allow unified 2 cache-line structure size.
>>>>
>>>> Added the __extension__ attribute over the modified struct to avoid
>>>> the warning:
>>>>
>>>> type of bit-field ... is a GCC extension [-pedantic]
>>>>
>>>> Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
>>>> ---
>>>>   lib/librte_mbuf/rte_mbuf_core.h | 11 ++++++-----
>>>>   1 file changed, 6 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/lib/librte_mbuf/rte_mbuf_core.h
>>>> b/lib/librte_mbuf/rte_mbuf_core.h index b9a59c879..82441555e 100644
>>>> --- a/lib/librte_mbuf/rte_mbuf_core.h
>>>> +++ b/lib/librte_mbuf/rte_mbuf_core.h
>>>> @@ -521,11 +521,12 @@ struct rte_mbuf {
>>>>   	RTE_STD_C11
>>>>   	union {
>>>>   		uint32_t packet_type; /**< L2/L3/L4 and tunnel
>>>> information. */
>>>> +		__extension__
>>>>   		struct {
>>>> -			uint32_t l2_type:4; /**< (Outer) L2 type.
>>>> */
>>>> -			uint32_t l3_type:4; /**< (Outer) L3 type.
>>>> */
>>>> -			uint32_t l4_type:4; /**< (Outer) L4 type.
>>>> */
>>>> -			uint32_t tun_type:4; /**< Tunnel type. */
>>>> +			uint8_t l2_type:4; /**< (Outer) L2 type. */
>>>> +			uint8_t l3_type:4; /**< (Outer) L3 type. */
>>>> +			uint8_t l4_type:4; /**< (Outer) L4 type. */
>>>> +			uint8_t tun_type:4; /**< Tunnel type. */
>>>>   			RTE_STD_C11
>>>>   			union {
>>>>   				uint8_t inner_esp_next_proto;
>>>> @@ -541,7 +542,7 @@ struct rte_mbuf {
>>>>   					/**< Inner L3 type. */
>>>>   				};
>>>>   			};
>>>> -			uint32_t inner_l4_type:4; /**< Inner L4
>>>> type. */
>>>> +			uint8_t inner_l4_type:4; /**< Inner L4
>>>> type. */ };
>>>>   	};
>>>
>>>
>> Such a clean and simple solution to what seemed to require compiler
>> workaround or fix! All offsets are equal on Windows and Linux for the
>> following toolchains, x86_64:
>>
>> * cross-compilation with MinGW-w64 6.0.0 GCC 9.3.0
>> * Windows native MinGW-w64 6.0.0 GCC 8.1.0 and Clang 9.0.1
>>
>> Tested-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
> Would be interesting to see an offset comparison in little and big endian.
>
>
BTW, this is the exact fix we used for the alignment issue in the 
Windows draft repo.

It completely skipped my mind, when we were discussing this during the 
community call.

See this code section in lib/librte_mbuf/rte_mbuf.h in the draft repo:

uint32_t packet_type; /**< L2/L3/L4 and tunnel information. */
struct {
#ifndef _WIN64
uint32_t l2_type:4; /**< (Outer) L2 type. */
uint32_t l3_type:4; /**< (Outer) L3 type. */
uint32_t l4_type:4; /**< (Outer) L4 type. */
uint32_t tun_type:4; /**< Tunnel type. */
#else
uint8_t l2_type:4; /**< (Outer) L2 type. */
uint8_t l3_type:4; /**< (Outer) L3 type. */
uint8_t l4_type:4; /**< (Outer) L4 type. */
uint8_t tun_type:4; /**< Tunnel type. */
#endif
             RTE_STD_C11
union {
uint8_t inner_esp_next_proto;
                 /**< ESP next protocol type, valid if
                  * RTE_PTYPE_TUNNEL_ESP tunnel type is set
                  * on both Tx and Rx.
                  */
                 __extension__
struct {
uint8_t inner_l2_type:4;
                     /**< Inner L2 type. */
uint8_t inner_l3_type:4;
                     /**< Inner L3 type. */
                 };
             };
#ifndef _WIN64
uint32_t inner_l4_type:4; /**< Inner L4 type. */
#else
uint8_t inner_l4_type:4; /**< Inner L4 type. */
#endif
         };

We didn't have the bandwidth to test this on other OS, so we used 
#ifdef, but we know this allowed us to be as performant as Linux (using 
L3Fwd).

Therefore:

Acked-by: Ranjit Menon <ranjit.menon@intel.com>
  
Olivier Matz June 11, 2020, 11:43 a.m. UTC | #5
On Tue, May 19, 2020 at 03:15:19PM -0700, Ranjit Menon wrote:
> On 5/19/2020 1:18 PM, Thomas Monjalon wrote:
> > 19/05/2020 21:57, Dmitry Kozlyuk:
> > > On Tue, 19 May 2020 20:49:50 +0200
> > > Thomas Monjalon <thomas@monjalon.net> wrote:
> > > 
> > > > +Cc more maintainers
> > > > 
> > > > 19/05/2020 20:41, talshn@mellanox.com:
> > > > > From: Tal Shnaiderman <talshn@mellanox.com>
> > > > > 
> > > > > Using uint32_t type bit-fields in Windows will pads the
> > > > > 'L2/L3/L4 and tunnel information' union with additional bits.
> > > > > 
> > > > > This padding causes rte_mbuf size misalignment and the total size
> > > > > increases to 3 cache-lines.
> > > > > 
> > > > > Changed packet_type bit-fields types from uint32_t to uint8_t
> > > > > to allow unified 2 cache-line structure size.
> > > > > 
> > > > > Added the __extension__ attribute over the modified struct to avoid
> > > > > the warning:
> > > > > 
> > > > > type of bit-field ... is a GCC extension [-pedantic]
> > > > > 
> > > > > Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>

Acked-by: Olivier Matz <olivier.matz@6wind.com>
  
Thomas Monjalon June 11, 2020, 2:29 p.m. UTC | #6
11/06/2020 13:43, Olivier Matz:
> On Tue, May 19, 2020 at 03:15:19PM -0700, Ranjit Menon wrote:
> > On 5/19/2020 1:18 PM, Thomas Monjalon wrote:
> > > 19/05/2020 21:57, Dmitry Kozlyuk:
> > > > On Tue, 19 May 2020 20:49:50 +0200
> > > > Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > 
> > > > > +Cc more maintainers
> > > > > 
> > > > > 19/05/2020 20:41, talshn@mellanox.com:
> > > > > > From: Tal Shnaiderman <talshn@mellanox.com>
> > > > > > 
> > > > > > Using uint32_t type bit-fields in Windows will pads the
> > > > > > 'L2/L3/L4 and tunnel information' union with additional bits.
> > > > > > 
> > > > > > This padding causes rte_mbuf size misalignment and the total size
> > > > > > increases to 3 cache-lines.
> > > > > > 
> > > > > > Changed packet_type bit-fields types from uint32_t to uint8_t
> > > > > > to allow unified 2 cache-line structure size.
> > > > > > 
> > > > > > Added the __extension__ attribute over the modified struct to avoid
> > > > > > the warning:
> > > > > > 
> > > > > > type of bit-field ... is a GCC extension [-pedantic]
> > > > > > 
> > > > > > Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
> 
> Acked-by: Olivier Matz <olivier.matz@6wind.com>

Applied, thanks
  

Patch

diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
index b9a59c879..82441555e 100644
--- a/lib/librte_mbuf/rte_mbuf_core.h
+++ b/lib/librte_mbuf/rte_mbuf_core.h
@@ -521,11 +521,12 @@  struct rte_mbuf {
 	RTE_STD_C11
 	union {
 		uint32_t packet_type; /**< L2/L3/L4 and tunnel information. */
+		__extension__
 		struct {
-			uint32_t l2_type:4; /**< (Outer) L2 type. */
-			uint32_t l3_type:4; /**< (Outer) L3 type. */
-			uint32_t l4_type:4; /**< (Outer) L4 type. */
-			uint32_t tun_type:4; /**< Tunnel type. */
+			uint8_t l2_type:4; /**< (Outer) L2 type. */
+			uint8_t l3_type:4; /**< (Outer) L3 type. */
+			uint8_t l4_type:4; /**< (Outer) L4 type. */
+			uint8_t tun_type:4; /**< Tunnel type. */
 			RTE_STD_C11
 			union {
 				uint8_t inner_esp_next_proto;
@@ -541,7 +542,7 @@  struct rte_mbuf {
 					/**< Inner L3 type. */
 				};
 			};
-			uint32_t inner_l4_type:4; /**< Inner L4 type. */
+			uint8_t inner_l4_type:4; /**< Inner L4 type. */
 		};
 	};