[v2,1/1] net: fix aliasing issue in checksum computation

Message ID 20211017203718.801998-2-mail@gms.tf (mailing list archive)
State Accepted, archived
Delegated to: Ferruh Yigit
Headers
Series net: fix aliasing issue in checksum computation |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/github-robot: build success github build: passed
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS

Commit Message

Georg Sauthoff Oct. 17, 2021, 8:37 p.m. UTC
  That means a superfluous cast is removed and aliasing through a uint8_t
pointer is eliminated. NB: The C standard specifies that a unsigned char
pointer may alias while the C standard doesn't include such requirement
for uint8_t pointers.

Also simplified the loop since a modern C compiler can speed up (i.e.
auto-vectorize) it in a similar way. For example, GCC auto-vectorizes it
for Haswell using AVX registers while halving the number of instructions
in the generated code.

Signed-off-by: Georg Sauthoff <mail@gms.tf>
---
 lib/net/rte_ip.h | 27 ++++++++-------------------
 1 file changed, 8 insertions(+), 19 deletions(-)
  

Comments

Morten Brørup Oct. 18, 2021, 7:35 a.m. UTC | #1
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Georg Sauthoff
> Sent: Sunday, 17 October 2021 22.37

+Ferruh, as delegate to v1 in Patchwork.

> 
> That means a superfluous cast is removed and aliasing through a uint8_t
> pointer is eliminated. NB: The C standard specifies that a unsigned
> char
> pointer may alias while the C standard doesn't include such requirement
> for uint8_t pointers.
> 
> Also simplified the loop since a modern C compiler can speed up (i.e.
> auto-vectorize) it in a similar way. For example, GCC auto-vectorizes
> it
> for Haswell using AVX registers while halving the number of
> instructions
> in the generated code.
> 
> Signed-off-by: Georg Sauthoff <mail@gms.tf>
> ---
>  lib/net/rte_ip.h | 27 ++++++++-------------------
>  1 file changed, 8 insertions(+), 19 deletions(-)
> 
> diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
> index 05948b69b7..1b8c6519a9 100644
> --- a/lib/net/rte_ip.h
> +++ b/lib/net/rte_ip.h
> @@ -141,29 +141,18 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr
> *ipv4_hdr)
>  static inline uint32_t
>  __rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
>  {
> -	/* workaround gcc strict-aliasing warning */
> -	uintptr_t ptr = (uintptr_t)buf;
> +	/* extend strict-aliasing rules */
>  	typedef uint16_t __attribute__((__may_alias__)) u16_p;
> -	const u16_p *u16_buf = (const u16_p *)ptr;
> -
> -	while (len >= (sizeof(*u16_buf) * 4)) {
> -		sum += u16_buf[0];
> -		sum += u16_buf[1];
> -		sum += u16_buf[2];
> -		sum += u16_buf[3];
> -		len -= sizeof(*u16_buf) * 4;
> -		u16_buf += 4;
> -	}
> -	while (len >= sizeof(*u16_buf)) {
> +	const u16_p *u16_buf = (const u16_p *)buf;
> +	const u16_p *end = u16_buf + len / sizeof(*u16_buf);
> +
> +	for (; u16_buf != end; ++u16_buf)
>  		sum += *u16_buf;
> -		len -= sizeof(*u16_buf);
> -		u16_buf += 1;
> -	}
> 
> -	/* if length is in odd bytes */
> -	if (len == 1) {
> +	/* if length is odd, keeping it byte order independent */
> +	if (unlikely(len % 2)) {
>  		uint16_t left = 0;
> -		*(uint8_t *)&left = *(const uint8_t *)u16_buf;
> +		*(unsigned char*)&left = *(const unsigned char *)end;
>  		sum += left;
>  	}
> 
> --
> 2.31.1
> 

Great work documenting your thoughts behind this patch, Georg! I, for one, didn't know about the aliasing difference between uint8_t and unsigned char. :-)

After taking a good look at v2 and the Godbolt reference to confirm the claimed benefits, there can be no doubts about this patch.

Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
  
Olivier Matz Oct. 18, 2021, 7:58 a.m. UTC | #2
On Mon, Oct 18, 2021 at 09:35:41AM +0200, Morten Brørup wrote:
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Georg Sauthoff
> > Sent: Sunday, 17 October 2021 22.37
> 
> +Ferruh, as delegate to v1 in Patchwork.
> 
> > 
> > That means a superfluous cast is removed and aliasing through a uint8_t
> > pointer is eliminated. NB: The C standard specifies that a unsigned
> > char
> > pointer may alias while the C standard doesn't include such requirement
> > for uint8_t pointers.
> > 
> > Also simplified the loop since a modern C compiler can speed up (i.e.
> > auto-vectorize) it in a similar way. For example, GCC auto-vectorizes
> > it
> > for Haswell using AVX registers while halving the number of
> > instructions
> > in the generated code.
> > 
> > Signed-off-by: Georg Sauthoff <mail@gms.tf>
> > ---
> >  lib/net/rte_ip.h | 27 ++++++++-------------------
> >  1 file changed, 8 insertions(+), 19 deletions(-)
> > 
> > diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
> > index 05948b69b7..1b8c6519a9 100644
> > --- a/lib/net/rte_ip.h
> > +++ b/lib/net/rte_ip.h
> > @@ -141,29 +141,18 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr
> > *ipv4_hdr)
> >  static inline uint32_t
> >  __rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
> >  {
> > -	/* workaround gcc strict-aliasing warning */
> > -	uintptr_t ptr = (uintptr_t)buf;
> > +	/* extend strict-aliasing rules */
> >  	typedef uint16_t __attribute__((__may_alias__)) u16_p;
> > -	const u16_p *u16_buf = (const u16_p *)ptr;
> > -
> > -	while (len >= (sizeof(*u16_buf) * 4)) {
> > -		sum += u16_buf[0];
> > -		sum += u16_buf[1];
> > -		sum += u16_buf[2];
> > -		sum += u16_buf[3];
> > -		len -= sizeof(*u16_buf) * 4;
> > -		u16_buf += 4;
> > -	}
> > -	while (len >= sizeof(*u16_buf)) {
> > +	const u16_p *u16_buf = (const u16_p *)buf;
> > +	const u16_p *end = u16_buf + len / sizeof(*u16_buf);
> > +
> > +	for (; u16_buf != end; ++u16_buf)
> >  		sum += *u16_buf;
> > -		len -= sizeof(*u16_buf);
> > -		u16_buf += 1;
> > -	}
> > 
> > -	/* if length is in odd bytes */
> > -	if (len == 1) {
> > +	/* if length is odd, keeping it byte order independent */
> > +	if (unlikely(len % 2)) {
> >  		uint16_t left = 0;
> > -		*(uint8_t *)&left = *(const uint8_t *)u16_buf;
> > +		*(unsigned char*)&left = *(const unsigned char *)end;
> >  		sum += left;
> >  	}
> > 
> > --
> > 2.31.1
> > 
> 
> Great work documenting your thoughts behind this patch, Georg! I, for one, didn't know about the aliasing difference between uint8_t and unsigned char. :-)
> 
> After taking a good look at v2 and the Godbolt reference to confirm the claimed benefits, there can be no doubts about this patch.

+1, thanks for the good documentation

> Reviewed-by: Morten Brørup <mb@smartsharesystems.com>

Acked-by: Olivier Matz <olivier.matz@6wind.com>
  
Ferruh Yigit Oct. 18, 2021, 4:14 p.m. UTC | #3
On 10/18/2021 8:58 AM, Olivier Matz wrote:
> On Mon, Oct 18, 2021 at 09:35:41AM +0200, Morten Brørup wrote:
>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Georg Sauthoff
>>> Sent: Sunday, 17 October 2021 22.37
>>
>> +Ferruh, as delegate to v1 in Patchwork.
>>
>>>
>>> That means a superfluous cast is removed and aliasing through a uint8_t
>>> pointer is eliminated. NB: The C standard specifies that a unsigned
>>> char
>>> pointer may alias while the C standard doesn't include such requirement
>>> for uint8_t pointers.
>>>
>>> Also simplified the loop since a modern C compiler can speed up (i.e.
>>> auto-vectorize) it in a similar way. For example, GCC auto-vectorizes
>>> it
>>> for Haswell using AVX registers while halving the number of
>>> instructions
>>> in the generated code.
>>>
>>> Signed-off-by: Georg Sauthoff <mail@gms.tf>
>>> ---
>>>   lib/net/rte_ip.h | 27 ++++++++-------------------
>>>   1 file changed, 8 insertions(+), 19 deletions(-)
>>>
>>> diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
>>> index 05948b69b7..1b8c6519a9 100644
>>> --- a/lib/net/rte_ip.h
>>> +++ b/lib/net/rte_ip.h
>>> @@ -141,29 +141,18 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr
>>> *ipv4_hdr)
>>>   static inline uint32_t
>>>   __rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
>>>   {
>>> -	/* workaround gcc strict-aliasing warning */
>>> -	uintptr_t ptr = (uintptr_t)buf;
>>> +	/* extend strict-aliasing rules */
>>>   	typedef uint16_t __attribute__((__may_alias__)) u16_p;
>>> -	const u16_p *u16_buf = (const u16_p *)ptr;
>>> -
>>> -	while (len >= (sizeof(*u16_buf) * 4)) {
>>> -		sum += u16_buf[0];
>>> -		sum += u16_buf[1];
>>> -		sum += u16_buf[2];
>>> -		sum += u16_buf[3];
>>> -		len -= sizeof(*u16_buf) * 4;
>>> -		u16_buf += 4;
>>> -	}
>>> -	while (len >= sizeof(*u16_buf)) {
>>> +	const u16_p *u16_buf = (const u16_p *)buf;
>>> +	const u16_p *end = u16_buf + len / sizeof(*u16_buf);
>>> +
>>> +	for (; u16_buf != end; ++u16_buf)
>>>   		sum += *u16_buf;
>>> -		len -= sizeof(*u16_buf);
>>> -		u16_buf += 1;
>>> -	}
>>>
>>> -	/* if length is in odd bytes */
>>> -	if (len == 1) {
>>> +	/* if length is odd, keeping it byte order independent */
>>> +	if (unlikely(len % 2)) {
>>>   		uint16_t left = 0;
>>> -		*(uint8_t *)&left = *(const uint8_t *)u16_buf;
>>> +		*(unsigned char*)&left = *(const unsigned char *)end;
>>>   		sum += left;
>>>   	}
>>>
>>> --
>>> 2.31.1
>>>
>>
>> Great work documenting your thoughts behind this patch, Georg! I, for one, didn't know about the aliasing difference between uint8_t and unsigned char. :-)
>>
>> After taking a good look at v2 and the Godbolt reference to confirm the claimed benefits, there can be no doubts about this patch.
> 
> +1, thanks for the good documentation
> 
>> Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
> 
> Acked-by: Olivier Matz <olivier.matz@6wind.com>
> 

Updated patch title as: "net: fix aliasing in checksum computation"

Added fixes tags:
     Fixes: 6006818cfb26 ("net: new checksum functions")
     Fixes: e079655c41fb ("net: fix build with gcc 4.4.7 and strict aliasing")
     Cc: stable@dpdk.org

Following warning fixed in next-net:
   ERROR:POINTER_LOCATION: "(foo*)" should be "(foo *)"
   #65: FILE: lib/net/rte_ip.h:168:
   +               *(unsigned char*)&left = *(const unsigned char *)end;


Applied to dpdk-next-net/main, thanks.
  

Patch

diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
index 05948b69b7..1b8c6519a9 100644
--- a/lib/net/rte_ip.h
+++ b/lib/net/rte_ip.h
@@ -141,29 +141,18 @@  rte_ipv4_hdr_len(const struct rte_ipv4_hdr *ipv4_hdr)
 static inline uint32_t
 __rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
 {
-	/* workaround gcc strict-aliasing warning */
-	uintptr_t ptr = (uintptr_t)buf;
+	/* extend strict-aliasing rules */
 	typedef uint16_t __attribute__((__may_alias__)) u16_p;
-	const u16_p *u16_buf = (const u16_p *)ptr;
-
-	while (len >= (sizeof(*u16_buf) * 4)) {
-		sum += u16_buf[0];
-		sum += u16_buf[1];
-		sum += u16_buf[2];
-		sum += u16_buf[3];
-		len -= sizeof(*u16_buf) * 4;
-		u16_buf += 4;
-	}
-	while (len >= sizeof(*u16_buf)) {
+	const u16_p *u16_buf = (const u16_p *)buf;
+	const u16_p *end = u16_buf + len / sizeof(*u16_buf);
+
+	for (; u16_buf != end; ++u16_buf)
 		sum += *u16_buf;
-		len -= sizeof(*u16_buf);
-		u16_buf += 1;
-	}
 
-	/* if length is in odd bytes */
-	if (len == 1) {
+	/* if length is odd, keeping it byte order independent */
+	if (unlikely(len % 2)) {
 		uint16_t left = 0;
-		*(uint8_t *)&left = *(const uint8_t *)u16_buf;
+		*(unsigned char*)&left = *(const unsigned char *)end;
 		sum += left;
 	}