app/testpmd: fix IP checksum calculation

Message ID 20201203135954.1127-1-prekageo@amazon.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series app/testpmd: fix IP checksum calculation |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/travis-robot success Travis build: passed
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-testing success Testing PASS

Commit Message

George Prekas Dec. 3, 2020, 1:59 p.m. UTC
  Insert a compiler barrier to make sure that the IP checksum calculation
happens after setting all the fields of the IP header.

Signed-off-by: George Prekas <prekageo@amazon.com>
---
 app/test-pmd/flowgen.c | 1 +
 1 file changed, 1 insertion(+)
  

Comments

Stephen Hemminger Dec. 3, 2020, 4:08 p.m. UTC | #1
On Thu, 3 Dec 2020 07:59:54 -0600
George Prekas <prekageo@amazon.com> wrote:

> Insert a compiler barrier to make sure that the IP checksum calculation
> happens after setting all the fields of the IP header.
> 
> Signed-off-by: George Prekas <prekageo@amazon.com>

I don't think this is necessary. All other OS's don't have to do this.
The CPU is going to maintain proper memory order.
  
George Prekas Dec. 3, 2020, 4:35 p.m. UTC | #2
On 12/3/2020 10:08 AM, Stephen Hemminger wrote:
> On Thu, 3 Dec 2020 07:59:54 -0600
> George Prekas <prekageo@amazon.com> wrote:
>
>> Insert a compiler barrier to make sure that the IP checksum calculation
>> happens after setting all the fields of the IP header.
>>
>> Signed-off-by: George Prekas <prekageo@amazon.com>
> I don't think this is necessary. All other OS's don't have to do this.
> The CPU is going to maintain proper memory order.

Hi Stephen,

This is not a CPU or OS issue. This is a compiler issue. The compiler is 
free to reorder statements if it does not detect dependencies between 
them. Without this compiler barrier, the calculated IP checksum is wrong.
  
Stephen Hemminger Dec. 3, 2020, 6:33 p.m. UTC | #3
On Thu, 3 Dec 2020 10:35:50 -0600
George Prekas <prekageo@amazon.com> wrote:

> On 12/3/2020 10:08 AM, Stephen Hemminger wrote:
> > On Thu, 3 Dec 2020 07:59:54 -0600
> > George Prekas <prekageo@amazon.com> wrote:
> >  
> >> Insert a compiler barrier to make sure that the IP checksum calculation
> >> happens after setting all the fields of the IP header.
> >>
> >> Signed-off-by: George Prekas <prekageo@amazon.com>  
> > I don't think this is necessary. All other OS's don't have to do this.
> > The CPU is going to maintain proper memory order.  
> 
> Hi Stephen,
> 
> This is not a CPU or OS issue. This is a compiler issue. The compiler is 
> free to reorder statements if it does not detect dependencies between 
> them. Without this compiler barrier, the calculated IP checksum is wrong.
> 
> 

But the compiler should be detecting any aliasing here, if no you have
a bad compiler.
  
Ferruh Yigit Dec. 4, 2020, 8:59 a.m. UTC | #4
On 12/3/2020 1:59 PM, George Prekas wrote:
> Insert a compiler barrier to make sure that the IP checksum calculation
> happens after setting all the fields of the IP header.
> 

Can you please provide the compiler details, and if there is any specific 
instruction on how to reproduce this failure?

> Signed-off-by: George Prekas <prekageo@amazon.com>
> ---
>   app/test-pmd/flowgen.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/app/test-pmd/flowgen.c b/app/test-pmd/flowgen.c
> index acf3e2460..893b4b0b8 100644
> --- a/app/test-pmd/flowgen.c
> +++ b/app/test-pmd/flowgen.c
> @@ -150,6 +150,7 @@ pkt_burst_flow_gen(struct fwd_stream *fs)
>   							   next_flow);
>   		ip_hdr->total_length	= RTE_CPU_TO_BE_16(pkt_size -
>   							   sizeof(*eth_hdr));
> +		rte_compiler_barrier();
>   		ip_hdr->hdr_checksum	= ip_sum((unaligned_uint16_t *)ip_hdr,
>   						 sizeof(*ip_hdr));
>   
>
  
George Prekas Dec. 5, 2020, 5:47 a.m. UTC | #5
On 12/4/2020 2:59 AM, Ferruh Yigit wrote:
> CAUTION: This email originated from outside of the organization. Do 
> not click links or open attachments unless you can confirm the sender 
> and know the content is safe.
>
>
>
> On 12/3/2020 1:59 PM, George Prekas wrote:
>> Insert a compiler barrier to make sure that the IP checksum calculation
>> happens after setting all the fields of the IP header.
>>
>
> Can you please provide the compiler details, and if there is any specific
> instruction on how to reproduce this failure?

This happens with GCC 9 and GCC 10. It works fine on GCC 8.

Stephen was right that a compiler barrier here is not the right 
solution. After spending some time on it, I realized that it is an 
aliasing problem when casting the IP header to uint16_t*. As far as I 
understand, this is not allowed by the C standard. As far as I know, 
there are 3 ways to fix this problem: Use a union, use memcpy, or set 
the compiler flag -fno-strict-aliasing. I assume that the last option is 
the least intrusive. I've submitted a second version of the patch with it.

Let me know of your opinion.

>
>> Signed-off-by: George Prekas <prekageo@amazon.com>
>> ---
>>   app/test-pmd/flowgen.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/app/test-pmd/flowgen.c b/app/test-pmd/flowgen.c
>> index acf3e2460..893b4b0b8 100644
>> --- a/app/test-pmd/flowgen.c
>> +++ b/app/test-pmd/flowgen.c
>> @@ -150,6 +150,7 @@ pkt_burst_flow_gen(struct fwd_stream *fs)
>> next_flow);
>>               ip_hdr->total_length    = RTE_CPU_TO_BE_16(pkt_size -
>> sizeof(*eth_hdr));
>> +             rte_compiler_barrier();
>>               ip_hdr->hdr_checksum    = ip_sum((unaligned_uint16_t 
>> *)ip_hdr,
>>                                                sizeof(*ip_hdr));
>>
>>
>
  

Patch

diff --git a/app/test-pmd/flowgen.c b/app/test-pmd/flowgen.c
index acf3e2460..893b4b0b8 100644
--- a/app/test-pmd/flowgen.c
+++ b/app/test-pmd/flowgen.c
@@ -150,6 +150,7 @@  pkt_burst_flow_gen(struct fwd_stream *fs)
 							   next_flow);
 		ip_hdr->total_length	= RTE_CPU_TO_BE_16(pkt_size -
 							   sizeof(*eth_hdr));
+		rte_compiler_barrier();
 		ip_hdr->hdr_checksum	= ip_sum((unaligned_uint16_t *)ip_hdr,
 						 sizeof(*ip_hdr));