[dpdk-dev] mk: change TLS model for ARMv8 and DPAA machine

Message ID 1528180425-27937-1-git-send-email-hemant.agrawal@nxp.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Hemant Agrawal June 5, 2018, 6:33 a.m. UTC
  From: Sachin Saxena <sachin.saxena@nxp.com>

Random corruptions observed on ARM platfoms with using
the dpdk library in shared mode with VPP software (plugin).

sing traditional TLS scheme resolved the issue.

Tested with VPP with DPDK as a plugin.

Signed-off-by: Sachin Saxena <sachin.saxena@nxp.com>
---
 mk/machine/armv8a/rte.vars.mk | 3 +++
 mk/machine/dpaa/rte.vars.mk   | 3 +++
 mk/machine/dpaa2/rte.vars.mk  | 3 +++
 3 files changed, 9 insertions(+)
  

Comments

Jerin Jacob June 10, 2018, 11:07 a.m. UTC | #1
-----Original Message-----
> Date: Tue,  5 Jun 2018 12:03:45 +0530
> From: Hemant Agrawal <hemant.agrawal@nxp.com>
> To: dev@dpdk.org
> CC: Sachin Saxena <sachin.saxena@nxp.com>
> Subject: [dpdk-dev] [PATCH] mk: change TLS model for ARMv8 and DPAA machine
> X-Mailer: git-send-email 2.7.4
> 
> From: Sachin Saxena <sachin.saxena@nxp.com>
> 
> Random corruptions observed on ARM platfoms with using
> the dpdk library in shared mode with VPP software (plugin).
> 
> sing traditional TLS scheme resolved the issue.
> 
> Tested with VPP with DPDK as a plugin.
> 
> Signed-off-by: Sachin Saxena <sachin.saxena@nxp.com>
> ---
>  mk/machine/armv8a/rte.vars.mk | 3 +++
>  mk/machine/dpaa/rte.vars.mk   | 3 +++
>  mk/machine/dpaa2/rte.vars.mk  | 3 +++
>  3 files changed, 9 insertions(+)
> 
> diff --git a/mk/machine/armv8a/rte.vars.mk b/mk/machine/armv8a/rte.vars.mk
> index 8252efb..6897cd6 100644
> --- a/mk/machine/armv8a/rte.vars.mk
> +++ b/mk/machine/armv8a/rte.vars.mk
> @@ -29,3 +29,6 @@
>  # CPU_ASFLAGS =
>  
>  MACHINE_CFLAGS += -march=armv8-a+crc+crypto
> +
> +# To avoid TLS corruption issue.
> +MACHINE_CFLAGS += -mtls-dialect=trad

This issue is not reproducible on Cavium ARMv8 platforms. Just wondering,
Do we need to change default ARMv8 config?

The GNU (descriptor) dialect for TLS is the default has been
since for a while on aarch64.

I think, it will be mostly a glibc issue with your SDK based toolchain.
Are you able to reproduce this issue with Linaro toolchain + standard
OS distribution environments? if so, could you please share more
details.

I am only concerned about, any performance issue with traditional tls
dialect model vs descriptor dialect.

I think, we have two options,
1) If you can identify if it is due a specific glibc version then we
could detect at runtime
2) In a worst case, it can be a conditional compilation option.

/Jerin
  
Sachin Saxena June 11, 2018, 4:05 a.m. UTC | #2
> -----Original Message-----
> From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> Sent: Sunday, June 10, 2018 4:37 PM
> To: Hemant Agrawal <hemant.agrawal@nxp.com>
> Cc: dev@dpdk.org; Sachin Saxena <sachin.saxena@nxp.com>
> Subject: Re: [dpdk-dev] [PATCH] mk: change TLS model for ARMv8 and DPAA
> machine
> 
> -----Original Message-----
> > Date: Tue,  5 Jun 2018 12:03:45 +0530
> > From: Hemant Agrawal <hemant.agrawal@nxp.com>
> > To: dev@dpdk.org
> > CC: Sachin Saxena <sachin.saxena@nxp.com>
> > Subject: [dpdk-dev] [PATCH] mk: change TLS model for ARMv8 and DPAA
> > machine
> > X-Mailer: git-send-email 2.7.4
> >
> > From: Sachin Saxena <sachin.saxena@nxp.com>
> >
> > Random corruptions observed on ARM platfoms with using the dpdk
> > library in shared mode with VPP software (plugin).
> >
> > sing traditional TLS scheme resolved the issue.
> >
> > Tested with VPP with DPDK as a plugin.
> >
> > Signed-off-by: Sachin Saxena <sachin.saxena@nxp.com>
> > ---
> >  mk/machine/armv8a/rte.vars.mk | 3 +++
> >  mk/machine/dpaa/rte.vars.mk   | 3 +++
> >  mk/machine/dpaa2/rte.vars.mk  | 3 +++
> >  3 files changed, 9 insertions(+)
> >
> > diff --git a/mk/machine/armv8a/rte.vars.mk
> > b/mk/machine/armv8a/rte.vars.mk index 8252efb..6897cd6 100644
> > --- a/mk/machine/armv8a/rte.vars.mk
> > +++ b/mk/machine/armv8a/rte.vars.mk
> > @@ -29,3 +29,6 @@
> >  # CPU_ASFLAGS =
> >
> >  MACHINE_CFLAGS += -march=armv8-a+crc+crypto
> > +
> > +# To avoid TLS corruption issue.
> > +MACHINE_CFLAGS += -mtls-dialect=trad
> 
> This issue is not reproducible on Cavium ARMv8 platforms. Just wondering,
> Do we need to change default ARMv8 config?
[Sachin Saxena]  The issue is currently visible On NXP platforms with VPP-dpdk solution only. Similar behavior like random crashes or initialization failures have been seen by Cavium guys on VPP but they are still investigating whether the issues are related to TLS corruption.
Also, issue will not be there with statically linked dpdk applications

> 
> The GNU (descriptor) dialect for TLS is the default has been since for a while
> on aarch64.
[Sachin Saxena] I agree but this model only applies to Shared mode compilation. As per my knowledge, the "initial-exec" model is default for static compilation or when -fPIC is not used. For shared dpdk or when -fPIC is used, the default is "global-dynamics" and tls-dialect=desc.

> 
> I think, it will be mostly a glibc issue with your SDK based toolchain.
> Are you able to reproduce this issue with Linaro toolchain + standard OS
> distribution environments? if so, could you please share more details.
[Sachin Saxena] Yes, issue is happening with both SDK & Linaro 7.2 toolchain.
> 
> I am only concerned about, any performance issue with traditional tls dialect
> model vs descriptor dialect.
[Sachin Saxena] No performance impact is expected for statically build dpdk. For shared mode, minor impact is expected but performance analysis is yet to be done. The Fix is suggested because right now it is functionally broken with VPP.
> 
> I think, we have two options,
> 1) If you can identify if it is due a specific glibc version then we could detect
> at runtime
> 2) In a worst case, it can be a conditional compilation option.
> 
> /Jerin
  
Jerin Jacob June 11, 2018, 7:45 a.m. UTC | #3
-----Original Message-----
> Date: Mon, 11 Jun 2018 04:05:26 +0000
> From: Sachin Saxena <sachin.saxena@nxp.com>
> To: Jerin Jacob <jerin.jacob@caviumnetworks.com>, Hemant Agrawal
>  <hemant.agrawal@nxp.com>
> CC: "dev@dpdk.org" <dev@dpdk.org>
> Subject: RE: [dpdk-dev] [PATCH] mk: change TLS model for ARMv8 and DPAA
>  machine
> 
> 
> > -----Original Message-----
> > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > Sent: Sunday, June 10, 2018 4:37 PM
> > To: Hemant Agrawal <hemant.agrawal@nxp.com>
> > Cc: dev@dpdk.org; Sachin Saxena <sachin.saxena@nxp.com>
> > Subject: Re: [dpdk-dev] [PATCH] mk: change TLS model for ARMv8 and DPAA
> > machine
> > 
> > -----Original Message-----
> > > Date: Tue,  5 Jun 2018 12:03:45 +0530
> > > From: Hemant Agrawal <hemant.agrawal@nxp.com>
> > > To: dev@dpdk.org
> > > CC: Sachin Saxena <sachin.saxena@nxp.com>
> > > Subject: [dpdk-dev] [PATCH] mk: change TLS model for ARMv8 and DPAA
> > > machine
> > > X-Mailer: git-send-email 2.7.4
> > >
> > > From: Sachin Saxena <sachin.saxena@nxp.com>
> > >
> > > Random corruptions observed on ARM platfoms with using the dpdk
> > > library in shared mode with VPP software (plugin).
> > >
> > > sing traditional TLS scheme resolved the issue.
> > >
> > > Tested with VPP with DPDK as a plugin.
> > >
> > > Signed-off-by: Sachin Saxena <sachin.saxena@nxp.com>
> > > ---
> > >  mk/machine/armv8a/rte.vars.mk | 3 +++
> > >  mk/machine/dpaa/rte.vars.mk   | 3 +++
> > >  mk/machine/dpaa2/rte.vars.mk  | 3 +++
> > >  3 files changed, 9 insertions(+)
> > >
> > > diff --git a/mk/machine/armv8a/rte.vars.mk
> > > b/mk/machine/armv8a/rte.vars.mk index 8252efb..6897cd6 100644
> > > --- a/mk/machine/armv8a/rte.vars.mk
> > > +++ b/mk/machine/armv8a/rte.vars.mk
> > > @@ -29,3 +29,6 @@
> > >  # CPU_ASFLAGS =
> > >
> > >  MACHINE_CFLAGS += -march=armv8-a+crc+crypto
> > > +
> > > +# To avoid TLS corruption issue.
> > > +MACHINE_CFLAGS += -mtls-dialect=trad
> > 
> > This issue is not reproducible on Cavium ARMv8 platforms. Just wondering,
> > Do we need to change default ARMv8 config?
> [Sachin Saxena]  The issue is currently visible On NXP platforms with VPP-dpdk solution only. Similar behavior like random crashes or initialization failures have been seen by Cavium guys on VPP but they are still investigating whether the issues are related to TLS corruption.

I checked with Cavium-VPP team. According to them, they are not facing any
issue related to TLS

> Also, issue will not be there with statically linked dpdk applications
> 
> > 
> > The GNU (descriptor) dialect for TLS is the default has been since for a while
> > on aarch64.
> [Sachin Saxena] I agree but this model only applies to Shared mode compilation. As per my knowledge, the "initial-exec" model is default for static compilation or when -fPIC is not used. For shared dpdk or when -fPIC is used, the default is "global-dynamics" and tls-dialect=desc.

But shared mode compilation is important too. Right? We are concerned
about performance and stability aspects of "changing default" with out
any proper root cause.


> 
> > 
> > I think, it will be mostly a glibc issue with your SDK based toolchain.
> > Are you able to reproduce this issue with Linaro toolchain + standard OS
> > distribution environments? if so, could you please share more details.
> [Sachin Saxena] Yes, issue is happening with both SDK & Linaro 7.2 toolchain.

We tested and it works with gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9).
If it is bug from Linaro on the latest toolchain, lets report and try
to have this fix based on runtime attributes.

> > 
> > I am only concerned about, any performance issue with traditional tls dialect
> > model vs descriptor dialect.
> [Sachin Saxena] No performance impact is expected for statically build dpdk. For shared mode, minor impact is expected but performance analysis is yet to be done. The Fix is suggested because right now it is functionally broken with VPP.

Is it possible to check with "gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9)"? so that we can identity it is an toolchain issue or not?

> > 
> > I think, we have two options,
> > 1) If you can identify if it is due a specific glibc version then we could detect
> > at runtime
> > 2) In a worst case, it can be a conditional compilation option.
> > 
> > /Jerin
>
  
Sachin Saxena June 14, 2018, 6:42 a.m. UTC | #4
> -----Original Message-----
> From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> Sent: Monday, June 11, 2018 1:15 PM
> To: Sachin Saxena <sachin.saxena@nxp.com>
> Cc: Hemant Agrawal <hemant.agrawal@nxp.com>; dev@dpdk.org;
> nitin.saxena@cavium.com; narayanaprasad.athreya@cavium.com
> Subject: Re: [dpdk-dev] [PATCH] mk: change TLS model for ARMv8 and DPAA
> machine
> 

[....]

> > > > Signed-off-by: Sachin Saxena <sachin.saxena@nxp.com>
> > > > ---
> > > >  mk/machine/armv8a/rte.vars.mk | 3 +++
> > > >  mk/machine/dpaa/rte.vars.mk   | 3 +++
> > > >  mk/machine/dpaa2/rte.vars.mk  | 3 +++
> > > >  3 files changed, 9 insertions(+)
> > > >
> > > > diff --git a/mk/machine/armv8a/rte.vars.mk
> > > > b/mk/machine/armv8a/rte.vars.mk index 8252efb..6897cd6 100644
> > > > --- a/mk/machine/armv8a/rte.vars.mk
> > > > +++ b/mk/machine/armv8a/rte.vars.mk
> > > > @@ -29,3 +29,6 @@
> > > >  # CPU_ASFLAGS =
> > > >
> > > >  MACHINE_CFLAGS += -march=armv8-a+crc+crypto
> > > > +
> > > > +# To avoid TLS corruption issue.
> > > > +MACHINE_CFLAGS += -mtls-dialect=trad
> > >
> > > This issue is not reproducible on Cavium ARMv8 platforms. Just
> > > wondering, Do we need to change default ARMv8 config?
> > [Sachin Saxena]  The issue is currently visible On NXP platforms with VPP-
> dpdk solution only. Similar behavior like random crashes or initialization
> failures have been seen by Cavium guys on VPP but they are still
> investigating whether the issues are related to TLS corruption.
> 
> I checked with Cavium-VPP team. According to them, they are not facing any
> issue related to TLS
> 
[Sachin Saxena] Some more information. - The issue is appearing on NXP ARM platforms as DPDK drivers are also using __thread TLS variables. If the total number of TLS variables (main application + dpdk shared Lib) increases beyond Static TLS Size limit, one will start facing issue like Corruption in TLS variable values. 

> > Also, issue will not be there with statically linked dpdk applications
> >
> > >
> > > The GNU (descriptor) dialect for TLS is the default has been since
> > > for a while on aarch64.
> > [Sachin Saxena] I agree but this model only applies to Shared mode
> compilation. As per my knowledge, the "initial-exec" model is default for
> static compilation or when -fPIC is not used. For shared dpdk or when -fPIC is
> used, the default is "global-dynamics" and tls-dialect=desc.
> 
> But shared mode compilation is important too. Right? We are concerned
> about performance and stability aspects of "changing default" with out any
> proper root cause.
[Sachin Saxena] I understand changes in " armv8a/rte.vars.mk " are common but we need it for Virtualization scenario where DPDK running in VM.
So, may I request you to please validate whether this patch results in any performance impact on your platform? If there will be any impact we will try to omit common changes.
One reason for other ARM platform not facing this issue could be that their overall TLS usage is somehow under limit. This can be verified using "initial-exec" model. Using "initial-exec" model with shared dpdk-plugin will force TLS space to be static.
If your TLS usage increase beyond the static TLS limit The application will give initialization error & exit. This is true for our case.
> 
> >
> > >
> > > I think, it will be mostly a glibc issue with your SDK based toolchain.
> > > Are you able to reproduce this issue with Linaro toolchain +
> > > standard OS distribution environments? if so, could you please share
> more details.
> > [Sachin Saxena] Yes, issue is happening with both SDK & Linaro 7.2
> toolchain.
> 
> We tested and it works with gcc version 5.4.0 20160609 (Ubuntu/Linaro
> 5.4.0-6ubuntu1~16.04.9).
>
> If it is bug from Linaro on the latest toolchain, lets report and try to have this
> fix based on runtime attributes.
> 
> > >
> > > I am only concerned about, any performance issue with traditional
> > > tls dialect model vs descriptor dialect.
> > [Sachin Saxena] No performance impact is expected for statically build
> dpdk. For shared mode, minor impact is expected but performance analysis is
> yet to be done. The Fix is suggested because right now it is functionally
> broken with VPP.
> 
> Is it possible to check with "gcc version 5.4.0 20160609 (Ubuntu/Linaro
> 5.4.0-6ubuntu1~16.04.9)"? so that we can identity it is an toolchain issue or
> not?

[Sachin Saxena]  Yes, we are also using ubuntu 16.04 and even with native compilation on board using 5.4 standard toolchain, issue is there. 

> 
> > >
> > > I think, we have two options,
> > > 1) If you can identify if it is due a specific glibc version then we
> > > could detect at runtime
> > > 2) In a worst case, it can be a conditional compilation option.
> > >
> > > /Jerin
> >
  
Jerin Jacob June 24, 2018, 12:27 p.m. UTC | #5
-----Original Message-----
> Date: Thu, 14 Jun 2018 06:42:40 +0000
> From: Sachin Saxena <sachin.saxena@nxp.com>
> To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> CC: Hemant Agrawal <hemant.agrawal@nxp.com>, "dev@dpdk.org" <dev@dpdk.org>,
>  "nitin.saxena@cavium.com" <nitin.saxena@cavium.com>,
>  "narayanaprasad.athreya@cavium.com" <narayanaprasad.athreya@cavium.com>
> Subject: RE: [dpdk-dev] [PATCH] mk: change TLS model for ARMv8 and DPAA
>  machine
> 
> 
> > -----Original Message-----
> > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > Sent: Monday, June 11, 2018 1:15 PM
> > To: Sachin Saxena <sachin.saxena@nxp.com>
> > Cc: Hemant Agrawal <hemant.agrawal@nxp.com>; dev@dpdk.org;
> > nitin.saxena@cavium.com; narayanaprasad.athreya@cavium.com
> > Subject: Re: [dpdk-dev] [PATCH] mk: change TLS model for ARMv8 and DPAA
> > machine
> >
> 
> [....]
> 
> > > > > Signed-off-by: Sachin Saxena <sachin.saxena@nxp.com>
> > > > > ---
> > > > >  mk/machine/armv8a/rte.vars.mk | 3 +++
> > > > >  mk/machine/dpaa/rte.vars.mk   | 3 +++
> > > > >  mk/machine/dpaa2/rte.vars.mk  | 3 +++
> > > > >  3 files changed, 9 insertions(+)
> > > > >
> > > > > diff --git a/mk/machine/armv8a/rte.vars.mk
> > > > > b/mk/machine/armv8a/rte.vars.mk index 8252efb..6897cd6 100644
> > > > > --- a/mk/machine/armv8a/rte.vars.mk
> > > > > +++ b/mk/machine/armv8a/rte.vars.mk
> > > > > @@ -29,3 +29,6 @@
> > > > >  # CPU_ASFLAGS =
> > > > >
> > > > >  MACHINE_CFLAGS += -march=armv8-a+crc+crypto
> > > > > +
> > > > > +# To avoid TLS corruption issue.
> > > > > +MACHINE_CFLAGS += -mtls-dialect=trad
> > > >
> > > > This issue is not reproducible on Cavium ARMv8 platforms. Just
> > > > wondering, Do we need to change default ARMv8 config?
> > > [Sachin Saxena]  The issue is currently visible On NXP platforms with VPP-
> > dpdk solution only. Similar behavior like random crashes or initialization
> > failures have been seen by Cavium guys on VPP but they are still
> > investigating whether the issues are related to TLS corruption.
> >
> > I checked with Cavium-VPP team. According to them, they are not facing any
> > issue related to TLS
> >
> [Sachin Saxena] Some more information. - The issue is appearing on NXP ARM platforms as DPDK drivers are also using __thread TLS variables. If the total number of TLS variables (main application + dpdk shared Lib) increases beyond Static TLS Size limit, one will start facing issue like Corruption in TLS variable values.

OK. Then it is generic problem. Any information on what is the limit of
number of __thread variable ? Is it possible to increase that limit by gcc command line arguments?
You may not have answers for this, but, could you ask in Linaro/gcc mailing list.

If it is fixed in some specific gcc/glibc version and applying blindly to
all GCC versions is not good IMO.
  

Patch

diff --git a/mk/machine/armv8a/rte.vars.mk b/mk/machine/armv8a/rte.vars.mk
index 8252efb..6897cd6 100644
--- a/mk/machine/armv8a/rte.vars.mk
+++ b/mk/machine/armv8a/rte.vars.mk
@@ -29,3 +29,6 @@ 
 # CPU_ASFLAGS =
 
 MACHINE_CFLAGS += -march=armv8-a+crc+crypto
+
+# To avoid TLS corruption issue.
+MACHINE_CFLAGS += -mtls-dialect=trad
diff --git a/mk/machine/dpaa/rte.vars.mk b/mk/machine/dpaa/rte.vars.mk
index bddcb80..75df626 100644
--- a/mk/machine/dpaa/rte.vars.mk
+++ b/mk/machine/dpaa/rte.vars.mk
@@ -32,3 +32,6 @@  MACHINE_CFLAGS += -march=armv8-a+crc
 ifdef CONFIG_RTE_ARCH_ARM_TUNE
 MACHINE_CFLAGS += -mtune=$(CONFIG_RTE_ARCH_ARM_TUNE:"%"=%)
 endif
+
+# To avoid TLS corruption issue.
+MACHINE_CFLAGS += -mtls-dialect=trad
diff --git a/mk/machine/dpaa2/rte.vars.mk b/mk/machine/dpaa2/rte.vars.mk
index 2fd2eac..aaa03c4 100644
--- a/mk/machine/dpaa2/rte.vars.mk
+++ b/mk/machine/dpaa2/rte.vars.mk
@@ -32,3 +32,6 @@  MACHINE_CFLAGS += -march=armv8-a+crc
 ifdef CONFIG_RTE_ARCH_ARM_TUNE
 MACHINE_CFLAGS += -mtune=$(CONFIG_RTE_ARCH_ARM_TUNE:"%"=%)
 endif
+
+# To avoid TLS corruption issue.
+MACHINE_CFLAGS += -mtls-dialect=trad