mbox series

[0/4] Extend --lcores to run on cores > RTE_MAX_LCORE

Message ID 20191202153559.9709-1-david.marchand@redhat.com (mailing list archive)
Headers
Series Extend --lcores to run on cores > RTE_MAX_LCORE |

Message

David Marchand Dec. 2, 2019, 3:35 p.m. UTC
  We are currently stuck with no option but recompile a DPDK if the system
has more cores than RTE_MAX_LCORE.
A bit of a pity when you get a system with more than 200+ cores and your
testpmd has been built and packaged with RTE_MAX_LCORE == 128.

The --lcores does not need to care about the underlying cores, remove
this limitation.
  

Comments

David Marchand Jan. 14, 2020, 12:58 p.m. UTC | #1
On Mon, Dec 2, 2019 at 4:36 PM David Marchand <david.marchand@redhat.com> wrote:
>
> We are currently stuck with no option but recompile a DPDK if the system
> has more cores than RTE_MAX_LCORE.
> A bit of a pity when you get a system with more than 200+ cores and your
> testpmd has been built and packaged with RTE_MAX_LCORE == 128.
>
> The --lcores does not need to care about the underlying cores, remove
> this limitation.

Any comment?
Thanks.
  
Morten Brørup Jan. 14, 2020, 3:32 p.m. UTC | #2
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of David Marchand
> Sent: Tuesday, January 14, 2020 1:59 PM
> 
> On Mon, Dec 2, 2019 at 4:36 PM David Marchand
> <david.marchand@redhat.com> wrote:
> >
> > We are currently stuck with no option but recompile a DPDK if the
> system
> > has more cores than RTE_MAX_LCORE.
> > A bit of a pity when you get a system with more than 200+ cores and
> your
> > testpmd has been built and packaged with RTE_MAX_LCORE == 128.
> >
> > The --lcores does not need to care about the underlying cores, remove
> > this limitation.
> 
> Any comment?
> Thanks.
> 
> 
> --
> David Marchand
> 

I haven't reviewed the patch, but recall that the Mempool library sets up caches per lcore using hardcoded loops going from 0 to RTE_MAX_LCORE, regardless how many lcores are present and assigned to DPDK.

Making rte_max_lcore dynamic (and possibly auto-detected by the EAL) instead of defined at compile time might also be an improvement for systems with fewer cores than RTE_MAX_LCORE. But there are probably a lot of considerations regarding multi-process applications.


Med venlig hilsen / kind regards
- Morten Brørup
  
Ferruh Yigit Jan. 20, 2020, 6:37 p.m. UTC | #3
On 12/2/2019 3:35 PM, David Marchand wrote:
> We are currently stuck with no option but recompile a DPDK if the system
> has more cores than RTE_MAX_LCORE.
> A bit of a pity when you get a system with more than 200+ cores and your
> testpmd has been built and packaged with RTE_MAX_LCORE == 128.

Just for clarification, in this case the DPDK application will work but will
only use RTE_MAX_LCORE cores, right? You need to compile to use all available cores.

Are cores more than RTE_MAX_LCORE usable after this patch?

> 
> The --lcores does not need to care about the underlying cores, remove
> this limitation.
> 

+1 to remove limit in --lcores, but I didn't make it work with this patchset, I
may be missing something, I tried by setting the "RTE_MAX_LCORE=32", in this
case should '--lcores "(30-40)@(10-20)"' work?
  
David Marchand Jan. 20, 2020, 7:35 p.m. UTC | #4
On Mon, Jan 20, 2020 at 7:37 PM Yigit, Ferruh
<ferruh.yigit@linux.intel.com> wrote:
>
> On 12/2/2019 3:35 PM, David Marchand wrote:
> > We are currently stuck with no option but recompile a DPDK if the system
> > has more cores than RTE_MAX_LCORE.
> > A bit of a pity when you get a system with more than 200+ cores and your
> > testpmd has been built and packaged with RTE_MAX_LCORE == 128.
>
> Just for clarification, in this case the DPDK application will work but will
> only use RTE_MAX_LCORE cores, right? You need to compile to use all available cores.

EAL starts up to RTE_MAX_LCORE lcores (as in EAL threads), that's true
before and after this patch.

By default (using the -c or -l options), a lcore X runs on the physical core X.
With the --lcores option, you can select on which physical cores a lcore runs.

But, before this series, the code limits the physical cores to the
same [0, RTE_MAX_LCORE[ range.

>
> Are cores more than RTE_MAX_LCORE usable after this patch?

Yes, see below.


>
> >
> > The --lcores does not need to care about the underlying cores, remove
> > this limitation.
> >
>
> +1 to remove limit in --lcores, but I didn't make it work with this patchset, I
> may be missing something, I tried by setting the "RTE_MAX_LCORE=32", in this
> case should '--lcores "(30-40)@(10-20)"' work?

Quoting the parser code:
 * The format pattern: --lcores='<lcores[@cpus]>[<,lcores[@cpus]>...]'
 * lcores, cpus could be a single digit/range or a group.
 * '(' and ')' are necessary if it's a group.
 * If not supply '@cpus', the value of cpus uses the same as lcores.
 * e.g. '1,2@(5-7),(3-5)@(0,2),(0,6),7-8' means start 9 EAL thread as below
 *   lcore 0 runs on cpuset 0x41 (cpu 0,6)
 *   lcore 1 runs on cpuset 0x2 (cpu 1)
 *   lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
 *   lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
 *   lcore 6 runs on cpuset 0x41 (cpu 0,6)
 *   lcore 7 runs on cpuset 0x80 (cpu 7)
 *   lcore 8 runs on cpuset 0x100 (cpu 8)

With --lcores "(30-40)@(10-20)", you asked to start lcores 30 to 40 to
run on the cpuset 10 to 20.
So it will be refused, since the max lcore is 32.

If your intention was to start lcore 10 on physical core 30, lcore 11
on core 31 etc... then you can try --lcores 10@30,11@31,12@32...
  
Thomas Monjalon Jan. 21, 2020, 12:24 a.m. UTC | #5
02/12/2019 16:35, David Marchand:
> We are currently stuck with no option but recompile a DPDK if the system
> has more cores than RTE_MAX_LCORE.
> A bit of a pity when you get a system with more than 200+ cores and your
> testpmd has been built and packaged with RTE_MAX_LCORE == 128.
> 
> The --lcores does not need to care about the underlying cores, remove
> this limitation.

> David Marchand (4):
>   eal/windows: fix cpuset macro name
>   eal: do not cache lcore detection state
>   eal: display all detected cores at startup
>   eal: remove limitation on cpuset with --lcores

The patches look good but it is very hard to review parsing code (last patch).
We will better experience corner cases after merging.

Applied for -rc1, thanks
  
Thomas Monjalon Feb. 21, 2020, 8:04 a.m. UTC | #6
Hi,

21/01/2020 01:24, Thomas Monjalon:
> 02/12/2019 16:35, David Marchand:
> > We are currently stuck with no option but recompile a DPDK if the system
> > has more cores than RTE_MAX_LCORE.
> > A bit of a pity when you get a system with more than 200+ cores and your
> > testpmd has been built and packaged with RTE_MAX_LCORE == 128.
> > 
> > The --lcores does not need to care about the underlying cores, remove
> > this limitation.
> 
> > David Marchand (4):
> >   eal/windows: fix cpuset macro name
> >   eal: do not cache lcore detection state
> >   eal: display all detected cores at startup
> >   eal: remove limitation on cpuset with --lcores
> 
> The patches look good but it is very hard to review parsing code (last patch).
> We will better experience corner cases after merging.
> 
> Applied for -rc1, thanks

This patch was merged in 20.02.
We don't have any feedback about issues so it's probably working fine.

It is solving a problem for running DPDK on machines having a lot of cores.
Now the difficult question: is it a new feature or a fix?
Should we backport this patchset?
  
Song, Keesang Feb. 21, 2020, 8:19 a.m. UTC | #7
[AMD Official Use Only - Internal Distribution Only]

Thanks Thomas for bringing this up.
I consider this is not a new feature, but rather a fix to address the issue with statically assigned maximum lcore limit on high-density CPU platform such as AMD Epyc.
As I see a lot of DPDK adopters are still using LTS 18.11 & 19.11, and they have 1~2 yrs of lifetime left, we like to backport this to LTS 18.11 & 19.11 at least.

-----Original Message-----
From: Thomas Monjalon <thomas@monjalon.net> 
Sent: Friday, February 21, 2020 12:04 AM
To: David Marchand <david.marchand@redhat.com>
Cc: dev@dpdk.org; aconole@redhat.com; ferruh.yigit@intel.com; Song, Keesang <Keesang.Song@amd.com>; bluca@debian.org; ktraynor@redhat.com; bruce.richardson@intel.com; honnappa.nagarahalli@arm.com; drc@linux.vnet.ibm.com; stable@dpdk.org
Subject: Re: [dpdk-dev] [PATCH 0/4] Extend --lcores to run on cores > RTE_MAX_LCORE

[CAUTION: External Email]

Hi,

21/01/2020 01:24, Thomas Monjalon:
> 02/12/2019 16:35, David Marchand:
> > We are currently stuck with no option but recompile a DPDK if the 
> > system has more cores than RTE_MAX_LCORE.
> > A bit of a pity when you get a system with more than 200+ cores and 
> > your testpmd has been built and packaged with RTE_MAX_LCORE == 128.
> >
> > The --lcores does not need to care about the underlying cores, 
> > remove this limitation.
>
> > David Marchand (4):
> >   eal/windows: fix cpuset macro name
> >   eal: do not cache lcore detection state
> >   eal: display all detected cores at startup
> >   eal: remove limitation on cpuset with --lcores
>
> The patches look good but it is very hard to review parsing code (last patch).
> We will better experience corner cases after merging.
>
> Applied for -rc1, thanks

This patch was merged in 20.02.
We don't have any feedback about issues so it's probably working fine.

It is solving a problem for running DPDK on machines having a lot of cores.
Now the difficult question: is it a new feature or a fix?
Should we backport this patchset?
  
David Marchand Feb. 21, 2020, 9:40 a.m. UTC | #8
On Fri, Feb 21, 2020 at 9:19 AM Song, Keesang <Keesang.Song@amd.com> wrote:
>
> [AMD Official Use Only - Internal Distribution Only]

Please, get this header removed.
This is a public mailing list.


> Thanks Thomas for bringing this up.
> I consider this is not a new feature, but rather a fix to address the issue with statically assigned maximum lcore limit on high-density CPU platform such as AMD Epyc.
> As I see a lot of DPDK adopters are still using LTS 18.11 & 19.11, and they have 1~2 yrs of lifetime left, we like to backport this to LTS 18.11 & 19.11 at least.

It is not a fix.

The use of static arrays is a design choice that goes back to the
early days in dpdk.
The addition of --lcores came in after this, but it was introduced for
a different use case than placing lcores on any physical core.
And there was no claim that a core > RTE_MAX_LCORE would be usable.


When testing on a new hardware, it is normal to observe some limitations.
Running DPDK on those platforms should be possible: "should be"
because I do not have access to this hardware and saw neither tests
reports nor performance numbers.
Before this patch, the limitation is that on Epyc, cores >
RTE_MAX_LCORE are not usable.


Now, this change is quite constrained.
If we backport it, I don't expect issues in the main dpdk components
(based on code review and ovs tests with a RTE_MAX_LCORE set to 16 on
a 24 cores system).
There might be issues in some examples or not widely used library
which uses a physical core id instead of a lcore id.


This is the same recurring question "do we allow new features in a
stable branch?".


--
David Marchand
  
Aaron Conole Feb. 21, 2020, 2:48 p.m. UTC | #9
David Marchand <david.marchand@redhat.com> writes:

> On Fri, Feb 21, 2020 at 9:19 AM Song, Keesang <Keesang.Song@amd.com> wrote:
>>
>> [AMD Official Use Only - Internal Distribution Only]
>
> Please, get this header removed.
> This is a public mailing list.
>
>
>> Thanks Thomas for bringing this up.
>> I consider this is not a new feature, but rather a fix to address
>> the issue with statically assigned maximum lcore limit on
>> high-density CPU platform such as AMD Epyc.
>> As I see a lot of DPDK adopters are still using LTS 18.11 & 19.11,
>> and they have 1~2 yrs of lifetime left, we like to backport this to
>> LTS 18.11 & 19.11 at least.
>
> It is not a fix.
>
> The use of static arrays is a design choice that goes back to the
> early days in dpdk.
> The addition of --lcores came in after this, but it was introduced for
> a different use case than placing lcores on any physical core.
> And there was no claim that a core > RTE_MAX_LCORE would be usable.
>
>
> When testing on a new hardware, it is normal to observe some limitations.
> Running DPDK on those platforms should be possible: "should be"
> because I do not have access to this hardware and saw neither tests
> reports nor performance numbers.
> Before this patch, the limitation is that on Epyc, cores >
> RTE_MAX_LCORE are not usable.
>
>
> Now, this change is quite constrained.
> If we backport it, I don't expect issues in the main dpdk components
> (based on code review and ovs tests with a RTE_MAX_LCORE set to 16 on
> a 24 cores system).
> There might be issues in some examples or not widely used library
> which uses a physical core id instead of a lcore id.
>
>
> This is the same recurring question "do we allow new features in a
> stable branch?".

Usually, the answer is 'no'.  But we do allow some "new" things to be
backported (pci ids, etc) that might be required to enable older
functionality.  Additionally, I'm sure if some feature were required to
mitigate a CVE, we'd rather favor backporting it.

I guess we could pose a litmus test:

  1. Is the problem this feature solves so widespread that it needs to
     be addressed ASAP?
  2. Is there a known workaround to the problem this is solving?
  3. How intrusive is the feature?
  4. Is it shown to be stable in the mainline (number of fixes, testing,
     etc)?
  5. Is it constrained enough that we know we can support it with even
     higher priority than other things?

Probably other questions that will need to be asked.

And even in that list of question, I'm not sure I'd be able to advocate
backporting this in the upstream branches - it hasn't had much testing.
It's unstable.  It's "difficult" to use.  It is not widespread that
people have so many cores.  The workaround is much simpler than
supporting this (recompile).

>
> --
> David Marchand
  
Stephen Hemminger Feb. 21, 2020, 4:38 p.m. UTC | #10
On Fri, 21 Feb 2020 09:48:58 -0500
Aaron Conole <aconole@redhat.com> wrote:

> David Marchand <david.marchand@redhat.com> writes:
> 
> > On Fri, Feb 21, 2020 at 9:19 AM Song, Keesang <Keesang.Song@amd.com> wrote:  
> >>
> >> [AMD Official Use Only - Internal Distribution Only]  
> >
> > Please, get this header removed.
> > This is a public mailing list.
> >
> >  
> >> Thanks Thomas for bringing this up.
> >> I consider this is not a new feature, but rather a fix to address
> >> the issue with statically assigned maximum lcore limit on
> >> high-density CPU platform such as AMD Epyc.
> >> As I see a lot of DPDK adopters are still using LTS 18.11 & 19.11,
> >> and they have 1~2 yrs of lifetime left, we like to backport this to
> >> LTS 18.11 & 19.11 at least.  
> >
> > It is not a fix.
> >
> > The use of static arrays is a design choice that goes back to the
> > early days in dpdk.
> > The addition of --lcores came in after this, but it was introduced for
> > a different use case than placing lcores on any physical core.
> > And there was no claim that a core > RTE_MAX_LCORE would be usable.
> >
> >
> > When testing on a new hardware, it is normal to observe some limitations.
> > Running DPDK on those platforms should be possible: "should be"
> > because I do not have access to this hardware and saw neither tests
> > reports nor performance numbers.
> > Before this patch, the limitation is that on Epyc, cores >
> > RTE_MAX_LCORE are not usable.
> >
> >
> > Now, this change is quite constrained.
> > If we backport it, I don't expect issues in the main dpdk components
> > (based on code review and ovs tests with a RTE_MAX_LCORE set to 16 on
> > a 24 cores system).
> > There might be issues in some examples or not widely used library
> > which uses a physical core id instead of a lcore id.
> >
> >
> > This is the same recurring question "do we allow new features in a
> > stable branch?".  
> 
> Usually, the answer is 'no'.  But we do allow some "new" things to be
> backported (pci ids, etc) that might be required to enable older
> functionality.  Additionally, I'm sure if some feature were required to
> mitigate a CVE, we'd rather favor backporting it.
> 
> I guess we could pose a litmus test:
> 
>   1. Is the problem this feature solves so widespread that it needs to
>      be addressed ASAP?
>   2. Is there a known workaround to the problem this is solving?
>   3. How intrusive is the feature?
>   4. Is it shown to be stable in the mainline (number of fixes, testing,
>      etc)?
>   5. Is it constrained enough that we know we can support it with even
>      higher priority than other things?
> 
> Probably other questions that will need to be asked.
> 
> And even in that list of question, I'm not sure I'd be able to advocate
> backporting this in the upstream branches - it hasn't had much testing.
> It's unstable.  It's "difficult" to use.  It is not widespread that
> people have so many cores.  The workaround is much simpler than
> supporting this (recompile).
> 
> >
> > --
> > David Marchand  
> 

RTE_MAX_LCORES is exposed in API/ABI to application.
Many applications use that to size internal data structures.
Having rte_lcore_id() potentially return a larger value would cause
out of bounds access (and crash) in that application.
  
Song, Keesang May 29, 2020, 3:05 a.m. UTC | #11
[AMD Official Use Only - Internal Distribution Only]

Hi Thomas & David,

We haven't got the final status on this patch, and I don't see this change even from the latest LTS 20.04 repo.
So I'd like to confirm whether this patch has been safely submitted to the main upstream.
Can you check the status of that commit?

https://patchwork.dpdk.org/patch/63507/

Thanks for your support in advance,
Keesang

-----Original Message-----
From: Thomas Monjalon <thomas@monjalon.net> 
Sent: Friday, February 21, 2020 12:04 AM
To: David Marchand <david.marchand@redhat.com>
Cc: dev@dpdk.org; aconole@redhat.com; ferruh.yigit@intel.com; Song, Keesang <Keesang.Song@amd.com>; bluca@debian.org; ktraynor@redhat.com; bruce.richardson@intel.com; honnappa.nagarahalli@arm.com; drc@linux.vnet.ibm.com; stable@dpdk.org
Subject: Re: [dpdk-dev] [PATCH 0/4] Extend --lcores to run on cores > RTE_MAX_LCORE

[CAUTION: External Email]

Hi,

21/01/2020 01:24, Thomas Monjalon:
> 02/12/2019 16:35, David Marchand:
> > We are currently stuck with no option but recompile a DPDK if the 
> > system has more cores than RTE_MAX_LCORE.
> > A bit of a pity when you get a system with more than 200+ cores and 
> > your testpmd has been built and packaged with RTE_MAX_LCORE == 128.
> >
> > The --lcores does not need to care about the underlying cores, 
> > remove this limitation.
>
> > David Marchand (4):
> >   eal/windows: fix cpuset macro name
> >   eal: do not cache lcore detection state
> >   eal: display all detected cores at startup
> >   eal: remove limitation on cpuset with --lcores
>
> The patches look good but it is very hard to review parsing code (last patch).
> We will better experience corner cases after merging.
>
> Applied for -rc1, thanks

This patch was merged in 20.02.
We don't have any feedback about issues so it's probably working fine.

It is solving a problem for running DPDK on machines having a lot of cores.
Now the difficult question: is it a new feature or a fix?
Should we backport this patchset?
  
Song, Keesang May 29, 2020, 3:05 a.m. UTC | #12
[AMD Public Use]

Hi Thomas & David,

We haven't got the final status on this patch, and I don't see this change even from the latest LTS 20.04 repo.
So I'd like to confirm whether this patch has been safely submitted to the main upstream.
Can you check the status of that commit?

https://patchwork.dpdk.org/patch/63507/

Thanks for your support in advance,
Keesang

-----Original Message-----
From: Thomas Monjalon <thomas@monjalon.net>
Sent: Friday, February 21, 2020 12:04 AM
To: David Marchand <david.marchand@redhat.com>
Cc: dev@dpdk.org; aconole@redhat.com; ferruh.yigit@intel.com; Song, Keesang <Keesang.Song@amd.com>; bluca@debian.org; ktraynor@redhat.com; bruce.richardson@intel.com; honnappa.nagarahalli@arm.com; drc@linux.vnet.ibm.com; stable@dpdk.org
Subject: Re: [dpdk-dev] [PATCH 0/4] Extend --lcores to run on cores > RTE_MAX_LCORE

[CAUTION: External Email]

Hi,

21/01/2020 01:24, Thomas Monjalon:
> 02/12/2019 16:35, David Marchand:
> > We are currently stuck with no option but recompile a DPDK if the 
> > system has more cores than RTE_MAX_LCORE.
> > A bit of a pity when you get a system with more than 200+ cores and 
> > your testpmd has been built and packaged with RTE_MAX_LCORE == 128.
> >
> > The --lcores does not need to care about the underlying cores, 
> > remove this limitation.
>
> > David Marchand (4):
> >   eal/windows: fix cpuset macro name
> >   eal: do not cache lcore detection state
> >   eal: display all detected cores at startup
> >   eal: remove limitation on cpuset with --lcores
>
> The patches look good but it is very hard to review parsing code (last patch).
> We will better experience corner cases after merging.
>
> Applied for -rc1, thanks

This patch was merged in 20.02.
We don't have any feedback about issues so it's probably working fine.

It is solving a problem for running DPDK on machines having a lot of cores.
Now the difficult question: is it a new feature or a fix?
Should we backport this patchset?
  
Thomas Monjalon June 1, 2020, 9:22 p.m. UTC | #13
29/05/2020 05:05, Song, Keesang:
> Hi Thomas & David,
> 
> We haven't got the final status on this patch, and I don't see this change even from the latest LTS 20.04 repo.
> So I'd like to confirm whether this patch has been safely submitted to the main upstream.
> Can you check the status of that commit?
> 
> https://patchwork.dpdk.org/patch/63507/

As you can see below, there is a pending question:
	"is it a new feature or a fix?"

Kevin and Luca are the arbiters for the backports in 18.11 and 19.11.


> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Friday, February 21, 2020 12:04 AM
> 
> Hi,
> 
> 21/01/2020 01:24, Thomas Monjalon:
> > 02/12/2019 16:35, David Marchand:
> > > We are currently stuck with no option but recompile a DPDK if the 
> > > system has more cores than RTE_MAX_LCORE.
> > > A bit of a pity when you get a system with more than 200+ cores and 
> > > your testpmd has been built and packaged with RTE_MAX_LCORE == 128.
> > >
> > > The --lcores does not need to care about the underlying cores, 
> > > remove this limitation.
> >
> > > David Marchand (4):
> > >   eal/windows: fix cpuset macro name
> > >   eal: do not cache lcore detection state
> > >   eal: display all detected cores at startup
> > >   eal: remove limitation on cpuset with --lcores
> >
> > The patches look good but it is very hard to review parsing code (last patch).
> > We will better experience corner cases after merging.
> >
> > Applied for -rc1, thanks
> 
> This patch was merged in 20.02.
> We don't have any feedback about issues so it's probably working fine.
> 
> It is solving a problem for running DPDK on machines having a lot of cores.
> Now the difficult question: is it a new feature or a fix?
> Should we backport this patchset?
  
Song, Keesang June 1, 2020, 10:54 p.m. UTC | #14
[AMD Public Use]

Thanks Thomas for the response.
For a correction, the patchwork has not been submitted for the current LTS release, 19.11.2, but was merged into 20.02 and onward. 
The reason I brought this again was to address LTS users and many other application based on the LTS releases(18.11 & 19.11).
Since I found many of our customers and users are still relying on the latest LTS version, I'm seeking an aid for adding this change into at least 19.11, LTS branch.
We could argue that this is actually not a bug, in a way, it's inconvenient for Openstack or cloud deployed DPDK application since it's often inapt to change the base config and recompile the max lcore limit.
If backporting is still not a preferred way(pushing this patchwork into 19.11), then can we instead consider changing only the default value of ' CONFIG_RTE_MAX_LCORE' to 256 in common_base file?

# Compile Environment Abstraction Layer
#
CONFIG_RTE_MAX_LCORE=128 --> 256

I'd appreciate if anyone can advise me know what we can do about this to move forward.

-----Original Message-----
From: Thomas Monjalon <thomas@monjalon.net> 
Sent: Monday, June 1, 2020 2:23 PM
To: Song, Keesang <Keesang.Song@amd.com>
Cc: David Marchand <david.marchand@redhat.com>; dev@dpdk.org; aconole@redhat.com; ferruh.yigit@intel.com; bluca@debian.org; ktraynor@redhat.com; bruce.richardson@intel.com; honnappa.nagarahalli@arm.com; drc@linux.vnet.ibm.com; stable@dpdk.org
Subject: Re: [dpdk-dev] [PATCH 0/4] Extend --lcores to run on cores > RTE_MAX_LCORE

[CAUTION: External Email]

29/05/2020 05:05, Song, Keesang:
> Hi Thomas & David,
>
> We haven't got the final status on this patch, and I don't see this change even from the latest LTS 20.04 repo.
> So I'd like to confirm whether this patch has been safely submitted to the main upstream.
> Can you check the status of that commit?
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatc
> hwork.dpdk.org%2Fpatch%2F63507%2F&amp;data=02%7C01%7CKeesang.Song%40am
> d.com%7Cd71ea9aca917447dfb3e08d80671f34c%7C3dd8961fe4884e608e11a82d994
> e183d%7C0%7C0%7C637266433776198364&amp;sdata=1EgxevCILVMMLgyQC%2FzaWYJ
> XJ%2BOijs0Rtym1TA0VS28%3D&amp;reserved=0

As you can see below, there is a pending question:
        "is it a new feature or a fix?"

Kevin and Luca are the arbiters for the backports in 18.11 and 19.11.


> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Friday, February 21, 2020 12:04 AM
>
> Hi,
>
> 21/01/2020 01:24, Thomas Monjalon:
> > 02/12/2019 16:35, David Marchand:
> > > We are currently stuck with no option but recompile a DPDK if the 
> > > system has more cores than RTE_MAX_LCORE.
> > > A bit of a pity when you get a system with more than 200+ cores 
> > > and your testpmd has been built and packaged with RTE_MAX_LCORE == 128.
> > >
> > > The --lcores does not need to care about the underlying cores, 
> > > remove this limitation.
> >
> > > David Marchand (4):
> > >   eal/windows: fix cpuset macro name
> > >   eal: do not cache lcore detection state
> > >   eal: display all detected cores at startup
> > >   eal: remove limitation on cpuset with --lcores
> >
> > The patches look good but it is very hard to review parsing code (last patch).
> > We will better experience corner cases after merging.
> >
> > Applied for -rc1, thanks
>
> This patch was merged in 20.02.
> We don't have any feedback about issues so it's probably working fine.
>
> It is solving a problem for running DPDK on machines having a lot of cores.
> Now the difficult question: is it a new feature or a fix?
> Should we backport this patchset?
  
Song, Keesang June 9, 2020, 4:30 p.m. UTC | #15
[AMD Public Use]

Hi Kevin and Luca,

We are still waiting for the response.
Can you help on this for the backports in 18.11 and 19.11?
It would work for our customers even with changing the default value of ' CONFIG_RTE_MAX_LCORE' to 256 in common_base file in 18.11 and 19.11.

Thanks,
Keesang


-----Original Message-----
From: Song, Keesang 
Sent: Monday, June 1, 2020 3:54 PM
To: Thomas Monjalon <thomas@monjalon.net>
Cc: David Marchand <david.marchand@redhat.com>; dev@dpdk.org; aconole@redhat.com; ferruh.yigit@intel.com; bluca@debian.org; ktraynor@redhat.com; bruce.richardson@intel.com; honnappa.nagarahalli@arm.com; drc@linux.vnet.ibm.com; stable@dpdk.org; Aman Kumar <aman.kumar@vvdntech.in>; Grimm, Jon <Jon.Grimm@amd.com>
Subject: RE: [dpdk-dev] [PATCH 0/4] Extend --lcores to run on cores > RTE_MAX_LCORE

[AMD Public Use]

Thanks Thomas for the response.
For a correction, the patchwork has not been submitted for the current LTS release, 19.11.2, but was merged into 20.02 and onward. 
The reason I brought this again was to address LTS users and many other application based on the LTS releases(18.11 & 19.11).
Since I found many of our customers and users are still relying on the latest LTS version, I'm seeking an aid for adding this change into at least 19.11, LTS branch.
We could argue that this is actually not a bug, in a way, it's inconvenient for Openstack or cloud deployed DPDK application since it's often inapt to change the base config and recompile the max lcore limit.
If backporting is still not a preferred way(pushing this patchwork into 19.11), then can we instead consider changing only the default value of ' CONFIG_RTE_MAX_LCORE' to 256 in common_base file?

# Compile Environment Abstraction Layer
#
CONFIG_RTE_MAX_LCORE=128 --> 256

I'd appreciate if anyone can advise me know what we can do about this to move forward.

-----Original Message-----
From: Thomas Monjalon <thomas@monjalon.net>
Sent: Monday, June 1, 2020 2:23 PM
To: Song, Keesang <Keesang.Song@amd.com>
Cc: David Marchand <david.marchand@redhat.com>; dev@dpdk.org; aconole@redhat.com; ferruh.yigit@intel.com; bluca@debian.org; ktraynor@redhat.com; bruce.richardson@intel.com; honnappa.nagarahalli@arm.com; drc@linux.vnet.ibm.com; stable@dpdk.org
Subject: Re: [dpdk-dev] [PATCH 0/4] Extend --lcores to run on cores > RTE_MAX_LCORE

[CAUTION: External Email]

29/05/2020 05:05, Song, Keesang:
> Hi Thomas & David,
>
> We haven't got the final status on this patch, and I don't see this change even from the latest LTS 20.04 repo.
> So I'd like to confirm whether this patch has been safely submitted to the main upstream.
> Can you check the status of that commit?
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatc
> hwork.dpdk.org%2Fpatch%2F63507%2F&amp;data=02%7C01%7CKeesang.Song%40am
> d.com%7Cd71ea9aca917447dfb3e08d80671f34c%7C3dd8961fe4884e608e11a82d994
> e183d%7C0%7C0%7C637266433776198364&amp;sdata=1EgxevCILVMMLgyQC%2FzaWYJ
> XJ%2BOijs0Rtym1TA0VS28%3D&amp;reserved=0

As you can see below, there is a pending question:
        "is it a new feature or a fix?"

Kevin and Luca are the arbiters for the backports in 18.11 and 19.11.


> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Friday, February 21, 2020 12:04 AM
>
> Hi,
>
> 21/01/2020 01:24, Thomas Monjalon:
> > 02/12/2019 16:35, David Marchand:
> > > We are currently stuck with no option but recompile a DPDK if the 
> > > system has more cores than RTE_MAX_LCORE.
> > > A bit of a pity when you get a system with more than 200+ cores 
> > > and your testpmd has been built and packaged with RTE_MAX_LCORE == 128.
> > >
> > > The --lcores does not need to care about the underlying cores, 
> > > remove this limitation.
> >
> > > David Marchand (4):
> > >   eal/windows: fix cpuset macro name
> > >   eal: do not cache lcore detection state
> > >   eal: display all detected cores at startup
> > >   eal: remove limitation on cpuset with --lcores
> >
> > The patches look good but it is very hard to review parsing code (last patch).
> > We will better experience corner cases after merging.
> >
> > Applied for -rc1, thanks
>
> This patch was merged in 20.02.
> We don't have any feedback about issues so it's probably working fine.
>
> It is solving a problem for running DPDK on machines having a lot of cores.
> Now the difficult question: is it a new feature or a fix?
> Should we backport this patchset?
  
Luca Boccassi June 9, 2020, 5:48 p.m. UTC | #16
On Tue, 2020-06-09 at 16:30 +0000, Song, Keesang wrote:
> [AMD Public Use]
> 
> Hi Kevin and Luca,
> 
> We are still waiting for the response.
> Can you help on this for the backports in 18.11 and 19.11?
> It would work for our customers even with changing the default value of ' CONFIG_RTE_MAX_LCORE' to 256 in common_base file in 18.11 and 19.11.
> 
> Thanks,
> Keesang

How to send patches for inclusion in LTS releases if not already
backported is documented here:

https://core.dpdk.org/contribute/#send

If you do the work to backport and test, on the surface it seems fine
to have those in 19.11.4

> -----Original Message-----
> From: Song, Keesang 
> Sent: Monday, June 1, 2020 3:54 PM
> To: Thomas Monjalon <thomas@monjalon.net>
> Cc: David Marchand <david.marchand@redhat.com>; dev@dpdk.org; aconole@redhat.com; ferruh.yigit@intel.com; bluca@debian.org; ktraynor@redhat.com; bruce.richardson@intel.com; honnappa.nagarahalli@arm.com; drc@linux.vnet.ibm.com; stable@dpdk.org; Aman Kumar <aman.kumar@vvdntech.in>; Grimm, Jon <Jon.Grimm@amd.com>
> Subject: RE: [dpdk-dev] [PATCH 0/4] Extend --lcores to run on cores > RTE_MAX_LCORE
> 
> [AMD Public Use]
> 
> Thanks Thomas for the response.
> For a correction, the patchwork has not been submitted for the current LTS release, 19.11.2, but was merged into 20.02 and onward. 
> The reason I brought this again was to address LTS users and many other application based on the LTS releases(18.11 & 19.11).
> Since I found many of our customers and users are still relying on the latest LTS version, I'm seeking an aid for adding this change into at least 19.11, LTS branch.
> We could argue that this is actually not a bug, in a way, it's inconvenient for Openstack or cloud deployed DPDK application since it's often inapt to change the base config and recompile the max lcore limit.
> If backporting is still not a preferred way(pushing this patchwork into 19.11), then can we instead consider changing only the default value of ' CONFIG_RTE_MAX_LCORE' to 256 in common_base file?
> 
> # Compile Environment Abstraction Layer
> #
> CONFIG_RTE_MAX_LCORE=128 --> 256
> 
> I'd appreciate if anyone can advise me know what we can do about this to move forward.
> 
> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Monday, June 1, 2020 2:23 PM
> To: Song, Keesang <Keesang.Song@amd.com>
> Cc: David Marchand <david.marchand@redhat.com>; dev@dpdk.org; aconole@redhat.com; ferruh.yigit@intel.com; bluca@debian.org; ktraynor@redhat.com; bruce.richardson@intel.com; honnappa.nagarahalli@arm.com; drc@linux.vnet.ibm.com; stable@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 0/4] Extend --lcores to run on cores > RTE_MAX_LCORE
> 
> [CAUTION: External Email]
> 
> 29/05/2020 05:05, Song, Keesang:
> > Hi Thomas & David,
> > 
> > We haven't got the final status on this patch, and I don't see this change even from the latest LTS 20.04 repo.
> > So I'd like to confirm whether this patch has been safely submitted to the main upstream.
> > Can you check the status of that commit?
> > 
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatc
> > hwork.dpdk.org%2Fpatch%2F63507%2F&amp;data=02%7C01%7CKeesang.Song%40am
> > d.com%7Cd71ea9aca917447dfb3e08d80671f34c%7C3dd8961fe4884e608e11a82d994
> > e183d%7C0%7C0%7C637266433776198364&amp;sdata=1EgxevCILVMMLgyQC%2FzaWYJ
> > XJ%2BOijs0Rtym1TA0VS28%3D&amp;reserved=0
> 
> As you can see below, there is a pending question:
>         "is it a new feature or a fix?"
> 
> Kevin and Luca are the arbiters for the backports in 18.11 and 19.11.
> 
> 
> > -----Original Message-----
> > From: Thomas Monjalon <thomas@monjalon.net>
> > Sent: Friday, February 21, 2020 12:04 AM
> > 
> > Hi,
> > 
> > 21/01/2020 01:24, Thomas Monjalon:
> > > 02/12/2019 16:35, David Marchand:
> > > > We are currently stuck with no option but recompile a DPDK if the 
> > > > system has more cores than RTE_MAX_LCORE.
> > > > A bit of a pity when you get a system with more than 200+ cores 
> > > > and your testpmd has been built and packaged with RTE_MAX_LCORE == 128.
> > > > 
> > > > The --lcores does not need to care about the underlying cores, 
> > > > remove this limitation.
> > > > David Marchand (4):
> > > >   eal/windows: fix cpuset macro name
> > > >   eal: do not cache lcore detection state
> > > >   eal: display all detected cores at startup
> > > >   eal: remove limitation on cpuset with --lcores
> > > 
> > > The patches look good but it is very hard to review parsing code (last patch).
> > > We will better experience corner cases after merging.
> > > 
> > > Applied for -rc1, thanks
> > 
> > This patch was merged in 20.02.
> > We don't have any feedback about issues so it's probably working fine.
> > 
> > It is solving a problem for running DPDK on machines having a lot of cores.
> > Now the difficult question: is it a new feature or a fix?
> > Should we backport this patchset?
  
Kevin Traynor June 9, 2020, 9:34 p.m. UTC | #17
On 09/06/2020 18:48, Luca Boccassi wrote:
> On Tue, 2020-06-09 at 16:30 +0000, Song, Keesang wrote:
>> [AMD Public Use]
>>
>> Hi Kevin and Luca,
>>

Hi Keesang

>> We are still waiting for the response.
>> Can you help on this for the backports in 18.11 and 19.11?
>> It would work for our customers even with changing the default value of ' CONFIG_RTE_MAX_LCORE' to 256 in common_base file in 18.11 and 19.11.
>>
>> Thanks,
>> Keesang
> 
> How to send patches for inclusion in LTS releases if not already
> backported is documented here:
> 
> https://core.dpdk.org/contribute/#send
> 
> If you do the work to backport and test, on the surface it seems fine
> to have those in 19.11.4
> 

Looks ok to me too. I can take this in 18.11, but it will need to be in
19.11 first. Please send a backport for 18.11 and test that it's working
as expected with the 18.11 branch.

thanks,
Kevin.

>> -----Original Message-----
>> From: Song, Keesang 
>> Sent: Monday, June 1, 2020 3:54 PM
>> To: Thomas Monjalon <thomas@monjalon.net>
>> Cc: David Marchand <david.marchand@redhat.com>; dev@dpdk.org; aconole@redhat.com; ferruh.yigit@intel.com; bluca@debian.org; ktraynor@redhat.com; bruce.richardson@intel.com; honnappa.nagarahalli@arm.com; drc@linux.vnet.ibm.com; stable@dpdk.org; Aman Kumar <aman.kumar@vvdntech.in>; Grimm, Jon <Jon.Grimm@amd.com>
>> Subject: RE: [dpdk-dev] [PATCH 0/4] Extend --lcores to run on cores > RTE_MAX_LCORE
>>
>> [AMD Public Use]
>>
>> Thanks Thomas for the response.
>> For a correction, the patchwork has not been submitted for the current LTS release, 19.11.2, but was merged into 20.02 and onward. 
>> The reason I brought this again was to address LTS users and many other application based on the LTS releases(18.11 & 19.11).
>> Since I found many of our customers and users are still relying on the latest LTS version, I'm seeking an aid for adding this change into at least 19.11, LTS branch.
>> We could argue that this is actually not a bug, in a way, it's inconvenient for Openstack or cloud deployed DPDK application since it's often inapt to change the base config and recompile the max lcore limit.
>> If backporting is still not a preferred way(pushing this patchwork into 19.11), then can we instead consider changing only the default value of ' CONFIG_RTE_MAX_LCORE' to 256 in common_base file?
>>
>> # Compile Environment Abstraction Layer
>> #
>> CONFIG_RTE_MAX_LCORE=128 --> 256
>>
>> I'd appreciate if anyone can advise me know what we can do about this to move forward.
>>
>> -----Original Message-----
>> From: Thomas Monjalon <thomas@monjalon.net>
>> Sent: Monday, June 1, 2020 2:23 PM
>> To: Song, Keesang <Keesang.Song@amd.com>
>> Cc: David Marchand <david.marchand@redhat.com>; dev@dpdk.org; aconole@redhat.com; ferruh.yigit@intel.com; bluca@debian.org; ktraynor@redhat.com; bruce.richardson@intel.com; honnappa.nagarahalli@arm.com; drc@linux.vnet.ibm.com; stable@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH 0/4] Extend --lcores to run on cores > RTE_MAX_LCORE
>>
>> [CAUTION: External Email]
>>
>> 29/05/2020 05:05, Song, Keesang:
>>> Hi Thomas & David,
>>>
>>> We haven't got the final status on this patch, and I don't see this change even from the latest LTS 20.04 repo.
>>> So I'd like to confirm whether this patch has been safely submitted to the main upstream.
>>> Can you check the status of that commit?
>>>
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatc
>>> hwork.dpdk.org%2Fpatch%2F63507%2F&amp;data=02%7C01%7CKeesang.Song%40am
>>> d.com%7Cd71ea9aca917447dfb3e08d80671f34c%7C3dd8961fe4884e608e11a82d994
>>> e183d%7C0%7C0%7C637266433776198364&amp;sdata=1EgxevCILVMMLgyQC%2FzaWYJ
>>> XJ%2BOijs0Rtym1TA0VS28%3D&amp;reserved=0
>>
>> As you can see below, there is a pending question:
>>         "is it a new feature or a fix?"
>>
>> Kevin and Luca are the arbiters for the backports in 18.11 and 19.11.
>>
>>
>>> -----Original Message-----
>>> From: Thomas Monjalon <thomas@monjalon.net>
>>> Sent: Friday, February 21, 2020 12:04 AM
>>>
>>> Hi,
>>>
>>> 21/01/2020 01:24, Thomas Monjalon:
>>>> 02/12/2019 16:35, David Marchand:
>>>>> We are currently stuck with no option but recompile a DPDK if the 
>>>>> system has more cores than RTE_MAX_LCORE.
>>>>> A bit of a pity when you get a system with more than 200+ cores 
>>>>> and your testpmd has been built and packaged with RTE_MAX_LCORE == 128.
>>>>>
>>>>> The --lcores does not need to care about the underlying cores, 
>>>>> remove this limitation.
>>>>> David Marchand (4):
>>>>>   eal/windows: fix cpuset macro name
>>>>>   eal: do not cache lcore detection state
>>>>>   eal: display all detected cores at startup
>>>>>   eal: remove limitation on cpuset with --lcores
>>>>
>>>> The patches look good but it is very hard to review parsing code (last patch).
>>>> We will better experience corner cases after merging.
>>>>
>>>> Applied for -rc1, thanks
>>>
>>> This patch was merged in 20.02.
>>> We don't have any feedback about issues so it's probably working fine.
>>>
>>> It is solving a problem for running DPDK on machines having a lot of cores.
>>> Now the difficult question: is it a new feature or a fix?
>>> Should we backport this patchset?
>