[0/4] Extend --lcores to run on cores > RTE_MAX_LCORE
mbox series

Message ID 20191202153559.9709-1-david.marchand@redhat.com
Headers show
Series
  • Extend --lcores to run on cores > RTE_MAX_LCORE
Related show

Message

David Marchand Dec. 2, 2019, 3:35 p.m. UTC
We are currently stuck with no option but recompile a DPDK if the system
has more cores than RTE_MAX_LCORE.
A bit of a pity when you get a system with more than 200+ cores and your
testpmd has been built and packaged with RTE_MAX_LCORE == 128.

The --lcores does not need to care about the underlying cores, remove
this limitation.

Comments

David Marchand Jan. 14, 2020, 12:58 p.m. UTC | #1
On Mon, Dec 2, 2019 at 4:36 PM David Marchand <david.marchand@redhat.com> wrote:
>
> We are currently stuck with no option but recompile a DPDK if the system
> has more cores than RTE_MAX_LCORE.
> A bit of a pity when you get a system with more than 200+ cores and your
> testpmd has been built and packaged with RTE_MAX_LCORE == 128.
>
> The --lcores does not need to care about the underlying cores, remove
> this limitation.

Any comment?
Thanks.
Morten Brørup Jan. 14, 2020, 3:32 p.m. UTC | #2
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of David Marchand
> Sent: Tuesday, January 14, 2020 1:59 PM
> 
> On Mon, Dec 2, 2019 at 4:36 PM David Marchand
> <david.marchand@redhat.com> wrote:
> >
> > We are currently stuck with no option but recompile a DPDK if the
> system
> > has more cores than RTE_MAX_LCORE.
> > A bit of a pity when you get a system with more than 200+ cores and
> your
> > testpmd has been built and packaged with RTE_MAX_LCORE == 128.
> >
> > The --lcores does not need to care about the underlying cores, remove
> > this limitation.
> 
> Any comment?
> Thanks.
> 
> 
> --
> David Marchand
> 

I haven't reviewed the patch, but recall that the Mempool library sets up caches per lcore using hardcoded loops going from 0 to RTE_MAX_LCORE, regardless how many lcores are present and assigned to DPDK.

Making rte_max_lcore dynamic (and possibly auto-detected by the EAL) instead of defined at compile time might also be an improvement for systems with fewer cores than RTE_MAX_LCORE. But there are probably a lot of considerations regarding multi-process applications.


Med venlig hilsen / kind regards
- Morten Brørup
Yigit, Ferruh Jan. 20, 2020, 6:37 p.m. UTC | #3
On 12/2/2019 3:35 PM, David Marchand wrote:
> We are currently stuck with no option but recompile a DPDK if the system
> has more cores than RTE_MAX_LCORE.
> A bit of a pity when you get a system with more than 200+ cores and your
> testpmd has been built and packaged with RTE_MAX_LCORE == 128.

Just for clarification, in this case the DPDK application will work but will
only use RTE_MAX_LCORE cores, right? You need to compile to use all available cores.

Are cores more than RTE_MAX_LCORE usable after this patch?

> 
> The --lcores does not need to care about the underlying cores, remove
> this limitation.
> 

+1 to remove limit in --lcores, but I didn't make it work with this patchset, I
may be missing something, I tried by setting the "RTE_MAX_LCORE=32", in this
case should '--lcores "(30-40)@(10-20)"' work?
David Marchand Jan. 20, 2020, 7:35 p.m. UTC | #4
On Mon, Jan 20, 2020 at 7:37 PM Yigit, Ferruh
<ferruh.yigit@linux.intel.com> wrote:
>
> On 12/2/2019 3:35 PM, David Marchand wrote:
> > We are currently stuck with no option but recompile a DPDK if the system
> > has more cores than RTE_MAX_LCORE.
> > A bit of a pity when you get a system with more than 200+ cores and your
> > testpmd has been built and packaged with RTE_MAX_LCORE == 128.
>
> Just for clarification, in this case the DPDK application will work but will
> only use RTE_MAX_LCORE cores, right? You need to compile to use all available cores.

EAL starts up to RTE_MAX_LCORE lcores (as in EAL threads), that's true
before and after this patch.

By default (using the -c or -l options), a lcore X runs on the physical core X.
With the --lcores option, you can select on which physical cores a lcore runs.

But, before this series, the code limits the physical cores to the
same [0, RTE_MAX_LCORE[ range.

>
> Are cores more than RTE_MAX_LCORE usable after this patch?

Yes, see below.


>
> >
> > The --lcores does not need to care about the underlying cores, remove
> > this limitation.
> >
>
> +1 to remove limit in --lcores, but I didn't make it work with this patchset, I
> may be missing something, I tried by setting the "RTE_MAX_LCORE=32", in this
> case should '--lcores "(30-40)@(10-20)"' work?

Quoting the parser code:
 * The format pattern: --lcores='<lcores[@cpus]>[<,lcores[@cpus]>...]'
 * lcores, cpus could be a single digit/range or a group.
 * '(' and ')' are necessary if it's a group.
 * If not supply '@cpus', the value of cpus uses the same as lcores.
 * e.g. '1,2@(5-7),(3-5)@(0,2),(0,6),7-8' means start 9 EAL thread as below
 *   lcore 0 runs on cpuset 0x41 (cpu 0,6)
 *   lcore 1 runs on cpuset 0x2 (cpu 1)
 *   lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
 *   lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
 *   lcore 6 runs on cpuset 0x41 (cpu 0,6)
 *   lcore 7 runs on cpuset 0x80 (cpu 7)
 *   lcore 8 runs on cpuset 0x100 (cpu 8)

With --lcores "(30-40)@(10-20)", you asked to start lcores 30 to 40 to
run on the cpuset 10 to 20.
So it will be refused, since the max lcore is 32.

If your intention was to start lcore 10 on physical core 30, lcore 11
on core 31 etc... then you can try --lcores 10@30,11@31,12@32...
Thomas Monjalon Jan. 21, 2020, 12:24 a.m. UTC | #5
02/12/2019 16:35, David Marchand:
> We are currently stuck with no option but recompile a DPDK if the system
> has more cores than RTE_MAX_LCORE.
> A bit of a pity when you get a system with more than 200+ cores and your
> testpmd has been built and packaged with RTE_MAX_LCORE == 128.
> 
> The --lcores does not need to care about the underlying cores, remove
> this limitation.

> David Marchand (4):
>   eal/windows: fix cpuset macro name
>   eal: do not cache lcore detection state
>   eal: display all detected cores at startup
>   eal: remove limitation on cpuset with --lcores

The patches look good but it is very hard to review parsing code (last patch).
We will better experience corner cases after merging.

Applied for -rc1, thanks
Thomas Monjalon Feb. 21, 2020, 8:04 a.m. UTC | #6
Hi,

21/01/2020 01:24, Thomas Monjalon:
> 02/12/2019 16:35, David Marchand:
> > We are currently stuck with no option but recompile a DPDK if the system
> > has more cores than RTE_MAX_LCORE.
> > A bit of a pity when you get a system with more than 200+ cores and your
> > testpmd has been built and packaged with RTE_MAX_LCORE == 128.
> > 
> > The --lcores does not need to care about the underlying cores, remove
> > this limitation.
> 
> > David Marchand (4):
> >   eal/windows: fix cpuset macro name
> >   eal: do not cache lcore detection state
> >   eal: display all detected cores at startup
> >   eal: remove limitation on cpuset with --lcores
> 
> The patches look good but it is very hard to review parsing code (last patch).
> We will better experience corner cases after merging.
> 
> Applied for -rc1, thanks

This patch was merged in 20.02.
We don't have any feedback about issues so it's probably working fine.

It is solving a problem for running DPDK on machines having a lot of cores.
Now the difficult question: is it a new feature or a fix?
Should we backport this patchset?
Song, Keesang Feb. 21, 2020, 8:19 a.m. UTC | #7
[AMD Official Use Only - Internal Distribution Only]

Thanks Thomas for bringing this up.
I consider this is not a new feature, but rather a fix to address the issue with statically assigned maximum lcore limit on high-density CPU platform such as AMD Epyc.
As I see a lot of DPDK adopters are still using LTS 18.11 & 19.11, and they have 1~2 yrs of lifetime left, we like to backport this to LTS 18.11 & 19.11 at least.

-----Original Message-----
From: Thomas Monjalon <thomas@monjalon.net> 
Sent: Friday, February 21, 2020 12:04 AM
To: David Marchand <david.marchand@redhat.com>
Cc: dev@dpdk.org; aconole@redhat.com; ferruh.yigit@intel.com; Song, Keesang <Keesang.Song@amd.com>; bluca@debian.org; ktraynor@redhat.com; bruce.richardson@intel.com; honnappa.nagarahalli@arm.com; drc@linux.vnet.ibm.com; stable@dpdk.org
Subject: Re: [dpdk-dev] [PATCH 0/4] Extend --lcores to run on cores > RTE_MAX_LCORE

[CAUTION: External Email]

Hi,

21/01/2020 01:24, Thomas Monjalon:
> 02/12/2019 16:35, David Marchand:
> > We are currently stuck with no option but recompile a DPDK if the 
> > system has more cores than RTE_MAX_LCORE.
> > A bit of a pity when you get a system with more than 200+ cores and 
> > your testpmd has been built and packaged with RTE_MAX_LCORE == 128.
> >
> > The --lcores does not need to care about the underlying cores, 
> > remove this limitation.
>
> > David Marchand (4):
> >   eal/windows: fix cpuset macro name
> >   eal: do not cache lcore detection state
> >   eal: display all detected cores at startup
> >   eal: remove limitation on cpuset with --lcores
>
> The patches look good but it is very hard to review parsing code (last patch).
> We will better experience corner cases after merging.
>
> Applied for -rc1, thanks

This patch was merged in 20.02.
We don't have any feedback about issues so it's probably working fine.

It is solving a problem for running DPDK on machines having a lot of cores.
Now the difficult question: is it a new feature or a fix?
Should we backport this patchset?
David Marchand Feb. 21, 2020, 9:40 a.m. UTC | #8
On Fri, Feb 21, 2020 at 9:19 AM Song, Keesang <Keesang.Song@amd.com> wrote:
>
> [AMD Official Use Only - Internal Distribution Only]

Please, get this header removed.
This is a public mailing list.


> Thanks Thomas for bringing this up.
> I consider this is not a new feature, but rather a fix to address the issue with statically assigned maximum lcore limit on high-density CPU platform such as AMD Epyc.
> As I see a lot of DPDK adopters are still using LTS 18.11 & 19.11, and they have 1~2 yrs of lifetime left, we like to backport this to LTS 18.11 & 19.11 at least.

It is not a fix.

The use of static arrays is a design choice that goes back to the
early days in dpdk.
The addition of --lcores came in after this, but it was introduced for
a different use case than placing lcores on any physical core.
And there was no claim that a core > RTE_MAX_LCORE would be usable.


When testing on a new hardware, it is normal to observe some limitations.
Running DPDK on those platforms should be possible: "should be"
because I do not have access to this hardware and saw neither tests
reports nor performance numbers.
Before this patch, the limitation is that on Epyc, cores >
RTE_MAX_LCORE are not usable.


Now, this change is quite constrained.
If we backport it, I don't expect issues in the main dpdk components
(based on code review and ovs tests with a RTE_MAX_LCORE set to 16 on
a 24 cores system).
There might be issues in some examples or not widely used library
which uses a physical core id instead of a lcore id.


This is the same recurring question "do we allow new features in a
stable branch?".


--
David Marchand
Aaron Conole Feb. 21, 2020, 2:48 p.m. UTC | #9
David Marchand <david.marchand@redhat.com> writes:

> On Fri, Feb 21, 2020 at 9:19 AM Song, Keesang <Keesang.Song@amd.com> wrote:
>>
>> [AMD Official Use Only - Internal Distribution Only]
>
> Please, get this header removed.
> This is a public mailing list.
>
>
>> Thanks Thomas for bringing this up.
>> I consider this is not a new feature, but rather a fix to address
>> the issue with statically assigned maximum lcore limit on
>> high-density CPU platform such as AMD Epyc.
>> As I see a lot of DPDK adopters are still using LTS 18.11 & 19.11,
>> and they have 1~2 yrs of lifetime left, we like to backport this to
>> LTS 18.11 & 19.11 at least.
>
> It is not a fix.
>
> The use of static arrays is a design choice that goes back to the
> early days in dpdk.
> The addition of --lcores came in after this, but it was introduced for
> a different use case than placing lcores on any physical core.
> And there was no claim that a core > RTE_MAX_LCORE would be usable.
>
>
> When testing on a new hardware, it is normal to observe some limitations.
> Running DPDK on those platforms should be possible: "should be"
> because I do not have access to this hardware and saw neither tests
> reports nor performance numbers.
> Before this patch, the limitation is that on Epyc, cores >
> RTE_MAX_LCORE are not usable.
>
>
> Now, this change is quite constrained.
> If we backport it, I don't expect issues in the main dpdk components
> (based on code review and ovs tests with a RTE_MAX_LCORE set to 16 on
> a 24 cores system).
> There might be issues in some examples or not widely used library
> which uses a physical core id instead of a lcore id.
>
>
> This is the same recurring question "do we allow new features in a
> stable branch?".

Usually, the answer is 'no'.  But we do allow some "new" things to be
backported (pci ids, etc) that might be required to enable older
functionality.  Additionally, I'm sure if some feature were required to
mitigate a CVE, we'd rather favor backporting it.

I guess we could pose a litmus test:

  1. Is the problem this feature solves so widespread that it needs to
     be addressed ASAP?
  2. Is there a known workaround to the problem this is solving?
  3. How intrusive is the feature?
  4. Is it shown to be stable in the mainline (number of fixes, testing,
     etc)?
  5. Is it constrained enough that we know we can support it with even
     higher priority than other things?

Probably other questions that will need to be asked.

And even in that list of question, I'm not sure I'd be able to advocate
backporting this in the upstream branches - it hasn't had much testing.
It's unstable.  It's "difficult" to use.  It is not widespread that
people have so many cores.  The workaround is much simpler than
supporting this (recompile).

>
> --
> David Marchand
Stephen Hemminger Feb. 21, 2020, 4:38 p.m. UTC | #10
On Fri, 21 Feb 2020 09:48:58 -0500
Aaron Conole <aconole@redhat.com> wrote:

> David Marchand <david.marchand@redhat.com> writes:
> 
> > On Fri, Feb 21, 2020 at 9:19 AM Song, Keesang <Keesang.Song@amd.com> wrote:  
> >>
> >> [AMD Official Use Only - Internal Distribution Only]  
> >
> > Please, get this header removed.
> > This is a public mailing list.
> >
> >  
> >> Thanks Thomas for bringing this up.
> >> I consider this is not a new feature, but rather a fix to address
> >> the issue with statically assigned maximum lcore limit on
> >> high-density CPU platform such as AMD Epyc.
> >> As I see a lot of DPDK adopters are still using LTS 18.11 & 19.11,
> >> and they have 1~2 yrs of lifetime left, we like to backport this to
> >> LTS 18.11 & 19.11 at least.  
> >
> > It is not a fix.
> >
> > The use of static arrays is a design choice that goes back to the
> > early days in dpdk.
> > The addition of --lcores came in after this, but it was introduced for
> > a different use case than placing lcores on any physical core.
> > And there was no claim that a core > RTE_MAX_LCORE would be usable.
> >
> >
> > When testing on a new hardware, it is normal to observe some limitations.
> > Running DPDK on those platforms should be possible: "should be"
> > because I do not have access to this hardware and saw neither tests
> > reports nor performance numbers.
> > Before this patch, the limitation is that on Epyc, cores >
> > RTE_MAX_LCORE are not usable.
> >
> >
> > Now, this change is quite constrained.
> > If we backport it, I don't expect issues in the main dpdk components
> > (based on code review and ovs tests with a RTE_MAX_LCORE set to 16 on
> > a 24 cores system).
> > There might be issues in some examples or not widely used library
> > which uses a physical core id instead of a lcore id.
> >
> >
> > This is the same recurring question "do we allow new features in a
> > stable branch?".  
> 
> Usually, the answer is 'no'.  But we do allow some "new" things to be
> backported (pci ids, etc) that might be required to enable older
> functionality.  Additionally, I'm sure if some feature were required to
> mitigate a CVE, we'd rather favor backporting it.
> 
> I guess we could pose a litmus test:
> 
>   1. Is the problem this feature solves so widespread that it needs to
>      be addressed ASAP?
>   2. Is there a known workaround to the problem this is solving?
>   3. How intrusive is the feature?
>   4. Is it shown to be stable in the mainline (number of fixes, testing,
>      etc)?
>   5. Is it constrained enough that we know we can support it with even
>      higher priority than other things?
> 
> Probably other questions that will need to be asked.
> 
> And even in that list of question, I'm not sure I'd be able to advocate
> backporting this in the upstream branches - it hasn't had much testing.
> It's unstable.  It's "difficult" to use.  It is not widespread that
> people have so many cores.  The workaround is much simpler than
> supporting this (recompile).
> 
> >
> > --
> > David Marchand  
> 

RTE_MAX_LCORES is exposed in API/ABI to application.
Many applications use that to size internal data structures.
Having rte_lcore_id() potentially return a larger value would cause
out of bounds access (and crash) in that application.