[v7] enhance NUMA affinity heuristic
Checks
Commit Message
When a DPDK application is started on only one numa node, memory is
allocated for only one socket. When interrupt threads use memory,
memory may not be found on the socket where the interrupt thread
is currently located, and memory has to be reallocated on the hugepage,
this operation will lead to performance degradation.
Fixes: 705356f0811f ("eal: simplify control thread creation")
Fixes: 770d41bf3309 ("malloc: fix allocation with unknown socket ID")
Cc: stable@dpdk.org
Signed-off-by: Kaisen You <kaisenx.you@intel.com>
---
Changes since v6:
- New explanation for easy understanding,
Changes since v5:
- Add comments to the code,
Changes since v4:
- mod the patch title,
Changes since v3:
- add the assignment of socket_id in thread initialization,
Changes since v2:
- add uncommitted local change and fix compilation,
Changes since v1:
- accomodate for configurations with main lcore running on multiples
physical cores belonging to different numa,
---
lib/eal/common/eal_common_thread.c | 6 ++++++
lib/eal/common/malloc_heap.c | 9 +++++++++
2 files changed, 15 insertions(+)
Comments
On 5/23/2023 3:50 AM, Kaisen You wrote:
> When a DPDK application is started on only one numa node, memory is
> allocated for only one socket. When interrupt threads use memory,
> memory may not be found on the socket where the interrupt thread
> is currently located, and memory has to be reallocated on the hugepage,
> this operation will lead to performance degradation.
>
> Fixes: 705356f0811f ("eal: simplify control thread creation")
> Fixes: 770d41bf3309 ("malloc: fix allocation with unknown socket ID")
> Cc: stable@dpdk.org
>
> Signed-off-by: Kaisen You <kaisenx.you@intel.com>
Hi You,
I've suggested comment rewordings based on my understanding of the issue.
> ---
> Changes since v6:
> - New explanation for easy understanding,
>
> Changes since v5:
> - Add comments to the code,
>
> Changes since v4:
> - mod the patch title,
>
> Changes since v3:
> - add the assignment of socket_id in thread initialization,
>
> Changes since v2:
> - add uncommitted local change and fix compilation,
>
> Changes since v1:
> - accomodate for configurations with main lcore running on multiples
> physical cores belonging to different numa,
> ---
> lib/eal/common/eal_common_thread.c | 6 ++++++
> lib/eal/common/malloc_heap.c | 9 +++++++++
> 2 files changed, 15 insertions(+)
>
> diff --git a/lib/eal/common/eal_common_thread.c b/lib/eal/common/eal_common_thread.c
> index 079a385630..6479b66da1 100644
> --- a/lib/eal/common/eal_common_thread.c
> +++ b/lib/eal/common/eal_common_thread.c
> @@ -252,6 +252,12 @@ static int ctrl_thread_init(void *arg)
> struct rte_thread_ctrl_params *params = arg;
>
> __rte_thread_init(rte_lcore_id(), cpuset);
> + /* set the value of the per-core variable _socket_id to SOCKET_ID_ANY.
> + * Satisfy the judgment condition when threads find memory.
> + * If SOCKET_ID_ANY is not specified, the thread may go to a node with
> + * unallocated memory in a subsequent memory search.
I suggest a different comment wording:
Set control thread socket ID to SOCKET_ID_ANY as control threads may be
scheduled on any NUMA node.
> + */
> + RTE_PER_LCORE(_socket_id) = SOCKET_ID_ANY;
> params->ret = rte_thread_set_affinity_by_id(rte_thread_self(), cpuset);
> if (params->ret != 0) {
> __atomic_store_n(¶ms->ctrl_thread_status,
> diff --git a/lib/eal/common/malloc_heap.c b/lib/eal/common/malloc_heap.c
> index d25bdc98f9..6d37f8afee 100644
> --- a/lib/eal/common/malloc_heap.c
> +++ b/lib/eal/common/malloc_heap.c
> @@ -716,6 +716,15 @@ malloc_get_numa_socket(void)
> if (conf->socket_mem[socket_id] != 0)
> return socket_id;
> }
> + /* Trying to allocate memory on the main lcore numa node.
> + * especially when the DPDK application is started only on one numa node.
> + */
I suggest the following comment wording:
We couldn't find quickly find a NUMA node where memory was available, so
fall back to using main lcore socket ID.
> + socket_id = rte_lcore_to_socket_id(rte_get_main_lcore());
> + /* When the socket_id obtained in the main lcore numa is SOCKET_ID_ANY,
> + * The probability of finding memory on rte_socket_id_by_idx(0) is higher.
> + */
I suggest the following comment wording:
Main lcore socket ID may be SOCKET_ID_ANY in cases when main lcore
thread is affinitized to multiple NUMA nodes.
> + if (socket_id != (unsigned int)SOCKET_ID_ANY)
> + return socket_id;
>
I suggest adding comment here:
Failed to find meaningful socket ID, so just use the first one available.
> return rte_socket_id_by_idx(0);
> }
I believe these comments offer better explanation as to why we are doing
the things we do here.
Whether or not you decide to take these corrections on board,
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
On 5/23/2023 3:50 AM, Kaisen You wrote:
> When a DPDK application is started on only one numa node, memory is
> allocated for only one socket. When interrupt threads use memory,
> memory may not be found on the socket where the interrupt thread
> is currently located, and memory has to be reallocated on the hugepage,
> this operation will lead to performance degradation.
>
> Fixes: 705356f0811f ("eal: simplify control thread creation")
> Fixes: 770d41bf3309 ("malloc: fix allocation with unknown socket ID")
> Cc: stable@dpdk.org
>
> Signed-off-by: Kaisen You <kaisenx.you@intel.com>
> ---
For the record, I still think that this is a solution for a problem that
should be fixed elsewhere, because a DPDK lcore (even main lcore!)
having a specific NUMA node affinity is one of the most fundamental
assumptions about DPDK, and I feel like we're inviting problems if we
allow lcores to have multiple NUMA node affinities.
For example, if I run DPDK test app with the following command-line:
--lcores "1@(1,29),2@(30)"
The malloc autotest will fail because main lcore now returns -1 when
we're calling `rte_socket_id()` from it. Correspondigly, any API's that
use `rte_socket_id()` internally for various purposes (especially
indexing arrays!) will now have to account for the fact that
`rte_socket_id()` can just return -1 and it is not an exceptional situation.
IMO if we want to keep this behavior, EAL should at least warn the user
that a DPDK lcore was assigned SOCKET_ID_ANY on account of multiple NUMA
nodes being in its cpuset. So, as an unrealted change (so, i'm not
suggesting doing it in this specific patchset), I would suggest that
`thread_update_affinity()` should warn about DPDK lcore being assigned
socket ID like that.
> -----Original Message-----
> From: Burakov, Anatoly <anatoly.burakov@intel.com>
> Sent: 2023年5月23日 18:45
> To: You, KaisenX <kaisenx.you@intel.com>; dev@dpdk.org
> Cc: Zhou, YidingX <yidingx.zhou@intel.com>; thomas@monjalon.net;
> david.marchand@redhat.com; Matz, Olivier <olivier.matz@6wind.com>;
> ferruh.yigit@amd.com; zhoumin@loongson.cn; stable@dpdk.org
> Subject: Re: [PATCH v7] enhance NUMA affinity heuristic
>
> On 5/23/2023 3:50 AM, Kaisen You wrote:
> > When a DPDK application is started on only one numa node, memory is
> > allocated for only one socket. When interrupt threads use memory,
> > memory may not be found on the socket where the interrupt thread is
> > currently located, and memory has to be reallocated on the hugepage,
> > this operation will lead to performance degradation.
> >
> > Fixes: 705356f0811f ("eal: simplify control thread creation")
> > Fixes: 770d41bf3309 ("malloc: fix allocation with unknown socket ID")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Kaisen You <kaisenx.you@intel.com>
>
> Hi You,
>
> I've suggested comment rewordings based on my understanding of the issue.
>
> > ---
> > Changes since v6:
> > - New explanation for easy understanding,
> >
> > Changes since v5:
> > - Add comments to the code,
> >
> > Changes since v4:
> > - mod the patch title,
> >
> > Changes since v3:
> > - add the assignment of socket_id in thread initialization,
> >
> > Changes since v2:
> > - add uncommitted local change and fix compilation,
> >
> > Changes since v1:
> > - accomodate for configurations with main lcore running on multiples
> > physical cores belonging to different numa,
> > ---
> > lib/eal/common/eal_common_thread.c | 6 ++++++
> > lib/eal/common/malloc_heap.c | 9 +++++++++
> > 2 files changed, 15 insertions(+)
> >
> > diff --git a/lib/eal/common/eal_common_thread.c
> > b/lib/eal/common/eal_common_thread.c
> > index 079a385630..6479b66da1 100644
> > --- a/lib/eal/common/eal_common_thread.c
> > +++ b/lib/eal/common/eal_common_thread.c
> > @@ -252,6 +252,12 @@ static int ctrl_thread_init(void *arg)
> > struct rte_thread_ctrl_params *params = arg;
> >
> > __rte_thread_init(rte_lcore_id(), cpuset);
> > + /* set the value of the per-core variable _socket_id to
> SOCKET_ID_ANY.
> > + * Satisfy the judgment condition when threads find memory.
> > + * If SOCKET_ID_ANY is not specified, the thread may go to a node
> with
> > + * unallocated memory in a subsequent memory search.
>
> I suggest a different comment wording:
>
> Set control thread socket ID to SOCKET_ID_ANY as control threads may be
> scheduled on any NUMA node.
>
> > + */
> > + RTE_PER_LCORE(_socket_id) = SOCKET_ID_ANY;
> > params->ret = rte_thread_set_affinity_by_id(rte_thread_self(),
> cpuset);
> > if (params->ret != 0) {
> > __atomic_store_n(¶ms->ctrl_thread_status,
> > diff --git a/lib/eal/common/malloc_heap.c
> > b/lib/eal/common/malloc_heap.c index d25bdc98f9..6d37f8afee 100644
> > --- a/lib/eal/common/malloc_heap.c
> > +++ b/lib/eal/common/malloc_heap.c
> > @@ -716,6 +716,15 @@ malloc_get_numa_socket(void)
> > if (conf->socket_mem[socket_id] != 0)
> > return socket_id;
> > }
> > + /* Trying to allocate memory on the main lcore numa node.
> > + * especially when the DPDK application is started only on one numa
> node.
> > + */
>
> I suggest the following comment wording:
>
> We couldn't find quickly find a NUMA node where memory was available, so
> fall back to using main lcore socket ID.
>
> > + socket_id = rte_lcore_to_socket_id(rte_get_main_lcore());
> > + /* When the socket_id obtained in the main lcore numa is
> SOCKET_ID_ANY,
> > + * The probability of finding memory on rte_socket_id_by_idx(0) is
> higher.
> > + */
>
> I suggest the following comment wording:
>
> Main lcore socket ID may be SOCKET_ID_ANY in cases when main lcore
> thread is affinitized to multiple NUMA nodes.
>
> > + if (socket_id != (unsigned int)SOCKET_ID_ANY)
> > + return socket_id;
> >
>
> I suggest adding comment here:
>
> Failed to find meaningful socket ID, so just use the first one available.
>
> > return rte_socket_id_by_idx(0);
> > }
>
> I believe these comments offer better explanation as to why we are doing
> the things we do here.
>
> Whether or not you decide to take these corrections on board,
>
> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Thank you for your acked and suggestions, I will adopt your suggestions in the V8 version.
>
> --
> Thanks,
> Anatoly
@@ -252,6 +252,12 @@ static int ctrl_thread_init(void *arg)
struct rte_thread_ctrl_params *params = arg;
__rte_thread_init(rte_lcore_id(), cpuset);
+ /* set the value of the per-core variable _socket_id to SOCKET_ID_ANY.
+ * Satisfy the judgment condition when threads find memory.
+ * If SOCKET_ID_ANY is not specified, the thread may go to a node with
+ * unallocated memory in a subsequent memory search.
+ */
+ RTE_PER_LCORE(_socket_id) = SOCKET_ID_ANY;
params->ret = rte_thread_set_affinity_by_id(rte_thread_self(), cpuset);
if (params->ret != 0) {
__atomic_store_n(¶ms->ctrl_thread_status,
@@ -716,6 +716,15 @@ malloc_get_numa_socket(void)
if (conf->socket_mem[socket_id] != 0)
return socket_id;
}
+ /* Trying to allocate memory on the main lcore numa node.
+ * especially when the DPDK application is started only on one numa node.
+ */
+ socket_id = rte_lcore_to_socket_id(rte_get_main_lcore());
+ /* When the socket_id obtained in the main lcore numa is SOCKET_ID_ANY,
+ * The probability of finding memory on rte_socket_id_by_idx(0) is higher.
+ */
+ if (socket_id != (unsigned int)SOCKET_ID_ANY)
+ return socket_id;
return rte_socket_id_by_idx(0);
}