eal: fix undetected NUMA nodes

Message ID 20250305134720.907347-1-bruce.richardson@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series eal: fix undetected NUMA nodes |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/github-robot: build success github build: passed
ci/intel-Functional success Functional PASS
ci/iol-marvell-Functional success Functional Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-intel-Performance fail Performance Testing issues
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-unit-arm64-testing warning Testing issues
ci/iol-abi-testing success Testing PASS
ci/iol-unit-amd64-testing success Testing PASS
ci/iol-compile-arm64-testing fail Testing issues
ci/iol-compile-amd64-testing success Testing PASS
ci/iol-sample-apps-testing success Testing PASS

Commit Message

Bruce Richardson March 5, 2025, 1:47 p.m. UTC
In cases where the number of cores on a given socket is greater than
RTE_MAX_LCORES, then EAL will be unaware of all the sockets/numa nodes
on a system. Fix this limitation by having the EAL probe the NUMA node
for cores it isn't going to use, and recording that for completeness.

This is necessary as memory is tracked per node, and with the --lcores
parameters our app lcores may be on different sockets than the lcore ids
may imply. For example, lcore 0 is on socket zero, but if app is run
with --lcores=0@64, then DPDK lcore 0 may be on socket one, so DPDK
needs to be aware of that socket.

Fixes: 952b20777255 ("eal: provide API for querying valid socket ids")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/eal/common/eal_common_lcore.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)
  

Patch

diff --git a/lib/eal/common/eal_common_lcore.c b/lib/eal/common/eal_common_lcore.c
index 2ff9252c52..c37f38f17a 100644
--- a/lib/eal/common/eal_common_lcore.c
+++ b/lib/eal/common/eal_common_lcore.c
@@ -144,7 +144,7 @@  rte_eal_cpu_init(void)
 	unsigned lcore_id;
 	unsigned count = 0;
 	unsigned int socket_id, prev_socket_id;
-	int lcore_to_socket_id[RTE_MAX_LCORE];
+	int lcore_to_socket_id[CPU_SETSIZE] = {0};
 
 	/*
 	 * Parse the maximum set of logical cores, detect the subset of running
@@ -183,9 +183,11 @@  rte_eal_cpu_init(void)
 	for (; lcore_id < CPU_SETSIZE; lcore_id++) {
 		if (eal_cpu_detected(lcore_id) == 0)
 			continue;
+		socket_id = eal_cpu_socket_id(lcore_id);
+		lcore_to_socket_id[lcore_id] = socket_id;
 		EAL_LOG(DEBUG, "Skipped lcore %u as core %u on socket %u",
 			lcore_id, eal_cpu_core_id(lcore_id),
-			eal_cpu_socket_id(lcore_id));
+			socket_id);
 	}
 
 	/* Set the count of enabled logical cores of the EAL configuration */
@@ -201,12 +203,13 @@  rte_eal_cpu_init(void)
 
 	prev_socket_id = -1;
 	config->numa_node_count = 0;
-	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
+	for (lcore_id = 0; lcore_id < RTE_DIM(lcore_to_socket_id); lcore_id++) {
 		socket_id = lcore_to_socket_id[lcore_id];
 		if (socket_id != prev_socket_id)
-			config->numa_nodes[config->numa_node_count++] =
-					socket_id;
+			config->numa_nodes[config->numa_node_count++] =	socket_id;
 		prev_socket_id = socket_id;
+		if (config->numa_node_count >= RTE_MAX_NUMA_NODES)
+			break;
 	}
 	EAL_LOG(INFO, "Detected NUMA nodes: %u", config->numa_node_count);