[RFC,v5,6/6] eal: keep per-lcore power intrinsics state in lcore variable

Message ID 20240228100928.524277-7-mattias.ronnblom@ericsson.com (mailing list archive)
State Superseded
Delegated to: David Marchand
Headers
Series Lcore variables |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/github-robot: build fail github build: failed
ci/intel-Functional success Functional PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-unit-arm64-testing fail Testing issues
ci/iol-compile-arm64-testing fail Testing issues
ci/iol-compile-amd64-testing fail Testing issues
ci/iol-unit-amd64-testing fail Testing issues
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-sample-apps-testing success Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS

Commit Message

Mattias Rönnblom Feb. 28, 2024, 10:09 a.m. UTC
  Keep per-lcore power intrinsics state in a lcore variable to reduce
cache working set size and avoid any CPU next-line-prefetching causing
false sharing.

Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
---
 lib/eal/x86/rte_power_intrinsics.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)
  

Patch

diff --git a/lib/eal/x86/rte_power_intrinsics.c b/lib/eal/x86/rte_power_intrinsics.c
index 532a2e646b..23d1761f0a 100644
--- a/lib/eal/x86/rte_power_intrinsics.c
+++ b/lib/eal/x86/rte_power_intrinsics.c
@@ -4,6 +4,7 @@ 
 
 #include <rte_common.h>
 #include <rte_lcore.h>
+#include <rte_lcore_var.h>
 #include <rte_rtm.h>
 #include <rte_spinlock.h>
 
@@ -12,10 +13,14 @@ 
 /*
  * Per-lcore structure holding current status of C0.2 sleeps.
  */
-static struct power_wait_status {
+struct power_wait_status {
 	rte_spinlock_t lock;
 	volatile void *monitor_addr; /**< NULL if not currently sleeping */
-} __rte_cache_aligned wait_status[RTE_MAX_LCORE];
+};
+
+RTE_LCORE_VAR_HANDLE(struct power_wait_status, wait_status);
+
+RTE_LCORE_VAR_INIT(wait_status);
 
 /*
  * This function uses UMONITOR/UMWAIT instructions and will enter C0.2 state.
@@ -170,7 +175,7 @@  rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	if (pmc->fn == NULL)
 		return -EINVAL;
 
-	s = &wait_status[lcore_id];
+	s = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, wait_status);
 
 	/* update sleep address */
 	rte_spinlock_lock(&s->lock);
@@ -262,7 +267,7 @@  rte_power_monitor_wakeup(const unsigned int lcore_id)
 	if (lcore_id >= RTE_MAX_LCORE)
 		return -EINVAL;
 
-	s = &wait_status[lcore_id];
+	s = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, wait_status);
 
 	/*
 	 * There is a race condition between sleep, wakeup and locking, but we
@@ -301,8 +306,8 @@  int
 rte_power_monitor_multi(const struct rte_power_monitor_cond pmc[],
 		const uint32_t num, const uint64_t tsc_timestamp)
 {
-	const unsigned int lcore_id = rte_lcore_id();
-	struct power_wait_status *s = &wait_status[lcore_id];
+	struct power_wait_status *s = RTE_LCORE_VAR_VALUE(wait_status);
+
 	uint32_t i, rc;
 
 	/* check if supported */