From patchwork Fri Oct 25 08:41:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Mattias_R=C3=B6nnblom?= X-Patchwork-Id: 147220 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0285745BD4; Fri, 25 Oct 2024 10:51:24 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 963D540647; Fri, 25 Oct 2024 10:51:09 +0200 (CEST) Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2068.outbound.protection.outlook.com [40.107.22.68]) by mails.dpdk.org (Postfix) with ESMTP id BD0BF402AE for ; Fri, 25 Oct 2024 10:50:54 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=XpRRsKibxc5RAfECVsziG60/iP612n/j9nTo83unZIdxtls9A8dzpnjhTtIYbsxsRZm91vTF4BijFErK9tVA/14nEF6FVBz1tle2/SVdEOzo3PMJ6qauMBtv/oV8NKZwDU31DuBOZeOSrOGoeH8IihYR5jlAH0t3nwQiLZpZkkEF/AIp2PIzwingQIjXWSJWb6iHjyenFeaJzrb3iBBajuF9bv9wTp7n/uYv2rDDxF18NjmuiMj5XVEVFYKvGjIidmydlkWc6W0fnczkwZ+yyd+pwgbE4MNgnGfEL6G7C8s3EhmXgxVtu5bc8EkBo66sEnmB5KNWZ77tigrgGNrw7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=trEJOf0lKBkjJI07zA1tXJLzh7SDabbjnI8Sa+JQiLs=; b=Z4ZYkIQ3JHpQqISCW7kLMrjEp4txHsK36BSoqHWfeY5rfHWeqwqY+Z8UnBU5ZKG05oqIsvuG/C3eK5jcSPiuOWnM3uI0sbJr5PMFZBTmFPGBbmluWbjYv0V8ftgVQSOuEwMQuFf7fZn+wiS5VyX3pn0t2BHSlaEGhcj50BOPKvA7hsGVyzjma9gJAu+3yhrSPLehMthWjktM2J8KEUWFQuODn3/4W6DFAPwYJzbecthBvCVlxNAOTOkI5s/NqEFCQtLbttxqwC7z1YTGI8IIm1LhDdQnr2o0TPbw0rBXfzgo/4mloXX1I76ermTMZY866VBcPYHeO3QoagOgBryHbQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 192.176.1.74) smtp.rcpttodomain=dpdk.org smtp.mailfrom=ericsson.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=ericsson.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=trEJOf0lKBkjJI07zA1tXJLzh7SDabbjnI8Sa+JQiLs=; b=zEXVKqGjL9OqV/4XEsn9dRUuW/x6Lcs0RhIK6Hz8tu35f9H/D4ltg9j59pnMIR5sZWJ6i/Inex9Qcu0e63asFpxkCwJEy8aoI5wL5MpyjrA7o8iHLxbur0JGqAqYYeORpMYi45o/moC8+hjdwfX/2RSxePEODJZjXkv9WURYN4pdjP7YbjHzLjItV5UssT2D4yhKzFpCYtW865dAbIAv10FyqZ3ALaK078S4Mg4iLRuOMY2m+wYOSv3236VzZ/1/6CQzjB/XNpu6lEGstQ1UG+3jJnoS25I0TiJUOKycg4NmO3t/SDxGnxv/iJyCrRJ+96KQe/8nOUFS75Ws36q8Ug== Received: from AM0PR08CA0023.eurprd08.prod.outlook.com (2603:10a6:208:d2::36) by GV2PR07MB9132.eurprd07.prod.outlook.com (2603:10a6:150:b4::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.17; Fri, 25 Oct 2024 08:50:51 +0000 Received: from AM3PEPF0000A795.eurprd04.prod.outlook.com (2603:10a6:208:d2:cafe::43) by AM0PR08CA0023.outlook.office365.com (2603:10a6:208:d2::36) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.20 via Frontend Transport; Fri, 25 Oct 2024 08:50:51 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 192.176.1.74) smtp.mailfrom=ericsson.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=ericsson.com; Received-SPF: Pass (protection.outlook.com: domain of ericsson.com designates 192.176.1.74 as permitted sender) receiver=protection.outlook.com; client-ip=192.176.1.74; helo=oa.msg.ericsson.com; pr=C Received: from oa.msg.ericsson.com (192.176.1.74) by AM3PEPF0000A795.mail.protection.outlook.com (10.167.16.100) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.14 via Frontend Transport; Fri, 25 Oct 2024 08:50:50 +0000 Received: from seliicinfr00050.seli.gic.ericsson.se (153.88.142.248) by smtp-central.internal.ericsson.com (100.87.178.64) with Microsoft SMTP Server id 15.2.1544.11; Fri, 25 Oct 2024 10:50:49 +0200 Received: from breslau.. (seliicwb00002.seli.gic.ericsson.se [10.156.25.100]) by seliicinfr00050.seli.gic.ericsson.se (Postfix) with ESMTP id BB7E01C00A5; Fri, 25 Oct 2024 10:50:49 +0200 (CEST) From: =?utf-8?q?Mattias_R=C3=B6nnblom?= To: CC: , =?utf-8?q?Morten_Br=C3=B8rup?= , Stephen Hemminger , Konstantin Ananyev , David Marchand , Jerin Jacob , Luka Jankovic , Thomas Monjalon , =?utf-8?q?Mattias_R=C3=B6nnblom?= , Konstantin Ananyev , Chengwen Feng Subject: [PATCH v17 1/8] eal: add static per-lcore memory allocation facility Date: Fri, 25 Oct 2024 10:41:42 +0200 Message-ID: <20241025084149.873037-2-mattias.ronnblom@ericsson.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241025084149.873037-1-mattias.ronnblom@ericsson.com> References: <20241023075302.869008-1-mattias.ronnblom@ericsson.com> <20241025084149.873037-1-mattias.ronnblom@ericsson.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AM3PEPF0000A795:EE_|GV2PR07MB9132:EE_ X-MS-Office365-Filtering-Correlation-Id: 9c2e49ca-f0c0-4027-a9d2-08dcf4d220b4 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|7416014|82310400026|1800799024|376014|36860700013; X-Microsoft-Antispam-Message-Info: =?utf-8?q?X1wr3dXEi0sQdDVI2qGQPfds+e0k1Yt?= =?utf-8?q?uzZB0LtYNKCGIRJT+IeAqb/6bdtQt7r33rm4QcioHni43ZqWUAwz0WJbGdGQZt+bo?= =?utf-8?q?oA6sgVNcqKjgr9KRMhQEGNAZgC9NXVtnikW4ZkeSiiMH3EKiOyxLRs3T7tHXkrPUp?= =?utf-8?q?FNN4i6ySNnqCasFbVn7imHuMWP2knwxU7Gp+sYwORQA/CPTy/WEEgOdgG1kPGzk4e?= =?utf-8?q?RGBxUUmkE7YgMnHwKsH3/JqeElauVdOKE3q0uT4XnezHQ5PI4DWuHT93jfOJcmYj+?= =?utf-8?q?0ie+4iGpAsWHmlHZe1knnHfxMG1U4ea8QJLLxcBdFWQ4U7mqXF59f2kbZPEo0P1rx?= =?utf-8?q?tkPSzSEL4M4NIeI8054ZuvuYnWBXleLvFfc1IBNwSM6+vxChOwVwLLoG5jfA8EJhs?= =?utf-8?q?IH581jje06QB+UWpGHWNffWs1L9Y9zCK2jZgIgX0GPXi9nXws93SZ5uFbWGNPV2HF?= =?utf-8?q?qptvhxrxCFJXmsmlj/eZt5GFpspgmQVV1bC5BwHRkga6uqld3P1IcgG1MfDyQ6Fpc?= =?utf-8?q?JvCiO/k+orKNrKpYju2CNqhObsvA8UcN8/tEx3YC/wXvXbQugx5hj6SRx0u7sQM77?= =?utf-8?q?dt99LMTAdNxfalX7E/aIp+ZoJURV21ThE6AvzI11XtEcvXAsBL7mwdwcNYsxZfqow?= =?utf-8?q?QczosyiSQpv/00OVHYApeWEifxYVeYMS+3RuhqYab22XJ3ieV6RTIY5gF8k8CWkRp?= =?utf-8?q?pJGAJ5eCQpNLjpeZO0HmmgylSKCWUc38IdCP8Ro2dE+p9HSwpc6etvCrAFdDok6cu?= =?utf-8?q?q6tkpo2XNcGxEWUg9KNBLVma3nhpZJM7nZ9fOThesoSL4Be1OnZh8deuif89353sB?= =?utf-8?q?0tHH9Scd7Nzk5K/Wx4fkxAWSNSOvgvBgScLORV/MDPy2cxKAxmckvAkdhfQtqeQBK?= =?utf-8?q?8zlRW9znBKOsp8E1jgZYBATa1CqG3ykL/VDxusS3hC2dS+I5kGzty/8n6iLacj/Vi?= =?utf-8?q?19lpvH3lfS2qvXpNUgB3wwsrmgsezklSjIKLYGXZDE2PcAdAUluwsaHdVEctYw3/W?= =?utf-8?q?q2tplXwKsjhWeyBKhD0KK9sl5Keypz/rUHX8r3ofhUDVWi2fOSIHBxeaDCVjyoUzL?= =?utf-8?q?ZZEOV1qF+Tri1bd0QIHkppJ7aAKCH6Ob0SdzUk26kCmbRyO6sCKMFOCK+Xi9U6kbr?= =?utf-8?q?CiHZZ6HziChM9nHQ+4KSYcCY5vNk4ik9U+1k7x2QZABN658ate2xfTKGOViFe1WvQ?= =?utf-8?q?xMNIMZ0S5Z/H3HqrGgVE3u1iMtZcRTX0ri9aJPjk0ygV3z6bE7QIgjgmMc5vfEg/7?= =?utf-8?q?Zuylq4zU4lMgalf6rnJ/DK8CWSgOcPch3n73JyPate5ieYPiLmwftrQHrDkCE0rTJ?= =?utf-8?q?hK+lo891o+7Zin98x8rMyxB/GQr+U2HJow=3D=3D?= X-Forefront-Antispam-Report: CIP:192.176.1.74; CTRY:SE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:oa.msg.ericsson.com; PTR:office365.se.ericsson.net; CAT:NONE; SFS:(13230040)(7416014)(82310400026)(1800799024)(376014)(36860700013); DIR:OUT; SFP:1101; X-OriginatorOrg: ericsson.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Oct 2024 08:50:50.8319 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 9c2e49ca-f0c0-4027-a9d2-08dcf4d220b4 X-MS-Exchange-CrossTenant-Id: 92e84ceb-fbfd-47ab-be52-080c6b87953f X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=92e84ceb-fbfd-47ab-be52-080c6b87953f; Ip=[192.176.1.74]; Helo=[oa.msg.ericsson.com] X-MS-Exchange-CrossTenant-AuthSource: AM3PEPF0000A795.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV2PR07MB9132 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Introduce DPDK per-lcore id variables, or lcore variables for short. An lcore variable has one value for every current and future lcore id-equipped thread. The primary use case is for statically allocating small, frequently-accessed data structures, for which one instance should exist for each lcore. Lcore variables are similar to thread-local storage (TLS, e.g., C11 _Thread_local), but decoupling the values' life time with that of the threads. Lcore variables are also similar in terms of functionality provided by FreeBSD kernel's DPCPU_*() family of macros and the associated build-time machinery. DPCPU uses linker scripts, which effectively prevents the reuse of its, otherwise seemingly viable, approach. The currently-prevailing way to solve the same problem as lcore variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of lcore variables over this approach is that data related to the same lcore now is close (spatially, in memory), rather than data used by the same module, which in turn avoid excessive use of padding, polluting caches with unused data. Signed-off-by: Mattias Rönnblom Acked-by: Morten Brørup Acked-by: Konstantin Ananyev Acked-by: Chengwen Feng Acked-by: Stephen Hemminger --- PATCH v16: * Move implementation overview type information to the programmer's guide. PATCH v15: * Add alignment-related compiler hint. (Stephen Hemminger) * Have size-related compiler hint point toward the right function argument. (Stephen Hemminger) PATCH v14: * Add note in rte_lcore_var_alloc() that the memory cannot be freed. (Stephen Hemminger) * Hint the compiler rte_lcore_var_alloc() is a memory allocation facility. (Stephen Hemminger) PATCH v13: * Remove _VALUE() suffix from value lookup and iterator macros. (Morten Brørup and Thomas Monjalon) * Remove the _ptr() suffix from the value lookup function. PATCH v12: * Replace RTE_ASSERT() with RTE_VERIFY(), since performance is not a concern. (Morten Brørup) * Fix issue (introduced in v11) where aligned_malloc() was provided an object size which wasn't an even number of the alignment. (Stephen Hemminger) PATCH v11: * Add a note in the API docs on lcore variables and huge page memory. (Stephen Hemminger) * Free lcore var buffers at EAL cleanup. (Thomas Monjalon) * Tweak naming and include short lcore var buffer use overview in eal_common_lcore_var.c. PATCH v10: * Improve documentation grammar and spelling. (Stephen Hemminger, Thomas Monjalon) * Add version.map DPDK version comment. (Thomas Monjalon) PATCH v9: * Fixed merge conflicts in release notes. PATCH v8: * Work around missing max_align_t definition in MSVC. (Morten Brørup) PATCH v7: * Add () to the FOREACH lcore id macro parameter, to allow arbitrary expression, not just a simple variable name, being passed. (Konstantin Ananyev) PATCH v6: * Have API user provide the loop variable in the FOREACH macro, to avoid subtle bugs where the loop variable name clashes with some other user-defined variable. (Konstantin Ananyev) PATCH v5: * Update EAL programming guide. PATCH v2: * Add Windows support. (Morten Brørup) * Fix lcore variables API index reference. (Morten Brørup) * Various improvements of the API documentation. (Morten Brørup) * Elimination of unused symbol in version.map. (Morten Brørup) PATCH: * Update MAINTAINERS and release notes. * Stop covering included files in extern "C" {}. RFC v6: * Include to get aligned_alloc(). * Tweak documentation (grammar). * Provide API-level guarantees that lcore variable values take on an initial value of zero. * Fix misplaced __rte_cache_aligned in the API doc example. RFC v5: * In Doxygen, consistenly use @ (and not \). * The RTE_LCORE_VAR_GET() and SET() convience access macros covered an uncommon use case, where the lcore value is of a primitive type, rather than a struct, and is thus eliminated from the API. (Morten Brørup) * In the wake up GET()/SET() removeal, rename RTE_LCORE_VAR_PTR() RTE_LCORE_VAR_VALUE(). * The underscores are removed from __rte_lcore_var_lcore_ptr() to signal that this function is a part of the public API. * Macro arguments are documented. RFV v4: * Replace large static array with libc heap-allocated memory. One implication of this change is there no longer exists a fixed upper bound for the total amount of memory used by lcore variables. RTE_MAX_LCORE_VAR has changed meaning, and now represent the maximum size of any individual lcore variable value. * Fix issues in example. (Morten Brørup) * Improve access macro type checking. (Morten Brørup) * Refer to the lcore variable handle as "handle" and not "name" in various macros. * Document lack of thread safety in rte_lcore_var_alloc(). * Provide API-level assurance the lcore variable handle is always non-NULL, to all applications to use NULL to mean "not yet allocated". * Note zero-sized allocations are not allowed. * Give API-level guarantee the lcore variable values are zeroed. RFC v3: * Replace use of GCC-specific alignof() with alignof(). * Update example to reflect FOREACH macro name change (in RFC v2). RFC v2: * Use alignof to derive alignment requirements. (Morten Brørup) * Change name of FOREACH to make it distinct from 's *per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup) * Allow user-specified alignment, but limit max to cache line size. --- MAINTAINERS | 6 + config/rte_config.h | 1 + doc/api/doxy-api-index.md | 1 + .../prog_guide/env_abstraction_layer.rst | 43 +++- doc/guides/rel_notes/release_24_11.rst | 14 ++ lib/eal/common/eal_common_lcore_var.c | 112 ++++++++++ lib/eal/common/eal_lcore_var.h | 11 + lib/eal/common/meson.build | 1 + lib/eal/freebsd/eal.c | 2 + lib/eal/include/meson.build | 1 + lib/eal/include/rte_lcore_var.h | 207 ++++++++++++++++++ lib/eal/linux/eal.c | 2 + lib/eal/version.map | 1 + 13 files changed, 396 insertions(+), 6 deletions(-) create mode 100644 lib/eal/common/eal_common_lcore_var.c create mode 100644 lib/eal/common/eal_lcore_var.h create mode 100644 lib/eal/include/rte_lcore_var.h diff --git a/MAINTAINERS b/MAINTAINERS index cd78bc7db1..9a6b1073e9 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -289,6 +289,12 @@ F: lib/eal/include/rte_random.h F: lib/eal/common/rte_random.c F: app/test/test_rand_perf.c +Lcore Variables +M: Mattias Rönnblom +F: lib/eal/include/rte_lcore_var.h +F: lib/eal/common/eal_common_lcore_var.c +F: app/test/test_lcore_var.c + ARM v7 M: Wathsala Vithanage F: config/arm/ diff --git a/config/rte_config.h b/config/rte_config.h index fd6f8a2f1a..498d509244 100644 --- a/config/rte_config.h +++ b/config/rte_config.h @@ -41,6 +41,7 @@ /* EAL defines */ #define RTE_CACHE_GUARD_LINES 1 #define RTE_MAX_HEAPS 32 +#define RTE_MAX_LCORE_VAR 1048576 #define RTE_MAX_MEMSEG_LISTS 128 #define RTE_MAX_MEMSEG_PER_LIST 8192 #define RTE_MAX_MEM_MB_PER_LIST 32768 diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md index 266c8b90dc..1d472c6ceb 100644 --- a/doc/api/doxy-api-index.md +++ b/doc/api/doxy-api-index.md @@ -99,6 +99,7 @@ The public API headers are grouped by topics: [interrupts](@ref rte_interrupts.h), [launch](@ref rte_launch.h), [lcore](@ref rte_lcore.h), + [lcore variables](@ref rte_lcore_var.h), [per-lcore](@ref rte_per_lcore.h), [service cores](@ref rte_service.h), [keepalive](@ref rte_keepalive.h), diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst index b9fac1839d..b659a1d085 100644 --- a/doc/guides/prog_guide/env_abstraction_layer.rst +++ b/doc/guides/prog_guide/env_abstraction_layer.rst @@ -429,12 +429,43 @@ with them once they're registered. Per-lcore and Shared Variables ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -.. note:: - - lcore refers to a logical execution unit of the processor, sometimes called a hardware *thread*. - -Shared variables are the default behavior. -Per-lcore variables are implemented using *Thread Local Storage* (TLS) to provide per-thread local storage. +By default, static variables, memory blocks allocated on the DPDK +heap, and other types of memory are shared by all DPDK threads. + +An application, a DPDK library, or a PMD may opt to keep per-thread state. + +Per-thread data can be maintained using either *lcore variables* (see +``rte_lcore_var.h``), *thread-local storage (TLS)* (see +``rte_per_lcore.h``), or a static array of ``RTE_MAX_LCORE`` elements, +indexed by ``rte_lcore_id()``. These methods allow per-lcore data to be +largely internal to the module and not directly exposed in its +API. Another approach is to explicitly handle per-thread aspects in +the API (e.g., the ports in the Eventdev API). + +Lcore variables are suitable for small objects that are statically +allocated at the time of module or application initialization. An +lcore variable takes on one value for each lcore ID-equipped thread +(i.e., for both EAL threads and registered non-EAL threads, in total +``RTE_MAX_LCORE`` instances). The lifetime of lcore variables is +independent of the owning threads and can, therefore, be initialized +before the threads are created. + +Variables with thread-local storage are allocated when the thread is +created and exist until the thread terminates. These are applicable +for every thread in the process. Only very small objects should be +allocated in TLS, as large TLS objects can significantly slow down +thread creation and may unnecessarily increase the memory footprint of +applications that extensively use unregistered threads. + +A common but now largely obsolete DPDK pattern is to use a static +array sized according to the maximum number of lcore ID-equipped +threads (i.e., with ``RTE_MAX_LCORE`` elements). To avoid *false +sharing*, each element must be both cache-aligned and include an +``RTE_CACHE_GUARD``. This extensive use of padding causes internal +fragmentation (i.e., unused space) and reduces cache hit rates. + +For more discussions on per-lcore state, refer to the +``rte_lcore_var.h`` API documentation. Logs ~~~~ diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst index fa4822d928..18f2f37944 100644 --- a/doc/guides/rel_notes/release_24_11.rst +++ b/doc/guides/rel_notes/release_24_11.rst @@ -247,6 +247,20 @@ New Features Added ability for node to advertise and update multiple xstat counters, that can be retrieved using ``rte_graph_cluster_stats_get``. +* **Added EAL per-lcore static memory allocation facility.** + + Added EAL API for statically allocating small, + frequently-accessed data structures, for which one instance should + exist for each EAL thread and registered non-EAL thread. + + With lcore variables, data is organized spatially on a per-lcore id + basis, rather than per library or PMD, avoiding the need for cache + aligning (or RTE_CACHE_GUARDing) data structures, which in turn + reduces CPU cache internal fragmentation, improving performance. + + Lcore variables are similar to thread-local storage (TLS, e.g., + C11 _Thread_local), but decoupling the values' life time from that + of the threads. Removed Items ------------- diff --git a/lib/eal/common/eal_common_lcore_var.c b/lib/eal/common/eal_common_lcore_var.c new file mode 100644 index 0000000000..3b0e0b89f7 --- /dev/null +++ b/lib/eal/common/eal_common_lcore_var.c @@ -0,0 +1,112 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2024 Ericsson AB + */ + +#include +#include + +#ifdef RTE_EXEC_ENV_WINDOWS +#include +#endif + +#include +#include +#include + +#include + +#include "eal_private.h" +#include "eal_lcore_var.h" + +/* + * Refer to the programmer's guide for an overview of the lcore + * variables implementation. + */ + +struct lcore_var_buffer { + char data[RTE_MAX_LCORE_VAR * RTE_MAX_LCORE]; + struct lcore_var_buffer *prev; +}; + +static struct lcore_var_buffer *current_buffer; + +/* initialized to trigger buffer allocation on first allocation */ +static size_t offset = RTE_MAX_LCORE_VAR; + +static void * +lcore_var_alloc(size_t size, size_t align) +{ + void *handle; + unsigned int lcore_id; + void *value; + + offset = RTE_ALIGN_CEIL(offset, align); + + if (offset + size > RTE_MAX_LCORE_VAR) { + struct lcore_var_buffer *prev = current_buffer; + size_t alloc_size = + RTE_ALIGN_CEIL(sizeof(struct lcore_var_buffer), + RTE_CACHE_LINE_SIZE); +#ifdef RTE_EXEC_ENV_WINDOWS + current_buffer = _aligned_malloc(alloc_size, RTE_CACHE_LINE_SIZE); +#else + current_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE, alloc_size); + +#endif + RTE_VERIFY(current_buffer != NULL); + + current_buffer->prev = prev; + + offset = 0; + } + + handle = ¤t_buffer->data[offset]; + + offset += size; + + RTE_LCORE_VAR_FOREACH(lcore_id, value, handle) + memset(value, 0, size); + + EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a " + "%"PRIuPTR"-byte alignment", size, align); + + return handle; +} + +void * +rte_lcore_var_alloc(size_t size, size_t align) +{ + /* Having the per-lcore buffer size aligned on cache lines + * assures as well as having the base pointer aligned on cache + * size assures that aligned offsets also translate to aligned + * pointers across all values. + */ + RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0); + RTE_VERIFY(align <= RTE_CACHE_LINE_SIZE); + RTE_VERIFY(size <= RTE_MAX_LCORE_VAR); + + /* '0' means asking for worst-case alignment requirements */ + if (align == 0) +#ifdef RTE_TOOLCHAIN_MSVC + /* MSVC is missing the max_align_t typedef */ + align = alignof(double); +#else + align = alignof(max_align_t); +#endif + + RTE_VERIFY(rte_is_power_of_2(align)); + + return lcore_var_alloc(size, align); +} + +void +eal_lcore_var_cleanup(void) +{ + while (current_buffer != NULL) { + struct lcore_var_buffer *prev = current_buffer->prev; + + free(current_buffer); + + current_buffer = prev; + } +} diff --git a/lib/eal/common/eal_lcore_var.h b/lib/eal/common/eal_lcore_var.h new file mode 100644 index 0000000000..de2c4e44a0 --- /dev/null +++ b/lib/eal/common/eal_lcore_var.h @@ -0,0 +1,11 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2024 Ericsson AB. + */ + +#ifndef EAL_LCORE_VAR_H +#define EAL_LCORE_VAR_H + +void +eal_lcore_var_cleanup(void); + +#endif diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build index c1bbf26654..e273745e93 100644 --- a/lib/eal/common/meson.build +++ b/lib/eal/common/meson.build @@ -18,6 +18,7 @@ sources += files( 'eal_common_interrupts.c', 'eal_common_launch.c', 'eal_common_lcore.c', + 'eal_common_lcore_var.c', 'eal_common_mcfg.c', 'eal_common_memalloc.c', 'eal_common_memory.c', diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c index 1229230063..796c9dbf2d 100644 --- a/lib/eal/freebsd/eal.c +++ b/lib/eal/freebsd/eal.c @@ -47,6 +47,7 @@ #include "eal_private.h" #include "eal_thread.h" +#include "eal_lcore_var.h" #include "eal_internal_cfg.h" #include "eal_filesystem.h" #include "eal_hugepages.h" @@ -941,6 +942,7 @@ rte_eal_cleanup(void) /* after this point, any DPDK pointers will become dangling */ rte_eal_memory_detach(); eal_cleanup_config(internal_conf); + eal_lcore_var_cleanup(); return 0; } diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build index 474097f211..d903577caa 100644 --- a/lib/eal/include/meson.build +++ b/lib/eal/include/meson.build @@ -28,6 +28,7 @@ headers += files( 'rte_keepalive.h', 'rte_launch.h', 'rte_lcore.h', + 'rte_lcore_var.h', 'rte_lock_annotations.h', 'rte_malloc.h', 'rte_mcslock.h', diff --git a/lib/eal/include/rte_lcore_var.h b/lib/eal/include/rte_lcore_var.h new file mode 100644 index 0000000000..ea8b61cf7d --- /dev/null +++ b/lib/eal/include/rte_lcore_var.h @@ -0,0 +1,207 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2024 Ericsson AB + */ + +#ifndef _RTE_LCORE_VAR_H_ +#define _RTE_LCORE_VAR_H_ + +/** + * @file + * + * Lcore variables + * + * This API provides a mechanism to create and access per-lcore id + * variables in a space- and cycle-efficient manner. + * + * Please refer to the lcore variables' programmer's guide for an + * overview of this API and its implementation. + */ + +#include +#include + +#include +#include +#include + +#ifdef __cplusplus +extern "C" { +#endif + +/** + * Given the lcore variable type, produces the type of the lcore + * variable handle. + */ +#define RTE_LCORE_VAR_HANDLE_TYPE(type) \ + type * + +/** + * Define an lcore variable handle. + * + * This macro defines a variable which is used as a handle to access + * the various instances of a per-lcore id variable. + * + * This macro clarifies that the declaration is an lcore handle, not a + * regular pointer. + * + * Add @b static as a prefix in case the lcore variable is only to be + * accessed from a particular translation unit. + */ +#define RTE_LCORE_VAR_HANDLE(type, name) \ + RTE_LCORE_VAR_HANDLE_TYPE(type) name + +/** + * Allocate space for an lcore variable, and initialize its handle. + * + * The values of the lcore variable are initialized to zero. + */ +#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, align) \ + handle = rte_lcore_var_alloc(size, align) + +/** + * Allocate space for an lcore variable, and initialize its handle, + * with values aligned for any type of object. + * + * The values of the lcore variable are initialized to zero. + */ +#define RTE_LCORE_VAR_ALLOC_SIZE(handle, size) \ + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, 0) + +/** + * Allocate space for an lcore variable of the size and alignment requirements + * suggested by the handle pointer type, and initialize its handle. + * + * The values of the lcore variable are initialized to zero. + */ +#define RTE_LCORE_VAR_ALLOC(handle) \ + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, sizeof(*(handle)), \ + alignof(typeof(*(handle)))) + +/** + * Allocate an explicitly-sized, explicitly-aligned lcore variable by + * means of a @ref RTE_INIT constructor. + * + * The values of the lcore variable are initialized to zero. + */ +#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align) \ + RTE_INIT(rte_lcore_var_init_ ## name) \ + { \ + RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align); \ + } + +/** + * Allocate an explicitly-sized lcore variable by means of a @ref + * RTE_INIT constructor. + * + * The values of the lcore variable are initialized to zero. + */ +#define RTE_LCORE_VAR_INIT_SIZE(name, size) \ + RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0) + +/** + * Allocate an lcore variable by means of a @ref RTE_INIT constructor. + * + * The values of the lcore variable are initialized to zero. + */ +#define RTE_LCORE_VAR_INIT(name) \ + RTE_INIT(rte_lcore_var_init_ ## name) \ + { \ + RTE_LCORE_VAR_ALLOC(name); \ + } + +/** + * Get void pointer to lcore variable instance with the specified + * lcore id. + * + * @param lcore_id + * The lcore id specifying which of the @c RTE_MAX_LCORE value + * instances should be accessed. The lcore id need not be valid + * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer + * is also not valid (and thus should not be dereferenced). + * @param handle + * The lcore variable handle. + */ +static inline void * +rte_lcore_var_lcore(unsigned int lcore_id, void *handle) +{ + return RTE_PTR_ADD(handle, lcore_id * RTE_MAX_LCORE_VAR); +} + +/** + * Get pointer to lcore variable instance with the specified lcore id. + * + * @param lcore_id + * The lcore id specifying which of the @c RTE_MAX_LCORE value + * instances should be accessed. The lcore id need not be valid + * (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer + * is also not valid (and thus should not be dereferenced). + * @param handle + * The lcore variable handle. + */ +#define RTE_LCORE_VAR_LCORE(lcore_id, handle) \ + ((typeof(handle))rte_lcore_var_lcore(lcore_id, handle)) + +/** + * Get pointer to lcore variable instance of the current thread. + * + * May only be used by EAL threads and registered non-EAL threads. + */ +#define RTE_LCORE_VAR(handle) \ + RTE_LCORE_VAR_LCORE(rte_lcore_id(), handle) + +/** + * Iterate over each lcore id's value for an lcore variable. + * + * @param lcore_id + * An unsigned int variable successively set to the + * lcore id of every valid lcore id (up to @c RTE_MAX_LCORE). + * @param value + * A pointer variable successively set to point to lcore variable + * value instance of the current lcore id being processed. + * @param handle + * The lcore variable handle. + */ +#define RTE_LCORE_VAR_FOREACH(lcore_id, value, handle) \ + for ((lcore_id) = \ + (((value) = RTE_LCORE_VAR_LCORE(0, handle)), 0); \ + (lcore_id) < RTE_MAX_LCORE; \ + (lcore_id)++, (value) = RTE_LCORE_VAR_LCORE(lcore_id, \ + handle)) + +/** + * Allocate space in the per-lcore id buffers for an lcore variable. + * + * The pointer returned is only an opaque identifier of the variable. To + * get an actual pointer to a particular instance of the variable use + * @ref RTE_LCORE_VAR or @ref RTE_LCORE_VAR_LCORE. + * + * The lcore variable values' memory is set to zero. + * + * The allocation is always successful, barring a fatal exhaustion of + * the per-lcore id buffer space. + * + * rte_lcore_var_alloc() is not multi-thread safe. + * + * The allocated memory cannot be freed. + * + * @param size + * The size (in bytes) of the variable's per-lcore id value. Must be > 0. + * @param align + * If 0, the values will be suitably aligned for any kind of type + * (i.e., alignof(max_align_t)). Otherwise, the values will be aligned + * on a multiple of *align*, which must be a power of 2 and equal or + * less than @c RTE_CACHE_LINE_SIZE. + * @return + * The variable's handle, stored in a void pointer value. The value + * is always non-NULL. + */ +__rte_experimental +void * +rte_lcore_var_alloc(size_t size, size_t align) + __rte_alloc_size(1) __rte_alloc_align(2); + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_LCORE_VAR_H_ */ diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c index 54577b7718..d0f27315b9 100644 --- a/lib/eal/linux/eal.c +++ b/lib/eal/linux/eal.c @@ -45,6 +45,7 @@ #include #include "eal_private.h" #include "eal_thread.h" +#include "eal_lcore_var.h" #include "eal_internal_cfg.h" #include "eal_filesystem.h" #include "eal_hugepages.h" @@ -1371,6 +1372,7 @@ rte_eal_cleanup(void) rte_eal_malloc_heap_cleanup(); eal_cleanup_config(internal_conf); rte_eal_log_cleanup(); + eal_lcore_var_cleanup(); return 0; } diff --git a/lib/eal/version.map b/lib/eal/version.map index f493cd1ca7..94dc5b17d6 100644 --- a/lib/eal/version.map +++ b/lib/eal/version.map @@ -399,6 +399,7 @@ EXPERIMENTAL { # added in 24.11 rte_bitset_to_str; + rte_lcore_var_alloc; }; INTERNAL { From patchwork Fri Oct 25 08:41:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Mattias_R=C3=B6nnblom?= X-Patchwork-Id: 147218 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 833FF45BD4; Fri, 25 Oct 2024 10:51:06 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 73A234028F; Fri, 25 Oct 2024 10:51:06 +0200 (CEST) Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2043.outbound.protection.outlook.com [40.107.21.43]) by mails.dpdk.org (Postfix) with ESMTP id 417BC4003C for ; Fri, 25 Oct 2024 10:50:53 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=d4Mo5Mh8y3M1bB9pPo0py3eeylf6YUkSVEjtNIZJtw2/KiicC+oyDiY8pjj9IWgT04icL//lfr8Wh5NyNsMckhfvs7NqPXsQeeaFeemIJ0VrF3In76cAM4thIO6dq3OAVhIGN5hkW8HQA1kKDE80SE2325rPz7KLIcXqllq+ZOuv8ELRHRkv5J5Qkmoc6ImgTMLeGW6Kes7TXrOzqZ47iNyUPCoV9o5oTHMW8hk7gkJldSrUWoN/UyWgcbXj7ZHeF5JjWTE7/m9E2UUEKbvpZkjXrRiAT1OohRg1hMotuKIIf0Uo5kq8q+aT2OMKLRrkayWIJRmCxwQAbroaI/dtdg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ddiwh231S2Cr/wkdr3oEWK5GXS1Nz5WtGz6HUHXF9Qs=; b=X0+q5rhh+tM1RCZQCdhxXBXnym74Kk/uUI5Q5Waad5RSRAQ9U0b1tPceK49EGqixZIpLihuAbGubggQWGWAcNe33d1kWz7pHI49fDn2e8mJXme23acjHw0JIHUqOmeUzUhX9jfnPti/dN8IErXuFPyrvbQ0aVtVXi9dF1MXyy8RuxKJuqFhwPEVPNUtcs9dQW845FBwguW9Q5sq/GJ+yI5Woaa2JV506T4kJG3pdZ66QkLw3/V4aRjnm9ThgOarebJfClyD/QX5n3bFfiZuE4zOvKHuJ8w0Zz36pMYOpakA6T+x53W4I9Hq9bwUpnhVQ7b2ugaiKzIHUzIRu8MFxAQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 192.176.1.74) smtp.rcpttodomain=dpdk.org smtp.mailfrom=ericsson.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=ericsson.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ddiwh231S2Cr/wkdr3oEWK5GXS1Nz5WtGz6HUHXF9Qs=; b=bA8r0G6tndzSOSwOAefXx3jqdnnIj7D71QU8Hif8qTvYawZf6X7MH9I6fTC5ubh/sgDTSz+3OnI0Dg8VHwRgfoiM/mdWNgL+d6rPcL5XZ4QdTIzARBfd9fmvDRjGv+ArUYgpL4bUS5MjXtDj1ytf+AkyeIHIDB02NVTblOQWTmCJaBe3r6mn0GGFvmY9dO/wXu7G6DlSOnfWXH3dR9s1qlUjh9r8ytvQ6+AL/QwAwg3BCmddyU1dHrl38fa65Ar0+clC46F6IwXJkWyHCMmdiDOzqZNyn41bqnlvkjcOaG8t6jp96wvVZ9mxT+5DzL+W9azP4pfBtQjfI6sP6iw37Q== Received: from DU7P195CA0026.EURP195.PROD.OUTLOOK.COM (2603:10a6:10:54d::6) by DB9PR07MB7849.eurprd07.prod.outlook.com (2603:10a6:10:2a5::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.21; Fri, 25 Oct 2024 08:50:50 +0000 Received: from DB3PEPF00008859.eurprd02.prod.outlook.com (2603:10a6:10:54d:cafe::18) by DU7P195CA0026.outlook.office365.com (2603:10a6:10:54d::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.20 via Frontend Transport; Fri, 25 Oct 2024 08:50:50 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 192.176.1.74) smtp.mailfrom=ericsson.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=ericsson.com; Received-SPF: Pass (protection.outlook.com: domain of ericsson.com designates 192.176.1.74 as permitted sender) receiver=protection.outlook.com; client-ip=192.176.1.74; helo=oa.msg.ericsson.com; pr=C Received: from oa.msg.ericsson.com (192.176.1.74) by DB3PEPF00008859.mail.protection.outlook.com (10.167.242.4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.14 via Frontend Transport; Fri, 25 Oct 2024 08:50:50 +0000 Received: from seliicinfr00050.seli.gic.ericsson.se (153.88.142.248) by smtp-central.internal.ericsson.com (100.87.178.61) with Microsoft SMTP Server id 15.2.1544.11; Fri, 25 Oct 2024 10:50:49 +0200 Received: from breslau.. (seliicwb00002.seli.gic.ericsson.se [10.156.25.100]) by seliicinfr00050.seli.gic.ericsson.se (Postfix) with ESMTP id D1CF41C00A9; Fri, 25 Oct 2024 10:50:49 +0200 (CEST) From: =?utf-8?q?Mattias_R=C3=B6nnblom?= To: CC: , =?utf-8?q?Morten_Br=C3=B8rup?= , Stephen Hemminger , Konstantin Ananyev , David Marchand , Jerin Jacob , Luka Jankovic , Thomas Monjalon , =?utf-8?q?Mattias_R=C3=B6nnblom?= , "Chengwen Feng" Subject: [PATCH v17 2/8] eal: add lcore variable functional tests Date: Fri, 25 Oct 2024 10:41:43 +0200 Message-ID: <20241025084149.873037-3-mattias.ronnblom@ericsson.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241025084149.873037-1-mattias.ronnblom@ericsson.com> References: <20241023075302.869008-1-mattias.ronnblom@ericsson.com> <20241025084149.873037-1-mattias.ronnblom@ericsson.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DB3PEPF00008859:EE_|DB9PR07MB7849:EE_ X-MS-Office365-Filtering-Correlation-Id: 67e5984d-d4d3-4764-6962-08dcf4d22087 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|1800799024|376014|36860700013|82310400026; X-Microsoft-Antispam-Message-Info: =?utf-8?q?kWmC2PrOVFpWTuaPFFAba3hdskUrack?= =?utf-8?q?g/Djyv8aFZLbSAar2fW0ycSMT4TSRS6yNnWaw3VtU72sjDL0YhZ338kFMsFjC7lw7?= =?utf-8?q?St4odp2EOOx0BHNsBZ0df6WFkGww++QmFJpzA6315d0UhFrCqkjNlKVpR1p+8SZwz?= =?utf-8?q?1fgn5uL/2OzwCx9UTfDEfig/uBnca1Nuoq3c9gvtJj3RvK3+dGWrLiZesH3UY2G3c?= =?utf-8?q?Fg87nNKjF35UxmmLLUcXK1jO+1l7Hu6SopUOl7BLpJAT59UC6oEITrID/d/0t4VMc?= =?utf-8?q?CEB0gn/SiLFDvQ95dzZnBFVL9l8793Xe4HMpuhaJ3wROtdfXMWL75k/uMV69sK0bl?= =?utf-8?q?YNn5Hgmwch03yL068s+SYJwfvJpylXQhFcdLIqv6lX4CcGyx5UtCcXfsSicVwZALd?= =?utf-8?q?pkfyaOMPwnRZzHD4M/2t6lSBIPe416VINRGAQz2ff8KaQMM+qNqj1S3juCdd8Gqid?= =?utf-8?q?lcJjob/o+chGRTOFg8kWsAj43BsEQQrn/vVYIN+vbdPNgyd++Pp5VUY8lQ6Q86Nlt?= =?utf-8?q?Pdy7Wjq3bTmGTWniD7vgoUZENmvcbSFsguKgouxfPP+f7TIyOoLGCtyTlS/7/xxdG?= =?utf-8?q?hLkmHLX4i4/CzCWibXxrBRUsyb/SuuEXvOP1dN5xMxqFHpUelTMuMXOPtGQYImmTe?= =?utf-8?q?xzDLqhOqsPfcI0H3aceVaIcF8wUIlInlFNbWddD5vjp+UKKuhP837X709leRGldAx?= =?utf-8?q?oioX9FJ+6GutHeO+x+jAKQ9V6tnTz28RhX9sPQvkM5os3DbYkwRq55LvTXvjrYM/T?= =?utf-8?q?gQBqJ5DKxkIjJPCmutAk7RV0rhbQrj5LA2GBZTOZ9p97sE+jp/ZUExA48jx5rDCC6?= =?utf-8?q?8cXN0gHiNTMl7VppanIlE/zGN1NowZDSs08YUBVDfQsnuVSUOfs6qhetUsuq3J6xZ?= =?utf-8?q?3TccHgBSCsLI2HjtG8G20ZjJk0YO/Y94EQ9JHscsuvFYKdaK7Q0+MEd23w3+mEMof?= =?utf-8?q?3rBynFCy4ZFrMWmsirgbfNHWGrcLoCV/vqZNGoY2OkJ0r5XZVSKIJ/THMDmuaCQ6t?= =?utf-8?q?GbaQoxLTv42MmY/NrowFmy8y/JN520Tq5taUqbmA97LO3/LZtHtaQYCRmUMr4yZHo?= =?utf-8?q?JVfIoLlQOcDXqb+qWIMygJmtg5eavbwk5GxXPwK/PlXf6fPwct/GvWytUMl0DFf6n?= =?utf-8?q?btQSZlW3LmgM1Xpltvz5ZZ/xRXUr0ETKifqqbyGhQM4nJKujD0O80o/TjRHBPDyRQ?= =?utf-8?q?GSjVv/uHZbUKzpvD7Sq0pCWYjH8KxPJrpUyVVsUM7iOuHnpt4Y4r2H9nLBLgOg++p?= =?utf-8?q?NCeUA1WiVvZSyIJ853Ac3nOVApJ1gst4qnqw3eXS4LJUobsiRvEbfFegmNHuFtNVW?= =?utf-8?q?CHBN8pxSPU1yagmialq37Bv/pCgxC3Zohw=3D=3D?= X-Forefront-Antispam-Report: CIP:192.176.1.74; CTRY:SE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:oa.msg.ericsson.com; PTR:office365.se.ericsson.net; CAT:NONE; SFS:(13230040)(1800799024)(376014)(36860700013)(82310400026); DIR:OUT; SFP:1101; X-OriginatorOrg: ericsson.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Oct 2024 08:50:50.5041 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 67e5984d-d4d3-4764-6962-08dcf4d22087 X-MS-Exchange-CrossTenant-Id: 92e84ceb-fbfd-47ab-be52-080c6b87953f X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=92e84ceb-fbfd-47ab-be52-080c6b87953f; Ip=[192.176.1.74]; Helo=[oa.msg.ericsson.com] X-MS-Exchange-CrossTenant-AuthSource: DB3PEPF00008859.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR07MB7849 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add functional test suite to exercise the API. Signed-off-by: Mattias Rönnblom Acked-by: Morten Brørup Acked-by: Chengwen Feng Acked-by: Stephen Hemminger --- PATCH v6: * Update FOREACH invocations to match new API. RFC v5: * Adapt tests to reflect the removal of the GET() and SET() macros. RFC v4: * Check all lcore id's values for all variables in the many variables test case. * Introduce test case for max-sized lcore variables. RFC v2: * Improve alignment-related test coverage. --- app/test/meson.build | 1 + app/test/test_lcore_var.c | 432 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 433 insertions(+) create mode 100644 app/test/test_lcore_var.c diff --git a/app/test/meson.build b/app/test/meson.build index 0f7e11969a..7dccd197ac 100644 --- a/app/test/meson.build +++ b/app/test/meson.build @@ -104,6 +104,7 @@ source_file_deps = { 'test_ipsec_sad.c': ['ipsec'], 'test_kvargs.c': ['kvargs'], 'test_latencystats.c': ['ethdev', 'latencystats', 'metrics'] + sample_packet_forward_deps, + 'test_lcore_var.c': [], 'test_lcores.c': [], 'test_link_bonding.c': ['ethdev', 'net_bond', 'net'] + packet_burst_generator_deps + virtual_pmd_deps, diff --git a/app/test/test_lcore_var.c b/app/test/test_lcore_var.c new file mode 100644 index 0000000000..ddf70b03a0 --- /dev/null +++ b/app/test/test_lcore_var.c @@ -0,0 +1,432 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2024 Ericsson AB + */ + +#include +#include +#include + +#include +#include +#include + +#include "test.h" + +#define MIN_LCORES 2 + +RTE_LCORE_VAR_HANDLE(int, test_int); +RTE_LCORE_VAR_HANDLE(char, test_char); +RTE_LCORE_VAR_HANDLE(long, test_long_sized); +RTE_LCORE_VAR_HANDLE(short, test_short); +RTE_LCORE_VAR_HANDLE(long, test_long_sized_aligned); + +struct int_checker_state { + int old_value; + int new_value; + bool success; +}; + +static void +rand_blk(void *blk, size_t size) +{ + size_t i; + + for (i = 0; i < size; i++) + ((unsigned char *)blk)[i] = (unsigned char)rte_rand(); +} + +static bool +is_ptr_aligned(const void *ptr, size_t align) +{ + return ptr != NULL ? (uintptr_t)ptr % align == 0 : false; +} + +static int +check_int(void *arg) +{ + struct int_checker_state *state = arg; + + int *ptr = RTE_LCORE_VAR(test_int); + + bool naturally_aligned = is_ptr_aligned(ptr, sizeof(int)); + + bool equal = *(RTE_LCORE_VAR(test_int)) == state->old_value; + + state->success = equal && naturally_aligned; + + *ptr = state->new_value; + + return 0; +} + +RTE_LCORE_VAR_INIT(test_int); +RTE_LCORE_VAR_INIT(test_char); +RTE_LCORE_VAR_INIT_SIZE(test_long_sized, 32); +RTE_LCORE_VAR_INIT(test_short); +RTE_LCORE_VAR_INIT_SIZE_ALIGN(test_long_sized_aligned, sizeof(long), + RTE_CACHE_LINE_SIZE); + +static int +test_int_lvar(void) +{ + unsigned int lcore_id; + + struct int_checker_state states[RTE_MAX_LCORE] = {}; + + RTE_LCORE_FOREACH_WORKER(lcore_id) { + struct int_checker_state *state = &states[lcore_id]; + + state->old_value = (int)rte_rand(); + state->new_value = (int)rte_rand(); + + *RTE_LCORE_VAR_LCORE(lcore_id, test_int) = state->old_value; + } + + RTE_LCORE_FOREACH_WORKER(lcore_id) + rte_eal_remote_launch(check_int, &states[lcore_id], lcore_id); + + rte_eal_mp_wait_lcore(); + + RTE_LCORE_FOREACH_WORKER(lcore_id) { + struct int_checker_state *state = &states[lcore_id]; + int value; + + TEST_ASSERT(state->success, "Unexpected value " + "encountered on lcore %d", lcore_id); + + value = *RTE_LCORE_VAR_LCORE(lcore_id, test_int); + TEST_ASSERT_EQUAL(state->new_value, value, + "Lcore %d failed to update int", lcore_id); + } + + /* take the opportunity to test the foreach macro */ + int *v; + unsigned int i = 0; + RTE_LCORE_VAR_FOREACH(lcore_id, v, test_int) { + TEST_ASSERT_EQUAL(i, lcore_id, "Encountered lcore id %d " + "while expecting %d during iteration", + lcore_id, i); + TEST_ASSERT_EQUAL(states[lcore_id].new_value, *v, + "Unexpected value on lcore %d during " + "iteration", lcore_id); + i++; + } + + return TEST_SUCCESS; +} + +static int +test_sized_alignment(void) +{ + unsigned int lcore_id; + long *v; + + RTE_LCORE_VAR_FOREACH(lcore_id, v, test_long_sized) { + TEST_ASSERT(is_ptr_aligned(v, alignof(long)), + "Type-derived alignment failed"); + } + + RTE_LCORE_VAR_FOREACH(lcore_id, v, test_long_sized_aligned) { + TEST_ASSERT(is_ptr_aligned(v, RTE_CACHE_LINE_SIZE), + "Explicit alignment failed"); + } + + return TEST_SUCCESS; +} + +/* private, larger, struct */ +#define TEST_STRUCT_DATA_SIZE 1234 + +struct test_struct { + uint8_t data[TEST_STRUCT_DATA_SIZE]; +}; + +static RTE_LCORE_VAR_HANDLE(char, before_struct); +static RTE_LCORE_VAR_HANDLE(struct test_struct, test_struct); +static RTE_LCORE_VAR_HANDLE(char, after_struct); + +struct struct_checker_state { + struct test_struct old_value; + struct test_struct new_value; + bool success; +}; + +static int check_struct(void *arg) +{ + struct struct_checker_state *state = arg; + + struct test_struct *lcore_struct = RTE_LCORE_VAR(test_struct); + + bool properly_aligned = + is_ptr_aligned(test_struct, alignof(struct test_struct)); + + bool equal = memcmp(lcore_struct->data, state->old_value.data, + TEST_STRUCT_DATA_SIZE) == 0; + + state->success = equal && properly_aligned; + + memcpy(lcore_struct->data, state->new_value.data, + TEST_STRUCT_DATA_SIZE); + + return 0; +} + +static int +test_struct_lvar(void) +{ + unsigned int lcore_id; + + RTE_LCORE_VAR_ALLOC(before_struct); + RTE_LCORE_VAR_ALLOC(test_struct); + RTE_LCORE_VAR_ALLOC(after_struct); + + struct struct_checker_state states[RTE_MAX_LCORE]; + + RTE_LCORE_FOREACH_WORKER(lcore_id) { + struct struct_checker_state *state = &states[lcore_id]; + + rand_blk(state->old_value.data, TEST_STRUCT_DATA_SIZE); + rand_blk(state->new_value.data, TEST_STRUCT_DATA_SIZE); + + memcpy(RTE_LCORE_VAR_LCORE(lcore_id, test_struct)->data, + state->old_value.data, TEST_STRUCT_DATA_SIZE); + } + + RTE_LCORE_FOREACH_WORKER(lcore_id) + rte_eal_remote_launch(check_struct, &states[lcore_id], + lcore_id); + + rte_eal_mp_wait_lcore(); + + RTE_LCORE_FOREACH_WORKER(lcore_id) { + struct struct_checker_state *state = &states[lcore_id]; + struct test_struct *lstruct = + RTE_LCORE_VAR_LCORE(lcore_id, test_struct); + + TEST_ASSERT(state->success, "Unexpected value encountered on " + "lcore %d", lcore_id); + + bool equal = memcmp(lstruct->data, state->new_value.data, + TEST_STRUCT_DATA_SIZE) == 0; + + TEST_ASSERT(equal, "Lcore %d failed to update struct", + lcore_id); + } + + RTE_LCORE_FOREACH_WORKER(lcore_id) { + char before = + *RTE_LCORE_VAR_LCORE(lcore_id, before_struct); + char after = + *RTE_LCORE_VAR_LCORE(lcore_id, after_struct); + + TEST_ASSERT_EQUAL(before, 0, "Lcore variable before test " + "struct was modified on lcore %d", lcore_id); + TEST_ASSERT_EQUAL(after, 0, "Lcore variable after test " + "struct was modified on lcore %d", lcore_id); + } + + return TEST_SUCCESS; +} + +#define TEST_ARRAY_SIZE 99 + +typedef uint16_t test_array_t[TEST_ARRAY_SIZE]; + +static void test_array_init_rand(test_array_t a) +{ + size_t i; + for (i = 0; i < TEST_ARRAY_SIZE; i++) + a[i] = (uint16_t)rte_rand(); +} + +static bool test_array_equal(test_array_t a, test_array_t b) +{ + size_t i; + for (i = 0; i < TEST_ARRAY_SIZE; i++) { + if (a[i] != b[i]) + return false; + } + return true; +} + +static void test_array_copy(test_array_t dst, const test_array_t src) +{ + size_t i; + for (i = 0; i < TEST_ARRAY_SIZE; i++) + dst[i] = src[i]; +} + +static RTE_LCORE_VAR_HANDLE(char, before_array); +static RTE_LCORE_VAR_HANDLE(test_array_t, test_array); +static RTE_LCORE_VAR_HANDLE(char, after_array); + +struct array_checker_state { + test_array_t old_value; + test_array_t new_value; + bool success; +}; + +static int check_array(void *arg) +{ + struct array_checker_state *state = arg; + + test_array_t *lcore_array = RTE_LCORE_VAR(test_array); + + bool properly_aligned = + is_ptr_aligned(lcore_array, alignof(test_array_t)); + + bool equal = test_array_equal(*lcore_array, state->old_value); + + state->success = equal && properly_aligned; + + test_array_copy(*lcore_array, state->new_value); + + return 0; +} + +static int +test_array_lvar(void) +{ + unsigned int lcore_id; + + RTE_LCORE_VAR_ALLOC(before_array); + RTE_LCORE_VAR_ALLOC(test_array); + RTE_LCORE_VAR_ALLOC(after_array); + + struct array_checker_state states[RTE_MAX_LCORE]; + + RTE_LCORE_FOREACH_WORKER(lcore_id) { + struct array_checker_state *state = &states[lcore_id]; + + test_array_init_rand(state->new_value); + test_array_init_rand(state->old_value); + + test_array_copy(*RTE_LCORE_VAR_LCORE(lcore_id, test_array), + state->old_value); + } + + RTE_LCORE_FOREACH_WORKER(lcore_id) + rte_eal_remote_launch(check_array, &states[lcore_id], + lcore_id); + + rte_eal_mp_wait_lcore(); + + RTE_LCORE_FOREACH_WORKER(lcore_id) { + struct array_checker_state *state = &states[lcore_id]; + test_array_t *larray = RTE_LCORE_VAR_LCORE(lcore_id, test_array); + + TEST_ASSERT(state->success, "Unexpected value encountered on " + "lcore %d", lcore_id); + + bool equal = test_array_equal(*larray, state->new_value); + + TEST_ASSERT(equal, "Lcore %d failed to update array", + lcore_id); + } + + RTE_LCORE_FOREACH_WORKER(lcore_id) { + char before = + *RTE_LCORE_VAR_LCORE(lcore_id, before_array); + char after = + *RTE_LCORE_VAR_LCORE(lcore_id, after_array); + + TEST_ASSERT_EQUAL(before, 0, "Lcore variable before test " + "array was modified on lcore %d", lcore_id); + TEST_ASSERT_EQUAL(after, 0, "Lcore variable after test " + "array was modified on lcore %d", lcore_id); + } + + return TEST_SUCCESS; +} + +#define MANY_LVARS (2 * RTE_MAX_LCORE_VAR / sizeof(uint32_t)) + +static int +test_many_lvars(void) +{ + uint32_t **handlers = malloc(sizeof(uint32_t *) * MANY_LVARS); + unsigned int i; + + TEST_ASSERT(handlers != NULL, "Unable to allocate memory"); + + for (i = 0; i < MANY_LVARS; i++) { + unsigned int lcore_id; + + RTE_LCORE_VAR_ALLOC(handlers[i]); + + for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { + uint32_t *v = + RTE_LCORE_VAR_LCORE(lcore_id, handlers[i]); + *v = (uint32_t)(i * lcore_id); + } + } + + for (i = 0; i < MANY_LVARS; i++) { + unsigned int lcore_id; + + for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { + uint32_t v = *RTE_LCORE_VAR_LCORE(lcore_id, handlers[i]); + TEST_ASSERT_EQUAL((uint32_t)(i * lcore_id), v, + "Unexpected lcore variable value on " + "lcore %d", lcore_id); + } + } + + free(handlers); + + return TEST_SUCCESS; +} + +static int +test_large_lvar(void) +{ + RTE_LCORE_VAR_HANDLE(unsigned char, large); + unsigned int lcore_id; + + RTE_LCORE_VAR_ALLOC_SIZE(large, RTE_MAX_LCORE_VAR); + + for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { + unsigned char *ptr = RTE_LCORE_VAR_LCORE(lcore_id, large); + + memset(ptr, (unsigned char)lcore_id, RTE_MAX_LCORE_VAR); + } + + for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { + unsigned char *ptr = RTE_LCORE_VAR_LCORE(lcore_id, large); + size_t i; + + for (i = 0; i < RTE_MAX_LCORE_VAR; i++) + TEST_ASSERT_EQUAL(ptr[i], (unsigned char)lcore_id, + "Large lcore variable value is " + "corrupted on lcore %d.", + lcore_id); + } + + return TEST_SUCCESS; +} + +static struct unit_test_suite lcore_var_testsuite = { + .suite_name = "lcore variable autotest", + .unit_test_cases = { + TEST_CASE(test_int_lvar), + TEST_CASE(test_sized_alignment), + TEST_CASE(test_struct_lvar), + TEST_CASE(test_array_lvar), + TEST_CASE(test_many_lvars), + TEST_CASE(test_large_lvar), + TEST_CASES_END() + }, +}; + +static int test_lcore_var(void) +{ + if (rte_lcore_count() < MIN_LCORES) { + printf("Not enough cores for lcore_var_autotest; expecting at " + "least %d.\n", MIN_LCORES); + return TEST_SKIPPED; + } + + return unit_test_suite_runner(&lcore_var_testsuite); +} + +REGISTER_FAST_TEST(lcore_var_autotest, true, false, test_lcore_var); From patchwork Fri Oct 25 08:41:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Mattias_R=C3=B6nnblom?= X-Patchwork-Id: 147217 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id A4B8B45BD4; Fri, 25 Oct 2024 10:50:58 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 42237402BB; Fri, 25 Oct 2024 10:50:55 +0200 (CEST) Received: from EUR03-DBA-obe.outbound.protection.outlook.com (mail-dbaeur03on2080.outbound.protection.outlook.com [40.107.104.80]) by mails.dpdk.org (Postfix) with ESMTP id B9CA14003C for ; Fri, 25 Oct 2024 10:50:52 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=MNw+REV7AAlucAdMTiXkWlTAGFgD8p1HZUnJcFy4X+05Rmt1DVPe70ljDVwmPBLxxD4/Bk35BRIh2O7grYXwGyh2/jYChNvNXwkW0OrOF+qKRP1muWkF8Y1I3V9Tz00tnabEfvXGG8dLjA9lrOb/uIDw/1nIOJO0ArfDn3yx4QqrgbwUsACdMnjZEwXS+/I9h/aK33BJPZmykOl5SISZYiuMsmYZ4jOzS0+z1qZ8B+C5Mr7KWSpvhZAn/QW26De6LWG2ku6JVutf/MIjmB00lhu3IK/RxgkMuxjtEFp1tIJUIMIgVlxkLBE3VDENd/9K8M0DLlJLR8GB0XpQOkPY+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=o6b1HEcpM0KUhEWZJInRKcoUJDY2fcG+jTKaAEbm0uw=; b=I3Nro0BRLs+2EJszrFcODni7+rfxfrhATIlTTKrSOPwPI4jVMfqlnKArlU8m3jVHDLWV6jAQ7rq82sM0HWKySzVpogEkY7/7u/CIabZ1TQGJT0KZgaO7GWeRy44Glj5hjqCSAxK2CBP4sESbrV9YctfUqmTAjpJVS0zA5MQ4SB/NRZI8EHt8zo5QaQOMe77RiP55mr0Vng+TJ4DouUHDuxH/5js3MYHfYfmrsKgN/f7YuP5BBa5wvr6D84yaGK/Gf+zoGPU8WouKhJsKqaILdD+EvZc5m9Y6fRuVMCxKGsbo9AcMrcNYsePbvFEvCrY0dsj457WghziHUdHSiSVyug== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 192.176.1.74) smtp.rcpttodomain=dpdk.org smtp.mailfrom=ericsson.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=ericsson.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=o6b1HEcpM0KUhEWZJInRKcoUJDY2fcG+jTKaAEbm0uw=; b=g3cJErUO0mucsaBk7UWOR7k5SWBeBm5WuppLzosXWShC1icvO5kvQEuX3nW+rBY/H9iBVC0DGAlUzfaVMUzqci4l5ifABCaZu94LAI2Gbs9uf58OjtlQDz0Y0f6n8nSlUCQ+UaytvLR5YlNEeZ/0gWV5wshZTRKgoh/uRzDY7aSgDVfkqj5j4nhTObvtHEpWkGg04MZTiiMeMhrqTkYxRO/hEp7bZ+Ocv2k48U99CN4lb0FXMeVNY/BuSufsuINXEE1jrb/dbt7SwmF3LbfAhD9asG/hWZ1klGpGW6Uw7PcU/9rrHEOxgljhocJJ3lKWMvKupyC7QHtSdesIvVhsSw== Received: from DUZPR01CA0271.eurprd01.prod.exchangelabs.com (2603:10a6:10:4b9::27) by AS8PR07MB7448.eurprd07.prod.outlook.com (2603:10a6:20b:28b::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.20; Fri, 25 Oct 2024 08:50:50 +0000 Received: from DB5PEPF00014B97.eurprd02.prod.outlook.com (2603:10a6:10:4b9:cafe::34) by DUZPR01CA0271.outlook.office365.com (2603:10a6:10:4b9::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.20 via Frontend Transport; Fri, 25 Oct 2024 08:50:50 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 192.176.1.74) smtp.mailfrom=ericsson.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=ericsson.com; Received-SPF: Pass (protection.outlook.com: domain of ericsson.com designates 192.176.1.74 as permitted sender) receiver=protection.outlook.com; client-ip=192.176.1.74; helo=oa.msg.ericsson.com; pr=C Received: from oa.msg.ericsson.com (192.176.1.74) by DB5PEPF00014B97.mail.protection.outlook.com (10.167.8.235) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.14 via Frontend Transport; Fri, 25 Oct 2024 08:50:50 +0000 Received: from seliicinfr00050.seli.gic.ericsson.se (153.88.142.248) by smtp-central.internal.ericsson.com (100.87.178.69) with Microsoft SMTP Server id 15.2.1544.11; Fri, 25 Oct 2024 10:50:49 +0200 Received: from breslau.. (seliicwb00002.seli.gic.ericsson.se [10.156.25.100]) by seliicinfr00050.seli.gic.ericsson.se (Postfix) with ESMTP id E55A31C009D; Fri, 25 Oct 2024 10:50:49 +0200 (CEST) From: =?utf-8?q?Mattias_R=C3=B6nnblom?= To: CC: , =?utf-8?q?Morten_Br=C3=B8rup?= , Stephen Hemminger , Konstantin Ananyev , David Marchand , Jerin Jacob , Luka Jankovic , Thomas Monjalon , =?utf-8?q?Mattias_R=C3=B6nnblom?= , "Chengwen Feng" Subject: [PATCH v17 3/8] eal: add lcore variable performance test Date: Fri, 25 Oct 2024 10:41:44 +0200 Message-ID: <20241025084149.873037-4-mattias.ronnblom@ericsson.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241025084149.873037-1-mattias.ronnblom@ericsson.com> References: <20241023075302.869008-1-mattias.ronnblom@ericsson.com> <20241025084149.873037-1-mattias.ronnblom@ericsson.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DB5PEPF00014B97:EE_|AS8PR07MB7448:EE_ X-MS-Office365-Filtering-Correlation-Id: 68e33677-bc09-4dca-a295-08dcf4d22080 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|1800799024|82310400026|36860700013|376014; X-Microsoft-Antispam-Message-Info: =?utf-8?q?nC8fX78nFgN0F1OXlvfk/NrJdnTAE6Q?= =?utf-8?q?vzB81Vig+5FRo9ajPakHQ4n8v67N2cWhpY9J5rLONDzoMTrKcZ1laJ5+Oj+nZyvKq?= =?utf-8?q?fZmVnWBfb4kDREtrU3HO84OjJAFcs77/AdjcWRo7OENVnbIsykIXM4mO+uCOidHUV?= =?utf-8?q?TUknoRvVyWZbUQ5qr9C3EBqvEPmu594yP8qcQrdNbFnDtffvwSF6QUU2q+WN9/sAn?= =?utf-8?q?7OId56l+rNG/xo4liEpJYZ0SvJtBfC4dZAaFd4exQyloSvvsJxoQX+YMg7r58jmZp?= =?utf-8?q?ZdV2Z62cWxFt8B+uf1pGVAw9iSnrlrFHIzWGhiU1dCYhjlEDAGMKg/FCtWbWMcLdV?= =?utf-8?q?oqk9g0YLYN+EjLdEkXVRS10Z+XiIN9hJQWSQdYE7dkbpIV9yl+zMB6NKM4AF/OaeS?= =?utf-8?q?VbRfoG2hYZ/gCmD0D8ZBi/ND53mFSQiLcqhj1vKdjtO4eoyUyicQ5zlKbhDFqn9X5?= =?utf-8?q?8fAK+a9rme/wg31zfxkJhwPwzVJeT+oYVp+yWsmWhmXpD7ciHWlDow6IrGUbgzXE6?= =?utf-8?q?fiUOScGrnWHsqe0WiQuYk3fvknUMCI9pXjVrC+JdCG6ax//WRfV5wgxC3SPVeFUFh?= =?utf-8?q?D0Vzgqw9pwDG6yj+D6yEq/jx1ftIHnD/T+zO4wjxSKM04kTW9jKGSgAT86buL+yo2?= =?utf-8?q?AV9AWt/oshnkRp8JXRs98dyelkhz5kZDvK8ZDK2++510h+LWX7AbOW/rQRVR9lcG1?= =?utf-8?q?Xshnzl2YctysH0eg21MVGsgR6DZKRfdFiOAP4+ACL/ZLpN/LbOMbaUnNeIR8cZNVh?= =?utf-8?q?7b2zexShZtP72gK+eO23sAX2e1VA8n+M2l3wbU4BAhvLYEhrS1VHOTy8q+fi8cCWn?= =?utf-8?q?aIA3Fk02gAUUADy5G96sUyFxd3wumjm1z1nJcaTR3RelGWVHT4W7wfgUsB7El7AH5?= =?utf-8?q?crLVNUDx5+cI9Or0nAywipZODPbTQHpw4BSUGsEq0pdSpG/W1Qvz9PI07hvULe1B6?= =?utf-8?q?D7Kx9/Q35jNwTWsI+n+Sz2kC6KfwN8MJYTpwqqQkk8x2csx1iaTqSWjsf0GoCcB0Z?= =?utf-8?q?WxNGS1ClgivaqKQ768eyDr5ZU+p6EgMk2Tj1KCHzDt7e3m0KCHEJs8uPOPROZe67L?= =?utf-8?q?sLgzuBcfnvRO7Eelan5H2TGRLEprPltUhW+uqcJRHyxmroIVfJZGzmZVI8Uq14P6A?= =?utf-8?q?26bdUy195pFPXAkgpJYEeCXOWJoUFEGtpmCGAqXXx4ZJmb5hqJeW1sfygWt3S8mEB?= =?utf-8?q?I1N8/CZoqfUH4KJeTojpUIEq9gPqPX7r9FQAemo+AoeiS05Egl+j8w4LLtjLE4E6S?= =?utf-8?q?Zi3HS+EWPYV/56xzg7LUqWS649Z3ioLqhR3dP08WuAT1VXqfH+1YFtzqTv1R+zFLt?= =?utf-8?q?F3HdGIYSXzblezfdpfs+hRXi9k/MXxRBNw=3D=3D?= X-Forefront-Antispam-Report: CIP:192.176.1.74; CTRY:SE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:oa.msg.ericsson.com; PTR:office365.se.ericsson.net; CAT:NONE; SFS:(13230040)(1800799024)(82310400026)(36860700013)(376014); DIR:OUT; SFP:1101; X-OriginatorOrg: ericsson.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Oct 2024 08:50:50.4753 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 68e33677-bc09-4dca-a295-08dcf4d22080 X-MS-Exchange-CrossTenant-Id: 92e84ceb-fbfd-47ab-be52-080c6b87953f X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=92e84ceb-fbfd-47ab-be52-080c6b87953f; Ip=[192.176.1.74]; Helo=[oa.msg.ericsson.com] X-MS-Exchange-CrossTenant-AuthSource: DB5PEPF00014B97.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR07MB7448 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add basic micro benchmark for lcore variables, in an attempt to assure that the overhead isn't significantly greater than alternative approaches, in scenarios where the benefits aren't expected to show up (i.e., when plenty of cache is available compared to the working set size of the per-lcore data). Signed-off-by: Mattias Rönnblom Acked-by: Chengwen Feng Acked-by: Stephen Hemminger Acked-by: Morten Brørup --- PATCH v8: * Fix spelling. (Morten Brørup) PATCH v6: * Use floating point math when calculating per-update latency. (Morten Brørup) PATCH v5: * Add variant of thread-local storage with initialization performed at the time of thread creation to the benchmark scenarios. (Morten Brørup) PATCH v4: * Rework the tests to be a little less unrealistic. Instead of a single dummy module using a single variable, use a number of variables/modules. In this way, differences in cache effects may show up. * Add RTE_CACHE_GUARD to better mimic that static array pattern. (Morten Brørup) * Show latencies as TSC cycles. (Morten Brørup) --- app/test/meson.build | 1 + app/test/test_lcore_var_perf.c | 256 +++++++++++++++++++++++++++++++++ 2 files changed, 257 insertions(+) create mode 100644 app/test/test_lcore_var_perf.c diff --git a/app/test/meson.build b/app/test/meson.build index 7dccd197ac..40f22a54d5 100644 --- a/app/test/meson.build +++ b/app/test/meson.build @@ -105,6 +105,7 @@ source_file_deps = { 'test_kvargs.c': ['kvargs'], 'test_latencystats.c': ['ethdev', 'latencystats', 'metrics'] + sample_packet_forward_deps, 'test_lcore_var.c': [], + 'test_lcore_var_perf.c': [], 'test_lcores.c': [], 'test_link_bonding.c': ['ethdev', 'net_bond', 'net'] + packet_burst_generator_deps + virtual_pmd_deps, diff --git a/app/test/test_lcore_var_perf.c b/app/test/test_lcore_var_perf.c new file mode 100644 index 0000000000..6d9869f873 --- /dev/null +++ b/app/test/test_lcore_var_perf.c @@ -0,0 +1,256 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2024 Ericsson AB + */ + +#define MAX_MODS 1024 + +#include + +#include +#include +#include +#include +#include + +#include "test.h" + +struct mod_lcore_state { + uint64_t a; + uint64_t b; + uint64_t sum; +}; + +static void +mod_init(struct mod_lcore_state *state) +{ + state->a = rte_rand(); + state->b = rte_rand(); + state->sum = 0; +} + +static __rte_always_inline void +mod_update(volatile struct mod_lcore_state *state) +{ + state->sum += state->a * state->b; +} + +struct __rte_cache_aligned mod_lcore_state_aligned { + struct mod_lcore_state mod_state; + + RTE_CACHE_GUARD; +}; + +static struct mod_lcore_state_aligned +sarray_lcore_state[MAX_MODS][RTE_MAX_LCORE]; + +static void +sarray_init(void) +{ + unsigned int lcore_id = rte_lcore_id(); + int mod; + + for (mod = 0; mod < MAX_MODS; mod++) { + struct mod_lcore_state *mod_state = + &sarray_lcore_state[mod][lcore_id].mod_state; + + mod_init(mod_state); + } +} + +static __rte_noinline void +sarray_update(unsigned int mod) +{ + unsigned int lcore_id = rte_lcore_id(); + struct mod_lcore_state *mod_state = + &sarray_lcore_state[mod][lcore_id].mod_state; + + mod_update(mod_state); +} + +struct mod_lcore_state_lazy { + struct mod_lcore_state mod_state; + bool initialized; +}; + +/* + * Note: it's usually a bad idea have this much thread-local storage + * allocated in a real application, since it will incur a cost on + * thread creation and non-lcore thread memory usage. + */ +static RTE_DEFINE_PER_LCORE(struct mod_lcore_state_lazy, + tls_lcore_state)[MAX_MODS]; + +static inline void +tls_init(struct mod_lcore_state_lazy *state) +{ + mod_init(&state->mod_state); + + state->initialized = true; +} + +static __rte_noinline void +tls_lazy_update(unsigned int mod) +{ + struct mod_lcore_state_lazy *state = + &RTE_PER_LCORE(tls_lcore_state[mod]); + + /* With thread-local storage, initialization must usually be lazy */ + if (!state->initialized) + tls_init(state); + + mod_update(&state->mod_state); +} + +static __rte_noinline void +tls_update(unsigned int mod) +{ + struct mod_lcore_state_lazy *state = + &RTE_PER_LCORE(tls_lcore_state[mod]); + + mod_update(&state->mod_state); +} + +RTE_LCORE_VAR_HANDLE(struct mod_lcore_state, lvar_lcore_state)[MAX_MODS]; + +static void +lvar_init(void) +{ + unsigned int mod; + + for (mod = 0; mod < MAX_MODS; mod++) { + RTE_LCORE_VAR_ALLOC(lvar_lcore_state[mod]); + + struct mod_lcore_state *state = + RTE_LCORE_VAR(lvar_lcore_state[mod]); + + mod_init(state); + } +} + +static __rte_noinline void +lvar_update(unsigned int mod) +{ + struct mod_lcore_state *state = RTE_LCORE_VAR(lvar_lcore_state[mod]); + + mod_update(state); +} + +static void +shuffle(unsigned int *elems, size_t len) +{ + size_t i; + + for (i = len - 1; i > 0; i--) { + unsigned int other = rte_rand_max(i + 1); + + unsigned int tmp = elems[other]; + elems[other] = elems[i]; + elems[i] = tmp; + } +} + +#define ITERATIONS UINT64_C(10000000) + +static inline double +benchmark_access(const unsigned int *mods, unsigned int num_mods, + void (*init_fun)(void), void (*update_fun)(unsigned int)) +{ + unsigned int i; + double start; + double end; + double latency; + unsigned int num_mods_mask = num_mods - 1; + + RTE_VERIFY(rte_is_power_of_2(num_mods)); + + if (init_fun != NULL) + init_fun(); + + /* Warm up cache and make sure TLS variables are initialized */ + for (i = 0; i < num_mods; i++) + update_fun(i); + + start = rte_rdtsc(); + + for (i = 0; i < ITERATIONS; i++) + update_fun(mods[i & num_mods_mask]); + + end = rte_rdtsc(); + + latency = (end - start) / (double)ITERATIONS; + + return latency; +} + +static void +test_lcore_var_access_n(unsigned int num_mods) +{ + double sarray_latency; + double tls_latency; + double lazy_tls_latency; + double lvar_latency; + unsigned int mods[num_mods]; + unsigned int i; + + for (i = 0; i < num_mods; i++) + mods[i] = i; + + shuffle(mods, num_mods); + + sarray_latency = + benchmark_access(mods, num_mods, sarray_init, sarray_update); + + tls_latency = + benchmark_access(mods, num_mods, NULL, tls_update); + + lazy_tls_latency = + benchmark_access(mods, num_mods, NULL, tls_lazy_update); + + lvar_latency = + benchmark_access(mods, num_mods, lvar_init, lvar_update); + + printf("%17u %8.1f %14.1f %15.1f %10.1f\n", num_mods, sarray_latency, + tls_latency, lazy_tls_latency, lvar_latency); +} + +/* + * The potential performance benefit of lcore variables compared to + * the use of statically sized, lcore id-indexed arrays is not + * shorter latencies in a scenario with low cache pressure, but rather + * fewer cache misses in a real-world scenario, with extensive cache + * usage. These tests are a crude simulation of such, using dummy + * modules, each with a small, per-lcore state. Note however that + * these tests have very little non-lcore/thread local state, which is + * unrealistic. + */ + +static int +test_lcore_var_access(void) +{ + unsigned int num_mods = 1; + + printf("- Latencies [TSC cycles/update] -\n"); + printf("Number of Static Thread-local Thread-local Lcore\n"); + printf("Modules/Variables Array Storage Storage (Lazy) Variables\n"); + + for (num_mods = 1; num_mods <= MAX_MODS; num_mods *= 2) + test_lcore_var_access_n(num_mods); + + return TEST_SUCCESS; +} + +static struct unit_test_suite lcore_var_testsuite = { + .suite_name = "lcore variable perf autotest", + .unit_test_cases = { + TEST_CASE(test_lcore_var_access), + TEST_CASES_END() + }, +}; + +static int +test_lcore_var_perf(void) +{ + return unit_test_suite_runner(&lcore_var_testsuite); +} + +REGISTER_PERF_TEST(lcore_var_perf_autotest, test_lcore_var_perf); From patchwork Fri Oct 25 08:41:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Mattias_R=C3=B6nnblom?= X-Patchwork-Id: 147221 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D90A445BD4; Fri, 25 Oct 2024 10:51:29 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D1B3A4065F; Fri, 25 Oct 2024 10:51:10 +0200 (CEST) Received: from EUR02-VI1-obe.outbound.protection.outlook.com (mail-vi1eur02on2070.outbound.protection.outlook.com [40.107.241.70]) by mails.dpdk.org (Postfix) with ESMTP id F23E6402CF for ; Fri, 25 Oct 2024 10:50:55 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=lC6Zq9/HeDPtMr2deitc+KIs2tj2Xo7Sj/nNEC4GAOvlnp69s4CbHDva06ud7MSKOUxQFG0cFS51OTKks8/NZ2l209kLyBLj+tYkLl1CKAQdKwOPzj0kQWSoi+JsUBWDzhe+YjaA3FfnYQetOtinR6dly1KKc1lQoutzjnkQjlFC0UBgQ+bqVLoNvKkLxGufil6p3wbGuHJQs4HLHQbMDPxDIqn3wSG3+KDn7iQjCgAvA40ffJ12uPcj1y9ujgbCoKzrnDFBR697M6ntl9OVNcI5/mq7cEsi5+VKxas5rpIYiljc0WvP7Afef7lbIKLFiGstZZULcfw2dAHEJev3yQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=dQnD+JpUua1Wt9PmLpBYbHLNuGQIjDuyayp0IQu7ECo=; b=vjAaW8Ae796yO2FiHZVbVLsQNWjOthQN1CIo4w6yoQXPld+G8FVTcH55LAADDFORPRcHWy71MG0SlxOTWEYNzXAP24BB7Ka3H2ozzcGZy9goS0byznKa9cA1Tz7k1zpiodEIOGymzn7IhWcnyXgV9GFpm7q9ZyOINxyZeHCe+8uBtlvOluO31oA7GV5nbuC1DMAHpdga4fIkEfsYmClV6XCVeRokEBsUlMNUDS11yhfl9NFBOZIfgABBp2imvkkB/dIffilaJ0eym9qfvb3IH5DOj2Wjc8CwgTK8F3Bg7kDBurY7f1bf4jGjABgmm24VB/dVa0c2/C0iVJz2QY7zHg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 192.176.1.74) smtp.rcpttodomain=dpdk.org smtp.mailfrom=ericsson.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=ericsson.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dQnD+JpUua1Wt9PmLpBYbHLNuGQIjDuyayp0IQu7ECo=; b=R09DqLFV8EAHe/58AMGOLGIWV/TjFOt7uq0TPKe2rDvsjM2fXflxUnmlRAgyRk1aBgufpJkHmizmdUo+mtB0nVSkhsHdJbFxae9leHbiabJCb2BRp3pCdXUCFGike+WScSRWTavlTNIwIeL0LOqMe942tDwwClwa3TGtBXIO64PLTjczTMYzYq/5P4JqxsvaMiE8cA/l5u/P92X+ZTaNCYSJvQX9MBXE6/b8byJj5eo4YOG4AWiB9PigbxPofq6fmisfON5uliar0YT1PK4KvoBLvL3EeVVGavA0Zaudy0caKQ6B8ja0bR1b35MHuVl8ZkmRoLCMkjNwzAA1rzEFEg== Received: from AM0PR08CA0002.eurprd08.prod.outlook.com (2603:10a6:208:d2::15) by AS2PR07MB9403.eurprd07.prod.outlook.com (2603:10a6:20b:643::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.21; Fri, 25 Oct 2024 08:50:52 +0000 Received: from AM3PEPF0000A795.eurprd04.prod.outlook.com (2603:10a6:208:d2:cafe::dd) by AM0PR08CA0002.outlook.office365.com (2603:10a6:208:d2::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.21 via Frontend Transport; Fri, 25 Oct 2024 08:50:52 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 192.176.1.74) smtp.mailfrom=ericsson.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=ericsson.com; Received-SPF: Pass (protection.outlook.com: domain of ericsson.com designates 192.176.1.74 as permitted sender) receiver=protection.outlook.com; client-ip=192.176.1.74; helo=oa.msg.ericsson.com; pr=C Received: from oa.msg.ericsson.com (192.176.1.74) by AM3PEPF0000A795.mail.protection.outlook.com (10.167.16.100) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.14 via Frontend Transport; Fri, 25 Oct 2024 08:50:51 +0000 Received: from seliicinfr00050.seli.gic.ericsson.se (153.88.142.248) by smtp-central.internal.ericsson.com (100.87.178.64) with Microsoft SMTP Server id 15.2.1544.11; Fri, 25 Oct 2024 10:50:50 +0200 Received: from breslau.. (seliicwb00002.seli.gic.ericsson.se [10.156.25.100]) by seliicinfr00050.seli.gic.ericsson.se (Postfix) with ESMTP id F19531C00AA; Fri, 25 Oct 2024 10:50:49 +0200 (CEST) From: =?utf-8?q?Mattias_R=C3=B6nnblom?= To: CC: , =?utf-8?q?Morten_Br=C3=B8rup?= , Stephen Hemminger , Konstantin Ananyev , David Marchand , Jerin Jacob , Luka Jankovic , Thomas Monjalon , =?utf-8?q?Mattias_R=C3=B6nnblom?= Subject: [PATCH v17 4/8] eal: add lcore variables' programmer's guide Date: Fri, 25 Oct 2024 10:41:45 +0200 Message-ID: <20241025084149.873037-5-mattias.ronnblom@ericsson.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241025084149.873037-1-mattias.ronnblom@ericsson.com> References: <20241023075302.869008-1-mattias.ronnblom@ericsson.com> <20241025084149.873037-1-mattias.ronnblom@ericsson.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AM3PEPF0000A795:EE_|AS2PR07MB9403:EE_ X-MS-Office365-Filtering-Correlation-Id: 2d2743e8-278b-4aca-17f8-08dcf4d22165 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|376014|36860700013|82310400026|1800799024; X-Microsoft-Antispam-Message-Info: =?utf-8?q?N2z5JQQXnEM9HRKmivhElzE6cwACTa9?= =?utf-8?q?fITAvQ/xu7Fqxg42kLlSYMtQMnojebQbtoLOyMngH5b4IfRieEsFFRpdsme95Vmui?= =?utf-8?q?AUvEXsavebkggmHl/mM1YXZWEgoRFveMxfeYUzjXJDT0O7ZXt/oNLg1LroBoinhHw?= =?utf-8?q?EPpnOmykNrqVAZ6qCcHKM4ZRBZCXIn6u55CSYJakcKH1SRYZ7lwGnLSV2tuWmaTqp?= =?utf-8?q?+0JmvnFnzQAkm5tqLPzGpXeG2xi2chzLW2ilXEOUjNojwiaNvijBEB0bJ4quRZNvm?= =?utf-8?q?r9na1cpzDt5xx6NUWS3qoxLw5PIMd67SuPTwXSbkdiP2LnZ6BAjFJN8AVh7oG8Jy2?= =?utf-8?q?hW/nvlCSbHrWVtFt99C8bVcbuuWNRB9aFkxfmYr4XD+Vyl9Wuhy8fV/reMArrvgAP?= =?utf-8?q?piVJ1WM25vN80TRY3WViXl0yDiSHje2ozHMpAV86KbQ29mEUNND5L33azSPwGJB9f?= =?utf-8?q?3qBocibrmIAW1ypwQf1on+ffow7knw5uz75ghLuc+d//5dbgGgQapNVl9loYxqQaJ?= =?utf-8?q?q/8W0gaUekkkoees1xafKRayoyqJndsaRIuRIqfNpcxfHZYKgQ8vIasUDZLWXYZVc?= =?utf-8?q?SPBzp7+66bNZQI912D4e88H1wEztixVrCIi0hxFFtUnqJ+MWyRJeREtf0U999N5+R?= =?utf-8?q?FcaHo5t7HAIDfsy5OOOh5PKpDBDSXjIRS2uZkMa/1Zi9s5qiiEOJpLtWmPTYZ6b1G?= =?utf-8?q?j03QldieuIHjTqMKCWd/D/2ZJpePcdueUnftTJZgh86+i8YUI26EvGnZnPlEOlWit?= =?utf-8?q?e5fb+/WZZ1i599YD22MtlVS7RMV+/AVZK7o9P7FeD7KjicTT6UhPj+j1x8Yq+TK24?= =?utf-8?q?snG0J8sS98VQKPsiaN+s9cm77FhJGTTpIK1kjGh/je76G/mjjbBxMwxYyASmG32UZ?= =?utf-8?q?UcLWJ+bT5Ms1NndXrBxMVCc50H1Lzqvi7IiNrPel7um9tmDg/vvZrjfZ/aLMbtJ+l?= =?utf-8?q?uq/YmYccflubtdhhWRBfhSqD73kldh7DJvure7FPL8p04DL0KVPU1KcAydMdQUldh?= =?utf-8?q?mlhXD9HwcshRUslerHItmlSjJMpsoo9UFbTXU6aWbEotxelQXXPnHbFnAMMFQr/JL?= =?utf-8?q?01dIbV7QAcmXjvLcL0H7tRLubEzJylcxtx2doIDC1HxDs1T/R6fRwe01to3BhbH6n?= =?utf-8?q?NhEMkYIGE91ze7ti8nkM8IwHxeDp6SBNLTbJkdoHSD3nhc+1G9/YCUCjr+KZOIrt6?= =?utf-8?q?7B3GHEz7Pu54gtDz+kGBCaMmdxDO5T25oC60OvrDjPK//00p230mxeKs0JuUMGq1z?= =?utf-8?q?ohK7LLmz8lMuH4xFOIQpVuU5YI5ExbwoupB716/a1lDwVn2HGelh44tOoS0cfutF1?= =?utf-8?q?P1mQVhNEbAQpPq0g5llm/F692MypwND0P1MH/zn8dnFa1ckV41aeXhY=3D?= X-Forefront-Antispam-Report: CIP:192.176.1.74; CTRY:SE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:oa.msg.ericsson.com; PTR:office365.se.ericsson.net; CAT:NONE; SFS:(13230040)(376014)(36860700013)(82310400026)(1800799024); DIR:OUT; SFP:1101; X-OriginatorOrg: ericsson.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Oct 2024 08:50:51.9881 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 2d2743e8-278b-4aca-17f8-08dcf4d22165 X-MS-Exchange-CrossTenant-Id: 92e84ceb-fbfd-47ab-be52-080c6b87953f X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=92e84ceb-fbfd-47ab-be52-080c6b87953f; Ip=[192.176.1.74]; Helo=[oa.msg.ericsson.com] X-MS-Exchange-CrossTenant-AuthSource: AM3PEPF0000A795.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS2PR07MB9403 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add lcore variables programmer's guide. This guide gives both an overview of the API, its implementation, and alternatives to the use of lcore variables for maintaining per-lcore id data. It has pictures, too. Signed-off-by: Mattias Rönnblom Reviewed-By: Luka Jankovic --- PATCH v17: * Change the color used for padding in the diagram to improve contrast. (Luka Jankovic) * Mention ``RTE_LCORE_VAR_FOREACH``. (Luka Jankovic) --- .../prog_guide/img/lcore_var_mem_layout.svg | 310 ++++++++++ .../img/static_array_mem_layout.svg | 278 +++++++++ doc/guides/prog_guide/index.rst | 1 + doc/guides/prog_guide/lcore_var.rst | 551 ++++++++++++++++++ 4 files changed, 1140 insertions(+) create mode 100644 doc/guides/prog_guide/img/lcore_var_mem_layout.svg create mode 100644 doc/guides/prog_guide/img/static_array_mem_layout.svg create mode 100644 doc/guides/prog_guide/lcore_var.rst diff --git a/doc/guides/prog_guide/img/lcore_var_mem_layout.svg b/doc/guides/prog_guide/img/lcore_var_mem_layout.svg new file mode 100644 index 0000000000..c4b286316c --- /dev/null +++ b/doc/guides/prog_guide/img/lcore_var_mem_layout.svg @@ -0,0 +1,310 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 0 + + 1 + + 2 + + 3 + + 4 + + 5 + + 6 + + 7 + + 0 + + int a + + char b + + <padding> + + 8 + + long c + + 16 + + long d + + 24 + + <unallocated> + + 32 + + 40 + + 48 + + 56 + + 64 + + int a + + char b + + <padding> + + 72 + + long c + + 80 + + long d + + 88 + + <unallocated> + + 96 + + 104 + + 112 + + 120 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + struct x_lcore + + + + + + + + + + + + lcore id 0 + + + + + + + + + + + + struct y_lcore + + + + + + #define RTE_MAX_LCORE 2#define RTE_MAX_LCORE_VAR 64 + + + + + + + + + + + + lcore id 1 + + + + + + + + + + + + struct x_lcore + + + + + + + + + + + + struct y_lcore + + + + + + + + + + + + struct lcore_var_buffer.data + + + + + + Handle pointers:x_lcores y_lcores + + + + + + + + + + + + + + + + + + \ No newline at end of file diff --git a/doc/guides/prog_guide/img/static_array_mem_layout.svg b/doc/guides/prog_guide/img/static_array_mem_layout.svg new file mode 100644 index 0000000000..87aa5b26f5 --- /dev/null +++ b/doc/guides/prog_guide/img/static_array_mem_layout.svg @@ -0,0 +1,278 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 0 + + 1 + + 2 + + 3 + + 4 + + 5 + + 6 + + 7 + + 0 + + int a + + char b + + <padding> + + 8 + + __rte_cache_aligned <padding> + + 16 + + 24 + + 32 + + 40 + + 48 + + 56 + + 64 + + RTE_CACHE_GUARD <padding> + + 72 + + 80 + + 88 + + 96 + + 104 + + 112 + + 120 + + 128 + + int a + + char b + + <padding> + + 136 + + __rte_cache_aligned <padding> + + 144 + + 152 + + 160 + + 168 + + 176 + + 184 + + 192 + + RTE_CACHE_GUARD <padding> + + 200 + + 208 + + 216 + + 224 + + 232 + + 240 + + 248 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + struct x_lcorelcore id 0 + + + + + + + + + + + + struct x_lcore x_lcores[RTE_MAX_LCORE] + + + + + + + + + + + + struct x_lcorelcore id 1 + + + + \ No newline at end of file diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst index 7eb1a98d88..c4432c4b74 100644 --- a/doc/guides/prog_guide/index.rst +++ b/doc/guides/prog_guide/index.rst @@ -27,6 +27,7 @@ Memory Management mempool_lib mbuf_lib multi_proc_support + lcore_var CPU Management diff --git a/doc/guides/prog_guide/lcore_var.rst b/doc/guides/prog_guide/lcore_var.rst new file mode 100644 index 0000000000..d0558d23f4 --- /dev/null +++ b/doc/guides/prog_guide/lcore_var.rst @@ -0,0 +1,551 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2024 Ericsson AB + +Lcore Variables +=============== + +The ``rte_lcore_var.h`` API provides a mechanism to allocate and +access per-lcore id variables in a space- and cycle-efficient manner. + +Lcore Variables API +------------------- + +A per-lcore id variable (or lcore variable for short) holds a unique +value for each EAL thread and registered non-EAL thread. Thus, there +is one distinct value for each past, current and future lcore +id-equipped thread, with a total of ``RTE_MAX_LCORE`` instances. + +The value of the lcore variable for one lcore id is independent of the +values associated with other lcore ids within the same variable. + +For detailed information on the lcore variables API, please refer to +the ``rte_lcore_var.h`` API documentation. + +Lcore Variable Handle +^^^^^^^^^^^^^^^^^^^^^ + +To allocate and access an lcore variable's values, a *handle* is +used. The handle is represented by an opaque pointer, only to be +dereferenced using the appropriate ```` macros. + +The handle is a pointer to the value's type (e.g., for an ``uint32_t`` +lcore variable, the handle is a ``uint32_t *``). + +The reason the handle is typed (i.e., it's not a void pointer or an +integer) is to enable type checking when accessing values of the lcore +variable. + +A handle may be passed between modules and threads just like any other +pointer. + +A valid (i.e., allocated) handle never has the value NULL. Thus, a +handle set to NULL may be used to signify that allocation has not yet +been done. + +Lcore Variable Allocation +^^^^^^^^^^^^^^^^^^^^^^^^^ + +An lcore variable is created in two steps: + +1. Define an lcore variable handle by using ``RTE_LCORE_VAR_HANDLE``. +2. Allocate lcore variable storage and initialize the handle by using + ``RTE_LCORE_VAR_ALLOC`` or ``RTE_LCORE_VAR_INIT``. Allocation + generally occurs at the time of module initialization, but may be + done at any time. + +The lifetime of an lcore variable is not tied to the thread that +created it. + +Each lcore variable has ``RTE_MAX_LCORE`` values, one for each +possible lcore id. All of an lcore variable's values may be accessed +from the moment the lcore variable is created, throughout the lifetime +of the EAL (i.e., until ``rte_eal_cleanup()``). + +Lcore variables do not need to be freed and cannot be freed. + +Access +^^^^^^ + +The value of any lcore variable for any lcore id may be accessed from +any thread (including unregistered threads), but it should only be +*frequently* read from or written to by the *owner*. A thread is +considered the owner of a particular lcore variable value instance if +it has the lcore id associated with that instance. + +Non-owner accesses results in *false sharing*. As long as non-owner +accesses are rare, they will have only a very slight effect on +performance. This property of lcore variables memory organization is +intentional. See the implementation section for more information. + +Values of the same lcore variable, associated with different lcore ids +may be frequently read or written by their respective owners without +risking false sharing. + +An appropriate synchronization mechanism, such as atomic load and +stores, should be employed to prevent data races between the owning +thread and any other thread accessing the same value instance. + +The value of the lcore variable for a particular lcore id is accessed +via ``RTE_LCORE_VAR_LCORE``. + +A common pattern is for an EAL thread or a registered non-EAL +thread to access its own lcore variable value. For this purpose, a +shorthand exists as ``RTE_LCORE_VAR``. + +``RTE_LCORE_VAR_FOREACH`` may be used to iterate over all values of a +particular lcore variable. + +The handle, defined by ``RTE_LCORE_VAR_HANDLE``, is a pointer of the +same type as the value, but it must be treated as an opaque identifier +and cannot be directly dereferenced. + +Lcore variable handles and value pointers may be freely passed +between different threads. + +Storage +^^^^^^^ + +An lcore variable's values may be of a primitive type like ``int``, +but is typically a ``struct``. + +The lcore variable handle introduces a per-variable (not +per-value/per-lcore id) overhead of ``sizeof(void *)`` bytes, so there +are some memory footprint gains to be made by organizing all per-lcore +id data for a particular module as one lcore variable (e.g., as a +struct). + +An application may define an lcore variable handle without ever +allocating the lcore variable. + +The size of an lcore variable's value cannot exceed the DPDK +build-time constant ``RTE_MAX_LCORE_VAR``. An lcore variable's size is +the size of one of its value instance, not the aggregate of all its +``RTE_MAX_LCORE`` instances. + +Lcore variables should generally *not* be ``__rte_cache_aligned`` and +need *not* include a ``RTE_CACHE_GUARD`` field, since these constructs +are designed to avoid false sharing. With lcore variables, false +sharing is largely avoided by other means. In the case of an lcore +variable instance, the thread most recently accessing nearby data +structures should almost always be the lcore variable's owner. Adding +padding (e.g., with ``RTE_CACHE_GUARD``) will increase the effective +memory working set size, potentially reducing performance. + +Lcore variable values are initialized to zero by default. + +Lcore variables are not stored in huge page memory. + +Example +^^^^^^^ + +Below is an example of the use of an lcore variable: + +.. code-block:: c + + struct foo_lcore_state { + int a; + long b; + }; + + static RTE_LCORE_VAR_HANDLE(struct foo_lcore_state, lcore_states); + + long foo_get_a_plus_b(void) + { + const struct foo_lcore_state *state = RTE_LCORE_VAR(lcore_states); + + return state->a + state->b; + } + + RTE_INIT(rte_foo_init) + { + RTE_LCORE_VAR_ALLOC(lcore_states); + + unsigned int lcore_id; + struct foo_lcore_state *state; + RTE_LCORE_VAR_FOREACH(lcore_id, state, lcore_states) { + /* initialize state */ + } + + /* other initialization */ + } + + +Implementation +-------------- + +This section gives an overview of the implementation of lcore +variables, and some background to its design. + +Lcore Variable Buffers +^^^^^^^^^^^^^^^^^^^^^^ + +Lcore variable values are kept in a set of ``lcore_var_buffer`` structs. + +.. code-block:: c + + struct lcore_var_buffer { + char data[RTE_MAX_LCORE_VAR * RTE_MAX_LCORE]; + struct lcore_var_buffer *prev; + }; + +An lcore var buffer stores at a minimum one, but usually many, lcore +variables. + +The value instances for all lcore ids are stored in the same +buffer. However, each lcore id has its own slice of the ``data`` +array. Such a slice is ``RTE_MAX_LCORE_VAR`` bytes in size. + +In this way, the values associated with a particular lcore id are +grouped spatially close (in memory). No padding is required to prevent +false sharing. + +.. code-block:: c + + static struct lcore_var_buffer *current_buffer; + + /* initialized to trigger buffer allocation on first allocation */ + static size_t offset = RTE_MAX_LCORE_VAR; + +The implementation maintains a current ``lcore_var_buffer`` and +an ``offset``, where the latter tracks how many bytes of this +current buffer has been allocated. + +The ``offset`` is progressively incremented (by the size of the +just-allocated lcore variable), as lcore variables are being +allocated. + +If the allocation of a variable would result in an ``offset`` larger +than ``RTE_MAX_LCORE_VAR`` (i.e., the slice size), the buffer is +full. In that case, new buffer is allocated off the heap, and the +``offset`` is reset. + +The lcore var buffers are arranged in a link list, to allow freeing +them at the point of ``rte_eal_cleanup()``, thereby avoiding false +positives from tools like valgrind memcheck. + +The lcore variable buffers are allocated off the regular C heap. There +are a number of reasons for not using ```` and huge +pages for lcore variables: + +- The libc heap is available at any time, including early in the + DPDK initialization. +- The amount of data kept in lcore variables is projected to be small, + and thus is unlikely to induce translate lookaside buffer (TLB) + misses. +- The last (and potentially only) lcore buffer in the chain will + likely only partially be in use. Huge pages of the sort used by DPDK + are always resident in memory, and their use would result in a + significant amount of memory going to waste. An example: ~256 kB + worth of lcore variables are allocated by DPDK libraries, PMDs and + the application. ``RTE_MAX_LCORE_VAR`` is set to 1 MB and + ``RTE_MAX_LCORE`` to 128. With 4 kB OS pages, only the first ~64 + pages of each of the 128 per-lcore id slices in the (only) + ``lcore_var_buffer`` will actually be resident (paged in). Here, + demand paging saves ~98 MB of memory. + +Not residing in huge pages, lcore variables cannot be accessed from +secondary processes. + +Heap allocation failures are treated as fatal. The reason for this +unorthodox design is that a majority of the allocations are deemed to +happen at initialization. An early heap allocation failure for a fixed +amount of data is a situation not unlike one where there is not enough +memory available for static variables (i.e., the BSS or data +sections). + +Provided these assumptions hold true, it's deemed acceptable to leave +the application out of handling memory allocation failures. + +The upside of this approach is that no error handling code is required +on the API user side. + +Lcore Variable Handles +^^^^^^^^^^^^^^^^^^^^^^ + +Upon lcore variable allocation, the lcore variables API returns an +opaque *handle* in the form of a pointer. The value of the pointer is +``buffer->data + offset``. + +Translating a handle base pointer to a pointer to a value associated +with a particular lcore id is straightforward: + +.. code-block:: c + + static inline void * + rte_lcore_var_lcore(unsigned int lcore_id, void *handle) + { + return RTE_PTR_ADD(handle, lcore_id * RTE_MAX_LCORE_VAR); + } + +``RTE_MAX_LCORE_VAR`` is a public macro to allow the compiler to +optimize the ``lcore_id * RTE_MAX_LCORE_VAR`` expression, and replace +the multiplication with a less expensive arithmetic operation. + +To maintain type safety, the ``RTE_LCORE_VAR*()`` macros should be +used, instead of directly invoking ``rte_lcore_var_lcore()``. The +macros return a pointer of the same type as the handle (i.e., a +pointer to the value's type). + +Memory Layout +^^^^^^^^^^^^^ + +This section describes how lcore variables are organized in memory. + +As an illustration, two example modules are used, ``rte_x`` and +``rte_y``, both maintaining per-lcore id state as a part of their +implementation. + +Two different methods will be used to maintain such state - lcore +variables and, to serve as a reference, lcore id-indexed static +arrays. + +Certain parameters are scaled down to make graphical depictions more +practical. + +For the purpose of this exercise, a ``RTE_MAX_LCORE`` of 2 is +assumed. In a real-world configuration the maximum number of EAL +threads and registered threads will be much greater (e.g., 128). + +The lcore variables example assumes a ``RTE_MAX_LCORE_VAR`` of 64. In +a real-world configuration (as controlled by ``rte_config.h``) the +value of this compile-time constant will be much greater (e.g., +1048576). + +The per-lcore id state is also smaller than what most real-world +modules would have. + +Lcore Variables Example +""""""""""""""""""""""" + +When lcore variables are used, the parts of ``rte_x`` and ``rte_y`` +that deal with the declaration and allocation of per-lcore id data may +look something like below. + +.. code-block:: c + + /* -- Lcore variables -- */ + + /* rte_x.c */ + + struct x_lcore + { + int a; + char b; + }; + + static RTE_LCORE_VAR_HANDLE(struct x_lcore, x_lcores); + RTE_LCORE_VAR_INIT(x_lcores); + + /../ + + /* rte_y.c */ + + struct y_lcore + { + long c; + long d; + }; + + static RTE_LCORE_VAR_HANDLE(struct y_lcore, y_lcores); + RTE_LCORE_VAR_INIT(y_lcores); + + /../ + +The resulting memory layout will look something like the following: + +.. _figure_lcore_var_mem_layout: + +.. figure:: img/lcore_var_mem_layout.* + +The above figure assumes that ``x_lcores`` is allocated prior to +``y_lcores``. ``RTE_LCORE_VAR_INIT()`` relies constructors, run prior +to ``main()`` in an undefined order. + +The use of lcore variables ensures that per-lcore id data is kept in +close proximity, within a designated region of memory. This proximity +enhances data locality and can improve performance. + +Lcore Id Index Static Array Example +""""""""""""""""""""""""""""""""""" + +Below is an example of the struct declarations, declarations and the +resulting organization in memory in case an lcore id indexed static +array of cache-line aligned, RTE_CACHE_GUARDed structs are used to +maintain per-lcore id state. + +This is a common pattern in DPDK, which lcore variables attempts to +replace. + +.. code-block:: c + + /* -- Cache-aligned static arrays -- */ + + /* rte_x.c */ + + struct x_lcore + { + int a; + char b; + RTE_CACHE_GUARD; + } __rte_cache_aligned; + + static struct x_lcore x_lcores[RTE_MAX_LCORE]; + + /../ + + /* rte_y.c */ + + struct y_lcore + { + long c; + long d; + RTE_CACHE_GUARD; + } __rte_cache_aligned; + + static struct y_lcore y_lcores[RTE_MAX_LCORE]; + + /../ + +In this approach, accessing the state for a particular lcore id is +merely a matter retrieving the lcore id and looking up the correct +struct instance. + +.. code-block:: c + + struct x_lcore *my_lcore_state = &x_lcores[rte_lcore_id()]; + +The address "0" at the top of the left-most column in the figure +represent the base address for the ``x_lcores`` array (in the BSS +segment in memory). + +The figure only includes the memory layout for the ``rte_x`` example +module. ``rte_y`` would look very similar, with ``y_lcores`` being +located at some other address in the BSS section. + +.. _figure_static_array_mem_layout: + +.. figure:: img/static_array_mem_layout.* + +The static array approach results in the per-lcore id being organized +around modules, not lcore ids. To avoid false sharing, an extensive +use of padding is employed, causing cache fragmentation. + +Because the padding is interspersed with the data, demand paging is +unlikely to reduce the actual resident DRAM memory footprint. This is +because the padding is smaller than a typical operating system memory +page (usually 4 kB). + +Performance +^^^^^^^^^^^ + +One of the goals of lcore variables is to improve performance. This is +achieved by packing often-used data in fewer cache lines, and thus +reducing fragmentation in CPU caches and thus somewhat improving the +effective cache size and cache hit rates. + +The application-level gains depends much on how much data is kept in +lcore variables, and how often it is accessed, and how much pressure +the application asserts on the CPU caches (i.e., how much other memory +it accesses). + +The ``lcore_var_perf_autotest`` is an attempt at exploring the +performance benefits (or drawbacks) of lcore variables compared to its +alternatives. Being a micro benchmark, it needs to be taken with a +grain of salt. + +Generally, one shouldn't expect more than some very modest gains in +performance after a switch from lcore id indexed arrays to lcore +variables. + +An additional benefit of the use of lcore variables is that it avoids +certain tricky issues related to CPU core hardware prefetching (e.g., +next-N-lines prefetching) that may cause false sharing even when data +used by two cores do not reside on the same cache line. Hardware +prefetch behavior is generally not publicly documented and varies +across CPU vendors, CPU generations and BIOS (or similar) +configurations. For applications aiming to be portable, this may cause +issues. Often, CPU hardware prefetch-induced issues are non-existent, +except some particular circumstances, where their adverse effects may +be significant. + +Alternatives +------------ + +Lcore Id Indexed Static Arrays +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Lcore variables are designed to replace a pattern exemplified below: + +.. code-block:: c + + struct __rte_cache_aligned foo_lcore_state { + int a; + long b; + RTE_CACHE_GUARD; + }; + + static struct foo_lcore_state lcore_states[RTE_MAX_LCORE]; + +This scheme is simple and effective, but has one drawback: the data is +organized so that objects related to all lcores for a particular +module are kept close in memory. At a bare minimum, this requires +sizing data structures (e.g., using ``__rte_cache_aligned``) to an +even number of cache lines and ensuring that allocation of such +objects are cache line aligned to avoid false sharing. With CPU +hardware prefetching and memory loads resulting from speculative +execution (functions which seemingly are getting more eager faster +than they are getting more intelligent), one or more "guard" cache +lines may be required to separate one lcore's data from another's and +prevent false sharing. + +Lcore variables offer the advantage of working with, rather than +against, the CPU's assumptions. A next-line hardware prefetcher, +for example, may function as intended (i.e., to the benefit, not +detriment, of system performance). + +Thread Local Storage +^^^^^^^^^^^^^^^^^^^^ + +An alternative to ``rte_lcore_var.h`` is the ``rte_per_lcore.h`` API, +which makes use of thread-local storage (TLS, e.g., GCC ``__thread`` or +C11 ``_Thread_local``). + +The are a number of differences between using TLS and the use of lcore +variables. + +The lifecycle of a thread-local variable instance is tied to that of +the thread. The data cannot be accessed before the thread has been +created, nor after it has terminated. As a result, thread-local +variables must be initialized in a "lazy" manner (e.g., at the point +of thread creation). Lcore variables may be accessed immediately after +having been allocated (which may occur before any thread beyond the +main thread is running). + +A thread-local variable is duplicated across all threads in the +process, including unregistered non-EAL threads (i.e., "regular" +threads). For DPDK applications heavily relying on multi-threading (in +conjunction to DPDK's "one thread per core" pattern), either by having +many concurrent threads or creating/destroying threads at a high rate, +an excessive use of thread-local variables may cause inefficiencies +(e.g., increased thread creation overhead due to thread-local storage +initialization or increased memory footprint). Lcore variables *only* +exist for threads with an lcore id. + +Whether data in thread-local storage can be shared between threads +(i.e., whether a pointer to a thread-local variable can be passed to +and successfully dereferenced by a non-owning thread) depends on the +specifics of the TLS implementation. With GCC __thread and GCC +_Thread_local, data sharing between threads is supported. In the C11 +standard, accessing another thread's _Thread_local object is +implementation-defined. Lcore variable instances may be accessed +reliably by any thread. + +Lcore variables also relies on TLS to retrieve the thread's +lcore id. However, the rest of the per-thread data is not kept in TLS. + +From a memory layout perspective, TLS is similar to lcore variables, +and thus per-thread data structure need not be padded. + +In case the above-mentioned drawbacks of the use of TLS is of no +significance to a particular application, TLS is a good alternative to +lcore variables. From patchwork Fri Oct 25 08:41:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Mattias_R=C3=B6nnblom?= X-Patchwork-Id: 147222 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 94EA045BD4; Fri, 25 Oct 2024 10:51:38 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 17EB440667; Fri, 25 Oct 2024 10:51:12 +0200 (CEST) Received: from EUR02-VI1-obe.outbound.protection.outlook.com (mail-vi1eur02on2045.outbound.protection.outlook.com [40.107.241.45]) by mails.dpdk.org (Postfix) with ESMTP id 9EB454003C for ; Fri, 25 Oct 2024 10:50:56 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=CGZK65mKhQhlZpNhqEcmk4SxBWzfMPKd5C2gLwFi6AtpoDMEgJFvbiHzOl5gLUAna0zzlmjVx+1EUxCCOPs+tQPD0FFHVLXy8IYPc+p2yDbY9BDcZTozlO+9Pp9NzW2NFiYRW4q1hFN2r1HaPhPv6SWee2N2QUPQXkqRaa9DUuRO7lI+3Sla6c2SoRpn6P51mVx1dGmmAz8Zw5XB+zqefdV9Pz0JQ2i1m8+73VBILBGnlQZvjkqQ2jO083HYHMaZUpzhEhg5CUKeFH02nurvOUhYwJZ+79CL6tKKG3aTa2BUTrgr6NtqTNZ6Hk/+wQQHPhlEJdjjSZL1C+8lhbDMQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=DESL2qFVuTQWm/n9l4enH+FDESyNnVazkJ0Sbs+rOpQ=; b=KOUrkQUglZ02vid9BkXsLh1usrVRPik6awjmBtjUX1i5QKBn0caPmXjGoifI+9Rhxc/AtAcCpWWbHAF6sLByp6rWLsu4+HaVvuYbiIMWE+u6RCFxzPDdw6L5xHItbmjuo8EE2AkocJhxw5n2t4cIdJX/+GfCAo3VJi+OBIqcyNBGV6ya+YeTERv1dsaRtmMZxmYcm/8Cc9tFtBFiP8SaE1hrafHT5N5eC+0sbTmTqCio/VGSbhdJr1M+AiFMAAtqbG7mkjJsYUWcEs6Z3dva0ikaELb9qZLN7pTGktwUE2rEtvAaZ76a2zp6EREjt+GzBu8YRVwBjXIPkpQPhidZxg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 192.176.1.74) smtp.rcpttodomain=dpdk.org smtp.mailfrom=ericsson.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=ericsson.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=DESL2qFVuTQWm/n9l4enH+FDESyNnVazkJ0Sbs+rOpQ=; b=nKqI5sCQGhOd/2tJh1G0N9aZpoaaaWKmXLVjA6zm3yWTfvA/qZBxtvYOFkCcSpc68xfSzkGTMSPB37M9Lzf+AQR8JHPGqsUiiJKaaJD8bgDzv8yDLoacHx1rwAf/J3/Ro9/TvlnNoeglNJuYA1G0lHJDeWnZjRVfVgJFrqxDDG6+TaVafW25LdV/vzseTIc5xYEFI2W9NZCypCFPHOvh01KaiqAcTrcSNRPsvG6KBQqUn/FPKNio2rrtiPMyHsjUPrZmTbWc8YDzXymzWGMiyUn5RfL1HuabLncGJqH6Bw4f/rsDoiyPQeIUkhgCED/WWQnUEcqt1uY98JQEoY2fSw== Received: from DU7P195CA0011.EURP195.PROD.OUTLOOK.COM (2603:10a6:10:54d::14) by PAWPR07MB10094.eurprd07.prod.outlook.com (2603:10a6:102:38f::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.20; Fri, 25 Oct 2024 08:50:51 +0000 Received: from DB3PEPF00008859.eurprd02.prod.outlook.com (2603:10a6:10:54d:cafe::5) by DU7P195CA0011.outlook.office365.com (2603:10a6:10:54d::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.20 via Frontend Transport; Fri, 25 Oct 2024 08:50:51 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 192.176.1.74) smtp.mailfrom=ericsson.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=ericsson.com; Received-SPF: Pass (protection.outlook.com: domain of ericsson.com designates 192.176.1.74 as permitted sender) receiver=protection.outlook.com; client-ip=192.176.1.74; helo=oa.msg.ericsson.com; pr=C Received: from oa.msg.ericsson.com (192.176.1.74) by DB3PEPF00008859.mail.protection.outlook.com (10.167.242.4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.14 via Frontend Transport; Fri, 25 Oct 2024 08:50:51 +0000 Received: from seliicinfr00050.seli.gic.ericsson.se (153.88.142.248) by smtp-central.internal.ericsson.com (100.87.178.61) with Microsoft SMTP Server id 15.2.1544.11; Fri, 25 Oct 2024 10:50:50 +0200 Received: from breslau.. (seliicwb00002.seli.gic.ericsson.se [10.156.25.100]) by seliicinfr00050.seli.gic.ericsson.se (Postfix) with ESMTP id 0A9C91C00A5; Fri, 25 Oct 2024 10:50:50 +0200 (CEST) From: =?utf-8?q?Mattias_R=C3=B6nnblom?= To: CC: , =?utf-8?q?Morten_Br=C3=B8rup?= , Stephen Hemminger , Konstantin Ananyev , David Marchand , Jerin Jacob , Luka Jankovic , Thomas Monjalon , =?utf-8?q?Mattias_R=C3=B6nnblom?= , Konstantin Ananyev , Chengwen Feng Subject: [PATCH v17 5/8] random: keep PRNG state in lcore variable Date: Fri, 25 Oct 2024 10:41:46 +0200 Message-ID: <20241025084149.873037-6-mattias.ronnblom@ericsson.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241025084149.873037-1-mattias.ronnblom@ericsson.com> References: <20241023075302.869008-1-mattias.ronnblom@ericsson.com> <20241025084149.873037-1-mattias.ronnblom@ericsson.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DB3PEPF00008859:EE_|PAWPR07MB10094:EE_ X-MS-Office365-Filtering-Correlation-Id: a3a51026-9f29-4798-dbcf-08dcf4d220da X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|82310400026|36860700013|1800799024|376014|7416014; X-Microsoft-Antispam-Message-Info: =?utf-8?q?yJYsoniavxLLgu1HUHvVnTMtZb66i7v?= =?utf-8?q?VnIvWfg/4K+4p3nvTJksWp6VPJyhbXThXEwNFKFhpbfz+tVdFnuVV3mqYeW+4yeZ+?= =?utf-8?q?iAjGPTbZ94CWe+9onzaaYALglzFX8MadKgkFXEhHo6fAZRNepGHOeax7uv1/U5fny?= =?utf-8?q?cvmy5+vh58ks9MEmCBnJoKlL2Nb+wkcovBbaipZRo/KE3MfmYguUTww7l3hSheLKQ?= =?utf-8?q?OE05TsCFeKmXMgN56xb2ZYV1rUF30LOGBFjcL+azfkL/PqhqNQ82gZiRrt9hvIOn4?= =?utf-8?q?GRA0WwHPdIcaiYDj8aj7OdreECnO/eyFHQrkLhoQG6YfEIRE44Y2yuMjyRWrdLukF?= =?utf-8?q?tRptqVgHYq8HzmtWqy3xQMfDPj184ek3+dFzmDus+RtX4QYegX4x1KfyFL22PTsC0?= =?utf-8?q?vOy/HtDgANTTEaAutp5sgSB/Nl8U7BVRg4gHS7UMpDCcKHEwwlLaq70nMFqQg4A8Y?= =?utf-8?q?duNZpPZMbvArLidUzegWkYKR2axwKGbfGnY7+zXhJt69vmc9DiQW9fPW8hcmD2NIe?= =?utf-8?q?N7Fv1KI0ibI6C6ygodqwZWObya8bZFJDI60SUB9hYb4ff8VJvC+Zjdzv4EU8VrZhB?= =?utf-8?q?+sFJNXUmOMnO7i7xmhl5vH7BDKMG3rGurEKxbZlfOKea6wTT/Yaq3gW5hQwfHeZHs?= =?utf-8?q?W5oMIAAREAIBXP2ecYlUVfw08Z+V3hhVCLkKwMvpo3sMuShRFZshlKdFEDWK2embn?= =?utf-8?q?F98Q3PkY5o0eG6GHFrnxsiQ93jK0EF5FoqZAzhKT3FRI1khggtdI5fTpiXScItfNQ?= =?utf-8?q?5wcCTP4kGaa/U2P5cWeUWGbhFuH/lKrAQXE0mNLx3BezX1crMhdUHDtzRyvgcLLkz?= =?utf-8?q?aaWwl2MbOw1rRvWmutvI++1t40QTyegEnpWT6QUBZiS+CxLHnDcWY6Q1qZc+I5JSn?= =?utf-8?q?vo7BtERnipOG9OGBv49WO1zuIU5X3seU561Me7Tj3wkIsogloljxosfRxJymevsFO?= =?utf-8?q?3emYlLu6THjyCEUMguU6UBhFJ1QpwvIF0uAAhU61A2gXDfASG0zzD8DKYIzTq7WOU?= =?utf-8?q?rIAWxlyPTOF+pY5I27XB00ogFa4ezzptuyqugYHcKzzAe7KWMcv94xcQproT1BQJ9?= =?utf-8?q?FY+s2PUELGQ6cwcbFzzOUv4w+ypJNwJYz4rgVOZtUcAlQeYTIszhsUrp84zcp3tpp?= =?utf-8?q?IAFhGGxdJgPByko+ZFbFUP3S9sHhI0hzSL7dqs8zBZw6cLn5asmiDT975c64/6Z4n?= =?utf-8?q?EjwezuuEdLwgoGagY1BiRJong306eYyr77LNRy0qFuTnwyR2DH57+tUi9mdTjIeF9?= =?utf-8?q?t0PcsJULdn5NNbVXT2Lu+g33zJX4L5UnTyPHhjSYYCzhUfl3oz/KxcURjPRbh7pdq?= =?utf-8?q?eXdKFvPMSaUyBxXBFJxFwLdjDQ+eKGzPbQ=3D=3D?= X-Forefront-Antispam-Report: CIP:192.176.1.74; CTRY:SE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:oa.msg.ericsson.com; PTR:office365.se.ericsson.net; CAT:NONE; SFS:(13230040)(82310400026)(36860700013)(1800799024)(376014)(7416014); DIR:OUT; SFP:1101; X-OriginatorOrg: ericsson.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Oct 2024 08:50:51.0510 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: a3a51026-9f29-4798-dbcf-08dcf4d220da X-MS-Exchange-CrossTenant-Id: 92e84ceb-fbfd-47ab-be52-080c6b87953f X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=92e84ceb-fbfd-47ab-be52-080c6b87953f; Ip=[192.176.1.74]; Helo=[oa.msg.ericsson.com] X-MS-Exchange-CrossTenant-AuthSource: DB3PEPF00008859.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAWPR07MB10094 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Replace keeping PRNG state in a RTE_MAX_LCORE-sized static array of cache-aligned and RTE_CACHE_GUARDed struct instances with keeping the same state in a more cache-friendly lcore variable. Signed-off-by: Mattias Rönnblom Acked-by: Morten Brørup Acked-by: Konstantin Ananyev Acked-by: Chengwen Feng Acked-by: Stephen Hemminger --- RFC v3: * Remove cache alignment on unregistered threads' rte_rand_state. (Morten Brørup) --- lib/eal/common/rte_random.c | 28 +++++++++++++++++----------- 1 file changed, 17 insertions(+), 11 deletions(-) diff --git a/lib/eal/common/rte_random.c b/lib/eal/common/rte_random.c index 90e91b3c4f..cf0756f26a 100644 --- a/lib/eal/common/rte_random.c +++ b/lib/eal/common/rte_random.c @@ -11,6 +11,7 @@ #include #include #include +#include #include struct __rte_cache_aligned rte_rand_state { @@ -19,14 +20,12 @@ struct __rte_cache_aligned rte_rand_state { uint64_t z3; uint64_t z4; uint64_t z5; - RTE_CACHE_GUARD; }; -/* One instance each for every lcore id-equipped thread, and one - * additional instance to be shared by all others threads (i.e., all - * unregistered non-EAL threads). - */ -static struct rte_rand_state rand_states[RTE_MAX_LCORE + 1]; +RTE_LCORE_VAR_HANDLE(struct rte_rand_state, rand_state); + +/* instance to be shared by all unregistered non-EAL threads */ +static struct rte_rand_state unregistered_rand_state; static uint32_t __rte_rand_lcg32(uint32_t *seed) @@ -85,8 +84,14 @@ rte_srand(uint64_t seed) unsigned int lcore_id; /* add lcore_id to seed to avoid having the same sequence */ - for (lcore_id = 0; lcore_id < RTE_DIM(rand_states); lcore_id++) - __rte_srand_lfsr258(seed + lcore_id, &rand_states[lcore_id]); + for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { + struct rte_rand_state *lcore_state = + RTE_LCORE_VAR_LCORE(lcore_id, rand_state); + + __rte_srand_lfsr258(seed + lcore_id, lcore_state); + } + + __rte_srand_lfsr258(seed + lcore_id, &unregistered_rand_state); } static __rte_always_inline uint64_t @@ -124,11 +129,10 @@ struct rte_rand_state *__rte_rand_get_state(void) idx = rte_lcore_id(); - /* last instance reserved for unregistered non-EAL threads */ if (unlikely(idx == LCORE_ID_ANY)) - idx = RTE_MAX_LCORE; + return &unregistered_rand_state; - return &rand_states[idx]; + return RTE_LCORE_VAR(rand_state); } uint64_t @@ -228,6 +232,8 @@ RTE_INIT(rte_rand_init) { uint64_t seed; + RTE_LCORE_VAR_ALLOC(rand_state); + seed = __rte_random_initial_seed(); rte_srand(seed); From patchwork Fri Oct 25 08:41:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Mattias_R=C3=B6nnblom?= X-Patchwork-Id: 147216 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 48BFE45BD4; Fri, 25 Oct 2024 10:50:54 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 3BC5840156; Fri, 25 Oct 2024 10:50:54 +0200 (CEST) Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2045.outbound.protection.outlook.com [40.107.20.45]) by mails.dpdk.org (Postfix) with ESMTP id 7CBD64003C for ; Fri, 25 Oct 2024 10:50:52 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=She4tNGy7PrguahBWl97KrZ8vGcZxcFQemOVJ0rbbrVmsmQavmRQeupQYVwKoVOYLNQcrMclnOt1dSZga/neRd7cvqRKtkjeC3plbSPF0QCEvYRdBa5lmjtfKfvMmig2KoZ7R3M1I4iCl8btJxOyJN6FR/6jxS44ZKygjXvSOHY8MtrIw8Mn6cF6Rf5vYdpxQISpT/wZ/Hgk/Okh9tcbx8ww6YvFjHdH0MphtiLjG8+dpLJ9aCzm+87136nxz+ukrMszpsneiryMTKBX7h5mghbZilqpQNCkefP0roHUQaUGIf/WuqqcBwMM3NpuksivW47PYNa1mZXyIaaOYK560A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=JUViexrXDbwH+ykxkmrKOl9oFD2NuR3+ptOQP5fWbe0=; b=wO2Ml6XdB01wkDyuV4IqVYluB5U7HAGBCqGdBiafbwwvCEasKLN+AsCapoj0i6kdgwiI57bPAXrRvCNWGCE5sTlRABGY+7dtdoT/Bs6+3l+JcZtrUEoCSL/Q9LW8cPoyRMcgYHjEcIn1GVd65/sjHCLt3reh0uyKfkkvbQ9GtmNSD9ket4JpLvFPCo12kn+uu/AbCl8o8e9H57/hyFiG5s1fOv0512ssjmcpPxZzgrxA5h24hjhhNWMd79ErATiI0cGwsJSOafjEqVwKwn7pJm3qUHL17n4y3cpQjNaDVR4pwB8erJrhnNKjal9kHX2olKP928C6qbIdDpO2v6y6Cw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 192.176.1.74) smtp.rcpttodomain=dpdk.org smtp.mailfrom=ericsson.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=ericsson.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=JUViexrXDbwH+ykxkmrKOl9oFD2NuR3+ptOQP5fWbe0=; b=JbUJX2ZhbIrbFHpCz4RmFuXENW1mHAT2HtVgOZli25ZnVVF0AebV1cM8ztLWt7tNJuPOsCOD8Uds39twlGoVGJlYqIF1JDE13esWiGurNozHCWDtBhW6/C3zdnKSZUeqBsk8peFfkYa62tob8RLpZ5Nv95kTy65LAsDN3wb/F40/we6mi7PgAXEM5oYTOwfvXE+aR8PFwe4FhF39X5jifXpLsKftyCOGhQcvLBHiqVwN9ZWhPv53dV+lMsCd1iRV/t1Ev4izCuV3QuTNbdC+bq/B0jeo7K8aCCzydrWt5ZppCnj0RNJyTAYWJT8X0k07+37qWPv/1TWnCsfC3FR0JQ== Received: from DUZPR01CA0259.eurprd01.prod.exchangelabs.com (2603:10a6:10:4b9::7) by AS8PR07MB7303.eurprd07.prod.outlook.com (2603:10a6:20b:259::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.16; Fri, 25 Oct 2024 08:50:51 +0000 Received: from DB5PEPF00014B97.eurprd02.prod.outlook.com (2603:10a6:10:4b9:cafe::93) by DUZPR01CA0259.outlook.office365.com (2603:10a6:10:4b9::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.20 via Frontend Transport; Fri, 25 Oct 2024 08:50:51 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 192.176.1.74) smtp.mailfrom=ericsson.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=ericsson.com; Received-SPF: Pass (protection.outlook.com: domain of ericsson.com designates 192.176.1.74 as permitted sender) receiver=protection.outlook.com; client-ip=192.176.1.74; helo=oa.msg.ericsson.com; pr=C Received: from oa.msg.ericsson.com (192.176.1.74) by DB5PEPF00014B97.mail.protection.outlook.com (10.167.8.235) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.14 via Frontend Transport; Fri, 25 Oct 2024 08:50:50 +0000 Received: from seliicinfr00050.seli.gic.ericsson.se (153.88.142.248) by smtp-central.internal.ericsson.com (100.87.178.69) with Microsoft SMTP Server id 15.2.1544.11; Fri, 25 Oct 2024 10:50:50 +0200 Received: from breslau.. (seliicwb00002.seli.gic.ericsson.se [10.156.25.100]) by seliicinfr00050.seli.gic.ericsson.se (Postfix) with ESMTP id 1BA7C1C009D; Fri, 25 Oct 2024 10:50:50 +0200 (CEST) From: =?utf-8?q?Mattias_R=C3=B6nnblom?= To: CC: , =?utf-8?q?Morten_Br=C3=B8rup?= , Stephen Hemminger , Konstantin Ananyev , David Marchand , Jerin Jacob , Luka Jankovic , Thomas Monjalon , =?utf-8?q?Mattias_R=C3=B6nnblom?= , Konstantin Ananyev , Chengwen Feng Subject: [PATCH v17 6/8] power: keep per-lcore state in lcore variable Date: Fri, 25 Oct 2024 10:41:47 +0200 Message-ID: <20241025084149.873037-7-mattias.ronnblom@ericsson.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241025084149.873037-1-mattias.ronnblom@ericsson.com> References: <20241023075302.869008-1-mattias.ronnblom@ericsson.com> <20241025084149.873037-1-mattias.ronnblom@ericsson.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DB5PEPF00014B97:EE_|AS8PR07MB7303:EE_ X-MS-Office365-Filtering-Correlation-Id: 2ea0d278-6804-4279-2972-08dcf4d220c8 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|7416014|1800799024|376014|36860700013|82310400026; X-Microsoft-Antispam-Message-Info: =?utf-8?q?8fxEp7+RyWJ9P3hZ6Rcrhw/elnHlbe2?= =?utf-8?q?FEJ83/0lLaAoaedrtvTWZ59hH4SczCyB3AE0gP7xqZEJESRAgNCD1VIt/rgzMAe9f?= =?utf-8?q?7Y/M46d5XSdv35FjB6rqy9D2PtrwMKpQJaDexvqKWHCD4JpGrOPmsNmNWjHuGwi9y?= =?utf-8?q?bT9ULu+ga143vS2ehIM+LxhflN7ENjuXiwiJIo0I9LInv1EGzH6Z8x6m+70wC+uZy?= =?utf-8?q?wqNpsGR732UCXmOdv533JrDok8fgtO2/QUf9HuDRO5ZiKVKc6XLX688pLU2SgG3fW?= =?utf-8?q?1h0sVB+7WFnKvIc89F6IrN81EGJi9PW2PVA0oknjsiYAvnRObUQVIk+5dqSYlctka?= =?utf-8?q?rAwaIQZfNX50Qkz/2eSxyuilkhuY1F4VMrYpduH95R/rN/GvNYu9SFvPG8QYct2TG?= =?utf-8?q?7MqyzavKl3v22AtCcNG9GvonOjVTLQPwt9ilDzvTTY7i7gGjtsgmPSwIVAEBpkkrf?= =?utf-8?q?/hhxPRgqKXt82sJ6O8d4E1sgDf1ZS04T4FuY0CrZva2r2Qa6ydwjmxEYL0Kgxlh7a?= =?utf-8?q?CWF8emZp4WgW5bMdBKyvyg2xbqyYoW/uTOHMnTcGTGUCVHnBAPOhG+6lasEr9HRip?= =?utf-8?q?T+qJfRZIbcAzvlBpkTXbzjFk+sbPZ9oabXQJSXUx9eD0hD67WE/ix0YNQCxF3m1jt?= =?utf-8?q?a8OkWp3RP6dnmTY9WG0O0b4MfgDNRkIsb6fZ/a/iAGSh95DtOrjZygYSgYeSzQe9h?= =?utf-8?q?csNgiYw89YWwINXaa2ZYKwNlz27R8GGjUaSAqXWZQW9x7go90mEOZIYC6eohj3fxu?= =?utf-8?q?1eGyCsQN+dGCmWXprFXdWGzb3Uusatt54mKq2NZu6D/7hkkhACFmgVvSlqd8I8rrK?= =?utf-8?q?RKEDLcyyDPzRhiRYxz8KaQCflhUi0deI29z0JMgzJMpODZmtdoYLXILJfAFn1EDj+?= =?utf-8?q?a4KtiD/yVX7izh48V6M+OxXOCR/VF3iPRPlTF1M5d5QQlaremxT7VXYUsO5jhutEu?= =?utf-8?q?E1CO0iutayLOpz02Ae+m3f+NUt6ngeCU7CT3Ea/WgvbOfzWqwlFhvE6iV1upTEliv?= =?utf-8?q?+VlK2ApUnq/8fXqrr7r7zLOtE571Ivwu/PdF6y1KJBvYuVDb+lJ2kkO9oFiSnBayw?= =?utf-8?q?nXqWTZAHfAP9W4OvN0I42ND0mDO0ioRh+GTFklv9kS/ugM1lhEGeSsXKvGBJibhjq?= =?utf-8?q?kDW6x6M2WiESsQMR+HIr4nfrGwbZNA8gBJS8hVclZ/yPovzEUOfkHUAVCas3G5rMN?= =?utf-8?q?P27pLymBtURlKFXQsCnapc6URrqzh2JNK1z+OIreP/HHWgyjL5rJcxqhcgfhWq9hV?= =?utf-8?q?eRdT5uLdXuEahg+jqjmL6nvVcfnV6kyPSktG/x8wDai1vptFIreZlDP07rNkolOr4?= =?utf-8?q?FUIDuUkyuUFzOIEvyWtPTL7rs8D+sw+RrQ=3D=3D?= X-Forefront-Antispam-Report: CIP:192.176.1.74; CTRY:SE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:oa.msg.ericsson.com; PTR:office365.se.ericsson.net; CAT:NONE; SFS:(13230040)(7416014)(1800799024)(376014)(36860700013)(82310400026); DIR:OUT; SFP:1101; X-OriginatorOrg: ericsson.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Oct 2024 08:50:50.9441 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 2ea0d278-6804-4279-2972-08dcf4d220c8 X-MS-Exchange-CrossTenant-Id: 92e84ceb-fbfd-47ab-be52-080c6b87953f X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=92e84ceb-fbfd-47ab-be52-080c6b87953f; Ip=[192.176.1.74]; Helo=[oa.msg.ericsson.com] X-MS-Exchange-CrossTenant-AuthSource: DB5PEPF00014B97.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR07MB7303 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Replace static array of cache-aligned structs with an lcore variable, to slightly benefit code simplicity and performance. Signed-off-by: Mattias Rönnblom Acked-by: Morten Brørup Acked-by: Konstantin Ananyev Acked-by: Chengwen Feng Acked-by: Stephen Hemminger --- PATCH v6: * Update FOREACH invocation to match new API. RFC v3: * Replace for loop with FOREACH macro. --- lib/power/rte_power_pmd_mgmt.c | 35 +++++++++++++++++----------------- 1 file changed, 17 insertions(+), 18 deletions(-) diff --git a/lib/power/rte_power_pmd_mgmt.c b/lib/power/rte_power_pmd_mgmt.c index 5e50613f5b..a2fff3b765 100644 --- a/lib/power/rte_power_pmd_mgmt.c +++ b/lib/power/rte_power_pmd_mgmt.c @@ -5,6 +5,7 @@ #include #include +#include #include #include #include @@ -69,7 +70,7 @@ struct __rte_cache_aligned pmd_core_cfg { uint64_t sleep_target; /**< Prevent a queue from triggering sleep multiple times */ }; -static struct pmd_core_cfg lcore_cfgs[RTE_MAX_LCORE]; +static RTE_LCORE_VAR_HANDLE(struct pmd_core_cfg, lcore_cfgs); static inline bool queue_equal(const union queue *l, const union queue *r) @@ -252,12 +253,11 @@ clb_multiwait(uint16_t port_id __rte_unused, uint16_t qidx __rte_unused, struct rte_mbuf **pkts __rte_unused, uint16_t nb_rx, uint16_t max_pkts __rte_unused, void *arg) { - const unsigned int lcore = rte_lcore_id(); struct queue_list_entry *queue_conf = arg; struct pmd_core_cfg *lcore_conf; const bool empty = nb_rx == 0; - lcore_conf = &lcore_cfgs[lcore]; + lcore_conf = RTE_LCORE_VAR(lcore_cfgs); /* early exit */ if (likely(!empty)) @@ -317,13 +317,12 @@ clb_pause(uint16_t port_id __rte_unused, uint16_t qidx __rte_unused, struct rte_mbuf **pkts __rte_unused, uint16_t nb_rx, uint16_t max_pkts __rte_unused, void *arg) { - const unsigned int lcore = rte_lcore_id(); struct queue_list_entry *queue_conf = arg; struct pmd_core_cfg *lcore_conf; const bool empty = nb_rx == 0; uint32_t pause_duration = rte_power_pmd_mgmt_get_pause_duration(); - lcore_conf = &lcore_cfgs[lcore]; + lcore_conf = RTE_LCORE_VAR(lcore_cfgs); if (likely(!empty)) /* early exit */ @@ -358,9 +357,8 @@ clb_scale_freq(uint16_t port_id __rte_unused, uint16_t qidx __rte_unused, struct rte_mbuf **pkts __rte_unused, uint16_t nb_rx, uint16_t max_pkts __rte_unused, void *arg) { - const unsigned int lcore = rte_lcore_id(); const bool empty = nb_rx == 0; - struct pmd_core_cfg *lcore_conf = &lcore_cfgs[lcore]; + struct pmd_core_cfg *lcore_conf = RTE_LCORE_VAR(lcore_cfgs); struct queue_list_entry *queue_conf = arg; if (likely(!empty)) { @@ -519,7 +517,7 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id, goto end; } - lcore_cfg = &lcore_cfgs[lcore_id]; + lcore_cfg = RTE_LCORE_VAR_LCORE(lcore_id, lcore_cfgs); /* check if other queues are stopped as well */ ret = cfg_queues_stopped(lcore_cfg); @@ -620,7 +618,7 @@ rte_power_ethdev_pmgmt_queue_disable(unsigned int lcore_id, } /* no need to check queue id as wrong queue id would not be enabled */ - lcore_cfg = &lcore_cfgs[lcore_id]; + lcore_cfg = RTE_LCORE_VAR_LCORE(lcore_id, lcore_cfgs); /* check if other queues are stopped as well */ ret = cfg_queues_stopped(lcore_cfg); @@ -770,21 +768,22 @@ rte_power_pmd_mgmt_get_scaling_freq_max(unsigned int lcore) } RTE_INIT(rte_power_ethdev_pmgmt_init) { - size_t i; - int j; + unsigned int lcore_id; + struct pmd_core_cfg *lcore_cfg; + int i; + + RTE_LCORE_VAR_ALLOC(lcore_cfgs); /* initialize all tailqs */ - for (i = 0; i < RTE_DIM(lcore_cfgs); i++) { - struct pmd_core_cfg *cfg = &lcore_cfgs[i]; - TAILQ_INIT(&cfg->head); - } + RTE_LCORE_VAR_FOREACH(lcore_id, lcore_cfg, lcore_cfgs) + TAILQ_INIT(&lcore_cfg->head); /* initialize config defaults */ emptypoll_max = 512; pause_duration = 1; /* scaling defaults out of range to ensure not used unless set by user or app */ - for (j = 0; j < RTE_MAX_LCORE; j++) { - scale_freq_min[j] = 0; - scale_freq_max[j] = UINT32_MAX; + for (i = 0; i < RTE_MAX_LCORE; i++) { + scale_freq_min[i] = 0; + scale_freq_max[i] = UINT32_MAX; } } From patchwork Fri Oct 25 08:41:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Mattias_R=C3=B6nnblom?= X-Patchwork-Id: 147223 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id EE9F145BD4; Fri, 25 Oct 2024 10:51:45 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 5215F4066F; Fri, 25 Oct 2024 10:51:13 +0200 (CEST) Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2053.outbound.protection.outlook.com [40.107.20.53]) by mails.dpdk.org (Postfix) with ESMTP id 27B424003C for ; Fri, 25 Oct 2024 10:50:57 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=st4EWLS24byVCm4XlI04gQLjgWoRg/OogybGNBwoMVLQ00eLYW1lS660BOr93MuwgVEbh6aTx//cgTQO2Q4GQbKW+7TN5MpBDNlYuw8tlnEr3KcrvNGdi/Pt4Gru17ByhZi0mB/1CEO72JLMMpOIzEVgs0yihWg4aYG4qRtzzza0DzbwVbsvGlr2JD0UfR0ePOkSxve859SyFflTAOE6BydnCJLkl1DyVVhA7QLx7PL5KqkjtH+8RtnzpOxTygEdcpcqBvjwg/dLm0Y/cwoyi4Gm8UMImscNdlFKSkfHcjqz+Wn8zA+Ktig6Grc9CHgapNaShC43hg+lTuPHz5gdIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=bZe54tHV1BWuXkkOzb0KIXBGV4vegSrp2GI9ZzhuZjE=; b=ljazkzQRTL6lPuGeR4Cm5Ga6fQ8LvBKqPDR3WCoehA7JLd2fe1oGNVYP6q9ObMoIV+909imCnrqN5kp/c3p3IaXAVVT6euKnyomMG3RX0kdH1gGCZhLflWb3bETgs8CjAdakfGIs4CfVY6GnUQ6UIK8d6vd0+J9wIVUZImJQq07NsNjX7RZGaNcxf5M7tiND7YiQa5ByxfCT2bDmsdIKkyH4fzRSgJDdeROFAxSR6PStasx+BRpp0Nc0cE2t+8TTYn+UOPWrRBgE4FchDUu1JOa2kI1zWC2T7DLPBkqvjRU+kXV8aY0NuwZyAyGHWJODl3Y6U8AntK2Ra2OzlQdVqA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 192.176.1.74) smtp.rcpttodomain=dpdk.org smtp.mailfrom=ericsson.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=ericsson.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=bZe54tHV1BWuXkkOzb0KIXBGV4vegSrp2GI9ZzhuZjE=; b=pi7a3WRzBXczaCngFwPRjwA3QJrIsNHlpahUSiyMNtvWv6M1AzaeVDbv+lTnEmMCYuqvpP/ozOCYOffh9NXVi0kVmdcvfs1YVTBYKmVXdpQNiJwcT63RzNNPhZxOIY5JULiucp09YI+lh0hqaQKpnPdSIgEjpx9xDve8kFgQSsOLzJ3O58AaGtYWL3ApcWSiTZO3HNyCspXKSGUX13tmSo2zDTviKJDFcsuFAZ05uHyBcNbsIXdDuVLUY8Ybos7M5sLTy2SB9En3XHqy9nSDdJiTKFzgk07ubm5cNG6BUfYULRyYzBagLqrPgpiVZI9JjxlTNKWBOLWMoTE43Yo49g== Received: from DU7P195CA0028.EURP195.PROD.OUTLOOK.COM (2603:10a6:10:54d::28) by AM9PR07MB7169.eurprd07.prod.outlook.com (2603:10a6:20b:2cc::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.21; Fri, 25 Oct 2024 08:50:51 +0000 Received: from DB3PEPF00008859.eurprd02.prod.outlook.com (2603:10a6:10:54d:cafe::db) by DU7P195CA0028.outlook.office365.com (2603:10a6:10:54d::28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.20 via Frontend Transport; Fri, 25 Oct 2024 08:50:51 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 192.176.1.74) smtp.mailfrom=ericsson.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=ericsson.com; Received-SPF: Pass (protection.outlook.com: domain of ericsson.com designates 192.176.1.74 as permitted sender) receiver=protection.outlook.com; client-ip=192.176.1.74; helo=oa.msg.ericsson.com; pr=C Received: from oa.msg.ericsson.com (192.176.1.74) by DB3PEPF00008859.mail.protection.outlook.com (10.167.242.4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.14 via Frontend Transport; Fri, 25 Oct 2024 08:50:51 +0000 Received: from seliicinfr00050.seli.gic.ericsson.se (153.88.142.248) by smtp-central.internal.ericsson.com (100.87.178.61) with Microsoft SMTP Server id 15.2.1544.11; Fri, 25 Oct 2024 10:50:50 +0200 Received: from breslau.. (seliicwb00002.seli.gic.ericsson.se [10.156.25.100]) by seliicinfr00050.seli.gic.ericsson.se (Postfix) with ESMTP id 2D5381C00A9; Fri, 25 Oct 2024 10:50:50 +0200 (CEST) From: =?utf-8?q?Mattias_R=C3=B6nnblom?= To: CC: , =?utf-8?q?Morten_Br=C3=B8rup?= , Stephen Hemminger , Konstantin Ananyev , David Marchand , Jerin Jacob , Luka Jankovic , Thomas Monjalon , =?utf-8?q?Mattias_R=C3=B6nnblom?= , Konstantin Ananyev , Chengwen Feng Subject: [PATCH v17 7/8] service: keep per-lcore state in lcore variable Date: Fri, 25 Oct 2024 10:41:48 +0200 Message-ID: <20241025084149.873037-8-mattias.ronnblom@ericsson.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241025084149.873037-1-mattias.ronnblom@ericsson.com> References: <20241023075302.869008-1-mattias.ronnblom@ericsson.com> <20241025084149.873037-1-mattias.ronnblom@ericsson.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DB3PEPF00008859:EE_|AM9PR07MB7169:EE_ X-MS-Office365-Filtering-Correlation-Id: dcf14ad4-82aa-496a-3b68-08dcf4d2211b X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|7416014|376014|36860700013|82310400026|1800799024; X-Microsoft-Antispam-Message-Info: =?utf-8?q?vHW5MG0+IgTDTHUz+jSVzInYVbjRPpQ?= =?utf-8?q?mzGO2O72jsqNqBCQKuqYAQ1h/BFDo0gwHQ1AbbjSlhHO+x/cLVRn9VWUOlsa98Jnr?= =?utf-8?q?o/Fp5vXE+e5Zwk24awfqqZcDm/wrlHslP3fQsC3OidR0lwS2Z6WvHkxzxD5HtkFyU?= =?utf-8?q?h5mRgHzPbxyDFwoqbVC6Ww/DPADnSJF97+t9v4sWzoGW26Q7eXf1cr2WlrBux0996?= =?utf-8?q?GoMbsnUqSQPXDeAwkD+8xtuPcy7cDe/K7zkz0FEUHDXNKCbOiGi1eTFsuZZSYwmBR?= =?utf-8?q?H07FIugFR6s81ofY8BvRRZxemqV+PvdHM2lZgW54z7SMaUFydbommlYJKNl1z43nt?= =?utf-8?q?EUgQ/HlIt8Jx0TBxsKcLpg7HP/vHOVHTIJtIDsT8N/K0K4fp08wW8injJ+xV/0JqF?= =?utf-8?q?PbFEdBawrXt7wtPWJzIUvYMBTjJhH2k3I2paAwPxoPMOsXb+IkI6g2PspEzVlVLca?= =?utf-8?q?4QWgiOQnbaLo3M/oURD3rFjTeWX2rsYutWc2jUvke7JmiR2KXjgVBlbQhBOMX7/JT?= =?utf-8?q?BVdfyV51BuZPQtRw24waLIUcVn+FX5KsG1KxCPvg77fFATjAh+GQMT10ZNLJVAOrG?= =?utf-8?q?xsNQ/7QDN3hRt70HwQkncz0VqvfEeizR56JdVjKthRsZxLo5WrIoT38MLShMVbk9c?= =?utf-8?q?oY9rJxJNz4yIGsDl2anS1UJRZVzniBePZ2eUv9UhDxlWC5a1Lgd8BqJQy9TMqRQNb?= =?utf-8?q?nmqOipsHs6I4/blUlLkyxboQZJ8qVq6KIUIZ/Z3q0PFQUGztNj6KQFvHIYdTb/VKq?= =?utf-8?q?V5FSW0wnkzfLJHcS2SNtsCMKyIvl7HYoF+CGCbKp01XjkKHIumaxLwpJpOph3kkwF?= =?utf-8?q?hmcHxh4NiqnllEesSneTD2s+wzjqzMrL3snpApA2cTqTuYokmtAk2WiF/PVkzLNd+?= =?utf-8?q?Ox2QwIO2zFscsJVNqL/1+ktoHUCbfWSKdAFGyBK8Z4IjZJFxZ10+5AndeBCVABhK/?= =?utf-8?q?Pm2AAL8WAqYY/imXb9RLOZVNqVY5sIr2+jXGdXABelr7jb2y1RH9XOsLXtqjwDHRi?= =?utf-8?q?+EPJGC1jaOn63VdnhpS7pT9ZHmN+CwYt7/jSPiSHcIG2xiAaA+w4R7e/jJYfiZF04?= =?utf-8?q?IX/eKPYorbVshjwlatvy38nuM44d6zUuxsg/jYadR4qQjfVTUky8ra+zrpnRpAmUB?= =?utf-8?q?wlIdPPty8n8TdREPnDlnLKg79E8sIxkk1CTbnVnQXcCQZtGP6QXFd4mRIZe2ySVpF?= =?utf-8?q?Czim/M0v9F1i/danV8mAJAQ6IcHAXmZlSKlrOEGZyxiHIdeaG4XuB26KRjEvbl60e?= =?utf-8?q?yes0SuTh0faYW+gQ/B7+QOcpWcu9euuE8mucVAuf3zUkPiq4LbCksBsKyN6uhSdP/?= =?utf-8?q?KtNPKUaqYUPrTLDZbNQSCUkUSZwm+GRHsQ=3D=3D?= X-Forefront-Antispam-Report: CIP:192.176.1.74; CTRY:SE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:oa.msg.ericsson.com; PTR:office365.se.ericsson.net; CAT:NONE; SFS:(13230040)(7416014)(376014)(36860700013)(82310400026)(1800799024); DIR:OUT; SFP:1101; X-OriginatorOrg: ericsson.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Oct 2024 08:50:51.4728 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: dcf14ad4-82aa-496a-3b68-08dcf4d2211b X-MS-Exchange-CrossTenant-Id: 92e84ceb-fbfd-47ab-be52-080c6b87953f X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=92e84ceb-fbfd-47ab-be52-080c6b87953f; Ip=[192.176.1.74]; Helo=[oa.msg.ericsson.com] X-MS-Exchange-CrossTenant-AuthSource: DB3PEPF00008859.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM9PR07MB7169 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Replace static array of cache-aligned structs with an lcore variable, to slightly benefit code simplicity and performance. Signed-off-by: Mattias Rönnblom Acked-by: Morten Brørup Acked-by: Konstantin Ananyev Acked-by: Chengwen Feng Acked-by: Stephen Hemminger --- PATCH v14: * Merge with bitset-related changes. PATCH v7: * Update to match new FOREACH API. RFC v6: * Remove a now-redundant lcore variable value memset(). RFC v5: * Fix lcore value pointer bug introduced by RFC v4. RFC v4: * Remove strange-looking lcore value lookup potentially containing invalid lcore id. (Morten Brørup) * Replace misplaced tab with space. (Morten Brørup) --- lib/eal/common/rte_service.c | 116 ++++++++++++++++++++--------------- 1 file changed, 65 insertions(+), 51 deletions(-) diff --git a/lib/eal/common/rte_service.c b/lib/eal/common/rte_service.c index 324471e897..dad3150df9 100644 --- a/lib/eal/common/rte_service.c +++ b/lib/eal/common/rte_service.c @@ -11,6 +11,7 @@ #include #include +#include #include #include #include @@ -78,7 +79,7 @@ struct __rte_cache_aligned core_state { static uint32_t rte_service_count; static struct rte_service_spec_impl *rte_services; -static struct core_state *lcore_states; +static RTE_LCORE_VAR_HANDLE(struct core_state, lcore_states); static uint32_t rte_service_library_initialized; int32_t @@ -99,12 +100,8 @@ rte_service_init(void) goto fail_mem; } - lcore_states = rte_calloc("rte_service_core_states", RTE_MAX_LCORE, - sizeof(struct core_state), RTE_CACHE_LINE_SIZE); - if (!lcore_states) { - EAL_LOG(ERR, "error allocating core states array"); - goto fail_mem; - } + if (lcore_states == NULL) + RTE_LCORE_VAR_ALLOC(lcore_states); int i; struct rte_config *cfg = rte_eal_get_configuration(); @@ -120,7 +117,6 @@ rte_service_init(void) return 0; fail_mem: rte_free(rte_services); - rte_free(lcore_states); return -ENOMEM; } @@ -134,7 +130,6 @@ rte_service_finalize(void) rte_eal_mp_wait_lcore(); rte_free(rte_services); - rte_free(lcore_states); rte_service_library_initialized = 0; } @@ -284,7 +279,6 @@ rte_service_component_register(const struct rte_service_spec *spec, int32_t rte_service_component_unregister(uint32_t id) { - uint32_t i; struct rte_service_spec_impl *s; SERVICE_VALID_GET_OR_ERR_RET(id, s, -EINVAL); @@ -292,9 +286,11 @@ rte_service_component_unregister(uint32_t id) s->internal_flags &= ~(SERVICE_F_REGISTERED); + unsigned int lcore_id; + struct core_state *cs; /* clear the run-bit in all cores */ - for (i = 0; i < RTE_MAX_LCORE; i++) - rte_bitset_clear(lcore_states[i].mapped_services, id); + RTE_LCORE_VAR_FOREACH(lcore_id, cs, lcore_states) + rte_bitset_clear(cs->mapped_services, id); memset(&rte_services[id], 0, sizeof(struct rte_service_spec_impl)); @@ -463,7 +459,10 @@ rte_service_may_be_active(uint32_t id) return -EINVAL; for (i = 0; i < lcore_count; i++) { - if (rte_bitset_test(lcore_states[ids[i]].service_active_on_lcore, id)) + struct core_state *cs = + RTE_LCORE_VAR_LCORE(ids[i], lcore_states); + + if (rte_bitset_test(cs->service_active_on_lcore, id)) return 1; } @@ -473,7 +472,7 @@ rte_service_may_be_active(uint32_t id) int32_t rte_service_run_iter_on_app_lcore(uint32_t id, uint32_t serialize_mt_unsafe) { - struct core_state *cs = &lcore_states[rte_lcore_id()]; + struct core_state *cs = RTE_LCORE_VAR(lcore_states); struct rte_service_spec_impl *s; SERVICE_VALID_GET_OR_ERR_RET(id, s, -EINVAL); @@ -496,8 +495,7 @@ static int32_t service_runner_func(void *arg) { RTE_SET_USED(arg); - const int lcore = rte_lcore_id(); - struct core_state *cs = &lcore_states[lcore]; + struct core_state *cs = RTE_LCORE_VAR(lcore_states); rte_atomic_store_explicit(&cs->thread_active, 1, rte_memory_order_seq_cst); @@ -533,13 +531,15 @@ service_runner_func(void *arg) int32_t rte_service_lcore_may_be_active(uint32_t lcore) { - if (lcore >= RTE_MAX_LCORE || !lcore_states[lcore].is_service_core) + struct core_state *cs = RTE_LCORE_VAR_LCORE(lcore, lcore_states); + + if (lcore >= RTE_MAX_LCORE || !cs->is_service_core) return -EINVAL; /* Load thread_active using ACQUIRE to avoid instructions dependent on * the result being re-ordered before this load completes. */ - return rte_atomic_load_explicit(&lcore_states[lcore].thread_active, + return rte_atomic_load_explicit(&cs->thread_active, rte_memory_order_acquire); } @@ -547,9 +547,12 @@ int32_t rte_service_lcore_count(void) { int32_t count = 0; - uint32_t i; - for (i = 0; i < RTE_MAX_LCORE; i++) - count += lcore_states[i].is_service_core; + + unsigned int lcore_id; + struct core_state *cs; + RTE_LCORE_VAR_FOREACH(lcore_id, cs, lcore_states) + count += cs->is_service_core; + return count; } @@ -566,7 +569,8 @@ rte_service_lcore_list(uint32_t array[], uint32_t n) uint32_t i; uint32_t idx = 0; for (i = 0; i < RTE_MAX_LCORE; i++) { - struct core_state *cs = &lcore_states[i]; + struct core_state *cs = + RTE_LCORE_VAR_LCORE(i, lcore_states); if (cs->is_service_core) { array[idx] = i; idx++; @@ -582,7 +586,7 @@ rte_service_lcore_count_services(uint32_t lcore) if (lcore >= RTE_MAX_LCORE) return -EINVAL; - struct core_state *cs = &lcore_states[lcore]; + struct core_state *cs = RTE_LCORE_VAR_LCORE(lcore, lcore_states); if (!cs->is_service_core) return -ENOTSUP; @@ -634,28 +638,30 @@ rte_service_start_with_defaults(void) static int32_t service_update(uint32_t sid, uint32_t lcore, uint32_t *set, uint32_t *enabled) { + struct core_state *cs = RTE_LCORE_VAR_LCORE(lcore, lcore_states); + /* validate ID, or return error value */ if (!service_valid(sid) || lcore >= RTE_MAX_LCORE || - !lcore_states[lcore].is_service_core) + !cs->is_service_core) return -EINVAL; if (set) { - uint64_t lcore_mapped = rte_bitset_test(lcore_states[lcore].mapped_services, sid); + bool lcore_mapped = rte_bitset_test(cs->mapped_services, sid); if (*set && !lcore_mapped) { - rte_bitset_set(lcore_states[lcore].mapped_services, sid); + rte_bitset_set(cs->mapped_services, sid); rte_atomic_fetch_add_explicit(&rte_services[sid].num_mapped_cores, 1, rte_memory_order_relaxed); } if (!*set && lcore_mapped) { - rte_bitset_clear(lcore_states[lcore].mapped_services, sid); + rte_bitset_clear(cs->mapped_services, sid); rte_atomic_fetch_sub_explicit(&rte_services[sid].num_mapped_cores, 1, rte_memory_order_relaxed); } } if (enabled) - *enabled = rte_bitset_test(lcore_states[lcore].mapped_services, sid); + *enabled = rte_bitset_test(cs->mapped_services, sid); return 0; } @@ -683,13 +689,14 @@ set_lcore_state(uint32_t lcore, int32_t state) { /* mark core state in hugepage backed config */ struct rte_config *cfg = rte_eal_get_configuration(); + struct core_state *cs = RTE_LCORE_VAR_LCORE(lcore, lcore_states); cfg->lcore_role[lcore] = state; /* mark state in process local lcore_config */ lcore_config[lcore].core_role = state; /* update per-lcore optimized state tracking */ - lcore_states[lcore].is_service_core = (state == ROLE_SERVICE); + cs->is_service_core = (state == ROLE_SERVICE); rte_eal_trace_service_lcore_state_change(lcore, state); } @@ -700,14 +707,16 @@ rte_service_lcore_reset_all(void) /* loop over cores, reset all mapped services */ uint32_t i; for (i = 0; i < RTE_MAX_LCORE; i++) { - if (lcore_states[i].is_service_core) { - rte_bitset_clear_all(lcore_states[i].mapped_services, RTE_SERVICE_NUM_MAX); + struct core_state *cs = RTE_LCORE_VAR_LCORE(i, lcore_states); + + if (cs->is_service_core) { + rte_bitset_clear_all(cs->mapped_services, RTE_SERVICE_NUM_MAX); set_lcore_state(i, ROLE_RTE); /* runstate act as guard variable Use * store-release memory order here to synchronize * with load-acquire in runstate read functions. */ - rte_atomic_store_explicit(&lcore_states[i].runstate, + rte_atomic_store_explicit(&cs->runstate, RUNSTATE_STOPPED, rte_memory_order_release); } } @@ -723,17 +732,19 @@ rte_service_lcore_add(uint32_t lcore) { if (lcore >= RTE_MAX_LCORE) return -EINVAL; - if (lcore_states[lcore].is_service_core) + + struct core_state *cs = RTE_LCORE_VAR_LCORE(lcore, lcore_states); + if (cs->is_service_core) return -EALREADY; set_lcore_state(lcore, ROLE_SERVICE); /* ensure that after adding a core the mask and state are defaults */ - rte_bitset_clear_all(lcore_states[lcore].mapped_services, RTE_SERVICE_NUM_MAX); + rte_bitset_clear_all(cs->mapped_services, RTE_SERVICE_NUM_MAX); /* Use store-release memory order here to synchronize with * load-acquire in runstate read functions. */ - rte_atomic_store_explicit(&lcore_states[lcore].runstate, RUNSTATE_STOPPED, + rte_atomic_store_explicit(&cs->runstate, RUNSTATE_STOPPED, rte_memory_order_release); return rte_eal_wait_lcore(lcore); @@ -745,7 +756,7 @@ rte_service_lcore_del(uint32_t lcore) if (lcore >= RTE_MAX_LCORE) return -EINVAL; - struct core_state *cs = &lcore_states[lcore]; + struct core_state *cs = RTE_LCORE_VAR_LCORE(lcore, lcore_states); if (!cs->is_service_core) return -EINVAL; @@ -769,7 +780,7 @@ rte_service_lcore_start(uint32_t lcore) if (lcore >= RTE_MAX_LCORE) return -EINVAL; - struct core_state *cs = &lcore_states[lcore]; + struct core_state *cs = RTE_LCORE_VAR_LCORE(lcore, lcore_states); if (!cs->is_service_core) return -EINVAL; @@ -799,6 +810,8 @@ rte_service_lcore_start(uint32_t lcore) int32_t rte_service_lcore_stop(uint32_t lcore) { + struct core_state *cs = RTE_LCORE_VAR_LCORE(lcore, lcore_states); + if (lcore >= RTE_MAX_LCORE) return -EINVAL; @@ -806,12 +819,11 @@ rte_service_lcore_stop(uint32_t lcore) * memory order here to synchronize with store-release * in runstate update functions. */ - if (rte_atomic_load_explicit(&lcore_states[lcore].runstate, rte_memory_order_acquire) == + if (rte_atomic_load_explicit(&cs->runstate, rte_memory_order_acquire) == RUNSTATE_STOPPED) return -EALREADY; uint32_t i; - struct core_state *cs = &lcore_states[lcore]; for (i = 0; i < RTE_SERVICE_NUM_MAX; i++) { bool enabled = rte_bitset_test(cs->mapped_services, i); @@ -831,7 +843,7 @@ rte_service_lcore_stop(uint32_t lcore) /* Use store-release memory order here to synchronize with * load-acquire in runstate read functions. */ - rte_atomic_store_explicit(&lcore_states[lcore].runstate, RUNSTATE_STOPPED, + rte_atomic_store_explicit(&cs->runstate, RUNSTATE_STOPPED, rte_memory_order_release); rte_eal_trace_service_lcore_stop(lcore); @@ -842,7 +854,7 @@ rte_service_lcore_stop(uint32_t lcore) static uint64_t lcore_attr_get_loops(unsigned int lcore) { - struct core_state *cs = &lcore_states[lcore]; + struct core_state *cs = RTE_LCORE_VAR_LCORE(lcore, lcore_states); return rte_atomic_load_explicit(&cs->loops, rte_memory_order_relaxed); } @@ -850,7 +862,7 @@ lcore_attr_get_loops(unsigned int lcore) static uint64_t lcore_attr_get_cycles(unsigned int lcore) { - struct core_state *cs = &lcore_states[lcore]; + struct core_state *cs = RTE_LCORE_VAR_LCORE(lcore, lcore_states); return rte_atomic_load_explicit(&cs->cycles, rte_memory_order_relaxed); } @@ -858,7 +870,7 @@ lcore_attr_get_cycles(unsigned int lcore) static uint64_t lcore_attr_get_service_calls(uint32_t service_id, unsigned int lcore) { - struct core_state *cs = &lcore_states[lcore]; + struct core_state *cs = RTE_LCORE_VAR_LCORE(lcore, lcore_states); return rte_atomic_load_explicit(&cs->service_stats[service_id].calls, rte_memory_order_relaxed); @@ -885,7 +897,7 @@ lcore_attr_get_service_error_calls(uint32_t service_id, unsigned int lcore) static uint64_t lcore_attr_get_service_cycles(uint32_t service_id, unsigned int lcore) { - struct core_state *cs = &lcore_states[lcore]; + struct core_state *cs = RTE_LCORE_VAR_LCORE(lcore, lcore_states); return rte_atomic_load_explicit(&cs->service_stats[service_id].cycles, rte_memory_order_relaxed); @@ -901,7 +913,10 @@ attr_get(uint32_t id, lcore_attr_get_fun lcore_attr_get) uint64_t sum = 0; for (lcore = 0; lcore < RTE_MAX_LCORE; lcore++) { - if (lcore_states[lcore].is_service_core) + struct core_state *cs = + RTE_LCORE_VAR_LCORE(lcore, lcore_states); + + if (cs->is_service_core) sum += lcore_attr_get(id, lcore); } @@ -963,12 +978,11 @@ int32_t rte_service_lcore_attr_get(uint32_t lcore, uint32_t attr_id, uint64_t *attr_value) { - struct core_state *cs; + struct core_state *cs = RTE_LCORE_VAR_LCORE(lcore, lcore_states); if (lcore >= RTE_MAX_LCORE || !attr_value) return -EINVAL; - cs = &lcore_states[lcore]; if (!cs->is_service_core) return -ENOTSUP; @@ -993,7 +1007,8 @@ rte_service_attr_reset_all(uint32_t id) return -EINVAL; for (lcore = 0; lcore < RTE_MAX_LCORE; lcore++) { - struct core_state *cs = &lcore_states[lcore]; + struct core_state *cs = + RTE_LCORE_VAR_LCORE(lcore, lcore_states); cs->service_stats[id] = (struct service_stats) {}; } @@ -1004,12 +1019,11 @@ rte_service_attr_reset_all(uint32_t id) int32_t rte_service_lcore_attr_reset_all(uint32_t lcore) { - struct core_state *cs; + struct core_state *cs = RTE_LCORE_VAR_LCORE(lcore, lcore_states); if (lcore >= RTE_MAX_LCORE) return -EINVAL; - cs = &lcore_states[lcore]; if (!cs->is_service_core) return -ENOTSUP; @@ -1044,7 +1058,7 @@ static void service_dump_calls_per_lcore(FILE *f, uint32_t lcore) { uint32_t i; - struct core_state *cs = &lcore_states[lcore]; + struct core_state *cs = RTE_LCORE_VAR_LCORE(lcore, lcore_states); fprintf(f, "%02d\t", lcore); for (i = 0; i < RTE_SERVICE_NUM_MAX; i++) { From patchwork Fri Oct 25 08:41:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Mattias_R=C3=B6nnblom?= X-Patchwork-Id: 147219 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id DC13645BD4; Fri, 25 Oct 2024 10:51:12 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 70E1C40614; Fri, 25 Oct 2024 10:51:07 +0200 (CEST) Received: from EUR03-DBA-obe.outbound.protection.outlook.com (mail-dbaeur03on2088.outbound.protection.outlook.com [40.107.104.88]) by mails.dpdk.org (Postfix) with ESMTP id 7C7C840156 for ; Fri, 25 Oct 2024 10:50:53 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=lqOg8thlN+XZXzj65rho2+IKEEz+YStb4iEBXVSogJVnU2V8GRLvYoOzH1nk+fZYSXTJlQoISLR1RcWaZfb8KuT8HORLofueZ11crATsc6WjFXYFftjJf+HZFG0xZ9Pa/W9RMPmQT4RHsIpIUJfLOhIgnnpjYlNfp1L7CfD4JlkFPamuo73jioneQ5PFpGJXWI0IYli/8n5BxuEKY3SKrVW+QfyE4ZJK6DwBx3c0e2WXILirYVlqBkqQV2xc5oQ/T0S0VxCCf2f2jfblLB5Mj2mJCqLTxUGcrjp7SO31qa+fjGee0H28l537o5PFr/I4WMwkOg1gtoVmu206Zp4xPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=cX4ufj1pN9MGR3XnbJ9151fWBNDYFHNFptwEX85LFw4=; b=Ecbg8ZYkZ9159Y+vx5tT4qEUdhn7L567+qOHLSTY1nSYawxC6KRhC1wK/GsE7rExvNwiwxgBWflIOazMFmgqlxHAdDS7BM0YggmAEKsNRlL0O38QSzH6j0EtCNvwPVY97AcyT55MrVaIkZ0EcEx7FgCIKjYbWEvmSADdUgKCtBInlF6eHurBOkZihpWtNCaJALmpN02qtDNnmH/egwu9K+aomtsp3hoYr6WPo135XnoreEaRwnZenprUEMgkirEC78er0fBESKZ/9RhjFSR2E6jTOySqjkIYTBYNTTAwhCCEd8qEfp8PjzO9j4Y9714uI/JoxG1ZlMp3t9cYhPbRSQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 192.176.1.74) smtp.rcpttodomain=dpdk.org smtp.mailfrom=ericsson.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=ericsson.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=cX4ufj1pN9MGR3XnbJ9151fWBNDYFHNFptwEX85LFw4=; b=CG7IrTpruy63levvqxRxP4M2aiWMHBLk4PKgC7RxQSjFDcK0/pK/BlZageAznVZjLOX6gDH5xoJBdsNofGzoxke9f2fgHdRFgHY4XCTg360HMu4Pcmahfh25S7gDNLISAA14YRxclQWAk72x0n4FRNV76yzru/RaTLbYmlCVljY1pT14zNEVKlQwOYecsaK2g1pZBC6gLuI0kgGS6TDwsC8cQm9o4ciBHQeRTn18LWKrLUipvGO0tMOFm6S48/vqWSkXNi/00lFf/I0kPg18yNc6STZ505Re39NlaTkURmVPaln/q4E9OZB8V0l68w5rDCzbHTClay62vjEwG1I5Dg== Received: from DUZPR01CA0272.eurprd01.prod.exchangelabs.com (2603:10a6:10:4b9::12) by PR3PR07MB7035.eurprd07.prod.outlook.com (2603:10a6:102:76::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.21; Fri, 25 Oct 2024 08:50:51 +0000 Received: from DB5PEPF00014B97.eurprd02.prod.outlook.com (2603:10a6:10:4b9:cafe::c9) by DUZPR01CA0272.outlook.office365.com (2603:10a6:10:4b9::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.21 via Frontend Transport; Fri, 25 Oct 2024 08:50:51 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 192.176.1.74) smtp.mailfrom=ericsson.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=ericsson.com; Received-SPF: Pass (protection.outlook.com: domain of ericsson.com designates 192.176.1.74 as permitted sender) receiver=protection.outlook.com; client-ip=192.176.1.74; helo=oa.msg.ericsson.com; pr=C Received: from oa.msg.ericsson.com (192.176.1.74) by DB5PEPF00014B97.mail.protection.outlook.com (10.167.8.235) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.14 via Frontend Transport; Fri, 25 Oct 2024 08:50:51 +0000 Received: from seliicinfr00050.seli.gic.ericsson.se (153.88.142.248) by smtp-central.internal.ericsson.com (100.87.178.69) with Microsoft SMTP Server id 15.2.1544.11; Fri, 25 Oct 2024 10:50:50 +0200 Received: from breslau.. (seliicwb00002.seli.gic.ericsson.se [10.156.25.100]) by seliicinfr00050.seli.gic.ericsson.se (Postfix) with ESMTP id 4131E1C00A5; Fri, 25 Oct 2024 10:50:50 +0200 (CEST) From: =?utf-8?q?Mattias_R=C3=B6nnblom?= To: CC: , =?utf-8?q?Morten_Br=C3=B8rup?= , Stephen Hemminger , Konstantin Ananyev , David Marchand , Jerin Jacob , Luka Jankovic , Thomas Monjalon , =?utf-8?q?Mattias_R=C3=B6nnblom?= , Konstantin Ananyev , Chengwen Feng Subject: [PATCH v17 8/8] eal: keep per-lcore power intrinsics state in lcore variable Date: Fri, 25 Oct 2024 10:41:49 +0200 Message-ID: <20241025084149.873037-9-mattias.ronnblom@ericsson.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241025084149.873037-1-mattias.ronnblom@ericsson.com> References: <20241023075302.869008-1-mattias.ronnblom@ericsson.com> <20241025084149.873037-1-mattias.ronnblom@ericsson.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DB5PEPF00014B97:EE_|PR3PR07MB7035:EE_ X-MS-Office365-Filtering-Correlation-Id: 518d6786-3139-4a72-eb2b-08dcf4d2210a X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|376014|7416014|36860700013|1800799024|82310400026; X-Microsoft-Antispam-Message-Info: =?utf-8?q?g4luROXnx+pexb8dy+HQDtMI4TNkv5q?= =?utf-8?q?hVWb5cQ9nuqpoMktsWL2Ec7cKf5rNhthqU5KPIjizYyU3bB+lRUeL4yKOzEAh5/cD?= =?utf-8?q?lM45CTt+jU8o3a2+HckmRkdWt8ZHhZ8xySouHieHmOgSe/KwpamjKx+5VntObT5Dh?= =?utf-8?q?41BxI1yxgft4pfPeyrQfdTb/ULkjsJcfQiT7SmxO9VZiFmYCqGR8DvDUp6KJSMZP/?= =?utf-8?q?u512WVqc8bu48vd5b0/7hzw9PXW3DDtMpINllcLGczIuQ88hgotS3HI9wQilobWKG?= =?utf-8?q?4pkERe1zI6M8GFIgdu7YDEWDZ/PAfRqDGqfQ9YCbo0+s/yIpjZZkz9WFgfAKxHobj?= =?utf-8?q?xdkmTIHlIt/tLfI/qBGtQe2OYxzZPpvu20RTJ1s7ajNo5Puu0B70U8pcFVKweSsEd?= =?utf-8?q?ovatzfKhVrzdb/mcbGQ5ZdohkifwI8yHuxdGv/vlGilfXVEe5uhPQs9baDM+9oMOI?= =?utf-8?q?+uufagO3AU3nuPqj1flZaira50ifWb7ZaWC2XsmuDBRc/sCmfnx/QA5PHvBKQ7sU4?= =?utf-8?q?b0tTVkOTbIL4G90CCXGpdHJqT5CAocAWZFxiu1edrDJDQYFqyJJmZ6Df7p5BoX7nH?= =?utf-8?q?g5GWx05hYtIBqy2WFKqBunwEU2uS958PnX5YRgPgK5a8KU39VLcJ6WBVCC/P/RIIt?= =?utf-8?q?TGMu1gKhP2p1k4BBwW3rAD6Sz/IQ3c+aXwhbkwRHDM10ijDyCbRUK5vehCoPnP6D6?= =?utf-8?q?lC6HubJMfLVWo/CR+N7t1RmmqeTCHhFXz0C70r4QiqSLoYxLGs6dulYti2Li1VGp4?= =?utf-8?q?6CF0rqev90Lappnq2y41Pb/XwdDIidHHBesg0sHwfYrKXJRR0sYb3SKhnvuE4m4Cb?= =?utf-8?q?jHTwNrDqY1whkjeWq9y/ER1+w3LSgjWGCaWYRNgIyX+pBXViVFWMyL8Xgj632Tuy1?= =?utf-8?q?NS+6L+FvDN7oZJgNxivWXxIoThEgONaWHwBwekV6SE8lrS9ACA2QLZ6oZ6pw7x7m/?= =?utf-8?q?8MvcABtcCPr1Br6FeLnmCX/0hUgC2odMs6a3r1vfZgta1zJSlKFzZc6inHJ7Rq2KS?= =?utf-8?q?2qHQvrwTS1IZFgdU0QcmDH5N3Mlxv5/oZ4LJzFxTdkx6LbtVZrUDcoY99NpZ86tdZ?= =?utf-8?q?CduujmDvdfbzP1yAWyzPdmmWdgokDvdqHD65wsAb7mmzHOXZqb9+YN7iKAzQ57up3?= =?utf-8?q?GY9dAXc9VkiOXh0FKlHuNi4wLQdDQDbVOg3VcGrm3jybWKiUJHDcPPnJIih9aSqCi?= =?utf-8?q?dy+SjuSyFvpzPcQhpKtywsRlxcpYmrLQAHHMXEGdNsFTDPS2TmXBTsz8XQGFhtC7H?= =?utf-8?q?HQjSeGDW8VXOoQHTuTags2S6afRtw1a2Rgom1wajHg2dI9H5JNjcjST6nZmbaza6O?= =?utf-8?q?/Mz3iYus9cVK5qQUIWJRKeEd+c8uMwUffg=3D=3D?= X-Forefront-Antispam-Report: CIP:192.176.1.74; CTRY:SE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:oa.msg.ericsson.com; PTR:office365.se.ericsson.net; CAT:NONE; SFS:(13230040)(376014)(7416014)(36860700013)(1800799024)(82310400026); DIR:OUT; SFP:1101; X-OriginatorOrg: ericsson.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Oct 2024 08:50:51.3659 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 518d6786-3139-4a72-eb2b-08dcf4d2210a X-MS-Exchange-CrossTenant-Id: 92e84ceb-fbfd-47ab-be52-080c6b87953f X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=92e84ceb-fbfd-47ab-be52-080c6b87953f; Ip=[192.176.1.74]; Helo=[oa.msg.ericsson.com] X-MS-Exchange-CrossTenant-AuthSource: DB5PEPF00014B97.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PR3PR07MB7035 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Keep per-lcore power intrinsics state in a lcore variable to reduce cache working set size and avoid any CPU next-line-prefetching causing false sharing. Signed-off-by: Mattias Rönnblom Acked-by: Morten Brørup Acked-by: Konstantin Ananyev Acked-by: Chengwen Feng Acked-by: Stephen Hemminger --- lib/eal/x86/rte_power_intrinsics.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/lib/eal/x86/rte_power_intrinsics.c b/lib/eal/x86/rte_power_intrinsics.c index 6d9b64240c..98a2cbc611 100644 --- a/lib/eal/x86/rte_power_intrinsics.c +++ b/lib/eal/x86/rte_power_intrinsics.c @@ -6,6 +6,7 @@ #include #include +#include #include #include @@ -14,10 +15,14 @@ /* * Per-lcore structure holding current status of C0.2 sleeps. */ -static alignas(RTE_CACHE_LINE_SIZE) struct power_wait_status { +struct power_wait_status { rte_spinlock_t lock; volatile void *monitor_addr; /**< NULL if not currently sleeping */ -} wait_status[RTE_MAX_LCORE]; +}; + +RTE_LCORE_VAR_HANDLE(struct power_wait_status, wait_status); + +RTE_LCORE_VAR_INIT(wait_status); /* * This function uses UMONITOR/UMWAIT instructions and will enter C0.2 state. @@ -172,7 +177,7 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc, if (pmc->fn == NULL) return -EINVAL; - s = &wait_status[lcore_id]; + s = RTE_LCORE_VAR_LCORE(lcore_id, wait_status); /* update sleep address */ rte_spinlock_lock(&s->lock); @@ -264,7 +269,7 @@ rte_power_monitor_wakeup(const unsigned int lcore_id) if (lcore_id >= RTE_MAX_LCORE) return -EINVAL; - s = &wait_status[lcore_id]; + s = RTE_LCORE_VAR_LCORE(lcore_id, wait_status); /* * There is a race condition between sleep, wakeup and locking, but we @@ -303,8 +308,8 @@ int rte_power_monitor_multi(const struct rte_power_monitor_cond pmc[], const uint32_t num, const uint64_t tsc_timestamp) { - const unsigned int lcore_id = rte_lcore_id(); - struct power_wait_status *s = &wait_status[lcore_id]; + struct power_wait_status *s = RTE_LCORE_VAR(wait_status); + uint32_t i, rc; /* check if supported */