From patchwork Fri Oct 11 09:49:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tomasz Duszynski X-Patchwork-Id: 145753 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id C669245AC4; Fri, 11 Oct 2024 11:50:04 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id B348140430; Fri, 11 Oct 2024 11:50:04 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id 05E1440655 for ; Fri, 11 Oct 2024 11:50:01 +0200 (CEST) Received: from pps.filterd (m0431383.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 49B9MIfu014110; Fri, 11 Oct 2024 02:49:59 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=pfpt0220; bh=e H3HSaExMKVxTq5FU+4DEfBXoMorQzFr0mhVO+yW/Ss=; b=CWW06Gb9HvZEnG9RG 3kqYUpTGCarQAKD9IocNWLnCng8EYCcM9LFA7B15QDLYdmok8ZbiqVjUyvbwN4IJ YwHXS/Jx4ebOqiNHigBn7Sjiw3MxPuZWKcKaLo3PKAt1yODTOgdcutnmC2HVmAAe 7WSVbKoXLj7UGej9NvGiasBq4jzG5dd5hUm6EctMPa77u2rTU8KweYToot8ETHj+ 8X+BIft9Emu2TpZidWdHGNfJ9kV2Qu6owZ2R4+eVd2IuWJRfipX94jTkIlcaDHER tg/qg2X6VQgTP2Bug3aV9hMs5BVYbHUEnVyn5Ibo1gQuKIYQa3p9/onocJ6d9LIh Buarg== Received: from dc5-exch05.marvell.com ([199.233.59.128]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 4271b201ma-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 11 Oct 2024 02:49:58 -0700 (PDT) Received: from DC5-EXCH05.marvell.com (10.69.176.209) by DC5-EXCH05.marvell.com (10.69.176.209) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Fri, 11 Oct 2024 02:49:57 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH05.marvell.com (10.69.176.209) with Microsoft SMTP Server id 15.2.1544.4 via Frontend Transport; Fri, 11 Oct 2024 02:49:57 -0700 Received: from cavium-optiplex-3070-BM15.. (unknown [10.28.34.39]) by maili.marvell.com (Postfix) with ESMTP id 414E23F704E; Fri, 11 Oct 2024 02:49:53 -0700 (PDT) From: Tomasz Duszynski To: , Thomas Monjalon CC: , , , , , , , , , , Subject: [PATCH v14 1/4] lib: add generic support for reading PMU events Date: Fri, 11 Oct 2024 11:49:41 +0200 Message-ID: <20241011094944.3586051-2-tduszynski@marvell.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241011094944.3586051-1-tduszynski@marvell.com> References: <20241009112308.2973903-1-tduszynski@marvell.com> <20241011094944.3586051-1-tduszynski@marvell.com> MIME-Version: 1.0 X-Proofpoint-GUID: C0nadbjsLVyrzg2go20hCFvppNu4qP4J X-Proofpoint-ORIG-GUID: C0nadbjsLVyrzg2go20hCFvppNu4qP4J X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.687,Hydra:6.0.235,FMLib:17.0.607.475 definitions=2020-10-13_15,2020-10-13_02,2020-04-07_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add support for programming PMU counters and reading their values in runtime bypassing kernel completely. This is especially useful in cases where CPU cores are isolated i.e run dedicated tasks. In such cases one cannot use standard perf utility without sacrificing latency and performance. Signed-off-by: Tomasz Duszynski --- MAINTAINERS | 5 + app/test/meson.build | 1 + app/test/test_pmu.c | 53 +++ doc/api/doxy-api-index.md | 3 +- doc/api/doxy-api.conf.in | 1 + doc/guides/prog_guide/profile_app.rst | 29 ++ doc/guides/rel_notes/release_24_11.rst | 7 + lib/meson.build | 1 + lib/pmu/meson.build | 12 + lib/pmu/pmu_private.h | 32 ++ lib/pmu/rte_pmu.c | 463 +++++++++++++++++++++++++ lib/pmu/rte_pmu.h | 227 ++++++++++++ lib/pmu/version.map | 14 + 13 files changed, 847 insertions(+), 1 deletion(-) create mode 100644 app/test/test_pmu.c create mode 100644 lib/pmu/meson.build create mode 100644 lib/pmu/pmu_private.h create mode 100644 lib/pmu/rte_pmu.c create mode 100644 lib/pmu/rte_pmu.h create mode 100644 lib/pmu/version.map diff --git a/MAINTAINERS b/MAINTAINERS index 7d171e3d45..fbd46905b6 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1811,6 +1811,11 @@ M: Nithin Dabilpuram M: Pavan Nikhilesh F: lib/node/ +PMU - EXPERIMENTAL +M: Tomasz Duszynski +F: lib/pmu/ +F: app/test/test_pmu* + Test Applications ----------------- diff --git a/app/test/meson.build b/app/test/meson.build index e29258e6ec..45f56d8aae 100644 --- a/app/test/meson.build +++ b/app/test/meson.build @@ -139,6 +139,7 @@ source_file_deps = { 'test_pmd_perf.c': ['ethdev', 'net'] + packet_burst_generator_deps, 'test_pmd_ring.c': ['net_ring', 'ethdev', 'bus_vdev'], 'test_pmd_ring_perf.c': ['ethdev', 'net_ring', 'bus_vdev'], + 'test_pmu.c': ['pmu'], 'test_power.c': ['power'], 'test_power_cpufreq.c': ['power'], 'test_power_intel_uncore.c': ['power'], diff --git a/app/test/test_pmu.c b/app/test/test_pmu.c new file mode 100644 index 0000000000..79376ea2e8 --- /dev/null +++ b/app/test/test_pmu.c @@ -0,0 +1,53 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2024 Marvell International Ltd. + */ + +#include + +#include "test.h" + +static int +test_pmu_read(void) +{ + const char *name = NULL; + int tries = 10, event; + uint64_t val = 0; + + if (name == NULL) { + printf("PMU not supported on this arch\n"); + return TEST_SKIPPED; + } + + if (rte_pmu_init() == -ENOTSUP) { + printf("pmu_autotest only supported on Linux, skipping test\n"); + return TEST_SKIPPED; + } + if (rte_pmu_init() < 0) + return TEST_SKIPPED; + + event = rte_pmu_add_event(name); + while (tries--) + val += rte_pmu_read(event); + + rte_pmu_fini(); + + return val ? TEST_SUCCESS : TEST_FAILED; +} + +static struct unit_test_suite pmu_tests = { + .suite_name = "pmu autotest", + .setup = NULL, + .teardown = NULL, + .unit_test_cases = { + TEST_CASE(test_pmu_read), + TEST_CASES_END() + } +}; + +static int +test_pmu(void) +{ + return unit_test_suite_runner(&pmu_tests); +} + +REGISTER_FAST_TEST(pmu_autotest, true, true, test_pmu); diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md index f9f0300126..805efc6520 100644 --- a/doc/api/doxy-api-index.md +++ b/doc/api/doxy-api-index.md @@ -237,7 +237,8 @@ The public API headers are grouped by topics: [log](@ref rte_log.h), [errno](@ref rte_errno.h), [trace](@ref rte_trace.h), - [trace_point](@ref rte_trace_point.h) + [trace_point](@ref rte_trace_point.h), + [pmu](@ref rte_pmu.h) - **misc**: [EAL config](@ref rte_eal.h), diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in index a8823c046f..658490b6a2 100644 --- a/doc/api/doxy-api.conf.in +++ b/doc/api/doxy-api.conf.in @@ -69,6 +69,7 @@ INPUT = @TOPDIR@/doc/api/doxy-api-index.md \ @TOPDIR@/lib/pdcp \ @TOPDIR@/lib/pdump \ @TOPDIR@/lib/pipeline \ + @TOPDIR@/lib/pmu \ @TOPDIR@/lib/port \ @TOPDIR@/lib/power \ @TOPDIR@/lib/ptr_compress \ diff --git a/doc/guides/prog_guide/profile_app.rst b/doc/guides/prog_guide/profile_app.rst index a6b5fb4d5e..854c515a61 100644 --- a/doc/guides/prog_guide/profile_app.rst +++ b/doc/guides/prog_guide/profile_app.rst @@ -7,6 +7,35 @@ Profile Your Application The following sections describe methods of profiling DPDK applications on different architectures. +Performance counter based profiling +----------------------------------- + +Majority of architectures support some performance monitoring unit (PMU). +Such unit provides programmable counters that monitor specific events. + +Different tools gather that information, like for example perf. +However, in some scenarios when CPU cores are isolated and run +dedicated tasks interrupting those tasks with perf may be undesirable. + +In such cases, an application can use the PMU library to read such events via ``rte_pmu_read()``. + +By default, userspace applications are not allowed to access PMU internals. That can be changed +by setting ``/sys/kernel/perf_event_paranoid`` to 2 (that should be a default value anyway) and +adding ``CAP_PERFMON`` capability to DPDK application. Please refer to +``Documentation/admin-guide/perf-security.rst`` under Linux sources for more information. Fairly +recent kernel, i.e >= 5.9, is advised too. + +As of now implementation imposes certain limitations: + +* Management APIs that normally return a non-negative value will return error (``-ENOTSUP``) while + ``rte_pmu_read()`` will return ``UINT64_MAX`` if running under unsupported operating system. + +* Only EAL lcores are supported + +* EAL lcores must not share a cpu + +* Each EAL lcore measures same group of events + Profiling on x86 ---------------- diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst index 53d8661365..06fa294787 100644 --- a/doc/guides/rel_notes/release_24_11.rst +++ b/doc/guides/rel_notes/release_24_11.rst @@ -150,6 +150,13 @@ New Features * Added independent enqueue feature. +* **Added PMU library.** + + Added a new performance monitoring unit (PMU) library which allows applications + to perform self monitoring activities without depending on external utilities like perf. + After integration with :doc:`../prog_guide/trace_lib` data gathered from hardware counters + can be stored in CTF format for further analysis. + Removed Items ------------- diff --git a/lib/meson.build b/lib/meson.build index 162287753f..cc7a4bd535 100644 --- a/lib/meson.build +++ b/lib/meson.build @@ -13,6 +13,7 @@ libraries = [ 'kvargs', # eal depends on kvargs 'argparse', 'telemetry', # basic info querying + 'pmu', 'eal', # everything depends on eal 'ptr_compress', 'ring', diff --git a/lib/pmu/meson.build b/lib/pmu/meson.build new file mode 100644 index 0000000000..386232e5c7 --- /dev/null +++ b/lib/pmu/meson.build @@ -0,0 +1,12 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(C) 2024 Marvell International Ltd. + +headers = files('rte_pmu.h') + +if not is_linux + subdir_done() +endif + +sources = files('rte_pmu.c') + +deps += ['log'] diff --git a/lib/pmu/pmu_private.h b/lib/pmu/pmu_private.h new file mode 100644 index 0000000000..d2b15615bf --- /dev/null +++ b/lib/pmu/pmu_private.h @@ -0,0 +1,32 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2024 Marvell + */ + +#ifndef _PMU_PRIVATE_H_ +#define _PMU_PRIVATE_H_ + +/** + * Architecture specific PMU init callback. + * + * @return + * 0 in case of success, negative value otherwise. + */ +int +pmu_arch_init(void); + +/** + * Architecture specific PMU cleanup callback. + */ +void +pmu_arch_fini(void); + +/** + * Apply architecture specific settings to config before passing it to syscall. + * + * @param config + * Architecture specific event configuration. Consult kernel sources for available options. + */ +void +pmu_arch_fixup_config(uint64_t config[3]); + +#endif /* _PMU_PRIVATE_H_ */ diff --git a/lib/pmu/rte_pmu.c b/lib/pmu/rte_pmu.c new file mode 100644 index 0000000000..3a861bca71 --- /dev/null +++ b/lib/pmu/rte_pmu.c @@ -0,0 +1,463 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2024 Marvell International Ltd. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include + +#include "pmu_private.h" + +#define EVENT_SOURCE_DEVICES_PATH "/sys/bus/event_source/devices" + +#define GENMASK_ULL(h, l) ((~0ULL - (1ULL << (l)) + 1) & (~0ULL >> ((64 - 1 - (h))))) +#define FIELD_PREP(m, v) (((uint64_t)(v) << (__builtin_ffsll(m) - 1)) & (m)) + +/* A structure describing an event */ +struct rte_pmu_event { + char *name; + unsigned int index; + TAILQ_ENTRY(rte_pmu_event) next; +}; + +RTE_DEFINE_PER_LCORE(struct rte_pmu_event_group, _event_group); +struct rte_pmu rte_pmu; + +/* + * Following __rte_weak functions provide default no-op. Architectures should override them if + * necessary. + */ + +int +__rte_weak pmu_arch_init(void) +{ + return 0; +} + +void +__rte_weak pmu_arch_fini(void) +{ +} + +void +__rte_weak pmu_arch_fixup_config(uint64_t __rte_unused config[3]) +{ +} + +static int +get_term_format(const char *name, int *num, uint64_t *mask) +{ + char path[PATH_MAX]; + char *config = NULL; + int high, low, ret; + FILE *fp; + + *num = *mask = 0; + snprintf(path, sizeof(path), EVENT_SOURCE_DEVICES_PATH "/%s/format/%s", rte_pmu.name, name); + fp = fopen(path, "r"); + if (fp == NULL) + return -errno; + + errno = 0; + ret = fscanf(fp, "%m[^:]:%d-%d", &config, &low, &high); + if (ret < 2) { + ret = -ENODATA; + goto out; + } + if (errno) { + ret = -errno; + goto out; + } + + if (ret == 2) + high = low; + + *mask = GENMASK_ULL(high, low); + /* Last digit should be [012]. If last digit is missing 0 is implied. */ + *num = config[strlen(config) - 1]; + *num = isdigit(*num) ? *num - '0' : 0; + + ret = 0; +out: + free(config); + fclose(fp); + + return ret; +} + +static int +parse_event(char *buf, uint64_t config[3]) +{ + char *token, *term; + int num, ret, val; + uint64_t mask; + + config[0] = config[1] = config[2] = 0; + + token = strtok(buf, ","); + while (token) { + errno = 0; + /* = */ + ret = sscanf(token, "%m[^=]=%i", &term, &val); + if (ret < 1) + return -ENODATA; + if (errno) + return -errno; + if (ret == 1) + val = 1; + + ret = get_term_format(term, &num, &mask); + free(term); + if (ret) + return ret; + + config[num] |= FIELD_PREP(mask, val); + token = strtok(NULL, ","); + } + + return 0; +} + +static int +get_event_config(const char *name, uint64_t config[3]) +{ + char path[PATH_MAX], buf[BUFSIZ]; + FILE *fp; + int ret; + + snprintf(path, sizeof(path), EVENT_SOURCE_DEVICES_PATH "/%s/events/%s", rte_pmu.name, name); + fp = fopen(path, "r"); + if (fp == NULL) + return -errno; + + ret = fread(buf, 1, sizeof(buf), fp); + if (ret == 0) { + fclose(fp); + + return -EINVAL; + } + fclose(fp); + buf[ret] = '\0'; + + return parse_event(buf, config); +} + +static int +do_perf_event_open(uint64_t config[3], int group_fd) +{ + struct perf_event_attr attr = { + .size = sizeof(struct perf_event_attr), + .type = PERF_TYPE_RAW, + .exclude_kernel = 1, + .exclude_hv = 1, + .disabled = 1, + .pinned = group_fd == -1, + }; + + pmu_arch_fixup_config(config); + + attr.config = config[0]; + attr.config1 = config[1]; + attr.config2 = config[2]; + + return syscall(SYS_perf_event_open, &attr, 0, -1, group_fd, 0); +} + +static int +open_events(struct rte_pmu_event_group *group) +{ + struct rte_pmu_event *event; + uint64_t config[3]; + int num = 0, ret; + + /* group leader gets created first, with fd = -1 */ + group->fds[0] = -1; + + TAILQ_FOREACH(event, &rte_pmu.event_list, next) { + ret = get_event_config(event->name, config); + if (ret) + continue; + + ret = do_perf_event_open(config, group->fds[0]); + if (ret == -1) { + ret = -errno; + goto out; + } + + group->fds[event->index] = ret; + num++; + } + + return 0; +out: + for (--num; num >= 0; num--) { + close(group->fds[num]); + group->fds[num] = -1; + } + + + return ret; +} + +static int +mmap_events(struct rte_pmu_event_group *group) +{ + long page_size = sysconf(_SC_PAGE_SIZE); + unsigned int i; + void *addr; + int ret; + + for (i = 0; i < rte_pmu.num_group_events; i++) { + addr = mmap(0, page_size, PROT_READ, MAP_SHARED, group->fds[i], 0); + if (addr == MAP_FAILED) { + ret = -errno; + goto out; + } + + group->mmap_pages[i] = addr; + } + + return 0; +out: + for (; i; i--) { + munmap(group->mmap_pages[i - 1], page_size); + group->mmap_pages[i - 1] = NULL; + } + + return ret; +} + +static void +cleanup_events(struct rte_pmu_event_group *group) +{ + unsigned int i; + + if (group->fds[0] != -1) + ioctl(group->fds[0], PERF_EVENT_IOC_DISABLE, PERF_IOC_FLAG_GROUP); + + for (i = 0; i < rte_pmu.num_group_events; i++) { + if (group->mmap_pages[i]) { + munmap(group->mmap_pages[i], sysconf(_SC_PAGE_SIZE)); + group->mmap_pages[i] = NULL; + } + + if (group->fds[i] != -1) { + close(group->fds[i]); + group->fds[i] = -1; + } + } + + group->enabled = false; +} + +int +__rte_pmu_enable_group(void) +{ + struct rte_pmu_event_group *group = &RTE_PER_LCORE(_event_group); + int ret; + + if (rte_pmu.num_group_events == 0) + return -ENODEV; + + ret = open_events(group); + if (ret) + goto out; + + ret = mmap_events(group); + if (ret) + goto out; + + if (ioctl(group->fds[0], PERF_EVENT_IOC_RESET, PERF_IOC_FLAG_GROUP) == -1) { + ret = -errno; + goto out; + } + + if (ioctl(group->fds[0], PERF_EVENT_IOC_ENABLE, PERF_IOC_FLAG_GROUP) == -1) { + ret = -errno; + goto out; + } + + rte_spinlock_lock(&rte_pmu.lock); + TAILQ_INSERT_TAIL(&rte_pmu.event_group_list, group, next); + rte_spinlock_unlock(&rte_pmu.lock); + group->enabled = true; + + return 0; + +out: + cleanup_events(group); + + return ret; +} + +static int +scan_pmus(void) +{ + char path[PATH_MAX]; + struct dirent *dent; + const char *name; + DIR *dirp; + + dirp = opendir(EVENT_SOURCE_DEVICES_PATH); + if (dirp == NULL) + return -errno; + + while ((dent = readdir(dirp))) { + name = dent->d_name; + if (name[0] == '.') + continue; + + /* sysfs entry should either contain cpus or be a cpu */ + if (!strcmp(name, "cpu")) + break; + + snprintf(path, sizeof(path), EVENT_SOURCE_DEVICES_PATH "/%s/cpus", name); + if (access(path, F_OK) == 0) + break; + } + + if (dent) { + rte_pmu.name = strdup(name); + if (rte_pmu.name == NULL) { + closedir(dirp); + + return -ENOMEM; + } + } + + closedir(dirp); + + return rte_pmu.name ? 0 : -ENODEV; +} + +static struct rte_pmu_event * +new_event(const char *name) +{ + struct rte_pmu_event *event; + + event = calloc(1, sizeof(*event)); + if (event == NULL) + goto out; + + event->name = strdup(name); + if (event->name == NULL) { + free(event); + event = NULL; + } + +out: + return event; +} + +static void +free_event(struct rte_pmu_event *event) +{ + free(event->name); + free(event); +} + +int +rte_pmu_add_event(const char *name) +{ + struct rte_pmu_event *event; + char path[PATH_MAX]; + + if (rte_pmu.name == NULL) + return -ENODEV; + + if (rte_pmu.num_group_events + 1 >= RTE_MAX_NUM_GROUP_EVENTS) + return -ENOSPC; + + snprintf(path, sizeof(path), EVENT_SOURCE_DEVICES_PATH "/%s/events/%s", rte_pmu.name, name); + if (access(path, R_OK)) + return -ENODEV; + + TAILQ_FOREACH(event, &rte_pmu.event_list, next) { + if (!strcmp(event->name, name)) + return event->index; + continue; + } + + event = new_event(name); + if (event == NULL) + return -ENOMEM; + + event->index = rte_pmu.num_group_events++; + TAILQ_INSERT_TAIL(&rte_pmu.event_list, event, next); + + return event->index; +} + +int +rte_pmu_init(void) +{ + int ret; + + /* Allow calling init from multiple contexts within a single thread. This simplifies + * resource management a bit e.g in case fast-path tracepoint has already been enabled + * via command line but application doesn't care enough and performs init/fini again. + */ + if (rte_atomic_fetch_add_explicit(&rte_pmu.initialized, 1, rte_memory_order_seq_cst) != 0) + return 0; + + ret = scan_pmus(); + if (ret) + goto out; + + ret = pmu_arch_init(); + if (ret) + goto out; + + TAILQ_INIT(&rte_pmu.event_list); + TAILQ_INIT(&rte_pmu.event_group_list); + rte_spinlock_init(&rte_pmu.lock); + + return 0; +out: + free(rte_pmu.name); + rte_pmu.name = NULL; + + return ret; +} + +void +rte_pmu_fini(void) +{ + struct rte_pmu_event_group *group, *tmp_group; + struct rte_pmu_event *event, *tmp_event; + + /* cleanup once init count drops to zero */ + if (rte_atomic_fetch_sub_explicit(&rte_pmu.initialized, 1, + rte_memory_order_seq_cst) - 1 != 0) + return; + + RTE_TAILQ_FOREACH_SAFE(event, &rte_pmu.event_list, next, tmp_event) { + TAILQ_REMOVE(&rte_pmu.event_list, event, next); + free_event(event); + } + + RTE_TAILQ_FOREACH_SAFE(group, &rte_pmu.event_group_list, next, tmp_group) { + TAILQ_REMOVE(&rte_pmu.event_group_list, group, next); + cleanup_events(group); + } + + pmu_arch_fini(); + free(rte_pmu.name); + rte_pmu.name = NULL; + rte_pmu.num_group_events = 0; +} diff --git a/lib/pmu/rte_pmu.h b/lib/pmu/rte_pmu.h new file mode 100644 index 0000000000..09238ee33d --- /dev/null +++ b/lib/pmu/rte_pmu.h @@ -0,0 +1,227 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2024 Marvell + */ + +#ifndef _RTE_PMU_H_ +#define _RTE_PMU_H_ + +/** + * @file + * + * PMU event tracing operations + * + * This file defines generic API and types necessary to setup PMU and + * read selected counters in runtime. + */ + +#ifdef __cplusplus +extern "C" { +#endif + +#include + +#include +#include + +#ifdef RTE_EXEC_ENV_LINUX + +#include + +#include +#include +#include + +/** Maximum number of events in a group */ +#define RTE_MAX_NUM_GROUP_EVENTS 8 + +/** + * A structure describing a group of events. + */ +struct __rte_cache_aligned rte_pmu_event_group { + /** array of user pages */ + struct perf_event_mmap_page *mmap_pages[RTE_MAX_NUM_GROUP_EVENTS]; + int fds[RTE_MAX_NUM_GROUP_EVENTS]; /**< array of event descriptors */ + bool enabled; /**< true if group was enabled on particular lcore */ + TAILQ_ENTRY(rte_pmu_event_group) next; /**< list entry */ +}; + +/** + * A PMU state container. + */ +struct rte_pmu { + char *name; /**< name of core PMU listed under /sys/bus/event_source/devices */ + rte_spinlock_t lock; /**< serialize access to event group list */ + TAILQ_HEAD(, rte_pmu_event_group) event_group_list; /**< list of event groups */ + unsigned int num_group_events; /**< number of events in a group */ + TAILQ_HEAD(, rte_pmu_event) event_list; /**< list of matching events */ + unsigned int initialized; /**< initialization counter */ +}; + +/** lcore event group */ +RTE_DECLARE_PER_LCORE(struct rte_pmu_event_group, _event_group); + +/** PMU state container */ +extern struct rte_pmu rte_pmu; + +/** Each architecture supporting PMU needs to provide its own version */ +#ifndef rte_pmu_pmc_read +#define rte_pmu_pmc_read(index) ({ (void)(index); 0; }) +#endif + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Read PMU counter. + * + * @warning This should be not called directly. + * + * @param pc + * Pointer to the mmapped user page. + * @return + * Counter value read from hardware. + */ +__rte_experimental +static __rte_always_inline uint64_t +__rte_pmu_read_userpage(struct perf_event_mmap_page *pc) +{ +#define __RTE_PMU_READ_ONCE(x) (*(const volatile typeof(x) *)&(x)) + uint64_t width, offset; + uint32_t seq, index; + int64_t pmc; + + for (;;) { + seq = __RTE_PMU_READ_ONCE(pc->lock); + rte_compiler_barrier(); + index = __RTE_PMU_READ_ONCE(pc->index); + offset = __RTE_PMU_READ_ONCE(pc->offset); + width = __RTE_PMU_READ_ONCE(pc->pmc_width); + + /* index set to 0 means that particular counter cannot be used */ + if (likely(pc->cap_user_rdpmc && index)) { + pmc = rte_pmu_pmc_read(index - 1); + pmc <<= 64 - width; + pmc >>= 64 - width; + offset += pmc; + } + + rte_compiler_barrier(); + + if (likely(__RTE_PMU_READ_ONCE(pc->lock) == seq)) + return offset; + } + + return 0; +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Enable group of events on the calling lcore. + * + * @warning This should be not called directly. + * + * @return + * 0 in case of success, negative value otherwise. + */ +__rte_experimental +int +__rte_pmu_enable_group(void); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Initialize PMU library. + * + * @warning This should be not called directly. + * + * @return + * 0 in case of success, negative value otherwise. + */ +__rte_experimental +int +rte_pmu_init(void); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Finalize PMU library. This should be called after PMU counters are no longer being read. + */ +__rte_experimental +void +rte_pmu_fini(void); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Add event to the group of enabled events. + * + * @param name + * Name of an event listed under /sys/bus/event_source/devices/pmu/events. + * @return + * Event index in case of success, negative value otherwise. + */ +__rte_experimental +int +rte_pmu_add_event(const char *name); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Read hardware counter configured to count occurrences of an event. + * + * @param index + * Index of an event to be read. + * @return + * Event value read from register. In case of errors or lack of support + * 0 is returned. In other words, stream of zeros in a trace file + * indicates problem with reading particular PMU event register. + */ +__rte_experimental +static __rte_always_inline uint64_t +rte_pmu_read(unsigned int index) +{ + struct rte_pmu_event_group *group = &RTE_PER_LCORE(_event_group); + int ret; + + if (unlikely(!rte_pmu.initialized)) + return 0; + + if (unlikely(!group->enabled)) { + ret = __rte_pmu_enable_group(); + if (ret) + return 0; + } + + if (unlikely(index >= rte_pmu.num_group_events)) + return 0; + + return __rte_pmu_read_userpage(group->mmap_pages[index]); +} + +#else /* !RTE_EXEC_ENV_LINUX */ + +__rte_experimental +static inline int rte_pmu_init(void) { return -ENOTSUP; } + +__rte_experimental +static inline void rte_pmu_fini(void) { } + +__rte_experimental +static inline int rte_pmu_add_event(const char *name __rte_unused) { return -ENOTSUP; } + +__rte_experimental +static inline uint64_t rte_pmu_read(unsigned int index __rte_unused) { return UINT64_MAX; } + +#endif /* RTE_EXEC_ENV_LINUX */ + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_PMU_H_ */ diff --git a/lib/pmu/version.map b/lib/pmu/version.map new file mode 100644 index 0000000000..d7c80ce4ce --- /dev/null +++ b/lib/pmu/version.map @@ -0,0 +1,14 @@ +EXPERIMENTAL { + global: + + # added in 24.11 + __rte_pmu_enable_group; + per_lcore__event_group; + rte_pmu; + rte_pmu_add_event; + rte_pmu_fini; + rte_pmu_init; + rte_pmu_read; + + local: *; +}; From patchwork Fri Oct 11 09:49:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tomasz Duszynski X-Patchwork-Id: 145754 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id C93A145A6D; Fri, 11 Oct 2024 11:50:12 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1EB3F40655; Fri, 11 Oct 2024 11:50:08 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id ACB9E4067A for ; Fri, 11 Oct 2024 11:50:06 +0200 (CEST) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 49B5umbd005710; Fri, 11 Oct 2024 02:50:03 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=pfpt0220; bh=b pYemAiOuLvS/evF6yGuWIwUImMmqPx/WmwsTie+4j0=; b=BxW92yOsZmWLg2Q8c vPg9YHvPUpWDLcItY+q5V7TrKpr8dXQ3b4vrKO7slyWKAzJyC7W1OPpKUPv72Xev bPsmpFqbhOXifBkR6Nk5wnH5w+Q4JxGRvvQCSK+UNI/Iy4I3w7/Z8zxyi6NQUsaY 9H1T9qD7wjJgIEf27mQoEWs42ygDzqnlucpSV8z0lRVfC+QuldN3IT1EYPqPOWLw MtHuGVydHORuBQPij7t3e/38qGIZA3hc1hHKD8HRgnnrG3dKlLq/avMu/MsOiwQi L64Z4+/tsd7PXdKCwzln28OtuGJKefKqT5JEHqKPVpoC4xgsqkE3YD1Qa80PuQU/ P6oZA== Received: from dc5-exch05.marvell.com ([199.233.59.128]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 426x9v8cev-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 11 Oct 2024 02:50:03 -0700 (PDT) Received: from DC5-EXCH05.marvell.com (10.69.176.209) by DC5-EXCH05.marvell.com (10.69.176.209) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Fri, 11 Oct 2024 02:50:01 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH05.marvell.com (10.69.176.209) with Microsoft SMTP Server id 15.2.1544.4 via Frontend Transport; Fri, 11 Oct 2024 02:50:01 -0700 Received: from cavium-optiplex-3070-BM15.. (unknown [10.28.34.39]) by maili.marvell.com (Postfix) with ESMTP id 9AA553F704E; Fri, 11 Oct 2024 02:49:57 -0700 (PDT) From: Tomasz Duszynski To: , Wathsala Vithanage CC: , , , , , , , , , , , Subject: [PATCH v14 2/4] pmu: support reading ARM PMU events in runtime Date: Fri, 11 Oct 2024 11:49:42 +0200 Message-ID: <20241011094944.3586051-3-tduszynski@marvell.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241011094944.3586051-1-tduszynski@marvell.com> References: <20241009112308.2973903-1-tduszynski@marvell.com> <20241011094944.3586051-1-tduszynski@marvell.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: S515-FCkxXgc21FwKIXkTrvyayFXT3jL X-Proofpoint-GUID: S515-FCkxXgc21FwKIXkTrvyayFXT3jL X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.60.29 definitions=2024-09-06_09,2024-09-06_01,2024-09-02_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add support for reading ARM PMU events in runtime. Signed-off-by: Tomasz Duszynski --- app/test/test_pmu.c | 4 ++ lib/pmu/meson.build | 8 ++++ lib/pmu/pmu_arm64.c | 94 +++++++++++++++++++++++++++++++++++++ lib/pmu/rte_pmu.h | 4 ++ lib/pmu/rte_pmu_pmc_arm64.h | 30 ++++++++++++ 5 files changed, 140 insertions(+) create mode 100644 lib/pmu/pmu_arm64.c create mode 100644 lib/pmu/rte_pmu_pmc_arm64.h diff --git a/app/test/test_pmu.c b/app/test/test_pmu.c index 79376ea2e8..e0809a0f93 100644 --- a/app/test/test_pmu.c +++ b/app/test/test_pmu.c @@ -13,6 +13,10 @@ test_pmu_read(void) int tries = 10, event; uint64_t val = 0; +#if defined(RTE_ARCH_ARM64) + name = "cpu_cycles"; +#endif + if (name == NULL) { printf("PMU not supported on this arch\n"); return TEST_SKIPPED; diff --git a/lib/pmu/meson.build b/lib/pmu/meson.build index 386232e5c7..0d9270eaca 100644 --- a/lib/pmu/meson.build +++ b/lib/pmu/meson.build @@ -9,4 +9,12 @@ endif sources = files('rte_pmu.c') +indirect_headers += files( + 'rte_pmu_pmc_arm64.h', +) + +if dpdk_conf.has('RTE_ARCH_ARM64') + sources += files('pmu_arm64.c') +endif + deps += ['log'] diff --git a/lib/pmu/pmu_arm64.c b/lib/pmu/pmu_arm64.c new file mode 100644 index 0000000000..3b72009cff --- /dev/null +++ b/lib/pmu/pmu_arm64.c @@ -0,0 +1,94 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2024 Marvell International Ltd. + */ + +#include +#include +#include +#include + +#include +#include + +#include "pmu_private.h" + +#define PERF_USER_ACCESS_PATH "/proc/sys/kernel/perf_user_access" + +static int restore_uaccess; + +static int +read_attr_int(const char *path, int *val) +{ + char buf[BUFSIZ]; + int ret, fd; + + fd = open(path, O_RDONLY); + if (fd == -1) + return -errno; + + ret = read(fd, buf, sizeof(buf)); + if (ret == -1) { + close(fd); + + return -errno; + } + + *val = strtol(buf, NULL, 10); + close(fd); + + return 0; +} + +static int +write_attr_int(const char *path, int val) +{ + char buf[BUFSIZ]; + int num, ret, fd; + + fd = open(path, O_WRONLY); + if (fd == -1) + return -errno; + + num = snprintf(buf, sizeof(buf), "%d", val); + ret = write(fd, buf, num); + if (ret == -1) { + close(fd); + + return -errno; + } + + close(fd); + + return 0; +} + +int +pmu_arch_init(void) +{ + int ret; + + ret = read_attr_int(PERF_USER_ACCESS_PATH, &restore_uaccess); + if (ret) + return ret; + + /* user access already enabled */ + if (restore_uaccess == 1) + return 0; + + return write_attr_int(PERF_USER_ACCESS_PATH, 1); +} + +void +pmu_arch_fini(void) +{ + write_attr_int(PERF_USER_ACCESS_PATH, restore_uaccess); +} + +void +pmu_arch_fixup_config(uint64_t config[3]) +{ + /* select 64 bit counters */ + config[1] |= RTE_BIT64(0); + /* enable userspace access */ + config[1] |= RTE_BIT64(1); +} diff --git a/lib/pmu/rte_pmu.h b/lib/pmu/rte_pmu.h index 09238ee33d..771fad31d0 100644 --- a/lib/pmu/rte_pmu.h +++ b/lib/pmu/rte_pmu.h @@ -31,6 +31,10 @@ extern "C" { #include #include +#if defined(RTE_ARCH_ARM64) +#include "rte_pmu_pmc_arm64.h" +#endif + /** Maximum number of events in a group */ #define RTE_MAX_NUM_GROUP_EVENTS 8 diff --git a/lib/pmu/rte_pmu_pmc_arm64.h b/lib/pmu/rte_pmu_pmc_arm64.h new file mode 100644 index 0000000000..39165bbc20 --- /dev/null +++ b/lib/pmu/rte_pmu_pmc_arm64.h @@ -0,0 +1,30 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2024 Marvell. + */ +#ifndef _RTE_PMU_PMC_ARM64_H_ +#define _RTE_PMU_PMC_ARM64_H_ + +#include + +static __rte_always_inline uint64_t +rte_pmu_pmc_read(int index) +{ + uint64_t val; + + if (index == 31) { + /* CPU Cycles (0x11) must be read via pmccntr_el0 */ + asm volatile("mrs %0, pmccntr_el0" : "=r" (val)); + } else { + asm volatile( + "msr pmselr_el0, %x0\n" + "mrs %0, pmxevcntr_el0\n" + : "=r" (val) + : "rZ" (index) + ); + } + + return val; +} +#define rte_pmu_pmc_read rte_pmu_pmc_read + +#endif /* _RTE_PMU_PMC_ARM64_H_ */ From patchwork Fri Oct 11 09:49:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tomasz Duszynski X-Patchwork-Id: 145755 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0096545A6D; Fri, 11 Oct 2024 11:50:17 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 2692040687; Fri, 11 Oct 2024 11:50:11 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id D69FA4067A for ; Fri, 11 Oct 2024 11:50:09 +0200 (CEST) Received: from pps.filterd (m0431383.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 49B9MI5x014123; Fri, 11 Oct 2024 02:50:07 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=pfpt0220; bh=Y IcekUR8bh0m6r8x04PWs/xv4lPTHOKa0bwcCKY6RLQ=; b=Suj3okFVLXaBBcAxf zHwpV7tFiANI/bX4K9Dk0yRkz+oc/kHwjWQ2R/+mp4P48cjnZWgsWqm+HQGpN1wQ JBl88p/nhp+TLLi45VBu71KtPj7LJfvew3jNPIV/LjEUwCfssZad3kRFiNOdjeLZ D6MwHzzo/v5IhlzNrrKrHtd0bPx6ZbL0hOIIn1A6dDpmxGaoQ/tsCUi5yO6YMddK 4muVkzJbMToyJ9n2gUWX/vLoDiCqBiQOmlZ32PbUMtcJwwvcxOInEBwngG20CFrJ b7y1Y08t6jfRPPcyrbW7BnL2QVkcpb84BL5DpcYKscnqThBamkng38L1OuWnyKOi 4v5fA== Received: from dc6wp-exch02.marvell.com ([4.21.29.225]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 4271b201ne-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 11 Oct 2024 02:50:07 -0700 (PDT) Received: from DC6WP-EXCH02.marvell.com (10.76.176.209) by DC6WP-EXCH02.marvell.com (10.76.176.209) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Fri, 11 Oct 2024 02:50:06 -0700 Received: from maili.marvell.com (10.69.176.80) by DC6WP-EXCH02.marvell.com (10.76.176.209) with Microsoft SMTP Server id 15.2.1544.4 via Frontend Transport; Fri, 11 Oct 2024 02:50:06 -0700 Received: from cavium-optiplex-3070-BM15.. (unknown [10.28.34.39]) by maili.marvell.com (Postfix) with ESMTP id 4496C3F7050; Fri, 11 Oct 2024 02:50:02 -0700 (PDT) From: Tomasz Duszynski To: CC: , , , , , , , , , , , Subject: [PATCH v14 3/4] pmu: support reading Intel x86_64 PMU events in runtime Date: Fri, 11 Oct 2024 11:49:43 +0200 Message-ID: <20241011094944.3586051-4-tduszynski@marvell.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241011094944.3586051-1-tduszynski@marvell.com> References: <20241009112308.2973903-1-tduszynski@marvell.com> <20241011094944.3586051-1-tduszynski@marvell.com> MIME-Version: 1.0 X-Proofpoint-GUID: phP1A1LLxzoYOm1S6gKwDiHr5DLVeBS1 X-Proofpoint-ORIG-GUID: phP1A1LLxzoYOm1S6gKwDiHr5DLVeBS1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.687,Hydra:6.0.235,FMLib:17.0.607.475 definitions=2020-10-13_15,2020-10-13_02,2020-04-07_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add support for reading Intel x86_64 PMU events in runtime. Signed-off-by: Tomasz Duszynski --- app/test/test_pmu.c | 2 ++ lib/pmu/meson.build | 1 + lib/pmu/rte_pmu.h | 2 ++ lib/pmu/rte_pmu_pmc_x86_64.h | 24 ++++++++++++++++++++++++ 4 files changed, 29 insertions(+) create mode 100644 lib/pmu/rte_pmu_pmc_x86_64.h diff --git a/app/test/test_pmu.c b/app/test/test_pmu.c index e0809a0f93..2d185574ca 100644 --- a/app/test/test_pmu.c +++ b/app/test/test_pmu.c @@ -15,6 +15,8 @@ test_pmu_read(void) #if defined(RTE_ARCH_ARM64) name = "cpu_cycles"; +#elif defined(RTE_ARCH_X86_64) + name = "cpu-cycles"; #endif if (name == NULL) { diff --git a/lib/pmu/meson.build b/lib/pmu/meson.build index 0d9270eaca..c2e99384e5 100644 --- a/lib/pmu/meson.build +++ b/lib/pmu/meson.build @@ -11,6 +11,7 @@ sources = files('rte_pmu.c') indirect_headers += files( 'rte_pmu_pmc_arm64.h', + 'rte_pmu_pmc_x86_64.h', ) if dpdk_conf.has('RTE_ARCH_ARM64') diff --git a/lib/pmu/rte_pmu.h b/lib/pmu/rte_pmu.h index 771fad31d0..248bb55024 100644 --- a/lib/pmu/rte_pmu.h +++ b/lib/pmu/rte_pmu.h @@ -33,6 +33,8 @@ extern "C" { #if defined(RTE_ARCH_ARM64) #include "rte_pmu_pmc_arm64.h" +#elif defined(RTE_ARCH_X86_64) +#include "rte_pmu_pmc_x86_64.h" #endif /** Maximum number of events in a group */ diff --git a/lib/pmu/rte_pmu_pmc_x86_64.h b/lib/pmu/rte_pmu_pmc_x86_64.h new file mode 100644 index 0000000000..5eb8f73c11 --- /dev/null +++ b/lib/pmu/rte_pmu_pmc_x86_64.h @@ -0,0 +1,24 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2024 Marvell. + */ +#ifndef _RTE_PMU_PMC_X86_64_H_ +#define _RTE_PMU_PMC_X86_64_H_ + +#include + +static __rte_always_inline uint64_t +rte_pmu_pmc_read(int index) +{ + uint64_t low, high; + + asm volatile( + "rdpmc\n" + : "=a" (low), "=d" (high) + : "c" (index) + ); + + return low | (high << 32); +} +#define rte_pmu_pmc_read rte_pmu_pmc_read + +#endif /* _RTE_PMU_PMC_X86_64_H_ */ From patchwork Fri Oct 11 09:49:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tomasz Duszynski X-Patchwork-Id: 145756 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 4567C45A6D; Fri, 11 Oct 2024 11:50:23 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 2D5CD40648; Fri, 11 Oct 2024 11:50:20 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id CE6404029A for ; Fri, 11 Oct 2024 11:50:18 +0200 (CEST) Received: from pps.filterd (m0431383.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 49B9MI68014123; Fri, 11 Oct 2024 02:50:16 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=pfpt0220; bh=+ bmSDzGPcWQdD3F/iVEM/XJHc/nSA2YkR69QeZc+c1Q=; b=KQG+mBpHlsBJHAAhH O2viTbDrsqms5RFOJmD7wAc8yOn/rBebMzgzo9gHW88W9bEX/vPUFPo277/RBa2G s4PHqilNN0DYyAmOUZee/F2lc5iPr3iQ1M7q07Fudin9/goSlzpIYXuHqF7KcyK7 J9725CRxqXoEY6h99V4B7i7bMvWQI0KefoHX+BKVML9iBGjgyhOWb8itkIQ0jIzU gMc1yStpSlTD7a1kFnvBZelmZVN2bkVxTYCRUYzQFn/XXeqsgz6KCnNGEMMtXH3F gMyjVHFONkkONH8m7e3EuxBpzag0nF7vHFf40/I1LwTKYi8qVNLIm9Q5XZIjzRE4 +x/+g== Received: from dc5-exch05.marvell.com ([199.233.59.128]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 4271b201nr-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 11 Oct 2024 02:50:15 -0700 (PDT) Received: from DC5-EXCH05.marvell.com (10.69.176.209) by DC5-EXCH05.marvell.com (10.69.176.209) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Fri, 11 Oct 2024 02:50:10 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH05.marvell.com (10.69.176.209) with Microsoft SMTP Server id 15.2.1544.4 via Frontend Transport; Fri, 11 Oct 2024 02:50:10 -0700 Received: from cavium-optiplex-3070-BM15.. (unknown [10.28.34.39]) by maili.marvell.com (Postfix) with ESMTP id 87FD03F704F; Fri, 11 Oct 2024 02:50:06 -0700 (PDT) From: Tomasz Duszynski To: , Jerin Jacob , "Sunil Kumar Kori" , Tyler Retzlaff CC: , , , , , , , , , Subject: [PATCH v14 4/4] eal: add PMU support to tracing library Date: Fri, 11 Oct 2024 11:49:44 +0200 Message-ID: <20241011094944.3586051-5-tduszynski@marvell.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241011094944.3586051-1-tduszynski@marvell.com> References: <20241009112308.2973903-1-tduszynski@marvell.com> <20241011094944.3586051-1-tduszynski@marvell.com> MIME-Version: 1.0 X-Proofpoint-GUID: 6APawc0aH9-MMJJ8oodnR67eA9BaUdCa X-Proofpoint-ORIG-GUID: 6APawc0aH9-MMJJ8oodnR67eA9BaUdCa X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.687,Hydra:6.0.235,FMLib:17.0.607.475 definitions=2020-10-13_15,2020-10-13_02,2020-04-07_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org In order to profile app one needs to store significant amount of samples somewhere for an analysis later on. Since trace library supports storing data in a CTF format lets take advantage of that and add a dedicated PMU tracepoint. Signed-off-by: Tomasz Duszynski --- app/test/test_trace_perf.c | 4 ++ doc/guides/prog_guide/profile_app.rst | 5 ++ doc/guides/prog_guide/trace_lib.rst | 32 +++++++++++++ lib/eal/common/eal_common_trace.c | 7 ++- lib/eal/common/eal_common_trace_points.c | 5 ++ lib/eal/include/rte_eal_trace.h | 9 ++++ lib/eal/meson.build | 2 +- lib/eal/version.map | 3 ++ lib/pmu/rte_pmu.c | 61 ++++++++++++++++++++++++ lib/pmu/rte_pmu.h | 18 +++++++ lib/pmu/version.map | 1 + 11 files changed, 145 insertions(+), 2 deletions(-) diff --git a/app/test/test_trace_perf.c b/app/test/test_trace_perf.c index 8257cc02be..37af2037b4 100644 --- a/app/test/test_trace_perf.c +++ b/app/test/test_trace_perf.c @@ -114,6 +114,8 @@ worker_fn_##func(void *arg) \ #define GENERIC_DOUBLE rte_eal_trace_generic_double(3.66666) #define GENERIC_STR rte_eal_trace_generic_str("hello world") #define VOID_FP app_dpdk_test_fp() +/* 0 corresponds first event passed via --trace= */ +#define READ_PMU rte_eal_trace_pmu_read(0) WORKER_DEFINE(GENERIC_VOID) WORKER_DEFINE(GENERIC_U64) @@ -122,6 +124,7 @@ WORKER_DEFINE(GENERIC_FLOAT) WORKER_DEFINE(GENERIC_DOUBLE) WORKER_DEFINE(GENERIC_STR) WORKER_DEFINE(VOID_FP) +WORKER_DEFINE(READ_PMU) static void run_test(const char *str, lcore_function_t f, struct test_data *data, size_t sz) @@ -174,6 +177,7 @@ test_trace_perf(void) run_test("double", worker_fn_GENERIC_DOUBLE, data, sz); run_test("string", worker_fn_GENERIC_STR, data, sz); run_test("void_fp", worker_fn_VOID_FP, data, sz); + run_test("read_pmu", worker_fn_READ_PMU, data, sz); rte_free(data); return TEST_SUCCESS; diff --git a/doc/guides/prog_guide/profile_app.rst b/doc/guides/prog_guide/profile_app.rst index 854c515a61..1ab6fb9eaa 100644 --- a/doc/guides/prog_guide/profile_app.rst +++ b/doc/guides/prog_guide/profile_app.rst @@ -36,6 +36,11 @@ As of now implementation imposes certain limitations: * Each EAL lcore measures same group of events +Alternatively tracing library can be used which offers dedicated tracepoint +``rte_eal_trace_pmu_event()``. + +Refer to :doc:`../prog_guide/trace_lib` for more details. + Profiling on x86 ---------------- diff --git a/doc/guides/prog_guide/trace_lib.rst b/doc/guides/prog_guide/trace_lib.rst index d9b17abe90..378abccd72 100644 --- a/doc/guides/prog_guide/trace_lib.rst +++ b/doc/guides/prog_guide/trace_lib.rst @@ -46,6 +46,7 @@ DPDK tracing library features trace format and is compatible with ``LTTng``. For detailed information, refer to `Common Trace Format `_. +- Support reading PMU events on ARM64 and x86-64 (Intel) How to add a tracepoint? ------------------------ @@ -139,6 +140,37 @@ the user must use ``RTE_TRACE_POINT_FP`` instead of ``RTE_TRACE_POINT``. ``RTE_TRACE_POINT_FP`` is compiled out by default and it can be enabled using the ``enable_trace_fp`` option for meson build. +PMU tracepoint +-------------- + +Performance monitoring unit (PMU) event values can be read from hardware +registers using predefined ``rte_pmu_read`` tracepoint. + +Tracing is enabled via ``--trace`` EAL option by passing both expression +matching PMU tracepoint name i.e ``lib.eal.pmu.read`` and expression +``e=ev1[,ev2,...]`` matching particular events:: + + --trace='.*pmu.read\|e=cpu_cycles,l1d_cache' + +Event names are available under ``/sys/bus/event_source/devices/PMU/events`` +directory, where ``PMU`` is a placeholder for either a ``cpu`` or a directory +containing ``cpus``. + +In contrary to other tracepoints this does not need any extra variables +added to source files. Instead, caller passes index which follows the order of +events specified via ``--trace`` parameter. In the following example index ``0`` +corresponds to ``cpu_cyclces`` while index ``1`` corresponds to ``l1d_cache``. + +.. code-block:: c + + ... + rte_eal_trace_pmu_read(0); + rte_eal_trace_pmu_read(1); + ... + +PMU tracing support must be explicitly enabled using the ``enable_trace_fp`` +option for meson build. + Event record mode ----------------- diff --git a/lib/eal/common/eal_common_trace.c b/lib/eal/common/eal_common_trace.c index 918f49bf4f..ffeb822fb7 100644 --- a/lib/eal/common/eal_common_trace.c +++ b/lib/eal/common/eal_common_trace.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include "eal_trace.h" @@ -72,8 +73,11 @@ eal_trace_init(void) goto free_meta; /* Apply global configurations */ - STAILQ_FOREACH(arg, &trace.args, next) + STAILQ_FOREACH(arg, &trace.args, next) { trace_args_apply(arg->val); + if (rte_pmu_init() == 0) + rte_pmu_add_events_by_pattern(arg->val); + } rte_trace_mode_set(trace.mode); @@ -89,6 +93,7 @@ eal_trace_init(void) void eal_trace_fini(void) { + rte_pmu_fini(); trace_mem_free(); trace_metadata_destroy(); eal_trace_args_free(); diff --git a/lib/eal/common/eal_common_trace_points.c b/lib/eal/common/eal_common_trace_points.c index 0f1240ea3a..05f7ed83d5 100644 --- a/lib/eal/common/eal_common_trace_points.c +++ b/lib/eal/common/eal_common_trace_points.c @@ -100,3 +100,8 @@ RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_enable, lib.eal.intr.enable) RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_disable, lib.eal.intr.disable) + +#ifdef RTE_EXEC_ENV_LINUX +RTE_TRACE_POINT_REGISTER(rte_eal_trace_pmu_read, + lib.eal.pmu.read) +#endif diff --git a/lib/eal/include/rte_eal_trace.h b/lib/eal/include/rte_eal_trace.h index 9ad2112801..6c1ed8e8e3 100644 --- a/lib/eal/include/rte_eal_trace.h +++ b/lib/eal/include/rte_eal_trace.h @@ -11,6 +11,7 @@ * API for EAL trace support */ +#include #include #ifdef __cplusplus @@ -127,6 +128,14 @@ RTE_TRACE_POINT( #define RTE_EAL_TRACE_GENERIC_FUNC rte_eal_trace_generic_func(__func__) +RTE_TRACE_POINT_FP( + rte_eal_trace_pmu_read, + RTE_TRACE_POINT_ARGS(unsigned int index), + uint64_t val; + val = rte_pmu_read(index); + rte_trace_point_emit_u64(val); +) + #ifdef __cplusplus } #endif diff --git a/lib/eal/meson.build b/lib/eal/meson.build index e1d6c4cf17..2b1a1eb283 100644 --- a/lib/eal/meson.build +++ b/lib/eal/meson.build @@ -14,7 +14,7 @@ subdir(exec_env) subdir(arch_subdir) -deps += ['log', 'kvargs'] +deps += ['log', 'kvargs', 'pmu'] if not is_windows deps += ['telemetry'] endif diff --git a/lib/eal/version.map b/lib/eal/version.map index e3ff412683..17c56f6bf7 100644 --- a/lib/eal/version.map +++ b/lib/eal/version.map @@ -396,6 +396,9 @@ EXPERIMENTAL { # added in 24.03 rte_vfio_get_device_info; # WINDOWS_NO_EXPORT + + # added in 24.11 + __rte_eal_trace_pmu_read; # WINDOWS_NO_EXPORT }; INTERNAL { diff --git a/lib/pmu/rte_pmu.c b/lib/pmu/rte_pmu.c index 3a861bca71..0fbff25c92 100644 --- a/lib/pmu/rte_pmu.c +++ b/lib/pmu/rte_pmu.c @@ -403,6 +403,67 @@ rte_pmu_add_event(const char *name) return event->index; } +static int +add_events(const char *pattern) +{ + char *token, *copy; + int ret = 0; + + copy = strdup(pattern); + if (copy == NULL) + return -ENOMEM; + + token = strtok(copy, ","); + while (token) { + ret = rte_pmu_add_event(token); + if (ret < 0) + break; + + token = strtok(NULL, ","); + } + + free(copy); + + return ret >= 0 ? 0 : ret; +} + +int +rte_pmu_add_events_by_pattern(const char *pattern) +{ + regmatch_t rmatch; + char buf[BUFSIZ]; + unsigned int num; + regex_t reg; + int ret; + + /* events are matched against occurrences of e=ev1[,ev2,..] pattern */ + ret = regcomp(®, "e=([_[:alnum:]-],?)+", REG_EXTENDED); + if (ret) + return -EINVAL; + + for (;;) { + if (regexec(®, pattern, 1, &rmatch, 0)) + break; + + num = rmatch.rm_eo - rmatch.rm_so; + if (num > sizeof(buf)) + num = sizeof(buf); + + /* skip e= pattern prefix */ + memcpy(buf, pattern + rmatch.rm_so + 2, num - 2); + buf[num - 2] = '\0'; + ret = add_events(buf); + if (ret) + break; + + pattern += rmatch.rm_eo; + } + + regfree(®); + + return ret; +} + int rte_pmu_init(void) { diff --git a/lib/pmu/rte_pmu.h b/lib/pmu/rte_pmu.h index 248bb55024..26264b3b8f 100644 --- a/lib/pmu/rte_pmu.h +++ b/lib/pmu/rte_pmu.h @@ -175,6 +175,20 @@ __rte_experimental int rte_pmu_add_event(const char *name); +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Add events matching pattern to the group of enabled events. + * + * @param pattern + * Pattern e=ev1[,ev2,...] matching events, where evX is a placeholder for an event listed under + * /sys/bus/event_source/devices/pmu/events. + */ +__rte_experimental +int +rte_pmu_add_events_by_pattern(const char *pattern); + /** * @warning * @b EXPERIMENTAL: this API may change without prior notice @@ -221,6 +235,10 @@ static inline void rte_pmu_fini(void) { } __rte_experimental static inline int rte_pmu_add_event(const char *name __rte_unused) { return -ENOTSUP; } +__rte_experimental +static inline int +rte_pmu_add_events_by_pattern(const char *pattern __rte_unused) { return -ENOTSUP; } + __rte_experimental static inline uint64_t rte_pmu_read(unsigned int index __rte_unused) { return UINT64_MAX; } diff --git a/lib/pmu/version.map b/lib/pmu/version.map index d7c80ce4ce..0c98209b4a 100644 --- a/lib/pmu/version.map +++ b/lib/pmu/version.map @@ -6,6 +6,7 @@ EXPERIMENTAL { per_lcore__event_group; rte_pmu; rte_pmu_add_event; + rte_pmu_add_events_by_pattern; rte_pmu_fini; rte_pmu_init; rte_pmu_read;