From patchwork Mon Feb 13 11:31:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tomasz Duszynski X-Patchwork-Id: 123785 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D8BDC41C88; Mon, 13 Feb 2023 12:32:33 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 8FA5B41151; Mon, 13 Feb 2023 12:32:32 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id CFEBD41151 for ; Mon, 13 Feb 2023 12:32:30 +0100 (CET) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 31D8H4sD030769; Mon, 13 Feb 2023 03:32:28 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=pfpt0220; bh=c9t/U2G+mlcmOEsOmAZNdthrw2ffGRJZ1uceFdz0vHU=; b=comRInNwoG7IYfOUw/Nc2M8JhnBTeOpVMT+LN8lFzWDuNeGOnRET1cuCKmo+pcGiu9CP 8wf29csn8JFfwVNgY6HjimxyBgAin2/Jfj4dquAi8WetBIbsrzpqiQl+Ur/YB4/tHCxo c4NA/PdrdxEfY8yYTDpkKkU2IgwR/+l2HehiAvO4vKZRMURWEGz7hVda/C9KYZm2t6A1 uQFUlY4opR0tokxUS2/4RhRcRm5+NeiSJYSjlg3gd/44l2DhiW19OiCTpCnPVXaQUn/l ebdOxq1YJUDrVcxUbKcSTTsPh1YwN0wzD1Ez/piAfCj3+8ptwO95SkrO06NmfEeKxesB tw== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3npbe1y5np-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Mon, 13 Feb 2023 03:32:27 -0800 Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.42; Mon, 13 Feb 2023 03:32:25 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.42 via Frontend Transport; Mon, 13 Feb 2023 03:32:25 -0800 Received: from cavium-DT10.. (unknown [10.28.34.39]) by maili.marvell.com (Postfix) with ESMTP id 8ADF73F704D; Mon, 13 Feb 2023 03:32:22 -0800 (PST) From: Tomasz Duszynski To: , Thomas Monjalon , Tomasz Duszynski CC: , , , , , , , Subject: [PATCH v10 1/4] lib: add generic support for reading PMU events Date: Mon, 13 Feb 2023 12:31:53 +0100 Message-ID: <20230213113156.385482-2-tduszynski@marvell.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230213113156.385482-1-tduszynski@marvell.com> References: <20230202124951.2915770-1-tduszynski@marvell.com> <20230213113156.385482-1-tduszynski@marvell.com> MIME-Version: 1.0 X-Proofpoint-GUID: 0kJPHAxPdGZ4gAODkzctuNArJq_GSsA3 X-Proofpoint-ORIG-GUID: 0kJPHAxPdGZ4gAODkzctuNArJq_GSsA3 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.170.22 definitions=2023-02-13_06,2023-02-13_01,2023-02-09_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add support for programming PMU counters and reading their values in runtime bypassing kernel completely. This is especially useful in cases where CPU cores are isolated i.e run dedicated tasks. In such cases one cannot use standard perf utility without sacrificing latency and performance. Signed-off-by: Tomasz Duszynski Acked-by: Morten Brørup --- MAINTAINERS | 5 + app/test/meson.build | 2 + app/test/test_pmu.c | 62 ++++ doc/api/doxy-api-index.md | 3 +- doc/api/doxy-api.conf.in | 1 + doc/guides/prog_guide/profile_app.rst | 12 + doc/guides/rel_notes/release_23_03.rst | 7 + lib/meson.build | 1 + lib/pmu/meson.build | 13 + lib/pmu/pmu_private.h | 32 ++ lib/pmu/rte_pmu.c | 460 +++++++++++++++++++++++++ lib/pmu/rte_pmu.h | 212 ++++++++++++ lib/pmu/version.map | 15 + 13 files changed, 824 insertions(+), 1 deletion(-) create mode 100644 app/test/test_pmu.c create mode 100644 lib/pmu/meson.build create mode 100644 lib/pmu/pmu_private.h create mode 100644 lib/pmu/rte_pmu.c create mode 100644 lib/pmu/rte_pmu.h create mode 100644 lib/pmu/version.map -- 2.34.1 diff --git a/MAINTAINERS b/MAINTAINERS index 3495946d0f..d37f242120 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1697,6 +1697,11 @@ M: Nithin Dabilpuram M: Pavan Nikhilesh F: lib/node/ +PMU - EXPERIMENTAL +M: Tomasz Duszynski +F: lib/pmu/ +F: app/test/test_pmu* + Test Applications ----------------- diff --git a/app/test/meson.build b/app/test/meson.build index f34d19e3c3..6b61b7fc32 100644 --- a/app/test/meson.build +++ b/app/test/meson.build @@ -111,6 +111,7 @@ test_sources = files( 'test_reciprocal_division_perf.c', 'test_red.c', 'test_pie.c', + 'test_pmu.c', 'test_reorder.c', 'test_rib.c', 'test_rib6.c', @@ -239,6 +240,7 @@ fast_tests = [ ['kni_autotest', false, true], ['kvargs_autotest', true, true], ['member_autotest', true, true], + ['pmu_autotest', true, true], ['power_cpufreq_autotest', false, true], ['power_autotest', true, true], ['power_kvm_vm_autotest', false, true], diff --git a/app/test/test_pmu.c b/app/test/test_pmu.c new file mode 100644 index 0000000000..a64564b5f5 --- /dev/null +++ b/app/test/test_pmu.c @@ -0,0 +1,62 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2023 Marvell International Ltd. + */ + +#include "test.h" + +#ifndef RTE_EXEC_ENV_LINUX + +static int +test_pmu(void) +{ + printf("pmu_autotest only supported on Linux, skipping test\n"); + return TEST_SKIPPED; +} + +#else + +#include + +static int +test_pmu_read(void) +{ + const char *name = NULL; + int tries = 10, event; + uint64_t val = 0; + + if (name == NULL) { + printf("PMU not supported on this arch\n"); + return TEST_SKIPPED; + } + + if (rte_pmu_init() < 0) + return TEST_FAILED; + + event = rte_pmu_add_event(name); + while (tries--) + val += rte_pmu_read(event); + + rte_pmu_fini(); + + return val ? TEST_SUCCESS : TEST_FAILED; +} + +static struct unit_test_suite pmu_tests = { + .suite_name = "pmu autotest", + .setup = NULL, + .teardown = NULL, + .unit_test_cases = { + TEST_CASE(test_pmu_read), + TEST_CASES_END() + } +}; + +static int +test_pmu(void) +{ + return unit_test_suite_runner(&pmu_tests); +} + +#endif /* RTE_EXEC_ENV_LINUX */ + +REGISTER_TEST_COMMAND(pmu_autotest, test_pmu); diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md index 2deec7ea19..a8e04a195d 100644 --- a/doc/api/doxy-api-index.md +++ b/doc/api/doxy-api-index.md @@ -223,7 +223,8 @@ The public API headers are grouped by topics: [log](@ref rte_log.h), [errno](@ref rte_errno.h), [trace](@ref rte_trace.h), - [trace_point](@ref rte_trace_point.h) + [trace_point](@ref rte_trace_point.h), + [pmu](@ref rte_pmu.h) - **misc**: [EAL config](@ref rte_eal.h), diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in index e859426099..350b5a8c94 100644 --- a/doc/api/doxy-api.conf.in +++ b/doc/api/doxy-api.conf.in @@ -63,6 +63,7 @@ INPUT = @TOPDIR@/doc/api/doxy-api-index.md \ @TOPDIR@/lib/pci \ @TOPDIR@/lib/pdump \ @TOPDIR@/lib/pipeline \ + @TOPDIR@/lib/pmu \ @TOPDIR@/lib/port \ @TOPDIR@/lib/power \ @TOPDIR@/lib/rawdev \ diff --git a/doc/guides/prog_guide/profile_app.rst b/doc/guides/prog_guide/profile_app.rst index 14292d4c25..89e38cd301 100644 --- a/doc/guides/prog_guide/profile_app.rst +++ b/doc/guides/prog_guide/profile_app.rst @@ -7,6 +7,18 @@ Profile Your Application The following sections describe methods of profiling DPDK applications on different architectures. +Performance counter based profiling +----------------------------------- + +Majority of architectures support some performance monitoring unit (PMU). +Such unit provides programmable counters that monitor specific events. + +Different tools gather that information, like for example perf. +However, in some scenarios when CPU cores are isolated and run +dedicated tasks interrupting those tasks with perf may be undesirable. + +In such cases, an application can use the PMU library to read such events via ``rte_pmu_read()``. + Profiling on x86 ---------------- diff --git a/doc/guides/rel_notes/release_23_03.rst b/doc/guides/rel_notes/release_23_03.rst index ab998a5357..20622efe58 100644 --- a/doc/guides/rel_notes/release_23_03.rst +++ b/doc/guides/rel_notes/release_23_03.rst @@ -147,6 +147,13 @@ New Features * Added support to capture packets at each graph node with packet metadata and node name. +* **Added PMU library.** + + Added a new performance monitoring unit (PMU) library which allows applications + to perform self monitoring activities without depending on external utilities like perf. + After integration with :doc:`../prog_guide/trace_lib` data gathered from hardware counters + can be stored in CTF format for further analysis. + Removed Items ------------- diff --git a/lib/meson.build b/lib/meson.build index 450c061d2b..8a42d45d20 100644 --- a/lib/meson.build +++ b/lib/meson.build @@ -11,6 +11,7 @@ libraries = [ 'kvargs', # eal depends on kvargs 'telemetry', # basic info querying + 'pmu', 'eal', # everything depends on eal 'ring', 'rcu', # rcu depends on ring diff --git a/lib/pmu/meson.build b/lib/pmu/meson.build new file mode 100644 index 0000000000..a4160b494e --- /dev/null +++ b/lib/pmu/meson.build @@ -0,0 +1,13 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(C) 2023 Marvell International Ltd. + +if not is_linux + build = false + reason = 'only supported on Linux' + subdir_done() +endif + +includes = [global_inc] + +sources = files('rte_pmu.c') +headers = files('rte_pmu.h') diff --git a/lib/pmu/pmu_private.h b/lib/pmu/pmu_private.h new file mode 100644 index 0000000000..b9f8c1ddc8 --- /dev/null +++ b/lib/pmu/pmu_private.h @@ -0,0 +1,32 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Marvell + */ + +#ifndef _PMU_PRIVATE_H_ +#define _PMU_PRIVATE_H_ + +/** + * Architecture specific PMU init callback. + * + * @return + * 0 in case of success, negative value otherwise. + */ +int +pmu_arch_init(void); + +/** + * Architecture specific PMU cleanup callback. + */ +void +pmu_arch_fini(void); + +/** + * Apply architecture specific settings to config before passing it to syscall. + * + * @param config + * Architecture specific event configuration. Consult kernel sources for available options. + */ +void +pmu_arch_fixup_config(uint64_t config[3]); + +#endif /* _PMU_PRIVATE_H_ */ diff --git a/lib/pmu/rte_pmu.c b/lib/pmu/rte_pmu.c new file mode 100644 index 0000000000..950f999cb7 --- /dev/null +++ b/lib/pmu/rte_pmu.c @@ -0,0 +1,460 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2023 Marvell International Ltd. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "pmu_private.h" + +#define EVENT_SOURCE_DEVICES_PATH "/sys/bus/event_source/devices" + +#define GENMASK_ULL(h, l) ((~0ULL - (1ULL << (l)) + 1) & (~0ULL >> ((64 - 1 - (h))))) +#define FIELD_PREP(m, v) (((uint64_t)(v) << (__builtin_ffsll(m) - 1)) & (m)) + +RTE_DEFINE_PER_LCORE(struct rte_pmu_event_group, _event_group); +struct rte_pmu rte_pmu; + +/* + * Following __rte_weak functions provide default no-op. Architectures should override them if + * necessary. + */ + +int +__rte_weak pmu_arch_init(void) +{ + return 0; +} + +void +__rte_weak pmu_arch_fini(void) +{ +} + +void +__rte_weak pmu_arch_fixup_config(uint64_t __rte_unused config[3]) +{ +} + +static int +get_term_format(const char *name, int *num, uint64_t *mask) +{ + char path[PATH_MAX]; + char *config = NULL; + int high, low, ret; + FILE *fp; + + *num = *mask = 0; + snprintf(path, sizeof(path), EVENT_SOURCE_DEVICES_PATH "/%s/format/%s", rte_pmu.name, name); + fp = fopen(path, "r"); + if (fp == NULL) + return -errno; + + errno = 0; + ret = fscanf(fp, "%m[^:]:%d-%d", &config, &low, &high); + if (ret < 2) { + ret = -ENODATA; + goto out; + } + if (errno) { + ret = -errno; + goto out; + } + + if (ret == 2) + high = low; + + *mask = GENMASK_ULL(high, low); + /* Last digit should be [012]. If last digit is missing 0 is implied. */ + *num = config[strlen(config) - 1]; + *num = isdigit(*num) ? *num - '0' : 0; + + ret = 0; +out: + free(config); + fclose(fp); + + return ret; +} + +static int +parse_event(char *buf, uint64_t config[3]) +{ + char *token, *term; + int num, ret, val; + uint64_t mask; + + config[0] = config[1] = config[2] = 0; + + token = strtok(buf, ","); + while (token) { + errno = 0; + /* = */ + ret = sscanf(token, "%m[^=]=%i", &term, &val); + if (ret < 1) + return -ENODATA; + if (errno) + return -errno; + if (ret == 1) + val = 1; + + ret = get_term_format(term, &num, &mask); + free(term); + if (ret) + return ret; + + config[num] |= FIELD_PREP(mask, val); + token = strtok(NULL, ","); + } + + return 0; +} + +static int +get_event_config(const char *name, uint64_t config[3]) +{ + char path[PATH_MAX], buf[BUFSIZ]; + FILE *fp; + int ret; + + snprintf(path, sizeof(path), EVENT_SOURCE_DEVICES_PATH "/%s/events/%s", rte_pmu.name, name); + fp = fopen(path, "r"); + if (fp == NULL) + return -errno; + + ret = fread(buf, 1, sizeof(buf), fp); + if (ret == 0) { + fclose(fp); + + return -EINVAL; + } + fclose(fp); + buf[ret] = '\0'; + + return parse_event(buf, config); +} + +static int +do_perf_event_open(uint64_t config[3], int group_fd) +{ + struct perf_event_attr attr = { + .size = sizeof(struct perf_event_attr), + .type = PERF_TYPE_RAW, + .exclude_kernel = 1, + .exclude_hv = 1, + .disabled = 1, + }; + + pmu_arch_fixup_config(config); + + attr.config = config[0]; + attr.config1 = config[1]; + attr.config2 = config[2]; + + return syscall(SYS_perf_event_open, &attr, 0, -1, group_fd, 0); +} + +static int +open_events(struct rte_pmu_event_group *group) +{ + struct rte_pmu_event *event; + uint64_t config[3]; + int num = 0, ret; + + /* group leader gets created first, with fd = -1 */ + group->fds[0] = -1; + + TAILQ_FOREACH(event, &rte_pmu.event_list, next) { + ret = get_event_config(event->name, config); + if (ret) + continue; + + ret = do_perf_event_open(config, group->fds[0]); + if (ret == -1) { + ret = -errno; + goto out; + } + + group->fds[event->index] = ret; + num++; + } + + return 0; +out: + for (--num; num >= 0; num--) { + close(group->fds[num]); + group->fds[num] = -1; + } + + + return ret; +} + +static int +mmap_events(struct rte_pmu_event_group *group) +{ + long page_size = sysconf(_SC_PAGE_SIZE); + unsigned int i; + void *addr; + int ret; + + for (i = 0; i < rte_pmu.num_group_events; i++) { + addr = mmap(0, page_size, PROT_READ, MAP_SHARED, group->fds[i], 0); + if (addr == MAP_FAILED) { + ret = -errno; + goto out; + } + + group->mmap_pages[i] = addr; + if (!group->mmap_pages[i]->cap_user_rdpmc) { + ret = -EPERM; + goto out; + } + } + + return 0; +out: + for (; i; i--) { + munmap(group->mmap_pages[i - 1], page_size); + group->mmap_pages[i - 1] = NULL; + } + + return ret; +} + +static void +cleanup_events(struct rte_pmu_event_group *group) +{ + unsigned int i; + + if (group->fds[0] != -1) + ioctl(group->fds[0], PERF_EVENT_IOC_DISABLE, PERF_IOC_FLAG_GROUP); + + for (i = 0; i < rte_pmu.num_group_events; i++) { + if (group->mmap_pages[i]) { + munmap(group->mmap_pages[i], sysconf(_SC_PAGE_SIZE)); + group->mmap_pages[i] = NULL; + } + + if (group->fds[i] != -1) { + close(group->fds[i]); + group->fds[i] = -1; + } + } + + group->enabled = false; +} + +int +__rte_pmu_enable_group(void) +{ + struct rte_pmu_event_group *group = &RTE_PER_LCORE(_event_group); + int ret; + + if (rte_pmu.num_group_events == 0) + return -ENODEV; + + ret = open_events(group); + if (ret) + goto out; + + ret = mmap_events(group); + if (ret) + goto out; + + if (ioctl(group->fds[0], PERF_EVENT_IOC_RESET, PERF_IOC_FLAG_GROUP) == -1) { + ret = -errno; + goto out; + } + + if (ioctl(group->fds[0], PERF_EVENT_IOC_ENABLE, PERF_IOC_FLAG_GROUP) == -1) { + ret = -errno; + goto out; + } + + rte_spinlock_lock(&rte_pmu.lock); + TAILQ_INSERT_TAIL(&rte_pmu.event_group_list, group, next); + rte_spinlock_unlock(&rte_pmu.lock); + group->enabled = true; + + return 0; + +out: + cleanup_events(group); + + return ret; +} + +static int +scan_pmus(void) +{ + char path[PATH_MAX]; + struct dirent *dent; + const char *name; + DIR *dirp; + + dirp = opendir(EVENT_SOURCE_DEVICES_PATH); + if (dirp == NULL) + return -errno; + + while ((dent = readdir(dirp))) { + name = dent->d_name; + if (name[0] == '.') + continue; + + /* sysfs entry should either contain cpus or be a cpu */ + if (!strcmp(name, "cpu")) + break; + + snprintf(path, sizeof(path), EVENT_SOURCE_DEVICES_PATH "/%s/cpus", name); + if (access(path, F_OK) == 0) + break; + } + + if (dent) { + rte_pmu.name = strdup(name); + if (rte_pmu.name == NULL) { + closedir(dirp); + + return -ENOMEM; + } + } + + closedir(dirp); + + return rte_pmu.name ? 0 : -ENODEV; +} + +static struct rte_pmu_event * +new_event(const char *name) +{ + struct rte_pmu_event *event; + + event = calloc(1, sizeof(*event)); + if (event == NULL) + goto out; + + event->name = strdup(name); + if (event->name == NULL) { + free(event); + event = NULL; + } + +out: + return event; +} + +static void +free_event(struct rte_pmu_event *event) +{ + free(event->name); + free(event); +} + +int +rte_pmu_add_event(const char *name) +{ + struct rte_pmu_event *event; + char path[PATH_MAX]; + + if (rte_pmu.name == NULL) + return -ENODEV; + + if (rte_pmu.num_group_events + 1 >= MAX_NUM_GROUP_EVENTS) + return -ENOSPC; + + snprintf(path, sizeof(path), EVENT_SOURCE_DEVICES_PATH "/%s/events/%s", rte_pmu.name, name); + if (access(path, R_OK)) + return -ENODEV; + + TAILQ_FOREACH(event, &rte_pmu.event_list, next) { + if (!strcmp(event->name, name)) + return event->index; + continue; + } + + event = new_event(name); + if (event == NULL) + return -ENOMEM; + + event->index = rte_pmu.num_group_events++; + TAILQ_INSERT_TAIL(&rte_pmu.event_list, event, next); + + return event->index; +} + +int +rte_pmu_init(void) +{ + int ret; + + /* Allow calling init from multiple contexts within a single thread. This simplifies + * resource management a bit e.g in case fast-path tracepoint has already been enabled + * via command line but application doesn't care enough and performs init/fini again. + */ + if (rte_pmu.initialized != 0) { + rte_pmu.initialized++; + return 0; + } + + ret = scan_pmus(); + if (ret) + goto out; + + ret = pmu_arch_init(); + if (ret) + goto out; + + TAILQ_INIT(&rte_pmu.event_list); + TAILQ_INIT(&rte_pmu.event_group_list); + rte_spinlock_init(&rte_pmu.lock); + rte_pmu.initialized = 1; + + return 0; +out: + free(rte_pmu.name); + rte_pmu.name = NULL; + + return ret; +} + +void +rte_pmu_fini(void) +{ + struct rte_pmu_event_group *group, *tmp_group; + struct rte_pmu_event *event, *tmp_event; + + /* cleanup once init count drops to zero */ + if (rte_pmu.initialized == 0 || --rte_pmu.initialized != 0) + return; + + RTE_TAILQ_FOREACH_SAFE(event, &rte_pmu.event_list, next, tmp_event) { + TAILQ_REMOVE(&rte_pmu.event_list, event, next); + free_event(event); + } + + RTE_TAILQ_FOREACH_SAFE(group, &rte_pmu.event_group_list, next, tmp_group) { + TAILQ_REMOVE(&rte_pmu.event_group_list, group, next); + cleanup_events(group); + } + + pmu_arch_fini(); + free(rte_pmu.name); + rte_pmu.name = NULL; + rte_pmu.num_group_events = 0; +} diff --git a/lib/pmu/rte_pmu.h b/lib/pmu/rte_pmu.h new file mode 100644 index 0000000000..6b664c3336 --- /dev/null +++ b/lib/pmu/rte_pmu.h @@ -0,0 +1,212 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Marvell + */ + +#ifndef _RTE_PMU_H_ +#define _RTE_PMU_H_ + +/** + * @file + * + * PMU event tracing operations + * + * This file defines generic API and types necessary to setup PMU and + * read selected counters in runtime. + */ + +#ifdef __cplusplus +extern "C" { +#endif + +#include + +#include +#include +#include +#include +#include + +/** Maximum number of events in a group */ +#define MAX_NUM_GROUP_EVENTS 8 + +/** + * A structure describing a group of events. + */ +struct rte_pmu_event_group { + struct perf_event_mmap_page *mmap_pages[MAX_NUM_GROUP_EVENTS]; /**< array of user pages */ + int fds[MAX_NUM_GROUP_EVENTS]; /**< array of event descriptors */ + bool enabled; /**< true if group was enabled on particular lcore */ + TAILQ_ENTRY(rte_pmu_event_group) next; /**< list entry */ +} __rte_cache_aligned; + +/** + * A structure describing an event. + */ +struct rte_pmu_event { + char *name; /**< name of an event */ + unsigned int index; /**< event index into fds/mmap_pages */ + TAILQ_ENTRY(rte_pmu_event) next; /**< list entry */ +}; + +/** + * A PMU state container. + */ +struct rte_pmu { + char *name; /**< name of core PMU listed under /sys/bus/event_source/devices */ + rte_spinlock_t lock; /**< serialize access to event group list */ + TAILQ_HEAD(, rte_pmu_event_group) event_group_list; /**< list of event groups */ + unsigned int num_group_events; /**< number of events in a group */ + TAILQ_HEAD(, rte_pmu_event) event_list; /**< list of matching events */ + unsigned int initialized; /**< initialization counter */ +}; + +/** lcore event group */ +RTE_DECLARE_PER_LCORE(struct rte_pmu_event_group, _event_group); + +/** PMU state container */ +extern struct rte_pmu rte_pmu; + +/** Each architecture supporting PMU needs to provide its own version */ +#ifndef rte_pmu_pmc_read +#define rte_pmu_pmc_read(index) ({ 0; }) +#endif + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Read PMU counter. + * + * @warning This should be not called directly. + * + * @param pc + * Pointer to the mmapped user page. + * @return + * Counter value read from hardware. + */ +static __rte_always_inline uint64_t +__rte_pmu_read_userpage(struct perf_event_mmap_page *pc) +{ + uint64_t width, offset; + uint32_t seq, index; + int64_t pmc; + + for (;;) { + seq = pc->lock; + rte_compiler_barrier(); + index = pc->index; + offset = pc->offset; + width = pc->pmc_width; + + /* index set to 0 means that particular counter cannot be used */ + if (likely(pc->cap_user_rdpmc && index)) { + pmc = rte_pmu_pmc_read(index - 1); + pmc <<= 64 - width; + pmc >>= 64 - width; + offset += pmc; + } + + rte_compiler_barrier(); + + if (likely(pc->lock == seq)) + return offset; + } + + return 0; +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Enable group of events on the calling lcore. + * + * @warning This should be not called directly. + * + * @return + * 0 in case of success, negative value otherwise. + */ +__rte_experimental +int +__rte_pmu_enable_group(void); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Initialize PMU library. + * + * @warning This should be not called directly. + * + * @return + * 0 in case of success, negative value otherwise. + */ +__rte_experimental +int +rte_pmu_init(void); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Finalize PMU library. This should be called after PMU counters are no longer being read. + */ +__rte_experimental +void +rte_pmu_fini(void); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Add event to the group of enabled events. + * + * @param name + * Name of an event listed under /sys/bus/event_source/devices/pmu/events. + * @return + * Event index in case of success, negative value otherwise. + */ +__rte_experimental +int +rte_pmu_add_event(const char *name); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Read hardware counter configured to count occurrences of an event. + * + * @param index + * Index of an event to be read. + * @return + * Event value read from register. In case of errors or lack of support + * 0 is returned. In other words, stream of zeros in a trace file + * indicates problem with reading particular PMU event register. + */ +__rte_experimental +static __rte_always_inline uint64_t +rte_pmu_read(unsigned int index) +{ + struct rte_pmu_event_group *group = &RTE_PER_LCORE(_event_group); + int ret; + + if (unlikely(!rte_pmu.initialized)) + return 0; + + if (unlikely(!group->enabled)) { + ret = __rte_pmu_enable_group(); + if (ret) + return 0; + } + + if (unlikely(index >= rte_pmu.num_group_events)) + return 0; + + return __rte_pmu_read_userpage(group->mmap_pages[index]); +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_PMU_H_ */ diff --git a/lib/pmu/version.map b/lib/pmu/version.map new file mode 100644 index 0000000000..39a4f279c1 --- /dev/null +++ b/lib/pmu/version.map @@ -0,0 +1,15 @@ +DPDK_23 { + local: *; +}; + +EXPERIMENTAL { + global: + + __rte_pmu_enable_group; + per_lcore__event_group; + rte_pmu; + rte_pmu_add_event; + rte_pmu_fini; + rte_pmu_init; + rte_pmu_read; +}; From patchwork Mon Feb 13 11:31:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tomasz Duszynski X-Patchwork-Id: 123786 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 6C50C41C88; Mon, 13 Feb 2023 12:32:43 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 4B3E942D0D; Mon, 13 Feb 2023 12:32:36 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id 9C02642D13 for ; Mon, 13 Feb 2023 12:32:34 +0100 (CET) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 31D8HJdM031336; Mon, 13 Feb 2023 03:32:32 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=pfpt0220; bh=bYCxMHLxjksrb5IP8q65C4mfg5gEVHbHjYKQmN0grcc=; b=VpHdkVzSOVlbr27yujvZ6tpo1W2pkDiHrIRMY3mgdDLjRcf1/8f7EuFz50rXIf1LP/yM wLKTuIi82B3PUv3dD3ewG+1q+1jf1hqUsVGn9qKjxSpkNiafdpOkmWrvpenyggqwjPBf 4VeXdeEHTgZCurf36H3OZApvjxncSRr/qy3urTHJeGviDi6cdRZnuTlPZetwM62V1Fmz NGxi1la8vmWBIloyKltcscrzdP/rlSF+LZJKMyGh/BqR3MIOuyq4vXLdRZlCqlAsM4uv kusbpPj2q92X1fXCrk6gGVPvRTG9qPT+t9i1N+rCmqeUXM8fjV5GCD6hZogUvfxL07WF mA== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3npbe1y5nx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Mon, 13 Feb 2023 03:32:31 -0800 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.42; Mon, 13 Feb 2023 03:32:29 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.42 via Frontend Transport; Mon, 13 Feb 2023 03:32:29 -0800 Received: from cavium-DT10.. (unknown [10.28.34.39]) by maili.marvell.com (Postfix) with ESMTP id 5F4FD3F706F; Mon, 13 Feb 2023 03:32:26 -0800 (PST) From: Tomasz Duszynski To: , Tomasz Duszynski , Ruifeng Wang CC: , , , , , , , , Subject: [PATCH v10 2/4] pmu: support reading ARM PMU events in runtime Date: Mon, 13 Feb 2023 12:31:54 +0100 Message-ID: <20230213113156.385482-3-tduszynski@marvell.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230213113156.385482-1-tduszynski@marvell.com> References: <20230202124951.2915770-1-tduszynski@marvell.com> <20230213113156.385482-1-tduszynski@marvell.com> MIME-Version: 1.0 X-Proofpoint-GUID: BYblHkNXJXIC3x9dR1E6YXOvkII8tilz X-Proofpoint-ORIG-GUID: BYblHkNXJXIC3x9dR1E6YXOvkII8tilz X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.170.22 definitions=2023-02-13_06,2023-02-13_01,2023-02-09_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add support for reading ARM PMU events in runtime. Signed-off-by: Tomasz Duszynski Acked-by: Morten Brørup Acked-by: Ruifeng Wang --- app/test/test_pmu.c | 4 ++ lib/pmu/meson.build | 7 +++ lib/pmu/pmu_arm64.c | 94 +++++++++++++++++++++++++++++++++++++ lib/pmu/rte_pmu.h | 4 ++ lib/pmu/rte_pmu_pmc_arm64.h | 30 ++++++++++++ 5 files changed, 139 insertions(+) create mode 100644 lib/pmu/pmu_arm64.c create mode 100644 lib/pmu/rte_pmu_pmc_arm64.h diff --git a/app/test/test_pmu.c b/app/test/test_pmu.c index a64564b5f5..7550223dc0 100644 --- a/app/test/test_pmu.c +++ b/app/test/test_pmu.c @@ -24,6 +24,10 @@ test_pmu_read(void) int tries = 10, event; uint64_t val = 0; +#if defined(RTE_ARCH_ARM64) + name = "cpu_cycles"; +#endif + if (name == NULL) { printf("PMU not supported on this arch\n"); return TEST_SKIPPED; diff --git a/lib/pmu/meson.build b/lib/pmu/meson.build index a4160b494e..e857681137 100644 --- a/lib/pmu/meson.build +++ b/lib/pmu/meson.build @@ -11,3 +11,10 @@ includes = [global_inc] sources = files('rte_pmu.c') headers = files('rte_pmu.h') +indirect_headers += files( + 'rte_pmu_pmc_arm64.h', +) + +if dpdk_conf.has('RTE_ARCH_ARM64') + sources += files('pmu_arm64.c') +endif diff --git a/lib/pmu/pmu_arm64.c b/lib/pmu/pmu_arm64.c new file mode 100644 index 0000000000..9e15727948 --- /dev/null +++ b/lib/pmu/pmu_arm64.c @@ -0,0 +1,94 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2023 Marvell International Ltd. + */ + +#include +#include +#include +#include + +#include +#include + +#include "pmu_private.h" + +#define PERF_USER_ACCESS_PATH "/proc/sys/kernel/perf_user_access" + +static int restore_uaccess; + +static int +read_attr_int(const char *path, int *val) +{ + char buf[BUFSIZ]; + int ret, fd; + + fd = open(path, O_RDONLY); + if (fd == -1) + return -errno; + + ret = read(fd, buf, sizeof(buf)); + if (ret == -1) { + close(fd); + + return -errno; + } + + *val = strtol(buf, NULL, 10); + close(fd); + + return 0; +} + +static int +write_attr_int(const char *path, int val) +{ + char buf[BUFSIZ]; + int num, ret, fd; + + fd = open(path, O_WRONLY); + if (fd == -1) + return -errno; + + num = snprintf(buf, sizeof(buf), "%d", val); + ret = write(fd, buf, num); + if (ret == -1) { + close(fd); + + return -errno; + } + + close(fd); + + return 0; +} + +int +pmu_arch_init(void) +{ + int ret; + + ret = read_attr_int(PERF_USER_ACCESS_PATH, &restore_uaccess); + if (ret) + return ret; + + /* user access already enabled */ + if (restore_uaccess == 1) + return 0; + + return write_attr_int(PERF_USER_ACCESS_PATH, 1); +} + +void +pmu_arch_fini(void) +{ + write_attr_int(PERF_USER_ACCESS_PATH, restore_uaccess); +} + +void +pmu_arch_fixup_config(uint64_t config[3]) +{ + /* select 64 bit counters */ + config[1] |= RTE_BIT64(0); + /* enable userspace access */ + config[1] |= RTE_BIT64(1); +} diff --git a/lib/pmu/rte_pmu.h b/lib/pmu/rte_pmu.h index 6b664c3336..bcc8e3b22d 100644 --- a/lib/pmu/rte_pmu.h +++ b/lib/pmu/rte_pmu.h @@ -26,6 +26,10 @@ extern "C" { #include #include +#if defined(RTE_ARCH_ARM64) +#include "rte_pmu_pmc_arm64.h" +#endif + /** Maximum number of events in a group */ #define MAX_NUM_GROUP_EVENTS 8 diff --git a/lib/pmu/rte_pmu_pmc_arm64.h b/lib/pmu/rte_pmu_pmc_arm64.h new file mode 100644 index 0000000000..10648f0c5f --- /dev/null +++ b/lib/pmu/rte_pmu_pmc_arm64.h @@ -0,0 +1,30 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Marvell. + */ +#ifndef _RTE_PMU_PMC_ARM64_H_ +#define _RTE_PMU_PMC_ARM64_H_ + +#include + +static __rte_always_inline uint64_t +rte_pmu_pmc_read(int index) +{ + uint64_t val; + + if (index == 31) { + /* CPU Cycles (0x11) must be read via pmccntr_el0 */ + asm volatile("mrs %0, pmccntr_el0" : "=r" (val)); + } else { + asm volatile( + "msr pmselr_el0, %x0\n" + "mrs %0, pmxevcntr_el0\n" + : "=r" (val) + : "rZ" (index) + ); + } + + return val; +} +#define rte_pmu_pmc_read rte_pmu_pmc_read + +#endif /* _RTE_PMU_PMC_ARM64_H_ */ From patchwork Mon Feb 13 11:31:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tomasz Duszynski X-Patchwork-Id: 123787 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id C465341C88; Mon, 13 Feb 2023 12:32:49 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 843FC42D1A; Mon, 13 Feb 2023 12:32:39 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id 7E13142D17 for ; Mon, 13 Feb 2023 12:32:38 +0100 (CET) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 31D8HJdQ031336; Mon, 13 Feb 2023 03:32:35 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=pfpt0220; bh=vNQcoFx7kZE2Bjh+xiVlYd1k1LDktrrstIAae4F9Hgs=; b=QCckyatmG7HiTEOROJWhOX0MsgzcF9YP9/QtEHGYdd8MkFzTs7OG4g6Wb27nRNtTMA86 mkiV+T/BV6SXDJe0KNN39u8IUMYEUH6jZ5vi0VYGgn/0hQr2OH5ncjDoRu+2CNuR+R6W z4ycMyd8wr3g3IrHGSMw5WDy1fteAM8OlaitYbNK3DDIKs+PDfCRhXSA3akEG6t/aE48 s/iDLDEifZuRZCKigdpx2xo7oHEwBvt15wK1kON4BAWXRDAQiPfKbOynQeY78DJ97k+T MnpLGuByzYG1dmjUhgbj3qmNqQvQxcuzo9SFFq3p3ibxD6KohPWqZ/5i9rGMF0p4WhKt 5Q== Received: from dc5-exch02.marvell.com ([199.233.59.182]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3npbe1y5pj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Mon, 13 Feb 2023 03:32:35 -0800 Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.42; Mon, 13 Feb 2023 03:32:33 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.42 via Frontend Transport; Mon, 13 Feb 2023 03:32:33 -0800 Received: from cavium-DT10.. (unknown [10.28.34.39]) by maili.marvell.com (Postfix) with ESMTP id 644E53F704D; Mon, 13 Feb 2023 03:32:30 -0800 (PST) From: Tomasz Duszynski To: , Tomasz Duszynski CC: , , , , , , , , Subject: [PATCH v10 3/4] pmu: support reading Intel x86_64 PMU events in runtime Date: Mon, 13 Feb 2023 12:31:55 +0100 Message-ID: <20230213113156.385482-4-tduszynski@marvell.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230213113156.385482-1-tduszynski@marvell.com> References: <20230202124951.2915770-1-tduszynski@marvell.com> <20230213113156.385482-1-tduszynski@marvell.com> MIME-Version: 1.0 X-Proofpoint-GUID: ZHnu9NgIgowhvamolhs6ZB0oSlSL2Qf9 X-Proofpoint-ORIG-GUID: ZHnu9NgIgowhvamolhs6ZB0oSlSL2Qf9 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.170.22 definitions=2023-02-13_06,2023-02-13_01,2023-02-09_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add support for reading Intel x86_64 PMU events in runtime. Signed-off-by: Tomasz Duszynski Acked-by: Morten Brørup --- app/test/test_pmu.c | 2 ++ lib/pmu/meson.build | 1 + lib/pmu/rte_pmu.h | 2 ++ lib/pmu/rte_pmu_pmc_x86_64.h | 24 ++++++++++++++++++++++++ 4 files changed, 29 insertions(+) create mode 100644 lib/pmu/rte_pmu_pmc_x86_64.h diff --git a/app/test/test_pmu.c b/app/test/test_pmu.c index 7550223dc0..cf6de85d88 100644 --- a/app/test/test_pmu.c +++ b/app/test/test_pmu.c @@ -26,6 +26,8 @@ test_pmu_read(void) #if defined(RTE_ARCH_ARM64) name = "cpu_cycles"; +#elif defined(RTE_ARCH_X86_64) + name = "cpu-cycles"; #endif if (name == NULL) { diff --git a/lib/pmu/meson.build b/lib/pmu/meson.build index e857681137..5b92e5c4e3 100644 --- a/lib/pmu/meson.build +++ b/lib/pmu/meson.build @@ -13,6 +13,7 @@ sources = files('rte_pmu.c') headers = files('rte_pmu.h') indirect_headers += files( 'rte_pmu_pmc_arm64.h', + 'rte_pmu_pmc_x86_64.h', ) if dpdk_conf.has('RTE_ARCH_ARM64') diff --git a/lib/pmu/rte_pmu.h b/lib/pmu/rte_pmu.h index bcc8e3b22d..b1d1c17bc5 100644 --- a/lib/pmu/rte_pmu.h +++ b/lib/pmu/rte_pmu.h @@ -28,6 +28,8 @@ extern "C" { #if defined(RTE_ARCH_ARM64) #include "rte_pmu_pmc_arm64.h" +#elif defined(RTE_ARCH_X86_64) +#include "rte_pmu_pmc_x86_64.h" #endif /** Maximum number of events in a group */ diff --git a/lib/pmu/rte_pmu_pmc_x86_64.h b/lib/pmu/rte_pmu_pmc_x86_64.h new file mode 100644 index 0000000000..7b67466960 --- /dev/null +++ b/lib/pmu/rte_pmu_pmc_x86_64.h @@ -0,0 +1,24 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Marvell. + */ +#ifndef _RTE_PMU_PMC_X86_64_H_ +#define _RTE_PMU_PMC_X86_64_H_ + +#include + +static __rte_always_inline uint64_t +rte_pmu_pmc_read(int index) +{ + uint64_t low, high; + + asm volatile( + "rdpmc\n" + : "=a" (low), "=d" (high) + : "c" (index) + ); + + return low | (high << 32); +} +#define rte_pmu_pmc_read rte_pmu_pmc_read + +#endif /* _RTE_PMU_PMC_X86_64_H_ */ From patchwork Mon Feb 13 11:31:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tomasz Duszynski X-Patchwork-Id: 123788 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 457D441C88; Mon, 13 Feb 2023 12:32:55 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id CA46D42D2D; Mon, 13 Feb 2023 12:32:43 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id AB66142D17 for ; Mon, 13 Feb 2023 12:32:42 +0100 (CET) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 31D8GS1K027784; Mon, 13 Feb 2023 03:32:40 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=pfpt0220; bh=x89aeO2WldMH7UlROninllCpgj7e5GK8UmF9lk1zpDk=; b=JZAA4C5tgEzYCVZpOq2Eyk3x+9P9XXRkrTsoeF3DtaY0d5/XESIVwH8Iaoi9piWG87Qo 8DMl/ogQE/ydDw0Fm8cWP6upBMn+X/g593wkSUvPzlKqJHU3U77dYkdABSXqJtPjSL1Y TKC6xtyo9qorLQXSpS9al8GKojITzO0SQZcJ0yuoUcjt/R/vRSwf2QJG3MtYaYegNlOo rOyYrg+q0TzxDK7X4QrAjjfcU9DtABOS/o1udkdv/TWdLLiDIrYCJwqfrVMyD9aBH66A bckBFCz5HfkqtGJry2VoJWlB3A+ZlesYB1s2DiiI1l5AVQtojQkcRGJVxuuM9xE7OpWy Hw== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3npbe1y5pw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Mon, 13 Feb 2023 03:32:39 -0800 Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.42; Mon, 13 Feb 2023 03:32:37 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.42 via Frontend Transport; Mon, 13 Feb 2023 03:32:37 -0800 Received: from cavium-DT10.. (unknown [10.28.34.39]) by maili.marvell.com (Postfix) with ESMTP id 32A573F706B; Mon, 13 Feb 2023 03:32:33 -0800 (PST) From: Tomasz Duszynski To: , Jerin Jacob , Sunil Kumar Kori , Tomasz Duszynski CC: , , , , , , , Subject: [PATCH v10 4/4] eal: add PMU support to tracing library Date: Mon, 13 Feb 2023 12:31:56 +0100 Message-ID: <20230213113156.385482-5-tduszynski@marvell.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230213113156.385482-1-tduszynski@marvell.com> References: <20230202124951.2915770-1-tduszynski@marvell.com> <20230213113156.385482-1-tduszynski@marvell.com> MIME-Version: 1.0 X-Proofpoint-GUID: Kt3t1bpjBFn1hUYpA49xzkUVI_LEzgN_ X-Proofpoint-ORIG-GUID: Kt3t1bpjBFn1hUYpA49xzkUVI_LEzgN_ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.170.22 definitions=2023-02-13_06,2023-02-13_01,2023-02-09_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org In order to profile app one needs to store significant amount of samples somewhere for an analysis latern on. Since trace library supports storing data in a CTF format lets take adventage of that and add a dedicated PMU tracepoint. Signed-off-by: Tomasz Duszynski Acked-by: Morten Brørup --- app/test/test_trace_perf.c | 10 ++++ doc/guides/prog_guide/profile_app.rst | 5 ++ doc/guides/prog_guide/trace_lib.rst | 32 +++++++++++++ lib/eal/common/eal_common_trace.c | 13 ++++- lib/eal/common/eal_common_trace_points.c | 5 ++ lib/eal/include/rte_eal_trace.h | 13 +++++ lib/eal/meson.build | 3 ++ lib/eal/version.map | 1 + lib/pmu/rte_pmu.c | 61 ++++++++++++++++++++++++ lib/pmu/rte_pmu.h | 14 ++++++ lib/pmu/version.map | 1 + 11 files changed, 157 insertions(+), 1 deletion(-) diff --git a/app/test/test_trace_perf.c b/app/test/test_trace_perf.c index 46ae7d8074..f1929f2734 100644 --- a/app/test/test_trace_perf.c +++ b/app/test/test_trace_perf.c @@ -114,6 +114,10 @@ worker_fn_##func(void *arg) \ #define GENERIC_DOUBLE rte_eal_trace_generic_double(3.66666) #define GENERIC_STR rte_eal_trace_generic_str("hello world") #define VOID_FP app_dpdk_test_fp() +#ifdef RTE_EXEC_ENV_LINUX +/* 0 corresponds first event passed via --trace= */ +#define READ_PMU rte_eal_trace_pmu_read(0) +#endif WORKER_DEFINE(GENERIC_VOID) WORKER_DEFINE(GENERIC_U64) @@ -122,6 +126,9 @@ WORKER_DEFINE(GENERIC_FLOAT) WORKER_DEFINE(GENERIC_DOUBLE) WORKER_DEFINE(GENERIC_STR) WORKER_DEFINE(VOID_FP) +#ifdef RTE_EXEC_ENV_LINUX +WORKER_DEFINE(READ_PMU) +#endif static void run_test(const char *str, lcore_function_t f, struct test_data *data, size_t sz) @@ -174,6 +181,9 @@ test_trace_perf(void) run_test("double", worker_fn_GENERIC_DOUBLE, data, sz); run_test("string", worker_fn_GENERIC_STR, data, sz); run_test("void_fp", worker_fn_VOID_FP, data, sz); +#ifdef RTE_EXEC_ENV_LINUX + run_test("read_pmu", worker_fn_READ_PMU, data, sz); +#endif rte_free(data); return TEST_SUCCESS; diff --git a/doc/guides/prog_guide/profile_app.rst b/doc/guides/prog_guide/profile_app.rst index 89e38cd301..c4dfe85c3b 100644 --- a/doc/guides/prog_guide/profile_app.rst +++ b/doc/guides/prog_guide/profile_app.rst @@ -19,6 +19,11 @@ dedicated tasks interrupting those tasks with perf may be undesirable. In such cases, an application can use the PMU library to read such events via ``rte_pmu_read()``. +Alternatively tracing library can be used which offers dedicated tracepoint +``rte_eal_trace_pmu_event()``. + +Refer to :doc:`../prog_guide/trace_lib` for more details. + Profiling on x86 ---------------- diff --git a/doc/guides/prog_guide/trace_lib.rst b/doc/guides/prog_guide/trace_lib.rst index 3e0ea5835c..9c81936e35 100644 --- a/doc/guides/prog_guide/trace_lib.rst +++ b/doc/guides/prog_guide/trace_lib.rst @@ -46,6 +46,7 @@ DPDK tracing library features trace format and is compatible with ``LTTng``. For detailed information, refer to `Common Trace Format `_. +- Support reading PMU events on ARM64 and x86-64 (Intel) How to add a tracepoint? ------------------------ @@ -137,6 +138,37 @@ the user must use ``RTE_TRACE_POINT_FP`` instead of ``RTE_TRACE_POINT``. ``RTE_TRACE_POINT_FP`` is compiled out by default and it can be enabled using the ``enable_trace_fp`` option for meson build. +PMU tracepoint +-------------- + +Performance monitoring unit (PMU) event values can be read from hardware +registers using predefined ``rte_pmu_read`` tracepoint. + +Tracing is enabled via ``--trace`` EAL option by passing both expression +matching PMU tracepoint name i.e ``lib.eal.pmu.read`` and expression +``e=ev1[,ev2,...]`` matching particular events:: + + --trace='.*pmu.read\|e=cpu_cycles,l1d_cache' + +Event names are available under ``/sys/bus/event_source/devices/PMU/events`` +directory, where ``PMU`` is a placeholder for either a ``cpu`` or a directory +containing ``cpus``. + +In contrary to other tracepoints this does not need any extra variables +added to source files. Instead, caller passes index which follows the order of +events specified via ``--trace`` parameter. In the following example index ``0`` +corresponds to ``cpu_cyclces`` while index ``1`` corresponds to ``l1d_cache``. + +.. code-block:: c + + ... + rte_eal_trace_pmu_read(0); + rte_eal_trace_pmu_read(1); + ... + +PMU tracing support must be explicitly enabled using the ``enable_trace_fp`` +option for meson build. + Event record mode ----------------- diff --git a/lib/eal/common/eal_common_trace.c b/lib/eal/common/eal_common_trace.c index 75162b722d..8796052d0c 100644 --- a/lib/eal/common/eal_common_trace.c +++ b/lib/eal/common/eal_common_trace.c @@ -11,6 +11,9 @@ #include #include #include +#ifdef RTE_EXEC_ENV_LINUX +#include +#endif #include #include "eal_trace.h" @@ -71,8 +74,13 @@ eal_trace_init(void) goto free_meta; /* Apply global configurations */ - STAILQ_FOREACH(arg, &trace.args, next) + STAILQ_FOREACH(arg, &trace.args, next) { trace_args_apply(arg->val); +#ifdef RTE_EXEC_ENV_LINUX + if (rte_pmu_init() == 0) + rte_pmu_add_events_by_pattern(arg->val); +#endif + } rte_trace_mode_set(trace.mode); @@ -88,6 +96,9 @@ eal_trace_init(void) void eal_trace_fini(void) { +#ifdef RTE_EXEC_ENV_LINUX + rte_pmu_fini(); +#endif trace_mem_free(); trace_metadata_destroy(); eal_trace_args_free(); diff --git a/lib/eal/common/eal_common_trace_points.c b/lib/eal/common/eal_common_trace_points.c index 051f89809c..9d6faa19ed 100644 --- a/lib/eal/common/eal_common_trace_points.c +++ b/lib/eal/common/eal_common_trace_points.c @@ -77,3 +77,8 @@ RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_enable, lib.eal.intr.enable) RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_disable, lib.eal.intr.disable) + +#ifdef RTE_EXEC_ENV_LINUX +RTE_TRACE_POINT_REGISTER(rte_eal_trace_pmu_read, + lib.eal.pmu.read) +#endif diff --git a/lib/eal/include/rte_eal_trace.h b/lib/eal/include/rte_eal_trace.h index 6f5c022558..c7da83c480 100644 --- a/lib/eal/include/rte_eal_trace.h +++ b/lib/eal/include/rte_eal_trace.h @@ -17,6 +17,9 @@ extern "C" { #include #include +#ifdef RTE_EXEC_ENV_LINUX +#include +#endif #include #include "eal_interrupts.h" @@ -285,6 +288,16 @@ RTE_TRACE_POINT( rte_trace_point_emit_string(cpuset); ) +#ifdef RTE_EXEC_ENV_LINUX +RTE_TRACE_POINT_FP( + rte_eal_trace_pmu_read, + RTE_TRACE_POINT_ARGS(unsigned int index), + uint64_t val; + val = rte_pmu_read(index); + rte_trace_point_emit_u64(val); +) +#endif + #ifdef __cplusplus } #endif diff --git a/lib/eal/meson.build b/lib/eal/meson.build index 056beb9461..f5865dbcd9 100644 --- a/lib/eal/meson.build +++ b/lib/eal/meson.build @@ -26,6 +26,9 @@ deps += ['kvargs'] if not is_windows deps += ['telemetry'] endif +if is_linux + deps += ['pmu'] +endif if dpdk_conf.has('RTE_USE_LIBBSD') ext_deps += libbsd endif diff --git a/lib/eal/version.map b/lib/eal/version.map index 2ae57ee78a..01e7a099d2 100644 --- a/lib/eal/version.map +++ b/lib/eal/version.map @@ -441,6 +441,7 @@ EXPERIMENTAL { rte_thread_join; # added in 23.03 + __rte_eal_trace_pmu_read; # WINDOWS_NO_EXPORT rte_lcore_register_usage_cb; rte_thread_create_control; rte_thread_set_name; diff --git a/lib/pmu/rte_pmu.c b/lib/pmu/rte_pmu.c index 950f999cb7..862edcb1e3 100644 --- a/lib/pmu/rte_pmu.c +++ b/lib/pmu/rte_pmu.c @@ -398,6 +398,67 @@ rte_pmu_add_event(const char *name) return event->index; } +static int +add_events(const char *pattern) +{ + char *token, *copy; + int ret = 0; + + copy = strdup(pattern); + if (copy == NULL) + return -ENOMEM; + + token = strtok(copy, ","); + while (token) { + ret = rte_pmu_add_event(token); + if (ret < 0) + break; + + token = strtok(NULL, ","); + } + + free(copy); + + return ret >= 0 ? 0 : ret; +} + +int +rte_pmu_add_events_by_pattern(const char *pattern) +{ + regmatch_t rmatch; + char buf[BUFSIZ]; + unsigned int num; + regex_t reg; + int ret; + + /* events are matched against occurrences of e=ev1[,ev2,..] pattern */ + ret = regcomp(®, "e=([_[:alnum:]-],?)+", REG_EXTENDED); + if (ret) + return -EINVAL; + + for (;;) { + if (regexec(®, pattern, 1, &rmatch, 0)) + break; + + num = rmatch.rm_eo - rmatch.rm_so; + if (num > sizeof(buf)) + num = sizeof(buf); + + /* skip e= pattern prefix */ + memcpy(buf, pattern + rmatch.rm_so + 2, num - 2); + buf[num - 2] = '\0'; + ret = add_events(buf); + if (ret) + break; + + pattern += rmatch.rm_eo; + } + + regfree(®); + + return ret; +} + int rte_pmu_init(void) { diff --git a/lib/pmu/rte_pmu.h b/lib/pmu/rte_pmu.h index b1d1c17bc5..e1c3bb5e56 100644 --- a/lib/pmu/rte_pmu.h +++ b/lib/pmu/rte_pmu.h @@ -176,6 +176,20 @@ __rte_experimental int rte_pmu_add_event(const char *name); +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Add events matching pattern to the group of enabled events. + * + * @param pattern + * Pattern e=ev1[,ev2,...] matching events, where evX is a placeholder for an event listed under + * /sys/bus/event_source/devices/pmu/events. + */ +__rte_experimental +int +rte_pmu_add_events_by_pattern(const char *pattern); + /** * @warning * @b EXPERIMENTAL: this API may change without prior notice diff --git a/lib/pmu/version.map b/lib/pmu/version.map index 39a4f279c1..e16b3ff009 100644 --- a/lib/pmu/version.map +++ b/lib/pmu/version.map @@ -9,6 +9,7 @@ EXPERIMENTAL { per_lcore__event_group; rte_pmu; rte_pmu_add_event; + rte_pmu_add_events_by_pattern; rte_pmu_fini; rte_pmu_init; rte_pmu_read;