From patchwork Thu Feb 16 17:54:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tomasz Duszynski X-Patchwork-Id: 124086 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0F64941CB3; Thu, 16 Feb 2023 18:55:40 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 320B842D93; Thu, 16 Feb 2023 18:55:38 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id 0EC9742D8F for ; Thu, 16 Feb 2023 18:55:36 +0100 (CET) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 31GG3OIm018900; Thu, 16 Feb 2023 09:55:34 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=pfpt0220; bh=VaKyxxhItprkG9y4ZNE6yxca43vpxKPF58Nb+8OpcuE=; b=CXztp9Fdu7+QB1vWoguWA0A+3Bi8yVt4KEQ3BVkhKQPpuU2n53zN49sx3Ab8MKkTu+0z ps6M5O+Gx5vD/p4pm9CXouqAdEmH+/iPH6oAnf5n/AnYzwDk98Rf95X/8zatTv6kTabF /3yxUBCYkeqrpsHfevRJOMZKwQvPQaNM/R13IZnabRhvooVIunCpDz8A+V7JE71bIHcl WOp0CSBZH7oTmvoGPb3zAaNX5mZ/6J7EBWqvw3Ljak9G986agdBYYMEpcx/qOZRtLU4/ E4b46+i8CY+M4b+bssQ1dtKhAn10IM/vUVl2cwRhmtSxO6f11SV6P6ubeGqW9CnjWFpA yQ== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3nsg6wb0d0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Thu, 16 Feb 2023 09:55:34 -0800 Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.42; Thu, 16 Feb 2023 09:55:31 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.42 via Frontend Transport; Thu, 16 Feb 2023 09:55:31 -0800 Received: from cavium-DT10.. (unknown [10.28.34.39]) by maili.marvell.com (Postfix) with ESMTP id 6645A3F7066; Thu, 16 Feb 2023 09:55:28 -0800 (PST) From: Tomasz Duszynski To: , Thomas Monjalon , Tomasz Duszynski CC: , , , , , , , Subject: [PATCH v11 1/4] lib: add generic support for reading PMU events Date: Thu, 16 Feb 2023 18:54:59 +0100 Message-ID: <20230216175502.3164820-2-tduszynski@marvell.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216175502.3164820-1-tduszynski@marvell.com> References: <20230213113156.385482-1-tduszynski@marvell.com> <20230216175502.3164820-1-tduszynski@marvell.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: fvjuk7GAcqyDuGzZD8FIg9xrzzw2fCRT X-Proofpoint-GUID: fvjuk7GAcqyDuGzZD8FIg9xrzzw2fCRT X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.170.22 definitions=2023-02-16_14,2023-02-16_01,2023-02-09_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add support for programming PMU counters and reading their values in runtime bypassing kernel completely. This is especially useful in cases where CPU cores are isolated i.e run dedicated tasks. In such cases one cannot use standard perf utility without sacrificing latency and performance. Signed-off-by: Tomasz Duszynski Acked-by: Morten Brørup --- MAINTAINERS | 5 + app/test/meson.build | 2 + app/test/test_pmu.c | 62 ++++ doc/api/doxy-api-index.md | 3 +- doc/api/doxy-api.conf.in | 1 + doc/guides/prog_guide/profile_app.rst | 12 + doc/guides/rel_notes/release_23_03.rst | 7 + lib/meson.build | 1 + lib/pmu/meson.build | 13 + lib/pmu/pmu_private.h | 32 ++ lib/pmu/rte_pmu.c | 460 +++++++++++++++++++++++++ lib/pmu/rte_pmu.h | 212 ++++++++++++ lib/pmu/version.map | 15 + 13 files changed, 824 insertions(+), 1 deletion(-) create mode 100644 app/test/test_pmu.c create mode 100644 lib/pmu/meson.build create mode 100644 lib/pmu/pmu_private.h create mode 100644 lib/pmu/rte_pmu.c create mode 100644 lib/pmu/rte_pmu.h create mode 100644 lib/pmu/version.map diff --git a/MAINTAINERS b/MAINTAINERS index 3495946d0f..d37f242120 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1697,6 +1697,11 @@ M: Nithin Dabilpuram M: Pavan Nikhilesh F: lib/node/ +PMU - EXPERIMENTAL +M: Tomasz Duszynski +F: lib/pmu/ +F: app/test/test_pmu* + Test Applications ----------------- diff --git a/app/test/meson.build b/app/test/meson.build index f34d19e3c3..6b61b7fc32 100644 --- a/app/test/meson.build +++ b/app/test/meson.build @@ -111,6 +111,7 @@ test_sources = files( 'test_reciprocal_division_perf.c', 'test_red.c', 'test_pie.c', + 'test_pmu.c', 'test_reorder.c', 'test_rib.c', 'test_rib6.c', @@ -239,6 +240,7 @@ fast_tests = [ ['kni_autotest', false, true], ['kvargs_autotest', true, true], ['member_autotest', true, true], + ['pmu_autotest', true, true], ['power_cpufreq_autotest', false, true], ['power_autotest', true, true], ['power_kvm_vm_autotest', false, true], diff --git a/app/test/test_pmu.c b/app/test/test_pmu.c new file mode 100644 index 0000000000..c257638e8b --- /dev/null +++ b/app/test/test_pmu.c @@ -0,0 +1,62 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2023 Marvell International Ltd. + */ + +#include "test.h" + +#ifndef RTE_EXEC_ENV_LINUX + +static int +test_pmu(void) +{ + printf("pmu_autotest only supported on Linux, skipping test\n"); + return TEST_SKIPPED; +} + +#else + +#include + +static int +test_pmu_read(void) +{ + const char *name = NULL; + int tries = 10, event; + uint64_t val = 0; + + if (name == NULL) { + printf("PMU not supported on this arch\n"); + return TEST_SKIPPED; + } + + if (rte_pmu_init() < 0) + return TEST_SKIPPED; + + event = rte_pmu_add_event(name); + while (tries--) + val += rte_pmu_read(event); + + rte_pmu_fini(); + + return val ? TEST_SUCCESS : TEST_FAILED; +} + +static struct unit_test_suite pmu_tests = { + .suite_name = "pmu autotest", + .setup = NULL, + .teardown = NULL, + .unit_test_cases = { + TEST_CASE(test_pmu_read), + TEST_CASES_END() + } +}; + +static int +test_pmu(void) +{ + return unit_test_suite_runner(&pmu_tests); +} + +#endif /* RTE_EXEC_ENV_LINUX */ + +REGISTER_TEST_COMMAND(pmu_autotest, test_pmu); diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md index 2deec7ea19..a8e04a195d 100644 --- a/doc/api/doxy-api-index.md +++ b/doc/api/doxy-api-index.md @@ -223,7 +223,8 @@ The public API headers are grouped by topics: [log](@ref rte_log.h), [errno](@ref rte_errno.h), [trace](@ref rte_trace.h), - [trace_point](@ref rte_trace_point.h) + [trace_point](@ref rte_trace_point.h), + [pmu](@ref rte_pmu.h) - **misc**: [EAL config](@ref rte_eal.h), diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in index e859426099..350b5a8c94 100644 --- a/doc/api/doxy-api.conf.in +++ b/doc/api/doxy-api.conf.in @@ -63,6 +63,7 @@ INPUT = @TOPDIR@/doc/api/doxy-api-index.md \ @TOPDIR@/lib/pci \ @TOPDIR@/lib/pdump \ @TOPDIR@/lib/pipeline \ + @TOPDIR@/lib/pmu \ @TOPDIR@/lib/port \ @TOPDIR@/lib/power \ @TOPDIR@/lib/rawdev \ diff --git a/doc/guides/prog_guide/profile_app.rst b/doc/guides/prog_guide/profile_app.rst index 14292d4c25..89e38cd301 100644 --- a/doc/guides/prog_guide/profile_app.rst +++ b/doc/guides/prog_guide/profile_app.rst @@ -7,6 +7,18 @@ Profile Your Application The following sections describe methods of profiling DPDK applications on different architectures. +Performance counter based profiling +----------------------------------- + +Majority of architectures support some performance monitoring unit (PMU). +Such unit provides programmable counters that monitor specific events. + +Different tools gather that information, like for example perf. +However, in some scenarios when CPU cores are isolated and run +dedicated tasks interrupting those tasks with perf may be undesirable. + +In such cases, an application can use the PMU library to read such events via ``rte_pmu_read()``. + Profiling on x86 ---------------- diff --git a/doc/guides/rel_notes/release_23_03.rst b/doc/guides/rel_notes/release_23_03.rst index ab998a5357..20622efe58 100644 --- a/doc/guides/rel_notes/release_23_03.rst +++ b/doc/guides/rel_notes/release_23_03.rst @@ -147,6 +147,13 @@ New Features * Added support to capture packets at each graph node with packet metadata and node name. +* **Added PMU library.** + + Added a new performance monitoring unit (PMU) library which allows applications + to perform self monitoring activities without depending on external utilities like perf. + After integration with :doc:`../prog_guide/trace_lib` data gathered from hardware counters + can be stored in CTF format for further analysis. + Removed Items ------------- diff --git a/lib/meson.build b/lib/meson.build index 450c061d2b..8a42d45d20 100644 --- a/lib/meson.build +++ b/lib/meson.build @@ -11,6 +11,7 @@ libraries = [ 'kvargs', # eal depends on kvargs 'telemetry', # basic info querying + 'pmu', 'eal', # everything depends on eal 'ring', 'rcu', # rcu depends on ring diff --git a/lib/pmu/meson.build b/lib/pmu/meson.build new file mode 100644 index 0000000000..a4160b494e --- /dev/null +++ b/lib/pmu/meson.build @@ -0,0 +1,13 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(C) 2023 Marvell International Ltd. + +if not is_linux + build = false + reason = 'only supported on Linux' + subdir_done() +endif + +includes = [global_inc] + +sources = files('rte_pmu.c') +headers = files('rte_pmu.h') diff --git a/lib/pmu/pmu_private.h b/lib/pmu/pmu_private.h new file mode 100644 index 0000000000..b9f8c1ddc8 --- /dev/null +++ b/lib/pmu/pmu_private.h @@ -0,0 +1,32 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Marvell + */ + +#ifndef _PMU_PRIVATE_H_ +#define _PMU_PRIVATE_H_ + +/** + * Architecture specific PMU init callback. + * + * @return + * 0 in case of success, negative value otherwise. + */ +int +pmu_arch_init(void); + +/** + * Architecture specific PMU cleanup callback. + */ +void +pmu_arch_fini(void); + +/** + * Apply architecture specific settings to config before passing it to syscall. + * + * @param config + * Architecture specific event configuration. Consult kernel sources for available options. + */ +void +pmu_arch_fixup_config(uint64_t config[3]); + +#endif /* _PMU_PRIVATE_H_ */ diff --git a/lib/pmu/rte_pmu.c b/lib/pmu/rte_pmu.c new file mode 100644 index 0000000000..950f999cb7 --- /dev/null +++ b/lib/pmu/rte_pmu.c @@ -0,0 +1,460 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2023 Marvell International Ltd. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "pmu_private.h" + +#define EVENT_SOURCE_DEVICES_PATH "/sys/bus/event_source/devices" + +#define GENMASK_ULL(h, l) ((~0ULL - (1ULL << (l)) + 1) & (~0ULL >> ((64 - 1 - (h))))) +#define FIELD_PREP(m, v) (((uint64_t)(v) << (__builtin_ffsll(m) - 1)) & (m)) + +RTE_DEFINE_PER_LCORE(struct rte_pmu_event_group, _event_group); +struct rte_pmu rte_pmu; + +/* + * Following __rte_weak functions provide default no-op. Architectures should override them if + * necessary. + */ + +int +__rte_weak pmu_arch_init(void) +{ + return 0; +} + +void +__rte_weak pmu_arch_fini(void) +{ +} + +void +__rte_weak pmu_arch_fixup_config(uint64_t __rte_unused config[3]) +{ +} + +static int +get_term_format(const char *name, int *num, uint64_t *mask) +{ + char path[PATH_MAX]; + char *config = NULL; + int high, low, ret; + FILE *fp; + + *num = *mask = 0; + snprintf(path, sizeof(path), EVENT_SOURCE_DEVICES_PATH "/%s/format/%s", rte_pmu.name, name); + fp = fopen(path, "r"); + if (fp == NULL) + return -errno; + + errno = 0; + ret = fscanf(fp, "%m[^:]:%d-%d", &config, &low, &high); + if (ret < 2) { + ret = -ENODATA; + goto out; + } + if (errno) { + ret = -errno; + goto out; + } + + if (ret == 2) + high = low; + + *mask = GENMASK_ULL(high, low); + /* Last digit should be [012]. If last digit is missing 0 is implied. */ + *num = config[strlen(config) - 1]; + *num = isdigit(*num) ? *num - '0' : 0; + + ret = 0; +out: + free(config); + fclose(fp); + + return ret; +} + +static int +parse_event(char *buf, uint64_t config[3]) +{ + char *token, *term; + int num, ret, val; + uint64_t mask; + + config[0] = config[1] = config[2] = 0; + + token = strtok(buf, ","); + while (token) { + errno = 0; + /* = */ + ret = sscanf(token, "%m[^=]=%i", &term, &val); + if (ret < 1) + return -ENODATA; + if (errno) + return -errno; + if (ret == 1) + val = 1; + + ret = get_term_format(term, &num, &mask); + free(term); + if (ret) + return ret; + + config[num] |= FIELD_PREP(mask, val); + token = strtok(NULL, ","); + } + + return 0; +} + +static int +get_event_config(const char *name, uint64_t config[3]) +{ + char path[PATH_MAX], buf[BUFSIZ]; + FILE *fp; + int ret; + + snprintf(path, sizeof(path), EVENT_SOURCE_DEVICES_PATH "/%s/events/%s", rte_pmu.name, name); + fp = fopen(path, "r"); + if (fp == NULL) + return -errno; + + ret = fread(buf, 1, sizeof(buf), fp); + if (ret == 0) { + fclose(fp); + + return -EINVAL; + } + fclose(fp); + buf[ret] = '\0'; + + return parse_event(buf, config); +} + +static int +do_perf_event_open(uint64_t config[3], int group_fd) +{ + struct perf_event_attr attr = { + .size = sizeof(struct perf_event_attr), + .type = PERF_TYPE_RAW, + .exclude_kernel = 1, + .exclude_hv = 1, + .disabled = 1, + }; + + pmu_arch_fixup_config(config); + + attr.config = config[0]; + attr.config1 = config[1]; + attr.config2 = config[2]; + + return syscall(SYS_perf_event_open, &attr, 0, -1, group_fd, 0); +} + +static int +open_events(struct rte_pmu_event_group *group) +{ + struct rte_pmu_event *event; + uint64_t config[3]; + int num = 0, ret; + + /* group leader gets created first, with fd = -1 */ + group->fds[0] = -1; + + TAILQ_FOREACH(event, &rte_pmu.event_list, next) { + ret = get_event_config(event->name, config); + if (ret) + continue; + + ret = do_perf_event_open(config, group->fds[0]); + if (ret == -1) { + ret = -errno; + goto out; + } + + group->fds[event->index] = ret; + num++; + } + + return 0; +out: + for (--num; num >= 0; num--) { + close(group->fds[num]); + group->fds[num] = -1; + } + + + return ret; +} + +static int +mmap_events(struct rte_pmu_event_group *group) +{ + long page_size = sysconf(_SC_PAGE_SIZE); + unsigned int i; + void *addr; + int ret; + + for (i = 0; i < rte_pmu.num_group_events; i++) { + addr = mmap(0, page_size, PROT_READ, MAP_SHARED, group->fds[i], 0); + if (addr == MAP_FAILED) { + ret = -errno; + goto out; + } + + group->mmap_pages[i] = addr; + if (!group->mmap_pages[i]->cap_user_rdpmc) { + ret = -EPERM; + goto out; + } + } + + return 0; +out: + for (; i; i--) { + munmap(group->mmap_pages[i - 1], page_size); + group->mmap_pages[i - 1] = NULL; + } + + return ret; +} + +static void +cleanup_events(struct rte_pmu_event_group *group) +{ + unsigned int i; + + if (group->fds[0] != -1) + ioctl(group->fds[0], PERF_EVENT_IOC_DISABLE, PERF_IOC_FLAG_GROUP); + + for (i = 0; i < rte_pmu.num_group_events; i++) { + if (group->mmap_pages[i]) { + munmap(group->mmap_pages[i], sysconf(_SC_PAGE_SIZE)); + group->mmap_pages[i] = NULL; + } + + if (group->fds[i] != -1) { + close(group->fds[i]); + group->fds[i] = -1; + } + } + + group->enabled = false; +} + +int +__rte_pmu_enable_group(void) +{ + struct rte_pmu_event_group *group = &RTE_PER_LCORE(_event_group); + int ret; + + if (rte_pmu.num_group_events == 0) + return -ENODEV; + + ret = open_events(group); + if (ret) + goto out; + + ret = mmap_events(group); + if (ret) + goto out; + + if (ioctl(group->fds[0], PERF_EVENT_IOC_RESET, PERF_IOC_FLAG_GROUP) == -1) { + ret = -errno; + goto out; + } + + if (ioctl(group->fds[0], PERF_EVENT_IOC_ENABLE, PERF_IOC_FLAG_GROUP) == -1) { + ret = -errno; + goto out; + } + + rte_spinlock_lock(&rte_pmu.lock); + TAILQ_INSERT_TAIL(&rte_pmu.event_group_list, group, next); + rte_spinlock_unlock(&rte_pmu.lock); + group->enabled = true; + + return 0; + +out: + cleanup_events(group); + + return ret; +} + +static int +scan_pmus(void) +{ + char path[PATH_MAX]; + struct dirent *dent; + const char *name; + DIR *dirp; + + dirp = opendir(EVENT_SOURCE_DEVICES_PATH); + if (dirp == NULL) + return -errno; + + while ((dent = readdir(dirp))) { + name = dent->d_name; + if (name[0] == '.') + continue; + + /* sysfs entry should either contain cpus or be a cpu */ + if (!strcmp(name, "cpu")) + break; + + snprintf(path, sizeof(path), EVENT_SOURCE_DEVICES_PATH "/%s/cpus", name); + if (access(path, F_OK) == 0) + break; + } + + if (dent) { + rte_pmu.name = strdup(name); + if (rte_pmu.name == NULL) { + closedir(dirp); + + return -ENOMEM; + } + } + + closedir(dirp); + + return rte_pmu.name ? 0 : -ENODEV; +} + +static struct rte_pmu_event * +new_event(const char *name) +{ + struct rte_pmu_event *event; + + event = calloc(1, sizeof(*event)); + if (event == NULL) + goto out; + + event->name = strdup(name); + if (event->name == NULL) { + free(event); + event = NULL; + } + +out: + return event; +} + +static void +free_event(struct rte_pmu_event *event) +{ + free(event->name); + free(event); +} + +int +rte_pmu_add_event(const char *name) +{ + struct rte_pmu_event *event; + char path[PATH_MAX]; + + if (rte_pmu.name == NULL) + return -ENODEV; + + if (rte_pmu.num_group_events + 1 >= MAX_NUM_GROUP_EVENTS) + return -ENOSPC; + + snprintf(path, sizeof(path), EVENT_SOURCE_DEVICES_PATH "/%s/events/%s", rte_pmu.name, name); + if (access(path, R_OK)) + return -ENODEV; + + TAILQ_FOREACH(event, &rte_pmu.event_list, next) { + if (!strcmp(event->name, name)) + return event->index; + continue; + } + + event = new_event(name); + if (event == NULL) + return -ENOMEM; + + event->index = rte_pmu.num_group_events++; + TAILQ_INSERT_TAIL(&rte_pmu.event_list, event, next); + + return event->index; +} + +int +rte_pmu_init(void) +{ + int ret; + + /* Allow calling init from multiple contexts within a single thread. This simplifies + * resource management a bit e.g in case fast-path tracepoint has already been enabled + * via command line but application doesn't care enough and performs init/fini again. + */ + if (rte_pmu.initialized != 0) { + rte_pmu.initialized++; + return 0; + } + + ret = scan_pmus(); + if (ret) + goto out; + + ret = pmu_arch_init(); + if (ret) + goto out; + + TAILQ_INIT(&rte_pmu.event_list); + TAILQ_INIT(&rte_pmu.event_group_list); + rte_spinlock_init(&rte_pmu.lock); + rte_pmu.initialized = 1; + + return 0; +out: + free(rte_pmu.name); + rte_pmu.name = NULL; + + return ret; +} + +void +rte_pmu_fini(void) +{ + struct rte_pmu_event_group *group, *tmp_group; + struct rte_pmu_event *event, *tmp_event; + + /* cleanup once init count drops to zero */ + if (rte_pmu.initialized == 0 || --rte_pmu.initialized != 0) + return; + + RTE_TAILQ_FOREACH_SAFE(event, &rte_pmu.event_list, next, tmp_event) { + TAILQ_REMOVE(&rte_pmu.event_list, event, next); + free_event(event); + } + + RTE_TAILQ_FOREACH_SAFE(group, &rte_pmu.event_group_list, next, tmp_group) { + TAILQ_REMOVE(&rte_pmu.event_group_list, group, next); + cleanup_events(group); + } + + pmu_arch_fini(); + free(rte_pmu.name); + rte_pmu.name = NULL; + rte_pmu.num_group_events = 0; +} diff --git a/lib/pmu/rte_pmu.h b/lib/pmu/rte_pmu.h new file mode 100644 index 0000000000..6b664c3336 --- /dev/null +++ b/lib/pmu/rte_pmu.h @@ -0,0 +1,212 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Marvell + */ + +#ifndef _RTE_PMU_H_ +#define _RTE_PMU_H_ + +/** + * @file + * + * PMU event tracing operations + * + * This file defines generic API and types necessary to setup PMU and + * read selected counters in runtime. + */ + +#ifdef __cplusplus +extern "C" { +#endif + +#include + +#include +#include +#include +#include +#include + +/** Maximum number of events in a group */ +#define MAX_NUM_GROUP_EVENTS 8 + +/** + * A structure describing a group of events. + */ +struct rte_pmu_event_group { + struct perf_event_mmap_page *mmap_pages[MAX_NUM_GROUP_EVENTS]; /**< array of user pages */ + int fds[MAX_NUM_GROUP_EVENTS]; /**< array of event descriptors */ + bool enabled; /**< true if group was enabled on particular lcore */ + TAILQ_ENTRY(rte_pmu_event_group) next; /**< list entry */ +} __rte_cache_aligned; + +/** + * A structure describing an event. + */ +struct rte_pmu_event { + char *name; /**< name of an event */ + unsigned int index; /**< event index into fds/mmap_pages */ + TAILQ_ENTRY(rte_pmu_event) next; /**< list entry */ +}; + +/** + * A PMU state container. + */ +struct rte_pmu { + char *name; /**< name of core PMU listed under /sys/bus/event_source/devices */ + rte_spinlock_t lock; /**< serialize access to event group list */ + TAILQ_HEAD(, rte_pmu_event_group) event_group_list; /**< list of event groups */ + unsigned int num_group_events; /**< number of events in a group */ + TAILQ_HEAD(, rte_pmu_event) event_list; /**< list of matching events */ + unsigned int initialized; /**< initialization counter */ +}; + +/** lcore event group */ +RTE_DECLARE_PER_LCORE(struct rte_pmu_event_group, _event_group); + +/** PMU state container */ +extern struct rte_pmu rte_pmu; + +/** Each architecture supporting PMU needs to provide its own version */ +#ifndef rte_pmu_pmc_read +#define rte_pmu_pmc_read(index) ({ 0; }) +#endif + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Read PMU counter. + * + * @warning This should be not called directly. + * + * @param pc + * Pointer to the mmapped user page. + * @return + * Counter value read from hardware. + */ +static __rte_always_inline uint64_t +__rte_pmu_read_userpage(struct perf_event_mmap_page *pc) +{ + uint64_t width, offset; + uint32_t seq, index; + int64_t pmc; + + for (;;) { + seq = pc->lock; + rte_compiler_barrier(); + index = pc->index; + offset = pc->offset; + width = pc->pmc_width; + + /* index set to 0 means that particular counter cannot be used */ + if (likely(pc->cap_user_rdpmc && index)) { + pmc = rte_pmu_pmc_read(index - 1); + pmc <<= 64 - width; + pmc >>= 64 - width; + offset += pmc; + } + + rte_compiler_barrier(); + + if (likely(pc->lock == seq)) + return offset; + } + + return 0; +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Enable group of events on the calling lcore. + * + * @warning This should be not called directly. + * + * @return + * 0 in case of success, negative value otherwise. + */ +__rte_experimental +int +__rte_pmu_enable_group(void); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Initialize PMU library. + * + * @warning This should be not called directly. + * + * @return + * 0 in case of success, negative value otherwise. + */ +__rte_experimental +int +rte_pmu_init(void); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Finalize PMU library. This should be called after PMU counters are no longer being read. + */ +__rte_experimental +void +rte_pmu_fini(void); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Add event to the group of enabled events. + * + * @param name + * Name of an event listed under /sys/bus/event_source/devices/pmu/events. + * @return + * Event index in case of success, negative value otherwise. + */ +__rte_experimental +int +rte_pmu_add_event(const char *name); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Read hardware counter configured to count occurrences of an event. + * + * @param index + * Index of an event to be read. + * @return + * Event value read from register. In case of errors or lack of support + * 0 is returned. In other words, stream of zeros in a trace file + * indicates problem with reading particular PMU event register. + */ +__rte_experimental +static __rte_always_inline uint64_t +rte_pmu_read(unsigned int index) +{ + struct rte_pmu_event_group *group = &RTE_PER_LCORE(_event_group); + int ret; + + if (unlikely(!rte_pmu.initialized)) + return 0; + + if (unlikely(!group->enabled)) { + ret = __rte_pmu_enable_group(); + if (ret) + return 0; + } + + if (unlikely(index >= rte_pmu.num_group_events)) + return 0; + + return __rte_pmu_read_userpage(group->mmap_pages[index]); +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_PMU_H_ */ diff --git a/lib/pmu/version.map b/lib/pmu/version.map new file mode 100644 index 0000000000..39a4f279c1 --- /dev/null +++ b/lib/pmu/version.map @@ -0,0 +1,15 @@ +DPDK_23 { + local: *; +}; + +EXPERIMENTAL { + global: + + __rte_pmu_enable_group; + per_lcore__event_group; + rte_pmu; + rte_pmu_add_event; + rte_pmu_fini; + rte_pmu_init; + rte_pmu_read; +}; From patchwork Thu Feb 16 17:55:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tomasz Duszynski X-Patchwork-Id: 124087 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 8230541CB3; Thu, 16 Feb 2023 18:55:48 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 938A442DA0; Thu, 16 Feb 2023 18:55:42 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id 962AD42DAF for ; Thu, 16 Feb 2023 18:55:40 +0100 (CET) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 31GFL4wa007551; Thu, 16 Feb 2023 09:55:38 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=pfpt0220; bh=Q1+gXzKC+qaKyOPjX30jxyHy0HSFQqH3OwZpnuGF06w=; b=ZK24U7rn6OPgCTRZJdE6xDG3HNmqpItJ1BsRWkNhutMF0xki72l4OvoN/dASxAFI2z0Q oPQiVAiU915huHjSFQ0OwskEOyz97dN4c7e8srDtKcfPIz2wb2X+/jC91eGQnrCtydvF Pulnm0x018VxX0y8cJzFULGJRWIGUu/NdZkKsudAqaRSrb+qjZiGyQf8ZByq6zJiII0y RNu6iEGK10DWmuQuV5pxGbgaus730BHB9diIpBMDd3aFnywTeWq96Xo9hTGoXMSFxBmg 6oWeF321Hnpfb5bNEPI0ey4EUMOPZwJ6gegvj3Q+RWzmkIcS8wphyWpEurvLZv6yyZlB dA== Received: from dc5-exch02.marvell.com ([199.233.59.182]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3nsg6wb0d7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Thu, 16 Feb 2023 09:55:38 -0800 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.42; Thu, 16 Feb 2023 09:55:35 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.42 via Frontend Transport; Thu, 16 Feb 2023 09:55:35 -0800 Received: from cavium-DT10.. (unknown [10.28.34.39]) by maili.marvell.com (Postfix) with ESMTP id 4B1033F7098; Thu, 16 Feb 2023 09:55:32 -0800 (PST) From: Tomasz Duszynski To: , Tomasz Duszynski , Ruifeng Wang CC: , , , , , , , , Subject: [PATCH v11 2/4] pmu: support reading ARM PMU events in runtime Date: Thu, 16 Feb 2023 18:55:00 +0100 Message-ID: <20230216175502.3164820-3-tduszynski@marvell.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216175502.3164820-1-tduszynski@marvell.com> References: <20230213113156.385482-1-tduszynski@marvell.com> <20230216175502.3164820-1-tduszynski@marvell.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: kSNwkiPSx69lsKILDTT1ABweYoOp_Uw2 X-Proofpoint-GUID: kSNwkiPSx69lsKILDTT1ABweYoOp_Uw2 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.170.22 definitions=2023-02-16_14,2023-02-16_01,2023-02-09_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add support for reading ARM PMU events in runtime. Signed-off-by: Tomasz Duszynski Acked-by: Morten Brørup Acked-by: Ruifeng Wang --- app/test/test_pmu.c | 4 ++ lib/pmu/meson.build | 7 +++ lib/pmu/pmu_arm64.c | 94 +++++++++++++++++++++++++++++++++++++ lib/pmu/rte_pmu.h | 4 ++ lib/pmu/rte_pmu_pmc_arm64.h | 30 ++++++++++++ 5 files changed, 139 insertions(+) create mode 100644 lib/pmu/pmu_arm64.c create mode 100644 lib/pmu/rte_pmu_pmc_arm64.h diff --git a/app/test/test_pmu.c b/app/test/test_pmu.c index c257638e8b..e0220e3c59 100644 --- a/app/test/test_pmu.c +++ b/app/test/test_pmu.c @@ -24,6 +24,10 @@ test_pmu_read(void) int tries = 10, event; uint64_t val = 0; +#if defined(RTE_ARCH_ARM64) + name = "cpu_cycles"; +#endif + if (name == NULL) { printf("PMU not supported on this arch\n"); return TEST_SKIPPED; diff --git a/lib/pmu/meson.build b/lib/pmu/meson.build index a4160b494e..e857681137 100644 --- a/lib/pmu/meson.build +++ b/lib/pmu/meson.build @@ -11,3 +11,10 @@ includes = [global_inc] sources = files('rte_pmu.c') headers = files('rte_pmu.h') +indirect_headers += files( + 'rte_pmu_pmc_arm64.h', +) + +if dpdk_conf.has('RTE_ARCH_ARM64') + sources += files('pmu_arm64.c') +endif diff --git a/lib/pmu/pmu_arm64.c b/lib/pmu/pmu_arm64.c new file mode 100644 index 0000000000..9e15727948 --- /dev/null +++ b/lib/pmu/pmu_arm64.c @@ -0,0 +1,94 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2023 Marvell International Ltd. + */ + +#include +#include +#include +#include + +#include +#include + +#include "pmu_private.h" + +#define PERF_USER_ACCESS_PATH "/proc/sys/kernel/perf_user_access" + +static int restore_uaccess; + +static int +read_attr_int(const char *path, int *val) +{ + char buf[BUFSIZ]; + int ret, fd; + + fd = open(path, O_RDONLY); + if (fd == -1) + return -errno; + + ret = read(fd, buf, sizeof(buf)); + if (ret == -1) { + close(fd); + + return -errno; + } + + *val = strtol(buf, NULL, 10); + close(fd); + + return 0; +} + +static int +write_attr_int(const char *path, int val) +{ + char buf[BUFSIZ]; + int num, ret, fd; + + fd = open(path, O_WRONLY); + if (fd == -1) + return -errno; + + num = snprintf(buf, sizeof(buf), "%d", val); + ret = write(fd, buf, num); + if (ret == -1) { + close(fd); + + return -errno; + } + + close(fd); + + return 0; +} + +int +pmu_arch_init(void) +{ + int ret; + + ret = read_attr_int(PERF_USER_ACCESS_PATH, &restore_uaccess); + if (ret) + return ret; + + /* user access already enabled */ + if (restore_uaccess == 1) + return 0; + + return write_attr_int(PERF_USER_ACCESS_PATH, 1); +} + +void +pmu_arch_fini(void) +{ + write_attr_int(PERF_USER_ACCESS_PATH, restore_uaccess); +} + +void +pmu_arch_fixup_config(uint64_t config[3]) +{ + /* select 64 bit counters */ + config[1] |= RTE_BIT64(0); + /* enable userspace access */ + config[1] |= RTE_BIT64(1); +} diff --git a/lib/pmu/rte_pmu.h b/lib/pmu/rte_pmu.h index 6b664c3336..bcc8e3b22d 100644 --- a/lib/pmu/rte_pmu.h +++ b/lib/pmu/rte_pmu.h @@ -26,6 +26,10 @@ extern "C" { #include #include +#if defined(RTE_ARCH_ARM64) +#include "rte_pmu_pmc_arm64.h" +#endif + /** Maximum number of events in a group */ #define MAX_NUM_GROUP_EVENTS 8 diff --git a/lib/pmu/rte_pmu_pmc_arm64.h b/lib/pmu/rte_pmu_pmc_arm64.h new file mode 100644 index 0000000000..10648f0c5f --- /dev/null +++ b/lib/pmu/rte_pmu_pmc_arm64.h @@ -0,0 +1,30 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Marvell. + */ +#ifndef _RTE_PMU_PMC_ARM64_H_ +#define _RTE_PMU_PMC_ARM64_H_ + +#include + +static __rte_always_inline uint64_t +rte_pmu_pmc_read(int index) +{ + uint64_t val; + + if (index == 31) { + /* CPU Cycles (0x11) must be read via pmccntr_el0 */ + asm volatile("mrs %0, pmccntr_el0" : "=r" (val)); + } else { + asm volatile( + "msr pmselr_el0, %x0\n" + "mrs %0, pmxevcntr_el0\n" + : "=r" (val) + : "rZ" (index) + ); + } + + return val; +} +#define rte_pmu_pmc_read rte_pmu_pmc_read + +#endif /* _RTE_PMU_PMC_ARM64_H_ */ From patchwork Thu Feb 16 17:55:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tomasz Duszynski X-Patchwork-Id: 124088 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id CCB7441CB3; Thu, 16 Feb 2023 18:55:53 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D68B942DA5; Thu, 16 Feb 2023 18:55:45 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id 65E3442DA5 for ; Thu, 16 Feb 2023 18:55:44 +0100 (CET) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 31GFcLUv007418; Thu, 16 Feb 2023 09:55:42 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=pfpt0220; bh=Kn+c/hzQgBWitjFiE/U3O8fCq3qMP6chiQNquKxtsxA=; b=iVmqqYmbIo7ahhCiq+MUVSEA4Ol9WBwD3KaNY60JafeyvG4UNvaYA4rJRKH3U5agoFTY eITlHo73J7rggQMOCBwgbCiolBXUwyLiY0lCbFJE8TNrM2+562oRlSTcrxMCxVkC5hJO ZY4j+EKFp2U6NlZwMK1q5SJG0wGA9/Q34pLxGdQdcnZ9gbAtMhOs08f79vGXtDdFZxtW HyeyUYHuu14oUCvFZFJTGuEUaUitWH6Zc0BKMgu4eKwmfiVEsAQAPQu1vKhH5Y6BcoXv lDyk0e6agEB8jGs8CUmBK3gIjk0YN6mZf5wfz1MuukkvAxLFAHrv9DejHeSWDXfH+mIA dw== Received: from dc5-exch02.marvell.com ([199.233.59.182]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3nsg6wb0db-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Thu, 16 Feb 2023 09:55:41 -0800 Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.42; Thu, 16 Feb 2023 09:55:39 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.42 via Frontend Transport; Thu, 16 Feb 2023 09:55:39 -0800 Received: from cavium-DT10.. (unknown [10.28.34.39]) by maili.marvell.com (Postfix) with ESMTP id 6C4CE3F7066; Thu, 16 Feb 2023 09:55:36 -0800 (PST) From: Tomasz Duszynski To: , Tomasz Duszynski CC: , , , , , , , , Subject: [PATCH v11 3/4] pmu: support reading Intel x86_64 PMU events in runtime Date: Thu, 16 Feb 2023 18:55:01 +0100 Message-ID: <20230216175502.3164820-4-tduszynski@marvell.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216175502.3164820-1-tduszynski@marvell.com> References: <20230213113156.385482-1-tduszynski@marvell.com> <20230216175502.3164820-1-tduszynski@marvell.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: 4BRvo2uCx6XRfv7qNzYAiZyv-V8HMRrN X-Proofpoint-GUID: 4BRvo2uCx6XRfv7qNzYAiZyv-V8HMRrN X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.170.22 definitions=2023-02-16_14,2023-02-16_01,2023-02-09_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add support for reading Intel x86_64 PMU events in runtime. Signed-off-by: Tomasz Duszynski Acked-by: Morten Brørup --- app/test/test_pmu.c | 2 ++ lib/pmu/meson.build | 1 + lib/pmu/rte_pmu.h | 2 ++ lib/pmu/rte_pmu_pmc_x86_64.h | 24 ++++++++++++++++++++++++ 4 files changed, 29 insertions(+) create mode 100644 lib/pmu/rte_pmu_pmc_x86_64.h diff --git a/app/test/test_pmu.c b/app/test/test_pmu.c index e0220e3c59..bed7101a5d 100644 --- a/app/test/test_pmu.c +++ b/app/test/test_pmu.c @@ -26,6 +26,8 @@ test_pmu_read(void) #if defined(RTE_ARCH_ARM64) name = "cpu_cycles"; +#elif defined(RTE_ARCH_X86_64) + name = "cpu-cycles"; #endif if (name == NULL) { diff --git a/lib/pmu/meson.build b/lib/pmu/meson.build index e857681137..5b92e5c4e3 100644 --- a/lib/pmu/meson.build +++ b/lib/pmu/meson.build @@ -13,6 +13,7 @@ sources = files('rte_pmu.c') headers = files('rte_pmu.h') indirect_headers += files( 'rte_pmu_pmc_arm64.h', + 'rte_pmu_pmc_x86_64.h', ) if dpdk_conf.has('RTE_ARCH_ARM64') diff --git a/lib/pmu/rte_pmu.h b/lib/pmu/rte_pmu.h index bcc8e3b22d..b1d1c17bc5 100644 --- a/lib/pmu/rte_pmu.h +++ b/lib/pmu/rte_pmu.h @@ -28,6 +28,8 @@ extern "C" { #if defined(RTE_ARCH_ARM64) #include "rte_pmu_pmc_arm64.h" +#elif defined(RTE_ARCH_X86_64) +#include "rte_pmu_pmc_x86_64.h" #endif /** Maximum number of events in a group */ diff --git a/lib/pmu/rte_pmu_pmc_x86_64.h b/lib/pmu/rte_pmu_pmc_x86_64.h new file mode 100644 index 0000000000..7b67466960 --- /dev/null +++ b/lib/pmu/rte_pmu_pmc_x86_64.h @@ -0,0 +1,24 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Marvell. + */ +#ifndef _RTE_PMU_PMC_X86_64_H_ +#define _RTE_PMU_PMC_X86_64_H_ + +#include + +static __rte_always_inline uint64_t +rte_pmu_pmc_read(int index) +{ + uint64_t low, high; + + asm volatile( + "rdpmc\n" + : "=a" (low), "=d" (high) + : "c" (index) + ); + + return low | (high << 32); +} +#define rte_pmu_pmc_read rte_pmu_pmc_read + +#endif /* _RTE_PMU_PMC_X86_64_H_ */ From patchwork Thu Feb 16 17:55:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tomasz Duszynski X-Patchwork-Id: 124089 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 459C141CB3; Thu, 16 Feb 2023 18:55:59 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 459AB42DB5; Thu, 16 Feb 2023 18:55:54 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id 3BB9242DB5 for ; Thu, 16 Feb 2023 18:55:50 +0100 (CET) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 31GH0Tca001078; Thu, 16 Feb 2023 09:55:46 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=pfpt0220; bh=x89aeO2WldMH7UlROninllCpgj7e5GK8UmF9lk1zpDk=; b=XvWNLb2UVAeu/5lRzz4SulytAzGa6/gI2aJwK8wrq+i0sgj16Fa4OkwtfmEXiOCzZrrm Uh1nOXkVrJ+R5IxeywyH+suTEUr2EYQ7hDrN5gKvZub0dWeED4GWHrR0/lLf4VRRVR3g mJjMM99CCVW7/3UQJ97Zl7Y7kQk0rj3U50ErOvQDC5iruJ1y+UYFtlNz2r57PIPX3YgR o+/8yB/A/ltR8swq5DhJSzrUMR4rYXhou8wrZ92dsLbXA9HmG5GwCc0bVzMmbNlbTrYb L39r0ycSNIkCHIvJ1prKoY/jHyPLrNHx2z3+Y3UAD+32BmhjqqkzEX4J/BIiOwkdWoZn yA== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3nsg6wb0dt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Thu, 16 Feb 2023 09:55:46 -0800 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.42; Thu, 16 Feb 2023 09:55:43 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.42 via Frontend Transport; Thu, 16 Feb 2023 09:55:43 -0800 Received: from cavium-DT10.. (unknown [10.28.34.39]) by maili.marvell.com (Postfix) with ESMTP id 55AF03F7097; Thu, 16 Feb 2023 09:55:40 -0800 (PST) From: Tomasz Duszynski To: , Jerin Jacob , Sunil Kumar Kori , Tomasz Duszynski CC: , , , , , , , Subject: [PATCH v11 4/4] eal: add PMU support to tracing library Date: Thu, 16 Feb 2023 18:55:02 +0100 Message-ID: <20230216175502.3164820-5-tduszynski@marvell.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216175502.3164820-1-tduszynski@marvell.com> References: <20230213113156.385482-1-tduszynski@marvell.com> <20230216175502.3164820-1-tduszynski@marvell.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: _tNUmxrqHK-NkuuJeShKjGymbIn5R9Ng X-Proofpoint-GUID: _tNUmxrqHK-NkuuJeShKjGymbIn5R9Ng X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.170.22 definitions=2023-02-16_14,2023-02-16_01,2023-02-09_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org In order to profile app one needs to store significant amount of samples somewhere for an analysis latern on. Since trace library supports storing data in a CTF format lets take adventage of that and add a dedicated PMU tracepoint. Signed-off-by: Tomasz Duszynski Acked-by: Morten Brørup --- app/test/test_trace_perf.c | 10 ++++ doc/guides/prog_guide/profile_app.rst | 5 ++ doc/guides/prog_guide/trace_lib.rst | 32 +++++++++++++ lib/eal/common/eal_common_trace.c | 13 ++++- lib/eal/common/eal_common_trace_points.c | 5 ++ lib/eal/include/rte_eal_trace.h | 13 +++++ lib/eal/meson.build | 3 ++ lib/eal/version.map | 1 + lib/pmu/rte_pmu.c | 61 ++++++++++++++++++++++++ lib/pmu/rte_pmu.h | 14 ++++++ lib/pmu/version.map | 1 + 11 files changed, 157 insertions(+), 1 deletion(-) diff --git a/app/test/test_trace_perf.c b/app/test/test_trace_perf.c index 46ae7d8074..f1929f2734 100644 --- a/app/test/test_trace_perf.c +++ b/app/test/test_trace_perf.c @@ -114,6 +114,10 @@ worker_fn_##func(void *arg) \ #define GENERIC_DOUBLE rte_eal_trace_generic_double(3.66666) #define GENERIC_STR rte_eal_trace_generic_str("hello world") #define VOID_FP app_dpdk_test_fp() +#ifdef RTE_EXEC_ENV_LINUX +/* 0 corresponds first event passed via --trace= */ +#define READ_PMU rte_eal_trace_pmu_read(0) +#endif WORKER_DEFINE(GENERIC_VOID) WORKER_DEFINE(GENERIC_U64) @@ -122,6 +126,9 @@ WORKER_DEFINE(GENERIC_FLOAT) WORKER_DEFINE(GENERIC_DOUBLE) WORKER_DEFINE(GENERIC_STR) WORKER_DEFINE(VOID_FP) +#ifdef RTE_EXEC_ENV_LINUX +WORKER_DEFINE(READ_PMU) +#endif static void run_test(const char *str, lcore_function_t f, struct test_data *data, size_t sz) @@ -174,6 +181,9 @@ test_trace_perf(void) run_test("double", worker_fn_GENERIC_DOUBLE, data, sz); run_test("string", worker_fn_GENERIC_STR, data, sz); run_test("void_fp", worker_fn_VOID_FP, data, sz); +#ifdef RTE_EXEC_ENV_LINUX + run_test("read_pmu", worker_fn_READ_PMU, data, sz); +#endif rte_free(data); return TEST_SUCCESS; diff --git a/doc/guides/prog_guide/profile_app.rst b/doc/guides/prog_guide/profile_app.rst index 89e38cd301..c4dfe85c3b 100644 --- a/doc/guides/prog_guide/profile_app.rst +++ b/doc/guides/prog_guide/profile_app.rst @@ -19,6 +19,11 @@ dedicated tasks interrupting those tasks with perf may be undesirable. In such cases, an application can use the PMU library to read such events via ``rte_pmu_read()``. +Alternatively tracing library can be used which offers dedicated tracepoint +``rte_eal_trace_pmu_event()``. + +Refer to :doc:`../prog_guide/trace_lib` for more details. + Profiling on x86 ---------------- diff --git a/doc/guides/prog_guide/trace_lib.rst b/doc/guides/prog_guide/trace_lib.rst index 3e0ea5835c..9c81936e35 100644 --- a/doc/guides/prog_guide/trace_lib.rst +++ b/doc/guides/prog_guide/trace_lib.rst @@ -46,6 +46,7 @@ DPDK tracing library features trace format and is compatible with ``LTTng``. For detailed information, refer to `Common Trace Format `_. +- Support reading PMU events on ARM64 and x86-64 (Intel) How to add a tracepoint? ------------------------ @@ -137,6 +138,37 @@ the user must use ``RTE_TRACE_POINT_FP`` instead of ``RTE_TRACE_POINT``. ``RTE_TRACE_POINT_FP`` is compiled out by default and it can be enabled using the ``enable_trace_fp`` option for meson build. +PMU tracepoint +-------------- + +Performance monitoring unit (PMU) event values can be read from hardware +registers using predefined ``rte_pmu_read`` tracepoint. + +Tracing is enabled via ``--trace`` EAL option by passing both expression +matching PMU tracepoint name i.e ``lib.eal.pmu.read`` and expression +``e=ev1[,ev2,...]`` matching particular events:: + + --trace='.*pmu.read\|e=cpu_cycles,l1d_cache' + +Event names are available under ``/sys/bus/event_source/devices/PMU/events`` +directory, where ``PMU`` is a placeholder for either a ``cpu`` or a directory +containing ``cpus``. + +In contrary to other tracepoints this does not need any extra variables +added to source files. Instead, caller passes index which follows the order of +events specified via ``--trace`` parameter. In the following example index ``0`` +corresponds to ``cpu_cyclces`` while index ``1`` corresponds to ``l1d_cache``. + +.. code-block:: c + + ... + rte_eal_trace_pmu_read(0); + rte_eal_trace_pmu_read(1); + ... + +PMU tracing support must be explicitly enabled using the ``enable_trace_fp`` +option for meson build. + Event record mode ----------------- diff --git a/lib/eal/common/eal_common_trace.c b/lib/eal/common/eal_common_trace.c index 75162b722d..8796052d0c 100644 --- a/lib/eal/common/eal_common_trace.c +++ b/lib/eal/common/eal_common_trace.c @@ -11,6 +11,9 @@ #include #include #include +#ifdef RTE_EXEC_ENV_LINUX +#include +#endif #include #include "eal_trace.h" @@ -71,8 +74,13 @@ eal_trace_init(void) goto free_meta; /* Apply global configurations */ - STAILQ_FOREACH(arg, &trace.args, next) + STAILQ_FOREACH(arg, &trace.args, next) { trace_args_apply(arg->val); +#ifdef RTE_EXEC_ENV_LINUX + if (rte_pmu_init() == 0) + rte_pmu_add_events_by_pattern(arg->val); +#endif + } rte_trace_mode_set(trace.mode); @@ -88,6 +96,9 @@ eal_trace_init(void) void eal_trace_fini(void) { +#ifdef RTE_EXEC_ENV_LINUX + rte_pmu_fini(); +#endif trace_mem_free(); trace_metadata_destroy(); eal_trace_args_free(); diff --git a/lib/eal/common/eal_common_trace_points.c b/lib/eal/common/eal_common_trace_points.c index 051f89809c..9d6faa19ed 100644 --- a/lib/eal/common/eal_common_trace_points.c +++ b/lib/eal/common/eal_common_trace_points.c @@ -77,3 +77,8 @@ RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_enable, lib.eal.intr.enable) RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_disable, lib.eal.intr.disable) + +#ifdef RTE_EXEC_ENV_LINUX +RTE_TRACE_POINT_REGISTER(rte_eal_trace_pmu_read, + lib.eal.pmu.read) +#endif diff --git a/lib/eal/include/rte_eal_trace.h b/lib/eal/include/rte_eal_trace.h index 6f5c022558..c7da83c480 100644 --- a/lib/eal/include/rte_eal_trace.h +++ b/lib/eal/include/rte_eal_trace.h @@ -17,6 +17,9 @@ extern "C" { #include #include +#ifdef RTE_EXEC_ENV_LINUX +#include +#endif #include #include "eal_interrupts.h" @@ -285,6 +288,16 @@ RTE_TRACE_POINT( rte_trace_point_emit_string(cpuset); ) +#ifdef RTE_EXEC_ENV_LINUX +RTE_TRACE_POINT_FP( + rte_eal_trace_pmu_read, + RTE_TRACE_POINT_ARGS(unsigned int index), + uint64_t val; + val = rte_pmu_read(index); + rte_trace_point_emit_u64(val); +) +#endif + #ifdef __cplusplus } #endif diff --git a/lib/eal/meson.build b/lib/eal/meson.build index 056beb9461..f5865dbcd9 100644 --- a/lib/eal/meson.build +++ b/lib/eal/meson.build @@ -26,6 +26,9 @@ deps += ['kvargs'] if not is_windows deps += ['telemetry'] endif +if is_linux + deps += ['pmu'] +endif if dpdk_conf.has('RTE_USE_LIBBSD') ext_deps += libbsd endif diff --git a/lib/eal/version.map b/lib/eal/version.map index 2ae57ee78a..01e7a099d2 100644 --- a/lib/eal/version.map +++ b/lib/eal/version.map @@ -441,6 +441,7 @@ EXPERIMENTAL { rte_thread_join; # added in 23.03 + __rte_eal_trace_pmu_read; # WINDOWS_NO_EXPORT rte_lcore_register_usage_cb; rte_thread_create_control; rte_thread_set_name; diff --git a/lib/pmu/rte_pmu.c b/lib/pmu/rte_pmu.c index 950f999cb7..862edcb1e3 100644 --- a/lib/pmu/rte_pmu.c +++ b/lib/pmu/rte_pmu.c @@ -398,6 +398,67 @@ rte_pmu_add_event(const char *name) return event->index; } +static int +add_events(const char *pattern) +{ + char *token, *copy; + int ret = 0; + + copy = strdup(pattern); + if (copy == NULL) + return -ENOMEM; + + token = strtok(copy, ","); + while (token) { + ret = rte_pmu_add_event(token); + if (ret < 0) + break; + + token = strtok(NULL, ","); + } + + free(copy); + + return ret >= 0 ? 0 : ret; +} + +int +rte_pmu_add_events_by_pattern(const char *pattern) +{ + regmatch_t rmatch; + char buf[BUFSIZ]; + unsigned int num; + regex_t reg; + int ret; + + /* events are matched against occurrences of e=ev1[,ev2,..] pattern */ + ret = regcomp(®, "e=([_[:alnum:]-],?)+", REG_EXTENDED); + if (ret) + return -EINVAL; + + for (;;) { + if (regexec(®, pattern, 1, &rmatch, 0)) + break; + + num = rmatch.rm_eo - rmatch.rm_so; + if (num > sizeof(buf)) + num = sizeof(buf); + + /* skip e= pattern prefix */ + memcpy(buf, pattern + rmatch.rm_so + 2, num - 2); + buf[num - 2] = '\0'; + ret = add_events(buf); + if (ret) + break; + + pattern += rmatch.rm_eo; + } + + regfree(®); + + return ret; +} + int rte_pmu_init(void) { diff --git a/lib/pmu/rte_pmu.h b/lib/pmu/rte_pmu.h index b1d1c17bc5..e1c3bb5e56 100644 --- a/lib/pmu/rte_pmu.h +++ b/lib/pmu/rte_pmu.h @@ -176,6 +176,20 @@ __rte_experimental int rte_pmu_add_event(const char *name); +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Add events matching pattern to the group of enabled events. + * + * @param pattern + * Pattern e=ev1[,ev2,...] matching events, where evX is a placeholder for an event listed under + * /sys/bus/event_source/devices/pmu/events. + */ +__rte_experimental +int +rte_pmu_add_events_by_pattern(const char *pattern); + /** * @warning * @b EXPERIMENTAL: this API may change without prior notice diff --git a/lib/pmu/version.map b/lib/pmu/version.map index 39a4f279c1..e16b3ff009 100644 --- a/lib/pmu/version.map +++ b/lib/pmu/version.map @@ -9,6 +9,7 @@ EXPERIMENTAL { per_lcore__event_group; rte_pmu; rte_pmu_add_event; + rte_pmu_add_events_by_pattern; rte_pmu_fini; rte_pmu_init; rte_pmu_read;