From patchwork Fri Nov 11 09:43:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tomasz Duszynski X-Patchwork-Id: 119794 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id AB5C6A0542; Fri, 11 Nov 2022 10:43:56 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 340E142D18; Fri, 11 Nov 2022 10:43:53 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com [67.231.148.174]) by mails.dpdk.org (Postfix) with ESMTP id 46D33427F2 for ; Fri, 11 Nov 2022 10:43:51 +0100 (CET) Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AB95A4s021726; Fri, 11 Nov 2022 01:43:50 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=9FZyyHu4t3p93n4fhviXYcszyu+2mkeom1cP3zQ3lgg=; b=f9CNBAxtrw1PzX4jp7QfZHhRZ+1jJ8ZikzYoR3udiOTxSnTy36jSeOgtpWwWzwMpyETs 9kHp0oAvnOl8l3SqVgGzpbuczm4Qptrm8HLBq/ZGjvNdnLyH3t8T1Krh3SDMRXZM/XI7 5JRpPX1LLVVuHYhFgvfEgW9A4QiFdM+VocBNKfrAMa3JJkk7rMhnWFFcT92JBFWLCYxA z9FXrdqij6HK9kJGJXlxdjvM3PNEEST8l7MXZzJwCHm7kI0y4QhZTN8nuFIchhLI1kML +C+ua46Vpb1BkOaHVcWQFj6CWierT4j+FnAfgtt0MlEcls/l/WHymBxIboUJVM1O2Hon MQ== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0a-0016f401.pphosted.com (PPS) with ESMTPS id 3kskf0r349-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Fri, 11 Nov 2022 01:43:50 -0800 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Fri, 11 Nov 2022 01:43:48 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Fri, 11 Nov 2022 01:43:48 -0800 Received: from localhost.localdomain (unknown [10.28.34.39]) by maili.marvell.com (Postfix) with ESMTP id E20465B6928; Fri, 11 Nov 2022 01:43:46 -0800 (PST) From: Tomasz Duszynski To: , CC: , Subject: [PATCH 1/4] eal: add generic support for reading PMU events Date: Fri, 11 Nov 2022 10:43:35 +0100 Message-ID: <20221111094338.2736065-2-tduszynski@marvell.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221111094338.2736065-1-tduszynski@marvell.com> References: <20221111094338.2736065-1-tduszynski@marvell.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: cszJQtH1YsvrhX4Lo0V8ouL1KN6lF51q X-Proofpoint-GUID: cszJQtH1YsvrhX4Lo0V8ouL1KN6lF51q X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-11_05,2022-11-09_01,2022-06-22_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add support for programming PMU counters and reading their values in runtime bypassing kernel completely. This is especially useful in cases where CPU cores are isolated (nohz_full) i.e run dedicated tasks. In such cases one cannot use standard perf utility without sacrificing latency and performance. Signed-off-by: Tomasz Duszynski --- app/test/meson.build | 1 + app/test/test_pmu.c | 41 +++ doc/guides/prog_guide/profile_app.rst | 8 + lib/eal/common/meson.build | 3 + lib/eal/common/pmu_private.h | 41 +++ lib/eal/common/rte_pmu.c | 455 ++++++++++++++++++++++++++ lib/eal/include/meson.build | 1 + lib/eal/include/rte_pmu.h | 204 ++++++++++++ lib/eal/linux/eal.c | 4 + lib/eal/version.map | 3 + 10 files changed, 761 insertions(+) create mode 100644 app/test/test_pmu.c create mode 100644 lib/eal/common/pmu_private.h create mode 100644 lib/eal/common/rte_pmu.c create mode 100644 lib/eal/include/rte_pmu.h diff --git a/app/test/meson.build b/app/test/meson.build index f34d19e3c3..93b3300309 100644 --- a/app/test/meson.build +++ b/app/test/meson.build @@ -143,6 +143,7 @@ test_sources = files( 'test_timer_racecond.c', 'test_timer_secondary.c', 'test_ticketlock.c', + 'test_pmu.c', 'test_trace.c', 'test_trace_register.c', 'test_trace_perf.c', diff --git a/app/test/test_pmu.c b/app/test/test_pmu.c new file mode 100644 index 0000000000..fd331af9ee --- /dev/null +++ b/app/test/test_pmu.c @@ -0,0 +1,41 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2022 Marvell International Ltd. + */ + +#include + +#include "test.h" + +static int +test_pmu_read(void) +{ + uint64_t val = 0; + int tries = 10; + int event = -1; + + while (tries--) + val += rte_pmu_read(event); + + if (val == 0) + return TEST_FAILED; + + return TEST_SUCCESS; +} + +static struct unit_test_suite pmu_tests = { + .suite_name = "pmu autotest", + .setup = NULL, + .teardown = NULL, + .unit_test_cases = { + TEST_CASE(test_pmu_read), + TEST_CASES_END() + } +}; + +static int +test_pmu(void) +{ + return unit_test_suite_runner(&pmu_tests); +} + +REGISTER_TEST_COMMAND(pmu_autotest, test_pmu); diff --git a/doc/guides/prog_guide/profile_app.rst b/doc/guides/prog_guide/profile_app.rst index bd6700ef85..8fc1b20cab 100644 --- a/doc/guides/prog_guide/profile_app.rst +++ b/doc/guides/prog_guide/profile_app.rst @@ -7,6 +7,14 @@ Profile Your Application The following sections describe methods of profiling DPDK applications on different architectures. +Performance counter based profiling +----------------------------------- + +Majority of architectures support some sort hardware measurement unit which provides a set of +programmable counters that monitor specific events. There are different tools which can gather +that information, perf being an example here. Though in some scenarios, eg. when CPU cores are +isolated (nohz_full) and run dedicated tasks, using perf is less than ideal. In such cases one can +read specific events directly from application via ``rte_pmu_read()``. Profiling on x86 ---------------- diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build index 917758cc65..d6d05b56f3 100644 --- a/lib/eal/common/meson.build +++ b/lib/eal/common/meson.build @@ -38,6 +38,9 @@ sources += files( 'rte_service.c', 'rte_version.c', ) +if is_linux + sources += files('rte_pmu.c') +endif if is_linux or is_windows sources += files('eal_common_dynmem.c') endif diff --git a/lib/eal/common/pmu_private.h b/lib/eal/common/pmu_private.h new file mode 100644 index 0000000000..cade4245e6 --- /dev/null +++ b/lib/eal/common/pmu_private.h @@ -0,0 +1,41 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2022 Marvell + */ + +#ifndef _PMU_PRIVATE_H_ +#define _PMU_PRIVATE_H_ + +/** + * Architecture specific PMU init callback. + * + * @return + * 0 in case of success, negative value otherwise. + */ +int +pmu_arch_init(void); + +/** + * Architecture specific PMU cleanup callback. + */ +void +pmu_arch_fini(void); + +/** + * Apply architecture specific settings to config before passing it to syscall. + */ +void +pmu_arch_fixup_config(uint64_t config[3]); + +/** + * Initialize PMU tracing internals. + */ +void +eal_pmu_init(void); + +/** + * Cleanup PMU internals. + */ +void +eal_pmu_fini(void); + +#endif /* _PMU_PRIVATE_H_ */ diff --git a/lib/eal/common/rte_pmu.c b/lib/eal/common/rte_pmu.c new file mode 100644 index 0000000000..7d3bd57d1d --- /dev/null +++ b/lib/eal/common/rte_pmu.c @@ -0,0 +1,455 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2022 Marvell International Ltd. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include + +#include "pmu_private.h" + +#define EVENT_SOURCE_DEVICES_PATH "/sys/bus/event_source/devices" + +#ifndef GENMASK_ULL +#define GENMASK_ULL(h, l) ((~0ULL - (1ULL << (l)) + 1) & (~0ULL >> ((64 - 1 - (h))))) +#endif + +#ifndef FIELD_PREP +#define FIELD_PREP(m, v) (((uint64_t)(v) << (__builtin_ffsll(m) - 1)) & (m)) +#endif + +struct rte_pmu *pmu; + +/* + * Following __rte_weak functions provide default no-op. Architectures should override them if + * necessary. + */ + +int +__rte_weak pmu_arch_init(void) +{ + return 0; +} + +void +__rte_weak pmu_arch_fini(void) +{ +} + +void +__rte_weak pmu_arch_fixup_config(uint64_t config[3]) +{ + RTE_SET_USED(config); +} + +static int +get_term_format(const char *name, int *num, uint64_t *mask) +{ + char *config = NULL; + char path[PATH_MAX]; + int high, low, ret; + FILE *fp; + + /* quiesce -Wmaybe-uninitialized warning */ + *num = 0; + *mask = 0; + + snprintf(path, sizeof(path), EVENT_SOURCE_DEVICES_PATH "/%s/format/%s", pmu->name, name); + fp = fopen(path, "r"); + if (!fp) + return -errno; + + errno = 0; + ret = fscanf(fp, "%m[^:]:%d-%d", &config, &low, &high); + if (ret < 2) { + ret = -ENODATA; + goto out; + } + if (errno) { + ret = -errno; + goto out; + } + + if (ret == 2) + high = low; + + *mask = GENMASK_ULL(high, low); + /* Last digit should be [012]. If last digit is missing 0 is implied. */ + *num = config[strlen(config) - 1]; + *num = isdigit(*num) ? *num - '0' : 0; + + ret = 0; +out: + free(config); + fclose(fp); + + return ret; +} + +static int +parse_event(char *buf, uint64_t config[3]) +{ + char *token, *term; + int num, ret, val; + uint64_t mask; + + config[0] = config[1] = config[2] = 0; + + token = strtok(buf, ","); + while (token) { + errno = 0; + /* = */ + ret = sscanf(token, "%m[^=]=%i", &term, &val); + if (ret < 1) + return -ENODATA; + if (errno) + return -errno; + if (ret == 1) + val = 1; + + ret = get_term_format(term, &num, &mask); + free(term); + if (ret) + return ret; + + config[num] |= FIELD_PREP(mask, val); + token = strtok(NULL, ","); + } + + return 0; +} + +static int +get_event_config(const char *name, uint64_t config[3]) +{ + char path[PATH_MAX], buf[BUFSIZ]; + FILE *fp; + int ret; + + snprintf(path, sizeof(path), EVENT_SOURCE_DEVICES_PATH "/%s/events/%s", pmu->name, name); + fp = fopen(path, "r"); + if (!fp) + return -errno; + + ret = fread(buf, 1, sizeof(buf), fp); + if (ret == 0) { + fclose(fp); + + return -EINVAL; + } + fclose(fp); + buf[ret] = '\0'; + + return parse_event(buf, config); +} + +static int +do_perf_event_open(uint64_t config[3], int lcore_id, int group_fd) +{ + struct perf_event_attr attr = { + .size = sizeof(struct perf_event_attr), + .type = PERF_TYPE_RAW, + .exclude_kernel = 1, + .exclude_hv = 1, + .disabled = 1, + }; + + pmu_arch_fixup_config(config); + + attr.config = config[0]; + attr.config1 = config[1]; + attr.config2 = config[2]; + + return syscall(SYS_perf_event_open, &attr, rte_gettid(), rte_lcore_to_cpu_id(lcore_id), + group_fd, 0); +} + +static int +open_events(int lcore_id) +{ + struct rte_pmu_event_group *group = &pmu->group[lcore_id]; + struct rte_pmu_event *event; + uint64_t config[3]; + int num = 0, ret; + + /* group leader gets created first, with fd = -1 */ + group->fds[0] = -1; + + TAILQ_FOREACH(event, &pmu->event_list, next) { + ret = get_event_config(event->name, config); + if (ret) { + RTE_LOG(ERR, EAL, "failed to get %s event config\n", event->name); + continue; + } + + ret = do_perf_event_open(config, lcore_id, group->fds[0]); + if (ret == -1) { + if (errno == EOPNOTSUPP) + RTE_LOG(ERR, EAL, "64 bit counters not supported\n"); + + ret = -errno; + goto out; + } + + group->fds[event->index] = ret; + num++; + } + + return 0; +out: + for (--num; num >= 0; num--) { + close(group->fds[num]); + group->fds[num] = -1; + } + + + return ret; +} + +static int +mmap_events(int lcore_id) +{ + struct rte_pmu_event_group *group = &pmu->group[lcore_id]; + void *addr; + int ret, i; + + for (i = 0; i < pmu->num_group_events; i++) { + addr = mmap(0, rte_mem_page_size(), PROT_READ, MAP_SHARED, group->fds[i], 0); + if (addr == MAP_FAILED) { + ret = -errno; + goto out; + } + + group->mmap_pages[i] = addr; + } + + return 0; +out: + for (; i; i--) { + munmap(group->mmap_pages[i - 1], rte_mem_page_size()); + group->mmap_pages[i - 1] = NULL; + } + + return ret; +} + +static void +cleanup_events(int lcore_id) +{ + struct rte_pmu_event_group *group = &pmu->group[lcore_id]; + int i; + + if (!group->fds) + return; + + if (group->fds[0] != -1) + ioctl(group->fds[0], PERF_EVENT_IOC_DISABLE, PERF_IOC_FLAG_GROUP); + + for (i = 0; i < pmu->num_group_events; i++) { + if (group->mmap_pages[i]) { + munmap(group->mmap_pages[i], rte_mem_page_size()); + group->mmap_pages[i] = NULL; + } + + if (group->fds[i] != -1) { + close(group->fds[i]); + group->fds[i] = -1; + } + } + + rte_free(group->mmap_pages); + rte_free(group->fds); + + group->mmap_pages = NULL; + group->fds = NULL; + group->enabled = false; +} + +int __rte_noinline +rte_pmu_enable_group(int lcore_id) +{ + struct rte_pmu_event_group *group = &pmu->group[lcore_id]; + int ret; + + if (pmu->num_group_events == 0) { + RTE_LOG(DEBUG, EAL, "no matching PMU events\n"); + + return 0; + } + + group->fds = rte_zmalloc(NULL, pmu->num_group_events, sizeof(*group->fds)); + if (!group->fds) { + RTE_LOG(ERR, EAL, "failed to alloc descriptor memory\n"); + + return -ENOMEM; + } + + group->mmap_pages = rte_zmalloc(NULL, pmu->num_group_events, sizeof(*group->mmap_pages)); + if (!group->mmap_pages) { + RTE_LOG(ERR, EAL, "failed to alloc userpage memory\n"); + + ret = -ENOMEM; + goto out; + } + + ret = open_events(lcore_id); + if (ret) { + RTE_LOG(ERR, EAL, "failed to open events on lcore-worker-%d\n", lcore_id); + goto out; + } + + ret = mmap_events(lcore_id); + if (ret) { + RTE_LOG(ERR, EAL, "failed to map events on lcore-worker-%d\n", lcore_id); + goto out; + } + + if (ioctl(group->fds[0], PERF_EVENT_IOC_ENABLE, PERF_IOC_FLAG_GROUP) == -1) { + RTE_LOG(ERR, EAL, "failed to enable events on lcore-worker-%d\n", lcore_id); + + ret = -errno; + goto out; + } + + return 0; + +out: + cleanup_events(lcore_id); + + return ret; +} + +static int +scan_pmus(void) +{ + char path[PATH_MAX]; + struct dirent *dent; + const char *name; + DIR *dirp; + + dirp = opendir(EVENT_SOURCE_DEVICES_PATH); + if (!dirp) + return -errno; + + while ((dent = readdir(dirp))) { + name = dent->d_name; + if (name[0] == '.') + continue; + + /* sysfs entry should either contain cpus or be a cpu */ + if (!strcmp(name, "cpu")) + break; + + snprintf(path, sizeof(path), EVENT_SOURCE_DEVICES_PATH "/%s/cpus", name); + if (access(path, F_OK) == 0) + break; + } + + closedir(dirp); + + if (dent) { + pmu->name = strdup(name); + if (!pmu->name) + return -ENOMEM; + } + + return pmu->name ? 0 : -ENODEV; +} + +int +rte_pmu_add_event(const char *name) +{ + struct rte_pmu_event *event; + char path[PATH_MAX]; + + snprintf(path, sizeof(path), EVENT_SOURCE_DEVICES_PATH "/%s/events/%s", pmu->name, name); + if (access(path, R_OK)) + return -ENODEV; + + TAILQ_FOREACH(event, &pmu->event_list, next) { + if (!strcmp(event->name, name)) + return event->index; + continue; + } + + event = rte_zmalloc(NULL, 1, sizeof(*event)); + if (!event) + return -ENOMEM; + + event->name = strdup(name); + if (!event->name) { + rte_free(event); + + return -ENOMEM; + } + + event->index = pmu->num_group_events++; + TAILQ_INSERT_TAIL(&pmu->event_list, event, next); + + RTE_LOG(DEBUG, EAL, "%s even added at index %d\n", name, event->index); + + return event->index; +} + +void +eal_pmu_init(void) +{ + int ret; + + pmu = rte_calloc(NULL, 1, sizeof(*pmu), RTE_CACHE_LINE_SIZE); + if (!pmu) { + RTE_LOG(ERR, EAL, "failed to alloc PMU\n"); + + return; + } + + TAILQ_INIT(&pmu->event_list); + + ret = scan_pmus(); + if (ret) { + RTE_LOG(ERR, EAL, "failed to find core pmu\n"); + goto out; + } + + ret = pmu_arch_init(); + if (ret) { + RTE_LOG(ERR, EAL, "failed to setup arch for PMU\n"); + goto out; + } + + return; +out: + free(pmu->name); + rte_free(pmu); +} + +void +eal_pmu_fini(void) +{ + struct rte_pmu_event *event, *tmp; + int lcore_id; + + RTE_TAILQ_FOREACH_SAFE(event, &pmu->event_list, next, tmp) { + TAILQ_REMOVE(&pmu->event_list, event, next); + free(event->name); + rte_free(event); + } + + RTE_LCORE_FOREACH_WORKER(lcore_id) + cleanup_events(lcore_id); + + pmu_arch_fini(); + free(pmu->name); + rte_free(pmu); +} diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build index cfcd40aaed..3bf830adee 100644 --- a/lib/eal/include/meson.build +++ b/lib/eal/include/meson.build @@ -36,6 +36,7 @@ headers += files( 'rte_pci_dev_features.h', 'rte_per_lcore.h', 'rte_pflock.h', + 'rte_pmu.h', 'rte_random.h', 'rte_reciprocal.h', 'rte_seqcount.h', diff --git a/lib/eal/include/rte_pmu.h b/lib/eal/include/rte_pmu.h new file mode 100644 index 0000000000..5955c22779 --- /dev/null +++ b/lib/eal/include/rte_pmu.h @@ -0,0 +1,204 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2022 Marvell + */ + +#ifndef _RTE_PMU_H_ +#define _RTE_PMU_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include + +#ifdef RTE_EXEC_ENV_LINUX + +#include + +#include +#include +#include +#include + +/** + * @file + * + * PMU event tracing operations + * + * This file defines generic API and types necessary to setup PMU and + * read selected counters in runtime. + */ + +/** + * A structure describing a group of events. + */ +struct rte_pmu_event_group { + int *fds; /**< array of event descriptors */ + void **mmap_pages; /**< array of pointers to mmapped perf_event_attr structures */ + bool enabled; /**< true if group was enabled on particular lcore */ +}; + +/** + * A structure describing an event. + */ +struct rte_pmu_event { + char *name; /** name of an event */ + int index; /** event index into fds/mmap_pages */ + TAILQ_ENTRY(rte_pmu_event) next; /** list entry */ +}; + +/** + * A PMU state container. + */ +struct rte_pmu { + char *name; /** name of core PMU listed under /sys/bus/event_source/devices */ + struct rte_pmu_event_group group[RTE_MAX_LCORE]; /**< per lcore event group data */ + int num_group_events; /**< number of events in a group */ + TAILQ_HEAD(, rte_pmu_event) event_list; /**< list of matching events */ +}; + +/** Pointer to the PMU state container */ +extern struct rte_pmu *pmu; + +/** Each architecture supporting PMU needs to provide its own version */ +#ifndef rte_pmu_pmc_read +#define rte_pmu_pmc_read(index) ({ 0; }) +#endif + +/** + * @internal + * + * Read PMU counter. + * + * @param pc + * Pointer to the mmapped user page. + * @return + * Counter value read from hardware. + */ +__rte_internal +static __rte_always_inline uint64_t +rte_pmu_read_userpage(struct perf_event_mmap_page *pc) +{ + uint64_t offset, width, pmc = 0; + uint32_t seq, index; + int tries = 100; + + for (;;) { + seq = pc->lock; + rte_compiler_barrier(); + index = pc->index; + offset = pc->offset; + width = pc->pmc_width; + + if (likely(pc->cap_user_rdpmc && index)) { + pmc = rte_pmu_pmc_read(index - 1); + pmc <<= 64 - width; + pmc >>= 64 - width; + } + + rte_compiler_barrier(); + + if (likely(pc->lock == seq)) + return pmc + offset; + + if (--tries == 0) { + RTE_LOG(DEBUG, EAL, "failed to get perf_event_mmap_page lock\n"); + break; + } + } + + return 0; +} + +/** + * @internal + * + * Enable group of events for a given lcore. + * + * @param lcore_id + * The identifier of the lcore. + * @return + * 0 in case of success, negative value otherwise. + */ +__rte_internal +int +rte_pmu_enable_group(int lcore_id); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Add event to the group of enabled events. + * + * @param name + * Name of an event listed under /sys/bus/event_source/devices/pmu/events. + * @return + * Event index in case of success, negative value otherwise. + */ +__rte_experimental +int +rte_pmu_add_event(const char *name); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Read hardware counter configured to count occurrences of an event. + * + * @param index + * Index of an event to be read. + * @return + * Event value read from register. In case of errors or lack of support + * 0 is returned. In other words, stream of zeros in a trace file + * indicates problem with reading particular PMU event register. + */ +__rte_experimental +static __rte_always_inline uint64_t +rte_pmu_read(int index) +{ + int lcore_id = rte_lcore_id(); + struct rte_pmu_event_group *group; + int ret; + + if (!pmu) + return 0; + + group = &pmu->group[lcore_id]; + if (!group->enabled) { + ret = rte_pmu_enable_group(lcore_id); + if (ret) + return 0; + + group->enabled = true; + } + + if (index < 0 || index >= pmu->num_group_events) + return 0; + + return rte_pmu_read_userpage(group->mmap_pages[index]); +} + +#else /* !RTE_EXEC_ENV_LINUX */ + +__rte_experimental +static int __rte_unused +rte_pmu_add_event(__rte_unused const char *name) +{ + return -1; +} + +__rte_experimental +static __rte_always_inline uint64_t +rte_pmu_read(__rte_unused int index) +{ + return 0; +} + +#endif /* RTE_EXEC_ENV_LINUX */ + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_PMU_H_ */ diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c index 8c118d0d9f..751a13b597 100644 --- a/lib/eal/linux/eal.c +++ b/lib/eal/linux/eal.c @@ -53,6 +53,7 @@ #include "eal_options.h" #include "eal_vfio.h" #include "hotplug_mp.h" +#include "pmu_private.h" #define MEMSIZE_IF_NO_HUGE_PAGE (64ULL * 1024ULL * 1024ULL) @@ -1206,6 +1207,8 @@ rte_eal_init(int argc, char **argv) return -1; } + eal_pmu_init(); + if (rte_eal_tailqs_init() < 0) { rte_eal_init_alert("Cannot init tail queues for objects"); rte_errno = EFAULT; @@ -1372,6 +1375,7 @@ rte_eal_cleanup(void) eal_bus_cleanup(); rte_trace_save(); eal_trace_fini(); + eal_pmu_fini(); /* after this point, any DPDK pointers will become dangling */ rte_eal_memory_detach(); eal_mp_dev_hotplug_cleanup(); diff --git a/lib/eal/version.map b/lib/eal/version.map index 7ad12a7dc9..e870c87493 100644 --- a/lib/eal/version.map +++ b/lib/eal/version.map @@ -432,6 +432,8 @@ EXPERIMENTAL { rte_thread_set_priority; # added in 22.11 + rte_pmu_add_event; # WINDOWS_NO_EXPORT + rte_pmu_read; # WINDOWS_NO_EXPORT rte_thread_attr_get_affinity; rte_thread_attr_init; rte_thread_attr_set_affinity; @@ -483,4 +485,5 @@ INTERNAL { rte_mem_map; rte_mem_page_size; rte_mem_unmap; + rte_pmu_enable_group; }; From patchwork Fri Nov 11 09:43:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tomasz Duszynski X-Patchwork-Id: 119795 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id B339FA0542; Fri, 11 Nov 2022 10:44:04 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 7BED642D26; Fri, 11 Nov 2022 10:43:57 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id 1B11942D2B for ; Fri, 11 Nov 2022 10:43:56 +0100 (CET) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AB91EGh006339; Fri, 11 Nov 2022 01:43:53 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=QZTt1anb85jQNVehsRXv5S4SdJg15gsPiJh//UTiCYg=; b=PlgQ4kW/ccXrllw9SLPq9a9Im7wgPt9s/AUs2JjgO/lveTUQefYx5aRPudfdcgQPhN4Y gbIj3dFjbYV5o6rT7KGi39yr8B9QEpuzfAJ6pgtuN9ZR7luEmtcUYQRTDc8tIDJuKik/ PC6Z0ABVRkSPENl0LwwZL+Rlx4y16iD0h1yomTyfc+3QyNUtNPICqeitEDKdgOThxaXJ 0sCc2eee8SPci8sdS0gW4jUi9G1xxprRh3+El7an6/AbmzB6oveE26D9mQG1yDKaOBr7 1N2Mr0xleCPepllaHhBberDSWPOPNPv6EZuBtWkHt4jB0nML6BGnFBieh4N1Nugwyw9q MQ== Received: from dc5-exch02.marvell.com ([199.233.59.182]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3kskck03w4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Fri, 11 Nov 2022 01:43:53 -0800 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Fri, 11 Nov 2022 01:43:50 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Fri, 11 Nov 2022 01:43:50 -0800 Received: from localhost.localdomain (unknown [10.28.34.39]) by maili.marvell.com (Postfix) with ESMTP id 167CF5B6926; Fri, 11 Nov 2022 01:43:48 -0800 (PST) From: Tomasz Duszynski To: , , Ruifeng Wang CC: , Subject: [PATCH 2/4] eal/arm: support reading ARM PMU events in runtime Date: Fri, 11 Nov 2022 10:43:36 +0100 Message-ID: <20221111094338.2736065-3-tduszynski@marvell.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221111094338.2736065-1-tduszynski@marvell.com> References: <20221111094338.2736065-1-tduszynski@marvell.com> MIME-Version: 1.0 X-Proofpoint-GUID: 0MiesDvWEWgiZ9nmctb7rl1SNIllWMbD X-Proofpoint-ORIG-GUID: 0MiesDvWEWgiZ9nmctb7rl1SNIllWMbD X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-11_05,2022-11-09_01,2022-06-22_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add support for reading ARM PMU events in runtime. Signed-off-by: Tomasz Duszynski --- app/test/test_pmu.c | 4 ++ lib/eal/arm/include/meson.build | 1 + lib/eal/arm/include/rte_pmu_pmc.h | 37 +++++++++++ lib/eal/arm/meson.build | 4 ++ lib/eal/arm/rte_pmu.c | 103 ++++++++++++++++++++++++++++++ lib/eal/include/rte_pmu.h | 3 + 6 files changed, 152 insertions(+) create mode 100644 lib/eal/arm/include/rte_pmu_pmc.h create mode 100644 lib/eal/arm/rte_pmu.c diff --git a/app/test/test_pmu.c b/app/test/test_pmu.c index fd331af9ee..f94866dff9 100644 --- a/app/test/test_pmu.c +++ b/app/test/test_pmu.c @@ -13,6 +13,10 @@ test_pmu_read(void) int tries = 10; int event = -1; +#if defined(RTE_ARCH_ARM64) + event = rte_pmu_add_event("cpu_cycles"); +#endif + while (tries--) val += rte_pmu_read(event); diff --git a/lib/eal/arm/include/meson.build b/lib/eal/arm/include/meson.build index 657bf58569..ab13b0220a 100644 --- a/lib/eal/arm/include/meson.build +++ b/lib/eal/arm/include/meson.build @@ -20,6 +20,7 @@ arch_headers = files( 'rte_pause_32.h', 'rte_pause_64.h', 'rte_pause.h', + 'rte_pmu_pmc.h', 'rte_power_intrinsics.h', 'rte_prefetch_32.h', 'rte_prefetch_64.h', diff --git a/lib/eal/arm/include/rte_pmu_pmc.h b/lib/eal/arm/include/rte_pmu_pmc.h new file mode 100644 index 0000000000..5efc851cb8 --- /dev/null +++ b/lib/eal/arm/include/rte_pmu_pmc.h @@ -0,0 +1,37 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2022 Marvell. + */ + +#ifndef _RTE_PMU_PMC_ARM_H_ +#define _RTE_PMU_PMC_ARM_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +static __rte_always_inline uint64_t +rte_pmu_pmc_read(int index) +{ + uint64_t val; + + if (index == 31) { + /* CPU Cycles (0x11) must be read via pmccntr_el0 */ + asm volatile("mrs %0, pmccntr_el0" : "=r" (val)); + } else { + asm volatile( + "msr pmselr_el0, %x0\n" + "mrs %0, pmxevcntr_el0\n" + : "=r" (val) + : "rZ" (index) + ); + } + + return val; +} +#define rte_pmu_pmc_read rte_pmu_pmc_read + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_PMU_PMC_ARM_H_ */ diff --git a/lib/eal/arm/meson.build b/lib/eal/arm/meson.build index dca1106aae..0c5575b197 100644 --- a/lib/eal/arm/meson.build +++ b/lib/eal/arm/meson.build @@ -9,3 +9,7 @@ sources += files( 'rte_hypervisor.c', 'rte_power_intrinsics.c', ) + +if is_linux + sources += files('rte_pmu.c') +endif diff --git a/lib/eal/arm/rte_pmu.c b/lib/eal/arm/rte_pmu.c new file mode 100644 index 0000000000..6c50a1b3c4 --- /dev/null +++ b/lib/eal/arm/rte_pmu.c @@ -0,0 +1,103 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2022 Marvell International Ltd. + */ + +#include +#include +#include + +#include +#include +#include +#include + +#include "pmu_private.h" + +#define PERF_USER_ACCESS_PATH "/proc/sys/kernel/perf_user_access" + +static int restore_uaccess; + +static int +read_attr_int(const char *path, int *val) +{ + char buf[BUFSIZ]; + int ret, fd; + + fd = open(path, O_RDONLY); + if (fd == -1) + return -errno; + + ret = read(fd, buf, sizeof(buf)); + if (ret == -1) { + close(fd); + + return -errno; + } + + *val = strtol(buf, NULL, 10); + close(fd); + + return 0; +} + +static int +write_attr_int(const char *path, int val) +{ + char buf[BUFSIZ]; + int num, ret, fd; + + fd = open(path, O_WRONLY); + if (fd == -1) + return -errno; + + num = snprintf(buf, sizeof(buf), "%d", val); + ret = write(fd, buf, num); + if (ret == -1) { + close(fd); + + return -errno; + } + + close(fd); + + return 0; +} + +int +pmu_arch_init(void) +{ + int ret; + + ret = read_attr_int(PERF_USER_ACCESS_PATH, &restore_uaccess); + if (ret) { + RTE_LOG(ERR, EAL, "failed to read %s\n", PERF_USER_ACCESS_PATH); + + return ret; + } + + ret = write_attr_int(PERF_USER_ACCESS_PATH, 1); + if (ret) { + RTE_LOG(ERR, EAL, "failed to enable perf user access\n" + "try enabling manually 'echo 1 > %s'\n", + PERF_USER_ACCESS_PATH); + + return ret; + } + + return 0; +} + +void +pmu_arch_fini(void) +{ + write_attr_int(PERF_USER_ACCESS_PATH, restore_uaccess); +} + +void +pmu_arch_fixup_config(uint64_t config[3]) +{ + /* select 64 bit counters */ + config[1] |= RTE_BIT64(0); + /* enable userspace access */ + config[1] |= RTE_BIT64(1); +} diff --git a/lib/eal/include/rte_pmu.h b/lib/eal/include/rte_pmu.h index 5955c22779..67b1194a2a 100644 --- a/lib/eal/include/rte_pmu.h +++ b/lib/eal/include/rte_pmu.h @@ -20,6 +20,9 @@ extern "C" { #include #include #include +#if defined(RTE_ARCH_ARM64) +#include +#endif /** * @file From patchwork Fri Nov 11 09:43:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tomasz Duszynski X-Patchwork-Id: 119796 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 6C4ADA0542; Fri, 11 Nov 2022 10:44:10 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 6B3EB42D2F; Fri, 11 Nov 2022 10:43:58 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com [67.231.148.174]) by mails.dpdk.org (Postfix) with ESMTP id 0AD2542D2A for ; Fri, 11 Nov 2022 10:43:55 +0100 (CET) Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AB959N3021657; Fri, 11 Nov 2022 01:43:55 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=VO1W/By2cX1nlbZLXx+OSA6tegQeZO1pm1HKGAgC438=; b=lX0L5OIAS1EdkdDhohDzW200h+1xNPUZTr9ZKjQZwaR2MUnm7coyUJfh6cGhAD6O2a2C dIc8DWZySXFwlf+8nXiNiXOu6/EQR+fHYVbSCUey9RvetDf9Re8e8CgGgA2g9xZWNYZd YZaWAzJ1UQ2efLGlpif4E8UOrbANRwaW4biuq6ARqgztiUn0nEy8KB+nd8Ba8f54fEk8 6OzWdcszsocktcZM6/uKDMRhe7/XbhvrZWn5A3LE0h4DC8lrO5ELwDUGhKNa5SPAvqGF enBCcOMbvLVFtQ7ZnPWXDRiyGSTRlp4JI3vNDsUXUR+d0PRJUEiO+TNt7axL5F3kviXO Qw== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0a-0016f401.pphosted.com (PPS) with ESMTPS id 3kskf0r34e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Fri, 11 Nov 2022 01:43:55 -0800 Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Fri, 11 Nov 2022 01:43:53 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.18 via Frontend Transport; Fri, 11 Nov 2022 01:43:53 -0800 Received: from localhost.localdomain (unknown [10.28.34.39]) by maili.marvell.com (Postfix) with ESMTP id 7B6255B692B; Fri, 11 Nov 2022 01:43:51 -0800 (PST) From: Tomasz Duszynski To: , , Bruce Richardson , Konstantin Ananyev CC: , Subject: [PATCH 3/4] eal/x86: support reading Intel PMU events in runtime Date: Fri, 11 Nov 2022 10:43:37 +0100 Message-ID: <20221111094338.2736065-4-tduszynski@marvell.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221111094338.2736065-1-tduszynski@marvell.com> References: <20221111094338.2736065-1-tduszynski@marvell.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: kmqEWm_YEFrfNabVRYMwgLI5oz26UJTy X-Proofpoint-GUID: kmqEWm_YEFrfNabVRYMwgLI5oz26UJTy X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-11_05,2022-11-09_01,2022-06-22_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add support for reading Intel PMU events in runtime. Signed-off-by: Tomasz Duszynski --- app/test/test_pmu.c | 2 ++ lib/eal/include/rte_pmu.h | 2 +- lib/eal/x86/include/meson.build | 1 + lib/eal/x86/include/rte_pmu_pmc.h | 32 +++++++++++++++++++++++++++++++ 4 files changed, 36 insertions(+), 1 deletion(-) create mode 100644 lib/eal/x86/include/rte_pmu_pmc.h diff --git a/app/test/test_pmu.c b/app/test/test_pmu.c index f94866dff9..016204c083 100644 --- a/app/test/test_pmu.c +++ b/app/test/test_pmu.c @@ -15,6 +15,8 @@ test_pmu_read(void) #if defined(RTE_ARCH_ARM64) event = rte_pmu_add_event("cpu_cycles"); +#elif defined(RTE_ARCH_X86_64) + event = rte_pmu_add_event("cpu-cycles"); #endif while (tries--) diff --git a/lib/eal/include/rte_pmu.h b/lib/eal/include/rte_pmu.h index 67b1194a2a..bbe12d100d 100644 --- a/lib/eal/include/rte_pmu.h +++ b/lib/eal/include/rte_pmu.h @@ -20,7 +20,7 @@ extern "C" { #include #include #include -#if defined(RTE_ARCH_ARM64) +#if defined(RTE_ARCH_ARM64) || defined(RTE_ARCH_X86_64) #include #endif diff --git a/lib/eal/x86/include/meson.build b/lib/eal/x86/include/meson.build index 52d2f8e969..03d286ed25 100644 --- a/lib/eal/x86/include/meson.build +++ b/lib/eal/x86/include/meson.build @@ -9,6 +9,7 @@ arch_headers = files( 'rte_io.h', 'rte_memcpy.h', 'rte_pause.h', + 'rte_pmu_pmc.h', 'rte_power_intrinsics.h', 'rte_prefetch.h', 'rte_rtm.h', diff --git a/lib/eal/x86/include/rte_pmu_pmc.h b/lib/eal/x86/include/rte_pmu_pmc.h new file mode 100644 index 0000000000..6ecb27a1eb --- /dev/null +++ b/lib/eal/x86/include/rte_pmu_pmc.h @@ -0,0 +1,32 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2022 Marvell. + */ + +#ifndef _RTE_PMU_PMC_X86_H_ +#define _RTE_PMU_PMC_X86_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +__rte_internal +static __rte_always_inline uint64_t +rte_pmu_pmc_read(int index) +{ + uint32_t high, low; + + asm volatile( + "rdpmc\n" + : "=a" (low), "=d" (high) + : "c" (index) + ); + + return ((uint64_t)high << 32) | (uint64_t)low; +} +#define rte_pmu_pmc_read rte_pmu_pmc_read + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_PMU_PMC_X86_H_ */ From patchwork Fri Nov 11 09:43:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tomasz Duszynski X-Patchwork-Id: 119797 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 97506A0542; Fri, 11 Nov 2022 10:44:15 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 5787442D37; Fri, 11 Nov 2022 10:44:00 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com [67.231.148.174]) by mails.dpdk.org (Postfix) with ESMTP id BBDBA42D34 for ; Fri, 11 Nov 2022 10:43:58 +0100 (CET) Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AB959ST021637; Fri, 11 Nov 2022 01:43:57 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=yMjdXhDj6aMAai62vfrcP7moSkEPuHrJD/JI1EYYTKA=; b=Zs7lEOQuH0sGbboDGEVU/mz0tNZdxBmdV6Egpl5mFdIt6rHGZ7HHo2XP2oxt179PG/RR lyE/oPOgD45/NxHDnETOo3TSIcIf7TWDYExewhBzZOTKVaiXDeARliqQm5OP2AXDVgMN gAfI9qhy5FcqwSVgWcyDH5hdHikJlCeUseykUuvXuwFX2+fT9gECe5KfNV4ZpSTGNcJw uto2EKkMN+xuRYy8bfQZI7Gp375SPhTGoV2afrjXdkouqRC0PUpJieeTqaHQsSSLsVPL ULZ4JUpK/6Yu3pvm8D4rA37qamm+5tyauTnf0WJ7fiKnuw5m8LJDYCIuPLsffDJr9UOi FA== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0a-0016f401.pphosted.com (PPS) with ESMTPS id 3kskf0r34j-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Fri, 11 Nov 2022 01:43:57 -0800 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Fri, 11 Nov 2022 01:43:56 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Fri, 11 Nov 2022 01:43:56 -0800 Received: from localhost.localdomain (unknown [10.28.34.39]) by maili.marvell.com (Postfix) with ESMTP id 2F1A13F70B4; Fri, 11 Nov 2022 01:43:53 -0800 (PST) From: Tomasz Duszynski To: , , Jerin Jacob , Sunil Kumar Kori CC: Subject: [PATCH 4/4] eal: add PMU support to tracing library Date: Fri, 11 Nov 2022 10:43:38 +0100 Message-ID: <20221111094338.2736065-5-tduszynski@marvell.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221111094338.2736065-1-tduszynski@marvell.com> References: <20221111094338.2736065-1-tduszynski@marvell.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: HfR4oBN9vk_y-vqiKqqakG_ZsEt5jacl X-Proofpoint-GUID: HfR4oBN9vk_y-vqiKqqakG_ZsEt5jacl X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-11_05,2022-11-09_01,2022-06-22_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org In order to profile app one needs to store significant amount of samples somewhere for an analysis latern on. Since trace library supports storing data in a CTF format lets take adventage of that and add a dedicated PMU tracepoint. Signed-off-by: Tomasz Duszynski --- app/test/test_trace_perf.c | 4 ++ doc/guides/prog_guide/profile_app.rst | 5 ++ doc/guides/prog_guide/trace_lib.rst | 32 ++++++++++++ lib/eal/common/eal_common_trace_points.c | 3 ++ lib/eal/common/rte_pmu.c | 63 ++++++++++++++++++++++++ lib/eal/include/rte_eal_trace.h | 11 +++++ lib/eal/version.map | 1 + 7 files changed, 119 insertions(+) diff --git a/app/test/test_trace_perf.c b/app/test/test_trace_perf.c index 46ae7d8074..4851b6852f 100644 --- a/app/test/test_trace_perf.c +++ b/app/test/test_trace_perf.c @@ -114,6 +114,8 @@ worker_fn_##func(void *arg) \ #define GENERIC_DOUBLE rte_eal_trace_generic_double(3.66666) #define GENERIC_STR rte_eal_trace_generic_str("hello world") #define VOID_FP app_dpdk_test_fp() +/* 0 corresponds first event passed via --trace= */ +#define READ_PMU rte_eal_trace_pmu_read(0) WORKER_DEFINE(GENERIC_VOID) WORKER_DEFINE(GENERIC_U64) @@ -122,6 +124,7 @@ WORKER_DEFINE(GENERIC_FLOAT) WORKER_DEFINE(GENERIC_DOUBLE) WORKER_DEFINE(GENERIC_STR) WORKER_DEFINE(VOID_FP) +WORKER_DEFINE(READ_PMU) static void run_test(const char *str, lcore_function_t f, struct test_data *data, size_t sz) @@ -174,6 +177,7 @@ test_trace_perf(void) run_test("double", worker_fn_GENERIC_DOUBLE, data, sz); run_test("string", worker_fn_GENERIC_STR, data, sz); run_test("void_fp", worker_fn_VOID_FP, data, sz); + run_test("read_pmu", worker_fn_READ_PMU, data, sz); rte_free(data); return TEST_SUCCESS; diff --git a/doc/guides/prog_guide/profile_app.rst b/doc/guides/prog_guide/profile_app.rst index 8fc1b20cab..977800ea01 100644 --- a/doc/guides/prog_guide/profile_app.rst +++ b/doc/guides/prog_guide/profile_app.rst @@ -16,6 +16,11 @@ that information, perf being an example here. Though in some scenarios, eg. when isolated (nohz_full) and run dedicated tasks, using perf is less than ideal. In such cases one can read specific events directly from application via ``rte_pmu_read()``. +Alternatively tracing library can be used which offers dedicated tracepoint +``rte_eal_trace_pmu_event()``. + +Refer to :doc:`../prog_guide/trace_lib` for more details. + Profiling on x86 ---------------- diff --git a/doc/guides/prog_guide/trace_lib.rst b/doc/guides/prog_guide/trace_lib.rst index 9a8f38073d..9a845fd86f 100644 --- a/doc/guides/prog_guide/trace_lib.rst +++ b/doc/guides/prog_guide/trace_lib.rst @@ -46,6 +46,7 @@ DPDK tracing library features trace format and is compatible with ``LTTng``. For detailed information, refer to `Common Trace Format `_. +- Support reading PMU events on ARM64 and x86 (Intel) How to add a tracepoint? ------------------------ @@ -137,6 +138,37 @@ the user must use ``RTE_TRACE_POINT_FP`` instead of ``RTE_TRACE_POINT``. ``RTE_TRACE_POINT_FP`` is compiled out by default and it can be enabled using the ``enable_trace_fp`` option for meson build. +PMU tracepoint +-------------- + +Performance measurement unit (PMU) event values can be read from hardware +registers using predefined ``rte_pmu_read`` tracepoint. + +Tracing is enabled via ``--trace`` EAL option by passing both expression +matching PMU tracepoint name i.e ``lib.eal.pmu.read`` and expression +``e=ev1[,ev2,...]`` matching particular events:: + + --trace='*pmu.read\|e=cpu_cycles,l1d_cache' + +Event names are available under ``/sys/bus/event_source/devices/PMU/events`` +directory, where ``PMU`` is a placeholder for either a ``cpu`` or a directory +containing ``cpus``. + +In contrary to other tracepoints this does not need any extra variables +added to source files. Instead, caller passes index which follows the order of +events specified via ``--trace`` parameter. In the following example index ``0`` +corresponds to ``cpu_cyclces`` while index ``1`` corresponds to ``l1d_cache``. + +.. code-block:: c + + ... + rte_eal_trace_pmu_read(0); + rte_eal_trace_pmu_read(1); + ... + +PMU tracing support must be explicitly enabled using the ``enable_trace_fp`` +option for meson build. + Event record mode ----------------- diff --git a/lib/eal/common/eal_common_trace_points.c b/lib/eal/common/eal_common_trace_points.c index 0b0b254615..de918ca618 100644 --- a/lib/eal/common/eal_common_trace_points.c +++ b/lib/eal/common/eal_common_trace_points.c @@ -75,3 +75,6 @@ RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_enable, lib.eal.intr.enable) RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_disable, lib.eal.intr.disable) + +RTE_TRACE_POINT_REGISTER(rte_eal_trace_pmu_read, + lib.eal.pmu.read) diff --git a/lib/eal/common/rte_pmu.c b/lib/eal/common/rte_pmu.c index 7d3bd57d1d..40c454f92a 100644 --- a/lib/eal/common/rte_pmu.c +++ b/lib/eal/common/rte_pmu.c @@ -18,6 +18,7 @@ #include #include "pmu_private.h" +#include "eal_trace.h" #define EVENT_SOURCE_DEVICES_PATH "/sys/bus/event_source/devices" @@ -402,11 +403,70 @@ rte_pmu_add_event(const char *name) return event->index; } +static void +add_events(const char *pattern) +{ + char *token, *copy; + int ret; + + copy = strdup(pattern); + if (!copy) + return; + + token = strtok(copy, ","); + while (token) { + ret = rte_pmu_add_event(token); + if (ret < 0) + RTE_LOG(ERR, EAL, "failed to add %s event\n", token); + + token = strtok(NULL, ","); + } + + free(copy); +} + +static void +add_events_by_pattern(const char *pattern) +{ + regmatch_t rmatch; + char buf[BUFSIZ]; + unsigned int num; + regex_t reg; + + /* events are matched against occurrences of e=ev1[,ev2,..] pattern */ + if (regcomp(®, "e=([_[:alnum:]-],?)+", REG_EXTENDED)) + return; + + for (;;) { + if (regexec(®, pattern, 1, &rmatch, 0)) + break; + + num = rmatch.rm_eo - rmatch.rm_so; + if (num > sizeof(buf)) + num = sizeof(buf); + + /* skip e= pattern prefix */ + memcpy(buf, pattern + rmatch.rm_so + 2, num - 2); + buf[num] = '\0'; + add_events(buf); + + pattern += rmatch.rm_eo; + } + + regfree(®); +} + void eal_pmu_init(void) { + struct trace_arg *arg; + struct trace *trace; int ret; + trace = trace_obj_get(); + if (!trace) + RTE_LOG(WARNING, EAL, "tracing not initialized\n"); + pmu = rte_calloc(NULL, 1, sizeof(*pmu), RTE_CACHE_LINE_SIZE); if (!pmu) { RTE_LOG(ERR, EAL, "failed to alloc PMU\n"); @@ -428,6 +488,9 @@ eal_pmu_init(void) goto out; } + STAILQ_FOREACH(arg, &trace->args, next) + add_events_by_pattern(arg->val); + return; out: free(pmu->name); diff --git a/lib/eal/include/rte_eal_trace.h b/lib/eal/include/rte_eal_trace.h index 5ef4398230..2a10f63e97 100644 --- a/lib/eal/include/rte_eal_trace.h +++ b/lib/eal/include/rte_eal_trace.h @@ -17,6 +17,7 @@ extern "C" { #include #include +#include #include #include "eal_interrupts.h" @@ -279,6 +280,16 @@ RTE_TRACE_POINT( rte_trace_point_emit_string(cpuset); ) +/* PMU */ +RTE_TRACE_POINT_FP( + rte_eal_trace_pmu_read, + RTE_TRACE_POINT_ARGS(int index), + uint64_t val; + rte_trace_point_emit_int(index); + val = rte_pmu_read(index); + rte_trace_point_emit_u64(val); +) + #ifdef __cplusplus } #endif diff --git a/lib/eal/version.map b/lib/eal/version.map index e870c87493..d6ec3f3b0e 100644 --- a/lib/eal/version.map +++ b/lib/eal/version.map @@ -432,6 +432,7 @@ EXPERIMENTAL { rte_thread_set_priority; # added in 22.11 + __rte_eal_trace_pmu_read; # WINDOWS_NO_EXPORT rte_pmu_add_event; # WINDOWS_NO_EXPORT rte_pmu_read; # WINDOWS_NO_EXPORT rte_thread_attr_get_affinity;