[v15,4/4] eal: add PMU support to tracing library

Message ID 20241025085414.3412068-5-tduszynski@marvell.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series add support for self monitoring |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/github-robot: build fail github build: failed
ci/intel-Testing success Testing PASS
ci/intel-Functional success Functional PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-marvell-Functional success Functional Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-sample-apps-testing success Testing PASS
ci/iol-compile-arm64-testing success Testing PASS
ci/iol-compile-amd64-testing warning Testing issues

Commit Message

Tomasz Duszynski Oct. 25, 2024, 8:54 a.m. UTC
In order to profile app one needs to store significant amount of samples
somewhere for an analysis later on. Since trace library supports
storing data in a CTF format lets take advantage of that and add a
dedicated PMU tracepoint.

Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
---
 app/test/test_trace_perf.c               | 10 ++++
 doc/guides/prog_guide/profile_app.rst    |  5 ++
 doc/guides/prog_guide/trace_lib.rst      | 32 +++++++++++
 lib/eal/common/eal_common_trace.c        |  5 +-
 lib/eal/common/eal_common_trace_pmu.c    | 38 ++++++++++++++
 lib/eal/common/eal_common_trace_points.c |  5 ++
 lib/eal/common/eal_trace.h               |  4 ++
 lib/eal/common/meson.build               |  1 +
 lib/eal/include/rte_eal_trace.h          | 11 ++++
 lib/eal/version.map                      |  1 +
 lib/pmu/rte_pmu.c                        | 67 +++++++++++++++++++++++-
 lib/pmu/rte_pmu.h                        | 24 +++++++--
 lib/pmu/version.map                      |  1 +
 13 files changed, 198 insertions(+), 6 deletions(-)
 create mode 100644 lib/eal/common/eal_common_trace_pmu.c
  

Comments

Jerin Jacob Oct. 25, 2024, 11:02 a.m. UTC | #1
On Fri, Oct 25, 2024 at 2:25 PM Tomasz Duszynski <tduszynski@marvell.com> wrote:
>
> In order to profile app one needs to store significant amount of samples
> somewhere for an analysis later on. Since trace library supports
> storing data in a CTF format lets take advantage of that and add a
> dedicated PMU tracepoint.
>
> Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>


--------------START-------------------------
> diff --git a/lib/eal/common/eal_common_trace_points.c b/lib/eal/common/eal_common_trace_points.c
> index 0f1240ea3a..c99eab92f4 100644
> --- a/lib/eal/common/eal_common_trace_points.c
> +++ b/lib/eal/common/eal_common_trace_points.c
> @@ -100,3 +100,8 @@ RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_enable,
>         lib.eal.intr.enable)
>  RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_disable,
>         lib.eal.intr.disable)
> +
> +#ifdef RTE_LIB_PMU
> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_pmu_read,
> +       lib.eal.pmu.read)
> +#endif
>
>  #define RTE_EAL_TRACE_GENERIC_FUNC rte_eal_trace_generic_func(__func__)
>
> +#ifdef RTE_LIB_PMU
> +#include <rte_pmu.h>
> +
> +RTE_TRACE_POINT_FP(
> +       rte_eal_trace_pmu_read,
> +       RTE_TRACE_POINT_ARGS(unsigned int index),
> +       uint64_t val = rte_pmu_read(index);
> +       rte_trace_point_emit_u64(val);
> +)
> +#endif
> +
--------------END-------------------------

All of the above changes can go to lib/pmu. Right? Like ethdev is
adding its trace points in ethdev library?
and make trace point name as rte_pmu_trace_read
  
Tomasz Duszynski Oct. 28, 2024, 10:32 a.m. UTC | #2
>On Fri, Oct 25, 2024 at 2:25 PM Tomasz Duszynski <tduszynski@marvell.com> wrote:
>>
>> In order to profile app one needs to store significant amount of
>> samples somewhere for an analysis later on. Since trace library
>> supports storing data in a CTF format lets take advantage of that and
>> add a dedicated PMU tracepoint.
>>
>> Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
>
>
>--------------START-------------------------
>> diff --git a/lib/eal/common/eal_common_trace_points.c
>> b/lib/eal/common/eal_common_trace_points.c
>> index 0f1240ea3a..c99eab92f4 100644
>> --- a/lib/eal/common/eal_common_trace_points.c
>> +++ b/lib/eal/common/eal_common_trace_points.c
>> @@ -100,3 +100,8 @@ RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_enable,
>>         lib.eal.intr.enable)
>>  RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_disable,
>>         lib.eal.intr.disable)
>> +
>> +#ifdef RTE_LIB_PMU
>> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_pmu_read,
>> +       lib.eal.pmu.read)
>> +#endif
>>
>>  #define RTE_EAL_TRACE_GENERIC_FUNC
>> rte_eal_trace_generic_func(__func__)
>>
>> +#ifdef RTE_LIB_PMU
>> +#include <rte_pmu.h>
>> +
>> +RTE_TRACE_POINT_FP(
>> +       rte_eal_trace_pmu_read,
>> +       RTE_TRACE_POINT_ARGS(unsigned int index),
>> +       uint64_t val = rte_pmu_read(index);
>> +       rte_trace_point_emit_u64(val);
>> +)
>> +#endif
>> +
>--------------END-------------------------
>
>All of the above changes can go to lib/pmu. Right? Like ethdev is adding its trace points in ethdev
>library?
>and make trace point name as rte_pmu_trace_read

That's because libpmu is higher than eal in hierarchy of libraries so it won't see some symbols, for example __rte_trace_point_register(). 
So first tracing itself needs to be moved out to a separate library and then approach other libs take would become natural for libpmu too.
  
Morten Brørup Nov. 5, 2024, 7:41 a.m. UTC | #3
> From: Tomasz Duszynski [mailto:tduszynski@marvell.com]
> Sent: Monday, 28 October 2024 11.32
> 
> >On Fri, Oct 25, 2024 at 2:25 PM Tomasz Duszynski
> <tduszynski@marvell.com> wrote:
> >>
> >> In order to profile app one needs to store significant amount of
> >> samples somewhere for an analysis later on. Since trace library
> >> supports storing data in a CTF format lets take advantage of that
> and
> >> add a dedicated PMU tracepoint.
> >>
> >> Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
> >
> >
> >--------------START-------------------------
> >> diff --git a/lib/eal/common/eal_common_trace_points.c
> >> b/lib/eal/common/eal_common_trace_points.c
> >> index 0f1240ea3a..c99eab92f4 100644
> >> --- a/lib/eal/common/eal_common_trace_points.c
> >> +++ b/lib/eal/common/eal_common_trace_points.c
> >> @@ -100,3 +100,8 @@
> RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_enable,
> >>         lib.eal.intr.enable)
> >>  RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_disable,
> >>         lib.eal.intr.disable)
> >> +
> >> +#ifdef RTE_LIB_PMU
> >> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_pmu_read,
> >> +       lib.eal.pmu.read)
> >> +#endif
> >>
> >>  #define RTE_EAL_TRACE_GENERIC_FUNC
> >> rte_eal_trace_generic_func(__func__)
> >>
> >> +#ifdef RTE_LIB_PMU
> >> +#include <rte_pmu.h>
> >> +
> >> +RTE_TRACE_POINT_FP(
> >> +       rte_eal_trace_pmu_read,
> >> +       RTE_TRACE_POINT_ARGS(unsigned int index),
> >> +       uint64_t val = rte_pmu_read(index);
> >> +       rte_trace_point_emit_u64(val);
> >> +)
> >> +#endif
> >> +
> >--------------END-------------------------
> >
> >All of the above changes can go to lib/pmu. Right? Like ethdev is
> adding its trace points in ethdev
> >library?
> >and make trace point name as rte_pmu_trace_read
> 
> That's because libpmu is higher than eal in hierarchy of libraries so
> it won't see some symbols, for example __rte_trace_point_register().
> So first tracing itself needs to be moved out to a separate library and
> then approach other libs take would become natural for libpmu too.

OK; I suppose the required change of tracing library is not going to happen right away, so Ack to keeping it here (for now).

Please rename trace point to rte_pmu_trace_read / lib.pmu.read as requested by Jerin. Although it is temporarily in EAL (due to the above trace library issue), it really is PMU library, not EAL library.
  
Konstantin Ananyev Nov. 5, 2024, 11:04 a.m. UTC | #4
> In order to profile app one needs to store significant amount of samples
> somewhere for an analysis later on. Since trace library supports
> storing data in a CTF format lets take advantage of that and add a
> dedicated PMU tracepoint.
> 
> Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
> ---
>  app/test/test_trace_perf.c               | 10 ++++
>  doc/guides/prog_guide/profile_app.rst    |  5 ++
>  doc/guides/prog_guide/trace_lib.rst      | 32 +++++++++++
>  lib/eal/common/eal_common_trace.c        |  5 +-
>  lib/eal/common/eal_common_trace_pmu.c    | 38 ++++++++++++++
>  lib/eal/common/eal_common_trace_points.c |  5 ++
>  lib/eal/common/eal_trace.h               |  4 ++
>  lib/eal/common/meson.build               |  1 +
>  lib/eal/include/rte_eal_trace.h          | 11 ++++
>  lib/eal/version.map                      |  1 +
>  lib/pmu/rte_pmu.c                        | 67 +++++++++++++++++++++++-
>  lib/pmu/rte_pmu.h                        | 24 +++++++--
>  lib/pmu/version.map                      |  1 +
>  13 files changed, 198 insertions(+), 6 deletions(-)
>  create mode 100644 lib/eal/common/eal_common_trace_pmu.c
> 
> diff --git a/app/test/test_trace_perf.c b/app/test/test_trace_perf.c
> index 8257cc02be..8a0730943e 100644
> --- a/app/test/test_trace_perf.c
> +++ b/app/test/test_trace_perf.c
> @@ -114,6 +114,10 @@ worker_fn_##func(void *arg) \
>  #define GENERIC_DOUBLE rte_eal_trace_generic_double(3.66666)
>  #define GENERIC_STR rte_eal_trace_generic_str("hello world")
>  #define VOID_FP app_dpdk_test_fp()
> +#ifdef RTE_LIB_PMU
> +/* 0 corresponds first event passed via --trace= */
> +#define READ_PMU rte_eal_trace_pmu_read(0)
> +#endif
> 
>  WORKER_DEFINE(GENERIC_VOID)
>  WORKER_DEFINE(GENERIC_U64)
> @@ -122,6 +126,9 @@ WORKER_DEFINE(GENERIC_FLOAT)
>  WORKER_DEFINE(GENERIC_DOUBLE)
>  WORKER_DEFINE(GENERIC_STR)
>  WORKER_DEFINE(VOID_FP)
> +#ifdef RTE_LIB_PMU
> +WORKER_DEFINE(READ_PMU)
> +#endif
> 
>  static void
>  run_test(const char *str, lcore_function_t f, struct test_data *data, size_t sz)
> @@ -174,6 +181,9 @@ test_trace_perf(void)
>  	run_test("double", worker_fn_GENERIC_DOUBLE, data, sz);
>  	run_test("string", worker_fn_GENERIC_STR, data, sz);
>  	run_test("void_fp", worker_fn_VOID_FP, data, sz);
> +#ifdef RTE_LIB_PMU
> +	run_test("read_pmu", worker_fn_READ_PMU, data, sz);
> +#endif
> 
>  	rte_free(data);
>  	return TEST_SUCCESS;
> diff --git a/doc/guides/prog_guide/profile_app.rst b/doc/guides/prog_guide/profile_app.rst
> index 854c515a61..1ab6fb9eaa 100644
> --- a/doc/guides/prog_guide/profile_app.rst
> +++ b/doc/guides/prog_guide/profile_app.rst
> @@ -36,6 +36,11 @@ As of now implementation imposes certain limitations:
> 
>  * Each EAL lcore measures same group of events
> 
> +Alternatively tracing library can be used which offers dedicated tracepoint
> +``rte_eal_trace_pmu_event()``.
> +
> +Refer to :doc:`../prog_guide/trace_lib` for more details.
> +
> 
>  Profiling on x86
>  ----------------
> diff --git a/doc/guides/prog_guide/trace_lib.rst b/doc/guides/prog_guide/trace_lib.rst
> index d9b17abe90..378abccd72 100644
> --- a/doc/guides/prog_guide/trace_lib.rst
> +++ b/doc/guides/prog_guide/trace_lib.rst
> @@ -46,6 +46,7 @@ DPDK tracing library features
>    trace format and is compatible with ``LTTng``.
>    For detailed information, refer to
>    `Common Trace Format <https://diamon.org/ctf/>`_.
> +- Support reading PMU events on ARM64 and x86-64 (Intel)
> 
>  How to add a tracepoint?
>  ------------------------
> @@ -139,6 +140,37 @@ the user must use ``RTE_TRACE_POINT_FP`` instead of ``RTE_TRACE_POINT``.
>  ``RTE_TRACE_POINT_FP`` is compiled out by default and it can be enabled using
>  the ``enable_trace_fp`` option for meson build.
> 
> +PMU tracepoint
> +--------------
> +
> +Performance monitoring unit (PMU) event values can be read from hardware
> +registers using predefined ``rte_pmu_read`` tracepoint.
> +
> +Tracing is enabled via ``--trace`` EAL option by passing both expression
> +matching PMU tracepoint name i.e ``lib.eal.pmu.read`` and expression
> +``e=ev1[,ev2,...]`` matching particular events::
> +
> +    --trace='.*pmu.read\|e=cpu_cycles,l1d_cache'
> +
> +Event names are available under ``/sys/bus/event_source/devices/PMU/events``
> +directory, where ``PMU`` is a placeholder for either a ``cpu`` or a directory
> +containing ``cpus``.
> +
> +In contrary to other tracepoints this does not need any extra variables
> +added to source files. Instead, caller passes index which follows the order of
> +events specified via ``--trace`` parameter. In the following example index ``0``
> +corresponds to ``cpu_cyclces`` while index ``1`` corresponds to ``l1d_cache``.
> +
> +.. code-block:: c
> +
> + ...
> + rte_eal_trace_pmu_read(0);
> + rte_eal_trace_pmu_read(1);
> + ...
> +
> +PMU tracing support must be explicitly enabled using the ``enable_trace_fp``
> +option for meson build.
> +
>  Event record mode
>  -----------------
> 
> diff --git a/lib/eal/common/eal_common_trace.c b/lib/eal/common/eal_common_trace.c
> index 918f49bf4f..9be8724ec4 100644
> --- a/lib/eal/common/eal_common_trace.c
> +++ b/lib/eal/common/eal_common_trace.c
> @@ -72,8 +72,10 @@ eal_trace_init(void)
>  		goto free_meta;
> 
>  	/* Apply global configurations */
> -	STAILQ_FOREACH(arg, &trace.args, next)
> +	STAILQ_FOREACH(arg, &trace.args, next) {
>  		trace_args_apply(arg->val);
> +		trace_pmu_args_apply(arg->val);
> +	}
> 
>  	rte_trace_mode_set(trace.mode);
> 
> @@ -89,6 +91,7 @@ eal_trace_init(void)
>  void
>  eal_trace_fini(void)
>  {
> +	trace_pmu_args_free();
>  	trace_mem_free();
>  	trace_metadata_destroy();
>  	eal_trace_args_free();
> diff --git a/lib/eal/common/eal_common_trace_pmu.c b/lib/eal/common/eal_common_trace_pmu.c
> new file mode 100644
> index 0000000000..b3ab41e8a2
> --- /dev/null
> +++ b/lib/eal/common/eal_common_trace_pmu.c
> @@ -0,0 +1,38 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2024 Marvell International Ltd.
> + */
> +
> +#include <rte_common.h>
> +
> +#include "eal_trace.h"
> +
> +#ifdef RTE_LIB_PMU
> +
> +#include <rte_pmu.h>
> +
> +void
> +trace_pmu_args_apply(const char *arg)
> +{
> +	static bool once;
> +
> +	if (!once) {
> +		if (rte_pmu_init())
> +			return;
> +		once = true;
> +	}
> +
> +	rte_pmu_add_events_by_pattern(arg);
> +}
> +
> +void
> +trace_pmu_args_free(void)
> +{
> +	rte_pmu_fini();
> +}
> +
> +#else /* !RTE_LIB_PMU */
> +
> +void trace_pmu_args_apply(const char *arg __rte_unused) { return; }
> +void trace_pmu_args_free(void) { return; }
> +
> +#endif /* RTE_LIB_PMU */
> diff --git a/lib/eal/common/eal_common_trace_points.c b/lib/eal/common/eal_common_trace_points.c
> index 0f1240ea3a..c99eab92f4 100644
> --- a/lib/eal/common/eal_common_trace_points.c
> +++ b/lib/eal/common/eal_common_trace_points.c
> @@ -100,3 +100,8 @@ RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_enable,
>  	lib.eal.intr.enable)
>  RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_disable,
>  	lib.eal.intr.disable)
> +
> +#ifdef RTE_LIB_PMU
> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_pmu_read,
> +	lib.eal.pmu.read)
> +#endif
> diff --git a/lib/eal/common/eal_trace.h b/lib/eal/common/eal_trace.h
> index 55262677e0..58fa43472a 100644
> --- a/lib/eal/common/eal_trace.h
> +++ b/lib/eal/common/eal_trace.h
> @@ -104,6 +104,10 @@ int trace_epoch_time_save(void);
>  void trace_mem_free(void);
>  void trace_mem_per_thread_free(void);
> 
> +/* PMU wrappers */
> +void trace_pmu_args_apply(const char *arg);
> +void trace_pmu_args_free(void);
> +
>  /* EAL interface */
>  int eal_trace_init(void);
>  void eal_trace_fini(void);
> diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build
> index c1bbf26654..59d5b15708 100644
> --- a/lib/eal/common/meson.build
> +++ b/lib/eal/common/meson.build
> @@ -27,6 +27,7 @@ sources += files(
>          'eal_common_tailqs.c',
>          'eal_common_thread.c',
>          'eal_common_timer.c',
> +        'eal_common_trace_pmu.c',
>          'eal_common_trace_points.c',
>          'eal_common_uuid.c',
>          'malloc_elem.c',
> diff --git a/lib/eal/include/rte_eal_trace.h b/lib/eal/include/rte_eal_trace.h
> index 9ad2112801..9c78f63ff5 100644
> --- a/lib/eal/include/rte_eal_trace.h
> +++ b/lib/eal/include/rte_eal_trace.h
> @@ -127,6 +127,17 @@ RTE_TRACE_POINT(
> 
>  #define RTE_EAL_TRACE_GENERIC_FUNC rte_eal_trace_generic_func(__func__)
> 
> +#ifdef RTE_LIB_PMU
> +#include <rte_pmu.h>
> +
> +RTE_TRACE_POINT_FP(
> +	rte_eal_trace_pmu_read,
> +	RTE_TRACE_POINT_ARGS(unsigned int index),
> +	uint64_t val = rte_pmu_read(index);
> +	rte_trace_point_emit_u64(val);
> +)
> +#endif
> +
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/lib/eal/version.map b/lib/eal/version.map
> index f493cd1ca7..3dc147f848 100644
> --- a/lib/eal/version.map
> +++ b/lib/eal/version.map
> @@ -399,6 +399,7 @@ EXPERIMENTAL {
> 
>  	# added in 24.11
>  	rte_bitset_to_str;
> +	__rte_eal_trace_pmu_read; # WINDOWS_NO_EXPORT
>  };
> 
>  INTERNAL {
> diff --git a/lib/pmu/rte_pmu.c b/lib/pmu/rte_pmu.c
> index dd57961627..b65f08c75b 100644
> --- a/lib/pmu/rte_pmu.c
> +++ b/lib/pmu/rte_pmu.c
> @@ -412,12 +412,75 @@ rte_pmu_add_event(const char *name)
>  	return event->index;
>  }
> 
> +static int
> +add_events(const char *pattern)
> +{
> +	char *token, *copy;
> +	int ret = 0;
> +
> +	copy = strdup(pattern);
> +	if (copy == NULL)
> +		return -ENOMEM;
> +
> +	token = strtok(copy, ",");
> +	while (token) {
> +		ret = rte_pmu_add_event(token);
> +		if (ret < 0)
> +			break;
> +
> +		token = strtok(NULL, ",");
> +	}
> +
> +	free(copy);
> +
> +	return ret >= 0 ? 0 : ret;
> +}
> +
> +int
> +rte_pmu_add_events_by_pattern(const char *pattern)
> +{
> +	regmatch_t rmatch;
> +	char buf[BUFSIZ];
> +	unsigned int num;
> +	regex_t reg;
> +	int ret;
> +
> +	/* events are matched against occurrences of e=ev1[,ev2,..] pattern */
> +	ret = regcomp(&reg, "e=([_[:alnum:]-],?)+", REG_EXTENDED);
> +	if (ret) {
> +		PMU_LOG(ERR, "Failed to compile event matching regexp");
> +		return -EINVAL;
> +	}
> +
> +	for (;;) {
> +		if (regexec(&reg, pattern, 1, &rmatch, 0))
> +			break;
> +
> +		num = rmatch.rm_eo - rmatch.rm_so;
> +		if (num > sizeof(buf))
> +			num = sizeof(buf);
> +
> +		/* skip e= pattern prefix */
> +		memcpy(buf, pattern + rmatch.rm_so + 2, num - 2);
> +		buf[num - 2] = '\0';
> +		ret = add_events(buf);
> +		if (ret)
> +			break;
> +
> +		pattern += rmatch.rm_eo;
> +	}
> +
> +	regfree(&reg);
> +
> +	return ret;
> +}
> +
>  int
>  rte_pmu_init(void)
>  {
>  	int ret;
> 
> -	if (rte_pmu.initialized)
> +	if (rte_pmu.initialized && ++rte_pmu.initialized)

Stupid q, why not:
if (rte_pmu.initialized++ > 0) return 0?

Also increment/decrement counter implies that init()/fini()
can be called multiple times, correct?
If so, there is still an implicit assumption that it always will
happen in a sequential manner? 

>  		return 0;
> 
>  	ret = scan_pmus();
> @@ -450,7 +513,7 @@ rte_pmu_fini(void)
>  	struct rte_pmu_event_group *group;
>  	unsigned int i;
> 
> -	if (!rte_pmu.initialized)
> +	if (!rte_pmu.initialized || --rte_pmu.initialized)
>  		return;
> 
>  	RTE_TAILQ_FOREACH_SAFE(event, &rte_pmu.event_list, next, tmp_event) {
> diff --git a/lib/pmu/rte_pmu.h b/lib/pmu/rte_pmu.h
> index 85f9127911..df571f0a2f 100644
> --- a/lib/pmu/rte_pmu.h
> +++ b/lib/pmu/rte_pmu.h
> @@ -135,7 +135,7 @@ __rte_pmu_enable_group(struct rte_pmu_event_group *group);
>   * @warning
>   * @b EXPERIMENTAL: this API may change without prior notice
>   *
> - * Initialize PMU library.
> + * Initialize PMU library. It's safe to call it multiple times.
>   *
>   * @return
>   *   0 in case of success, negative value otherwise.
> @@ -148,7 +148,9 @@ rte_pmu_init(void);
>   * @warning
>   * @b EXPERIMENTAL: this API may change without prior notice
>   *
> - * Finalize PMU library.
> + * Finalize PMU library. Number of calls must match number
> + * of times rte_pmu_init() was called. Otherwise memory
> + * won't be freed properly.
>   */
>  __rte_experimental
>  void
> @@ -173,7 +175,23 @@ rte_pmu_add_event(const char *name);
>   * @warning
>   * @b EXPERIMENTAL: this API may change without prior notice
>   *
> - * Read hardware counter configured to count occurrences of an event.
> + * Add events matching pattern to the group of enabled events.
> + *
> + * @param pattern
> + *   Pattern e=ev1[,ev2,...] matching events, where evX is a placeholder for an event listed under
> + *   /sys/bus/event_source/devices/<pmu>/events.
> + */
> +__rte_experimental
> +int
> +rte_pmu_add_events_by_pattern(const char *pattern);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice
> + *
> + * Read hardware counter configured to count occurrences of an event. This is called by an lcore
> + * binded exclusively to particular cpu and may not work as expected if gets migrated elsewhere.
> + * Reason being event group is pinned hence not supposed to be multiplexed with any other events.
>   *
>   * @param index
>   *   Index of an event to be read.
> diff --git a/lib/pmu/version.map b/lib/pmu/version.map
> index d0f907d13d..f14d498b54 100644
> --- a/lib/pmu/version.map
> +++ b/lib/pmu/version.map
> @@ -5,6 +5,7 @@ EXPERIMENTAL {
>  	__rte_pmu_enable_group;
>  	rte_pmu;
>  	rte_pmu_add_event;
> +	rte_pmu_add_events_by_pattern;
>  	rte_pmu_fini;
>  	rte_pmu_init;
>  	rte_pmu_read;
> --
> 2.34.1
  
Tomasz Duszynski Nov. 8, 2024, 10:36 a.m. UTC | #5
>> From: Tomasz Duszynski [mailto:tduszynski@marvell.com]
>> Sent: Monday, 28 October 2024 11.32
>>
>> >On Fri, Oct 25, 2024 at 2:25 PM Tomasz Duszynski
>> <tduszynski@marvell.com> wrote:
>> >>
>> >> In order to profile app one needs to store significant amount of
>> >> samples somewhere for an analysis later on. Since trace library
>> >> supports storing data in a CTF format lets take advantage of that
>> and
>> >> add a dedicated PMU tracepoint.
>> >>
>> >> Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
>> >
>> >
>> >--------------START-------------------------
>> >> diff --git a/lib/eal/common/eal_common_trace_points.c
>> >> b/lib/eal/common/eal_common_trace_points.c
>> >> index 0f1240ea3a..c99eab92f4 100644
>> >> --- a/lib/eal/common/eal_common_trace_points.c
>> >> +++ b/lib/eal/common/eal_common_trace_points.c
>> >> @@ -100,3 +100,8 @@
>> RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_enable,
>> >>         lib.eal.intr.enable)
>> >>  RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_disable,
>> >>         lib.eal.intr.disable)
>> >> +
>> >> +#ifdef RTE_LIB_PMU
>> >> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_pmu_read,
>> >> +       lib.eal.pmu.read)
>> >> +#endif
>> >>
>> >>  #define RTE_EAL_TRACE_GENERIC_FUNC
>> >> rte_eal_trace_generic_func(__func__)
>> >>
>> >> +#ifdef RTE_LIB_PMU
>> >> +#include <rte_pmu.h>
>> >> +
>> >> +RTE_TRACE_POINT_FP(
>> >> +       rte_eal_trace_pmu_read,
>> >> +       RTE_TRACE_POINT_ARGS(unsigned int index),
>> >> +       uint64_t val = rte_pmu_read(index);
>> >> +       rte_trace_point_emit_u64(val);
>> >> +)
>> >> +#endif
>> >> +
>> >--------------END-------------------------
>> >
>> >All of the above changes can go to lib/pmu. Right? Like ethdev is
>> adding its trace points in ethdev
>> >library?
>> >and make trace point name as rte_pmu_trace_read
>>
>> That's because libpmu is higher than eal in hierarchy of libraries so
>> it won't see some symbols, for example __rte_trace_point_register().
>> So first tracing itself needs to be moved out to a separate library
>> and then approach other libs take would become natural for libpmu too.
>
>OK; I suppose the required change of tracing library is not going to happen right away, so Ack to
>keeping it here (for now).
>
>Please rename trace point to rte_pmu_trace_read / lib.pmu.read as requested by Jerin. Although it
>is temporarily in EAL (due to the above trace library issue), it really is PMU library, not EAL
>library.
>

Sure, no problem.
  
Tomasz Duszynski Nov. 8, 2024, 11:44 a.m. UTC | #6
>> In order to profile app one needs to store significant amount of
>> samples somewhere for an analysis later on. Since trace library
>> supports storing data in a CTF format lets take advantage of that and
>> add a dedicated PMU tracepoint.
>>
>> Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
>> ---
>>  app/test/test_trace_perf.c               | 10 ++++
>>  doc/guides/prog_guide/profile_app.rst    |  5 ++
>>  doc/guides/prog_guide/trace_lib.rst      | 32 +++++++++++
>>  lib/eal/common/eal_common_trace.c        |  5 +-
>>  lib/eal/common/eal_common_trace_pmu.c    | 38 ++++++++++++++
>>  lib/eal/common/eal_common_trace_points.c |  5 ++
>>  lib/eal/common/eal_trace.h               |  4 ++
>>  lib/eal/common/meson.build               |  1 +
>>  lib/eal/include/rte_eal_trace.h          | 11 ++++
>>  lib/eal/version.map                      |  1 +
>>  lib/pmu/rte_pmu.c                        | 67 +++++++++++++++++++++++-
>>  lib/pmu/rte_pmu.h                        | 24 +++++++--
>>  lib/pmu/version.map                      |  1 +
>>  13 files changed, 198 insertions(+), 6 deletions(-)  create mode
>> 100644 lib/eal/common/eal_common_trace_pmu.c
>>
>> diff --git a/app/test/test_trace_perf.c b/app/test/test_trace_perf.c
>> index 8257cc02be..8a0730943e 100644
>> --- a/app/test/test_trace_perf.c
>> +++ b/app/test/test_trace_perf.c
>> @@ -114,6 +114,10 @@ worker_fn_##func(void *arg) \  #define
>> GENERIC_DOUBLE rte_eal_trace_generic_double(3.66666)
>>  #define GENERIC_STR rte_eal_trace_generic_str("hello world")  #define
>> VOID_FP app_dpdk_test_fp()
>> +#ifdef RTE_LIB_PMU
>> +/* 0 corresponds first event passed via --trace= */ #define READ_PMU
>> +rte_eal_trace_pmu_read(0) #endif
>>
>>  WORKER_DEFINE(GENERIC_VOID)
>>  WORKER_DEFINE(GENERIC_U64)
>> @@ -122,6 +126,9 @@ WORKER_DEFINE(GENERIC_FLOAT)
>>  WORKER_DEFINE(GENERIC_DOUBLE)
>>  WORKER_DEFINE(GENERIC_STR)
>>  WORKER_DEFINE(VOID_FP)
>> +#ifdef RTE_LIB_PMU
>> +WORKER_DEFINE(READ_PMU)
>> +#endif
>>
>>  static void
>>  run_test(const char *str, lcore_function_t f, struct test_data *data,
>> size_t sz) @@ -174,6 +181,9 @@ test_trace_perf(void)
>>  	run_test("double", worker_fn_GENERIC_DOUBLE, data, sz);
>>  	run_test("string", worker_fn_GENERIC_STR, data, sz);
>>  	run_test("void_fp", worker_fn_VOID_FP, data, sz);
>> +#ifdef RTE_LIB_PMU
>> +	run_test("read_pmu", worker_fn_READ_PMU, data, sz); #endif
>>
>>  	rte_free(data);
>>  	return TEST_SUCCESS;
>> diff --git a/doc/guides/prog_guide/profile_app.rst
>> b/doc/guides/prog_guide/profile_app.rst
>> index 854c515a61..1ab6fb9eaa 100644
>> --- a/doc/guides/prog_guide/profile_app.rst
>> +++ b/doc/guides/prog_guide/profile_app.rst
>> @@ -36,6 +36,11 @@ As of now implementation imposes certain limitations:
>>
>>  * Each EAL lcore measures same group of events
>>
>> +Alternatively tracing library can be used which offers dedicated
>> +tracepoint ``rte_eal_trace_pmu_event()``.
>> +
>> +Refer to :doc:`../prog_guide/trace_lib` for more details.
>> +
>>
>>  Profiling on x86
>>  ----------------
>> diff --git a/doc/guides/prog_guide/trace_lib.rst
>> b/doc/guides/prog_guide/trace_lib.rst
>> index d9b17abe90..378abccd72 100644
>> --- a/doc/guides/prog_guide/trace_lib.rst
>> +++ b/doc/guides/prog_guide/trace_lib.rst
>> @@ -46,6 +46,7 @@ DPDK tracing library features
>>    trace format and is compatible with ``LTTng``.
>>    For detailed information, refer to
>>    `Common Trace Format <https://urldefense.proofpoint.com/v2/url?u=https-
>3A__diamon.org_ctf_&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=PZNXgrbjdlXxVEEGYkxIxRndyEUwWU_ad5ce22YI6Is
>&m=sUalf-TsyEi-
>hdbFm4eACDHmnQ9BrqmS_1Df11kPLuDK3_xxQZoVjxp2ZFNsHPnb&s=JntIuAUv6IrRBlS92bo8iPsQ66IVHHPouJ4S82GCNyI&
>e=>`_.
>> +- Support reading PMU events on ARM64 and x86-64 (Intel)
>>
>>  How to add a tracepoint?
>>  ------------------------
>> @@ -139,6 +140,37 @@ the user must use ``RTE_TRACE_POINT_FP`` instead of ``RTE_TRACE_POINT``.
>>  ``RTE_TRACE_POINT_FP`` is compiled out by default and it can be
>> enabled using  the ``enable_trace_fp`` option for meson build.
>>
>> +PMU tracepoint
>> +--------------
>> +
>> +Performance monitoring unit (PMU) event values can be read from
>> +hardware registers using predefined ``rte_pmu_read`` tracepoint.
>> +
>> +Tracing is enabled via ``--trace`` EAL option by passing both
>> +expression matching PMU tracepoint name i.e ``lib.eal.pmu.read`` and
>> +expression ``e=ev1[,ev2,...]`` matching particular events::
>> +
>> +    --trace='.*pmu.read\|e=cpu_cycles,l1d_cache'
>> +
>> +Event names are available under
>> +``/sys/bus/event_source/devices/PMU/events``
>> +directory, where ``PMU`` is a placeholder for either a ``cpu`` or a
>> +directory containing ``cpus``.
>> +
>> +In contrary to other tracepoints this does not need any extra
>> +variables added to source files. Instead, caller passes index which
>> +follows the order of events specified via ``--trace`` parameter. In
>> +the following example index ``0`` corresponds to ``cpu_cyclces`` while index ``1`` corresponds
>to ``l1d_cache``.
>> +
>> +.. code-block:: c
>> +
>> + ...
>> + rte_eal_trace_pmu_read(0);
>> + rte_eal_trace_pmu_read(1);
>> + ...
>> +
>> +PMU tracing support must be explicitly enabled using the
>> +``enable_trace_fp`` option for meson build.
>> +
>>  Event record mode
>>  -----------------
>>
>> diff --git a/lib/eal/common/eal_common_trace.c
>> b/lib/eal/common/eal_common_trace.c
>> index 918f49bf4f..9be8724ec4 100644
>> --- a/lib/eal/common/eal_common_trace.c
>> +++ b/lib/eal/common/eal_common_trace.c
>> @@ -72,8 +72,10 @@ eal_trace_init(void)
>>  		goto free_meta;
>>
>>  	/* Apply global configurations */
>> -	STAILQ_FOREACH(arg, &trace.args, next)
>> +	STAILQ_FOREACH(arg, &trace.args, next) {
>>  		trace_args_apply(arg->val);
>> +		trace_pmu_args_apply(arg->val);
>> +	}
>>
>>  	rte_trace_mode_set(trace.mode);
>>
>> @@ -89,6 +91,7 @@ eal_trace_init(void)  void
>>  eal_trace_fini(void)
>>  {
>> +	trace_pmu_args_free();
>>  	trace_mem_free();
>>  	trace_metadata_destroy();
>>  	eal_trace_args_free();
>> diff --git a/lib/eal/common/eal_common_trace_pmu.c
>> b/lib/eal/common/eal_common_trace_pmu.c
>> new file mode 100644
>> index 0000000000..b3ab41e8a2
>> --- /dev/null
>> +++ b/lib/eal/common/eal_common_trace_pmu.c
>> @@ -0,0 +1,38 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(C) 2024 Marvell International Ltd.
>> + */
>> +
>> +#include <rte_common.h>
>> +
>> +#include "eal_trace.h"
>> +
>> +#ifdef RTE_LIB_PMU
>> +
>> +#include <rte_pmu.h>
>> +
>> +void
>> +trace_pmu_args_apply(const char *arg) {
>> +	static bool once;
>> +
>> +	if (!once) {
>> +		if (rte_pmu_init())
>> +			return;
>> +		once = true;
>> +	}
>> +
>> +	rte_pmu_add_events_by_pattern(arg);
>> +}
>> +
>> +void
>> +trace_pmu_args_free(void)
>> +{
>> +	rte_pmu_fini();
>> +}
>> +
>> +#else /* !RTE_LIB_PMU */
>> +
>> +void trace_pmu_args_apply(const char *arg __rte_unused) { return; }
>> +void trace_pmu_args_free(void) { return; }
>> +
>> +#endif /* RTE_LIB_PMU */
>> diff --git a/lib/eal/common/eal_common_trace_points.c
>> b/lib/eal/common/eal_common_trace_points.c
>> index 0f1240ea3a..c99eab92f4 100644
>> --- a/lib/eal/common/eal_common_trace_points.c
>> +++ b/lib/eal/common/eal_common_trace_points.c
>> @@ -100,3 +100,8 @@ RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_enable,
>>  	lib.eal.intr.enable)
>>  RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_disable,
>>  	lib.eal.intr.disable)
>> +
>> +#ifdef RTE_LIB_PMU
>> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_pmu_read,
>> +	lib.eal.pmu.read)
>> +#endif
>> diff --git a/lib/eal/common/eal_trace.h b/lib/eal/common/eal_trace.h
>> index 55262677e0..58fa43472a 100644
>> --- a/lib/eal/common/eal_trace.h
>> +++ b/lib/eal/common/eal_trace.h
>> @@ -104,6 +104,10 @@ int trace_epoch_time_save(void);  void
>> trace_mem_free(void);  void trace_mem_per_thread_free(void);
>>
>> +/* PMU wrappers */
>> +void trace_pmu_args_apply(const char *arg); void
>> +trace_pmu_args_free(void);
>> +
>>  /* EAL interface */
>>  int eal_trace_init(void);
>>  void eal_trace_fini(void);
>> diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build
>> index c1bbf26654..59d5b15708 100644
>> --- a/lib/eal/common/meson.build
>> +++ b/lib/eal/common/meson.build
>> @@ -27,6 +27,7 @@ sources += files(
>>          'eal_common_tailqs.c',
>>          'eal_common_thread.c',
>>          'eal_common_timer.c',
>> +        'eal_common_trace_pmu.c',
>>          'eal_common_trace_points.c',
>>          'eal_common_uuid.c',
>>          'malloc_elem.c',
>> diff --git a/lib/eal/include/rte_eal_trace.h
>> b/lib/eal/include/rte_eal_trace.h index 9ad2112801..9c78f63ff5 100644
>> --- a/lib/eal/include/rte_eal_trace.h
>> +++ b/lib/eal/include/rte_eal_trace.h
>> @@ -127,6 +127,17 @@ RTE_TRACE_POINT(
>>
>>  #define RTE_EAL_TRACE_GENERIC_FUNC
>> rte_eal_trace_generic_func(__func__)
>>
>> +#ifdef RTE_LIB_PMU
>> +#include <rte_pmu.h>
>> +
>> +RTE_TRACE_POINT_FP(
>> +	rte_eal_trace_pmu_read,
>> +	RTE_TRACE_POINT_ARGS(unsigned int index),
>> +	uint64_t val = rte_pmu_read(index);
>> +	rte_trace_point_emit_u64(val);
>> +)
>> +#endif
>> +
>>  #ifdef __cplusplus
>>  }
>>  #endif
>> diff --git a/lib/eal/version.map b/lib/eal/version.map index
>> f493cd1ca7..3dc147f848 100644
>> --- a/lib/eal/version.map
>> +++ b/lib/eal/version.map
>> @@ -399,6 +399,7 @@ EXPERIMENTAL {
>>
>>  	# added in 24.11
>>  	rte_bitset_to_str;
>> +	__rte_eal_trace_pmu_read; # WINDOWS_NO_EXPORT
>>  };
>>
>>  INTERNAL {
>> diff --git a/lib/pmu/rte_pmu.c b/lib/pmu/rte_pmu.c index
>> dd57961627..b65f08c75b 100644
>> --- a/lib/pmu/rte_pmu.c
>> +++ b/lib/pmu/rte_pmu.c
>> @@ -412,12 +412,75 @@ rte_pmu_add_event(const char *name)
>>  	return event->index;
>>  }
>>
>> +static int
>> +add_events(const char *pattern)
>> +{
>> +	char *token, *copy;
>> +	int ret = 0;
>> +
>> +	copy = strdup(pattern);
>> +	if (copy == NULL)
>> +		return -ENOMEM;
>> +
>> +	token = strtok(copy, ",");
>> +	while (token) {
>> +		ret = rte_pmu_add_event(token);
>> +		if (ret < 0)
>> +			break;
>> +
>> +		token = strtok(NULL, ",");
>> +	}
>> +
>> +	free(copy);
>> +
>> +	return ret >= 0 ? 0 : ret;
>> +}
>> +
>> +int
>> +rte_pmu_add_events_by_pattern(const char *pattern) {
>> +	regmatch_t rmatch;
>> +	char buf[BUFSIZ];
>> +	unsigned int num;
>> +	regex_t reg;
>> +	int ret;
>> +
>> +	/* events are matched against occurrences of e=ev1[,ev2,..] pattern */
>> +	ret = regcomp(&reg, "e=([_[:alnum:]-],?)+", REG_EXTENDED);
>> +	if (ret) {
>> +		PMU_LOG(ERR, "Failed to compile event matching regexp");
>> +		return -EINVAL;
>> +	}
>> +
>> +	for (;;) {
>> +		if (regexec(&reg, pattern, 1, &rmatch, 0))
>> +			break;
>> +
>> +		num = rmatch.rm_eo - rmatch.rm_so;
>> +		if (num > sizeof(buf))
>> +			num = sizeof(buf);
>> +
>> +		/* skip e= pattern prefix */
>> +		memcpy(buf, pattern + rmatch.rm_so + 2, num - 2);
>> +		buf[num - 2] = '\0';
>> +		ret = add_events(buf);
>> +		if (ret)
>> +			break;
>> +
>> +		pattern += rmatch.rm_eo;
>> +	}
>> +
>> +	regfree(&reg);
>> +
>> +	return ret;
>> +}
>> +
>>  int
>>  rte_pmu_init(void)
>>  {
>>  	int ret;
>>
>> -	if (rte_pmu.initialized)
>> +	if (rte_pmu.initialized && ++rte_pmu.initialized)
>
>Stupid q, why not:
>if (rte_pmu.initialized++ > 0) return 0?
>

If you call rte_pmu_init() and it fails, then rte_pmu.initialized stays at 1. It means that libpmu is initialized
but obviously that's not the case. On the other hand 1 means that there's one user, since this variable serves 
both as a flag and reference counter, which is true in fact. So only half of the story is true but we want the 
whole story to be true. 

Besides, if first call fails then second one will return early hence caller won’t be able to initialize libpmu but this time due to a different reason. 

And last thing, that expression resembles one used in rte_pmu_fini(). 

>Also increment/decrement counter implies that init()/fini() can be called multiple times, correct?
>If so, there is still an implicit assumption that it always will happen in a sequential manner?
>

Yes, that should happen in sequential manner. Frankly, that counter should stay at 1 for the most of the time.
The only exception is when you configure DPDK with -Denable_trace_fp=true and run functional test (pmu_autotest). 
Then that counter jumps to 2 during test and gets pulled down to 1 at the end. 

>>  		return 0;
>>
>>  	ret = scan_pmus();
>> @@ -450,7 +513,7 @@ rte_pmu_fini(void)
>>  	struct rte_pmu_event_group *group;
>>  	unsigned int i;
>>
>> -	if (!rte_pmu.initialized)
>> +	if (!rte_pmu.initialized || --rte_pmu.initialized)
>>  		return;
>>
>>  	RTE_TAILQ_FOREACH_SAFE(event, &rte_pmu.event_list, next, tmp_event)
>> { diff --git a/lib/pmu/rte_pmu.h b/lib/pmu/rte_pmu.h index
>> 85f9127911..df571f0a2f 100644
>> --- a/lib/pmu/rte_pmu.h
>> +++ b/lib/pmu/rte_pmu.h
>> @@ -135,7 +135,7 @@ __rte_pmu_enable_group(struct rte_pmu_event_group *group);
>>   * @warning
>>   * @b EXPERIMENTAL: this API may change without prior notice
>>   *
>> - * Initialize PMU library.
>> + * Initialize PMU library. It's safe to call it multiple times.
>>   *
>>   * @return
>>   *   0 in case of success, negative value otherwise.
>> @@ -148,7 +148,9 @@ rte_pmu_init(void);
>>   * @warning
>>   * @b EXPERIMENTAL: this API may change without prior notice
>>   *
>> - * Finalize PMU library.
>> + * Finalize PMU library. Number of calls must match number
>> + * of times rte_pmu_init() was called. Otherwise memory
>> + * won't be freed properly.
>>   */
>>  __rte_experimental
>>  void
>> @@ -173,7 +175,23 @@ rte_pmu_add_event(const char *name);
>>   * @warning
>>   * @b EXPERIMENTAL: this API may change without prior notice
>>   *
>> - * Read hardware counter configured to count occurrences of an event.
>> + * Add events matching pattern to the group of enabled events.
>> + *
>> + * @param pattern
>> + *   Pattern e=ev1[,ev2,...] matching events, where evX is a placeholder for an event listed
>under
>> + *   /sys/bus/event_source/devices/<pmu>/events.
>> + */
>> +__rte_experimental
>> +int
>> +rte_pmu_add_events_by_pattern(const char *pattern);
>> +
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice
>> + *
>> + * Read hardware counter configured to count occurrences of an event.
>> +This is called by an lcore
>> + * binded exclusively to particular cpu and may not work as expected if gets migrated elsewhere.
>> + * Reason being event group is pinned hence not supposed to be multiplexed with any other
>events.
>>   *
>>   * @param index
>>   *   Index of an event to be read.
>> diff --git a/lib/pmu/version.map b/lib/pmu/version.map index
>> d0f907d13d..f14d498b54 100644
>> --- a/lib/pmu/version.map
>> +++ b/lib/pmu/version.map
>> @@ -5,6 +5,7 @@ EXPERIMENTAL {
>>  	__rte_pmu_enable_group;
>>  	rte_pmu;
>>  	rte_pmu_add_event;
>> +	rte_pmu_add_events_by_pattern;
>>  	rte_pmu_fini;
>>  	rte_pmu_init;
>>  	rte_pmu_read;
>> --
>> 2.34.1
  
Stephen Hemminger Nov. 12, 2024, 11:09 p.m. UTC | #7
On Fri, 25 Oct 2024 10:54:14 +0200
Tomasz Duszynski <tduszynski@marvell.com> wrote:

> In order to profile app one needs to store significant amount of samples
> somewhere for an analysis later on. Since trace library supports
> storing data in a CTF format lets take advantage of that and add a
> dedicated PMU tracepoint.
> 
> Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
> ---
>  app/test/test_trace_perf.c               | 10 ++++
>  doc/guides/prog_guide/profile_app.rst    |  5 ++
>  doc/guides/prog_guide/trace_lib.rst      | 32 +++++++++++
>  lib/eal/common/eal_common_trace.c        |  5 +-
>  lib/eal/common/eal_common_trace_pmu.c    | 38 ++++++++++++++
>  lib/eal/common/eal_common_trace_points.c |  5 ++
>  lib/eal/common/eal_trace.h               |  4 ++
>  lib/eal/common/meson.build               |  1 +
>  lib/eal/include/rte_eal_trace.h          | 11 ++++
>  lib/eal/version.map                      |  1 +
>  lib/pmu/rte_pmu.c                        | 67 +++++++++++++++++++++++-
>  lib/pmu/rte_pmu.h                        | 24 +++++++--
>  lib/pmu/version.map                      |  1 +
>  13 files changed, 198 insertions(+), 6 deletions(-)
>  create mode 100644 lib/eal/common/eal_common_trace_pmu.c


There is an issue with calling a rte_experimental function.

-------------------------------BEGIN LOGS----------------------------
####################################################################################
#### [Begin job log] "ubuntu-22.04-gcc-debug+doc+examples+tests" at step Build and test
####################################################################################
[3384/6468] Compiling C object buildtools/chkincs/chkincs.p/meson-generated_rte_pmu.c.o
FAILED: buildtools/chkincs/chkincs.p/meson-generated_rte_pmu.c.o 
ccache gcc -Ibuildtools/chkincs/chkincs.p -Ibuildtools/chkincs -I../buildtools/chkincs -Iexamples/l3fwd -I../examples/l3fwd -I../examples/common -Idrivers/bus/vdev -I../drivers/bus/vdev -I. -I.. -Iconfig -I../config -Ilib/eal/include -I../lib/eal/include -Ilib/eal/linux/include -I../lib/eal/linux/include -Ilib/eal/x86/include -I../lib/eal/x86/include -Ilib/eal/common -I../lib/eal/common -Ilib/eal -I../lib/eal -Ilib/kvargs -I../lib/kvargs -Ilib/log -I../lib/log -Ilib/metrics -I../lib/metrics -Ilib/telemetry -I../lib/telemetry -Ilib/pmu -I../lib/pmu -Idrivers/bus/pci -I../drivers/bus/pci -I../drivers/bus/pci/linux -Ilib/pci -I../lib/pci -Idrivers/bus/vmbus -I../drivers/bus/vmbus -I../drivers/bus/vmbus/linux -Ilib/argparse -I../lib/argparse -Ilib/ptr_compress -I../lib/ptr_compress -Ilib/ring -I../lib/ring -Ilib/rcu -I../lib/rcu -Ilib/mempool -I../lib/mempool -Ilib/mbuf -I../lib/mbuf -Ilib/net -I../lib/net -Ilib/meter -I../lib/meter -Ilib/ethdev -I../lib/ethdev -Ilib/cmdline -I../lib/cmdline -Ilib/hash -I../lib/hash -Ilib/timer -I../lib/timer -Ilib/acl -I../lib/acl -Ilib/bbdev -I../lib/bbdev -Ilib/bitratestats -I../lib/bitratestats -Ilib/bpf -I../lib/bpf -Ilib/cfgfile -I../lib/cfgfile -Ilib/compressdev -I../lib/compressdev -Ilib/cryptodev -I../lib/cryptodev -Ilib/distributor -I../lib/distributor -Ilib/dmadev -I../lib/dmadev -Ilib/efd -I../lib/efd -Ilib/eventdev -I../lib/eventdev -Ilib/dispatcher -I../lib/dispatcher -Ilib/gpudev -I../lib/gpudev -Ilib/gro -I../lib/gro -Ilib/gso -I../lib/gso -Ilib/ip_frag -I../lib/ip_frag -Ilib/jobstats -I../lib/jobstats -Ilib/latencystats -I../lib/latencystats -Ilib/lpm -I../lib/lpm -Ilib/member -I../lib/member -Ilib/pcapng -I../lib/pcapng -Ilib/power -I../lib/power -Ilib/rawdev -I../lib/rawdev -Ilib/regexdev -I../lib/regexdev -Ilib/mldev -I../lib/mldev -Ilib/rib -I../lib/rib -Ilib/reorder -I../lib/reorder -Ilib/sched -I../lib/sched -Ilib/security -I../lib/security -Ilib/stack -I../lib/stack -Ilib/vhost -I../lib/vhost -Ilib/ipsec -I../lib/ipsec -Ilib/pdcp -I../lib/pdcp -Ilib/fib -I../lib/fib -Ilib/port -I../lib/port -Ilib/pdump -I../lib/pdump -Ilib/table -I../lib/table -Ilib/pipeline -I../lib/pipeline -Ilib/graph -I../lib/graph -Ilib/node -I../lib/node -fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -Werror -std=c11 -g -include rte_config.h -Wcast-qual -Wdeprecated -Wformat -Wformat-nonliteral -Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wold-style-definition -Wpointer-arith -Wsign-compare -Wstrict-prototypes -Wundef -Wwrite-strings -Wno-address-of-packed-member -Wno-packed-not-aligned -Wno-missing-field-initializers -D_GNU_SOURCE -march=corei7 -mrtm -MD -MQ buildtools/chkincs/chkincs.p/meson-generated_rte_pmu.c.o -MF buildtools/chkincs/chkincs.p/meson-generated_rte_pmu.c.o.d -o buildtools/chkincs/chkincs.p/meson-generated_rte_pmu.c.o -c buildtools/chkincs/chkincs.p/rte_pmu.c
In file included from buildtools/chkincs/chkincs.p/rte_pmu.c:1:
/home/runner/work/dpdk/dpdk/lib/pmu/rte_pmu.h: In function ‘rte_pmu_read’:
/home/runner/work/dpdk/dpdk/lib/pmu/rte_pmu.h:214:17: error: ‘__rte_pmu_enable_group’ is deprecated: Symbol is not yet part of stable ABI [-Werror=deprecated-declarations]
  214 |                 ret = __rte_pmu_enable_group(group);
      |                 ^~~
/home/runner/work/dpdk/dpdk/lib/pmu/rte_pmu.h:132:1: note: declared here
  132 | __rte_pmu_enable_group(struct rte_pmu_event_group *group);
      | ^~~~~~~~~~~~~~~~~~~~~~
/home/runner/work/dpdk/dpdk/lib/pmu/rte_pmu.h:222:9: error: ‘__rte_pmu_read_userpage’ is deprecated: Symbol is not yet part of stable ABI [-Werror=deprecated-declarations]
  222 |         return __rte_pmu_read_userpage(group->mmap_pages[index]);
      |         ^~~~~~
/home/runner/work/dpdk/dpdk/lib/pmu/rte_pmu.h:86:1: note: declared here
   86 | __rte_pmu_read_userpage(struct perf_event_mmap_page *pc)
      | ^~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
[3385/6468] Compiling C object buildtools/chkincs/chkincs.p/meson-generated_rte_byteorder.c.o
[3386/6468] Compiling C object buildtools/chkincs/chkincs.p/meson-generated_rte_atomic.c.o
[3387/6468] Compiling C object buildtools/chkincs/chkincs.p/meson-generated_rte_rtm.c.o
[3388/6468] Compiling C object buildtools/chkincs/chkincs.p/meson-generated_rte_memcpy.c.o
[3389/6468] Compiling C object app/dpdk-test.p/test_test_memcpy_perf.c.o
ninja: build stopped: subcommand failed.
##[error]Process completed with exit code 1.
  
Tomasz Duszynski Nov. 15, 2024, 10:24 a.m. UTC | #8
>-----Original Message-----
>From: Stephen Hemminger <stephen@networkplumber.org>
>Sent: Wednesday, November 13, 2024 12:10 AM
>To: Tomasz Duszynski <tduszynski@marvell.com>
>Cc: Jerin Jacob <jerinj@marvell.com>; Sunil Kumar Kori <skori@marvell.com>; Tyler Retzlaff
><roretzla@linux.microsoft.com>; Ruifeng.Wang@arm.com; bruce.richardson@intel.com;
>david.marchand@redhat.com; dev@dpdk.org; konstantin.v.ananyev@yandex.ru;
>mattias.ronnblom@ericsson.com; mb@smartsharesystems.com; thomas@monjalon.net; zhoumin@loongson.cn
>Subject: [EXTERNAL] Re: [PATCH v15 4/4] eal: add PMU support to tracing library
>
>On Fri, 25 Oct 2024 10: 54: 14 +0200 Tomasz Duszynski <tduszynski@ marvell. com> wrote: > In order
>to profile app one needs to store significant amount of samples > somewhere for an analysis later
>on. Since trace library supports > 
>On Fri, 25 Oct 2024 10:54:14 +0200
>Tomasz Duszynski <tduszynski@marvell.com> wrote:
>
>> In order to profile app one needs to store significant amount of
>> samples somewhere for an analysis later on. Since trace library
>> supports storing data in a CTF format lets take advantage of that and
>> add a dedicated PMU tracepoint.
>>
>> Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
>> ---
>>  app/test/test_trace_perf.c               | 10 ++++
>>  doc/guides/prog_guide/profile_app.rst    |  5 ++
>>  doc/guides/prog_guide/trace_lib.rst      | 32 +++++++++++
>>  lib/eal/common/eal_common_trace.c        |  5 +-
>>  lib/eal/common/eal_common_trace_pmu.c    | 38 ++++++++++++++
>>  lib/eal/common/eal_common_trace_points.c |  5 ++
>>  lib/eal/common/eal_trace.h               |  4 ++
>>  lib/eal/common/meson.build               |  1 +
>>  lib/eal/include/rte_eal_trace.h          | 11 ++++
>>  lib/eal/version.map                      |  1 +
>>  lib/pmu/rte_pmu.c                        | 67 +++++++++++++++++++++++-
>>  lib/pmu/rte_pmu.h                        | 24 +++++++--
>>  lib/pmu/version.map                      |  1 +
>>  13 files changed, 198 insertions(+), 6 deletions(-)  create mode
>> 100644 lib/eal/common/eal_common_trace_pmu.c
>
>
>There is an issue with calling a rte_experimental function.
>
>-------------------------------BEGIN LOGS----------------------------
>####################################################################################
>#### [Begin job log] "ubuntu-22.04-gcc-debug+doc+examples+tests" at step Build and test
>####################################################################################
>[3384/6468] Compiling C object buildtools/chkincs/chkincs.p/meson-generated_rte_pmu.c.o
>FAILED: buildtools/chkincs/chkincs.p/meson-generated_rte_pmu.c.o
>ccache gcc -Ibuildtools/chkincs/chkincs.p -Ibuildtools/chkincs -I../buildtools/chkincs -
>Iexamples/l3fwd -I../examples/l3fwd -I../examples/common -Idrivers/bus/vdev -I../drivers/bus/vdev -
>I. -I.. -Iconfig -I../config -Ilib/eal/include -I../lib/eal/include -Ilib/eal/linux/include -
>I../lib/eal/linux/include -Ilib/eal/x86/include -I../lib/eal/x86/include -Ilib/eal/common -
>I../lib/eal/common -Ilib/eal -I../lib/eal -Ilib/kvargs -I../lib/kvargs -Ilib/log -I../lib/log -
>Ilib/metrics -I../lib/metrics -Ilib/telemetry -I../lib/telemetry -Ilib/pmu -I../lib/pmu -
>Idrivers/bus/pci -I../drivers/bus/pci -I../drivers/bus/pci/linux -Ilib/pci -I../lib/pci -
>Idrivers/bus/vmbus -I../drivers/bus/vmbus -I../drivers/bus/vmbus/linux -Ilib/argparse -
>I../lib/argparse -Ilib/ptr_compress -I../lib/ptr_compress -Ilib/ring -I../lib/ring -Ilib/rcu -
>I../lib/rcu -Ilib/mempool -I../lib/mempool -Ilib/mbuf -I../lib/mbuf -Ilib/net -I../lib/net -
>Ilib/meter -I../lib/meter -Ilib/ethdev -I../lib/ethdev -Ilib/cmdline -I../lib/cmdline -Ilib/hash -
>I../lib/hash -Ilib/timer -I../lib/timer -Ilib/acl -I../lib/acl -Ilib/bbdev -I../lib/bbdev -
>Ilib/bitratestats -I../lib/bitratestats -Ilib/bpf -I../lib/bpf -Ilib/cfgfile -I../lib/cfgfile -
>Ilib/compressdev -I../lib/compressdev -Ilib/cryptodev -I../lib/cryptodev -Ilib/distributor -
>I../lib/distributor -Ilib/dmadev -I../lib/dmadev -Ilib/efd -I../lib/efd -Ilib/eventdev -
>I../lib/eventdev -Ilib/dispatcher -I../lib/dispatcher -Ilib/gpudev -I../lib/gpudev -Ilib/gro -
>I../lib/gro -Ilib/gso -I../lib/gso -Ilib/ip_frag -I../lib/ip_frag -Ilib/jobstats -I../lib/jobstats
>-Ilib/latencystats -I../lib/latencystats -Ilib/lpm -I../lib/lpm -Ilib/member -I../lib/member -
>Ilib/pcapng -I../lib/pcapng -Ilib/power -I../lib/power -Ilib/rawdev -I../lib/rawdev -Ilib/regexdev
>-I../lib/regexdev -Ilib/mldev -I../lib/mldev -Ilib/rib -I../lib/rib -Ilib/reorder -I../lib/reorder
>-Ilib/sched -I../lib/sched -Ilib/security -I../lib/security -Ilib/stack -I../lib/stack -Ilib/vhost
>-I../lib/vhost -Ilib/ipsec -I../lib/ipsec -Ilib/pdcp -I../lib/pdcp -Ilib/fib -I../lib/fib -
>Ilib/port -I../lib/port -Ilib/pdump -I../lib/pdump -Ilib/table -I../lib/table -Ilib/pipeline -
>I../lib/pipeline -Ilib/graph -I../lib/graph -Ilib/node -I../lib/node -fdiagnostics-color=always -
>pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -Werror -std=c11 -g -include rte_config.h -
>Wcast-qual -Wdeprecated -Wformat -Wformat-nonliteral -Wformat-security -Wmissing-declarations -
>Wmissing-prototypes -Wnested-externs -Wold-style-definition -Wpointer-arith -Wsign-compare -
>Wstrict-prototypes -Wundef -Wwrite-strings -Wno-address-of-packed-member -Wno-packed-not-aligned -
>Wno-missing-field-initializers -D_GNU_SOURCE -march=corei7 -mrtm -MD -MQ
>buildtools/chkincs/chkincs.p/meson-generated_rte_pmu.c.o -MF buildtools/chkincs/chkincs.p/meson-
>generated_rte_pmu.c.o.d -o buildtools/chkincs/chkincs.p/meson-generated_rte_pmu.c.o -c
>buildtools/chkincs/chkincs.p/rte_pmu.c
>In file included from buildtools/chkincs/chkincs.p/rte_pmu.c:1:
>/home/runner/work/dpdk/dpdk/lib/pmu/rte_pmu.h: In function ‘rte_pmu_read’:
>/home/runner/work/dpdk/dpdk/lib/pmu/rte_pmu.h:214:17: error: ‘__rte_pmu_enable_group’ is
>deprecated: Symbol is not yet part of stable ABI [-Werror=deprecated-declarations]
>  214 |                 ret = __rte_pmu_enable_group(group);
>      |                 ^~~
>/home/runner/work/dpdk/dpdk/lib/pmu/rte_pmu.h:132:1: note: declared here
>  132 | __rte_pmu_enable_group(struct rte_pmu_event_group *group);
>      | ^~~~~~~~~~~~~~~~~~~~~~
>/home/runner/work/dpdk/dpdk/lib/pmu/rte_pmu.h:222:9: error: ‘__rte_pmu_read_userpage’ is
>deprecated: Symbol is not yet part of stable ABI [-Werror=deprecated-declarations]
>  222 |         return __rte_pmu_read_userpage(group->mmap_pages[index]);
>      |         ^~~~~~
>/home/runner/work/dpdk/dpdk/lib/pmu/rte_pmu.h:86:1: note: declared here
>   86 | __rte_pmu_read_userpage(struct perf_event_mmap_page *pc)
>      | ^~~~~~~~~~~~~~~~~~~~~~~
>cc1: all warnings being treated as errors [3385/6468] Compiling C object
>buildtools/chkincs/chkincs.p/meson-generated_rte_byteorder.c.o
>[3386/6468] Compiling C object buildtools/chkincs/chkincs.p/meson-generated_rte_atomic.c.o
>[3387/6468] Compiling C object buildtools/chkincs/chkincs.p/meson-generated_rte_rtm.c.o
>[3388/6468] Compiling C object buildtools/chkincs/chkincs.p/meson-generated_rte_memcpy.c.o
>[3389/6468] Compiling C object app/dpdk-test.p/test_test_memcpy_perf.c.o
>ninja: build stopped: subcommand failed.
>##[error]Process completed with exit code 1.

Right, this indeed pops up with -Dcheck_includes=true. Will fix this in v16. 

Thanks.
  

Patch

diff --git a/app/test/test_trace_perf.c b/app/test/test_trace_perf.c
index 8257cc02be..8a0730943e 100644
--- a/app/test/test_trace_perf.c
+++ b/app/test/test_trace_perf.c
@@ -114,6 +114,10 @@  worker_fn_##func(void *arg) \
 #define GENERIC_DOUBLE rte_eal_trace_generic_double(3.66666)
 #define GENERIC_STR rte_eal_trace_generic_str("hello world")
 #define VOID_FP app_dpdk_test_fp()
+#ifdef RTE_LIB_PMU
+/* 0 corresponds first event passed via --trace= */
+#define READ_PMU rte_eal_trace_pmu_read(0)
+#endif
 
 WORKER_DEFINE(GENERIC_VOID)
 WORKER_DEFINE(GENERIC_U64)
@@ -122,6 +126,9 @@  WORKER_DEFINE(GENERIC_FLOAT)
 WORKER_DEFINE(GENERIC_DOUBLE)
 WORKER_DEFINE(GENERIC_STR)
 WORKER_DEFINE(VOID_FP)
+#ifdef RTE_LIB_PMU
+WORKER_DEFINE(READ_PMU)
+#endif
 
 static void
 run_test(const char *str, lcore_function_t f, struct test_data *data, size_t sz)
@@ -174,6 +181,9 @@  test_trace_perf(void)
 	run_test("double", worker_fn_GENERIC_DOUBLE, data, sz);
 	run_test("string", worker_fn_GENERIC_STR, data, sz);
 	run_test("void_fp", worker_fn_VOID_FP, data, sz);
+#ifdef RTE_LIB_PMU
+	run_test("read_pmu", worker_fn_READ_PMU, data, sz);
+#endif
 
 	rte_free(data);
 	return TEST_SUCCESS;
diff --git a/doc/guides/prog_guide/profile_app.rst b/doc/guides/prog_guide/profile_app.rst
index 854c515a61..1ab6fb9eaa 100644
--- a/doc/guides/prog_guide/profile_app.rst
+++ b/doc/guides/prog_guide/profile_app.rst
@@ -36,6 +36,11 @@  As of now implementation imposes certain limitations:
 
 * Each EAL lcore measures same group of events
 
+Alternatively tracing library can be used which offers dedicated tracepoint
+``rte_eal_trace_pmu_event()``.
+
+Refer to :doc:`../prog_guide/trace_lib` for more details.
+
 
 Profiling on x86
 ----------------
diff --git a/doc/guides/prog_guide/trace_lib.rst b/doc/guides/prog_guide/trace_lib.rst
index d9b17abe90..378abccd72 100644
--- a/doc/guides/prog_guide/trace_lib.rst
+++ b/doc/guides/prog_guide/trace_lib.rst
@@ -46,6 +46,7 @@  DPDK tracing library features
   trace format and is compatible with ``LTTng``.
   For detailed information, refer to
   `Common Trace Format <https://diamon.org/ctf/>`_.
+- Support reading PMU events on ARM64 and x86-64 (Intel)
 
 How to add a tracepoint?
 ------------------------
@@ -139,6 +140,37 @@  the user must use ``RTE_TRACE_POINT_FP`` instead of ``RTE_TRACE_POINT``.
 ``RTE_TRACE_POINT_FP`` is compiled out by default and it can be enabled using
 the ``enable_trace_fp`` option for meson build.
 
+PMU tracepoint
+--------------
+
+Performance monitoring unit (PMU) event values can be read from hardware
+registers using predefined ``rte_pmu_read`` tracepoint.
+
+Tracing is enabled via ``--trace`` EAL option by passing both expression
+matching PMU tracepoint name i.e ``lib.eal.pmu.read`` and expression
+``e=ev1[,ev2,...]`` matching particular events::
+
+    --trace='.*pmu.read\|e=cpu_cycles,l1d_cache'
+
+Event names are available under ``/sys/bus/event_source/devices/PMU/events``
+directory, where ``PMU`` is a placeholder for either a ``cpu`` or a directory
+containing ``cpus``.
+
+In contrary to other tracepoints this does not need any extra variables
+added to source files. Instead, caller passes index which follows the order of
+events specified via ``--trace`` parameter. In the following example index ``0``
+corresponds to ``cpu_cyclces`` while index ``1`` corresponds to ``l1d_cache``.
+
+.. code-block:: c
+
+ ...
+ rte_eal_trace_pmu_read(0);
+ rte_eal_trace_pmu_read(1);
+ ...
+
+PMU tracing support must be explicitly enabled using the ``enable_trace_fp``
+option for meson build.
+
 Event record mode
 -----------------
 
diff --git a/lib/eal/common/eal_common_trace.c b/lib/eal/common/eal_common_trace.c
index 918f49bf4f..9be8724ec4 100644
--- a/lib/eal/common/eal_common_trace.c
+++ b/lib/eal/common/eal_common_trace.c
@@ -72,8 +72,10 @@  eal_trace_init(void)
 		goto free_meta;
 
 	/* Apply global configurations */
-	STAILQ_FOREACH(arg, &trace.args, next)
+	STAILQ_FOREACH(arg, &trace.args, next) {
 		trace_args_apply(arg->val);
+		trace_pmu_args_apply(arg->val);
+	}
 
 	rte_trace_mode_set(trace.mode);
 
@@ -89,6 +91,7 @@  eal_trace_init(void)
 void
 eal_trace_fini(void)
 {
+	trace_pmu_args_free();
 	trace_mem_free();
 	trace_metadata_destroy();
 	eal_trace_args_free();
diff --git a/lib/eal/common/eal_common_trace_pmu.c b/lib/eal/common/eal_common_trace_pmu.c
new file mode 100644
index 0000000000..b3ab41e8a2
--- /dev/null
+++ b/lib/eal/common/eal_common_trace_pmu.c
@@ -0,0 +1,38 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2024 Marvell International Ltd.
+ */
+
+#include <rte_common.h>
+
+#include "eal_trace.h"
+
+#ifdef RTE_LIB_PMU
+
+#include <rte_pmu.h>
+
+void
+trace_pmu_args_apply(const char *arg)
+{
+	static bool once;
+
+	if (!once) {
+		if (rte_pmu_init())
+			return;
+		once = true;
+	}
+
+	rte_pmu_add_events_by_pattern(arg);
+}
+
+void
+trace_pmu_args_free(void)
+{
+	rte_pmu_fini();
+}
+
+#else /* !RTE_LIB_PMU */
+
+void trace_pmu_args_apply(const char *arg __rte_unused) { return; }
+void trace_pmu_args_free(void) { return; }
+
+#endif /* RTE_LIB_PMU */
diff --git a/lib/eal/common/eal_common_trace_points.c b/lib/eal/common/eal_common_trace_points.c
index 0f1240ea3a..c99eab92f4 100644
--- a/lib/eal/common/eal_common_trace_points.c
+++ b/lib/eal/common/eal_common_trace_points.c
@@ -100,3 +100,8 @@  RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_enable,
 	lib.eal.intr.enable)
 RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_disable,
 	lib.eal.intr.disable)
+
+#ifdef RTE_LIB_PMU
+RTE_TRACE_POINT_REGISTER(rte_eal_trace_pmu_read,
+	lib.eal.pmu.read)
+#endif
diff --git a/lib/eal/common/eal_trace.h b/lib/eal/common/eal_trace.h
index 55262677e0..58fa43472a 100644
--- a/lib/eal/common/eal_trace.h
+++ b/lib/eal/common/eal_trace.h
@@ -104,6 +104,10 @@  int trace_epoch_time_save(void);
 void trace_mem_free(void);
 void trace_mem_per_thread_free(void);
 
+/* PMU wrappers */
+void trace_pmu_args_apply(const char *arg);
+void trace_pmu_args_free(void);
+
 /* EAL interface */
 int eal_trace_init(void);
 void eal_trace_fini(void);
diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build
index c1bbf26654..59d5b15708 100644
--- a/lib/eal/common/meson.build
+++ b/lib/eal/common/meson.build
@@ -27,6 +27,7 @@  sources += files(
         'eal_common_tailqs.c',
         'eal_common_thread.c',
         'eal_common_timer.c',
+        'eal_common_trace_pmu.c',
         'eal_common_trace_points.c',
         'eal_common_uuid.c',
         'malloc_elem.c',
diff --git a/lib/eal/include/rte_eal_trace.h b/lib/eal/include/rte_eal_trace.h
index 9ad2112801..9c78f63ff5 100644
--- a/lib/eal/include/rte_eal_trace.h
+++ b/lib/eal/include/rte_eal_trace.h
@@ -127,6 +127,17 @@  RTE_TRACE_POINT(
 
 #define RTE_EAL_TRACE_GENERIC_FUNC rte_eal_trace_generic_func(__func__)
 
+#ifdef RTE_LIB_PMU
+#include <rte_pmu.h>
+
+RTE_TRACE_POINT_FP(
+	rte_eal_trace_pmu_read,
+	RTE_TRACE_POINT_ARGS(unsigned int index),
+	uint64_t val = rte_pmu_read(index);
+	rte_trace_point_emit_u64(val);
+)
+#endif
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/eal/version.map b/lib/eal/version.map
index f493cd1ca7..3dc147f848 100644
--- a/lib/eal/version.map
+++ b/lib/eal/version.map
@@ -399,6 +399,7 @@  EXPERIMENTAL {
 
 	# added in 24.11
 	rte_bitset_to_str;
+	__rte_eal_trace_pmu_read; # WINDOWS_NO_EXPORT
 };
 
 INTERNAL {
diff --git a/lib/pmu/rte_pmu.c b/lib/pmu/rte_pmu.c
index dd57961627..b65f08c75b 100644
--- a/lib/pmu/rte_pmu.c
+++ b/lib/pmu/rte_pmu.c
@@ -412,12 +412,75 @@  rte_pmu_add_event(const char *name)
 	return event->index;
 }
 
+static int
+add_events(const char *pattern)
+{
+	char *token, *copy;
+	int ret = 0;
+
+	copy = strdup(pattern);
+	if (copy == NULL)
+		return -ENOMEM;
+
+	token = strtok(copy, ",");
+	while (token) {
+		ret = rte_pmu_add_event(token);
+		if (ret < 0)
+			break;
+
+		token = strtok(NULL, ",");
+	}
+
+	free(copy);
+
+	return ret >= 0 ? 0 : ret;
+}
+
+int
+rte_pmu_add_events_by_pattern(const char *pattern)
+{
+	regmatch_t rmatch;
+	char buf[BUFSIZ];
+	unsigned int num;
+	regex_t reg;
+	int ret;
+
+	/* events are matched against occurrences of e=ev1[,ev2,..] pattern */
+	ret = regcomp(&reg, "e=([_[:alnum:]-],?)+", REG_EXTENDED);
+	if (ret) {
+		PMU_LOG(ERR, "Failed to compile event matching regexp");
+		return -EINVAL;
+	}
+
+	for (;;) {
+		if (regexec(&reg, pattern, 1, &rmatch, 0))
+			break;
+
+		num = rmatch.rm_eo - rmatch.rm_so;
+		if (num > sizeof(buf))
+			num = sizeof(buf);
+
+		/* skip e= pattern prefix */
+		memcpy(buf, pattern + rmatch.rm_so + 2, num - 2);
+		buf[num - 2] = '\0';
+		ret = add_events(buf);
+		if (ret)
+			break;
+
+		pattern += rmatch.rm_eo;
+	}
+
+	regfree(&reg);
+
+	return ret;
+}
+
 int
 rte_pmu_init(void)
 {
 	int ret;
 
-	if (rte_pmu.initialized)
+	if (rte_pmu.initialized && ++rte_pmu.initialized)
 		return 0;
 
 	ret = scan_pmus();
@@ -450,7 +513,7 @@  rte_pmu_fini(void)
 	struct rte_pmu_event_group *group;
 	unsigned int i;
 
-	if (!rte_pmu.initialized)
+	if (!rte_pmu.initialized || --rte_pmu.initialized)
 		return;
 
 	RTE_TAILQ_FOREACH_SAFE(event, &rte_pmu.event_list, next, tmp_event) {
diff --git a/lib/pmu/rte_pmu.h b/lib/pmu/rte_pmu.h
index 85f9127911..df571f0a2f 100644
--- a/lib/pmu/rte_pmu.h
+++ b/lib/pmu/rte_pmu.h
@@ -135,7 +135,7 @@  __rte_pmu_enable_group(struct rte_pmu_event_group *group);
  * @warning
  * @b EXPERIMENTAL: this API may change without prior notice
  *
- * Initialize PMU library.
+ * Initialize PMU library. It's safe to call it multiple times.
  *
  * @return
  *   0 in case of success, negative value otherwise.
@@ -148,7 +148,9 @@  rte_pmu_init(void);
  * @warning
  * @b EXPERIMENTAL: this API may change without prior notice
  *
- * Finalize PMU library.
+ * Finalize PMU library. Number of calls must match number
+ * of times rte_pmu_init() was called. Otherwise memory
+ * won't be freed properly.
  */
 __rte_experimental
 void
@@ -173,7 +175,23 @@  rte_pmu_add_event(const char *name);
  * @warning
  * @b EXPERIMENTAL: this API may change without prior notice
  *
- * Read hardware counter configured to count occurrences of an event.
+ * Add events matching pattern to the group of enabled events.
+ *
+ * @param pattern
+ *   Pattern e=ev1[,ev2,...] matching events, where evX is a placeholder for an event listed under
+ *   /sys/bus/event_source/devices/<pmu>/events.
+ */
+__rte_experimental
+int
+rte_pmu_add_events_by_pattern(const char *pattern);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Read hardware counter configured to count occurrences of an event. This is called by an lcore
+ * binded exclusively to particular cpu and may not work as expected if gets migrated elsewhere.
+ * Reason being event group is pinned hence not supposed to be multiplexed with any other events.
  *
  * @param index
  *   Index of an event to be read.
diff --git a/lib/pmu/version.map b/lib/pmu/version.map
index d0f907d13d..f14d498b54 100644
--- a/lib/pmu/version.map
+++ b/lib/pmu/version.map
@@ -5,6 +5,7 @@  EXPERIMENTAL {
 	__rte_pmu_enable_group;
 	rte_pmu;
 	rte_pmu_add_event;
+	rte_pmu_add_events_by_pattern;
 	rte_pmu_fini;
 	rte_pmu_init;
 	rte_pmu_read;