mbox series

[v17,0/4] add support for self monitoring

Message ID 20250117090033.2807073-1-tduszynski@marvell.com (mailing list archive)
Headers
Series add support for self monitoring |

Message

Tomasz Duszynski Jan. 17, 2025, 9 a.m. UTC
This series adds self monitoring support i.e allows to configure and
read performance measurement unit (PMU) counters in runtime without
using perf utility. This has certain advantages when application runs on
isolated cores running dedicated tasks.

Events can be read directly using rte_pmu_read() or using dedicated
tracepoint rte_eal_trace_pmu_read(). The latter will cause events to be
stored inside CTF file.

By design, all enabled events are grouped together and the same group
is attached to lcores that use self monitoring funtionality.

Events are enabled by names, which need to be read from standard
location under sysfs i.e

/sys/bus/event_source/devices/PMU/events

where PMU is a core pmu i.e one measuring cpu events. As of today
raw events are not supported.

v17:
- improve docs
- use safer strtok_r()
v16:
- fix build issue related to ALLOW_EXPERIMENTAL_API not always defined
- rename tracepoints upfront to make it easier to decouple for tracing
- minor doc improvements
- make sure non EAL threads can't read PMU
v15:
- add some basic logs related to API failures
- get rid of thread-local-storage
- do not support MT-safety (which was buggy anyway) in management APIs (rte_pmu_init(),
  rte_pmu_fini(), rte_pmu_add_{event,events_by_pattern}() as it impacts rte_pmu_read()
  performance because more logic needs to be incorporated to handle all corner cases
- improve documentation slightly
- various other improvements here and there
v14:
- replace __atomic_xxx with rte_atomic_xxx
- rebase to dpdk/main since that's a new feature
v13:
- allow init/fini calling from multiple contexts
- get rid of conditional compilation and return erors in case APIs are
  used on unsupported OS
v12:
- rebase old series
- slightly refresh existing documentation
- make sure compiler won't optimize accesses to certain variables during
  event readout
- drop previous Acked-by to respin a fresh review cycle
v11:
- skip fast test in case init fails
v10:
- check permissions before using counters
- do not use internal symbols in exported functions
- address review comments
v9:
- fix 'maybe-uninitialized' warning reported by CI
v8:
- just rebase series
v7:
- use per-lcore event group instead of global table index by lcore-id
- don't add pmu_autotest to fast tests because due to lack of suported
  on every arch
v6:
- move codebase to the separate library
- address review comments
v5:
- address review comments
- fix sign extension while reading pmu on x86
- fix regex mentioned in doc
- various minor changes/improvements here and there
v4:
- fix freeing mem detected by debug_autotest
v3:
- fix shared build
v2:
- fix problems reported by test build infra

Tomasz Duszynski (4):
  lib: add generic support for reading PMU events
  pmu: support reading ARM PMU events in runtime
  pmu: support reading Intel x86_64 PMU events in runtime
  eal: add PMU support to tracing library

 MAINTAINERS                              |   5 +
 app/test/meson.build                     |   1 +
 app/test/test_pmu.c                      |  55 +++
 app/test/test_trace_perf.c               |  10 +
 doc/api/doxy-api-index.md                |   3 +-
 doc/api/doxy-api.conf.in                 |   1 +
 doc/guides/prog_guide/glossary.rst       |   3 +
 doc/guides/prog_guide/profile_app.rst    |  38 ++
 doc/guides/prog_guide/trace_lib.rst      |  32 ++
 doc/guides/rel_notes/release_25_03.rst   |   7 +
 lib/eal/common/eal_common_trace.c        |   5 +-
 lib/eal/common/eal_common_trace_pmu.c    |  38 ++
 lib/eal/common/eal_common_trace_points.c |   5 +
 lib/eal/common/eal_trace.h               |   4 +
 lib/eal/common/meson.build               |   1 +
 lib/eal/include/rte_eal_trace.h          |  11 +
 lib/eal/meson.build                      |   3 +
 lib/eal/version.map                      |   1 +
 lib/meson.build                          |   1 +
 lib/pmu/meson.build                      |  22 +
 lib/pmu/pmu_arm64.c                      |  94 ++++
 lib/pmu/pmu_private.h                    |  32 ++
 lib/pmu/rte_pmu.c                        | 537 +++++++++++++++++++++++
 lib/pmu/rte_pmu.h                        | 251 +++++++++++
 lib/pmu/rte_pmu_pmc_arm64.h              |  30 ++
 lib/pmu/rte_pmu_pmc_x86_64.h             |  24 +
 lib/pmu/version.map                      |  15 +
 27 files changed, 1227 insertions(+), 2 deletions(-)
 create mode 100644 app/test/test_pmu.c
 create mode 100644 lib/eal/common/eal_common_trace_pmu.c
 create mode 100644 lib/pmu/meson.build
 create mode 100644 lib/pmu/pmu_arm64.c
 create mode 100644 lib/pmu/pmu_private.h
 create mode 100644 lib/pmu/rte_pmu.c
 create mode 100644 lib/pmu/rte_pmu.h
 create mode 100644 lib/pmu/rte_pmu_pmc_arm64.h
 create mode 100644 lib/pmu/rte_pmu_pmc_x86_64.h
 create mode 100644 lib/pmu/version.map

--
2.34.1
  

Comments

Morten Brørup June 3, 2025, 12:50 p.m. UTC | #1
+CC David Christensen (IBM POWER maintainer) and Stanislaw Kardach (RISC-V maintainer)
+CC Stanislaw Kardach (RISC-V maintainer)
+CC Wathsala Vithanage (ARM maintainer)
+CC Vipin Varghese (AMD x86 maintainer)

> From: Tomasz Duszynski [mailto:tduszynski@marvell.com]
> Sent: Friday, 17 January 2025 10.00
> 
> This series adds self monitoring support i.e allows to configure and
> read performance measurement unit (PMU) counters in runtime without
> using perf utility. This has certain advantages when application runs
> on
> isolated cores running dedicated tasks.
> 
> Events can be read directly using rte_pmu_read() or using dedicated
> tracepoint rte_eal_trace_pmu_read(). The latter will cause events to be
> stored inside CTF file.
> 
> By design, all enabled events are grouped together and the same group
> is attached to lcores that use self monitoring funtionality.
> 
> Events are enabled by names, which need to be read from standard
> location under sysfs i.e
> 
> /sys/bus/event_source/devices/PMU/events
> 
> where PMU is a core pmu i.e one measuring cpu events. As of today
> raw events are not supported.
> 
> v17:
> - improve docs
> - use safer strtok_r()
> v16:
> - fix build issue related to ALLOW_EXPERIMENTAL_API not always defined
> - rename tracepoints upfront to make it easier to decouple for tracing
> - minor doc improvements
> - make sure non EAL threads can't read PMU
> v15:
> - add some basic logs related to API failures
> - get rid of thread-local-storage
> - do not support MT-safety (which was buggy anyway) in management APIs
> (rte_pmu_init(),
>   rte_pmu_fini(), rte_pmu_add_{event,events_by_pattern}() as it impacts
> rte_pmu_read()
>   performance because more logic needs to be incorporated to handle all
> corner cases
> - improve documentation slightly
> - various other improvements here and there
> v14:
> - replace __atomic_xxx with rte_atomic_xxx
> - rebase to dpdk/main since that's a new feature
> v13:
> - allow init/fini calling from multiple contexts
> - get rid of conditional compilation and return erors in case APIs are
>   used on unsupported OS
> v12:
> - rebase old series
> - slightly refresh existing documentation
> - make sure compiler won't optimize accesses to certain variables
> during
>   event readout
> - drop previous Acked-by to respin a fresh review cycle
> v11:
> - skip fast test in case init fails
> v10:
> - check permissions before using counters
> - do not use internal symbols in exported functions
> - address review comments
> v9:
> - fix 'maybe-uninitialized' warning reported by CI
> v8:
> - just rebase series
> v7:
> - use per-lcore event group instead of global table index by lcore-id
> - don't add pmu_autotest to fast tests because due to lack of suported
>   on every arch
> v6:
> - move codebase to the separate library
> - address review comments
> v5:
> - address review comments
> - fix sign extension while reading pmu on x86
> - fix regex mentioned in doc
> - various minor changes/improvements here and there
> v4:
> - fix freeing mem detected by debug_autotest
> v3:
> - fix shared build
> v2:
> - fix problems reported by test build infra
> 
> Tomasz Duszynski (4):
>   lib: add generic support for reading PMU events
>   pmu: support reading ARM PMU events in runtime
>   pmu: support reading Intel x86_64 PMU events in runtime
>   eal: add PMU support to tracing library
> 
>  MAINTAINERS                              |   5 +
>  app/test/meson.build                     |   1 +
>  app/test/test_pmu.c                      |  55 +++
>  app/test/test_trace_perf.c               |  10 +
>  doc/api/doxy-api-index.md                |   3 +-
>  doc/api/doxy-api.conf.in                 |   1 +
>  doc/guides/prog_guide/glossary.rst       |   3 +
>  doc/guides/prog_guide/profile_app.rst    |  38 ++
>  doc/guides/prog_guide/trace_lib.rst      |  32 ++
>  doc/guides/rel_notes/release_25_03.rst   |   7 +
>  lib/eal/common/eal_common_trace.c        |   5 +-
>  lib/eal/common/eal_common_trace_pmu.c    |  38 ++
>  lib/eal/common/eal_common_trace_points.c |   5 +
>  lib/eal/common/eal_trace.h               |   4 +
>  lib/eal/common/meson.build               |   1 +
>  lib/eal/include/rte_eal_trace.h          |  11 +
>  lib/eal/meson.build                      |   3 +
>  lib/eal/version.map                      |   1 +
>  lib/meson.build                          |   1 +
>  lib/pmu/meson.build                      |  22 +
>  lib/pmu/pmu_arm64.c                      |  94 ++++
>  lib/pmu/pmu_private.h                    |  32 ++
>  lib/pmu/rte_pmu.c                        | 537 +++++++++++++++++++++++
>  lib/pmu/rte_pmu.h                        | 251 +++++++++++
>  lib/pmu/rte_pmu_pmc_arm64.h              |  30 ++
>  lib/pmu/rte_pmu_pmc_x86_64.h             |  24 +
>  lib/pmu/version.map                      |  15 +
>  27 files changed, 1227 insertions(+), 2 deletions(-)
>  create mode 100644 app/test/test_pmu.c
>  create mode 100644 lib/eal/common/eal_common_trace_pmu.c
>  create mode 100644 lib/pmu/meson.build
>  create mode 100644 lib/pmu/pmu_arm64.c
>  create mode 100644 lib/pmu/pmu_private.h
>  create mode 100644 lib/pmu/rte_pmu.c
>  create mode 100644 lib/pmu/rte_pmu.h
>  create mode 100644 lib/pmu/rte_pmu_pmc_arm64.h
>  create mode 100644 lib/pmu/rte_pmu_pmc_x86_64.h
>  create mode 100644 lib/pmu/version.map
> 
> --
> 2.34.1

Series-acked-by: Morten Brørup <mb@smartsharesystems.com>

I think the PMU library will be useful, and hope it gets into the 25.11 LTS release.
Let's see if it can go into 25.07 to begin with.

NB: Although this is Linux only, consider using rte_ffs64() instead of __builtin_ffsll() in the FIELD_PREP() macro.
  
Thomas Monjalon June 3, 2025, 1:28 p.m. UTC | #2
Hello,

17/01/2025 10:00, Tomasz Duszynski:
>  lib/pmu/meson.build                      |  22 +
>  lib/pmu/pmu_arm64.c                      |  94 ++++
>  lib/pmu/pmu_private.h                    |  32 ++
>  lib/pmu/rte_pmu.c                        | 537 +++++++++++++++++++++++
>  lib/pmu/rte_pmu.h                        | 251 +++++++++++
>  lib/pmu/rte_pmu_pmc_arm64.h              |  30 ++
>  lib/pmu/rte_pmu_pmc_x86_64.h             |  24 +

Reading it again, I wonder why it is a separate library.
In general we give access to the hardware in EAL.
Why not having PMU in EAL?
  
Morten Brørup June 3, 2025, 2:17 p.m. UTC | #3
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Tuesday, 3 June 2025 15.28
> 
> Hello,
> 
> 17/01/2025 10:00, Tomasz Duszynski:
> >  lib/pmu/meson.build                      |  22 +
> >  lib/pmu/pmu_arm64.c                      |  94 ++++
> >  lib/pmu/pmu_private.h                    |  32 ++
> >  lib/pmu/rte_pmu.c                        | 537
> +++++++++++++++++++++++
> >  lib/pmu/rte_pmu.h                        | 251 +++++++++++
> >  lib/pmu/rte_pmu_pmc_arm64.h              |  30 ++
> >  lib/pmu/rte_pmu_pmc_x86_64.h             |  24 +
> 
> Reading it again, I wonder why it is a separate library.
> In general we give access to the hardware in EAL.
> Why not having PMU in EAL?
> 

IMO, the EAL should be a thin shim layer, providing abstractions to only the "must have" parts of any underlying hardware and O/S running the DPDK application, not auxiliary hardware or O/S features. The EAL is already bloated with stuff that is not absolutely required to run simple DPDK applications.

As an advocate for a slimmer EAL, having PMU as a separate library makes sense to me.

Not all hardware access is provided through the EAL.
E.g., access to hardware power management controls is provided through the Power library/driver combo.
The PMU feature could be considered a type of "library/driver" combo, like the Power library/driver combo; but that might come with unnecessary complexity and performance overhead from additional abstractions and indirect function calling, so I don't like this alternative.

-Morten
  
Tomasz Duszynski June 6, 2025, 4:27 p.m. UTC | #4
Hi Thomas,

>Hello,
>
>17/01/2025 10:00, Tomasz Duszynski:
>>  lib/pmu/meson.build                      |  22 +
>>  lib/pmu/pmu_arm64.c                      |  94 ++++
>>  lib/pmu/pmu_private.h                    |  32 ++
>>  lib/pmu/rte_pmu.c                        | 537 +++++++++++++++++++++++
>>  lib/pmu/rte_pmu.h                        | 251 +++++++++++
>>  lib/pmu/rte_pmu_pmc_arm64.h              |  30 ++
>>  lib/pmu/rte_pmu_pmc_x86_64.h             |  24 +
>
>Reading it again, I wonder why it is a separate library.
>In general we give access to the hardware in EAL.
>Why not having PMU in EAL?

I initially started PMU as something tightly coupled with EAL, but after
feedback from other developers I decided to decouple it completely [1].

The Technical Board also discussed this and supported the idea of having
PMU as a separate library [2].

The main goal is to keep EAL lean and focused only on core essentials,
making the PMU code easier to manage separately. This separation aligns
nicely with the broader effort to simplify EAL by moving optional
features out.

[1] https://lore.kernel.org/dpdk-dev/20230111210547.GA2134@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net/
[2] https://mails.dpdk.org/archives/dev/2023-January/260035.html#:~:text=PMU%20Library%20,principle%20for%20inclusion%20in%20DPDK
  
Thomas Monjalon June 9, 2025, 4:34 p.m. UTC | #5
06/06/2025 18:27, Tomasz Duszynski:
> Hi Thomas,
> 
> >Hello,
> >
> >17/01/2025 10:00, Tomasz Duszynski:
> >>  lib/pmu/meson.build                      |  22 +
> >>  lib/pmu/pmu_arm64.c                      |  94 ++++
> >>  lib/pmu/pmu_private.h                    |  32 ++
> >>  lib/pmu/rte_pmu.c                        | 537 +++++++++++++++++++++++
> >>  lib/pmu/rte_pmu.h                        | 251 +++++++++++
> >>  lib/pmu/rte_pmu_pmc_arm64.h              |  30 ++
> >>  lib/pmu/rte_pmu_pmc_x86_64.h             |  24 +
> >
> >Reading it again, I wonder why it is a separate library.
> >In general we give access to the hardware in EAL.
> >Why not having PMU in EAL?
> 
> I initially started PMU as something tightly coupled with EAL, but after
> feedback from other developers I decided to decouple it completely [1].
> 
> The Technical Board also discussed this and supported the idea of having
> PMU as a separate library [2].
> 
> The main goal is to keep EAL lean and focused only on core essentials,
> making the PMU code easier to manage separately. This separation aligns
> nicely with the broader effort to simplify EAL by moving optional
> features out.
> 
> [1] https://lore.kernel.org/dpdk-dev/20230111210547.GA2134@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net/
> [2] https://mails.dpdk.org/archives/dev/2023-January/260035.html#:~:text=PMU%20Library%20,principle%20for%20inclusion%20in%20DPDK

OK I forgot this history.
Let's go with a separate library.

I'm trying to merge it.
I'm moving to a new release notes.
I did a bit of cleanup.
I've replaced __builtin_ffsll with rte_ffs64.
I've exported symbols with the new macro RTE_EXPORT_EXPERIMENTAL_SYMBOL.
Now I hit another kind of issue:
	error: "__rte_weak" is deprecated
I fix this one with using flags like RTE_PMU_SUPPORTED.
(note: it replaces RTE_LIB_PMU which was not defined)

And at the end I get this error for tracing:

rte_eal_trace.h:136:37: error: passing argument 1 of 'rte_pmu_read' makes integer from pointer without a cast [-Wint-conversion]
  136 |         uint64_t val = rte_pmu_read(index);
      |                                     ^~~~~
      |                                     |
      |                                     char * (*)(const char *, int)

I think integration in tracing is too much risky at this stage anyway,
so I'll postpone it.

I'll post my rebased version of this last patch for follow-up.
  
Tomasz Duszynski June 10, 2025, 9:55 a.m. UTC | #6
Hi Thomas,

Thanks for taking the time to merge the code and for the detailed update.
Cleanup and fixes you've done - appreciated.

I'm working on resolving remaining issues.