mbox series

[v6,00/33] DPDK Trace support

Message ID 20200419100133.3232316-1-jerinj@marvell.com (mailing list archive)
Headers
Series DPDK Trace support |

Message

Jerin Jacob April 19, 2020, 10:01 a.m. UTC
From: Jerin Jacob <jerinj@marvell.com>

v6
~~

Following changes are based on David's feedback.

1) Changed rte_trace_t to rte_trace_point_t
2) Reworked the header file to have following API name and filename changes
rte_trace.h has following trace control APIs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
a) rte_trace_is_enabled()
b) rte_trace_mode_set()
c) rte_trace_mode_get()
d) rte_trace_pattern()
e) rte_trace_regexp()
f) rte_trace_save()
g) rte_trace_metadata_dump()
h) rte_trace_dump()

rte_trace_point.h has following tracepoint related APIs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
a) rte_trace_point_enable()
b) rte_trace_point_disable()
c) rte_trace_point_is_enabled()
d) rte_trace_point_lookup()
e) rte_trace_point_emit_*()
f) RTE_TRACE_POINT_* macros

3) made rte_trace_fp_enabled() as internal.

4) Added common macros to have rte_trace_point_emit_* definition at
single place a.k.a more common code now.
 
5) git grep -l rte_trace_ctf |xargs sed -i -e
's/rte_trace_ctf_/rte_tracepoint_emit_/g'.

6) Added the reason new section for tracepoint in git commit log
7) Sorted symbols alphabetic order in rte_eal_version.map file.
8) Fix snprint return handling bug in __rte_trace_point_emit_field() function.
9) Doc: added newline for section titles.
10) ctf_field and ctf_count made it as static.

v5
~~

Following API rework based David and Thomas feedback.

1) Rename - "Shell pattern" to "Globbing pattern" 
2) Remove the log "level" notion from trace library.
3) Remove rte_trace_global_[level/mode]* API from trace library. 
4) EAL command line arg to --trace=regex/globing from --trace-level=regex/globing:level
5) Remove "@b EXPERIMENTAL: this API may change without prior notice" from each functions.
6) Use FP instead of DP for fastpath reference.
7) Updated documentation to reflect  the above rework.
8) Updated UT to to reflect  the above rework.
8) Updated UT for a FP trace point emit.
9) Updated performance test case for a FP trace point emission.
10) Updated git commit comments to reflect  the above rework.

v4:
~~

1) Rebased to master.

2) Adapted to latest EAL directory structure change.

3) Fix possible build issue with out of tree application wherein
it does not define -DALLOW_EXPERIMENTAL_API. Fixed by making
fast path trace functions as NOP as it was getting included
in the inline functions of ethdev,mempool, cryptodev, etc(David)
 
4) Removed DALLOW_EXPERIMENTAL_API definition from individual driver files.
Now it set as global using http://patches.dpdk.org/patch/67758/ patch (David)

5) Added new meson option(-Denable_trace_dp=true)for enabling the datapath trace point (David, Bruce)

6) Changed the authorship and Rewrote the programmer's guide  based on the below feedback(Thomas)
http://patches.dpdk.org/patch/67352/

v3:
~~

1) Fix the following build issues reported by CI
http://mails.dpdk.org/archives/test-report/2020-March/122060.html

a) clang + i686 meson build issue(Fixed in the the patch meson: add libatomic as a global dependency for i686 clang)
b) fixed build issue with FreeBSD with top of tree change.
c) fixed missing experimental API for iavf and ice with meson build in avx512 files.

v2:
~~
Addressed the following review comments from Mattias Rönnblom:

1) Changed 

from:
	typedef uint64_t* rte_trace_t;
to
	typedef uint64_t rte_trace_t;

	Initially thought to make the handle as
	struct rte_trace {
	    uint64_t val;
	}

	but changed to uint64_t for the following reasons
	a) It is opaque to the application and it will fix the compile-time type
	check as well.
	b) The handle has an index that will point to an internal slow-path
	structure so no ABI change required in the future.
	c) RTE_TRACE_POINT_DEFINE need to expose trace object. So it is better
	to keep as uint64_t and avoid one more indirection for no use.

2)
Changed:
from:
	enum rte_trace_mode_e {
to:
	enum rte_trace_mode {

3) removed [out] "found" param from rte_trace_pattern() and
rte_trace_regexp()


4) Changed rte_trace_from_name to rte_trace_by_name

5) rte_trace_is_dp_enabled() return bool now


6) in __rte_trace_point_register() the argument fn change to register_fn

7) removed !! from rte_trace_is_enabled()

8) Remove uninitialized "rc warning" from rte_trace_pattern() and
rte_trace_regexp()

9) fixup bool return type for trace_entry_compare()

10) fixup calloc casting in trace_mkdir()

11) check fclose() return in trace_meta_save() and trace_mem_save()

12) rte_trace_ctf_* macro cleanup

13) added release notes

14) fix build issues reported by CI
http://mails.dpdk.org/archives/test-report/2020-March/121235.html


This patch set contains 
~~~~~~~~~~~~~~~~~~~~~~~~

# The native implementation of common trace format(CTF)[1] based tracer
# Public API to create the trace points.
# Add tracepoints to eal, ethdev, mempool, eventdev and cryptodev 
library for tracing support
# A unit test case
# Performance test case to measure the trace overhead. (See eal/trace:
# add trace performance test cases, patch)
# Programmers guide for Trace support(See doc: add trace library guide,
# patch)

# Tested OS:
~~~~~~~~~~~
- Linux
- FreeBSD

# Tested open source CTF trace viewers
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Babeltrace
- Tracecompass

# Trace overhead comparison with LTTng
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

trace overhead data on x86:[2]
# 236 cycles with LTTng(>100ns)
# 18 cycles(7ns) with Native DPDK CTF emitter.(See eal/trace: add trace
# performance test cases patch)

trace overhead data on arm64:
#  312  cycles to  1100 cycles with LTTng based on the class of arm64  CPU.
#  11 cycles to 13 cycles with Native DPDK CTF emitter based on the
class of arm64 CPU.

18 cycles(on x86) vs 11 cycles(on arm64) is due to rdtsc() overhead in
x86. It seems  rdtsc takes around 15cycles in x86.

More details:
~~~~~~~~~~~~~

# The Native DPDK CTF trace support does not have any dependency on
third-party library.
The generated output file is compatible with LTTng as both are using
CTF trace format.

The performance gain comes from:
1) exploit dpdk worker thread usage model to avoid atomics and use per
core variables
2) use hugepage,
3) avoid a lot function pointers in fast-path etc
4) avoid unaligned store for arm64 etc

Features:
~~~~~~~~~
- No specific limit on the events. A string-based event like rte_log
for pattern matching
- Dynamic enable/disable support.
- Instructmention overhead is ~1 cycle. i.e cost of adding the code
wth out using trace feature.
- Timestamp support for all the events using DPDK rte_rtdsc
- No dependency on another library. Clean room native implementation of CTF.

Functional test case:
a) echo "trace_autotest" | sudo ./build/app/test/dpdk-test  -c 0x3 --trace=.*

The above command emits the following trace events
<code>
        uint8_t i;

        rte_trace_lib_eal_generic_void();
        rte_trace_lib_eal_generic_u64(0x10000000000000);
        rte_trace_lib_eal_generic_u32(0x10000000);
        rte_trace_lib_eal_generic_u16(0xffee);
        rte_trace_lib_eal_generic_u8(0xc);
        rte_trace_lib_eal_generic_i64(-1234);
        rte_trace_lib_eal_generic_i32(-1234567);
        rte_trace_lib_eal_generic_i16(12);
        rte_trace_lib_eal_generic_i8(-3);
        rte_trace_lib_eal_generic_string("my string");
        rte_trace_lib_eal_generic_function(__func__);

</code>

Install babeltrace package in Linux and point the generated trace file
to babel trace. By default trace file created under
<user>/dpdk-traces/time_stamp/

example:
# babeltrace /root/dpdk-traces/rte-2020-02-15-PM-02-56-51 | more

[13:27:36.138468807] (+?.?????????) lib.eal.generic.void: { cpu_id =0, name = "dpdk-test" }, { }
[13:27:36.138468851] (+0.000000044) lib.eal.generic.u64: { cpu_id = 0, name = "dpdk-test" }, { in = 4503599627370496 }
[13:27:36.138468860] (+0.000000009) lib.eal.generic.u32: { cpu_id = 0, name = "dpdk-test" }, { in = 268435456 }
[13:27:36.138468934] (+0.000000074) lib.eal.generic.u16: { cpu_id = 0, name = "dpdk-test" }, { in = 65518 }
[13:27:36.138468949] (+0.000000015) lib.eal.generic.u8: { cpu_id = 0, name = "dpdk-test" }, { in = 12 }
[13:27:36.138468956] (+0.000000007) lib.eal.generic.i64: { cpu_id = 0, name = "dpdk-test" }, { in = -1234 }
[13:27:36.138468963] (+0.000000007) lib.eal.generic.i32: { cpu_id = 0, name = "dpdk-test" }, { in = -1234567 }
[13:27:36.138469024] (+0.000000061) lib.eal.generic.i16: { cpu_id = 0, name = "dpdk-test" }, { in = 12 }
[13:27:36.138469044] (+0.000000020) lib.eal.generic.i8: { cpu_id = 0, name = "dpdk-test" }, { in = -3 }
[13:27:36.138469051] (+0.000000007) lib.eal.generic.string: { cpu_id = 0, name = "dpdk-test" }, { str = "my string" }
[13:27:36.138469203] (+0.000000152) lib.eal.generic.func: { cpu_id = 0, name = "dpdk-test" }, { func = "test_trace_points" }

# There is a  GUI based trace viewer available in Windows, Linux and  Mac.
It is called as tracecompass.(https://www.eclipse.org/tracecompass/)

The example screenshot and Histogram of above DPDK trace using
Tracecompass.

https://github.com/jerinjacobk/share/blob/master/dpdk_trace.JPG


File walk through:
~~~~~~~~~~~~~~~~~~

lib/librte_eal/common/include/rte_trace.h - Public API for Trace
provider and Trace control
lib/librte_eal/common/eal_common_trace.c - main trace implementation
lib/librte_eal/common/eal_common_trace_ctf.c - CTF metadata spec
implementation
lib/librte_eal/common/eal_common_trace_utils.c - command line utils
and filesystem operations.
lib/librte_eal/common/eal_common_trace_points.c -  trace points for EAL
library
lib/librte_eal/common/include/rte_trace_eal.h - EAL tracepoint public
API.
lib/librte_eal/common/eal_trace.h - Private trace header file.


[1] https://diamon.org/ctf/

[2] The above test is ported to LTTng for finding the LTTng trace
overhead. It available at
https://github.com/jerinjacobk/lttng-overhead
https://github.com/jerinjacobk/lttng-overhead/blob/master/README


Jerin Jacob (22):
  eal: introduce API for getting thread name
  eal/trace: define the public API for trace support
  eal/trace: implement trace register API
  eal/trace: implement trace operation APIs
  eal/trace: add internal trace init and fini interface
  eal/trace: get bootup timestamp for trace
  eal/trace: create CTF TDSL metadata in memory
  eal/trace: implement trace memory allocation
  eal/trace: implement debug dump function
  eal/trace: implement trace save
  eal/trace: implement registration payload
  eal/trace: implement provider payload
  eal/trace: hook internal trace APIs to Linux
  eal/trace: hook internal trace APIs to FreeBSD
  eal/trace: add generic tracepoints
  eal/trace: add alarm tracepoints
  eal/trace: add memory tracepoints
  eal/trace: add memzone tracepoints
  eal/trace: add thread tracepoints
  eal/trace: add interrupt tracepoints
  eal/trace: add trace performance test cases
  doc: add trace library guide

Pavan Nikhilesh (1):
  meson: add libatomic as a global dependency for i686 clang

Sunil Kumar Kori (10):
  eal/trace: handle CTF keyword collision
  eal/trace: add trace configuration parameter
  eal/trace: add trace dir configuration parameter
  eal/trace: add trace bufsize configuration parameter
  eal/trace: add trace mode configuration parameter
  eal/trace: add unit test cases
  ethdev: add tracepoints
  eventdev: add tracepoints
  cryptodev: add tracepoints
  mempool: add tracepoints

 MAINTAINERS                                   |   8 +
 app/test/Makefile                             |   4 +-
 app/test/meson.build                          |   3 +
 app/test/test_trace.c                         | 223 ++++++++
 app/test/test_trace.h                         |  15 +
 app/test/test_trace_perf.c                    | 183 +++++++
 app/test/test_trace_register.c                |  17 +
 config/common_base                            |   1 +
 config/meson.build                            |   9 +
 doc/api/doxy-api-index.md                     |   4 +-
 doc/guides/linux_gsg/eal_args.include.rst     |  52 ++
 doc/guides/prog_guide/build-sdk-meson.rst     |   5 +
 doc/guides/prog_guide/index.rst               |   1 +
 doc/guides/prog_guide/trace_lib.rst           | 346 ++++++++++++
 doc/guides/rel_notes/release_20_05.rst        |   9 +
 drivers/event/octeontx/meson.build            |   5 -
 drivers/event/octeontx2/meson.build           |   5 -
 drivers/event/opdl/meson.build                |   5 -
 examples/cmdline/Makefile                     |   1 +
 examples/cmdline/meson.build                  |   1 +
 examples/distributor/Makefile                 |   1 +
 examples/distributor/meson.build              |   1 +
 examples/ethtool/ethtool-app/Makefile         |   1 +
 examples/eventdev_pipeline/meson.build        |   1 +
 examples/flow_filtering/Makefile              |   1 +
 examples/flow_filtering/meson.build           |   1 +
 examples/helloworld/Makefile                  |   1 +
 examples/helloworld/meson.build               |   1 +
 examples/ioat/Makefile                        |   1 +
 examples/ioat/meson.build                     |   1 +
 examples/ip_fragmentation/Makefile            |   2 +
 examples/ip_fragmentation/meson.build         |   1 +
 examples/ip_reassembly/Makefile               |   1 +
 examples/ip_reassembly/meson.build            |   1 +
 examples/ipv4_multicast/Makefile              |   1 +
 examples/ipv4_multicast/meson.build           |   1 +
 examples/l2fwd-cat/Makefile                   |   1 +
 examples/l2fwd-cat/meson.build                |   1 +
 examples/l2fwd-event/Makefile                 |   1 +
 examples/l2fwd-event/meson.build              |   6 +-
 examples/l2fwd-jobstats/Makefile              |   1 +
 examples/l2fwd-jobstats/meson.build           |   1 +
 examples/l2fwd-keepalive/Makefile             |   1 +
 examples/l2fwd-keepalive/ka-agent/Makefile    |   1 +
 examples/l2fwd-keepalive/meson.build          |   1 +
 examples/l3fwd-acl/Makefile                   |   1 +
 examples/l3fwd-acl/meson.build                |   1 +
 examples/l3fwd/Makefile                       |   1 +
 examples/l3fwd/meson.build                    |   1 +
 examples/link_status_interrupt/Makefile       |   1 +
 examples/link_status_interrupt/meson.build    |   1 +
 .../client_server_mp/mp_client/Makefile       |   1 +
 .../client_server_mp/mp_client/meson.build    |   1 +
 .../client_server_mp/mp_server/meson.build    |   1 +
 examples/multi_process/hotplug_mp/Makefile    |   1 +
 examples/multi_process/hotplug_mp/meson.build |   1 +
 examples/multi_process/simple_mp/Makefile     |   1 +
 examples/multi_process/simple_mp/meson.build  |   1 +
 examples/multi_process/symmetric_mp/Makefile  |   1 +
 .../multi_process/symmetric_mp/meson.build    |   1 +
 examples/ntb/Makefile                         |   1 +
 examples/ntb/meson.build                      |   1 +
 examples/packet_ordering/Makefile             |   1 +
 examples/packet_ordering/meson.build          |   1 +
 .../performance-thread/l3fwd-thread/Makefile  |   1 +
 .../l3fwd-thread/meson.build                  |   1 +
 .../performance-thread/pthread_shim/Makefile  |   1 +
 .../pthread_shim/meson.build                  |   1 +
 examples/ptpclient/Makefile                   |   1 +
 examples/ptpclient/meson.build                |   1 +
 examples/qos_meter/Makefile                   |   1 +
 examples/qos_meter/meson.build                |   1 +
 examples/qos_sched/Makefile                   |   1 +
 examples/qos_sched/meson.build                |   1 +
 examples/server_node_efd/node/Makefile        |   1 +
 examples/server_node_efd/node/meson.build     |   1 +
 examples/server_node_efd/server/Makefile      |   1 +
 examples/server_node_efd/server/meson.build   |   1 +
 examples/service_cores/Makefile               |   1 +
 examples/service_cores/meson.build            |   1 +
 examples/skeleton/Makefile                    |   1 +
 examples/skeleton/meson.build                 |   1 +
 examples/timer/Makefile                       |   1 +
 examples/timer/meson.build                    |   1 +
 examples/vm_power_manager/Makefile            |   1 +
 examples/vm_power_manager/meson.build         |   1 +
 examples/vmdq/Makefile                        |   1 +
 examples/vmdq/meson.build                     |   1 +
 examples/vmdq_dcb/Makefile                    |   1 +
 examples/vmdq_dcb/meson.build                 |   1 +
 lib/librte_cryptodev/Makefile                 |   4 +-
 lib/librte_cryptodev/cryptodev_trace_points.c |  70 +++
 lib/librte_cryptodev/meson.build              |   6 +-
 lib/librte_cryptodev/rte_cryptodev.c          |  18 +
 lib/librte_cryptodev/rte_cryptodev.h          |   6 +
 .../rte_cryptodev_version.map                 |  18 +
 lib/librte_cryptodev/rte_trace_cryptodev.h    | 148 ++++++
 lib/librte_cryptodev/rte_trace_cryptodev_fp.h |  38 ++
 lib/librte_distributor/meson.build            |   5 -
 lib/librte_eal/common/eal_common_memzone.c    |   9 +
 lib/librte_eal/common/eal_common_options.c    |  65 +++
 lib/librte_eal/common/eal_common_thread.c     |   4 +-
 lib/librte_eal/common/eal_common_trace.c      | 500 ++++++++++++++++++
 lib/librte_eal/common/eal_common_trace_ctf.c  | 488 +++++++++++++++++
 .../common/eal_common_trace_points.c          | 115 ++++
 .../common/eal_common_trace_utils.c           | 478 +++++++++++++++++
 lib/librte_eal/common/eal_options.h           |   8 +
 lib/librte_eal/common/eal_private.h           |   5 +
 lib/librte_eal/common/eal_trace.h             | 122 +++++
 lib/librte_eal/common/meson.build             |   4 +
 lib/librte_eal/common/rte_malloc.c            |  60 ++-
 lib/librte_eal/freebsd/Makefile               |   4 +
 lib/librte_eal/freebsd/eal.c                  |  10 +
 lib/librte_eal/freebsd/eal_alarm.c            |   3 +
 lib/librte_eal/freebsd/eal_interrupts.c       |  54 +-
 lib/librte_eal/freebsd/eal_thread.c           |  21 +-
 lib/librte_eal/include/meson.build            |   5 +
 lib/librte_eal/include/rte_lcore.h            |  17 +
 lib/librte_eal/include/rte_trace.h            | 141 +++++
 lib/librte_eal/include/rte_trace_eal.h        | 278 ++++++++++
 lib/librte_eal/include/rte_trace_point.h      | 306 +++++++++++
 .../include/rte_trace_point_provider.h        | 131 +++++
 .../include/rte_trace_point_register.h        |  39 ++
 lib/librte_eal/linux/Makefile                 |   4 +
 lib/librte_eal/linux/eal.c                    |   9 +
 lib/librte_eal/linux/eal_alarm.c              |   4 +
 lib/librte_eal/linux/eal_interrupts.c         |  84 +--
 lib/librte_eal/linux/eal_thread.c             |  27 +-
 lib/librte_eal/rte_eal_version.map            |  50 ++
 lib/librte_ethdev/Makefile                    |   3 +
 lib/librte_ethdev/ethdev_trace_points.c       |  43 ++
 lib/librte_ethdev/meson.build                 |   5 +-
 lib/librte_ethdev/rte_ethdev.c                |  12 +
 lib/librte_ethdev/rte_ethdev.h                |   5 +
 lib/librte_ethdev/rte_ethdev_version.map      |  10 +
 lib/librte_ethdev/rte_trace_ethdev.h          |  97 ++++
 lib/librte_ethdev/rte_trace_ethdev_fp.h       |  44 ++
 lib/librte_eventdev/Makefile                  |   3 +
 lib/librte_eventdev/eventdev_trace_points.c   | 173 ++++++
 lib/librte_eventdev/meson.build               |   3 +
 .../rte_event_crypto_adapter.c                |  10 +
 .../rte_event_eth_rx_adapter.c                |  11 +
 .../rte_event_eth_tx_adapter.c                |  13 +-
 .../rte_event_eth_tx_adapter.h                |   2 +
 lib/librte_eventdev/rte_event_timer_adapter.c |   8 +-
 lib/librte_eventdev/rte_event_timer_adapter.h |   8 +
 lib/librte_eventdev/rte_eventdev.c            |   9 +
 lib/librte_eventdev/rte_eventdev.h            |   5 +-
 lib/librte_eventdev/rte_eventdev_version.map  |  42 ++
 lib/librte_eventdev/rte_trace_eventdev.h      | 311 +++++++++++
 lib/librte_eventdev/rte_trace_eventdev_fp.h   |  85 +++
 lib/librte_mempool/Makefile                   |   3 +
 lib/librte_mempool/mempool_trace_points.c     | 108 ++++
 lib/librte_mempool/meson.build                |   5 +-
 lib/librte_mempool/rte_mempool.c              |  16 +
 lib/librte_mempool/rte_mempool.h              |  13 +
 lib/librte_mempool/rte_mempool_ops.c          |   7 +
 lib/librte_mempool/rte_mempool_version.map    |  26 +
 lib/librte_mempool/rte_trace_mempool.h        | 178 +++++++
 lib/librte_mempool/rte_trace_mempool_fp.h     | 116 ++++
 lib/librte_rcu/meson.build                    |   5 -
 meson_options.txt                             |   2 +
 162 files changed, 5607 insertions(+), 105 deletions(-)
 create mode 100644 app/test/test_trace.c
 create mode 100644 app/test/test_trace.h
 create mode 100644 app/test/test_trace_perf.c
 create mode 100644 app/test/test_trace_register.c
 create mode 100644 doc/guides/prog_guide/trace_lib.rst
 create mode 100644 lib/librte_cryptodev/cryptodev_trace_points.c
 create mode 100644 lib/librte_cryptodev/rte_trace_cryptodev.h
 create mode 100644 lib/librte_cryptodev/rte_trace_cryptodev_fp.h
 create mode 100644 lib/librte_eal/common/eal_common_trace.c
 create mode 100644 lib/librte_eal/common/eal_common_trace_ctf.c
 create mode 100644 lib/librte_eal/common/eal_common_trace_points.c
 create mode 100644 lib/librte_eal/common/eal_common_trace_utils.c
 create mode 100644 lib/librte_eal/common/eal_trace.h
 create mode 100644 lib/librte_eal/include/rte_trace.h
 create mode 100644 lib/librte_eal/include/rte_trace_eal.h
 create mode 100644 lib/librte_eal/include/rte_trace_point.h
 create mode 100644 lib/librte_eal/include/rte_trace_point_provider.h
 create mode 100644 lib/librte_eal/include/rte_trace_point_register.h
 create mode 100644 lib/librte_ethdev/ethdev_trace_points.c
 create mode 100644 lib/librte_ethdev/rte_trace_ethdev.h
 create mode 100644 lib/librte_ethdev/rte_trace_ethdev_fp.h
 create mode 100644 lib/librte_eventdev/eventdev_trace_points.c
 create mode 100644 lib/librte_eventdev/rte_trace_eventdev.h
 create mode 100644 lib/librte_eventdev/rte_trace_eventdev_fp.h
 create mode 100644 lib/librte_mempool/mempool_trace_points.c
 create mode 100644 lib/librte_mempool/rte_trace_mempool.h
 create mode 100644 lib/librte_mempool/rte_trace_mempool_fp.h