mbox

[v3,0/4] net/mlx5: introduce Tx datapath tracing

Message ID 20230628110958.1403-1-viacheslavo@nvidia.com (mailing list archive)
Headers

Message

Slava Ovsiienko June 28, 2023, 11:09 a.m. UTC
  The mlx5 provides the send scheduling on specific moment of time,
and for the related kind of applications it would be extremely useful
to have extra debug information - when and how packets were scheduled
and when the actual sending was completed by the NIC hardware (it helps
application to track the internal delay issues).

Because the DPDK tx datapath API does not suppose getting any feedback
from the driver and the feature looks like to be mlx5 specific, it seems
to be reasonable to engage exisiting DPDK datapath tracing capability.

The work cycle is supposed to be:
  - compile appplication with enabled tracing
  - run application with EAL parameters configuring the tracing in mlx5
    Tx datapath
  - store the dump file with gathered tracing information
  - run analyzing scrypt (in Python) to combine related events (packet
    firing and completion) and see the data in human-readable view

Below is the detailed instruction "how to" with mlx5 NIC to gather
all the debug data including the full timings information.


1. Build DPDK application with enabled datapath tracing

The meson option should be specified:
   --enable_trace_fp=true

The c_args shoudl be specified:
   -DALLOW_EXPERIMENTAL_API

The DPDK configuration examples:

  meson configure --buildtype=debug -Denable_trace_fp=true
        -Dc_args='-DRTE_LIBRTE_MLX5_DEBUG -DRTE_ENABLE_ASSERT -DALLOW_EXPERIMENTAL_API' build

  meson configure --buildtype=debug -Denable_trace_fp=true
        -Dc_args='-DRTE_ENABLE_ASSERT -DALLOW_EXPERIMENTAL_API' build

  meson configure --buildtype=release -Denable_trace_fp=true
        -Dc_args='-DRTE_ENABLE_ASSERT -DALLOW_EXPERIMENTAL_API' build

  meson configure --buildtype=release -Denable_trace_fp=true
        -Dc_args='-DALLOW_EXPERIMENTAL_API' build


2. Configuring the NIC

If the sending completion timings are important the NIC should be configured
to provide realtime timestamps, the REAL_TIME_CLOCK_ENABLE NV settings parameter
should be configured to TRUE, for example with command (and with following
FW/driver reset):

  sudo mlxconfig -d /dev/mst/mt4125_pciconf0 s REAL_TIME_CLOCK_ENABLE=1


3. Run DPDK application to gather the traces

EAL parameters controlling trace capability in runtime

  --trace=pmd.net.mlx5.tx - the regular expression enabling the tracepoints
                            with matching names at least "pmd.net.mlx5.tx"
                            must be enabled to gather all events needed
                            to analyze mlx5 Tx datapath and its timings.
                            By default all tracepoints are disabled.

  --trace-dir=/var/log - trace storing directory

  --trace-bufsz=<val>B|<val>K|<val>M - optional, trace data buffer size
                                       per thread. The default is 1MB.

  --trace-mode=overwrite|discard  - optional, selects trace data buffer mode.


4. Installing or Building Babeltrace2 Package

The gathered trace data can be analyzed with a developed Python script.
To parse the trace, the data script uses the Babeltrace2 library.
The package should be either installed or built from source code as
shown below:

  git clone https://github.com/efficios/babeltrace.git
  cd babeltrace
  ./bootstrap
  ./configure -help
  ./configure --disable-api-doc --disable-man-pages
              --disable-python-bindings-doc --enbale-python-plugins
              --enable-python-binding

5. Running the Analyzing Script

The analyzing script is located in the folder: ./drivers/net/mlx5/tools
It requires Python3.6, Babeltrace2 packages and it takes the only parameter
of trace data file. For example:

   ./mlx5_trace.py /var/log/rte-2023-01-23-AM-11-52-39


6. Interpreting the Script Output Data

All the timings are given in nanoseconds.
The list of Tx (and coming Rx) bursts per port/queue is presented in the output.
Each list element contains the list of built WQEs with specific opcodes, and
each WQE contains the list of the encompassed packets to send.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

--
v2: - comment addressed: "dump_trace" command is replaced with "save_trace"
    - Windows build failure addressed, Windows does not support tracing

v3: - tracepoint routines are moved to the net folder, no need to export
    - documentation added
    - testpmd patches moved out from series to the dedicated patches

Viacheslav Ovsiienko (4):
  net/mlx5: introduce tracepoints for mlx5 drivers
  net/mlx5: add comprehensive send completion trace
  net/mlx5: add Tx datapath trace analyzing script
  doc: add mlx5 datapath tracing feature description

 doc/guides/nics/mlx5.rst             |  77 ++++++++
 drivers/net/mlx5/linux/mlx5_verbs.c  |   8 +-
 drivers/net/mlx5/mlx5_devx.c         |   8 +-
 drivers/net/mlx5/mlx5_rx.h           |  19 --
 drivers/net/mlx5/mlx5_rxtx.h         |  19 ++
 drivers/net/mlx5/mlx5_tx.c           |  29 +++
 drivers/net/mlx5/mlx5_tx.h           | 135 ++++++++++++-
 drivers/net/mlx5/tools/mlx5_trace.py | 271 +++++++++++++++++++++++++++
 8 files changed, 537 insertions(+), 29 deletions(-)
 create mode 100755 drivers/net/mlx5/tools/mlx5_trace.py