mbox series

[[PATCH,v3,0/4] pdump HW Rx timestamps for mlx5

Message ID 20200724202315.19533-1-patrick.keroulas@radio-canada.ca (mailing list archive)
Headers
Series pdump HW Rx timestamps for mlx5 |

Message

Patrick Keroulas July 24, 2020, 8:23 p.m. UTC
  The intention is to produce a pcap with nanosecond precision when
Rx timestamp offloading is activated on mlx5 NIC.

The packets forwarded by testpmd hold the raw counter but a pcap
requires a time unit. Assuming that the NIC clock is already synced
with external master clock, this patchset simply integrates the
nanosecond converter that derives from device frequency and start time.

v2 -> v3:
    - replace ib_verbs nanosecond converter with more generic method
      based on device frequency and start time.

Patrick Keroulas (3):
  net/mlx5: query device frequency
  ethdev: add API to query device frequency
  pdump: convert timestamp to nanoseconds on Rx path

Vivien Didelot (1):
  net/pcap: support hardware Tx timestamps

 doc/guides/rel_notes/release_20_08.rst   |  1 +
 drivers/common/mlx5/mlx5_devx_cmds.c     |  2 ++
 drivers/common/mlx5/mlx5_devx_cmds.h     |  1 +
 drivers/net/mlx5/linux/mlx5_ethdev_os.c  | 22 ++++++++++++++++
 drivers/net/mlx5/linux/mlx5_os.c         |  1 +
 drivers/net/mlx5/mlx5.h                  |  1 +
 drivers/net/pcap/rte_eth_pcap.c          | 32 +++++++++++++-----------
 lib/librte_ethdev/rte_ethdev.c           | 12 +++++++++
 lib/librte_ethdev/rte_ethdev.h           | 17 +++++++++++++
 lib/librte_ethdev/rte_ethdev_core.h      |  5 ++++
 lib/librte_ethdev/rte_ethdev_version.map |  2 ++
 lib/librte_pdump/rte_pdump.c             | 27 ++++++++++++++++++++
 12 files changed, 109 insertions(+), 14 deletions(-)
  

Comments

Ferruh Yigit Oct. 6, 2020, 4:25 p.m. UTC | #1
On 7/24/2020 9:23 PM, Patrick Keroulas wrote:
> The intention is to produce a pcap with nanosecond precision when
> Rx timestamp offloading is activated on mlx5 NIC.
> 
> The packets forwarded by testpmd hold the raw counter but a pcap
> requires a time unit. Assuming that the NIC clock is already synced
> with external master clock, this patchset simply integrates the
> nanosecond converter that derives from device frequency and start time.
> 
> v2 -> v3:
>      - replace ib_verbs nanosecond converter with more generic method
>        based on device frequency and start time.
> 
> Patrick Keroulas (3):
>    net/mlx5: query device frequency
>    ethdev: add API to query device frequency
>    pdump: convert timestamp to nanoseconds on Rx path
> 
> Vivien Didelot (1):
>    net/pcap: support hardware Tx timestamps
> 

We have three patch/patchset for same issue,

1) Current one, https://patches.dpdk.org/user/todo/dpdk/?series=11294
2) Vivien's series: https://patches.dpdk.org/user/todo/dpdk/?series=10678
3) Vivien's pcap patch: https://patches.dpdk.org/user/todo/dpdk/?series=10403

And one related one from Slava,
4) https://patchwork.dpdk.org/project/dpdk/list/?series=10948&state=%2A&archive=both

I am replying to this one since this is the latest, but first can we clarify if 
all are still valid and can we combine the effort?


Second, the problems to solve,
1) Device provided timestamp value has no unit, it needs to be converted to 
nanosecond.
There are different approaches in above patches,
- One adds '.convert_ts_to_ns' dev_ops to make PMD convert the timestamp
- Other adds '.eth_get_clock_freq' dev_ops, to get frequency from device clock
   so that conversion can be done within the app.
- I wonder why existing 'rte_eth_read_clock()' can't be enough for conversion,
   as described in its documentation:
   https://doc.dpdk.org/api/rte__ethdev_8h.html#a4346bf07a0d302c9ba4fe06baffd3196
     rte_eth_read_clock(port, start);
     rte_delay_ms(100);
     rte_eth_read_clock(port, end);
     double freq = (end - start) * 10;

2) Where to put the timestamps data, is it to the 'mbuf->timestamp' or dynamic
    filed 'RTE_MBUF_DYNFIELD_TIMESTAMP_NAME'? Using dynamic field of requires
    more work on registering and looking up the fields.

3) Calculation in the datapath should be done in a performance optimized way, 
instead of using division.

4) Should the timestamp value provided by the Rx device used, or should the time 
when the packet transmitted used. I can see current use case seems first one, 
but can there be cases we would like to record when packet exactly sent.

5) Should we create a 'PKT_TX_TIMESTAMP' flag, instead of re-using the Rx one, 
to let the application explicitly define which packets to record timestamp.

Please add if I missing more.
  
Morten Brørup Oct. 7, 2020, 6:59 a.m. UTC | #2
> From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]
> Sent: Tuesday, October 6, 2020 6:26 PM
> 
> On 7/24/2020 9:23 PM, Patrick Keroulas wrote:
> > The intention is to produce a pcap with nanosecond precision when
> > Rx timestamp offloading is activated on mlx5 NIC.
> >
> > The packets forwarded by testpmd hold the raw counter but a pcap
> > requires a time unit. Assuming that the NIC clock is already synced
> > with external master clock, this patchset simply integrates the
> > nanosecond converter that derives from device frequency and start
> time.
> >
> > v2 -> v3:
> >      - replace ib_verbs nanosecond converter with more generic method
> >        based on device frequency and start time.
> >
> > Patrick Keroulas (3):
> >    net/mlx5: query device frequency
> >    ethdev: add API to query device frequency
> >    pdump: convert timestamp to nanoseconds on Rx path
> >
> > Vivien Didelot (1):
> >    net/pcap: support hardware Tx timestamps
> >
> 
> We have three patch/patchset for same issue,
> 
> 1) Current one, https://patches.dpdk.org/user/todo/dpdk/?series=11294
> 2) Vivien's series:
> https://patches.dpdk.org/user/todo/dpdk/?series=10678
> 3) Vivien's pcap patch:
> https://patches.dpdk.org/user/todo/dpdk/?series=10403
> 
> And one related one from Slava,
> 4)
> https://patchwork.dpdk.org/project/dpdk/list/?series=10948&state=%2A&ar
> chive=both
> 
> I am replying to this one since this is the latest, but first can we
> clarify if
> all are still valid and can we combine the effort?
> 
> 
> Second, the problems to solve,
> 1) Device provided timestamp value has no unit, it needs to be
> converted to
> nanosecond.
> There are different approaches in above patches,
> - One adds '.convert_ts_to_ns' dev_ops to make PMD convert the
> timestamp
> - Other adds '.eth_get_clock_freq' dev_ops, to get frequency from
> device clock
>    so that conversion can be done within the app.
> - I wonder why existing 'rte_eth_read_clock()' can't be enough for
> conversion,
>    as described in its documentation:
> 
> https://doc.dpdk.org/api/rte__ethdev_8h.html#a4346bf07a0d302c9ba4fe06ba
> ffd3196
>      rte_eth_read_clock(port, start);
>      rte_delay_ms(100);
>      rte_eth_read_clock(port, end);
>      double freq = (end - start) * 10;
> 
> 2) Where to put the timestamps data, is it to the 'mbuf->timestamp' or
> dynamic
>     filed 'RTE_MBUF_DYNFIELD_TIMESTAMP_NAME'? Using dynamic field of
> requires
>     more work on registering and looking up the fields.
> 
> 3) Calculation in the datapath should be done in a performance
> optimized way,
> instead of using division.
> 
> 4) Should the timestamp value provided by the Rx device used, or should
> the time
> when the packet transmitted used. I can see current use case seems
> first one,
> but can there be cases we would like to record when packet exactly
> sent.
> 
> 5) Should we create a 'PKT_TX_TIMESTAMP' flag, instead of re-using the
> Rx one,
> to let the application explicitly define which packets to record
> timestamp.
> 
> Please add if I missing more.

Regarding RX timestamps:

I believe that it is on the roadmap to remove the timestamp field from the mbuf and make it a dynamic field instead.

Furthermore, I understand that some Mellanox NICs have accurate "wall clock" timestamping capability, possibly even PTP synchronized. So I propose to make two variants of the dynamic timestamp field, one with an accurate "wall clock" timestamp for NICs with that capability, and one with an NIC-specific unitless timestamp (like the current timestamp).

Regarding TX timestamps:

Do any NICs currently have a TX pipeline that puts a TX timestamp in the descriptor on transmission to the wire instead of just marking the descriptor as free? If not, then there is probably no need for TX timestamps at that level of accuracy for captured packets. The application can set the timestamp in the captured/mirrored packets while cloning the originals and passing them on to the NIC driver.

And if the dynamic mbuf field that instructs the NIC to transmit the packet at a specific time is present, the application might even use that timestamp, knowing that the NIC will transmit the packet at that time.


Med venlig hilsen / kind regards
- Morten Brørup