diff mbox series

[1/2] mbuf: introduce accurate packet Tx scheduling

Message ID 1593617787-3252-1-git-send-email-viacheslavo@mellanox.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers show
Series [1/2] mbuf: introduce accurate packet Tx scheduling | expand

Checks

Context Check Description
ci/Intel-compilation success Compilation OK
ci/iol-testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-nxp-Performance success Performance Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/checkpatch success coding style OK

Commit Message

Slava Ovsiienko July 1, 2020, 3:36 p.m. UTC
There is the requirement on some networks for precise traffic timing
management. The ability to send (and, generally speaking, receive)
the packets at the very precisely specified moment of time provides
the opportunity to support the connections with Time Division
Multiplexing using the contemporary general purpose NIC without involving
an auxiliary hardware. For example, the supporting of O-RAN Fronthaul
interface is one of the promising features for potentially usage of the
precise time management for the egress packets.

The main objective of this RFC is to specify the way how applications
can provide the moment of time at what the packet transmission must be
started and to describe in preliminary the supporting this feature from
mlx5 PMD side.

The new dynamic timestamp field is proposed, it provides some timing
information, the units and time references (initial phase) are not
explicitly defined but are maintained always the same for a given port.
Some devices allow to query rte_eth_read_clock() that will return
the current device timestamp. The dynamic timestamp flag tells whether
the field contains actual timestamp value. For the packets being sent
this value can be used by PMD to schedule packet sending.

After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
and obsoleting, these dynamic flag and field will be used to manage
the timestamps on receiving datapath as well.

When PMD sees the "rte_dynfield_timestamp" set on the packet being sent
it tries to synchronize the time of packet appearing on the wire with
the specified packet timestamp. It the specified one is in the past it
should be ignored, if one is in the distant future it should be capped
with some reasonable value (in range of seconds). These specific cases
("too late" and "distant future") can be optionally reported via
device xstats to assist applications to detect the time-related
problems.

There is no any packet reordering according timestamps is supposed,
neither within packet burst, nor between packets, it is an entirely
application responsibility to generate packets and its timestamps
in desired order. The timestamps can be put only in the first packet
in the burst providing the entire burst scheduling.

PMD reports the ability to synchronize packet sending on timestamp
with new offload flag:

This is palliative and is going to be replaced with new eth_dev API
about reporting/managing the supported dynamic flags and its related
features. This API would break ABI compatibility and can't be introduced
at the moment, so is postponed to 20.11.

For testing purposes it is proposed to update testpmd "txonly"
forwarding mode routine. With this update testpmd application generates
the packets and sets the dynamic timestamps according to specified time
pattern if it sees the "rte_dynfield_timestamp" is registered.

The new testpmd command is proposed to configure sending pattern:

set tx_times <burst_gap>,<intra_gap>

<intra_gap> - the delay between the packets within the burst
              specified in the device clock units. The number
              of packets in the burst is defined by txburst parameter

<burst_gap> - the delay between the bursts in the device clock units

As the result the bursts of packet will be transmitted with specific
delays between the packets within the burst and specific delay between
the bursts. The rte_eth_get_clock is supposed to be engaged to get the
current device clock value and provide the reference for the timestamps.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 lib/librte_ethdev/rte_ethdev.c |  1 +
 lib/librte_ethdev/rte_ethdev.h |  4 ++++
 lib/librte_mbuf/rte_mbuf_dyn.h | 16 ++++++++++++++++
 3 files changed, 21 insertions(+)

Comments

Olivier Matz July 7, 2020, 11:50 a.m. UTC | #1
Hi Slava,

Few question/comments below.

On Wed, Jul 01, 2020 at 03:36:26PM +0000, Viacheslav Ovsiienko wrote:
> There is the requirement on some networks for precise traffic timing
> management. The ability to send (and, generally speaking, receive)
> the packets at the very precisely specified moment of time provides
> the opportunity to support the connections with Time Division
> Multiplexing using the contemporary general purpose NIC without involving
> an auxiliary hardware. For example, the supporting of O-RAN Fronthaul
> interface is one of the promising features for potentially usage of the
> precise time management for the egress packets.
> 
> The main objective of this RFC is to specify the way how applications
> can provide the moment of time at what the packet transmission must be
> started and to describe in preliminary the supporting this feature from
> mlx5 PMD side.
> 
> The new dynamic timestamp field is proposed, it provides some timing
> information, the units and time references (initial phase) are not
> explicitly defined but are maintained always the same for a given port.
> Some devices allow to query rte_eth_read_clock() that will return
> the current device timestamp. The dynamic timestamp flag tells whether
> the field contains actual timestamp value. For the packets being sent
> this value can be used by PMD to schedule packet sending.
> 
> After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
> and obsoleting, these dynamic flag and field will be used to manage
> the timestamps on receiving datapath as well.

Do you mean the same flag will be used for both Rx and Tx?  I wonder if
it's a good idea: if you enable the timestamp on Rx, the packets will be
flagged and it will impact Tx, except if the application explicitly
resets the flag in all mbufs. Wouldn't it be safer to have an Rx flag
and a Tx flag?

> When PMD sees the "rte_dynfield_timestamp" set on the packet being sent
> it tries to synchronize the time of packet appearing on the wire with
> the specified packet timestamp. It the specified one is in the past it
> should be ignored, if one is in the distant future it should be capped
> with some reasonable value (in range of seconds). These specific cases
> ("too late" and "distant future") can be optionally reported via
> device xstats to assist applications to detect the time-related
> problems.

I think what to do with packets to be send in the "past" could be
configurable through an ethdev API in the future (drop or send).

> There is no any packet reordering according timestamps is supposed,
> neither within packet burst, nor between packets, it is an entirely
> application responsibility to generate packets and its timestamps
> in desired order. The timestamps can be put only in the first packet
> in the burst providing the entire burst scheduling.

This constraint makes sense. At first glance, it looks it is imposed by
a PMD or hw limitation, but thinking more about it, I think it is the
correct behavior to have. Packets are ordered within a PMD queue, and
the ability to set the timestamp for one packet to delay the subsequent
ones looks useful.

Should this behavior be documented somewhere? Maybe in the API comment
documenting the dynamic flag?

> PMD reports the ability to synchronize packet sending on timestamp
> with new offload flag:
> 
> This is palliative and is going to be replaced with new eth_dev API
> about reporting/managing the supported dynamic flags and its related
> features. This API would break ABI compatibility and can't be introduced
> at the moment, so is postponed to 20.11.
> 
> For testing purposes it is proposed to update testpmd "txonly"
> forwarding mode routine. With this update testpmd application generates
> the packets and sets the dynamic timestamps according to specified time
> pattern if it sees the "rte_dynfield_timestamp" is registered.
> 
> The new testpmd command is proposed to configure sending pattern:
> 
> set tx_times <burst_gap>,<intra_gap>
> 
> <intra_gap> - the delay between the packets within the burst
>               specified in the device clock units. The number
>               of packets in the burst is defined by txburst parameter
> 
> <burst_gap> - the delay between the bursts in the device clock units
> 
> As the result the bursts of packet will be transmitted with specific
> delays between the packets within the burst and specific delay between
> the bursts. The rte_eth_get_clock is supposed to be engaged to get the
> current device clock value and provide the reference for the timestamps.
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> ---
>  lib/librte_ethdev/rte_ethdev.c |  1 +
>  lib/librte_ethdev/rte_ethdev.h |  4 ++++
>  lib/librte_mbuf/rte_mbuf_dyn.h | 16 ++++++++++++++++
>  3 files changed, 21 insertions(+)
> 
> diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
> index 8e10a6f..02157d5 100644
> --- a/lib/librte_ethdev/rte_ethdev.c
> +++ b/lib/librte_ethdev/rte_ethdev.c
> @@ -162,6 +162,7 @@ struct rte_eth_xstats_name_off {
>  	RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
>  	RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
>  	RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
> +	RTE_TX_OFFLOAD_BIT2STR(SEND_ON_TIMESTAMP),
>  };
>  
>  #undef RTE_TX_OFFLOAD_BIT2STR
> diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
> index a49242b..6f6454c 100644
> --- a/lib/librte_ethdev/rte_ethdev.h
> +++ b/lib/librte_ethdev/rte_ethdev.h
> @@ -1178,6 +1178,10 @@ struct rte_eth_conf {
>  /** Device supports outer UDP checksum */
>  #define DEV_TX_OFFLOAD_OUTER_UDP_CKSUM  0x00100000
>  
> +/** Device supports send on timestamp */
> +#define DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
> +
> +
>  #define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
>  /**< Device supports Rx queue setup after device started*/
>  #define RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x00000002
> diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
> index 96c3631..fb5477c 100644
> --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> @@ -250,4 +250,20 @@ int rte_mbuf_dynflag_lookup(const char *name,
>  #define RTE_MBUF_DYNFIELD_METADATA_NAME "rte_flow_dynfield_metadata"
>  #define RTE_MBUF_DYNFLAG_METADATA_NAME "rte_flow_dynflag_metadata"
>  
> +/*
> + * The timestamp dynamic field provides some timing information, the
> + * units and time references (initial phase) are not explicitly defined
> + * but are maintained always the same for a given port. Some devices allow
> + * to query rte_eth_read_clock() that will return the current device
> + * timestamp. The dynamic timestamp flag tells whether the field contains
> + * actual timestamp value. For the packets being sent this value can be
> + * used by PMD to schedule packet sending.
> + *
> + * After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
> + * and obsoleting, these dynamic flag and field will be used to manage
> + * the timestamps on receiving datapath as well.
> + */
> +#define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME "rte_dynfield_timestamp"
> +#define RTE_MBUF_DYNFLAG_TIMESTAMP_NAME "rte_dynflag_timestamp"
> +

I realize that's not the case for rte_flow_dynfield_metadata, but
I think it would be good to have a doxygen-like comment.



Regards,
Olivier
Slava Ovsiienko July 7, 2020, 12:46 p.m. UTC | #2
Hi, Olivier

Thanks a lot for the review.

> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: Tuesday, July 7, 2020 14:51
> To: Slava Ovsiienko <viacheslavo@mellanox.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan
> Darawsheh <rasland@mellanox.com>; bernard.iremonger@intel.com;
> thomas@mellanox.net
> Subject: Re: [PATCH 1/2] mbuf: introduce accurate packet Tx scheduling
> 
> Hi Slava,
> 
> Few question/comments below.
> 
> On Wed, Jul 01, 2020 at 03:36:26PM +0000, Viacheslav Ovsiienko wrote:
> > There is the requirement on some networks for precise traffic timing
> > management. The ability to send (and, generally speaking, receive) the
> > packets at the very precisely specified moment of time provides the
> > opportunity to support the connections with Time Division Multiplexing
> > using the contemporary general purpose NIC without involving an
> > auxiliary hardware. For example, the supporting of O-RAN Fronthaul
> > interface is one of the promising features for potentially usage of
> > the precise time management for the egress packets.
> >
> > The main objective of this RFC is to specify the way how applications
> > can provide the moment of time at what the packet transmission must be
> > started and to describe in preliminary the supporting this feature
> > from
> > mlx5 PMD side.
> >
> > The new dynamic timestamp field is proposed, it provides some timing
> > information, the units and time references (initial phase) are not
> > explicitly defined but are maintained always the same for a given port.
> > Some devices allow to query rte_eth_read_clock() that will return the
> > current device timestamp. The dynamic timestamp flag tells whether the
> > field contains actual timestamp value. For the packets being sent this
> > value can be used by PMD to schedule packet sending.
> >
> > After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation and
> > obsoleting, these dynamic flag and field will be used to manage the
> > timestamps on receiving datapath as well.
> 
> Do you mean the same flag will be used for both Rx and Tx?  I wonder if it's a
> good idea: if you enable the timestamp on Rx, the packets will be flagged
> and it will impact Tx, except if the application explicitly resets the flag in all
> mbufs. Wouldn't it be safer to have an Rx flag and a Tx flag?

A little bit difficult to say ambiguously, I thought about and did not make strong decision.
We have the flag sharing for the Rx/Tx metadata and just follow the same approach.
OK, I will promote ones to:
RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME
RTE_MBUF_DYNFIELD_TX_TIMESTAMP_NAME

And, possible, we should reconsider metadata dynamic flags.
> 
> > When PMD sees the "rte_dynfield_timestamp" set on the packet being
> > sent it tries to synchronize the time of packet appearing on the wire
> > with the specified packet timestamp. It the specified one is in the
> > past it should be ignored, if one is in the distant future it should
> > be capped with some reasonable value (in range of seconds). These
> > specific cases ("too late" and "distant future") can be optionally
> > reported via device xstats to assist applications to detect the
> > time-related problems.
> 
> I think what to do with packets to be send in the "past" could be configurable
> through an ethdev API in the future (drop or send).
Yes, currently there is no complete understanding how to handle packets out of the time slot.
In 20.11 we are going to introduce the time-based rte flow API to manage out-of-window packets.
 
> 
> > There is no any packet reordering according timestamps is supposed,
> > neither within packet burst, nor between packets, it is an entirely
> > application responsibility to generate packets and its timestamps in
> > desired order. The timestamps can be put only in the first packet in
> > the burst providing the entire burst scheduling.

"can" should be replaced with "might". Current mlx5 implementation
checks each packet in the burst for the timestamp presence.

> 
> This constraint makes sense. At first glance, it looks it is imposed by a PMD or
> hw limitation, but thinking more about it, I think it is the correct behavior to
> have. Packets are ordered within a PMD queue, and the ability to set the
> timestamp for one packet to delay the subsequent ones looks useful.
> 
> Should this behavior be documented somewhere? Maybe in the API
> comment documenting the dynamic flag?

It is documented in mlx5.rst (coming soon), do you think it should be
more common? OK, will update.

> 
> > PMD reports the ability to synchronize packet sending on timestamp
> > with new offload flag:
> >
> > This is palliative and is going to be replaced with new eth_dev API
> > about reporting/managing the supported dynamic flags and its related
> > features. This API would break ABI compatibility and can't be
> > introduced at the moment, so is postponed to 20.11.
> >
> > For testing purposes it is proposed to update testpmd "txonly"
> > forwarding mode routine. With this update testpmd application
> > generates the packets and sets the dynamic timestamps according to
> > specified time pattern if it sees the "rte_dynfield_timestamp" is registered.
> >
> > The new testpmd command is proposed to configure sending pattern:
> >
> > set tx_times <burst_gap>,<intra_gap>
> >
> > <intra_gap> - the delay between the packets within the burst
> >               specified in the device clock units. The number
> >               of packets in the burst is defined by txburst parameter
> >
> > <burst_gap> - the delay between the bursts in the device clock units
> >
> > As the result the bursts of packet will be transmitted with specific
> > delays between the packets within the burst and specific delay between
> > the bursts. The rte_eth_get_clock is supposed to be engaged to get the
> > current device clock value and provide the reference for the timestamps.
> >
> > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> > ---
> >  lib/librte_ethdev/rte_ethdev.c |  1 +  lib/librte_ethdev/rte_ethdev.h
> > |  4 ++++  lib/librte_mbuf/rte_mbuf_dyn.h | 16 ++++++++++++++++
> >  3 files changed, 21 insertions(+)
> >
> > diff --git a/lib/librte_ethdev/rte_ethdev.c
> > b/lib/librte_ethdev/rte_ethdev.c index 8e10a6f..02157d5 100644
> > --- a/lib/librte_ethdev/rte_ethdev.c
> > +++ b/lib/librte_ethdev/rte_ethdev.c
> > @@ -162,6 +162,7 @@ struct rte_eth_xstats_name_off {
> >  	RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
> >  	RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
> >  	RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
> > +	RTE_TX_OFFLOAD_BIT2STR(SEND_ON_TIMESTAMP),
> >  };
> >
> >  #undef RTE_TX_OFFLOAD_BIT2STR
> > diff --git a/lib/librte_ethdev/rte_ethdev.h
> > b/lib/librte_ethdev/rte_ethdev.h index a49242b..6f6454c 100644
> > --- a/lib/librte_ethdev/rte_ethdev.h
> > +++ b/lib/librte_ethdev/rte_ethdev.h
> > @@ -1178,6 +1178,10 @@ struct rte_eth_conf {
> >  /** Device supports outer UDP checksum */  #define
> > DEV_TX_OFFLOAD_OUTER_UDP_CKSUM  0x00100000
> >
> > +/** Device supports send on timestamp */ #define
> > +DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
> > +
> > +
> >  #define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
> /**<
> > Device supports Rx queue setup after device started*/  #define
> > RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x00000002 diff --git
> > a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
> > index 96c3631..fb5477c 100644
> > --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> > +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> > @@ -250,4 +250,20 @@ int rte_mbuf_dynflag_lookup(const char *name,
> > #define RTE_MBUF_DYNFIELD_METADATA_NAME
> "rte_flow_dynfield_metadata"
> >  #define RTE_MBUF_DYNFLAG_METADATA_NAME
> "rte_flow_dynflag_metadata"
> >
> > +/*
> > + * The timestamp dynamic field provides some timing information, the
> > + * units and time references (initial phase) are not explicitly
> > +defined
> > + * but are maintained always the same for a given port. Some devices
> > +allow
> > + * to query rte_eth_read_clock() that will return the current device
> > + * timestamp. The dynamic timestamp flag tells whether the field
> > +contains
> > + * actual timestamp value. For the packets being sent this value can
> > +be
> > + * used by PMD to schedule packet sending.
> > + *
> > + * After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
> > + * and obsoleting, these dynamic flag and field will be used to
> > +manage
> > + * the timestamps on receiving datapath as well.
> > + */
> > +#define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
> "rte_dynfield_timestamp"
> > +#define RTE_MBUF_DYNFLAG_TIMESTAMP_NAME
> "rte_dynflag_timestamp"
> > +
> 
> I realize that's not the case for rte_flow_dynfield_metadata, but I think it
> would be good to have a doxygen-like comment.
OK, will extend comment  with expected PMD behavior and replace "/*" with "/**"
> 
> 
> 
> Regards,
> Olivier

With best regards, Slava
diff mbox series

Patch

diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 8e10a6f..02157d5 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -162,6 +162,7 @@  struct rte_eth_xstats_name_off {
 	RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
 	RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
 	RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
+	RTE_TX_OFFLOAD_BIT2STR(SEND_ON_TIMESTAMP),
 };
 
 #undef RTE_TX_OFFLOAD_BIT2STR
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index a49242b..6f6454c 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1178,6 +1178,10 @@  struct rte_eth_conf {
 /** Device supports outer UDP checksum */
 #define DEV_TX_OFFLOAD_OUTER_UDP_CKSUM  0x00100000
 
+/** Device supports send on timestamp */
+#define DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
+
+
 #define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
 /**< Device supports Rx queue setup after device started*/
 #define RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x00000002
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 96c3631..fb5477c 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -250,4 +250,20 @@  int rte_mbuf_dynflag_lookup(const char *name,
 #define RTE_MBUF_DYNFIELD_METADATA_NAME "rte_flow_dynfield_metadata"
 #define RTE_MBUF_DYNFLAG_METADATA_NAME "rte_flow_dynflag_metadata"
 
+/*
+ * The timestamp dynamic field provides some timing information, the
+ * units and time references (initial phase) are not explicitly defined
+ * but are maintained always the same for a given port. Some devices allow
+ * to query rte_eth_read_clock() that will return the current device
+ * timestamp. The dynamic timestamp flag tells whether the field contains
+ * actual timestamp value. For the packets being sent this value can be
+ * used by PMD to schedule packet sending.
+ *
+ * After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
+ * and obsoleting, these dynamic flag and field will be used to manage
+ * the timestamps on receiving datapath as well.
+ */
+#define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME "rte_dynfield_timestamp"
+#define RTE_MBUF_DYNFLAG_TIMESTAMP_NAME "rte_dynflag_timestamp"
+
 #endif