[1/3] mbuf: add Tx offloads for packet marking
diff mbox series

Message ID 20200417072254.11455-1-nithind1988@gmail.com
State New
Delegated to: Jerin Jacob
Headers show
Series
  • [1/3] mbuf: add Tx offloads for packet marking
Related show

Checks

Context Check Description
ci/Intel-compilation fail apply issues
ci/checkpatch success coding style OK

Commit Message

Nithin Kumar D April 17, 2020, 7:22 a.m. UTC
From: Nithin Dabilpuram <ndabilpuram@marvell.com>

Introduce PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_ECN
and PKT_TX_MARK_VLAN_DEI Tx offload flags to support
packet marking.

When packet marking feature in Traffic manager is enabled,
application has to the use the three new flags to indicate
to PMD on whether packet marking needs to be enabled on the
specific mbuf or not. By setting the three flags, it is
assumed by PMD that application has already verified the
applicability of marking on that specific packet and
PMD need not perform further checks as per RFC.

Signed-off-by: Krzysztof Kanas <kkanas@marvell.com>
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
---
 doc/guides/nics/features.rst    | 14 ++++++++++++++
 lib/librte_mbuf/rte_mbuf.c      |  6 ++++++
 lib/librte_mbuf/rte_mbuf_core.h | 36 ++++++++++++++++++++++++++++++++++--
 3 files changed, 54 insertions(+), 2 deletions(-)

Comments

Jerin Jacob May 1, 2020, 11:18 a.m. UTC | #1
On Fri, Apr 17, 2020 at 12:53 PM Nithin Dabilpuram
<nithind1988@gmail.com> wrote:
>
> From: Nithin Dabilpuram <ndabilpuram@marvell.com>
>
> Introduce PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_ECN
> and PKT_TX_MARK_VLAN_DEI Tx offload flags to support
> packet marking.
>
> When packet marking feature in Traffic manager is enabled,
> application has to the use the three new flags to indicate
> to PMD on whether packet marking needs to be enabled on the
> specific mbuf or not. By setting the three flags, it is
> assumed by PMD that application has already verified the
> applicability of marking on that specific packet and
> PMD need not perform further checks as per RFC.
>
> Signed-off-by: Krzysztof Kanas <kkanas@marvell.com>
> Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>

None of the ethdev TM driver implementations has supported packet
marking support.
rte_tm and rte_mbuf maintainers(Christian, Oliver), Could you review this patch?


> ---
>  doc/guides/nics/features.rst    | 14 ++++++++++++++
>  lib/librte_mbuf/rte_mbuf.c      |  6 ++++++
>  lib/librte_mbuf/rte_mbuf_core.h | 36 ++++++++++++++++++++++++++++++++++--
>  3 files changed, 54 insertions(+), 2 deletions(-)
>
> diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> index edd21c4..bc978fb 100644
> --- a/doc/guides/nics/features.rst
> +++ b/doc/guides/nics/features.rst
> @@ -913,6 +913,20 @@ Supports to get Rx/Tx packet burst mode information.
>  * **[implements] eth_dev_ops**: ``rx_burst_mode_get``, ``tx_burst_mode_get``.
>  * **[related] API**: ``rte_eth_rx_burst_mode_get()``, ``rte_eth_tx_burst_mode_get()``.
>
> +.. _nic_features_traffic_manager_packet_marking_offload:
> +
> +Traffic Manager Packet marking offload
> +--------------------------------------
> +
> +Supports enabling a packet marking offload specific mbuf.
> +
> +* **[uses]     mbuf**: ``mbuf.ol_flags:PKT_TX_MARK_IP_DSCP``,
> +  ``mbuf.ol_flags:PKT_TX_MARK_IP_ECN``, ``mbuf.ol_flags:PKT_TX_MARK_VLAN_DEI``,
> +  ``mbuf.ol_flags:PKT_TX_IPV4``, ``mbuf.ol_flags:PKT_TX_IPV6``.
> +* **[uses]     mbuf**: ``mbuf.l2_len``.
> +* **[related] API**: ``rte_tm_mark_ip_dscp()``, ``rte_tm_mark_ip_ecn()``,
> +  ``rte_tm_mark_vlan_dei()``.
> +
>  .. _nic_features_other:
>
>  Other dev ops not represented by a Feature
> diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> index cd5794d..5c6896d 100644
> --- a/lib/librte_mbuf/rte_mbuf.c
> +++ b/lib/librte_mbuf/rte_mbuf.c
> @@ -880,6 +880,9 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
>         case PKT_TX_SEC_OFFLOAD: return "PKT_TX_SEC_OFFLOAD";
>         case PKT_TX_UDP_SEG: return "PKT_TX_UDP_SEG";
>         case PKT_TX_OUTER_UDP_CKSUM: return "PKT_TX_OUTER_UDP_CKSUM";
> +       case PKT_TX_MARK_VLAN_DEI: return "PKT_TX_MARK_VLAN_DEI";
> +       case PKT_TX_MARK_IP_DSCP: return "PKT_TX_MARK_IP_DSCP";
> +       case PKT_TX_MARK_IP_ECN: return "PKT_TX_MARK_IP_ECN";
>         default: return NULL;
>         }
>  }
> @@ -916,6 +919,9 @@ rte_get_tx_ol_flag_list(uint64_t mask, char *buf, size_t buflen)
>                 { PKT_TX_SEC_OFFLOAD, PKT_TX_SEC_OFFLOAD, NULL },
>                 { PKT_TX_UDP_SEG, PKT_TX_UDP_SEG, NULL },
>                 { PKT_TX_OUTER_UDP_CKSUM, PKT_TX_OUTER_UDP_CKSUM, NULL },
> +               { PKT_TX_MARK_VLAN_DEI, PKT_TX_MARK_VLAN_DEI, NULL },
> +               { PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_DSCP, NULL },
> +               { PKT_TX_MARK_IP_ECN, PKT_TX_MARK_IP_ECN, NULL },
>         };
>         const char *name;
>         unsigned int i;
> diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> index b9a59c8..d9f1290 100644
> --- a/lib/librte_mbuf/rte_mbuf_core.h
> +++ b/lib/librte_mbuf/rte_mbuf_core.h
> @@ -187,11 +187,40 @@ extern "C" {
>  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
>
>  #define PKT_FIRST_FREE (1ULL << 23)
> -#define PKT_LAST_FREE (1ULL << 40)
> +#define PKT_LAST_FREE (1ULL << 37)
>
>  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
>
>  /**
> + * Packet marking offload flags. These flags indicated what kind
> + * of packet marking needs to be applied on a given mbuf when
> + * appropriate Traffic Manager configuration is in place.
> + * When user set's these flags on a mbuf, below assumptions are made
> + * 1) When PKT_TX_MARK_VLAN_DEI is set,
> + * a) PMD assumes pkt to be a 802.1q packet.
> + * b) Application should also set mbuf.l2_len where 802.1Q header is
> + *    at (mbuf.l2_len - 6) offset.
> + * 2) When PKT_TX_MARK_IP_DSCP is set,
> + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6
> + *    to indicate whether if it is IPv4 packet or IPv6 packet
> + *    for DSCP marking. It should also set PKT_TX_IP_CKSUM if it is
> + *    IPv4 pkt.
> + * b) Application should also set mbuf.l2_len that indicates
> + *    start offset of L3 header.
> + * 3) When PKT_TX_MARK_IP_ECN is set,
> + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6.
> + *    It should also set PKT_TX_IP_CKSUM if it is IPv4 pkt.
> + * b) PMD will assume pkt L4 protocol is either TCP or SCTP and
> + *    ECN is set to 2'b01 or 2'b10 as per RFC 3168 and hence HW
> + *    can mark the packet for a configured color.
> + * c) Application should also set mbuf.l2_len that indicates
> + *    start offset of L3 header.
> + */
> +#define PKT_TX_MARK_VLAN_DEI           (1ULL << 38)
> +#define PKT_TX_MARK_IP_DSCP            (1ULL << 39)
> +#define PKT_TX_MARK_IP_ECN             (1ULL << 40)
> +
> +/**
>   * Outer UDP checksum offload flag. This flag is used for enabling
>   * outer UDP checksum in PMD. To use outer UDP checksum, the user needs to
>   * 1) Enable the following in mbuf,
> @@ -384,7 +413,10 @@ extern "C" {
>                 PKT_TX_MACSEC |          \
>                 PKT_TX_SEC_OFFLOAD |     \
>                 PKT_TX_UDP_SEG |         \
> -               PKT_TX_OUTER_UDP_CKSUM)
> +               PKT_TX_OUTER_UDP_CKSUM | \
> +               PKT_TX_MARK_VLAN_DEI |   \
> +               PKT_TX_MARK_IP_DSCP |    \
> +               PKT_TX_MARK_IP_ECN)
>
>  /**
>   * Mbuf having an external buffer attached. shinfo in mbuf must be filled.
> --
> 2.8.4
>
Olivier Matz May 4, 2020, 8:06 a.m. UTC | #2
Hi,

On Fri, May 01, 2020 at 04:48:21PM +0530, Jerin Jacob wrote:
> On Fri, Apr 17, 2020 at 12:53 PM Nithin Dabilpuram
> <nithind1988@gmail.com> wrote:
> >
> > From: Nithin Dabilpuram <ndabilpuram@marvell.com>
> >
> > Introduce PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_ECN
> > and PKT_TX_MARK_VLAN_DEI Tx offload flags to support
> > packet marking.
> >
> > When packet marking feature in Traffic manager is enabled,
> > application has to the use the three new flags to indicate
> > to PMD on whether packet marking needs to be enabled on the
> > specific mbuf or not. By setting the three flags, it is
> > assumed by PMD that application has already verified the
> > applicability of marking on that specific packet and
> > PMD need not perform further checks as per RFC.
> >
> > Signed-off-by: Krzysztof Kanas <kkanas@marvell.com>
> > Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
> 
> None of the ethdev TM driver implementations has supported packet
> marking support.
> rte_tm and rte_mbuf maintainers(Christian, Oliver), Could you review this patch?

As you know, the number of mbuf flags is limited (only 18 bits are
remaining), so I think we should use them with care, i.e. for features
that are generic enough.

From what I understand, this feature is bound to octeontx2, so using a
mbuf dynamic flag would make more sense here. There are some examples in
dpdk repository, just grep for "dynflag".

Also, I think that the feature availability should be advertised through
an ethdev offload, so an application can know at initialization time
that these flags can be used.

Regards,
Olivier



> 
> 
> > ---
> >  doc/guides/nics/features.rst    | 14 ++++++++++++++
> >  lib/librte_mbuf/rte_mbuf.c      |  6 ++++++
> >  lib/librte_mbuf/rte_mbuf_core.h | 36 ++++++++++++++++++++++++++++++++++--
> >  3 files changed, 54 insertions(+), 2 deletions(-)
> >
> > diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> > index edd21c4..bc978fb 100644
> > --- a/doc/guides/nics/features.rst
> > +++ b/doc/guides/nics/features.rst
> > @@ -913,6 +913,20 @@ Supports to get Rx/Tx packet burst mode information.
> >  * **[implements] eth_dev_ops**: ``rx_burst_mode_get``, ``tx_burst_mode_get``.
> >  * **[related] API**: ``rte_eth_rx_burst_mode_get()``, ``rte_eth_tx_burst_mode_get()``.
> >
> > +.. _nic_features_traffic_manager_packet_marking_offload:
> > +
> > +Traffic Manager Packet marking offload
> > +--------------------------------------
> > +
> > +Supports enabling a packet marking offload specific mbuf.
> > +
> > +* **[uses]     mbuf**: ``mbuf.ol_flags:PKT_TX_MARK_IP_DSCP``,
> > +  ``mbuf.ol_flags:PKT_TX_MARK_IP_ECN``, ``mbuf.ol_flags:PKT_TX_MARK_VLAN_DEI``,
> > +  ``mbuf.ol_flags:PKT_TX_IPV4``, ``mbuf.ol_flags:PKT_TX_IPV6``.
> > +* **[uses]     mbuf**: ``mbuf.l2_len``.
> > +* **[related] API**: ``rte_tm_mark_ip_dscp()``, ``rte_tm_mark_ip_ecn()``,
> > +  ``rte_tm_mark_vlan_dei()``.
> > +
> >  .. _nic_features_other:
> >
> >  Other dev ops not represented by a Feature
> > diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> > index cd5794d..5c6896d 100644
> > --- a/lib/librte_mbuf/rte_mbuf.c
> > +++ b/lib/librte_mbuf/rte_mbuf.c
> > @@ -880,6 +880,9 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
> >         case PKT_TX_SEC_OFFLOAD: return "PKT_TX_SEC_OFFLOAD";
> >         case PKT_TX_UDP_SEG: return "PKT_TX_UDP_SEG";
> >         case PKT_TX_OUTER_UDP_CKSUM: return "PKT_TX_OUTER_UDP_CKSUM";
> > +       case PKT_TX_MARK_VLAN_DEI: return "PKT_TX_MARK_VLAN_DEI";
> > +       case PKT_TX_MARK_IP_DSCP: return "PKT_TX_MARK_IP_DSCP";
> > +       case PKT_TX_MARK_IP_ECN: return "PKT_TX_MARK_IP_ECN";
> >         default: return NULL;
> >         }
> >  }
> > @@ -916,6 +919,9 @@ rte_get_tx_ol_flag_list(uint64_t mask, char *buf, size_t buflen)
> >                 { PKT_TX_SEC_OFFLOAD, PKT_TX_SEC_OFFLOAD, NULL },
> >                 { PKT_TX_UDP_SEG, PKT_TX_UDP_SEG, NULL },
> >                 { PKT_TX_OUTER_UDP_CKSUM, PKT_TX_OUTER_UDP_CKSUM, NULL },
> > +               { PKT_TX_MARK_VLAN_DEI, PKT_TX_MARK_VLAN_DEI, NULL },
> > +               { PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_DSCP, NULL },
> > +               { PKT_TX_MARK_IP_ECN, PKT_TX_MARK_IP_ECN, NULL },
> >         };
> >         const char *name;
> >         unsigned int i;
> > diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> > index b9a59c8..d9f1290 100644
> > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > @@ -187,11 +187,40 @@ extern "C" {
> >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> >
> >  #define PKT_FIRST_FREE (1ULL << 23)
> > -#define PKT_LAST_FREE (1ULL << 40)
> > +#define PKT_LAST_FREE (1ULL << 37)
> >
> >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> >
> >  /**
> > + * Packet marking offload flags. These flags indicated what kind
> > + * of packet marking needs to be applied on a given mbuf when
> > + * appropriate Traffic Manager configuration is in place.
> > + * When user set's these flags on a mbuf, below assumptions are made
> > + * 1) When PKT_TX_MARK_VLAN_DEI is set,
> > + * a) PMD assumes pkt to be a 802.1q packet.
> > + * b) Application should also set mbuf.l2_len where 802.1Q header is
> > + *    at (mbuf.l2_len - 6) offset.
> > + * 2) When PKT_TX_MARK_IP_DSCP is set,
> > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6
> > + *    to indicate whether if it is IPv4 packet or IPv6 packet
> > + *    for DSCP marking. It should also set PKT_TX_IP_CKSUM if it is
> > + *    IPv4 pkt.
> > + * b) Application should also set mbuf.l2_len that indicates
> > + *    start offset of L3 header.
> > + * 3) When PKT_TX_MARK_IP_ECN is set,
> > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6.
> > + *    It should also set PKT_TX_IP_CKSUM if it is IPv4 pkt.
> > + * b) PMD will assume pkt L4 protocol is either TCP or SCTP and
> > + *    ECN is set to 2'b01 or 2'b10 as per RFC 3168 and hence HW
> > + *    can mark the packet for a configured color.
> > + * c) Application should also set mbuf.l2_len that indicates
> > + *    start offset of L3 header.
> > + */
> > +#define PKT_TX_MARK_VLAN_DEI           (1ULL << 38)
> > +#define PKT_TX_MARK_IP_DSCP            (1ULL << 39)
> > +#define PKT_TX_MARK_IP_ECN             (1ULL << 40)
> > +
> > +/**
> >   * Outer UDP checksum offload flag. This flag is used for enabling
> >   * outer UDP checksum in PMD. To use outer UDP checksum, the user needs to
> >   * 1) Enable the following in mbuf,
> > @@ -384,7 +413,10 @@ extern "C" {
> >                 PKT_TX_MACSEC |          \
> >                 PKT_TX_SEC_OFFLOAD |     \
> >                 PKT_TX_UDP_SEG |         \
> > -               PKT_TX_OUTER_UDP_CKSUM)
> > +               PKT_TX_OUTER_UDP_CKSUM | \
> > +               PKT_TX_MARK_VLAN_DEI |   \
> > +               PKT_TX_MARK_IP_DSCP |    \
> > +               PKT_TX_MARK_IP_ECN)
> >
> >  /**
> >   * Mbuf having an external buffer attached. shinfo in mbuf must be filled.
> > --
> > 2.8.4
> >
Nithin Dabilpuram May 4, 2020, 8:27 a.m. UTC | #3
Hi Olivier,

On Mon, May 04, 2020 at 10:06:34AM +0200, Olivier Matz wrote:
> External Email
> 
> ----------------------------------------------------------------------
> Hi,
> 
> On Fri, May 01, 2020 at 04:48:21PM +0530, Jerin Jacob wrote:
> > On Fri, Apr 17, 2020 at 12:53 PM Nithin Dabilpuram
> > <nithind1988@gmail.com> wrote:
> > >
> > > From: Nithin Dabilpuram <ndabilpuram@marvell.com>
> > >
> > > Introduce PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_ECN
> > > and PKT_TX_MARK_VLAN_DEI Tx offload flags to support
> > > packet marking.
> > >
> > > When packet marking feature in Traffic manager is enabled,
> > > application has to the use the three new flags to indicate
> > > to PMD on whether packet marking needs to be enabled on the
> > > specific mbuf or not. By setting the three flags, it is
> > > assumed by PMD that application has already verified the
> > > applicability of marking on that specific packet and
> > > PMD need not perform further checks as per RFC.
> > >
> > > Signed-off-by: Krzysztof Kanas <kkanas@marvell.com>
> > > Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
> > 
> > None of the ethdev TM driver implementations has supported packet
> > marking support.
> > rte_tm and rte_mbuf maintainers(Christian, Oliver), Could you review this patch?
> 
> As you know, the number of mbuf flags is limited (only 18 bits are
> remaining), so I think we should use them with care, i.e. for features
> that are generic enough.

I agree, but I believe this is one of the basic flags needed like other 
Tx checksum offload flags (like PKT_TX_IP_CKSUM, PKT_TX_IPV4, etc) which 
are needed to identify on which packets HW should/can apply packet marking.

> 
> From what I understand, this feature is bound to octeontx2, so using a
> mbuf dynamic flag would make more sense here. There are some examples in
> dpdk repository, just grep for "dynflag".

This is not octeontx2 specific flag but any "packet marking feature" enabled
PMD would need these flags to identify on which packets marking needs to be 
done. This is the first PMD that supports packet marking feature and
hence it was not exposed earlier.


For example to mark VLAN DEI, PMD cannot always assume that there is preexisting
VLAN header from Byte 12 as there is no gaurantee that ethernet header
always starts at Byte 0 (Custom headers before ethernet hdr).

> 
> Also, I think that the feature availability should be advertised through
> an ethdev offload, so an application can know at initialization time
> that these flags can be used.

Feature availablity is already part of TM spec in rte_tm.h 
struct rte_tm_capabilities:mark_vlan_dei_supported
struct rte_tm_capabilities:mark_ip_ecn_[sctp|tcp]_supported
struct rte_tm_capabilities:mark_ip_dscp_supported

Thanks
Nithin
> 
> Regards,
> Olivier
> 
> 
> 
> > 
> > 
> > > ---
> > >  doc/guides/nics/features.rst    | 14 ++++++++++++++
> > >  lib/librte_mbuf/rte_mbuf.c      |  6 ++++++
> > >  lib/librte_mbuf/rte_mbuf_core.h | 36 ++++++++++++++++++++++++++++++++++--
> > >  3 files changed, 54 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> > > index edd21c4..bc978fb 100644
> > > --- a/doc/guides/nics/features.rst
> > > +++ b/doc/guides/nics/features.rst
> > > @@ -913,6 +913,20 @@ Supports to get Rx/Tx packet burst mode information.
> > >  * **[implements] eth_dev_ops**: ``rx_burst_mode_get``, ``tx_burst_mode_get``.
> > >  * **[related] API**: ``rte_eth_rx_burst_mode_get()``, ``rte_eth_tx_burst_mode_get()``.
> > >
> > > +.. _nic_features_traffic_manager_packet_marking_offload:
> > > +
> > > +Traffic Manager Packet marking offload
> > > +--------------------------------------
> > > +
> > > +Supports enabling a packet marking offload specific mbuf.
> > > +
> > > +* **[uses]     mbuf**: ``mbuf.ol_flags:PKT_TX_MARK_IP_DSCP``,
> > > +  ``mbuf.ol_flags:PKT_TX_MARK_IP_ECN``, ``mbuf.ol_flags:PKT_TX_MARK_VLAN_DEI``,
> > > +  ``mbuf.ol_flags:PKT_TX_IPV4``, ``mbuf.ol_flags:PKT_TX_IPV6``.
> > > +* **[uses]     mbuf**: ``mbuf.l2_len``.
> > > +* **[related] API**: ``rte_tm_mark_ip_dscp()``, ``rte_tm_mark_ip_ecn()``,
> > > +  ``rte_tm_mark_vlan_dei()``.
> > > +
> > >  .. _nic_features_other:
> > >
> > >  Other dev ops not represented by a Feature
> > > diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> > > index cd5794d..5c6896d 100644
> > > --- a/lib/librte_mbuf/rte_mbuf.c
> > > +++ b/lib/librte_mbuf/rte_mbuf.c
> > > @@ -880,6 +880,9 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
> > >         case PKT_TX_SEC_OFFLOAD: return "PKT_TX_SEC_OFFLOAD";
> > >         case PKT_TX_UDP_SEG: return "PKT_TX_UDP_SEG";
> > >         case PKT_TX_OUTER_UDP_CKSUM: return "PKT_TX_OUTER_UDP_CKSUM";
> > > +       case PKT_TX_MARK_VLAN_DEI: return "PKT_TX_MARK_VLAN_DEI";
> > > +       case PKT_TX_MARK_IP_DSCP: return "PKT_TX_MARK_IP_DSCP";
> > > +       case PKT_TX_MARK_IP_ECN: return "PKT_TX_MARK_IP_ECN";
> > >         default: return NULL;
> > >         }
> > >  }
> > > @@ -916,6 +919,9 @@ rte_get_tx_ol_flag_list(uint64_t mask, char *buf, size_t buflen)
> > >                 { PKT_TX_SEC_OFFLOAD, PKT_TX_SEC_OFFLOAD, NULL },
> > >                 { PKT_TX_UDP_SEG, PKT_TX_UDP_SEG, NULL },
> > >                 { PKT_TX_OUTER_UDP_CKSUM, PKT_TX_OUTER_UDP_CKSUM, NULL },
> > > +               { PKT_TX_MARK_VLAN_DEI, PKT_TX_MARK_VLAN_DEI, NULL },
> > > +               { PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_DSCP, NULL },
> > > +               { PKT_TX_MARK_IP_ECN, PKT_TX_MARK_IP_ECN, NULL },
> > >         };
> > >         const char *name;
> > >         unsigned int i;
> > > diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> > > index b9a59c8..d9f1290 100644
> > > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > > @@ -187,11 +187,40 @@ extern "C" {
> > >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> > >
> > >  #define PKT_FIRST_FREE (1ULL << 23)
> > > -#define PKT_LAST_FREE (1ULL << 40)
> > > +#define PKT_LAST_FREE (1ULL << 37)
> > >
> > >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> > >
> > >  /**
> > > + * Packet marking offload flags. These flags indicated what kind
> > > + * of packet marking needs to be applied on a given mbuf when
> > > + * appropriate Traffic Manager configuration is in place.
> > > + * When user set's these flags on a mbuf, below assumptions are made
> > > + * 1) When PKT_TX_MARK_VLAN_DEI is set,
> > > + * a) PMD assumes pkt to be a 802.1q packet.
> > > + * b) Application should also set mbuf.l2_len where 802.1Q header is
> > > + *    at (mbuf.l2_len - 6) offset.
> > > + * 2) When PKT_TX_MARK_IP_DSCP is set,
> > > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6
> > > + *    to indicate whether if it is IPv4 packet or IPv6 packet
> > > + *    for DSCP marking. It should also set PKT_TX_IP_CKSUM if it is
> > > + *    IPv4 pkt.
> > > + * b) Application should also set mbuf.l2_len that indicates
> > > + *    start offset of L3 header.
> > > + * 3) When PKT_TX_MARK_IP_ECN is set,
> > > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6.
> > > + *    It should also set PKT_TX_IP_CKSUM if it is IPv4 pkt.
> > > + * b) PMD will assume pkt L4 protocol is either TCP or SCTP and
> > > + *    ECN is set to 2'b01 or 2'b10 as per RFC 3168 and hence HW
> > > + *    can mark the packet for a configured color.
> > > + * c) Application should also set mbuf.l2_len that indicates
> > > + *    start offset of L3 header.
> > > + */
> > > +#define PKT_TX_MARK_VLAN_DEI           (1ULL << 38)
> > > +#define PKT_TX_MARK_IP_DSCP            (1ULL << 39)
> > > +#define PKT_TX_MARK_IP_ECN             (1ULL << 40)
> > > +
> > > +/**
> > >   * Outer UDP checksum offload flag. This flag is used for enabling
> > >   * outer UDP checksum in PMD. To use outer UDP checksum, the user needs to
> > >   * 1) Enable the following in mbuf,
> > > @@ -384,7 +413,10 @@ extern "C" {
> > >                 PKT_TX_MACSEC |          \
> > >                 PKT_TX_SEC_OFFLOAD |     \
> > >                 PKT_TX_UDP_SEG |         \
> > > -               PKT_TX_OUTER_UDP_CKSUM)
> > > +               PKT_TX_OUTER_UDP_CKSUM | \
> > > +               PKT_TX_MARK_VLAN_DEI |   \
> > > +               PKT_TX_MARK_IP_DSCP |    \
> > > +               PKT_TX_MARK_IP_ECN)
> > >
> > >  /**
> > >   * Mbuf having an external buffer attached. shinfo in mbuf must be filled.
> > > --
> > > 2.8.4
> > >
Olivier Matz May 4, 2020, 9:16 a.m. UTC | #4
On Mon, May 04, 2020 at 01:57:06PM +0530, Nithin Dabilpuram wrote:
> Hi Olivier,
> 
> On Mon, May 04, 2020 at 10:06:34AM +0200, Olivier Matz wrote:
> > External Email
> > 
> > ----------------------------------------------------------------------
> > Hi,
> > 
> > On Fri, May 01, 2020 at 04:48:21PM +0530, Jerin Jacob wrote:
> > > On Fri, Apr 17, 2020 at 12:53 PM Nithin Dabilpuram
> > > <nithind1988@gmail.com> wrote:
> > > >
> > > > From: Nithin Dabilpuram <ndabilpuram@marvell.com>
> > > >
> > > > Introduce PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_ECN
> > > > and PKT_TX_MARK_VLAN_DEI Tx offload flags to support
> > > > packet marking.
> > > >
> > > > When packet marking feature in Traffic manager is enabled,
> > > > application has to the use the three new flags to indicate
> > > > to PMD on whether packet marking needs to be enabled on the
> > > > specific mbuf or not. By setting the three flags, it is
> > > > assumed by PMD that application has already verified the
> > > > applicability of marking on that specific packet and
> > > > PMD need not perform further checks as per RFC.
> > > >
> > > > Signed-off-by: Krzysztof Kanas <kkanas@marvell.com>
> > > > Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
> > > 
> > > None of the ethdev TM driver implementations has supported packet
> > > marking support.
> > > rte_tm and rte_mbuf maintainers(Christian, Oliver), Could you review this patch?
> > 
> > As you know, the number of mbuf flags is limited (only 18 bits are
> > remaining), so I think we should use them with care, i.e. for features
> > that are generic enough.
> 
> I agree, but I believe this is one of the basic flags needed like other 
> Tx checksum offload flags (like PKT_TX_IP_CKSUM, PKT_TX_IPV4, etc) which 
> are needed to identify on which packets HW should/can apply packet marking.

PKT_TX_IP_CKSUM tells the hardware to offload the checksum
calculation. This is pretty straightforward and there is no other
dependency than the offload feature advertised by the PMD.

I'm sorry, I have not a lot of experience with rte_tm.h, so it's
difficult for me to have a global view of what is done for instance when
PKT_TX_MARK_VLAN_DEI is set, and what happens when it is not set.

Can you confirm that my understanding below is correct? (or correct me
where I'm wrong)

Before your patch:
- the application enables the port and traffic manager on it
- the application calls rte_tm_mark_vlan_dei() to select which traffic
  class must be marked
- when a packet is transmitted, the traffic class is determined by the
  hardware, and if the hardware recognizes a VLAN packet, the VLAN DEI
  bit is set depending on traffic class

The problem is for packets that cannot be recognized by the hardware,
correct?

So your patch is a way to force the hardware to recognize mark set the
VLAN DEI on packets that are not recognized as VLAN packets?

How the is traffic class of the packet determined?


> > From what I understand, this feature is bound to octeontx2, so using a
> > mbuf dynamic flag would make more sense here. There are some examples in
> > dpdk repository, just grep for "dynflag".
> 
> This is not octeontx2 specific flag but any "packet marking feature" enabled
> PMD would need these flags to identify on which packets marking needs to be 
> done. This is the first PMD that supports packet marking feature and
> hence it was not exposed earlier.
> 
> For example to mark VLAN DEI, PMD cannot always assume that there is preexisting
> VLAN header from Byte 12 as there is no gaurantee that ethernet header
> always starts at Byte 0 (Custom headers before ethernet hdr).
> 
> > 
> > Also, I think that the feature availability should be advertised through
> > an ethdev offload, so an application can know at initialization time
> > that these flags can be used.
> 
> Feature availablity is already part of TM spec in rte_tm.h 
> struct rte_tm_capabilities:mark_vlan_dei_supported
> struct rte_tm_capabilities:mark_ip_ecn_[sctp|tcp]_supported
> struct rte_tm_capabilities:mark_ip_dscp_supported

Does this mean that any driver advertising this existing feature flag
has to support the new mbuf flags too? Shouldn't we have a specific
feature for it?

Please also see few comments below.

> > > > ---
> > > >  doc/guides/nics/features.rst    | 14 ++++++++++++++
> > > >  lib/librte_mbuf/rte_mbuf.c      |  6 ++++++
> > > >  lib/librte_mbuf/rte_mbuf_core.h | 36 ++++++++++++++++++++++++++++++++++--
> > > >  3 files changed, 54 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> > > > index edd21c4..bc978fb 100644
> > > > --- a/doc/guides/nics/features.rst
> > > > +++ b/doc/guides/nics/features.rst
> > > > @@ -913,6 +913,20 @@ Supports to get Rx/Tx packet burst mode information.
> > > >  * **[implements] eth_dev_ops**: ``rx_burst_mode_get``, ``tx_burst_mode_get``.
> > > >  * **[related] API**: ``rte_eth_rx_burst_mode_get()``, ``rte_eth_tx_burst_mode_get()``.
> > > >
> > > > +.. _nic_features_traffic_manager_packet_marking_offload:
> > > > +
> > > > +Traffic Manager Packet marking offload
> > > > +--------------------------------------
> > > > +
> > > > +Supports enabling a packet marking offload specific mbuf.
> > > > +
> > > > +* **[uses]     mbuf**: ``mbuf.ol_flags:PKT_TX_MARK_IP_DSCP``,
> > > > +  ``mbuf.ol_flags:PKT_TX_MARK_IP_ECN``, ``mbuf.ol_flags:PKT_TX_MARK_VLAN_DEI``,
> > > > +  ``mbuf.ol_flags:PKT_TX_IPV4``, ``mbuf.ol_flags:PKT_TX_IPV6``.
> > > > +* **[uses]     mbuf**: ``mbuf.l2_len``.
> > > > +* **[related] API**: ``rte_tm_mark_ip_dscp()``, ``rte_tm_mark_ip_ecn()``,
> > > > +  ``rte_tm_mark_vlan_dei()``.
> > > > +
> > > >  .. _nic_features_other:
> > > >
> > > >  Other dev ops not represented by a Feature
> > > > diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> > > > index cd5794d..5c6896d 100644
> > > > --- a/lib/librte_mbuf/rte_mbuf.c
> > > > +++ b/lib/librte_mbuf/rte_mbuf.c
> > > > @@ -880,6 +880,9 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
> > > >         case PKT_TX_SEC_OFFLOAD: return "PKT_TX_SEC_OFFLOAD";
> > > >         case PKT_TX_UDP_SEG: return "PKT_TX_UDP_SEG";
> > > >         case PKT_TX_OUTER_UDP_CKSUM: return "PKT_TX_OUTER_UDP_CKSUM";
> > > > +       case PKT_TX_MARK_VLAN_DEI: return "PKT_TX_MARK_VLAN_DEI";
> > > > +       case PKT_TX_MARK_IP_DSCP: return "PKT_TX_MARK_IP_DSCP";
> > > > +       case PKT_TX_MARK_IP_ECN: return "PKT_TX_MARK_IP_ECN";
> > > >         default: return NULL;
> > > >         }
> > > >  }
> > > > @@ -916,6 +919,9 @@ rte_get_tx_ol_flag_list(uint64_t mask, char *buf, size_t buflen)
> > > >                 { PKT_TX_SEC_OFFLOAD, PKT_TX_SEC_OFFLOAD, NULL },
> > > >                 { PKT_TX_UDP_SEG, PKT_TX_UDP_SEG, NULL },
> > > >                 { PKT_TX_OUTER_UDP_CKSUM, PKT_TX_OUTER_UDP_CKSUM, NULL },
> > > > +               { PKT_TX_MARK_VLAN_DEI, PKT_TX_MARK_VLAN_DEI, NULL },
> > > > +               { PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_DSCP, NULL },
> > > > +               { PKT_TX_MARK_IP_ECN, PKT_TX_MARK_IP_ECN, NULL },
> > > >         };
> > > >         const char *name;
> > > >         unsigned int i;
> > > > diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> > > > index b9a59c8..d9f1290 100644
> > > > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > > > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > > > @@ -187,11 +187,40 @@ extern "C" {
> > > >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> > > >
> > > >  #define PKT_FIRST_FREE (1ULL << 23)
> > > > -#define PKT_LAST_FREE (1ULL << 40)
> > > > +#define PKT_LAST_FREE (1ULL << 37)
> > > >
> > > >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> > > >
> > > >  /**
> > > > + * Packet marking offload flags. These flags indicated what kind
> > > > + * of packet marking needs to be applied on a given mbuf when
> > > > + * appropriate Traffic Manager configuration is in place.
> > > > + * When user set's these flags on a mbuf, below assumptions are made
> > > > + * 1) When PKT_TX_MARK_VLAN_DEI is set,
> > > > + * a) PMD assumes pkt to be a 802.1q packet.

What does that imply?

> > > > + * b) Application should also set mbuf.l2_len where 802.1Q header is
> > > > + *    at (mbuf.l2_len - 6) offset.

Why mbuf.l2_len - 6 ?

> > > > + * 2) When PKT_TX_MARK_IP_DSCP is set,
> > > > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6
> > > > + *    to indicate whether if it is IPv4 packet or IPv6 packet
> > > > + *    for DSCP marking. It should also set PKT_TX_IP_CKSUM if it is
> > > > + *    IPv4 pkt.
> > > > + * b) Application should also set mbuf.l2_len that indicates
> > > > + *    start offset of L3 header.
> > > > + * 3) When PKT_TX_MARK_IP_ECN is set,
> > > > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6.
> > > > + *    It should also set PKT_TX_IP_CKSUM if it is IPv4 pkt.
> > > > + * b) PMD will assume pkt L4 protocol is either TCP or SCTP and
> > > > + *    ECN is set to 2'b01 or 2'b10 as per RFC 3168 and hence HW
> > > > + *    can mark the packet for a configured color.
> > > > + * c) Application should also set mbuf.l2_len that indicates
> > > > + *    start offset of L3 header.
> > > > + */
> > > > +#define PKT_TX_MARK_VLAN_DEI           (1ULL << 38)
> > > > +#define PKT_TX_MARK_IP_DSCP            (1ULL << 39)
> > > > +#define PKT_TX_MARK_IP_ECN             (1ULL << 40)

We should have one comment per define.


> > > > +
> > > > +/**
> > > >   * Outer UDP checksum offload flag. This flag is used for enabling
> > > >   * outer UDP checksum in PMD. To use outer UDP checksum, the user needs to
> > > >   * 1) Enable the following in mbuf,
> > > > @@ -384,7 +413,10 @@ extern "C" {
> > > >                 PKT_TX_MACSEC |          \
> > > >                 PKT_TX_SEC_OFFLOAD |     \
> > > >                 PKT_TX_UDP_SEG |         \
> > > > -               PKT_TX_OUTER_UDP_CKSUM)
> > > > +               PKT_TX_OUTER_UDP_CKSUM | \
> > > > +               PKT_TX_MARK_VLAN_DEI |   \
> > > > +               PKT_TX_MARK_IP_DSCP |    \
> > > > +               PKT_TX_MARK_IP_ECN)
> > > >
> > > >  /**
> > > >   * Mbuf having an external buffer attached. shinfo in mbuf must be filled.
> > > > --
> > > > 2.8.4
> > > >
Nithin Dabilpuram May 4, 2020, 10:04 a.m. UTC | #5
On Mon, May 04, 2020 at 11:16:40AM +0200, Olivier Matz wrote:
> On Mon, May 04, 2020 at 01:57:06PM +0530, Nithin Dabilpuram wrote:
> > Hi Olivier,
> > 
> > On Mon, May 04, 2020 at 10:06:34AM +0200, Olivier Matz wrote:
> > > External Email
> > > 
> > > ----------------------------------------------------------------------
> > > Hi,
> > > 
> > > On Fri, May 01, 2020 at 04:48:21PM +0530, Jerin Jacob wrote:
> > > > On Fri, Apr 17, 2020 at 12:53 PM Nithin Dabilpuram
> > > > <nithind1988@gmail.com> wrote:
> > > > >
> > > > > From: Nithin Dabilpuram <ndabilpuram@marvell.com>
> > > > >
> > > > > Introduce PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_ECN
> > > > > and PKT_TX_MARK_VLAN_DEI Tx offload flags to support
> > > > > packet marking.
> > > > >
> > > > > When packet marking feature in Traffic manager is enabled,
> > > > > application has to the use the three new flags to indicate
> > > > > to PMD on whether packet marking needs to be enabled on the
> > > > > specific mbuf or not. By setting the three flags, it is
> > > > > assumed by PMD that application has already verified the
> > > > > applicability of marking on that specific packet and
> > > > > PMD need not perform further checks as per RFC.
> > > > >
> > > > > Signed-off-by: Krzysztof Kanas <kkanas@marvell.com>
> > > > > Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
> > > > 
> > > > None of the ethdev TM driver implementations has supported packet
> > > > marking support.
> > > > rte_tm and rte_mbuf maintainers(Christian, Oliver), Could you review this patch?
> > > 
> > > As you know, the number of mbuf flags is limited (only 18 bits are
> > > remaining), so I think we should use them with care, i.e. for features
> > > that are generic enough.
> > 
> > I agree, but I believe this is one of the basic flags needed like other 
> > Tx checksum offload flags (like PKT_TX_IP_CKSUM, PKT_TX_IPV4, etc) which 
> > are needed to identify on which packets HW should/can apply packet marking.
> 
> PKT_TX_IP_CKSUM tells the hardware to offload the checksum
> calculation. This is pretty straightforward and there is no other
> dependency than the offload feature advertised by the PMD.
> 
> I'm sorry, I have not a lot of experience with rte_tm.h, so it's
> difficult for me to have a global view of what is done for instance when
> PKT_TX_MARK_VLAN_DEI is set, and what happens when it is not set.
> 
> Can you confirm that my understanding below is correct? (or correct me
> where I'm wrong)
> 
> Before your patch:
> - the application enables the port and traffic manager on it
> - the application calls rte_tm_mark_vlan_dei() to select which traffic
>   class must be marked
> - when a packet is transmitted, the traffic class is determined by the
>   hardware, and if the hardware recognizes a VLAN packet, the VLAN DEI
>   bit is set depending on traffic class
> 
> The problem is for packets that cannot be recognized by the hardware,
> correct?

Yes. Octeontx2 HW always depends on application knowledge instead of walking 
through all the layers of packet data in Tx to identify what packet it is 
and where the l2, l3, l4 headers start for performance reasons. 

I believe there are other hardware too that have the same expectation
and hence we have a need for PKT_TX_IPv4, PKT_TX_IPv6 kind of flags.

Hence we want to make use of mbuf:tx_offload field and PKT_TX_* flags 
for identifying the packet and knowing what are its l2,l3,l4 offsets.

> 
> So your patch is a way to force the hardware to recognize mark set the
> VLAN DEI on packets that are not recognized as VLAN packets?
> 
> How the is traffic class of the packet determined?

Packet is coloured based on Single Rate[1] or Dual Rate[2] Shaping result
and packet color determines traffic class. The exact behavior of 
packet color to traffic class mapping is mentioned in TM spec based on
few other RFC's.

[1] https://tools.ietf.org/html/rfc2697
[2] https://tools.ietf.org/html/rfc2698

> 
> 
> > > From what I understand, this feature is bound to octeontx2, so using a
> > > mbuf dynamic flag would make more sense here. There are some examples in
> > > dpdk repository, just grep for "dynflag".
> > 
> > This is not octeontx2 specific flag but any "packet marking feature" enabled
> > PMD would need these flags to identify on which packets marking needs to be 
> > done. This is the first PMD that supports packet marking feature and
> > hence it was not exposed earlier.
> > 
> > For example to mark VLAN DEI, PMD cannot always assume that there is preexisting
> > VLAN header from Byte 12 as there is no gaurantee that ethernet header
> > always starts at Byte 0 (Custom headers before ethernet hdr).
> > 
> > > 
> > > Also, I think that the feature availability should be advertised through
> > > an ethdev offload, so an application can know at initialization time
> > > that these flags can be used.
> > 
> > Feature availablity is already part of TM spec in rte_tm.h 
> > struct rte_tm_capabilities:mark_vlan_dei_supported
> > struct rte_tm_capabilities:mark_ip_ecn_[sctp|tcp]_supported
> > struct rte_tm_capabilities:mark_ip_dscp_supported
> 
> Does this mean that any driver advertising this existing feature flag
> has to support the new mbuf flags too? Shouldn't we have a specific
> feature for it?

Yes, I thought PMD's need to support both.
I'm fine adding specific feature flag for the offload flags alone
if you insist or if there are other PMD's which don't need the offload flags
for packet marking. I was not able to find out about other PMD's as
none of the existing PMD's support packet marking.

> 
> Please also see few comments below.
> 
> > > > > ---
> > > > >  doc/guides/nics/features.rst    | 14 ++++++++++++++
> > > > >  lib/librte_mbuf/rte_mbuf.c      |  6 ++++++
> > > > >  lib/librte_mbuf/rte_mbuf_core.h | 36 ++++++++++++++++++++++++++++++++++--
> > > > >  3 files changed, 54 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> > > > > index edd21c4..bc978fb 100644
> > > > > --- a/doc/guides/nics/features.rst
> > > > > +++ b/doc/guides/nics/features.rst
> > > > > @@ -913,6 +913,20 @@ Supports to get Rx/Tx packet burst mode information.
> > > > >  * **[implements] eth_dev_ops**: ``rx_burst_mode_get``, ``tx_burst_mode_get``.
> > > > >  * **[related] API**: ``rte_eth_rx_burst_mode_get()``, ``rte_eth_tx_burst_mode_get()``.
> > > > >
> > > > > +.. _nic_features_traffic_manager_packet_marking_offload:
> > > > > +
> > > > > +Traffic Manager Packet marking offload
> > > > > +--------------------------------------
> > > > > +
> > > > > +Supports enabling a packet marking offload specific mbuf.
> > > > > +
> > > > > +* **[uses]     mbuf**: ``mbuf.ol_flags:PKT_TX_MARK_IP_DSCP``,
> > > > > +  ``mbuf.ol_flags:PKT_TX_MARK_IP_ECN``, ``mbuf.ol_flags:PKT_TX_MARK_VLAN_DEI``,
> > > > > +  ``mbuf.ol_flags:PKT_TX_IPV4``, ``mbuf.ol_flags:PKT_TX_IPV6``.
> > > > > +* **[uses]     mbuf**: ``mbuf.l2_len``.
> > > > > +* **[related] API**: ``rte_tm_mark_ip_dscp()``, ``rte_tm_mark_ip_ecn()``,
> > > > > +  ``rte_tm_mark_vlan_dei()``.
> > > > > +
> > > > >  .. _nic_features_other:
> > > > >
> > > > >  Other dev ops not represented by a Feature
> > > > > diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> > > > > index cd5794d..5c6896d 100644
> > > > > --- a/lib/librte_mbuf/rte_mbuf.c
> > > > > +++ b/lib/librte_mbuf/rte_mbuf.c
> > > > > @@ -880,6 +880,9 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
> > > > >         case PKT_TX_SEC_OFFLOAD: return "PKT_TX_SEC_OFFLOAD";
> > > > >         case PKT_TX_UDP_SEG: return "PKT_TX_UDP_SEG";
> > > > >         case PKT_TX_OUTER_UDP_CKSUM: return "PKT_TX_OUTER_UDP_CKSUM";
> > > > > +       case PKT_TX_MARK_VLAN_DEI: return "PKT_TX_MARK_VLAN_DEI";
> > > > > +       case PKT_TX_MARK_IP_DSCP: return "PKT_TX_MARK_IP_DSCP";
> > > > > +       case PKT_TX_MARK_IP_ECN: return "PKT_TX_MARK_IP_ECN";
> > > > >         default: return NULL;
> > > > >         }
> > > > >  }
> > > > > @@ -916,6 +919,9 @@ rte_get_tx_ol_flag_list(uint64_t mask, char *buf, size_t buflen)
> > > > >                 { PKT_TX_SEC_OFFLOAD, PKT_TX_SEC_OFFLOAD, NULL },
> > > > >                 { PKT_TX_UDP_SEG, PKT_TX_UDP_SEG, NULL },
> > > > >                 { PKT_TX_OUTER_UDP_CKSUM, PKT_TX_OUTER_UDP_CKSUM, NULL },
> > > > > +               { PKT_TX_MARK_VLAN_DEI, PKT_TX_MARK_VLAN_DEI, NULL },
> > > > > +               { PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_DSCP, NULL },
> > > > > +               { PKT_TX_MARK_IP_ECN, PKT_TX_MARK_IP_ECN, NULL },
> > > > >         };
> > > > >         const char *name;
> > > > >         unsigned int i;
> > > > > diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> > > > > index b9a59c8..d9f1290 100644
> > > > > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > > > > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > > > > @@ -187,11 +187,40 @@ extern "C" {
> > > > >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> > > > >
> > > > >  #define PKT_FIRST_FREE (1ULL << 23)
> > > > > -#define PKT_LAST_FREE (1ULL << 40)
> > > > > +#define PKT_LAST_FREE (1ULL << 37)
> > > > >
> > > > >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> > > > >
> > > > >  /**
> > > > > + * Packet marking offload flags. These flags indicated what kind
> > > > > + * of packet marking needs to be applied on a given mbuf when
> > > > > + * appropriate Traffic Manager configuration is in place.
> > > > > + * When user set's these flags on a mbuf, below assumptions are made
> > > > > + * 1) When PKT_TX_MARK_VLAN_DEI is set,
> > > > > + * a) PMD assumes pkt to be a 802.1q packet.
> 
> What does that imply?

I meant by setting the flag, a packet has VLAN header adhering to IEEE 802.1Q spec.

> 
> > > > > + * b) Application should also set mbuf.l2_len where 802.1Q header is
> > > > > + *    at (mbuf.l2_len - 6) offset.
> 
> Why mbuf.l2_len - 6 ?
L2 header when VLAN header is preset will be 
{custom header 'X' Bytes}:{Ethernet SRC+DST (12B)}:{VLAN Header (4B)}:{Ether Type (2B)}
l2_len = X + 12 + 4 + 2
So, VLAN header starts at (l2_len - 6) bytes.

> 
> > > > > + * 2) When PKT_TX_MARK_IP_DSCP is set,
> > > > > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6
> > > > > + *    to indicate whether if it is IPv4 packet or IPv6 packet
> > > > > + *    for DSCP marking. It should also set PKT_TX_IP_CKSUM if it is
> > > > > + *    IPv4 pkt.
> > > > > + * b) Application should also set mbuf.l2_len that indicates
> > > > > + *    start offset of L3 header.
> > > > > + * 3) When PKT_TX_MARK_IP_ECN is set,
> > > > > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6.
> > > > > + *    It should also set PKT_TX_IP_CKSUM if it is IPv4 pkt.
> > > > > + * b) PMD will assume pkt L4 protocol is either TCP or SCTP and
> > > > > + *    ECN is set to 2'b01 or 2'b10 as per RFC 3168 and hence HW
> > > > > + *    can mark the packet for a configured color.
> > > > > + * c) Application should also set mbuf.l2_len that indicates
> > > > > + *    start offset of L3 header.
> > > > > + */
> > > > > +#define PKT_TX_MARK_VLAN_DEI           (1ULL << 38)
> > > > > +#define PKT_TX_MARK_IP_DSCP            (1ULL << 39)
> > > > > +#define PKT_TX_MARK_IP_ECN             (1ULL << 40)
> 
> We should have one comment per define.
Ack, will fix in V2.

> 
> 
> > > > > +
> > > > > +/**
> > > > >   * Outer UDP checksum offload flag. This flag is used for enabling
> > > > >   * outer UDP checksum in PMD. To use outer UDP checksum, the user needs to
> > > > >   * 1) Enable the following in mbuf,
> > > > > @@ -384,7 +413,10 @@ extern "C" {
> > > > >                 PKT_TX_MACSEC |          \
> > > > >                 PKT_TX_SEC_OFFLOAD |     \
> > > > >                 PKT_TX_UDP_SEG |         \
> > > > > -               PKT_TX_OUTER_UDP_CKSUM)
> > > > > +               PKT_TX_OUTER_UDP_CKSUM | \
> > > > > +               PKT_TX_MARK_VLAN_DEI |   \
> > > > > +               PKT_TX_MARK_IP_DSCP |    \
> > > > > +               PKT_TX_MARK_IP_ECN)
> > > > >
> > > > >  /**
> > > > >   * Mbuf having an external buffer attached. shinfo in mbuf must be filled.
> > > > > --
> > > > > 2.8.4
> > > > >
Olivier Matz May 4, 2020, 12:27 p.m. UTC | #6
On Mon, May 04, 2020 at 03:34:57PM +0530, Nithin Dabilpuram wrote:
> On Mon, May 04, 2020 at 11:16:40AM +0200, Olivier Matz wrote:
> > On Mon, May 04, 2020 at 01:57:06PM +0530, Nithin Dabilpuram wrote:
> > > Hi Olivier,
> > > 
> > > On Mon, May 04, 2020 at 10:06:34AM +0200, Olivier Matz wrote:
> > > > External Email
> > > > 
> > > > ----------------------------------------------------------------------
> > > > Hi,
> > > > 
> > > > On Fri, May 01, 2020 at 04:48:21PM +0530, Jerin Jacob wrote:
> > > > > On Fri, Apr 17, 2020 at 12:53 PM Nithin Dabilpuram
> > > > > <nithind1988@gmail.com> wrote:
> > > > > >
> > > > > > From: Nithin Dabilpuram <ndabilpuram@marvell.com>
> > > > > >
> > > > > > Introduce PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_ECN
> > > > > > and PKT_TX_MARK_VLAN_DEI Tx offload flags to support
> > > > > > packet marking.
> > > > > >
> > > > > > When packet marking feature in Traffic manager is enabled,
> > > > > > application has to the use the three new flags to indicate
> > > > > > to PMD on whether packet marking needs to be enabled on the
> > > > > > specific mbuf or not. By setting the three flags, it is
> > > > > > assumed by PMD that application has already verified the
> > > > > > applicability of marking on that specific packet and
> > > > > > PMD need not perform further checks as per RFC.
> > > > > >
> > > > > > Signed-off-by: Krzysztof Kanas <kkanas@marvell.com>
> > > > > > Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
> > > > > 
> > > > > None of the ethdev TM driver implementations has supported packet
> > > > > marking support.
> > > > > rte_tm and rte_mbuf maintainers(Christian, Oliver), Could you review this patch?
> > > > 
> > > > As you know, the number of mbuf flags is limited (only 18 bits are
> > > > remaining), so I think we should use them with care, i.e. for features
> > > > that are generic enough.
> > > 
> > > I agree, but I believe this is one of the basic flags needed like other 
> > > Tx checksum offload flags (like PKT_TX_IP_CKSUM, PKT_TX_IPV4, etc) which 
> > > are needed to identify on which packets HW should/can apply packet marking.
> > 
> > PKT_TX_IP_CKSUM tells the hardware to offload the checksum
> > calculation. This is pretty straightforward and there is no other
> > dependency than the offload feature advertised by the PMD.
> > 
> > I'm sorry, I have not a lot of experience with rte_tm.h, so it's
> > difficult for me to have a global view of what is done for instance when
> > PKT_TX_MARK_VLAN_DEI is set, and what happens when it is not set.
> > 
> > Can you confirm that my understanding below is correct? (or correct me
> > where I'm wrong)
> > 
> > Before your patch:
> > - the application enables the port and traffic manager on it
> > - the application calls rte_tm_mark_vlan_dei() to select which traffic
> >   class must be marked
> > - when a packet is transmitted, the traffic class is determined by the
> >   hardware, and if the hardware recognizes a VLAN packet, the VLAN DEI
> >   bit is set depending on traffic class
> > 
> > The problem is for packets that cannot be recognized by the hardware,
> > correct?
> 
> Yes. Octeontx2 HW always depends on application knowledge instead of walking 
> through all the layers of packet data in Tx to identify what packet it is 
> and where the l2, l3, l4 headers start for performance reasons. 
> 
> I believe there are other hardware too that have the same expectation
> and hence we have a need for PKT_TX_IPv4, PKT_TX_IPv6 kind of flags.
> 
> Hence we want to make use of mbuf:tx_offload field and PKT_TX_* flags 
> for identifying the packet and knowing what are its l2,l3,l4 offsets.

The objective is to give an indication to the hardware that the packet has:
- an 802.1q header at offset X for PKT_TX_MARK_VLAN_DEI
- an IP/IPv6 header at offset X for PKT_TX_MARK_IP_DSCP
- an IP/IPv6 header at offset X and a TCP/SCTP header at offset Y for
  PKT_TX_MARK_IP_ECN

Just to be sure I'm getting the point, would it also work if with flags
like this:

- an 802.1q header at offset X for PKT_TX_HAS_VLAN
- an IP/IPv6 header at offset X for PKT_TX_IPv4 or PKT_TX_IPv6
- a TCP/SCTP header at offset Y for PKT_TX_TCP/PKT_TX_SCTP (implies
  PKT_TX_IPv4 or PKT_TX_IPv6)

The underlying question is: do we need the flags to only describe the
content of the packet or do the flag also indicate that an action has to
be done?

> > So your patch is a way to force the hardware to recognize mark set the
> > VLAN DEI on packets that are not recognized as VLAN packets?
> > 
> > How the is traffic class of the packet determined?
> 
> Packet is coloured based on Single Rate[1] or Dual Rate[2] Shaping result
> and packet color determines traffic class. The exact behavior of 
> packet color to traffic class mapping is mentioned in TM spec based on
> few other RFC's.
> 
> [1] https://tools.ietf.org/html/rfc2697
> [2] https://tools.ietf.org/html/rfc2698

OK, so the traffic class does not depend on the packet type?


> > > > From what I understand, this feature is bound to octeontx2, so using a
> > > > mbuf dynamic flag would make more sense here. There are some examples in
> > > > dpdk repository, just grep for "dynflag".
> > > 
> > > This is not octeontx2 specific flag but any "packet marking feature" enabled
> > > PMD would need these flags to identify on which packets marking needs to be 
> > > done. This is the first PMD that supports packet marking feature and
> > > hence it was not exposed earlier.
> > > 
> > > For example to mark VLAN DEI, PMD cannot always assume that there is preexisting
> > > VLAN header from Byte 12 as there is no gaurantee that ethernet header
> > > always starts at Byte 0 (Custom headers before ethernet hdr).
> > > 
> > > > 
> > > > Also, I think that the feature availability should be advertised through
> > > > an ethdev offload, so an application can know at initialization time
> > > > that these flags can be used.
> > > 
> > > Feature availablity is already part of TM spec in rte_tm.h 
> > > struct rte_tm_capabilities:mark_vlan_dei_supported
> > > struct rte_tm_capabilities:mark_ip_ecn_[sctp|tcp]_supported
> > > struct rte_tm_capabilities:mark_ip_dscp_supported
> > 
> > Does this mean that any driver advertising this existing feature flag
> > has to support the new mbuf flags too? Shouldn't we have a specific
> > feature for it?
> 
> Yes, I thought PMD's need to support both.
> I'm fine adding specific feature flag for the offload flags alone
> if you insist or if there are other PMD's which don't need the offload flags
> for packet marking. I was not able to find out about other PMD's as
> none of the existing PMD's support packet marking.

Do you suggest that the behavior of the traffic manager marking should
be:

a- the hardware tries to recognize tx packets, and mark them
   accordingly. What packets are recognized depend on hardware.
b- if the mbuf has a specific flag, it helps the PMD and hardware to
   recognize packets, so it can mark packets.

For an application, a- is difficult to apprehend as it will be dependent
on hardware.

Or do you suggest that packets should only be marked if there is a mbuf
flag? (only b-)

Do you confirm that there is no support at all for this feature today?
I mean, what was the usage of rte_tm_mark_vlan_dei() these last 3 years?

Thanks,
Olivier


> 
> > 
> > Please also see few comments below.
> > 
> > > > > > ---
> > > > > >  doc/guides/nics/features.rst    | 14 ++++++++++++++
> > > > > >  lib/librte_mbuf/rte_mbuf.c      |  6 ++++++
> > > > > >  lib/librte_mbuf/rte_mbuf_core.h | 36 ++++++++++++++++++++++++++++++++++--
> > > > > >  3 files changed, 54 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> > > > > > index edd21c4..bc978fb 100644
> > > > > > --- a/doc/guides/nics/features.rst
> > > > > > +++ b/doc/guides/nics/features.rst
> > > > > > @@ -913,6 +913,20 @@ Supports to get Rx/Tx packet burst mode information.
> > > > > >  * **[implements] eth_dev_ops**: ``rx_burst_mode_get``, ``tx_burst_mode_get``.
> > > > > >  * **[related] API**: ``rte_eth_rx_burst_mode_get()``, ``rte_eth_tx_burst_mode_get()``.
> > > > > >
> > > > > > +.. _nic_features_traffic_manager_packet_marking_offload:
> > > > > > +
> > > > > > +Traffic Manager Packet marking offload
> > > > > > +--------------------------------------
> > > > > > +
> > > > > > +Supports enabling a packet marking offload specific mbuf.
> > > > > > +
> > > > > > +* **[uses]     mbuf**: ``mbuf.ol_flags:PKT_TX_MARK_IP_DSCP``,
> > > > > > +  ``mbuf.ol_flags:PKT_TX_MARK_IP_ECN``, ``mbuf.ol_flags:PKT_TX_MARK_VLAN_DEI``,
> > > > > > +  ``mbuf.ol_flags:PKT_TX_IPV4``, ``mbuf.ol_flags:PKT_TX_IPV6``.
> > > > > > +* **[uses]     mbuf**: ``mbuf.l2_len``.
> > > > > > +* **[related] API**: ``rte_tm_mark_ip_dscp()``, ``rte_tm_mark_ip_ecn()``,
> > > > > > +  ``rte_tm_mark_vlan_dei()``.
> > > > > > +
> > > > > >  .. _nic_features_other:
> > > > > >
> > > > > >  Other dev ops not represented by a Feature
> > > > > > diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> > > > > > index cd5794d..5c6896d 100644
> > > > > > --- a/lib/librte_mbuf/rte_mbuf.c
> > > > > > +++ b/lib/librte_mbuf/rte_mbuf.c
> > > > > > @@ -880,6 +880,9 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
> > > > > >         case PKT_TX_SEC_OFFLOAD: return "PKT_TX_SEC_OFFLOAD";
> > > > > >         case PKT_TX_UDP_SEG: return "PKT_TX_UDP_SEG";
> > > > > >         case PKT_TX_OUTER_UDP_CKSUM: return "PKT_TX_OUTER_UDP_CKSUM";
> > > > > > +       case PKT_TX_MARK_VLAN_DEI: return "PKT_TX_MARK_VLAN_DEI";
> > > > > > +       case PKT_TX_MARK_IP_DSCP: return "PKT_TX_MARK_IP_DSCP";
> > > > > > +       case PKT_TX_MARK_IP_ECN: return "PKT_TX_MARK_IP_ECN";
> > > > > >         default: return NULL;
> > > > > >         }
> > > > > >  }
> > > > > > @@ -916,6 +919,9 @@ rte_get_tx_ol_flag_list(uint64_t mask, char *buf, size_t buflen)
> > > > > >                 { PKT_TX_SEC_OFFLOAD, PKT_TX_SEC_OFFLOAD, NULL },
> > > > > >                 { PKT_TX_UDP_SEG, PKT_TX_UDP_SEG, NULL },
> > > > > >                 { PKT_TX_OUTER_UDP_CKSUM, PKT_TX_OUTER_UDP_CKSUM, NULL },
> > > > > > +               { PKT_TX_MARK_VLAN_DEI, PKT_TX_MARK_VLAN_DEI, NULL },
> > > > > > +               { PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_DSCP, NULL },
> > > > > > +               { PKT_TX_MARK_IP_ECN, PKT_TX_MARK_IP_ECN, NULL },
> > > > > >         };
> > > > > >         const char *name;
> > > > > >         unsigned int i;
> > > > > > diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> > > > > > index b9a59c8..d9f1290 100644
> > > > > > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > > > > > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > > > > > @@ -187,11 +187,40 @@ extern "C" {
> > > > > >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> > > > > >
> > > > > >  #define PKT_FIRST_FREE (1ULL << 23)
> > > > > > -#define PKT_LAST_FREE (1ULL << 40)
> > > > > > +#define PKT_LAST_FREE (1ULL << 37)
> > > > > >
> > > > > >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> > > > > >
> > > > > >  /**
> > > > > > + * Packet marking offload flags. These flags indicated what kind
> > > > > > + * of packet marking needs to be applied on a given mbuf when
> > > > > > + * appropriate Traffic Manager configuration is in place.
> > > > > > + * When user set's these flags on a mbuf, below assumptions are made
> > > > > > + * 1) When PKT_TX_MARK_VLAN_DEI is set,
> > > > > > + * a) PMD assumes pkt to be a 802.1q packet.
> > 
> > What does that imply?
> 
> I meant by setting the flag, a packet has VLAN header adhering to IEEE 802.1Q spec.
> 
> > 
> > > > > > + * b) Application should also set mbuf.l2_len where 802.1Q header is
> > > > > > + *    at (mbuf.l2_len - 6) offset.
> > 
> > Why mbuf.l2_len - 6 ?
> L2 header when VLAN header is preset will be 
> {custom header 'X' Bytes}:{Ethernet SRC+DST (12B)}:{VLAN Header (4B)}:{Ether Type (2B)}
> l2_len = X + 12 + 4 + 2
> So, VLAN header starts at (l2_len - 6) bytes.
> 
> > 
> > > > > > + * 2) When PKT_TX_MARK_IP_DSCP is set,
> > > > > > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6
> > > > > > + *    to indicate whether if it is IPv4 packet or IPv6 packet
> > > > > > + *    for DSCP marking. It should also set PKT_TX_IP_CKSUM if it is
> > > > > > + *    IPv4 pkt.
> > > > > > + * b) Application should also set mbuf.l2_len that indicates
> > > > > > + *    start offset of L3 header.
> > > > > > + * 3) When PKT_TX_MARK_IP_ECN is set,
> > > > > > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6.
> > > > > > + *    It should also set PKT_TX_IP_CKSUM if it is IPv4 pkt.
> > > > > > + * b) PMD will assume pkt L4 protocol is either TCP or SCTP and
> > > > > > + *    ECN is set to 2'b01 or 2'b10 as per RFC 3168 and hence HW
> > > > > > + *    can mark the packet for a configured color.
> > > > > > + * c) Application should also set mbuf.l2_len that indicates
> > > > > > + *    start offset of L3 header.
> > > > > > + */
> > > > > > +#define PKT_TX_MARK_VLAN_DEI           (1ULL << 38)
> > > > > > +#define PKT_TX_MARK_IP_DSCP            (1ULL << 39)
> > > > > > +#define PKT_TX_MARK_IP_ECN             (1ULL << 40)
> > 
> > We should have one comment per define.
> Ack, will fix in V2.
> 
> > 
> > 
> > > > > > +
> > > > > > +/**
> > > > > >   * Outer UDP checksum offload flag. This flag is used for enabling
> > > > > >   * outer UDP checksum in PMD. To use outer UDP checksum, the user needs to
> > > > > >   * 1) Enable the following in mbuf,
> > > > > > @@ -384,7 +413,10 @@ extern "C" {
> > > > > >                 PKT_TX_MACSEC |          \
> > > > > >                 PKT_TX_SEC_OFFLOAD |     \
> > > > > >                 PKT_TX_UDP_SEG |         \
> > > > > > -               PKT_TX_OUTER_UDP_CKSUM)
> > > > > > +               PKT_TX_OUTER_UDP_CKSUM | \
> > > > > > +               PKT_TX_MARK_VLAN_DEI |   \
> > > > > > +               PKT_TX_MARK_IP_DSCP |    \
> > > > > > +               PKT_TX_MARK_IP_ECN)
> > > > > >
> > > > > >  /**
> > > > > >   * Mbuf having an external buffer attached. shinfo in mbuf must be filled.
> > > > > > --
> > > > > > 2.8.4
> > > > > >
Nithin Dabilpuram May 5, 2020, 6:19 a.m. UTC | #7
On Mon, May 04, 2020 at 02:27:35PM +0200, Olivier Matz wrote:
> On Mon, May 04, 2020 at 03:34:57PM +0530, Nithin Dabilpuram wrote:
> > On Mon, May 04, 2020 at 11:16:40AM +0200, Olivier Matz wrote:
> > > On Mon, May 04, 2020 at 01:57:06PM +0530, Nithin Dabilpuram wrote:
> > > > Hi Olivier,
> > > > 
> > > > On Mon, May 04, 2020 at 10:06:34AM +0200, Olivier Matz wrote:
> > > > > External Email
> > > > > 
> > > > > ----------------------------------------------------------------------
> > > > > Hi,
> > > > > 
> > > > > On Fri, May 01, 2020 at 04:48:21PM +0530, Jerin Jacob wrote:
> > > > > > On Fri, Apr 17, 2020 at 12:53 PM Nithin Dabilpuram
> > > > > > <nithind1988@gmail.com> wrote:
> > > > > > >
> > > > > > > From: Nithin Dabilpuram <ndabilpuram@marvell.com>
> > > > > > >
> > > > > > > Introduce PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_ECN
> > > > > > > and PKT_TX_MARK_VLAN_DEI Tx offload flags to support
> > > > > > > packet marking.
> > > > > > >
> > > > > > > When packet marking feature in Traffic manager is enabled,
> > > > > > > application has to the use the three new flags to indicate
> > > > > > > to PMD on whether packet marking needs to be enabled on the
> > > > > > > specific mbuf or not. By setting the three flags, it is
> > > > > > > assumed by PMD that application has already verified the
> > > > > > > applicability of marking on that specific packet and
> > > > > > > PMD need not perform further checks as per RFC.
> > > > > > >
> > > > > > > Signed-off-by: Krzysztof Kanas <kkanas@marvell.com>
> > > > > > > Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
> > > > > > 
> > > > > > None of the ethdev TM driver implementations has supported packet
> > > > > > marking support.
> > > > > > rte_tm and rte_mbuf maintainers(Christian, Oliver), Could you review this patch?
> > > > > 
> > > > > As you know, the number of mbuf flags is limited (only 18 bits are
> > > > > remaining), so I think we should use them with care, i.e. for features
> > > > > that are generic enough.
> > > > 
> > > > I agree, but I believe this is one of the basic flags needed like other 
> > > > Tx checksum offload flags (like PKT_TX_IP_CKSUM, PKT_TX_IPV4, etc) which 
> > > > are needed to identify on which packets HW should/can apply packet marking.
> > > 
> > > PKT_TX_IP_CKSUM tells the hardware to offload the checksum
> > > calculation. This is pretty straightforward and there is no other
> > > dependency than the offload feature advertised by the PMD.
> > > 
> > > I'm sorry, I have not a lot of experience with rte_tm.h, so it's
> > > difficult for me to have a global view of what is done for instance when
> > > PKT_TX_MARK_VLAN_DEI is set, and what happens when it is not set.
> > > 
> > > Can you confirm that my understanding below is correct? (or correct me
> > > where I'm wrong)
> > > 
> > > Before your patch:
> > > - the application enables the port and traffic manager on it
> > > - the application calls rte_tm_mark_vlan_dei() to select which traffic
> > >   class must be marked
> > > - when a packet is transmitted, the traffic class is determined by the
> > >   hardware, and if the hardware recognizes a VLAN packet, the VLAN DEI
> > >   bit is set depending on traffic class
> > > 
> > > The problem is for packets that cannot be recognized by the hardware,
> > > correct?
> > 
> > Yes. Octeontx2 HW always depends on application knowledge instead of walking 
> > through all the layers of packet data in Tx to identify what packet it is 
> > and where the l2, l3, l4 headers start for performance reasons. 
> > 
> > I believe there are other hardware too that have the same expectation
> > and hence we have a need for PKT_TX_IPv4, PKT_TX_IPv6 kind of flags.
> > 
> > Hence we want to make use of mbuf:tx_offload field and PKT_TX_* flags 
> > for identifying the packet and knowing what are its l2,l3,l4 offsets.
> 
> The objective is to give an indication to the hardware that the packet has:
> - an 802.1q header at offset X for PKT_TX_MARK_VLAN_DEI
> - an IP/IPv6 header at offset X for PKT_TX_MARK_IP_DSCP
> - an IP/IPv6 header at offset X and a TCP/SCTP header at offset Y for
>   PKT_TX_MARK_IP_ECN
> 
> Just to be sure I'm getting the point, would it also work if with flags
> like this:
> 
> - an 802.1q header at offset X for PKT_TX_HAS_VLAN
> - an IP/IPv6 header at offset X for PKT_TX_IPv4 or PKT_TX_IPv6
> - a TCP/SCTP header at offset Y for PKT_TX_TCP/PKT_TX_SCTP (implies
>   PKT_TX_IPv4 or PKT_TX_IPv6)
> 
> The underlying question is: do we need the flags to only describe the
> content of the packet or do the flag also indicate that an action has to
> be done?

If we don't have a specific action based flag, then in future it might collide
with other functionality and we will not be able to choose that specific
offload. All the existing features are having specific flags, like TSO,
CSUM.

RFC wise, even when marking in enabled and packet is coloured, not all packets
can be marked. 
For example when IP DSCP marking(RFC 2597) is enabled, marking is defined
only with below 12 code points out of 64 code points (6 bits of DSCP).

                  Class 1    Class 2    Class 3    Class 4    
                 +----------+----------+----------+----------+
Low Drop Prec    |  001010  |  010010  |  011010  |  100010  |
Medium Drop Prec |  001100  |  010100  |  011100  |  100100  |
High Drop Prec   |  001110  |  010110  |  011110  |  100110  |
                 +----------+----------+----------+----------+

All other combinations of DSCP value can be used for some other purposes
and hence packets with those values shouldn't be marked.
Similar is the case with IP ECN marking for TCP/SCTP(RFC 3168).

Having PMD or HW to check if the packet falls in the said class and then do
marking will impact performance. Since application actually fills those values
in packet, it will be more easy for them to say.

> 
> > > So your patch is a way to force the hardware to recognize mark set the
> > > VLAN DEI on packets that are not recognized as VLAN packets?
> > > 
> > > How the is traffic class of the packet determined?
> > 
> > Packet is coloured based on Single Rate[1] or Dual Rate[2] Shaping result
> > and packet color determines traffic class. The exact behavior of 
> > packet color to traffic class mapping is mentioned in TM spec based on
> > few other RFC's.
> > 
> > [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__tools.ietf.org_html_rfc2697&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=pJDciSXpMy6TawycjvpYj_Jq5M5j_ywqhU8-keRI_ac&s=05emGNkz3Qat3dtZIbEsmQDC5y9-tU9yItHX0x1aaJU&e= 
> > [2] https://urldefense.proofpoint.com/v2/url?u=https-3A__tools.ietf.org_html_rfc2698&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=pJDciSXpMy6TawycjvpYj_Jq5M5j_ywqhU8-keRI_ac&s=3VN2dIGSDt4vWM-FpPOOf-8SeVShl_t7QpXRU6Zw460&e= 
> 
> OK, so the traffic class does not depend on the packet type?
Yes it doesn't. But where to update the traffic class is specific to packet
type like DEI bit in VLAN or ECN field in IPv4/IPv6 or DSCP field in IPv4/IPv6.
Also ECN marking is only valid for TCP/SCTP packets.

> 
> 
> > > > > From what I understand, this feature is bound to octeontx2, so using a
> > > > > mbuf dynamic flag would make more sense here. There are some examples in
> > > > > dpdk repository, just grep for "dynflag".
> > > > 
> > > > This is not octeontx2 specific flag but any "packet marking feature" enabled
> > > > PMD would need these flags to identify on which packets marking needs to be 
> > > > done. This is the first PMD that supports packet marking feature and
> > > > hence it was not exposed earlier.
> > > > 
> > > > For example to mark VLAN DEI, PMD cannot always assume that there is preexisting
> > > > VLAN header from Byte 12 as there is no gaurantee that ethernet header
> > > > always starts at Byte 0 (Custom headers before ethernet hdr).
> > > > 
> > > > > 
> > > > > Also, I think that the feature availability should be advertised through
> > > > > an ethdev offload, so an application can know at initialization time
> > > > > that these flags can be used.
> > > > 
> > > > Feature availablity is already part of TM spec in rte_tm.h 
> > > > struct rte_tm_capabilities:mark_vlan_dei_supported
> > > > struct rte_tm_capabilities:mark_ip_ecn_[sctp|tcp]_supported
> > > > struct rte_tm_capabilities:mark_ip_dscp_supported
> > > 
> > > Does this mean that any driver advertising this existing feature flag
> > > has to support the new mbuf flags too? Shouldn't we have a specific
> > > feature for it?
> > 
> > Yes, I thought PMD's need to support both.
> > I'm fine adding specific feature flag for the offload flags alone
> > if you insist or if there are other PMD's which don't need the offload flags
> > for packet marking. I was not able to find out about other PMD's as
> > none of the existing PMD's support packet marking.
> 
> Do you suggest that the behavior of the traffic manager marking should
> be:
> 
> a- the hardware tries to recognize tx packets, and mark them
>    accordingly. What packets are recognized depend on hardware.
> b- if the mbuf has a specific flag, it helps the PMD and hardware to
>    recognize packets, so it can mark packets.
> 
> For an application, a- is difficult to apprehend as it will be dependent
> on hardware.
> 
> Or do you suggest that packets should only be marked if there is a mbuf
> flag? (only b-)
Yes, I believe b- is the right thing.

> 
> Do you confirm that there is no support at all for this feature today?
> I mean, what was the usage of rte_tm_mark_vlan_dei() these last 3 years?

Yes, it was not implemented/used. Because of such reasons, rte_tm.h is
supposed to be experimental but was mistakenly marked stable. 
You can see related discussion in below threads about marking rte_tm.h 
experimental again in v20.11.
https://mails.dpdk.org/archives/dev/2020-April/164970.html
https://mails.dpdk.org/archives/dev/2020-May/166221.html

Thanks
Nithin

> 
> Thanks,
> Olivier
> 
> 
> > 
> > > 
> > > Please also see few comments below.
> > > 
> > > > > > > ---
> > > > > > >  doc/guides/nics/features.rst    | 14 ++++++++++++++
> > > > > > >  lib/librte_mbuf/rte_mbuf.c      |  6 ++++++
> > > > > > >  lib/librte_mbuf/rte_mbuf_core.h | 36 ++++++++++++++++++++++++++++++++++--
> > > > > > >  3 files changed, 54 insertions(+), 2 deletions(-)
> > > > > > >
> > > > > > > diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> > > > > > > index edd21c4..bc978fb 100644
> > > > > > > --- a/doc/guides/nics/features.rst
> > > > > > > +++ b/doc/guides/nics/features.rst
> > > > > > > @@ -913,6 +913,20 @@ Supports to get Rx/Tx packet burst mode information.
> > > > > > >  * **[implements] eth_dev_ops**: ``rx_burst_mode_get``, ``tx_burst_mode_get``.
> > > > > > >  * **[related] API**: ``rte_eth_rx_burst_mode_get()``, ``rte_eth_tx_burst_mode_get()``.
> > > > > > >
> > > > > > > +.. _nic_features_traffic_manager_packet_marking_offload:
> > > > > > > +
> > > > > > > +Traffic Manager Packet marking offload
> > > > > > > +--------------------------------------
> > > > > > > +
> > > > > > > +Supports enabling a packet marking offload specific mbuf.
> > > > > > > +
> > > > > > > +* **[uses]     mbuf**: ``mbuf.ol_flags:PKT_TX_MARK_IP_DSCP``,
> > > > > > > +  ``mbuf.ol_flags:PKT_TX_MARK_IP_ECN``, ``mbuf.ol_flags:PKT_TX_MARK_VLAN_DEI``,
> > > > > > > +  ``mbuf.ol_flags:PKT_TX_IPV4``, ``mbuf.ol_flags:PKT_TX_IPV6``.
> > > > > > > +* **[uses]     mbuf**: ``mbuf.l2_len``.
> > > > > > > +* **[related] API**: ``rte_tm_mark_ip_dscp()``, ``rte_tm_mark_ip_ecn()``,
> > > > > > > +  ``rte_tm_mark_vlan_dei()``.
> > > > > > > +
> > > > > > >  .. _nic_features_other:
> > > > > > >
> > > > > > >  Other dev ops not represented by a Feature
> > > > > > > diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> > > > > > > index cd5794d..5c6896d 100644
> > > > > > > --- a/lib/librte_mbuf/rte_mbuf.c
> > > > > > > +++ b/lib/librte_mbuf/rte_mbuf.c
> > > > > > > @@ -880,6 +880,9 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
> > > > > > >         case PKT_TX_SEC_OFFLOAD: return "PKT_TX_SEC_OFFLOAD";
> > > > > > >         case PKT_TX_UDP_SEG: return "PKT_TX_UDP_SEG";
> > > > > > >         case PKT_TX_OUTER_UDP_CKSUM: return "PKT_TX_OUTER_UDP_CKSUM";
> > > > > > > +       case PKT_TX_MARK_VLAN_DEI: return "PKT_TX_MARK_VLAN_DEI";
> > > > > > > +       case PKT_TX_MARK_IP_DSCP: return "PKT_TX_MARK_IP_DSCP";
> > > > > > > +       case PKT_TX_MARK_IP_ECN: return "PKT_TX_MARK_IP_ECN";
> > > > > > >         default: return NULL;
> > > > > > >         }
> > > > > > >  }
> > > > > > > @@ -916,6 +919,9 @@ rte_get_tx_ol_flag_list(uint64_t mask, char *buf, size_t buflen)
> > > > > > >                 { PKT_TX_SEC_OFFLOAD, PKT_TX_SEC_OFFLOAD, NULL },
> > > > > > >                 { PKT_TX_UDP_SEG, PKT_TX_UDP_SEG, NULL },
> > > > > > >                 { PKT_TX_OUTER_UDP_CKSUM, PKT_TX_OUTER_UDP_CKSUM, NULL },
> > > > > > > +               { PKT_TX_MARK_VLAN_DEI, PKT_TX_MARK_VLAN_DEI, NULL },
> > > > > > > +               { PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_DSCP, NULL },
> > > > > > > +               { PKT_TX_MARK_IP_ECN, PKT_TX_MARK_IP_ECN, NULL },
> > > > > > >         };
> > > > > > >         const char *name;
> > > > > > >         unsigned int i;
> > > > > > > diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> > > > > > > index b9a59c8..d9f1290 100644
> > > > > > > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > > > > > > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > > > > > > @@ -187,11 +187,40 @@ extern "C" {
> > > > > > >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> > > > > > >
> > > > > > >  #define PKT_FIRST_FREE (1ULL << 23)
> > > > > > > -#define PKT_LAST_FREE (1ULL << 40)
> > > > > > > +#define PKT_LAST_FREE (1ULL << 37)
> > > > > > >
> > > > > > >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> > > > > > >
> > > > > > >  /**
> > > > > > > + * Packet marking offload flags. These flags indicated what kind
> > > > > > > + * of packet marking needs to be applied on a given mbuf when
> > > > > > > + * appropriate Traffic Manager configuration is in place.
> > > > > > > + * When user set's these flags on a mbuf, below assumptions are made
> > > > > > > + * 1) When PKT_TX_MARK_VLAN_DEI is set,
> > > > > > > + * a) PMD assumes pkt to be a 802.1q packet.
> > > 
> > > What does that imply?
> > 
> > I meant by setting the flag, a packet has VLAN header adhering to IEEE 802.1Q spec.
> > 
> > > 
> > > > > > > + * b) Application should also set mbuf.l2_len where 802.1Q header is
> > > > > > > + *    at (mbuf.l2_len - 6) offset.
> > > 
> > > Why mbuf.l2_len - 6 ?
> > L2 header when VLAN header is preset will be 
> > {custom header 'X' Bytes}:{Ethernet SRC+DST (12B)}:{VLAN Header (4B)}:{Ether Type (2B)}
> > l2_len = X + 12 + 4 + 2
> > So, VLAN header starts at (l2_len - 6) bytes.
> > 
> > > 
> > > > > > > + * 2) When PKT_TX_MARK_IP_DSCP is set,
> > > > > > > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6
> > > > > > > + *    to indicate whether if it is IPv4 packet or IPv6 packet
> > > > > > > + *    for DSCP marking. It should also set PKT_TX_IP_CKSUM if it is
> > > > > > > + *    IPv4 pkt.
> > > > > > > + * b) Application should also set mbuf.l2_len that indicates
> > > > > > > + *    start offset of L3 header.
> > > > > > > + * 3) When PKT_TX_MARK_IP_ECN is set,
> > > > > > > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6.
> > > > > > > + *    It should also set PKT_TX_IP_CKSUM if it is IPv4 pkt.
> > > > > > > + * b) PMD will assume pkt L4 protocol is either TCP or SCTP and
> > > > > > > + *    ECN is set to 2'b01 or 2'b10 as per RFC 3168 and hence HW
> > > > > > > + *    can mark the packet for a configured color.
> > > > > > > + * c) Application should also set mbuf.l2_len that indicates
> > > > > > > + *    start offset of L3 header.
> > > > > > > + */
> > > > > > > +#define PKT_TX_MARK_VLAN_DEI           (1ULL << 38)
> > > > > > > +#define PKT_TX_MARK_IP_DSCP            (1ULL << 39)
> > > > > > > +#define PKT_TX_MARK_IP_ECN             (1ULL << 40)
> > > 
> > > We should have one comment per define.
> > Ack, will fix in V2.
> > 
> > > 
> > > 
> > > > > > > +
> > > > > > > +/**
> > > > > > >   * Outer UDP checksum offload flag. This flag is used for enabling
> > > > > > >   * outer UDP checksum in PMD. To use outer UDP checksum, the user needs to
> > > > > > >   * 1) Enable the following in mbuf,
> > > > > > > @@ -384,7 +413,10 @@ extern "C" {
> > > > > > >                 PKT_TX_MACSEC |          \
> > > > > > >                 PKT_TX_SEC_OFFLOAD |     \
> > > > > > >                 PKT_TX_UDP_SEG |         \
> > > > > > > -               PKT_TX_OUTER_UDP_CKSUM)
> > > > > > > +               PKT_TX_OUTER_UDP_CKSUM | \
> > > > > > > +               PKT_TX_MARK_VLAN_DEI |   \
> > > > > > > +               PKT_TX_MARK_IP_DSCP |    \
> > > > > > > +               PKT_TX_MARK_IP_ECN)
> > > > > > >
> > > > > > >  /**
> > > > > > >   * Mbuf having an external buffer attached. shinfo in mbuf must be filled.
> > > > > > > --
> > > > > > > 2.8.4
> > > > > > >
Nithin Dabilpuram May 13, 2020, 12:28 p.m. UTC | #8
Hi Olivier,

Any thoughts on this ?

On Tue, May 05, 2020 at 11:49:20AM +0530, Nithin Dabilpuram wrote:
> On Mon, May 04, 2020 at 02:27:35PM +0200, Olivier Matz wrote:
> > On Mon, May 04, 2020 at 03:34:57PM +0530, Nithin Dabilpuram wrote:
> > > On Mon, May 04, 2020 at 11:16:40AM +0200, Olivier Matz wrote:
> > > > On Mon, May 04, 2020 at 01:57:06PM +0530, Nithin Dabilpuram wrote:
> > > > > Hi Olivier,
> > > > > 
> > > > > On Mon, May 04, 2020 at 10:06:34AM +0200, Olivier Matz wrote:
> > > > > > External Email
> > > > > > 
> > > > > > ----------------------------------------------------------------------
> > > > > > Hi,
> > > > > > 
> > > > > > On Fri, May 01, 2020 at 04:48:21PM +0530, Jerin Jacob wrote:
> > > > > > > On Fri, Apr 17, 2020 at 12:53 PM Nithin Dabilpuram
> > > > > > > <nithind1988@gmail.com> wrote:
> > > > > > > >
> > > > > > > > From: Nithin Dabilpuram <ndabilpuram@marvell.com>
> > > > > > > >
> > > > > > > > Introduce PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_ECN
> > > > > > > > and PKT_TX_MARK_VLAN_DEI Tx offload flags to support
> > > > > > > > packet marking.
> > > > > > > >
> > > > > > > > When packet marking feature in Traffic manager is enabled,
> > > > > > > > application has to the use the three new flags to indicate
> > > > > > > > to PMD on whether packet marking needs to be enabled on the
> > > > > > > > specific mbuf or not. By setting the three flags, it is
> > > > > > > > assumed by PMD that application has already verified the
> > > > > > > > applicability of marking on that specific packet and
> > > > > > > > PMD need not perform further checks as per RFC.
> > > > > > > >
> > > > > > > > Signed-off-by: Krzysztof Kanas <kkanas@marvell.com>
> > > > > > > > Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
> > > > > > > 
> > > > > > > None of the ethdev TM driver implementations has supported packet
> > > > > > > marking support.
> > > > > > > rte_tm and rte_mbuf maintainers(Christian, Oliver), Could you review this patch?
> > > > > > 
> > > > > > As you know, the number of mbuf flags is limited (only 18 bits are
> > > > > > remaining), so I think we should use them with care, i.e. for features
> > > > > > that are generic enough.
> > > > > 
> > > > > I agree, but I believe this is one of the basic flags needed like other 
> > > > > Tx checksum offload flags (like PKT_TX_IP_CKSUM, PKT_TX_IPV4, etc) which 
> > > > > are needed to identify on which packets HW should/can apply packet marking.
> > > > 
> > > > PKT_TX_IP_CKSUM tells the hardware to offload the checksum
> > > > calculation. This is pretty straightforward and there is no other
> > > > dependency than the offload feature advertised by the PMD.
> > > > 
> > > > I'm sorry, I have not a lot of experience with rte_tm.h, so it's
> > > > difficult for me to have a global view of what is done for instance when
> > > > PKT_TX_MARK_VLAN_DEI is set, and what happens when it is not set.
> > > > 
> > > > Can you confirm that my understanding below is correct? (or correct me
> > > > where I'm wrong)
> > > > 
> > > > Before your patch:
> > > > - the application enables the port and traffic manager on it
> > > > - the application calls rte_tm_mark_vlan_dei() to select which traffic
> > > >   class must be marked
> > > > - when a packet is transmitted, the traffic class is determined by the
> > > >   hardware, and if the hardware recognizes a VLAN packet, the VLAN DEI
> > > >   bit is set depending on traffic class
> > > > 
> > > > The problem is for packets that cannot be recognized by the hardware,
> > > > correct?
> > > 
> > > Yes. Octeontx2 HW always depends on application knowledge instead of walking 
> > > through all the layers of packet data in Tx to identify what packet it is 
> > > and where the l2, l3, l4 headers start for performance reasons. 
> > > 
> > > I believe there are other hardware too that have the same expectation
> > > and hence we have a need for PKT_TX_IPv4, PKT_TX_IPv6 kind of flags.
> > > 
> > > Hence we want to make use of mbuf:tx_offload field and PKT_TX_* flags 
> > > for identifying the packet and knowing what are its l2,l3,l4 offsets.
> > 
> > The objective is to give an indication to the hardware that the packet has:
> > - an 802.1q header at offset X for PKT_TX_MARK_VLAN_DEI
> > - an IP/IPv6 header at offset X for PKT_TX_MARK_IP_DSCP
> > - an IP/IPv6 header at offset X and a TCP/SCTP header at offset Y for
> >   PKT_TX_MARK_IP_ECN
> > 
> > Just to be sure I'm getting the point, would it also work if with flags
> > like this:
> > 
> > - an 802.1q header at offset X for PKT_TX_HAS_VLAN
> > - an IP/IPv6 header at offset X for PKT_TX_IPv4 or PKT_TX_IPv6
> > - a TCP/SCTP header at offset Y for PKT_TX_TCP/PKT_TX_SCTP (implies
> >   PKT_TX_IPv4 or PKT_TX_IPv6)
> > 
> > The underlying question is: do we need the flags to only describe the
> > content of the packet or do the flag also indicate that an action has to
> > be done?
> 
> If we don't have a specific action based flag, then in future it might collide
> with other functionality and we will not be able to choose that specific
> offload. All the existing features are having specific flags, like TSO,
> CSUM.
> 
> RFC wise, even when marking in enabled and packet is coloured, not all packets
> can be marked. 
> For example when IP DSCP marking(RFC 2597) is enabled, marking is defined
> only with below 12 code points out of 64 code points (6 bits of DSCP).
> 
>                   Class 1    Class 2    Class 3    Class 4    
>                  +----------+----------+----------+----------+
> Low Drop Prec    |  001010  |  010010  |  011010  |  100010  |
> Medium Drop Prec |  001100  |  010100  |  011100  |  100100  |
> High Drop Prec   |  001110  |  010110  |  011110  |  100110  |
>                  +----------+----------+----------+----------+
> 
> All other combinations of DSCP value can be used for some other purposes
> and hence packets with those values shouldn't be marked.
> Similar is the case with IP ECN marking for TCP/SCTP(RFC 3168).
> 
> Having PMD or HW to check if the packet falls in the said class and then do
> marking will impact performance. Since application actually fills those values
> in packet, it will be more easy for them to say.
> 
> > 
> > > > So your patch is a way to force the hardware to recognize mark set the
> > > > VLAN DEI on packets that are not recognized as VLAN packets?
> > > > 
> > > > How the is traffic class of the packet determined?
> > > 
> > > Packet is coloured based on Single Rate[1] or Dual Rate[2] Shaping result
> > > and packet color determines traffic class. The exact behavior of 
> > > packet color to traffic class mapping is mentioned in TM spec based on
> > > few other RFC's.
> > > 
> > > [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__tools.ietf.org_html_rfc2697&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=pJDciSXpMy6TawycjvpYj_Jq5M5j_ywqhU8-keRI_ac&s=05emGNkz3Qat3dtZIbEsmQDC5y9-tU9yItHX0x1aaJU&e= 
> > > [2] https://urldefense.proofpoint.com/v2/url?u=https-3A__tools.ietf.org_html_rfc2698&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=pJDciSXpMy6TawycjvpYj_Jq5M5j_ywqhU8-keRI_ac&s=3VN2dIGSDt4vWM-FpPOOf-8SeVShl_t7QpXRU6Zw460&e= 
> > 
> > OK, so the traffic class does not depend on the packet type?
> Yes it doesn't. But where to update the traffic class is specific to packet
> type like DEI bit in VLAN or ECN field in IPv4/IPv6 or DSCP field in IPv4/IPv6.
> Also ECN marking is only valid for TCP/SCTP packets.
> 
> > 
> > 
> > > > > > From what I understand, this feature is bound to octeontx2, so using a
> > > > > > mbuf dynamic flag would make more sense here. There are some examples in
> > > > > > dpdk repository, just grep for "dynflag".
> > > > > 
> > > > > This is not octeontx2 specific flag but any "packet marking feature" enabled
> > > > > PMD would need these flags to identify on which packets marking needs to be 
> > > > > done. This is the first PMD that supports packet marking feature and
> > > > > hence it was not exposed earlier.
> > > > > 
> > > > > For example to mark VLAN DEI, PMD cannot always assume that there is preexisting
> > > > > VLAN header from Byte 12 as there is no gaurantee that ethernet header
> > > > > always starts at Byte 0 (Custom headers before ethernet hdr).
> > > > > 
> > > > > > 
> > > > > > Also, I think that the feature availability should be advertised through
> > > > > > an ethdev offload, so an application can know at initialization time
> > > > > > that these flags can be used.
> > > > > 
> > > > > Feature availablity is already part of TM spec in rte_tm.h 
> > > > > struct rte_tm_capabilities:mark_vlan_dei_supported
> > > > > struct rte_tm_capabilities:mark_ip_ecn_[sctp|tcp]_supported
> > > > > struct rte_tm_capabilities:mark_ip_dscp_supported
> > > > 
> > > > Does this mean that any driver advertising this existing feature flag
> > > > has to support the new mbuf flags too? Shouldn't we have a specific
> > > > feature for it?
> > > 
> > > Yes, I thought PMD's need to support both.
> > > I'm fine adding specific feature flag for the offload flags alone
> > > if you insist or if there are other PMD's which don't need the offload flags
> > > for packet marking. I was not able to find out about other PMD's as
> > > none of the existing PMD's support packet marking.
> > 
> > Do you suggest that the behavior of the traffic manager marking should
> > be:
> > 
> > a- the hardware tries to recognize tx packets, and mark them
> >    accordingly. What packets are recognized depend on hardware.
> > b- if the mbuf has a specific flag, it helps the PMD and hardware to
> >    recognize packets, so it can mark packets.
> > 
> > For an application, a- is difficult to apprehend as it will be dependent
> > on hardware.
> > 
> > Or do you suggest that packets should only be marked if there is a mbuf
> > flag? (only b-)
> Yes, I believe b- is the right thing.
> 
> > 
> > Do you confirm that there is no support at all for this feature today?
> > I mean, what was the usage of rte_tm_mark_vlan_dei() these last 3 years?
> 
> Yes, it was not implemented/used. Because of such reasons, rte_tm.h is
> supposed to be experimental but was mistakenly marked stable. 
> You can see related discussion in below threads about marking rte_tm.h 
> experimental again in v20.11.
> https://urldefense.proofpoint.com/v2/url?u=https-3A__mails.dpdk.org_archives_dev_2020-2DApril_164970.html&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=JWDwlSkCAEkWR-4kKuuIGFmhMtr8W10Ns9kEPidDFbQ&s=XESl2bNVKTkGiVmm3qww3zDb0vYu9_XcaqT2CkCViTs&e= 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__mails.dpdk.org_archives_dev_2020-2DMay_166221.html&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=JWDwlSkCAEkWR-4kKuuIGFmhMtr8W10Ns9kEPidDFbQ&s=ZGxLxUL_76HZo9YmWhyvDZeYg28uEh3q6od48a3KlbI&e= 
> 
> Thanks
> Nithin
> 
> > 
> > Thanks,
> > Olivier
> > 
> > 
> > > 
> > > > 
> > > > Please also see few comments below.
> > > > 
> > > > > > > > ---
> > > > > > > >  doc/guides/nics/features.rst    | 14 ++++++++++++++
> > > > > > > >  lib/librte_mbuf/rte_mbuf.c      |  6 ++++++
> > > > > > > >  lib/librte_mbuf/rte_mbuf_core.h | 36 ++++++++++++++++++++++++++++++++++--
> > > > > > > >  3 files changed, 54 insertions(+), 2 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> > > > > > > > index edd21c4..bc978fb 100644
> > > > > > > > --- a/doc/guides/nics/features.rst
> > > > > > > > +++ b/doc/guides/nics/features.rst
> > > > > > > > @@ -913,6 +913,20 @@ Supports to get Rx/Tx packet burst mode information.
> > > > > > > >  * **[implements] eth_dev_ops**: ``rx_burst_mode_get``, ``tx_burst_mode_get``.
> > > > > > > >  * **[related] API**: ``rte_eth_rx_burst_mode_get()``, ``rte_eth_tx_burst_mode_get()``.
> > > > > > > >
> > > > > > > > +.. _nic_features_traffic_manager_packet_marking_offload:
> > > > > > > > +
> > > > > > > > +Traffic Manager Packet marking offload
> > > > > > > > +--------------------------------------
> > > > > > > > +
> > > > > > > > +Supports enabling a packet marking offload specific mbuf.
> > > > > > > > +
> > > > > > > > +* **[uses]     mbuf**: ``mbuf.ol_flags:PKT_TX_MARK_IP_DSCP``,
> > > > > > > > +  ``mbuf.ol_flags:PKT_TX_MARK_IP_ECN``, ``mbuf.ol_flags:PKT_TX_MARK_VLAN_DEI``,
> > > > > > > > +  ``mbuf.ol_flags:PKT_TX_IPV4``, ``mbuf.ol_flags:PKT_TX_IPV6``.
> > > > > > > > +* **[uses]     mbuf**: ``mbuf.l2_len``.
> > > > > > > > +* **[related] API**: ``rte_tm_mark_ip_dscp()``, ``rte_tm_mark_ip_ecn()``,
> > > > > > > > +  ``rte_tm_mark_vlan_dei()``.
> > > > > > > > +
> > > > > > > >  .. _nic_features_other:
> > > > > > > >
> > > > > > > >  Other dev ops not represented by a Feature
> > > > > > > > diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> > > > > > > > index cd5794d..5c6896d 100644
> > > > > > > > --- a/lib/librte_mbuf/rte_mbuf.c
> > > > > > > > +++ b/lib/librte_mbuf/rte_mbuf.c
> > > > > > > > @@ -880,6 +880,9 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
> > > > > > > >         case PKT_TX_SEC_OFFLOAD: return "PKT_TX_SEC_OFFLOAD";
> > > > > > > >         case PKT_TX_UDP_SEG: return "PKT_TX_UDP_SEG";
> > > > > > > >         case PKT_TX_OUTER_UDP_CKSUM: return "PKT_TX_OUTER_UDP_CKSUM";
> > > > > > > > +       case PKT_TX_MARK_VLAN_DEI: return "PKT_TX_MARK_VLAN_DEI";
> > > > > > > > +       case PKT_TX_MARK_IP_DSCP: return "PKT_TX_MARK_IP_DSCP";
> > > > > > > > +       case PKT_TX_MARK_IP_ECN: return "PKT_TX_MARK_IP_ECN";
> > > > > > > >         default: return NULL;
> > > > > > > >         }
> > > > > > > >  }
> > > > > > > > @@ -916,6 +919,9 @@ rte_get_tx_ol_flag_list(uint64_t mask, char *buf, size_t buflen)
> > > > > > > >                 { PKT_TX_SEC_OFFLOAD, PKT_TX_SEC_OFFLOAD, NULL },
> > > > > > > >                 { PKT_TX_UDP_SEG, PKT_TX_UDP_SEG, NULL },
> > > > > > > >                 { PKT_TX_OUTER_UDP_CKSUM, PKT_TX_OUTER_UDP_CKSUM, NULL },
> > > > > > > > +               { PKT_TX_MARK_VLAN_DEI, PKT_TX_MARK_VLAN_DEI, NULL },
> > > > > > > > +               { PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_DSCP, NULL },
> > > > > > > > +               { PKT_TX_MARK_IP_ECN, PKT_TX_MARK_IP_ECN, NULL },
> > > > > > > >         };
> > > > > > > >         const char *name;
> > > > > > > >         unsigned int i;
> > > > > > > > diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> > > > > > > > index b9a59c8..d9f1290 100644
> > > > > > > > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > > > > > > > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > > > > > > > @@ -187,11 +187,40 @@ extern "C" {
> > > > > > > >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> > > > > > > >
> > > > > > > >  #define PKT_FIRST_FREE (1ULL << 23)
> > > > > > > > -#define PKT_LAST_FREE (1ULL << 40)
> > > > > > > > +#define PKT_LAST_FREE (1ULL << 37)
> > > > > > > >
> > > > > > > >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> > > > > > > >
> > > > > > > >  /**
> > > > > > > > + * Packet marking offload flags. These flags indicated what kind
> > > > > > > > + * of packet marking needs to be applied on a given mbuf when
> > > > > > > > + * appropriate Traffic Manager configuration is in place.
> > > > > > > > + * When user set's these flags on a mbuf, below assumptions are made
> > > > > > > > + * 1) When PKT_TX_MARK_VLAN_DEI is set,
> > > > > > > > + * a) PMD assumes pkt to be a 802.1q packet.
> > > > 
> > > > What does that imply?
> > > 
> > > I meant by setting the flag, a packet has VLAN header adhering to IEEE 802.1Q spec.
> > > 
> > > > 
> > > > > > > > + * b) Application should also set mbuf.l2_len where 802.1Q header is
> > > > > > > > + *    at (mbuf.l2_len - 6) offset.
> > > > 
> > > > Why mbuf.l2_len - 6 ?
> > > L2 header when VLAN header is preset will be 
> > > {custom header 'X' Bytes}:{Ethernet SRC+DST (12B)}:{VLAN Header (4B)}:{Ether Type (2B)}
> > > l2_len = X + 12 + 4 + 2
> > > So, VLAN header starts at (l2_len - 6) bytes.
> > > 
> > > > 
> > > > > > > > + * 2) When PKT_TX_MARK_IP_DSCP is set,
> > > > > > > > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6
> > > > > > > > + *    to indicate whether if it is IPv4 packet or IPv6 packet
> > > > > > > > + *    for DSCP marking. It should also set PKT_TX_IP_CKSUM if it is
> > > > > > > > + *    IPv4 pkt.
> > > > > > > > + * b) Application should also set mbuf.l2_len that indicates
> > > > > > > > + *    start offset of L3 header.
> > > > > > > > + * 3) When PKT_TX_MARK_IP_ECN is set,
> > > > > > > > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6.
> > > > > > > > + *    It should also set PKT_TX_IP_CKSUM if it is IPv4 pkt.
> > > > > > > > + * b) PMD will assume pkt L4 protocol is either TCP or SCTP and
> > > > > > > > + *    ECN is set to 2'b01 or 2'b10 as per RFC 3168 and hence HW
> > > > > > > > + *    can mark the packet for a configured color.
> > > > > > > > + * c) Application should also set mbuf.l2_len that indicates
> > > > > > > > + *    start offset of L3 header.
> > > > > > > > + */
> > > > > > > > +#define PKT_TX_MARK_VLAN_DEI           (1ULL << 38)
> > > > > > > > +#define PKT_TX_MARK_IP_DSCP            (1ULL << 39)
> > > > > > > > +#define PKT_TX_MARK_IP_ECN             (1ULL << 40)
> > > > 
> > > > We should have one comment per define.
> > > Ack, will fix in V2.
> > > 
> > > > 
> > > > 
> > > > > > > > +
> > > > > > > > +/**
> > > > > > > >   * Outer UDP checksum offload flag. This flag is used for enabling
> > > > > > > >   * outer UDP checksum in PMD. To use outer UDP checksum, the user needs to
> > > > > > > >   * 1) Enable the following in mbuf,
> > > > > > > > @@ -384,7 +413,10 @@ extern "C" {
> > > > > > > >                 PKT_TX_MACSEC |          \
> > > > > > > >                 PKT_TX_SEC_OFFLOAD |     \
> > > > > > > >                 PKT_TX_UDP_SEG |         \
> > > > > > > > -               PKT_TX_OUTER_UDP_CKSUM)
> > > > > > > > +               PKT_TX_OUTER_UDP_CKSUM | \
> > > > > > > > +               PKT_TX_MARK_VLAN_DEI |   \
> > > > > > > > +               PKT_TX_MARK_IP_DSCP |    \
> > > > > > > > +               PKT_TX_MARK_IP_ECN)
> > > > > > > >
> > > > > > > >  /**
> > > > > > > >   * Mbuf having an external buffer attached. shinfo in mbuf must be filled.
> > > > > > > > --
> > > > > > > > 2.8.4
> > > > > > > >
Olivier Matz May 14, 2020, 8:29 p.m. UTC | #9
Hi Nithin,

On Tue, May 05, 2020 at 11:49:20AM +0530, Nithin Dabilpuram wrote:
> On Mon, May 04, 2020 at 02:27:35PM +0200, Olivier Matz wrote:
> > On Mon, May 04, 2020 at 03:34:57PM +0530, Nithin Dabilpuram wrote:
> > > On Mon, May 04, 2020 at 11:16:40AM +0200, Olivier Matz wrote:
> > > > On Mon, May 04, 2020 at 01:57:06PM +0530, Nithin Dabilpuram wrote:
> > > > > Hi Olivier,
> > > > > 
> > > > > On Mon, May 04, 2020 at 10:06:34AM +0200, Olivier Matz wrote:
> > > > > > External Email
> > > > > > 
> > > > > > ----------------------------------------------------------------------
> > > > > > Hi,
> > > > > > 
> > > > > > On Fri, May 01, 2020 at 04:48:21PM +0530, Jerin Jacob wrote:
> > > > > > > On Fri, Apr 17, 2020 at 12:53 PM Nithin Dabilpuram
> > > > > > > <nithind1988@gmail.com> wrote:
> > > > > > > >
> > > > > > > > From: Nithin Dabilpuram <ndabilpuram@marvell.com>
> > > > > > > >
> > > > > > > > Introduce PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_ECN
> > > > > > > > and PKT_TX_MARK_VLAN_DEI Tx offload flags to support
> > > > > > > > packet marking.
> > > > > > > >
> > > > > > > > When packet marking feature in Traffic manager is enabled,
> > > > > > > > application has to the use the three new flags to indicate
> > > > > > > > to PMD on whether packet marking needs to be enabled on the
> > > > > > > > specific mbuf or not. By setting the three flags, it is
> > > > > > > > assumed by PMD that application has already verified the
> > > > > > > > applicability of marking on that specific packet and
> > > > > > > > PMD need not perform further checks as per RFC.
> > > > > > > >
> > > > > > > > Signed-off-by: Krzysztof Kanas <kkanas@marvell.com>
> > > > > > > > Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
> > > > > > > 
> > > > > > > None of the ethdev TM driver implementations has supported packet
> > > > > > > marking support.
> > > > > > > rte_tm and rte_mbuf maintainers(Christian, Oliver), Could you review this patch?
> > > > > > 
> > > > > > As you know, the number of mbuf flags is limited (only 18 bits are
> > > > > > remaining), so I think we should use them with care, i.e. for features
> > > > > > that are generic enough.
> > > > > 
> > > > > I agree, but I believe this is one of the basic flags needed like other 
> > > > > Tx checksum offload flags (like PKT_TX_IP_CKSUM, PKT_TX_IPV4, etc) which 
> > > > > are needed to identify on which packets HW should/can apply packet marking.
> > > > 
> > > > PKT_TX_IP_CKSUM tells the hardware to offload the checksum
> > > > calculation. This is pretty straightforward and there is no other
> > > > dependency than the offload feature advertised by the PMD.
> > > > 
> > > > I'm sorry, I have not a lot of experience with rte_tm.h, so it's
> > > > difficult for me to have a global view of what is done for instance when
> > > > PKT_TX_MARK_VLAN_DEI is set, and what happens when it is not set.
> > > > 
> > > > Can you confirm that my understanding below is correct? (or correct me
> > > > where I'm wrong)
> > > > 
> > > > Before your patch:
> > > > - the application enables the port and traffic manager on it
> > > > - the application calls rte_tm_mark_vlan_dei() to select which traffic
> > > >   class must be marked
> > > > - when a packet is transmitted, the traffic class is determined by the
> > > >   hardware, and if the hardware recognizes a VLAN packet, the VLAN DEI
> > > >   bit is set depending on traffic class
> > > > 
> > > > The problem is for packets that cannot be recognized by the hardware,
> > > > correct?
> > > 
> > > Yes. Octeontx2 HW always depends on application knowledge instead of walking 
> > > through all the layers of packet data in Tx to identify what packet it is 
> > > and where the l2, l3, l4 headers start for performance reasons. 
> > > 
> > > I believe there are other hardware too that have the same expectation
> > > and hence we have a need for PKT_TX_IPv4, PKT_TX_IPv6 kind of flags.
> > > 
> > > Hence we want to make use of mbuf:tx_offload field and PKT_TX_* flags 
> > > for identifying the packet and knowing what are its l2,l3,l4 offsets.
> > 
> > The objective is to give an indication to the hardware that the packet has:
> > - an 802.1q header at offset X for PKT_TX_MARK_VLAN_DEI
> > - an IP/IPv6 header at offset X for PKT_TX_MARK_IP_DSCP
> > - an IP/IPv6 header at offset X and a TCP/SCTP header at offset Y for
> >   PKT_TX_MARK_IP_ECN
> > 
> > Just to be sure I'm getting the point, would it also work if with flags
> > like this:
> > 
> > - an 802.1q header at offset X for PKT_TX_HAS_VLAN
> > - an IP/IPv6 header at offset X for PKT_TX_IPv4 or PKT_TX_IPv6
> > - a TCP/SCTP header at offset Y for PKT_TX_TCP/PKT_TX_SCTP (implies
> >   PKT_TX_IPv4 or PKT_TX_IPv6)
> > 
> > The underlying question is: do we need the flags to only describe the
> > content of the packet or do the flag also indicate that an action has to
> > be done?
> 
> If we don't have a specific action based flag, then in future it might collide
> with other functionality and we will not be able to choose that specific
> offload. All the existing features are having specific flags, like TSO,
> CSUM.
> 
> RFC wise, even when marking in enabled and packet is coloured, not all packets
> can be marked. 
> For example when IP DSCP marking(RFC 2597) is enabled, marking is defined
> only with below 12 code points out of 64 code points (6 bits of DSCP).
> 
>                   Class 1    Class 2    Class 3    Class 4    
>                  +----------+----------+----------+----------+
> Low Drop Prec    |  001010  |  010010  |  011010  |  100010  |
> Medium Drop Prec |  001100  |  010100  |  011100  |  100100  |
> High Drop Prec   |  001110  |  010110  |  011110  |  100110  |
>                  +----------+----------+----------+----------+
> 
> All other combinations of DSCP value can be used for some other purposes
> and hence packets with those values shouldn't be marked.
> Similar is the case with IP ECN marking for TCP/SCTP(RFC 3168).
> 
> Having PMD or HW to check if the packet falls in the said class and then do
> marking will impact performance. Since application actually fills those values
> in packet, it will be more easy for them to say.
> 
> > 
> > > > So your patch is a way to force the hardware to recognize mark set the
> > > > VLAN DEI on packets that are not recognized as VLAN packets?
> > > > 
> > > > How the is traffic class of the packet determined?
> > > 
> > > Packet is coloured based on Single Rate[1] or Dual Rate[2] Shaping result
> > > and packet color determines traffic class. The exact behavior of 
> > > packet color to traffic class mapping is mentioned in TM spec based on
> > > few other RFC's.
> > > 
> > > [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__tools.ietf.org_html_rfc2697&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=pJDciSXpMy6TawycjvpYj_Jq5M5j_ywqhU8-keRI_ac&s=05emGNkz3Qat3dtZIbEsmQDC5y9-tU9yItHX0x1aaJU&e= 
> > > [2] https://urldefense.proofpoint.com/v2/url?u=https-3A__tools.ietf.org_html_rfc2698&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=pJDciSXpMy6TawycjvpYj_Jq5M5j_ywqhU8-keRI_ac&s=3VN2dIGSDt4vWM-FpPOOf-8SeVShl_t7QpXRU6Zw460&e= 
> > 
> > OK, so the traffic class does not depend on the packet type?
> Yes it doesn't. But where to update the traffic class is specific to packet
> type like DEI bit in VLAN or ECN field in IPv4/IPv6 or DSCP field in IPv4/IPv6.
> Also ECN marking is only valid for TCP/SCTP packets.
> 
> > 
> > 
> > > > > > From what I understand, this feature is bound to octeontx2, so using a
> > > > > > mbuf dynamic flag would make more sense here. There are some examples in
> > > > > > dpdk repository, just grep for "dynflag".
> > > > > 
> > > > > This is not octeontx2 specific flag but any "packet marking feature" enabled
> > > > > PMD would need these flags to identify on which packets marking needs to be 
> > > > > done. This is the first PMD that supports packet marking feature and
> > > > > hence it was not exposed earlier.
> > > > > 
> > > > > For example to mark VLAN DEI, PMD cannot always assume that there is preexisting
> > > > > VLAN header from Byte 12 as there is no gaurantee that ethernet header
> > > > > always starts at Byte 0 (Custom headers before ethernet hdr).
> > > > > 
> > > > > > 
> > > > > > Also, I think that the feature availability should be advertised through
> > > > > > an ethdev offload, so an application can know at initialization time
> > > > > > that these flags can be used.
> > > > > 
> > > > > Feature availablity is already part of TM spec in rte_tm.h 
> > > > > struct rte_tm_capabilities:mark_vlan_dei_supported
> > > > > struct rte_tm_capabilities:mark_ip_ecn_[sctp|tcp]_supported
> > > > > struct rte_tm_capabilities:mark_ip_dscp_supported
> > > > 
> > > > Does this mean that any driver advertising this existing feature flag
> > > > has to support the new mbuf flags too? Shouldn't we have a specific
> > > > feature for it?
> > > 
> > > Yes, I thought PMD's need to support both.
> > > I'm fine adding specific feature flag for the offload flags alone
> > > if you insist or if there are other PMD's which don't need the offload flags
> > > for packet marking. I was not able to find out about other PMD's as
> > > none of the existing PMD's support packet marking.
> > 
> > Do you suggest that the behavior of the traffic manager marking should
> > be:
> > 
> > a- the hardware tries to recognize tx packets, and mark them
> >    accordingly. What packets are recognized depend on hardware.
> > b- if the mbuf has a specific flag, it helps the PMD and hardware to
> >    recognize packets, so it can mark packets.
> > 
> > For an application, a- is difficult to apprehend as it will be dependent
> > on hardware.
> > 
> > Or do you suggest that packets should only be marked if there is a mbuf
> > flag? (only b-)
> Yes, I believe b- is the right thing.
> 
> > 
> > Do you confirm that there is no support at all for this feature today?
> > I mean, what was the usage of rte_tm_mark_vlan_dei() these last 3 years?
> 
> Yes, it was not implemented/used. Because of such reasons, rte_tm.h is
> supposed to be experimental but was mistakenly marked stable. 
> You can see related discussion in below threads about marking rte_tm.h 
> experimental again in v20.11.
> https://mails.dpdk.org/archives/dev/2020-April/164970.html
> https://mails.dpdk.org/archives/dev/2020-May/166221.html

Thank you for the explanations. I also think b- is a better choice.

I don't see any better approach than having a mbuf flag. However, I'm
still not fully convinced that a dynamic flag won't do the job. Taking
3 additional flags (among 18 remaing) for this feature also means that
we have 3 flags less for dynamic flags for all applications, even for
applications that will not use this feature.

Would it be a problem to use a dynamic flag in this case?

Thanks,
Olivier


> 
> Thanks
> Nithin
> 
> > 
> > Thanks,
> > Olivier
> > 
> > 
> > > 
> > > > 
> > > > Please also see few comments below.
> > > > 
> > > > > > > > ---
> > > > > > > >  doc/guides/nics/features.rst    | 14 ++++++++++++++
> > > > > > > >  lib/librte_mbuf/rte_mbuf.c      |  6 ++++++
> > > > > > > >  lib/librte_mbuf/rte_mbuf_core.h | 36 ++++++++++++++++++++++++++++++++++--
> > > > > > > >  3 files changed, 54 insertions(+), 2 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> > > > > > > > index edd21c4..bc978fb 100644
> > > > > > > > --- a/doc/guides/nics/features.rst
> > > > > > > > +++ b/doc/guides/nics/features.rst
> > > > > > > > @@ -913,6 +913,20 @@ Supports to get Rx/Tx packet burst mode information.
> > > > > > > >  * **[implements] eth_dev_ops**: ``rx_burst_mode_get``, ``tx_burst_mode_get``.
> > > > > > > >  * **[related] API**: ``rte_eth_rx_burst_mode_get()``, ``rte_eth_tx_burst_mode_get()``.
> > > > > > > >
> > > > > > > > +.. _nic_features_traffic_manager_packet_marking_offload:
> > > > > > > > +
> > > > > > > > +Traffic Manager Packet marking offload
> > > > > > > > +--------------------------------------
> > > > > > > > +
> > > > > > > > +Supports enabling a packet marking offload specific mbuf.
> > > > > > > > +
> > > > > > > > +* **[uses]     mbuf**: ``mbuf.ol_flags:PKT_TX_MARK_IP_DSCP``,
> > > > > > > > +  ``mbuf.ol_flags:PKT_TX_MARK_IP_ECN``, ``mbuf.ol_flags:PKT_TX_MARK_VLAN_DEI``,
> > > > > > > > +  ``mbuf.ol_flags:PKT_TX_IPV4``, ``mbuf.ol_flags:PKT_TX_IPV6``.
> > > > > > > > +* **[uses]     mbuf**: ``mbuf.l2_len``.
> > > > > > > > +* **[related] API**: ``rte_tm_mark_ip_dscp()``, ``rte_tm_mark_ip_ecn()``,
> > > > > > > > +  ``rte_tm_mark_vlan_dei()``.
> > > > > > > > +
> > > > > > > >  .. _nic_features_other:
> > > > > > > >
> > > > > > > >  Other dev ops not represented by a Feature
> > > > > > > > diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> > > > > > > > index cd5794d..5c6896d 100644
> > > > > > > > --- a/lib/librte_mbuf/rte_mbuf.c
> > > > > > > > +++ b/lib/librte_mbuf/rte_mbuf.c
> > > > > > > > @@ -880,6 +880,9 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
> > > > > > > >         case PKT_TX_SEC_OFFLOAD: return "PKT_TX_SEC_OFFLOAD";
> > > > > > > >         case PKT_TX_UDP_SEG: return "PKT_TX_UDP_SEG";
> > > > > > > >         case PKT_TX_OUTER_UDP_CKSUM: return "PKT_TX_OUTER_UDP_CKSUM";
> > > > > > > > +       case PKT_TX_MARK_VLAN_DEI: return "PKT_TX_MARK_VLAN_DEI";
> > > > > > > > +       case PKT_TX_MARK_IP_DSCP: return "PKT_TX_MARK_IP_DSCP";
> > > > > > > > +       case PKT_TX_MARK_IP_ECN: return "PKT_TX_MARK_IP_ECN";
> > > > > > > >         default: return NULL;
> > > > > > > >         }
> > > > > > > >  }
> > > > > > > > @@ -916,6 +919,9 @@ rte_get_tx_ol_flag_list(uint64_t mask, char *buf, size_t buflen)
> > > > > > > >                 { PKT_TX_SEC_OFFLOAD, PKT_TX_SEC_OFFLOAD, NULL },
> > > > > > > >                 { PKT_TX_UDP_SEG, PKT_TX_UDP_SEG, NULL },
> > > > > > > >                 { PKT_TX_OUTER_UDP_CKSUM, PKT_TX_OUTER_UDP_CKSUM, NULL },
> > > > > > > > +               { PKT_TX_MARK_VLAN_DEI, PKT_TX_MARK_VLAN_DEI, NULL },
> > > > > > > > +               { PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_DSCP, NULL },
> > > > > > > > +               { PKT_TX_MARK_IP_ECN, PKT_TX_MARK_IP_ECN, NULL },
> > > > > > > >         };
> > > > > > > >         const char *name;
> > > > > > > >         unsigned int i;
> > > > > > > > diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> > > > > > > > index b9a59c8..d9f1290 100644
> > > > > > > > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > > > > > > > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > > > > > > > @@ -187,11 +187,40 @@ extern "C" {
> > > > > > > >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> > > > > > > >
> > > > > > > >  #define PKT_FIRST_FREE (1ULL << 23)
> > > > > > > > -#define PKT_LAST_FREE (1ULL << 40)
> > > > > > > > +#define PKT_LAST_FREE (1ULL << 37)
> > > > > > > >
> > > > > > > >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> > > > > > > >
> > > > > > > >  /**
> > > > > > > > + * Packet marking offload flags. These flags indicated what kind
> > > > > > > > + * of packet marking needs to be applied on a given mbuf when
> > > > > > > > + * appropriate Traffic Manager configuration is in place.
> > > > > > > > + * When user set's these flags on a mbuf, below assumptions are made
> > > > > > > > + * 1) When PKT_TX_MARK_VLAN_DEI is set,
> > > > > > > > + * a) PMD assumes pkt to be a 802.1q packet.
> > > > 
> > > > What does that imply?
> > > 
> > > I meant by setting the flag, a packet has VLAN header adhering to IEEE 802.1Q spec.
> > > 
> > > > 
> > > > > > > > + * b) Application should also set mbuf.l2_len where 802.1Q header is
> > > > > > > > + *    at (mbuf.l2_len - 6) offset.
> > > > 
> > > > Why mbuf.l2_len - 6 ?
> > > L2 header when VLAN header is preset will be 
> > > {custom header 'X' Bytes}:{Ethernet SRC+DST (12B)}:{VLAN Header (4B)}:{Ether Type (2B)}
> > > l2_len = X + 12 + 4 + 2
> > > So, VLAN header starts at (l2_len - 6) bytes.
> > > 
> > > > 
> > > > > > > > + * 2) When PKT_TX_MARK_IP_DSCP is set,
> > > > > > > > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6
> > > > > > > > + *    to indicate whether if it is IPv4 packet or IPv6 packet
> > > > > > > > + *    for DSCP marking. It should also set PKT_TX_IP_CKSUM if it is
> > > > > > > > + *    IPv4 pkt.
> > > > > > > > + * b) Application should also set mbuf.l2_len that indicates
> > > > > > > > + *    start offset of L3 header.
> > > > > > > > + * 3) When PKT_TX_MARK_IP_ECN is set,
> > > > > > > > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6.
> > > > > > > > + *    It should also set PKT_TX_IP_CKSUM if it is IPv4 pkt.
> > > > > > > > + * b) PMD will assume pkt L4 protocol is either TCP or SCTP and
> > > > > > > > + *    ECN is set to 2'b01 or 2'b10 as per RFC 3168 and hence HW
> > > > > > > > + *    can mark the packet for a configured color.
> > > > > > > > + * c) Application should also set mbuf.l2_len that indicates
> > > > > > > > + *    start offset of L3 header.
> > > > > > > > + */
> > > > > > > > +#define PKT_TX_MARK_VLAN_DEI           (1ULL << 38)
> > > > > > > > +#define PKT_TX_MARK_IP_DSCP            (1ULL << 39)
> > > > > > > > +#define PKT_TX_MARK_IP_ECN             (1ULL << 40)
> > > > 
> > > > We should have one comment per define.
> > > Ack, will fix in V2.
> > > 
> > > > 
> > > > 
> > > > > > > > +
> > > > > > > > +/**
> > > > > > > >   * Outer UDP checksum offload flag. This flag is used for enabling
> > > > > > > >   * outer UDP checksum in PMD. To use outer UDP checksum, the user needs to
> > > > > > > >   * 1) Enable the following in mbuf,
> > > > > > > > @@ -384,7 +413,10 @@ extern "C" {
> > > > > > > >                 PKT_TX_MACSEC |          \
> > > > > > > >                 PKT_TX_SEC_OFFLOAD |     \
> > > > > > > >                 PKT_TX_UDP_SEG |         \
> > > > > > > > -               PKT_TX_OUTER_UDP_CKSUM)
> > > > > > > > +               PKT_TX_OUTER_UDP_CKSUM | \
> > > > > > > > +               PKT_TX_MARK_VLAN_DEI |   \
> > > > > > > > +               PKT_TX_MARK_IP_DSCP |    \
> > > > > > > > +               PKT_TX_MARK_IP_ECN)
> > > > > > > >
> > > > > > > >  /**
> > > > > > > >   * Mbuf having an external buffer attached. shinfo in mbuf must be filled.
> > > > > > > > --
> > > > > > > > 2.8.4
> > > > > > > >
Nithin Dabilpuram May 15, 2020, 10:08 a.m. UTC | #10
On Thu, May 14, 2020 at 10:29:31PM +0200, Olivier Matz wrote:
> Hi Nithin,
> 
> On Tue, May 05, 2020 at 11:49:20AM +0530, Nithin Dabilpuram wrote:
> > On Mon, May 04, 2020 at 02:27:35PM +0200, Olivier Matz wrote:
> > > On Mon, May 04, 2020 at 03:34:57PM +0530, Nithin Dabilpuram wrote:
> > > > On Mon, May 04, 2020 at 11:16:40AM +0200, Olivier Matz wrote:
> > > > > On Mon, May 04, 2020 at 01:57:06PM +0530, Nithin Dabilpuram wrote:
> > > > > > Hi Olivier,
> > > > > > 
> > > > > > On Mon, May 04, 2020 at 10:06:34AM +0200, Olivier Matz wrote:
> > > > > > > External Email
> > > > > > > 
> > > > > > > ----------------------------------------------------------------------
> > > > > > > Hi,
> > > > > > > 
> > > > > > > On Fri, May 01, 2020 at 04:48:21PM +0530, Jerin Jacob wrote:
> > > > > > > > On Fri, Apr 17, 2020 at 12:53 PM Nithin Dabilpuram
> > > > > > > > <nithind1988@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > From: Nithin Dabilpuram <ndabilpuram@marvell.com>
> > > > > > > > >
> > > > > > > > > Introduce PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_ECN
> > > > > > > > > and PKT_TX_MARK_VLAN_DEI Tx offload flags to support
> > > > > > > > > packet marking.
> > > > > > > > >
> > > > > > > > > When packet marking feature in Traffic manager is enabled,
> > > > > > > > > application has to the use the three new flags to indicate
> > > > > > > > > to PMD on whether packet marking needs to be enabled on the
> > > > > > > > > specific mbuf or not. By setting the three flags, it is
> > > > > > > > > assumed by PMD that application has already verified the
> > > > > > > > > applicability of marking on that specific packet and
> > > > > > > > > PMD need not perform further checks as per RFC.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Krzysztof Kanas <kkanas@marvell.com>
> > > > > > > > > Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
> > > > > > > > 
> > > > > > > > None of the ethdev TM driver implementations has supported packet
> > > > > > > > marking support.
> > > > > > > > rte_tm and rte_mbuf maintainers(Christian, Oliver), Could you review this patch?
> > > > > > > 
> > > > > > > As you know, the number of mbuf flags is limited (only 18 bits are
> > > > > > > remaining), so I think we should use them with care, i.e. for features
> > > > > > > that are generic enough.
> > > > > > 
> > > > > > I agree, but I believe this is one of the basic flags needed like other 
> > > > > > Tx checksum offload flags (like PKT_TX_IP_CKSUM, PKT_TX_IPV4, etc) which 
> > > > > > are needed to identify on which packets HW should/can apply packet marking.
> > > > > 
> > > > > PKT_TX_IP_CKSUM tells the hardware to offload the checksum
> > > > > calculation. This is pretty straightforward and there is no other
> > > > > dependency than the offload feature advertised by the PMD.
> > > > > 
> > > > > I'm sorry, I have not a lot of experience with rte_tm.h, so it's
> > > > > difficult for me to have a global view of what is done for instance when
> > > > > PKT_TX_MARK_VLAN_DEI is set, and what happens when it is not set.
> > > > > 
> > > > > Can you confirm that my understanding below is correct? (or correct me
> > > > > where I'm wrong)
> > > > > 
> > > > > Before your patch:
> > > > > - the application enables the port and traffic manager on it
> > > > > - the application calls rte_tm_mark_vlan_dei() to select which traffic
> > > > >   class must be marked
> > > > > - when a packet is transmitted, the traffic class is determined by the
> > > > >   hardware, and if the hardware recognizes a VLAN packet, the VLAN DEI
> > > > >   bit is set depending on traffic class
> > > > > 
> > > > > The problem is for packets that cannot be recognized by the hardware,
> > > > > correct?
> > > > 
> > > > Yes. Octeontx2 HW always depends on application knowledge instead of walking 
> > > > through all the layers of packet data in Tx to identify what packet it is 
> > > > and where the l2, l3, l4 headers start for performance reasons. 
> > > > 
> > > > I believe there are other hardware too that have the same expectation
> > > > and hence we have a need for PKT_TX_IPv4, PKT_TX_IPv6 kind of flags.
> > > > 
> > > > Hence we want to make use of mbuf:tx_offload field and PKT_TX_* flags 
> > > > for identifying the packet and knowing what are its l2,l3,l4 offsets.
> > > 
> > > The objective is to give an indication to the hardware that the packet has:
> > > - an 802.1q header at offset X for PKT_TX_MARK_VLAN_DEI
> > > - an IP/IPv6 header at offset X for PKT_TX_MARK_IP_DSCP
> > > - an IP/IPv6 header at offset X and a TCP/SCTP header at offset Y for
> > >   PKT_TX_MARK_IP_ECN
> > > 
> > > Just to be sure I'm getting the point, would it also work if with flags
> > > like this:
> > > 
> > > - an 802.1q header at offset X for PKT_TX_HAS_VLAN
> > > - an IP/IPv6 header at offset X for PKT_TX_IPv4 or PKT_TX_IPv6
> > > - a TCP/SCTP header at offset Y for PKT_TX_TCP/PKT_TX_SCTP (implies
> > >   PKT_TX_IPv4 or PKT_TX_IPv6)
> > > 
> > > The underlying question is: do we need the flags to only describe the
> > > content of the packet or do the flag also indicate that an action has to
> > > be done?
> > 
> > If we don't have a specific action based flag, then in future it might collide
> > with other functionality and we will not be able to choose that specific
> > offload. All the existing features are having specific flags, like TSO,
> > CSUM.
> > 
> > RFC wise, even when marking in enabled and packet is coloured, not all packets
> > can be marked. 
> > For example when IP DSCP marking(RFC 2597) is enabled, marking is defined
> > only with below 12 code points out of 64 code points (6 bits of DSCP).
> > 
> >                   Class 1    Class 2    Class 3    Class 4    
> >                  +----------+----------+----------+----------+
> > Low Drop Prec    |  001010  |  010010  |  011010  |  100010  |
> > Medium Drop Prec |  001100  |  010100  |  011100  |  100100  |
> > High Drop Prec   |  001110  |  010110  |  011110  |  100110  |
> >                  +----------+----------+----------+----------+
> > 
> > All other combinations of DSCP value can be used for some other purposes
> > and hence packets with those values shouldn't be marked.
> > Similar is the case with IP ECN marking for TCP/SCTP(RFC 3168).
> > 
> > Having PMD or HW to check if the packet falls in the said class and then do
> > marking will impact performance. Since application actually fills those values
> > in packet, it will be more easy for them to say.
> > 
> > > 
> > > > > So your patch is a way to force the hardware to recognize mark set the
> > > > > VLAN DEI on packets that are not recognized as VLAN packets?
> > > > > 
> > > > > How the is traffic class of the packet determined?
> > > > 
> > > > Packet is coloured based on Single Rate[1] or Dual Rate[2] Shaping result
> > > > and packet color determines traffic class. The exact behavior of 
> > > > packet color to traffic class mapping is mentioned in TM spec based on
> > > > few other RFC's.
> > > > 
> > > > [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__tools.ietf.org_html_rfc2697&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=pJDciSXpMy6TawycjvpYj_Jq5M5j_ywqhU8-keRI_ac&s=05emGNkz3Qat3dtZIbEsmQDC5y9-tU9yItHX0x1aaJU&e= 
> > > > [2] https://urldefense.proofpoint.com/v2/url?u=https-3A__tools.ietf.org_html_rfc2698&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=pJDciSXpMy6TawycjvpYj_Jq5M5j_ywqhU8-keRI_ac&s=3VN2dIGSDt4vWM-FpPOOf-8SeVShl_t7QpXRU6Zw460&e= 
> > > 
> > > OK, so the traffic class does not depend on the packet type?
> > Yes it doesn't. But where to update the traffic class is specific to packet
> > type like DEI bit in VLAN or ECN field in IPv4/IPv6 or DSCP field in IPv4/IPv6.
> > Also ECN marking is only valid for TCP/SCTP packets.
> > 
> > > 
> > > 
> > > > > > > From what I understand, this feature is bound to octeontx2, so using a
> > > > > > > mbuf dynamic flag would make more sense here. There are some examples in
> > > > > > > dpdk repository, just grep for "dynflag".
> > > > > > 
> > > > > > This is not octeontx2 specific flag but any "packet marking feature" enabled
> > > > > > PMD would need these flags to identify on which packets marking needs to be 
> > > > > > done. This is the first PMD that supports packet marking feature and
> > > > > > hence it was not exposed earlier.
> > > > > > 
> > > > > > For example to mark VLAN DEI, PMD cannot always assume that there is preexisting
> > > > > > VLAN header from Byte 12 as there is no gaurantee that ethernet header
> > > > > > always starts at Byte 0 (Custom headers before ethernet hdr).
> > > > > > 
> > > > > > > 
> > > > > > > Also, I think that the feature availability should be advertised through
> > > > > > > an ethdev offload, so an application can know at initialization time
> > > > > > > that these flags can be used.
> > > > > > 
> > > > > > Feature availablity is already part of TM spec in rte_tm.h 
> > > > > > struct rte_tm_capabilities:mark_vlan_dei_supported
> > > > > > struct rte_tm_capabilities:mark_ip_ecn_[sctp|tcp]_supported
> > > > > > struct rte_tm_capabilities:mark_ip_dscp_supported
> > > > > 
> > > > > Does this mean that any driver advertising this existing feature flag
> > > > > has to support the new mbuf flags too? Shouldn't we have a specific
> > > > > feature for it?
> > > > 
> > > > Yes, I thought PMD's need to support both.
> > > > I'm fine adding specific feature flag for the offload flags alone
> > > > if you insist or if there are other PMD's which don't need the offload flags
> > > > for packet marking. I was not able to find out about other PMD's as
> > > > none of the existing PMD's support packet marking.
> > > 
> > > Do you suggest that the behavior of the traffic manager marking should
> > > be:
> > > 
> > > a- the hardware tries to recognize tx packets, and mark them
> > >    accordingly. What packets are recognized depend on hardware.
> > > b- if the mbuf has a specific flag, it helps the PMD and hardware to
> > >    recognize packets, so it can mark packets.
> > > 
> > > For an application, a- is difficult to apprehend as it will be dependent
> > > on hardware.
> > > 
> > > Or do you suggest that packets should only be marked if there is a mbuf
> > > flag? (only b-)
> > Yes, I believe b- is the right thing.
> > 
> > > 
> > > Do you confirm that there is no support at all for this feature today?
> > > I mean, what was the usage of rte_tm_mark_vlan_dei() these last 3 years?
> > 
> > Yes, it was not implemented/used. Because of such reasons, rte_tm.h is
> > supposed to be experimental but was mistakenly marked stable. 
> > You can see related discussion in below threads about marking rte_tm.h 
> > experimental again in v20.11.
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__mails.dpdk.org_archives_dev_2020-2DApril_164970.html&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=myqGwnIHNjN9IP7urxcVAB384qKoxlmm00p1gS7ttbw&s=-o2E-F9aHy3mrQw6xgO__RPXY9t8s3yjJn81X6Ius3k&e= 
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__mails.dpdk.org_archives_dev_2020-2DMay_166221.html&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=myqGwnIHNjN9IP7urxcVAB384qKoxlmm00p1gS7ttbw&s=gTKSzMmlhE75x4TP8IJB7NP5MVO-zxjmNRQ9bZ6MxwI&e= 
> 
> Thank you for the explanations. I also think b- is a better choice.
> 
> I don't see any better approach than having a mbuf flag. However, I'm
> still not fully convinced that a dynamic flag won't do the job. Taking
> 3 additional flags (among 18 remaing) for this feature also means that
> we have 3 flags less for dynamic flags for all applications, even for
> applications that will not use this feature.
> 
> Would it be a problem to use a dynamic flag in this case?
Since packet marking feature itself is already part of spec,
if we move the flags to PMD specific dynamic flag, then it creates a confusion.

It is not the case of a custom feature supported by a specific PMD.
I believe when other PMD's implement packet marking, the same flags will
suffice.
> 
> Thanks,
> Olivier
> 
> 
> > 
> > Thanks
> > Nithin
> > 
> > > 
> > > Thanks,
> > > Olivier
> > > 
> > > 
> > > > 
> > > > > 
> > > > > Please also see few comments below.
> > > > > 
> > > > > > > > > ---
> > > > > > > > >  doc/guides/nics/features.rst    | 14 ++++++++++++++
> > > > > > > > >  lib/librte_mbuf/rte_mbuf.c      |  6 ++++++
> > > > > > > > >  lib/librte_mbuf/rte_mbuf_core.h | 36 ++++++++++++++++++++++++++++++++++--
> > > > > > > > >  3 files changed, 54 insertions(+), 2 deletions(-)
> > > > > > > > >
> > > > > > > > > diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> > > > > > > > > index edd21c4..bc978fb 100644
> > > > > > > > > --- a/doc/guides/nics/features.rst
> > > > > > > > > +++ b/doc/guides/nics/features.rst
> > > > > > > > > @@ -913,6 +913,20 @@ Supports to get Rx/Tx packet burst mode information.
> > > > > > > > >  * **[implements] eth_dev_ops**: ``rx_burst_mode_get``, ``tx_burst_mode_get``.
> > > > > > > > >  * **[related] API**: ``rte_eth_rx_burst_mode_get()``, ``rte_eth_tx_burst_mode_get()``.
> > > > > > > > >
> > > > > > > > > +.. _nic_features_traffic_manager_packet_marking_offload:
> > > > > > > > > +
> > > > > > > > > +Traffic Manager Packet marking offload
> > > > > > > > > +--------------------------------------
> > > > > > > > > +
> > > > > > > > > +Supports enabling a packet marking offload specific mbuf.
> > > > > > > > > +
> > > > > > > > > +* **[uses]     mbuf**: ``mbuf.ol_flags:PKT_TX_MARK_IP_DSCP``,
> > > > > > > > > +  ``mbuf.ol_flags:PKT_TX_MARK_IP_ECN``, ``mbuf.ol_flags:PKT_TX_MARK_VLAN_DEI``,
> > > > > > > > > +  ``mbuf.ol_flags:PKT_TX_IPV4``, ``mbuf.ol_flags:PKT_TX_IPV6``.
> > > > > > > > > +* **[uses]     mbuf**: ``mbuf.l2_len``.
> > > > > > > > > +* **[related] API**: ``rte_tm_mark_ip_dscp()``, ``rte_tm_mark_ip_ecn()``,
> > > > > > > > > +  ``rte_tm_mark_vlan_dei()``.
> > > > > > > > > +
> > > > > > > > >  .. _nic_features_other:
> > > > > > > > >
> > > > > > > > >  Other dev ops not represented by a Feature
> > > > > > > > > diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> > > > > > > > > index cd5794d..5c6896d 100644
> > > > > > > > > --- a/lib/librte_mbuf/rte_mbuf.c
> > > > > > > > > +++ b/lib/librte_mbuf/rte_mbuf.c
> > > > > > > > > @@ -880,6 +880,9 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
> > > > > > > > >         case PKT_TX_SEC_OFFLOAD: return "PKT_TX_SEC_OFFLOAD";
> > > > > > > > >         case PKT_TX_UDP_SEG: return "PKT_TX_UDP_SEG";
> > > > > > > > >         case PKT_TX_OUTER_UDP_CKSUM: return "PKT_TX_OUTER_UDP_CKSUM";
> > > > > > > > > +       case PKT_TX_MARK_VLAN_DEI: return "PKT_TX_MARK_VLAN_DEI";
> > > > > > > > > +       case PKT_TX_MARK_IP_DSCP: return "PKT_TX_MARK_IP_DSCP";
> > > > > > > > > +       case PKT_TX_MARK_IP_ECN: return "PKT_TX_MARK_IP_ECN";
> > > > > > > > >         default: return NULL;
> > > > > > > > >         }
> > > > > > > > >  }
> > > > > > > > > @@ -916,6 +919,9 @@ rte_get_tx_ol_flag_list(uint64_t mask, char *buf, size_t buflen)
> > > > > > > > >                 { PKT_TX_SEC_OFFLOAD, PKT_TX_SEC_OFFLOAD, NULL },
> > > > > > > > >                 { PKT_TX_UDP_SEG, PKT_TX_UDP_SEG, NULL },
> > > > > > > > >                 { PKT_TX_OUTER_UDP_CKSUM, PKT_TX_OUTER_UDP_CKSUM, NULL },
> > > > > > > > > +               { PKT_TX_MARK_VLAN_DEI, PKT_TX_MARK_VLAN_DEI, NULL },
> > > > > > > > > +               { PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_DSCP, NULL },
> > > > > > > > > +               { PKT_TX_MARK_IP_ECN, PKT_TX_MARK_IP_ECN, NULL },
> > > > > > > > >         };
> > > > > > > > >         const char *name;
> > > > > > > > >         unsigned int i;
> > > > > > > > > diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> > > > > > > > > index b9a59c8..d9f1290 100644
> > > > > > > > > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > > > > > > > > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > > > > > > > > @@ -187,11 +187,40 @@ extern "C" {
> > > > > > > > >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> > > > > > > > >
> > > > > > > > >  #define PKT_FIRST_FREE (1ULL << 23)
> > > > > > > > > -#define PKT_LAST_FREE (1ULL << 40)
> > > > > > > > > +#define PKT_LAST_FREE (1ULL << 37)
> > > > > > > > >
> > > > > > > > >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> > > > > > > > >
> > > > > > > > >  /**
> > > > > > > > > + * Packet marking offload flags. These flags indicated what kind
> > > > > > > > > + * of packet marking needs to be applied on a given mbuf when
> > > > > > > > > + * appropriate Traffic Manager configuration is in place.
> > > > > > > > > + * When user set's these flags on a mbuf, below assumptions are made
> > > > > > > > > + * 1) When PKT_TX_MARK_VLAN_DEI is set,
> > > > > > > > > + * a) PMD assumes pkt to be a 802.1q packet.
> > > > > 
> > > > > What does that imply?
> > > > 
> > > > I meant by setting the flag, a packet has VLAN header adhering to IEEE 802.1Q spec.
> > > > 
> > > > > 
> > > > > > > > > + * b) Application should also set mbuf.l2_len where 802.1Q header is
> > > > > > > > > + *    at (mbuf.l2_len - 6) offset.
> > > > > 
> > > > > Why mbuf.l2_len - 6 ?
> > > > L2 header when VLAN header is preset will be 
> > > > {custom header 'X' Bytes}:{Ethernet SRC+DST (12B)}:{VLAN Header (4B)}:{Ether Type (2B)}
> > > > l2_len = X + 12 + 4 + 2
> > > > So, VLAN header starts at (l2_len - 6) bytes.
> > > > 
> > > > > 
> > > > > > > > > + * 2) When PKT_TX_MARK_IP_DSCP is set,
> > > > > > > > > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6
> > > > > > > > > + *    to indicate whether if it is IPv4 packet or IPv6 packet
> > > > > > > > > + *    for DSCP marking. It should also set PKT_TX_IP_CKSUM if it is
> > > > > > > > > + *    IPv4 pkt.
> > > > > > > > > + * b) Application should also set mbuf.l2_len that indicates
> > > > > > > > > + *    start offset of L3 header.
> > > > > > > > > + * 3) When PKT_TX_MARK_IP_ECN is set,
> > > > > > > > > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6.
> > > > > > > > > + *    It should also set PKT_TX_IP_CKSUM if it is IPv4 pkt.
> > > > > > > > > + * b) PMD will assume pkt L4 protocol is either TCP or SCTP and
> > > > > > > > > + *    ECN is set to 2'b01 or 2'b10 as per RFC 3168 and hence HW
> > > > > > > > > + *    can mark the packet for a configured color.
> > > > > > > > > + * c) Application should also set mbuf.l2_len that indicates
> > > > > > > > > + *    start offset of L3 header.
> > > > > > > > > + */
> > > > > > > > > +#define PKT_TX_MARK_VLAN_DEI           (1ULL << 38)
> > > > > > > > > +#define PKT_TX_MARK_IP_DSCP            (1ULL << 39)
> > > > > > > > > +#define PKT_TX_MARK_IP_ECN             (1ULL << 40)
> > > > > 
> > > > > We should have one comment per define.
> > > > Ack, will fix in V2.
> > > > 
> > > > > 
> > > > > 
> > > > > > > > > +
> > > > > > > > > +/**
> > > > > > > > >   * Outer UDP checksum offload flag. This flag is used for enabling
> > > > > > > > >   * outer UDP checksum in PMD. To use outer UDP checksum, the user needs to
> > > > > > > > >   * 1) Enable the following in mbuf,
> > > > > > > > > @@ -384,7 +413,10 @@ extern "C" {
> > > > > > > > >                 PKT_TX_MACSEC |          \
> > > > > > > > >                 PKT_TX_SEC_OFFLOAD |     \
> > > > > > > > >                 PKT_TX_UDP_SEG |         \
> > > > > > > > > -               PKT_TX_OUTER_UDP_CKSUM)
> > > > > > > > > +               PKT_TX_OUTER_UDP_CKSUM | \
> > > > > > > > > +               PKT_TX_MARK_VLAN_DEI |   \
> > > > > > > > > +               PKT_TX_MARK_IP_DSCP |    \
> > > > > > > > > +               PKT_TX_MARK_IP_ECN)
> > > > > > > > >
> > > > > > > > >  /**
> > > > > > > > >   * Mbuf having an external buffer attached. shinfo in mbuf must be filled.
> > > > > > > > > --
> > > > > > > > > 2.8.4
> > > > > > > > >
Ananyev, Konstantin May 15, 2020, 10:30 a.m. UTC | #11
> On Thu, May 14, 2020 at 10:29:31PM +0200, Olivier Matz wrote:
> > Hi Nithin,
> >
> > On Tue, May 05, 2020 at 11:49:20AM +0530, Nithin Dabilpuram wrote:
> > > On Mon, May 04, 2020 at 02:27:35PM +0200, Olivier Matz wrote:
> > > > On Mon, May 04, 2020 at 03:34:57PM +0530, Nithin Dabilpuram wrote:
> > > > > On Mon, May 04, 2020 at 11:16:40AM +0200, Olivier Matz wrote:
> > > > > > On Mon, May 04, 2020 at 01:57:06PM +0530, Nithin Dabilpuram wrote:
> > > > > > > Hi Olivier,
> > > > > > >
> > > > > > > On Mon, May 04, 2020 at 10:06:34AM +0200, Olivier Matz wrote:
> > > > > > > > External Email
> > > > > > > >
> > > > > > > > ----------------------------------------------------------------------
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > On Fri, May 01, 2020 at 04:48:21PM +0530, Jerin Jacob wrote:
> > > > > > > > > On Fri, Apr 17, 2020 at 12:53 PM Nithin Dabilpuram
> > > > > > > > > <nithind1988@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > > From: Nithin Dabilpuram <ndabilpuram@marvell.com>
> > > > > > > > > >
> > > > > > > > > > Introduce PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_ECN
> > > > > > > > > > and PKT_TX_MARK_VLAN_DEI Tx offload flags to support
> > > > > > > > > > packet marking.
> > > > > > > > > >
> > > > > > > > > > When packet marking feature in Traffic manager is enabled,
> > > > > > > > > > application has to the use the three new flags to indicate
> > > > > > > > > > to PMD on whether packet marking needs to be enabled on the
> > > > > > > > > > specific mbuf or not. By setting the three flags, it is
> > > > > > > > > > assumed by PMD that application has already verified the
> > > > > > > > > > applicability of marking on that specific packet and
> > > > > > > > > > PMD need not perform further checks as per RFC.
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Krzysztof Kanas <kkanas@marvell.com>
> > > > > > > > > > Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
> > > > > > > > >
> > > > > > > > > None of the ethdev TM driver implementations has supported packet
> > > > > > > > > marking support.
> > > > > > > > > rte_tm and rte_mbuf maintainers(Christian, Oliver), Could you review this patch?
> > > > > > > >
> > > > > > > > As you know, the number of mbuf flags is limited (only 18 bits are
> > > > > > > > remaining), so I think we should use them with care, i.e. for features
> > > > > > > > that are generic enough.
> > > > > > >
> > > > > > > I agree, but I believe this is one of the basic flags needed like other
> > > > > > > Tx checksum offload flags (like PKT_TX_IP_CKSUM, PKT_TX_IPV4, etc) which
> > > > > > > are needed to identify on which packets HW should/can apply packet marking.
> > > > > >
> > > > > > PKT_TX_IP_CKSUM tells the hardware to offload the checksum
> > > > > > calculation. This is pretty straightforward and there is no other
> > > > > > dependency than the offload feature advertised by the PMD.
> > > > > >
> > > > > > I'm sorry, I have not a lot of experience with rte_tm.h, so it's
> > > > > > difficult for me to have a global view of what is done for instance when
> > > > > > PKT_TX_MARK_VLAN_DEI is set, and what happens when it is not set.
> > > > > >
> > > > > > Can you confirm that my understanding below is correct? (or correct me
> > > > > > where I'm wrong)
> > > > > >
> > > > > > Before your patch:
> > > > > > - the application enables the port and traffic manager on it
> > > > > > - the application calls rte_tm_mark_vlan_dei() to select which traffic
> > > > > >   class must be marked
> > > > > > - when a packet is transmitted, the traffic class is determined by the
> > > > > >   hardware, and if the hardware recognizes a VLAN packet, the VLAN DEI
> > > > > >   bit is set depending on traffic class
> > > > > >
> > > > > > The problem is for packets that cannot be recognized by the hardware,
> > > > > > correct?
> > > > >
> > > > > Yes. Octeontx2 HW always depends on application knowledge instead of walking
> > > > > through all the layers of packet data in Tx to identify what packet it is
> > > > > and where the l2, l3, l4 headers start for performance reasons.
> > > > >
> > > > > I believe there are other hardware too that have the same expectation
> > > > > and hence we have a need for PKT_TX_IPv4, PKT_TX_IPv6 kind of flags.
> > > > >
> > > > > Hence we want to make use of mbuf:tx_offload field and PKT_TX_* flags
> > > > > for identifying the packet and knowing what are its l2,l3,l4 offsets.
> > > >
> > > > The objective is to give an indication to the hardware that the packet has:
> > > > - an 802.1q header at offset X for PKT_TX_MARK_VLAN_DEI
> > > > - an IP/IPv6 header at offset X for PKT_TX_MARK_IP_DSCP
> > > > - an IP/IPv6 header at offset X and a TCP/SCTP header at offset Y for
> > > >   PKT_TX_MARK_IP_ECN
> > > >
> > > > Just to be sure I'm getting the point, would it also work if with flags
> > > > like this:
> > > >
> > > > - an 802.1q header at offset X for PKT_TX_HAS_VLAN
> > > > - an IP/IPv6 header at offset X for PKT_TX_IPv4 or PKT_TX_IPv6
> > > > - a TCP/SCTP header at offset Y for PKT_TX_TCP/PKT_TX_SCTP (implies
> > > >   PKT_TX_IPv4 or PKT_TX_IPv6)
> > > >
> > > > The underlying question is: do we need the flags to only describe the
> > > > content of the packet or do the flag also indicate that an action has to
> > > > be done?
> > >
> > > If we don't have a specific action based flag, then in future it might collide
> > > with other functionality and we will not be able to choose that specific
> > > offload. All the existing features are having specific flags, like TSO,
> > > CSUM.
> > >
> > > RFC wise, even when marking in enabled and packet is coloured, not all packets
> > > can be marked.
> > > For example when IP DSCP marking(RFC 2597) is enabled, marking is defined
> > > only with below 12 code points out of 64 code points (6 bits of DSCP).
> > >
> > >                   Class 1    Class 2    Class 3    Class 4
> > >                  +----------+----------+----------+----------+
> > > Low Drop Prec    |  001010  |  010010  |  011010  |  100010  |
> > > Medium Drop Prec |  001100  |  010100  |  011100  |  100100  |
> > > High Drop Prec   |  001110  |  010110  |  011110  |  100110  |
> > >                  +----------+----------+----------+----------+
> > >
> > > All other combinations of DSCP value can be used for some other purposes
> > > and hence packets with those values shouldn't be marked.
> > > Similar is the case with IP ECN marking for TCP/SCTP(RFC 3168).
> > >
> > > Having PMD or HW to check if the packet falls in the said class and then do
> > > marking will impact performance. Since application actually fills those values
> > > in packet, it will be more easy for them to say.
> > >
> > > >
> > > > > > So your patch is a way to force the hardware to recognize mark set the
> > > > > > VLAN DEI on packets that are not recognized as VLAN packets?
> > > > > >
> > > > > > How the is traffic class of the packet determined?
> > > > >
> > > > > Packet is coloured based on Single Rate[1] or Dual Rate[2] Shaping result
> > > > > and packet color determines traffic class. The exact behavior of
> > > > > packet color to traffic class mapping is mentioned in TM spec based on
> > > > > few other RFC's.
> > > > >
> > > > > [1] https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__tools.ietf.org_html_rfc2697&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=
> pJDciSXpMy6TawycjvpYj_Jq5M5j_ywqhU8-keRI_ac&s=05emGNkz3Qat3dtZIbEsmQDC5y9-tU9yItHX0x1aaJU&e=
> > > > > [2] https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__tools.ietf.org_html_rfc2698&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=
> pJDciSXpMy6TawycjvpYj_Jq5M5j_ywqhU8-keRI_ac&s=3VN2dIGSDt4vWM-FpPOOf-8SeVShl_t7QpXRU6Zw460&e=
> > > >
> > > > OK, so the traffic class does not depend on the packet type?
> > > Yes it doesn't. But where to update the traffic class is specific to packet
> > > type like DEI bit in VLAN or ECN field in IPv4/IPv6 or DSCP field in IPv4/IPv6.
> > > Also ECN marking is only valid for TCP/SCTP packets.
> > >
> > > >
> > > >
> > > > > > > > From what I understand, this feature is bound to octeontx2, so using a
> > > > > > > > mbuf dynamic flag would make more sense here. There are some examples in
> > > > > > > > dpdk repository, just grep for "dynflag".
> > > > > > >
> > > > > > > This is not octeontx2 specific flag but any "packet marking feature" enabled
> > > > > > > PMD would need these flags to identify on which packets marking needs to be
> > > > > > > done. This is the first PMD that supports packet marking feature and
> > > > > > > hence it was not exposed earlier.
> > > > > > >
> > > > > > > For example to mark VLAN DEI, PMD cannot always assume that there is preexisting
> > > > > > > VLAN header from Byte 12 as there is no gaurantee that ethernet header
> > > > > > > always starts at Byte 0 (Custom headers before ethernet hdr).
> > > > > > >
> > > > > > > >
> > > > > > > > Also, I think that the feature availability should be advertised through
> > > > > > > > an ethdev offload, so an application can know at initialization time
> > > > > > > > that these flags can be used.
> > > > > > >
> > > > > > > Feature availablity is already part of TM spec in rte_tm.h
> > > > > > > struct rte_tm_capabilities:mark_vlan_dei_supported
> > > > > > > struct rte_tm_capabilities:mark_ip_ecn_[sctp|tcp]_supported
> > > > > > > struct rte_tm_capabilities:mark_ip_dscp_supported
> > > > > >
> > > > > > Does this mean that any driver advertising this existing feature flag
> > > > > > has to support the new mbuf flags too? Shouldn't we have a specific
> > > > > > feature for it?
> > > > >
> > > > > Yes, I thought PMD's need to support both.
> > > > > I'm fine adding specific feature flag for the offload flags alone
> > > > > if you insist or if there are other PMD's which don't need the offload flags
> > > > > for packet marking. I was not able to find out about other PMD's as
> > > > > none of the existing PMD's support packet marking.
> > > >
> > > > Do you suggest that the behavior of the traffic manager marking should
> > > > be:
> > > >
> > > > a- the hardware tries to recognize tx packets, and mark them
> > > >    accordingly. What packets are recognized depend on hardware.
> > > > b- if the mbuf has a specific flag, it helps the PMD and hardware to
> > > >    recognize packets, so it can mark packets.
> > > >
> > > > For an application, a- is difficult to apprehend as it will be dependent
> > > > on hardware.
> > > >
> > > > Or do you suggest that packets should only be marked if there is a mbuf
> > > > flag? (only b-)
> > > Yes, I believe b- is the right thing.
> > >
> > > >
> > > > Do you confirm that there is no support at all for this feature today?
> > > > I mean, what was the usage of rte_tm_mark_vlan_dei() these last 3 years?
> > >
> > > Yes, it was not implemented/used. Because of such reasons, rte_tm.h is
> > > supposed to be experimental but was mistakenly marked stable.
> > > You can see related discussion in below threads about marking rte_tm.h
> > > experimental again in v20.11.
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__mails.dpdk.org_archives_dev_2020-
> 2DApril_164970.html&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=myqGwnIHN
> jN9IP7urxcVAB384qKoxlmm00p1gS7ttbw&s=-o2E-F9aHy3mrQw6xgO__RPXY9t8s3yjJn81X6Ius3k&e=
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__mails.dpdk.org_archives_dev_2020-
> 2DMay_166221.html&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=myqGwnIHNj
> N9IP7urxcVAB384qKoxlmm00p1gS7ttbw&s=gTKSzMmlhE75x4TP8IJB7NP5MVO-zxjmNRQ9bZ6MxwI&e=
> >
> > Thank you for the explanations. I also think b- is a better choice.
> >
> > I don't see any better approach than having a mbuf flag. However, I'm
> > still not fully convinced that a dynamic flag won't do the job. Taking
> > 3 additional flags (among 18 remaing) for this feature also means that
> > we have 3 flags less for dynamic flags for all applications, even for
> > applications that will not use this feature.

I also share Olivier's concern about consuming 3 bits in ol_flags for that feature.
Can it probably be squeezed somehow?
Let say we reserve one flag that this information is present or not, and
re-use one of rx-only fields for store additional information (packet_type, or so).
Or might be some other approach.   

> >
> > Would it be a problem to use a dynamic flag in this case?
> Since packet marking feature itself is already part of spec,
> if we move the flags to PMD specific dynamic flag, then it creates a confusion.
> 
> It is not the case of a custom feature supported by a specific PMD.
> I believe when other PMD's implement packet marking, the same flags will
> suffice.
> >
> > Thanks,
> > Olivier
> >
> >
> > >
> > > Thanks
> > > Nithin
> > >
> > > >
> > > > Thanks,
> > > > Olivier
> > > >
> > > >
> > > > >
> > > > > >
> > > > > > Please also see few comments below.
> > > > > >
> > > > > > > > > > ---
> > > > > > > > > >  doc/guides/nics/features.rst    | 14 ++++++++++++++
> > > > > > > > > >  lib/librte_mbuf/rte_mbuf.c      |  6 ++++++
> > > > > > > > > >  lib/librte_mbuf/rte_mbuf_core.h | 36 ++++++++++++++++++++++++++++++++++--
> > > > > > > > > >  3 files changed, 54 insertions(+), 2 deletions(-)
> > > > > > > > > >
> > > > > > > > > > diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> > > > > > > > > > index edd21c4..bc978fb 100644
> > > > > > > > > > --- a/doc/guides/nics/features.rst
> > > > > > > > > > +++ b/doc/guides/nics/features.rst
> > > > > > > > > > @@ -913,6 +913,20 @@ Supports to get Rx/Tx packet burst mode information.
> > > > > > > > > >  * **[implements] eth_dev_ops**: ``rx_burst_mode_get``, ``tx_burst_mode_get``.
> > > > > > > > > >  * **[related] API**: ``rte_eth_rx_burst_mode_get()``, ``rte_eth_tx_burst_mode_get()``.
> > > > > > > > > >
> > > > > > > > > > +.. _nic_features_traffic_manager_packet_marking_offload:
> > > > > > > > > > +
> > > > > > > > > > +Traffic Manager Packet marking offload
> > > > > > > > > > +--------------------------------------
> > > > > > > > > > +
> > > > > > > > > > +Supports enabling a packet marking offload specific mbuf.
> > > > > > > > > > +
> > > > > > > > > > +* **[uses]     mbuf**: ``mbuf.ol_flags:PKT_TX_MARK_IP_DSCP``,
> > > > > > > > > > +  ``mbuf.ol_flags:PKT_TX_MARK_IP_ECN``, ``mbuf.ol_flags:PKT_TX_MARK_VLAN_DEI``,
> > > > > > > > > > +  ``mbuf.ol_flags:PKT_TX_IPV4``, ``mbuf.ol_flags:PKT_TX_IPV6``.
> > > > > > > > > > +* **[uses]     mbuf**: ``mbuf.l2_len``.
> > > > > > > > > > +* **[related] API**: ``rte_tm_mark_ip_dscp()``, ``rte_tm_mark_ip_ecn()``,
> > > > > > > > > > +  ``rte_tm_mark_vlan_dei()``.
> > > > > > > > > > +
> > > > > > > > > >  .. _nic_features_other:
> > > > > > > > > >
> > > > > > > > > >  Other dev ops not represented by a Feature
> > > > > > > > > > diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> > > > > > > > > > index cd5794d..5c6896d 100644
> > > > > > > > > > --- a/lib/librte_mbuf/rte_mbuf.c
> > > > > > > > > > +++ b/lib/librte_mbuf/rte_mbuf.c
> > > > > > > > > > @@ -880,6 +880,9 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
> > > > > > > > > >         case PKT_TX_SEC_OFFLOAD: return "PKT_TX_SEC_OFFLOAD";
> > > > > > > > > >         case PKT_TX_UDP_SEG: return "PKT_TX_UDP_SEG";
> > > > > > > > > >         case PKT_TX_OUTER_UDP_CKSUM: return "PKT_TX_OUTER_UDP_CKSUM";
> > > > > > > > > > +       case PKT_TX_MARK_VLAN_DEI: return "PKT_TX_MARK_VLAN_DEI";
> > > > > > > > > > +       case PKT_TX_MARK_IP_DSCP: return "PKT_TX_MARK_IP_DSCP";
> > > > > > > > > > +       case PKT_TX_MARK_IP_ECN: return "PKT_TX_MARK_IP_ECN";
> > > > > > > > > >         default: return NULL;
> > > > > > > > > >         }
> > > > > > > > > >  }
> > > > > > > > > > @@ -916,6 +919,9 @@ rte_get_tx_ol_flag_list(uint64_t mask, char *buf, size_t buflen)
> > > > > > > > > >                 { PKT_TX_SEC_OFFLOAD, PKT_TX_SEC_OFFLOAD, NULL },
> > > > > > > > > >                 { PKT_TX_UDP_SEG, PKT_TX_UDP_SEG, NULL },
> > > > > > > > > >                 { PKT_TX_OUTER_UDP_CKSUM, PKT_TX_OUTER_UDP_CKSUM, NULL },
> > > > > > > > > > +               { PKT_TX_MARK_VLAN_DEI, PKT_TX_MARK_VLAN_DEI, NULL },
> > > > > > > > > > +               { PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_DSCP, NULL },
> > > > > > > > > > +               { PKT_TX_MARK_IP_ECN, PKT_TX_MARK_IP_ECN, NULL },
> > > > > > > > > >         };
> > > > > > > > > >         const char *name;
> > > > > > > > > >         unsigned int i;
> > > > > > > > > > diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> > > > > > > > > > index b9a59c8..d9f1290 100644
> > > > > > > > > > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > > > > > > > > > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > > > > > > > > > @@ -187,11 +187,40 @@ extern "C" {
> > > > > > > > > >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> > > > > > > > > >
> > > > > > > > > >  #define PKT_FIRST_FREE (1ULL << 23)
> > > > > > > > > > -#define PKT_LAST_FREE (1ULL << 40)
> > > > > > > > > > +#define PKT_LAST_FREE (1ULL << 37)
> > > > > > > > > >
> > > > > > > > > >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> > > > > > > > > >
> > > > > > > > > >  /**
> > > > > > > > > > + * Packet marking offload flags. These flags indicated what kind
> > > > > > > > > > + * of packet marking needs to be applied on a given mbuf when
> > > > > > > > > > + * appropriate Traffic Manager configuration is in place.
> > > > > > > > > > + * When user set's these flags on a mbuf, below assumptions are made
> > > > > > > > > > + * 1) When PKT_TX_MARK_VLAN_DEI is set,
> > > > > > > > > > + * a) PMD assumes pkt to be a 802.1q packet.
> > > > > >
> > > > > > What does that imply?
> > > > >
> > > > > I meant by setting the flag, a packet has VLAN header adhering to IEEE 802.1Q spec.
> > > > >
> > > > > >
> > > > > > > > > > + * b) Application should also set mbuf.l2_len where 802.1Q header is
> > > > > > > > > > + *    at (mbuf.l2_len - 6) offset.
> > > > > >
> > > > > > Why mbuf.l2_len - 6 ?
> > > > > L2 header when VLAN header is preset will be
> > > > > {custom header 'X' Bytes}:{Ethernet SRC+DST (12B)}:{VLAN Header (4B)}:{Ether Type (2B)}
> > > > > l2_len = X + 12 + 4 + 2
> > > > > So, VLAN header starts at (l2_len - 6) bytes.
> > > > >
> > > > > >
> > > > > > > > > > + * 2) When PKT_TX_MARK_IP_DSCP is set,
> > > > > > > > > > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6
> > > > > > > > > > + *    to indicate whether if it is IPv4 packet or IPv6 packet
> > > > > > > > > > + *    for DSCP marking. It should also set PKT_TX_IP_CKSUM if it is
> > > > > > > > > > + *    IPv4 pkt.
> > > > > > > > > > + * b) Application should also set mbuf.l2_len that indicates
> > > > > > > > > > + *    start offset of L3 header.
> > > > > > > > > > + * 3) When PKT_TX_MARK_IP_ECN is set,
> > > > > > > > > > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6.
> > > > > > > > > > + *    It should also set PKT_TX_IP_CKSUM if it is IPv4 pkt.
> > > > > > > > > > + * b) PMD will assume pkt L4 protocol is either TCP or SCTP and
> > > > > > > > > > + *    ECN is set to 2'b01 or 2'b10 as per RFC 3168 and hence HW
> > > > > > > > > > + *    can mark the packet for a configured color.
> > > > > > > > > > + * c) Application should also set mbuf.l2_len that indicates
> > > > > > > > > > + *    start offset of L3 header.
> > > > > > > > > > + */
> > > > > > > > > > +#define PKT_TX_MARK_VLAN_DEI           (1ULL << 38)
> > > > > > > > > > +#define PKT_TX_MARK_IP_DSCP            (1ULL << 39)
> > > > > > > > > > +#define PKT_TX_MARK_IP_ECN             (1ULL << 40)
> > > > > >
> > > > > > We should have one comment per define.
> > > > > Ack, will fix in V2.
> > > > >
> > > > > >
> > > > > >
> > > > > > > > > > +
> > > > > > > > > > +/**
> > > > > > > > > >   * Outer UDP checksum offload flag. This flag is used for enabling
> > > > > > > > > >   * outer UDP checksum in PMD. To use outer UDP checksum, the user needs to
> > > > > > > > > >   * 1) Enable the following in mbuf,
> > > > > > > > > > @@ -384,7 +413,10 @@ extern "C" {
> > > > > > > > > >                 PKT_TX_MACSEC |          \
> > > > > > > > > >                 PKT_TX_SEC_OFFLOAD |     \
> > > > > > > > > >                 PKT_TX_UDP_SEG |         \
> > > > > > > > > > -               PKT_TX_OUTER_UDP_CKSUM)
> > > > > > > > > > +               PKT_TX_OUTER_UDP_CKSUM | \
> > > > > > > > > > +               PKT_TX_MARK_VLAN_DEI |   \
> > > > > > > > > > +               PKT_TX_MARK_IP_DSCP |    \
> > > > > > > > > > +               PKT_TX_MARK_IP_ECN)
> > > > > > > > > >
> > > > > > > > > >  /**
> > > > > > > > > >   * Mbuf having an external buffer attached. shinfo in mbuf must be filled.
> > > > > > > > > > --
> > > > > > > > > > 2.8.4
> > > > > > > > > >
Thomas Monjalon May 15, 2020, 1:12 p.m. UTC | #12
15/05/2020 12:08, Nithin Dabilpuram:
> On Thu, May 14, 2020 at 10:29:31PM +0200, Olivier Matz wrote:
> > I don't see any better approach than having a mbuf flag. However, I'm
> > still not fully convinced that a dynamic flag won't do the job. Taking
> > 3 additional flags (among 18 remaing) for this feature also means that
> > we have 3 flags less for dynamic flags for all applications, even for
> > applications that will not use this feature.
> > 
> > Would it be a problem to use a dynamic flag in this case?
> Since packet marking feature itself is already part of spec,
> if we move the flags to PMD specific dynamic flag, then it creates a confusion.
> 
> It is not the case of a custom feature supported by a specific PMD.
> I believe when other PMD's implement packet marking, the same flags will
> suffice.

A dynamic flag is not necessarily PMD-specific.
It is just avoiding consuming bits if the feature is not used by the application.
We must move more existing flags and fields to be dynamic.

In general, all new flags and fields in mbuf should be dynamic.
And a work must be done to move existing stuff to free more space
for more dynamic features.
Nithin Dabilpuram May 15, 2020, 1:44 p.m. UTC | #13
On Fri, May 15, 2020 at 03:12:59PM +0200, Thomas Monjalon wrote:
> 15/05/2020 12:08, Nithin Dabilpuram:
> > On Thu, May 14, 2020 at 10:29:31PM +0200, Olivier Matz wrote:
> > > I don't see any better approach than having a mbuf flag. However, I'm
> > > still not fully convinced that a dynamic flag won't do the job. Taking
> > > 3 additional flags (among 18 remaing) for this feature also means that
> > > we have 3 flags less for dynamic flags for all applications, even for
> > > applications that will not use this feature.
> > > 
> > > Would it be a problem to use a dynamic flag in this case?
> > Since packet marking feature itself is already part of spec,
> > if we move the flags to PMD specific dynamic flag, then it creates a confusion.
> > 
> > It is not the case of a custom feature supported by a specific PMD.
> > I believe when other PMD's implement packet marking, the same flags will
> > suffice.
> 
> A dynamic flag is not necessarily PMD-specific.
> It is just avoiding consuming bits if the feature is not used by the application.
> We must move more existing flags and fields to be dynamic.
> 
> In general, all new flags and fields in mbuf should be dynamic.
> And a work must be done to move existing stuff to free more space
> for more dynamic features.

My bad, I thought dynamic flags can only be used for PMD specific thing.

There is however a cost of using dynamic flag which I think should be avoided
for DPDK spec defined offloads, though it's fine for PMD specific things.

Dynamic offload flags causes application and PMD to use non constant offset 
or shift which are looked up at init, instead of having a constant shift or
offset. This indirection costs some cycles due to extra loads in fast path.


> 
>
Nithin Dabilpuram May 15, 2020, 1:57 p.m. UTC | #14
On Fri, May 15, 2020 at 10:30:30AM +0000, Ananyev, Konstantin wrote:
> 
> > On Thu, May 14, 2020 at 10:29:31PM +0200, Olivier Matz wrote:
> > > Hi Nithin,
> > >
> > > On Tue, May 05, 2020 at 11:49:20AM +0530, Nithin Dabilpuram wrote:
> > > > On Mon, May 04, 2020 at 02:27:35PM +0200, Olivier Matz wrote:
> > > > > On Mon, May 04, 2020 at 03:34:57PM +0530, Nithin Dabilpuram wrote:
> > > > > > On Mon, May 04, 2020 at 11:16:40AM +0200, Olivier Matz wrote:
> > > > > > > On Mon, May 04, 2020 at 01:57:06PM +0530, Nithin Dabilpuram wrote:
> > > > > > > > Hi Olivier,
> > > > > > > >
> > > > > > > > On Mon, May 04, 2020 at 10:06:34AM +0200, Olivier Matz wrote:
> > > > > > > > > External Email
> > > > > > > > >
> > > > > > > > > ----------------------------------------------------------------------
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > On Fri, May 01, 2020 at 04:48:21PM +0530, Jerin Jacob wrote:
> > > > > > > > > > On Fri, Apr 17, 2020 at 12:53 PM Nithin Dabilpuram
> > > > > > > > > > <nithind1988@gmail.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > From: Nithin Dabilpuram <ndabilpuram@marvell.com>
> > > > > > > > > > >
> > > > > > > > > > > Introduce PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_ECN
> > > > > > > > > > > and PKT_TX_MARK_VLAN_DEI Tx offload flags to support
> > > > > > > > > > > packet marking.
> > > > > > > > > > >
> > > > > > > > > > > When packet marking feature in Traffic manager is enabled,
> > > > > > > > > > > application has to the use the three new flags to indicate
> > > > > > > > > > > to PMD on whether packet marking needs to be enabled on the
> > > > > > > > > > > specific mbuf or not. By setting the three flags, it is
> > > > > > > > > > > assumed by PMD that application has already verified the
> > > > > > > > > > > applicability of marking on that specific packet and
> > > > > > > > > > > PMD need not perform further checks as per RFC.
> > > > > > > > > > >
> > > > > > > > > > > Signed-off-by: Krzysztof Kanas <kkanas@marvell.com>
> > > > > > > > > > > Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
> > > > > > > > > >
> > > > > > > > > > None of the ethdev TM driver implementations has supported packet
> > > > > > > > > > marking support.
> > > > > > > > > > rte_tm and rte_mbuf maintainers(Christian, Oliver), Could you review this patch?
> > > > > > > > >
> > > > > > > > > As you know, the number of mbuf flags is limited (only 18 bits are
> > > > > > > > > remaining), so I think we should use them with care, i.e. for features
> > > > > > > > > that are generic enough.
> > > > > > > >
> > > > > > > > I agree, but I believe this is one of the basic flags needed like other
> > > > > > > > Tx checksum offload flags (like PKT_TX_IP_CKSUM, PKT_TX_IPV4, etc) which
> > > > > > > > are needed to identify on which packets HW should/can apply packet marking.
> > > > > > >
> > > > > > > PKT_TX_IP_CKSUM tells the hardware to offload the checksum
> > > > > > > calculation. This is pretty straightforward and there is no other
> > > > > > > dependency than the offload feature advertised by the PMD.
> > > > > > >
> > > > > > > I'm sorry, I have not a lot of experience with rte_tm.h, so it's
> > > > > > > difficult for me to have a global view of what is done for instance when
> > > > > > > PKT_TX_MARK_VLAN_DEI is set, and what happens when it is not set.
> > > > > > >
> > > > > > > Can you confirm that my understanding below is correct? (or correct me
> > > > > > > where I'm wrong)
> > > > > > >
> > > > > > > Before your patch:
> > > > > > > - the application enables the port and traffic manager on it
> > > > > > > - the application calls rte_tm_mark_vlan_dei() to select which traffic
> > > > > > >   class must be marked
> > > > > > > - when a packet is transmitted, the traffic class is determined by the
> > > > > > >   hardware, and if the hardware recognizes a VLAN packet, the VLAN DEI
> > > > > > >   bit is set depending on traffic class
> > > > > > >
> > > > > > > The problem is for packets that cannot be recognized by the hardware,
> > > > > > > correct?
> > > > > >
> > > > > > Yes. Octeontx2 HW always depends on application knowledge instead of walking
> > > > > > through all the layers of packet data in Tx to identify what packet it is
> > > > > > and where the l2, l3, l4 headers start for performance reasons.
> > > > > >
> > > > > > I believe there are other hardware too that have the same expectation
> > > > > > and hence we have a need for PKT_TX_IPv4, PKT_TX_IPv6 kind of flags.
> > > > > >
> > > > > > Hence we want to make use of mbuf:tx_offload field and PKT_TX_* flags
> > > > > > for identifying the packet and knowing what are its l2,l3,l4 offsets.
> > > > >
> > > > > The objective is to give an indication to the hardware that the packet has:
> > > > > - an 802.1q header at offset X for PKT_TX_MARK_VLAN_DEI
> > > > > - an IP/IPv6 header at offset X for PKT_TX_MARK_IP_DSCP
> > > > > - an IP/IPv6 header at offset X and a TCP/SCTP header at offset Y for
> > > > >   PKT_TX_MARK_IP_ECN
> > > > >
> > > > > Just to be sure I'm getting the point, would it also work if with flags
> > > > > like this:
> > > > >
> > > > > - an 802.1q header at offset X for PKT_TX_HAS_VLAN
> > > > > - an IP/IPv6 header at offset X for PKT_TX_IPv4 or PKT_TX_IPv6
> > > > > - a TCP/SCTP header at offset Y for PKT_TX_TCP/PKT_TX_SCTP (implies
> > > > >   PKT_TX_IPv4 or PKT_TX_IPv6)
> > > > >
> > > > > The underlying question is: do we need the flags to only describe the
> > > > > content of the packet or do the flag also indicate that an action has to
> > > > > be done?
> > > >
> > > > If we don't have a specific action based flag, then in future it might collide
> > > > with other functionality and we will not be able to choose that specific
> > > > offload. All the existing features are having specific flags, like TSO,
> > > > CSUM.
> > > >
> > > > RFC wise, even when marking in enabled and packet is coloured, not all packets
> > > > can be marked.
> > > > For example when IP DSCP marking(RFC 2597) is enabled, marking is defined
> > > > only with below 12 code points out of 64 code points (6 bits of DSCP).
> > > >
> > > >                   Class 1    Class 2    Class 3    Class 4
> > > >                  +----------+----------+----------+----------+
> > > > Low Drop Prec    |  001010  |  010010  |  011010  |  100010  |
> > > > Medium Drop Prec |  001100  |  010100  |  011100  |  100100  |
> > > > High Drop Prec   |  001110  |  010110  |  011110  |  100110  |
> > > >                  +----------+----------+----------+----------+
> > > >
> > > > All other combinations of DSCP value can be used for some other purposes
> > > > and hence packets with those values shouldn't be marked.
> > > > Similar is the case with IP ECN marking for TCP/SCTP(RFC 3168).
> > > >
> > > > Having PMD or HW to check if the packet falls in the said class and then do
> > > > marking will impact performance. Since application actually fills those values
> > > > in packet, it will be more easy for them to say.
> > > >
> > > > >
> > > > > > > So your patch is a way to force the hardware to recognize mark set the
> > > > > > > VLAN DEI on packets that are not recognized as VLAN packets?
> > > > > > >
> > > > > > > How the is traffic class of the packet determined?
> > > > > >
> > > > > > Packet is coloured based on Single Rate[1] or Dual Rate[2] Shaping result
> > > > > > and packet color determines traffic class. The exact behavior of
> > > > > > packet color to traffic class mapping is mentioned in TM spec based on
> > > > > > few other RFC's.
> > > > > >
> > > > > > [1] https://urldefense.proofpoint.com/v2/url?u=https-
> > 3A__tools.ietf.org_html_rfc2697&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=
> > pJDciSXpMy6TawycjvpYj_Jq5M5j_ywqhU8-keRI_ac&s=05emGNkz3Qat3dtZIbEsmQDC5y9-tU9yItHX0x1aaJU&e=
> > > > > > [2] https://urldefense.proofpoint.com/v2/url?u=https-
> > 3A__tools.ietf.org_html_rfc2698&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=
> > pJDciSXpMy6TawycjvpYj_Jq5M5j_ywqhU8-keRI_ac&s=3VN2dIGSDt4vWM-FpPOOf-8SeVShl_t7QpXRU6Zw460&e=
> > > > >
> > > > > OK, so the traffic class does not depend on the packet type?
> > > > Yes it doesn't. But where to update the traffic class is specific to packet
> > > > type like DEI bit in VLAN or ECN field in IPv4/IPv6 or DSCP field in IPv4/IPv6.
> > > > Also ECN marking is only valid for TCP/SCTP packets.
> > > >
> > > > >
> > > > >
> > > > > > > > > From what I understand, this feature is bound to octeontx2, so using a
> > > > > > > > > mbuf dynamic flag would make more sense here. There are some examples in
> > > > > > > > > dpdk repository, just grep for "dynflag".
> > > > > > > >
> > > > > > > > This is not octeontx2 specific flag but any "packet marking feature" enabled
> > > > > > > > PMD would need these flags to identify on which packets marking needs to be
> > > > > > > > done. This is the first PMD that supports packet marking feature and
> > > > > > > > hence it was not exposed earlier.
> > > > > > > >
> > > > > > > > For example to mark VLAN DEI, PMD cannot always assume that there is preexisting
> > > > > > > > VLAN header from Byte 12 as there is no gaurantee that ethernet header
> > > > > > > > always starts at Byte 0 (Custom headers before ethernet hdr).
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Also, I think that the feature availability should be advertised through
> > > > > > > > > an ethdev offload, so an application can know at initialization time
> > > > > > > > > that these flags can be used.
> > > > > > > >
> > > > > > > > Feature availablity is already part of TM spec in rte_tm.h
> > > > > > > > struct rte_tm_capabilities:mark_vlan_dei_supported
> > > > > > > > struct rte_tm_capabilities:mark_ip_ecn_[sctp|tcp]_supported
> > > > > > > > struct rte_tm_capabilities:mark_ip_dscp_supported
> > > > > > >
> > > > > > > Does this mean that any driver advertising this existing feature flag
> > > > > > > has to support the new mbuf flags too? Shouldn't we have a specific
> > > > > > > feature for it?
> > > > > >
> > > > > > Yes, I thought PMD's need to support both.
> > > > > > I'm fine adding specific feature flag for the offload flags alone
> > > > > > if you insist or if there are other PMD's which don't need the offload flags
> > > > > > for packet marking. I was not able to find out about other PMD's as
> > > > > > none of the existing PMD's support packet marking.
> > > > >
> > > > > Do you suggest that the behavior of the traffic manager marking should
> > > > > be:
> > > > >
> > > > > a- the hardware tries to recognize tx packets, and mark them
> > > > >    accordingly. What packets are recognized depend on hardware.
> > > > > b- if the mbuf has a specific flag, it helps the PMD and hardware to
> > > > >    recognize packets, so it can mark packets.
> > > > >
> > > > > For an application, a- is difficult to apprehend as it will be dependent
> > > > > on hardware.
> > > > >
> > > > > Or do you suggest that packets should only be marked if there is a mbuf
> > > > > flag? (only b-)
> > > > Yes, I believe b- is the right thing.
> > > >
> > > > >
> > > > > Do you confirm that there is no support at all for this feature today?
> > > > > I mean, what was the usage of rte_tm_mark_vlan_dei() these last 3 years?
> > > >
> > > > Yes, it was not implemented/used. Because of such reasons, rte_tm.h is
> > > > supposed to be experimental but was mistakenly marked stable.
> > > > You can see related discussion in below threads about marking rte_tm.h
> > > > experimental again in v20.11.
> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__mails.dpdk.org_archives_dev_2020-
> > 2DApril_164970.html&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=myqGwnIHN
> > jN9IP7urxcVAB384qKoxlmm00p1gS7ttbw&s=-o2E-F9aHy3mrQw6xgO__RPXY9t8s3yjJn81X6Ius3k&e=
> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__mails.dpdk.org_archives_dev_2020-
> > 2DMay_166221.html&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=myqGwnIHNj
> > N9IP7urxcVAB384qKoxlmm00p1gS7ttbw&s=gTKSzMmlhE75x4TP8IJB7NP5MVO-zxjmNRQ9bZ6MxwI&e=
> > >
> > > Thank you for the explanations. I also think b- is a better choice.
> > >
> > > I don't see any better approach than having a mbuf flag. However, I'm
> > > still not fully convinced that a dynamic flag won't do the job. Taking
> > > 3 additional flags (among 18 remaing) for this feature also means that
> > > we have 3 flags less for dynamic flags for all applications, even for
> > > applications that will not use this feature.
> 
> I also share Olivier's concern about consuming 3 bits in ol_flags for that feature.
> Can it probably be squeezed somehow?
> Let say we reserve one flag that this information is present or not, and
> re-use one of rx-only fields for store additional information (packet_type, or so).
> Or might be some other approach.  

We are fine with this approach where we define one bit in Tx offloads for pkt
marking and and 3 bits reused from Rx offload flags area.

For example:

@@ -186,10 +186,16 @@ extern "C" {
 
 /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
 
+/* Reused Rx offload bits for Tx offloads */
+#define PKT_X_TX_MARK_VLAN_DEI         (1ULL << 0)
+#define PKT_X_TX_MARK_IP_DSCP          (1ULL << 1)
+#define PKT_X_TX_MARK_IP_ECN           (1ULL << 2)
+
 #define PKT_FIRST_FREE (1ULL << 23)
-#define PKT_LAST_FREE (1ULL << 40)
+#define PKT_LAST_FREE (1ULL << 39)
 
 /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
+#define PKT_TX_MARK_EN         (1ULL << 40)

Is this fine ?


> 
> > >
> > > Would it be a problem to use a dynamic flag in this case?
> > Since packet marking feature itself is already part of spec,
> > if we move the flags to PMD specific dynamic flag, then it creates a confusion.
> > 
> > It is not the case of a custom feature supported by a specific PMD.
> > I believe when other PMD's implement packet marking, the same flags will
> > suffice.
> > >
> > > Thanks,
> > > Olivier
> > >
> > >
> > > >
> > > > Thanks
> > > > Nithin
> > > >
> > > > >
> > > > > Thanks,
> > > > > Olivier
> > > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > Please also see few comments below.
> > > > > > >
> > > > > > > > > > > ---
> > > > > > > > > > >  doc/guides/nics/features.rst    | 14 ++++++++++++++
> > > > > > > > > > >  lib/librte_mbuf/rte_mbuf.c      |  6 ++++++
> > > > > > > > > > >  lib/librte_mbuf/rte_mbuf_core.h | 36 ++++++++++++++++++++++++++++++++++--
> > > > > > > > > > >  3 files changed, 54 insertions(+), 2 deletions(-)
> > > > > > > > > > >
> > > > > > > > > > > diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> > > > > > > > > > > index edd21c4..bc978fb 100644
> > > > > > > > > > > --- a/doc/guides/nics/features.rst
> > > > > > > > > > > +++ b/doc/guides/nics/features.rst
> > > > > > > > > > > @@ -913,6 +913,20 @@ Supports to get Rx/Tx packet burst mode information.
> > > > > > > > > > >  * **[implements] eth_dev_ops**: ``rx_burst_mode_get``, ``tx_burst_mode_get``.
> > > > > > > > > > >  * **[related] API**: ``rte_eth_rx_burst_mode_get()``, ``rte_eth_tx_burst_mode_get()``.
> > > > > > > > > > >
> > > > > > > > > > > +.. _nic_features_traffic_manager_packet_marking_offload:
> > > > > > > > > > > +
> > > > > > > > > > > +Traffic Manager Packet marking offload
> > > > > > > > > > > +--------------------------------------
> > > > > > > > > > > +
> > > > > > > > > > > +Supports enabling a packet marking offload specific mbuf.
> > > > > > > > > > > +
> > > > > > > > > > > +* **[uses]     mbuf**: ``mbuf.ol_flags:PKT_TX_MARK_IP_DSCP``,
> > > > > > > > > > > +  ``mbuf.ol_flags:PKT_TX_MARK_IP_ECN``, ``mbuf.ol_flags:PKT_TX_MARK_VLAN_DEI``,
> > > > > > > > > > > +  ``mbuf.ol_flags:PKT_TX_IPV4``, ``mbuf.ol_flags:PKT_TX_IPV6``.
> > > > > > > > > > > +* **[uses]     mbuf**: ``mbuf.l2_len``.
> > > > > > > > > > > +* **[related] API**: ``rte_tm_mark_ip_dscp()``, ``rte_tm_mark_ip_ecn()``,
> > > > > > > > > > > +  ``rte_tm_mark_vlan_dei()``.
> > > > > > > > > > > +
> > > > > > > > > > >  .. _nic_features_other:
> > > > > > > > > > >
> > > > > > > > > > >  Other dev ops not represented by a Feature
> > > > > > > > > > > diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> > > > > > > > > > > index cd5794d..5c6896d 100644
> > > > > > > > > > > --- a/lib/librte_mbuf/rte_mbuf.c
> > > > > > > > > > > +++ b/lib/librte_mbuf/rte_mbuf.c
> > > > > > > > > > > @@ -880,6 +880,9 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
> > > > > > > > > > >         case PKT_TX_SEC_OFFLOAD: return "PKT_TX_SEC_OFFLOAD";
> > > > > > > > > > >         case PKT_TX_UDP_SEG: return "PKT_TX_UDP_SEG";
> > > > > > > > > > >         case PKT_TX_OUTER_UDP_CKSUM: return "PKT_TX_OUTER_UDP_CKSUM";
> > > > > > > > > > > +       case PKT_TX_MARK_VLAN_DEI: return "PKT_TX_MARK_VLAN_DEI";
> > > > > > > > > > > +       case PKT_TX_MARK_IP_DSCP: return "PKT_TX_MARK_IP_DSCP";
> > > > > > > > > > > +       case PKT_TX_MARK_IP_ECN: return "PKT_TX_MARK_IP_ECN";
> > > > > > > > > > >         default: return NULL;
> > > > > > > > > > >         }
> > > > > > > > > > >  }
> > > > > > > > > > > @@ -916,6 +919,9 @@ rte_get_tx_ol_flag_list(uint64_t mask, char *buf, size_t buflen)
> > > > > > > > > > >                 { PKT_TX_SEC_OFFLOAD, PKT_TX_SEC_OFFLOAD, NULL },
> > > > > > > > > > >                 { PKT_TX_UDP_SEG, PKT_TX_UDP_SEG, NULL },
> > > > > > > > > > >                 { PKT_TX_OUTER_UDP_CKSUM, PKT_TX_OUTER_UDP_CKSUM, NULL },
> > > > > > > > > > > +               { PKT_TX_MARK_VLAN_DEI, PKT_TX_MARK_VLAN_DEI, NULL },
> > > > > > > > > > > +               { PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_DSCP, NULL },
> > > > > > > > > > > +               { PKT_TX_MARK_IP_ECN, PKT_TX_MARK_IP_ECN, NULL },
> > > > > > > > > > >         };
> > > > > > > > > > >         const char *name;
> > > > > > > > > > >         unsigned int i;
> > > > > > > > > > > diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> > > > > > > > > > > index b9a59c8..d9f1290 100644
> > > > > > > > > > > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > > > > > > > > > > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > > > > > > > > > > @@ -187,11 +187,40 @@ extern "C" {
> > > > > > > > > > >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> > > > > > > > > > >
> > > > > > > > > > >  #define PKT_FIRST_FREE (1ULL << 23)
> > > > > > > > > > > -#define PKT_LAST_FREE (1ULL << 40)
> > > > > > > > > > > +#define PKT_LAST_FREE (1ULL << 37)
> > > > > > > > > > >
> > > > > > > > > > >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> > > > > > > > > > >
> > > > > > > > > > >  /**
> > > > > > > > > > > + * Packet marking offload flags. These flags indicated what kind
> > > > > > > > > > > + * of packet marking needs to be applied on a given mbuf when
> > > > > > > > > > > + * appropriate Traffic Manager configuration is in place.
> > > > > > > > > > > + * When user set's these flags on a mbuf, below assumptions are made
> > > > > > > > > > > + * 1) When PKT_TX_MARK_VLAN_DEI is set,
> > > > > > > > > > > + * a) PMD assumes pkt to be a 802.1q packet.
> > > > > > >
> > > > > > > What does that imply?
> > > > > >
> > > > > > I meant by setting the flag, a packet has VLAN header adhering to IEEE 802.1Q spec.
> > > > > >
> > > > > > >
> > > > > > > > > > > + * b) Application should also set mbuf.l2_len where 802.1Q header is
> > > > > > > > > > > + *    at (mbuf.l2_len - 6) offset.
> > > > > > >
> > > > > > > Why mbuf.l2_len - 6 ?
> > > > > > L2 header when VLAN header is preset will be
> > > > > > {custom header 'X' Bytes}:{Ethernet SRC+DST (12B)}:{VLAN Header (4B)}:{Ether Type (2B)}
> > > > > > l2_len = X + 12 + 4 + 2
> > > > > > So, VLAN header starts at (l2_len - 6) bytes.
> > > > > >
> > > > > > >
> > > > > > > > > > > + * 2) When PKT_TX_MARK_IP_DSCP is set,
> > > > > > > > > > > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6
> > > > > > > > > > > + *    to indicate whether if it is IPv4 packet or IPv6 packet
> > > > > > > > > > > + *    for DSCP marking. It should also set PKT_TX_IP_CKSUM if it is
> > > > > > > > > > > + *    IPv4 pkt.
> > > > > > > > > > > + * b) Application should also set mbuf.l2_len that indicates
> > > > > > > > > > > + *    start offset of L3 header.
> > > > > > > > > > > + * 3) When PKT_TX_MARK_IP_ECN is set,
> > > > > > > > > > > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6.
> > > > > > > > > > > + *    It should also set PKT_TX_IP_CKSUM if it is IPv4 pkt.
> > > > > > > > > > > + * b) PMD will assume pkt L4 protocol is either TCP or SCTP and
> > > > > > > > > > > + *    ECN is set to 2'b01 or 2'b10 as per RFC 3168 and hence HW
> > > > > > > > > > > + *    can mark the packet for a configured color.
> > > > > > > > > > > + * c) Application should also set mbuf.l2_len that indicates
> > > > > > > > > > > + *    start offset of L3 header.
> > > > > > > > > > > + */
> > > > > > > > > > > +#define PKT_TX_MARK_VLAN_DEI           (1ULL << 38)
> > > > > > > > > > > +#define PKT_TX_MARK_IP_DSCP            (1ULL << 39)
> > > > > > > > > > > +#define PKT_TX_MARK_IP_ECN             (1ULL << 40)
> > > > > > >
> > > > > > > We should have one comment per define.
> > > > > > Ack, will fix in V2.
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > > > > +
> > > > > > > > > > > +/**
> > > > > > > > > > >   * Outer UDP checksum offload flag. This flag is used for enabling
> > > > > > > > > > >   * outer UDP checksum in PMD. To use outer UDP checksum, the user needs to
> > > > > > > > > > >   * 1) Enable the following in mbuf,
> > > > > > > > > > > @@ -384,7 +413,10 @@ extern "C" {
> > > > > > > > > > >                 PKT_TX_MACSEC |          \
> > > > > > > > > > >                 PKT_TX_SEC_OFFLOAD |     \
> > > > > > > > > > >                 PKT_TX_UDP_SEG |         \
> > > > > > > > > > > -               PKT_TX_OUTER_UDP_CKSUM)
> > > > > > > > > > > +               PKT_TX_OUTER_UDP_CKSUM | \
> > > > > > > > > > > +               PKT_TX_MARK_VLAN_DEI |   \
> > > > > > > > > > > +               PKT_TX_MARK_IP_DSCP |    \
> > > > > > > > > > > +               PKT_TX_MARK_IP_ECN)
> > > > > > > > > > >
> > > > > > > > > > >  /**
> > > > > > > > > > >   * Mbuf having an external buffer attached. shinfo in mbuf must be filled.
> > > > > > > > > > > --
> > > > > > > > > > > 2.8.4
> > > > > > > > > > >
Thomas Monjalon May 15, 2020, 3:10 p.m. UTC | #15
15/05/2020 15:44, Nithin Dabilpuram:
> On Fri, May 15, 2020 at 03:12:59PM +0200, Thomas Monjalon wrote:
> > 15/05/2020 12:08, Nithin Dabilpuram:
> > > On Thu, May 14, 2020 at 10:29:31PM +0200, Olivier Matz wrote:
> > > > I don't see any better approach than having a mbuf flag. However, I'm
> > > > still not fully convinced that a dynamic flag won't do the job. Taking
> > > > 3 additional flags (among 18 remaing) for this feature also means that
> > > > we have 3 flags less for dynamic flags for all applications, even for
> > > > applications that will not use this feature.
> > > > 
> > > > Would it be a problem to use a dynamic flag in this case?
> > > Since packet marking feature itself is already part of spec,
> > > if we move the flags to PMD specific dynamic flag, then it creates a confusion.
> > > 
> > > It is not the case of a custom feature supported by a specific PMD.
> > > I believe when other PMD's implement packet marking, the same flags will
> > > suffice.
> > 
> > A dynamic flag is not necessarily PMD-specific.
> > It is just avoiding consuming bits if the feature is not used by the application.
> > We must move more existing flags and fields to be dynamic.
> > 
> > In general, all new flags and fields in mbuf should be dynamic.
> > And a work must be done to move existing stuff to free more space
> > for more dynamic features.
> 
> My bad, I thought dynamic flags can only be used for PMD specific thing.
> 
> There is however a cost of using dynamic flag which I think should be avoided
> for DPDK spec defined offloads, though it's fine for PMD specific things.
> 
> Dynamic offload flags causes application and PMD to use non constant offset 
> or shift which are looked up at init, instead of having a constant shift or
> offset. This indirection costs some cycles due to extra loads in fast path.

Yes there is a cost. We described it quite clearly last year.
The default rule is now to add new flags and fields as dynamic.
In case the rule was not clear, I will send a patch to insert some
notes in the code and the doc.

If you disagree with this new rule, you will have to give very good arguments.
Jerin Jacob May 15, 2020, 4:26 p.m. UTC | #16
On Fri, May 15, 2020 at 8:40 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 15/05/2020 15:44, Nithin Dabilpuram:
> > On Fri, May 15, 2020 at 03:12:59PM +0200, Thomas Monjalon wrote:
> > > 15/05/2020 12:08, Nithin Dabilpuram:
> > > > On Thu, May 14, 2020 at 10:29:31PM +0200, Olivier Matz wrote:
> > > > > I don't see any better approach than having a mbuf flag. However, I'm
> > > > > still not fully convinced that a dynamic flag won't do the job. Taking
> > > > > 3 additional flags (among 18 remaing) for this feature also means that
> > > > > we have 3 flags less for dynamic flags for all applications, even for
> > > > > applications that will not use this feature.
> > > > >
> > > > > Would it be a problem to use a dynamic flag in this case?
> > > > Since packet marking feature itself is already part of spec,
> > > > if we move the flags to PMD specific dynamic flag, then it creates a confusion.
> > > >
> > > > It is not the case of a custom feature supported by a specific PMD.
> > > > I believe when other PMD's implement packet marking, the same flags will
> > > > suffice.
> > >
> > > A dynamic flag is not necessarily PMD-specific.
> > > It is just avoiding consuming bits if the feature is not used by the application.
> > > We must move more existing flags and fields to be dynamic.
> > >
> > > In general, all new flags and fields in mbuf should be dynamic.
> > > And a work must be done to move existing stuff to free more space
> > > for more dynamic features.
> >
> > My bad, I thought dynamic flags can only be used for PMD specific thing.
> >
> > There is however a cost of using dynamic flag which I think should be avoided
> > for DPDK spec defined offloads, though it's fine for PMD specific things.
> >
> > Dynamic offload flags causes application and PMD to use non constant offset
> > or shift which are looked up at init, instead of having a constant shift or
> > offset. This indirection costs some cycles due to extra loads in fast path.
>
> Yes there is a cost. We described it quite clearly last year.
> The default rule is now to add new flags and fields as dynamic.
> In case the rule was not clear, I will send a patch to insert some
> notes in the code and the doc.

Yes. Please send a patch to document the rule. That makes life easy
for everyone to make a boolean decision.

Here is the comment from mbuf: support dynamic fields and flags commit
when accepted this patch.

"    The typical use case is a PMD that registers space for an offload
    feature, when the application requests to enable this feature.  As
    the space in mbuf is limited, the space should only be reserved if it
    is going to be used (i.e when the application explicitly asks for it).
"
If you are pushing this feature to dynamic mbuf filed then rte_tm
subsystem needs to register dynamic field
not the PMD as the feature is part of rte_tm spec.


>
> If you disagree with this new rule, you will have to give very good arguments.

What would the definition of a good argument? as the same logic can be
implemented with dynamic vs
static at the cost of dynamic indirection.

>
>
>
>
Thomas Monjalon May 15, 2020, 4:52 p.m. UTC | #17
15/05/2020 18:26, Jerin Jacob:
> On Fri, May 15, 2020 at 8:40 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > 15/05/2020 15:44, Nithin Dabilpuram:
> > > On Fri, May 15, 2020 at 03:12:59PM +0200, Thomas Monjalon wrote:
> > > > 15/05/2020 12:08, Nithin Dabilpuram:
> > > > > On Thu, May 14, 2020 at 10:29:31PM +0200, Olivier Matz wrote:
> > > > > > I don't see any better approach than having a mbuf flag. However, I'm
> > > > > > still not fully convinced that a dynamic flag won't do the job. Taking
> > > > > > 3 additional flags (among 18 remaing) for this feature also means that
> > > > > > we have 3 flags less for dynamic flags for all applications, even for
> > > > > > applications that will not use this feature.
> > > > > >
> > > > > > Would it be a problem to use a dynamic flag in this case?
> > > > > Since packet marking feature itself is already part of spec,
> > > > > if we move the flags to PMD specific dynamic flag, then it creates a confusion.
> > > > >
> > > > > It is not the case of a custom feature supported by a specific PMD.
> > > > > I believe when other PMD's implement packet marking, the same flags will
> > > > > suffice.
> > > >
> > > > A dynamic flag is not necessarily PMD-specific.
> > > > It is just avoiding consuming bits if the feature is not used by the application.
> > > > We must move more existing flags and fields to be dynamic.
> > > >
> > > > In general, all new flags and fields in mbuf should be dynamic.
> > > > And a work must be done to move existing stuff to free more space
> > > > for more dynamic features.
> > >
> > > My bad, I thought dynamic flags can only be used for PMD specific thing.
> > >
> > > There is however a cost of using dynamic flag which I think should be avoided
> > > for DPDK spec defined offloads, though it's fine for PMD specific things.
> > >
> > > Dynamic offload flags causes application and PMD to use non constant offset
> > > or shift which are looked up at init, instead of having a constant shift or
> > > offset. This indirection costs some cycles due to extra loads in fast path.
> >
> > Yes there is a cost. We described it quite clearly last year.
> > The default rule is now to add new flags and fields as dynamic.
> > In case the rule was not clear, I will send a patch to insert some
> > notes in the code and the doc.
> 
> Yes. Please send a patch to document the rule. That makes life easy
> for everyone to make a boolean decision.

Yes, I will work on it.

> Here is the comment from mbuf: support dynamic fields and flags commit
> when accepted this patch.
> 
> "    The typical use case is a PMD that registers space for an offload
>     feature, when the application requests to enable this feature.  As
>     the space in mbuf is limited, the space should only be reserved if it
>     is going to be used (i.e when the application explicitly asks for it).
> "

OK, there is probably a documentation gap.

> If you are pushing this feature to dynamic mbuf filed then rte_tm
> subsystem needs to register dynamic field
> not the PMD as the feature is part of rte_tm spec.

Is there a function in rte_tm which initializes or configure the feature?


> > If you disagree with this new rule, you will have to give very good arguments.
> 
> What would the definition of a good argument? as the same logic can be
> implemented with dynamic vs
> static at the cost of dynamic indirection.

I think the only exception to add a static flag or field is to demonstrate
how basic is the feature.
But I think all basic features are already integrated for years.
Jerin Jacob May 15, 2020, 5 p.m. UTC | #18
On Fri, May 15, 2020 at 10:22 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 15/05/2020 18:26, Jerin Jacob:
> > On Fri, May 15, 2020 at 8:40 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > 15/05/2020 15:44, Nithin Dabilpuram:
> > > > On Fri, May 15, 2020 at 03:12:59PM +0200, Thomas Monjalon wrote:
> > > > > 15/05/2020 12:08, Nithin Dabilpuram:
> > > > > > On Thu, May 14, 2020 at 10:29:31PM +0200, Olivier Matz wrote:
> > > > > > > I don't see any better approach than having a mbuf flag. However, I'm
> > > > > > > still not fully convinced that a dynamic flag won't do the job. Taking
> > > > > > > 3 additional flags (among 18 remaing) for this feature also means that
> > > > > > > we have 3 flags less for dynamic flags for all applications, even for
> > > > > > > applications that will not use this feature.
> > > > > > >
> > > > > > > Would it be a problem to use a dynamic flag in this case?
> > > > > > Since packet marking feature itself is already part of spec,
> > > > > > if we move the flags to PMD specific dynamic flag, then it creates a confusion.
> > > > > >
> > > > > > It is not the case of a custom feature supported by a specific PMD.
> > > > > > I believe when other PMD's implement packet marking, the same flags will
> > > > > > suffice.
> > > > >
> > > > > A dynamic flag is not necessarily PMD-specific.
> > > > > It is just avoiding consuming bits if the feature is not used by the application.
> > > > > We must move more existing flags and fields to be dynamic.
> > > > >
> > > > > In general, all new flags and fields in mbuf should be dynamic.
> > > > > And a work must be done to move existing stuff to free more space
> > > > > for more dynamic features.
> > > >
> > > > My bad, I thought dynamic flags can only be used for PMD specific thing.
> > > >
> > > > There is however a cost of using dynamic flag which I think should be avoided
> > > > for DPDK spec defined offloads, though it's fine for PMD specific things.
> > > >
> > > > Dynamic offload flags causes application and PMD to use non constant offset
> > > > or shift which are looked up at init, instead of having a constant shift or
> > > > offset. This indirection costs some cycles due to extra loads in fast path.
> > >
> > > Yes there is a cost. We described it quite clearly last year.
> > > The default rule is now to add new flags and fields as dynamic.
> > > In case the rule was not clear, I will send a patch to insert some
> > > notes in the code and the doc.
> >
> > Yes. Please send a patch to document the rule. That makes life easy
> > for everyone to make a boolean decision.
>
> Yes, I will work on it.

Thanks.

>
> > Here is the comment from mbuf: support dynamic fields and flags commit
> > when accepted this patch.
> >
> > "    The typical use case is a PMD that registers space for an offload
> >     feature, when the application requests to enable this feature.  As
> >     the space in mbuf is limited, the space should only be reserved if it
> >     is going to be used (i.e when the application explicitly asks for it).
> > "
>
> OK, there is probably a documentation gap.

Obviously :-)


>
> > If you are pushing this feature to dynamic mbuf filed then rte_tm
> > subsystem needs to register dynamic field
> > not the PMD as the feature is part of rte_tm spec.
>
> Is there a function in rte_tm which initializes or configure the feature?

See rte_tm_mark_*

>
>
> > > If you disagree with this new rule, you will have to give very good arguments.
> >
> > What would the definition of a good argument? as the same logic can be
> > implemented with dynamic vs
> > static at the cost of dynamic indirection.
>
> I think the only exception to add a static flag or field is to demonstrate
> how basic is the feature.
> But I think all basic features are already integrated for years.

Yes. That's the path then let have a rule to not add any "new fields"
and "flags" to mbuf
and everything should be through dynamic.

>
>
Nithin Dabilpuram May 15, 2020, 6:07 p.m. UTC | #19
On Fri, May 15, 2020 at 06:52:23PM +0200, Thomas Monjalon wrote:
> 15/05/2020 18:26, Jerin Jacob:
> > On Fri, May 15, 2020 at 8:40 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > 15/05/2020 15:44, Nithin Dabilpuram:
> > > > On Fri, May 15, 2020 at 03:12:59PM +0200, Thomas Monjalon wrote:
> > > > > 15/05/2020 12:08, Nithin Dabilpuram:
> > > > > > On Thu, May 14, 2020 at 10:29:31PM +0200, Olivier Matz wrote:
> > > > > > > I don't see any better approach than having a mbuf flag. However, I'm
> > > > > > > still not fully convinced that a dynamic flag won't do the job. Taking
> > > > > > > 3 additional flags (among 18 remaing) for this feature also means that
> > > > > > > we have 3 flags less for dynamic flags for all applications, even for
> > > > > > > applications that will not use this feature.
> > > > > > >
> > > > > > > Would it be a problem to use a dynamic flag in this case?
> > > > > > Since packet marking feature itself is already part of spec,
> > > > > > if we move the flags to PMD specific dynamic flag, then it creates a confusion.
> > > > > >
> > > > > > It is not the case of a custom feature supported by a specific PMD.
> > > > > > I believe when other PMD's implement packet marking, the same flags will
> > > > > > suffice.
> > > > >
> > > > > A dynamic flag is not necessarily PMD-specific.
> > > > > It is just avoiding consuming bits if the feature is not used by the application.
> > > > > We must move more existing flags and fields to be dynamic.
> > > > >
> > > > > In general, all new flags and fields in mbuf should be dynamic.
> > > > > And a work must be done to move existing stuff to free more space
> > > > > for more dynamic features.
> > > >
> > > > My bad, I thought dynamic flags can only be used for PMD specific thing.
> > > >
> > > > There is however a cost of using dynamic flag which I think should be avoided
> > > > for DPDK spec defined offloads, though it's fine for PMD specific things.
> > > >
> > > > Dynamic offload flags causes application and PMD to use non constant offset
> > > > or shift which are looked up at init, instead of having a constant shift or
> > > > offset. This indirection costs some cycles due to extra loads in fast path.
> > >
> > > Yes there is a cost. We described it quite clearly last year.
> > > The default rule is now to add new flags and fields as dynamic.
> > > In case the rule was not clear, I will send a patch to insert some
> > > notes in the code and the doc.
> > 
> > Yes. Please send a patch to document the rule. That makes life easy
> > for everyone to make a boolean decision.
> 
> Yes, I will work on it.
> 
> > Here is the comment from mbuf: support dynamic fields and flags commit
> > when accepted this patch.
> > 
> > "    The typical use case is a PMD that registers space for an offload
> >     feature, when the application requests to enable this feature.  As
> >     the space in mbuf is limited, the space should only be reserved if it
> >     is going to be used (i.e when the application explicitly asks for it).
> > "
> 
> OK, there is probably a documentation gap.
> 
> > If you are pushing this feature to dynamic mbuf filed then rte_tm
> > subsystem needs to register dynamic field
> > not the PMD as the feature is part of rte_tm spec.
> 
> Is there a function in rte_tm which initializes or configure the feature?
> 
> 
> > > If you disagree with this new rule, you will have to give very good arguments.
> > 
> > What would the definition of a good argument? as the same logic can be
> > implemented with dynamic vs
> > static at the cost of dynamic indirection.
> 
> I think the only exception to add a static flag or field is to demonstrate
> how basic is the feature.
> But I think all basic features are already integrated for years.
>

Vector implementations will always have huge cost due to dynamic offload
flags. Ofcourse, I don't have a vector implementation for packet marking but,
in future if there is such an offload flag with vector implementation
then hard enforcement will not work.

>
Nithin Dabilpuram May 28, 2020, 3:43 p.m. UTC | #20
Hi Olivier,

On Fri, May 15, 2020 at 07:27:46PM +0530, Nithin Dabilpuram wrote:
> On Fri, May 15, 2020 at 10:30:30AM +0000, Ananyev, Konstantin wrote:
> > 
> > > On Thu, May 14, 2020 at 10:29:31PM +0200, Olivier Matz wrote:
> > > > Hi Nithin,
> > > >
> > > > On Tue, May 05, 2020 at 11:49:20AM +0530, Nithin Dabilpuram wrote:
> > > > > On Mon, May 04, 2020 at 02:27:35PM +0200, Olivier Matz wrote:
> > > > > > On Mon, May 04, 2020 at 03:34:57PM +0530, Nithin Dabilpuram wrote:
> > > > > > > On Mon, May 04, 2020 at 11:16:40AM +0200, Olivier Matz wrote:
> > > > > > > > On Mon, May 04, 2020 at 01:57:06PM +0530, Nithin Dabilpuram wrote:
> > > > > > > > > Hi Olivier,
> > > > > > > > >
> > > > > > > > > On Mon, May 04, 2020 at 10:06:34AM +0200, Olivier Matz wrote:
> > > > > > > > > > External Email
> > > > > > > > > >
> > > > > > > > > > ----------------------------------------------------------------------
> > > > > > > > > > Hi,
> > > > > > > > > >
> > > > > > > > > > On Fri, May 01, 2020 at 04:48:21PM +0530, Jerin Jacob wrote:
> > > > > > > > > > > On Fri, Apr 17, 2020 at 12:53 PM Nithin Dabilpuram
> > > > > > > > > > > <nithind1988@gmail.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > From: Nithin Dabilpuram <ndabilpuram@marvell.com>
> > > > > > > > > > > >
> > > > > > > > > > > > Introduce PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_ECN
> > > > > > > > > > > > and PKT_TX_MARK_VLAN_DEI Tx offload flags to support
> > > > > > > > > > > > packet marking.
> > > > > > > > > > > >
> > > > > > > > > > > > When packet marking feature in Traffic manager is enabled,
> > > > > > > > > > > > application has to the use the three new flags to indicate
> > > > > > > > > > > > to PMD on whether packet marking needs to be enabled on the
> > > > > > > > > > > > specific mbuf or not. By setting the three flags, it is
> > > > > > > > > > > > assumed by PMD that application has already verified the
> > > > > > > > > > > > applicability of marking on that specific packet and
> > > > > > > > > > > > PMD need not perform further checks as per RFC.
> > > > > > > > > > > >
> > > > > > > > > > > > Signed-off-by: Krzysztof Kanas <kkanas@marvell.com>
> > > > > > > > > > > > Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
> > > > > > > > > > >
> > > > > > > > > > > None of the ethdev TM driver implementations has supported packet
> > > > > > > > > > > marking support.
> > > > > > > > > > > rte_tm and rte_mbuf maintainers(Christian, Oliver), Could you review this patch?
> > > > > > > > > >
> > > > > > > > > > As you know, the number of mbuf flags is limited (only 18 bits are
> > > > > > > > > > remaining), so I think we should use them with care, i.e. for features
> > > > > > > > > > that are generic enough.
> > > > > > > > >
> > > > > > > > > I agree, but I believe this is one of the basic flags needed like other
> > > > > > > > > Tx checksum offload flags (like PKT_TX_IP_CKSUM, PKT_TX_IPV4, etc) which
> > > > > > > > > are needed to identify on which packets HW should/can apply packet marking.
> > > > > > > >
> > > > > > > > PKT_TX_IP_CKSUM tells the hardware to offload the checksum
> > > > > > > > calculation. This is pretty straightforward and there is no other
> > > > > > > > dependency than the offload feature advertised by the PMD.
> > > > > > > >
> > > > > > > > I'm sorry, I have not a lot of experience with rte_tm.h, so it's
> > > > > > > > difficult for me to have a global view of what is done for instance when
> > > > > > > > PKT_TX_MARK_VLAN_DEI is set, and what happens when it is not set.
> > > > > > > >
> > > > > > > > Can you confirm that my understanding below is correct? (or correct me
> > > > > > > > where I'm wrong)
> > > > > > > >
> > > > > > > > Before your patch:
> > > > > > > > - the application enables the port and traffic manager on it
> > > > > > > > - the application calls rte_tm_mark_vlan_dei() to select which traffic
> > > > > > > >   class must be marked
> > > > > > > > - when a packet is transmitted, the traffic class is determined by the
> > > > > > > >   hardware, and if the hardware recognizes a VLAN packet, the VLAN DEI
> > > > > > > >   bit is set depending on traffic class
> > > > > > > >
> > > > > > > > The problem is for packets that cannot be recognized by the hardware,
> > > > > > > > correct?
> > > > > > >
> > > > > > > Yes. Octeontx2 HW always depends on application knowledge instead of walking
> > > > > > > through all the layers of packet data in Tx to identify what packet it is
> > > > > > > and where the l2, l3, l4 headers start for performance reasons.
> > > > > > >
> > > > > > > I believe there are other hardware too that have the same expectation
> > > > > > > and hence we have a need for PKT_TX_IPv4, PKT_TX_IPv6 kind of flags.
> > > > > > >
> > > > > > > Hence we want to make use of mbuf:tx_offload field and PKT_TX_* flags
> > > > > > > for identifying the packet and knowing what are its l2,l3,l4 offsets.
> > > > > >
> > > > > > The objective is to give an indication to the hardware that the packet has:
> > > > > > - an 802.1q header at offset X for PKT_TX_MARK_VLAN_DEI
> > > > > > - an IP/IPv6 header at offset X for PKT_TX_MARK_IP_DSCP
> > > > > > - an IP/IPv6 header at offset X and a TCP/SCTP header at offset Y for
> > > > > >   PKT_TX_MARK_IP_ECN
> > > > > >
> > > > > > Just to be sure I'm getting the point, would it also work if with flags
> > > > > > like this:
> > > > > >
> > > > > > - an 802.1q header at offset X for PKT_TX_HAS_VLAN
> > > > > > - an IP/IPv6 header at offset X for PKT_TX_IPv4 or PKT_TX_IPv6
> > > > > > - a TCP/SCTP header at offset Y for PKT_TX_TCP/PKT_TX_SCTP (implies
> > > > > >   PKT_TX_IPv4 or PKT_TX_IPv6)
> > > > > >
> > > > > > The underlying question is: do we need the flags to only describe the
> > > > > > content of the packet or do the flag also indicate that an action has to
> > > > > > be done?
> > > > >
> > > > > If we don't have a specific action based flag, then in future it might collide
> > > > > with other functionality and we will not be able to choose that specific
> > > > > offload. All the existing features are having specific flags, like TSO,
> > > > > CSUM.
> > > > >
> > > > > RFC wise, even when marking in enabled and packet is coloured, not all packets
> > > > > can be marked.
> > > > > For example when IP DSCP marking(RFC 2597) is enabled, marking is defined
> > > > > only with below 12 code points out of 64 code points (6 bits of DSCP).
> > > > >
> > > > >                   Class 1    Class 2    Class 3    Class 4
> > > > >                  +----------+----------+----------+----------+
> > > > > Low Drop Prec    |  001010  |  010010  |  011010  |  100010  |
> > > > > Medium Drop Prec |  001100  |  010100  |  011100  |  100100  |
> > > > > High Drop Prec   |  001110  |  010110  |  011110  |  100110  |
> > > > >                  +----------+----------+----------+----------+
> > > > >
> > > > > All other combinations of DSCP value can be used for some other purposes
> > > > > and hence packets with those values shouldn't be marked.
> > > > > Similar is the case with IP ECN marking for TCP/SCTP(RFC 3168).
> > > > >
> > > > > Having PMD or HW to check if the packet falls in the said class and then do
> > > > > marking will impact performance. Since application actually fills those values
> > > > > in packet, it will be more easy for them to say.
> > > > >
> > > > > >
> > > > > > > > So your patch is a way to force the hardware to recognize mark set the
> > > > > > > > VLAN DEI on packets that are not recognized as VLAN packets?
> > > > > > > >
> > > > > > > > How the is traffic class of the packet determined?
> > > > > > >
> > > > > > > Packet is coloured based on Single Rate[1] or Dual Rate[2] Shaping result
> > > > > > > and packet color determines traffic class. The exact behavior of
> > > > > > > packet color to traffic class mapping is mentioned in TM spec based on
> > > > > > > few other RFC's.
> > > > > > >
> > > > > > > [1] https://urldefense.proofpoint.com/v2/url?u=https-
> > > 3A__tools.ietf.org_html_rfc2697&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=
> > > pJDciSXpMy6TawycjvpYj_Jq5M5j_ywqhU8-keRI_ac&s=05emGNkz3Qat3dtZIbEsmQDC5y9-tU9yItHX0x1aaJU&e=
> > > > > > > [2] https://urldefense.proofpoint.com/v2/url?u=https-
> > > 3A__tools.ietf.org_html_rfc2698&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=
> > > pJDciSXpMy6TawycjvpYj_Jq5M5j_ywqhU8-keRI_ac&s=3VN2dIGSDt4vWM-FpPOOf-8SeVShl_t7QpXRU6Zw460&e=
> > > > > >
> > > > > > OK, so the traffic class does not depend on the packet type?
> > > > > Yes it doesn't. But where to update the traffic class is specific to packet
> > > > > type like DEI bit in VLAN or ECN field in IPv4/IPv6 or DSCP field in IPv4/IPv6.
> > > > > Also ECN marking is only valid for TCP/SCTP packets.
> > > > >
> > > > > >
> > > > > >
> > > > > > > > > > From what I understand, this feature is bound to octeontx2, so using a
> > > > > > > > > > mbuf dynamic flag would make more sense here. There are some examples in
> > > > > > > > > > dpdk repository, just grep for "dynflag".
> > > > > > > > >
> > > > > > > > > This is not octeontx2 specific flag but any "packet marking feature" enabled
> > > > > > > > > PMD would need these flags to identify on which packets marking needs to be
> > > > > > > > > done. This is the first PMD that supports packet marking feature and
> > > > > > > > > hence it was not exposed earlier.
> > > > > > > > >
> > > > > > > > > For example to mark VLAN DEI, PMD cannot always assume that there is preexisting
> > > > > > > > > VLAN header from Byte 12 as there is no gaurantee that ethernet header
> > > > > > > > > always starts at Byte 0 (Custom headers before ethernet hdr).
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Also, I think that the feature availability should be advertised through
> > > > > > > > > > an ethdev offload, so an application can know at initialization time
> > > > > > > > > > that these flags can be used.
> > > > > > > > >
> > > > > > > > > Feature availablity is already part of TM spec in rte_tm.h
> > > > > > > > > struct rte_tm_capabilities:mark_vlan_dei_supported
> > > > > > > > > struct rte_tm_capabilities:mark_ip_ecn_[sctp|tcp]_supported
> > > > > > > > > struct rte_tm_capabilities:mark_ip_dscp_supported
> > > > > > > >
> > > > > > > > Does this mean that any driver advertising this existing feature flag
> > > > > > > > has to support the new mbuf flags too? Shouldn't we have a specific
> > > > > > > > feature for it?
> > > > > > >
> > > > > > > Yes, I thought PMD's need to support both.
> > > > > > > I'm fine adding specific feature flag for the offload flags alone
> > > > > > > if you insist or if there are other PMD's which don't need the offload flags
> > > > > > > for packet marking. I was not able to find out about other PMD's as
> > > > > > > none of the existing PMD's support packet marking.
> > > > > >
> > > > > > Do you suggest that the behavior of the traffic manager marking should
> > > > > > be:
> > > > > >
> > > > > > a- the hardware tries to recognize tx packets, and mark them
> > > > > >    accordingly. What packets are recognized depend on hardware.
> > > > > > b- if the mbuf has a specific flag, it helps the PMD and hardware to
> > > > > >    recognize packets, so it can mark packets.
> > > > > >
> > > > > > For an application, a- is difficult to apprehend as it will be dependent
> > > > > > on hardware.
> > > > > >
> > > > > > Or do you suggest that packets should only be marked if there is a mbuf
> > > > > > flag? (only b-)
> > > > > Yes, I believe b- is the right thing.
> > > > >
> > > > > >
> > > > > > Do you confirm that there is no support at all for this feature today?
> > > > > > I mean, what was the usage of rte_tm_mark_vlan_dei() these last 3 years?
> > > > >
> > > > > Yes, it was not implemented/used. Because of such reasons, rte_tm.h is
> > > > > supposed to be experimental but was mistakenly marked stable.
> > > > > You can see related discussion in below threads about marking rte_tm.h
> > > > > experimental again in v20.11.
> > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__mails.dpdk.org_archives_dev_2020-
> > > 2DApril_164970.html&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=myqGwnIHN
> > > jN9IP7urxcVAB384qKoxlmm00p1gS7ttbw&s=-o2E-F9aHy3mrQw6xgO__RPXY9t8s3yjJn81X6Ius3k&e=
> > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__mails.dpdk.org_archives_dev_2020-
> > > 2DMay_166221.html&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=myqGwnIHNj
> > > N9IP7urxcVAB384qKoxlmm00p1gS7ttbw&s=gTKSzMmlhE75x4TP8IJB7NP5MVO-zxjmNRQ9bZ6MxwI&e=
> > > >
> > > > Thank you for the explanations. I also think b- is a better choice.
> > > >
> > > > I don't see any better approach than having a mbuf flag. However, I'm
> > > > still not fully convinced that a dynamic flag won't do the job. Taking
> > > > 3 additional flags (among 18 remaing) for this feature also means that
> > > > we have 3 flags less for dynamic flags for all applications, even for
> > > > applications that will not use this feature.
> > 
> > I also share Olivier's concern about consuming 3 bits in ol_flags for that feature.
> > Can it probably be squeezed somehow?
> > Let say we reserve one flag that this information is present or not, and
> > re-use one of rx-only fields for store additional information (packet_type, or so).
> > Or might be some other approach.  
> 
> We are fine with this approach where we define one bit in Tx offloads for pkt
> marking and and 3 bits reused from Rx offload flags area.
> 
> For example:
> 
> @@ -186,10 +186,16 @@ extern "C" {
>  
>  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
>  
> +/* Reused Rx offload bits for Tx offloads */
> +#define PKT_X_TX_MARK_VLAN_DEI         (1ULL << 0)
> +#define PKT_X_TX_MARK_IP_DSCP          (1ULL << 1)
> +#define PKT_X_TX_MARK_IP_ECN           (1ULL << 2)
> +
>  #define PKT_FIRST_FREE (1ULL << 23)
> -#define PKT_LAST_FREE (1ULL << 40)
> +#define PKT_LAST_FREE (1ULL << 39)
>  
>  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> +#define PKT_TX_MARK_EN         (1ULL << 40)
> 
> Is this fine ?

Any thoughts on this approach which uses only 1 bit in Tx flags out of 18
and reuse unused Rx flag bits ?


> 
> 
> > 
> > > >
> > > > Would it be a problem to use a dynamic flag in this case?
> > > Since packet marking feature itself is already part of spec,
> > > if we move the flags to PMD specific dynamic flag, then it creates a confusion.
> > > 
> > > It is not the case of a custom feature supported by a specific PMD.
> > > I believe when other PMD's implement packet marking, the same flags will
> > > suffice.
> > > >
> > > > Thanks,
> > > > Olivier
> > > >
> > > >
> > > > >
> > > > > Thanks
> > > > > Nithin
> > > > >
> > > > > >
> > > > > > Thanks,
> > > > > > Olivier
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > Please also see few comments below.
> > > > > > > >
> > > > > > > > > > > > ---
> > > > > > > > > > > >  doc/guides/nics/features.rst    | 14 ++++++++++++++
> > > > > > > > > > > >  lib/librte_mbuf/rte_mbuf.c      |  6 ++++++
> > > > > > > > > > > >  lib/librte_mbuf/rte_mbuf_core.h | 36 ++++++++++++++++++++++++++++++++++--
> > > > > > > > > > > >  3 files changed, 54 insertions(+), 2 deletions(-)
> > > > > > > > > > > >
> > > > > > > > > > > > diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> > > > > > > > > > > > index edd21c4..bc978fb 100644
> > > > > > > > > > > > --- a/doc/guides/nics/features.rst
> > > > > > > > > > > > +++ b/doc/guides/nics/features.rst
> > > > > > > > > > > > @@ -913,6 +913,20 @@ Supports to get Rx/Tx packet burst mode information.
> > > > > > > > > > > >  * **[implements] eth_dev_ops**: ``rx_burst_mode_get``, ``tx_burst_mode_get``.
> > > > > > > > > > > >  * **[related] API**: ``rte_eth_rx_burst_mode_get()``, ``rte_eth_tx_burst_mode_get()``.
> > > > > > > > > > > >
> > > > > > > > > > > > +.. _nic_features_traffic_manager_packet_marking_offload:
> > > > > > > > > > > > +
> > > > > > > > > > > > +Traffic Manager Packet marking offload
> > > > > > > > > > > > +--------------------------------------
> > > > > > > > > > > > +
> > > > > > > > > > > > +Supports enabling a packet marking offload specific mbuf.
> > > > > > > > > > > > +
> > > > > > > > > > > > +* **[uses]     mbuf**: ``mbuf.ol_flags:PKT_TX_MARK_IP_DSCP``,
> > > > > > > > > > > > +  ``mbuf.ol_flags:PKT_TX_MARK_IP_ECN``, ``mbuf.ol_flags:PKT_TX_MARK_VLAN_DEI``,
> > > > > > > > > > > > +  ``mbuf.ol_flags:PKT_TX_IPV4``, ``mbuf.ol_flags:PKT_TX_IPV6``.
> > > > > > > > > > > > +* **[uses]     mbuf**: ``mbuf.l2_len``.
> > > > > > > > > > > > +* **[related] API**: ``rte_tm_mark_ip_dscp()``, ``rte_tm_mark_ip_ecn()``,
> > > > > > > > > > > > +  ``rte_tm_mark_vlan_dei()``.
> > > > > > > > > > > > +
> > > > > > > > > > > >  .. _nic_features_other:
> > > > > > > > > > > >
> > > > > > > > > > > >  Other dev ops not represented by a Feature
> > > > > > > > > > > > diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> > > > > > > > > > > > index cd5794d..5c6896d 100644
> > > > > > > > > > > > --- a/lib/librte_mbuf/rte_mbuf.c
> > > > > > > > > > > > +++ b/lib/librte_mbuf/rte_mbuf.c
> > > > > > > > > > > > @@ -880,6 +880,9 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
> > > > > > > > > > > >         case PKT_TX_SEC_OFFLOAD: return "PKT_TX_SEC_OFFLOAD";
> > > > > > > > > > > >         case PKT_TX_UDP_SEG: return "PKT_TX_UDP_SEG";
> > > > > > > > > > > >         case PKT_TX_OUTER_UDP_CKSUM: return "PKT_TX_OUTER_UDP_CKSUM";
> > > > > > > > > > > > +       case PKT_TX_MARK_VLAN_DEI: return "PKT_TX_MARK_VLAN_DEI";
> > > > > > > > > > > > +       case PKT_TX_MARK_IP_DSCP: return "PKT_TX_MARK_IP_DSCP";
> > > > > > > > > > > > +       case PKT_TX_MARK_IP_ECN: return "PKT_TX_MARK_IP_ECN";
> > > > > > > > > > > >         default: return NULL;
> > > > > > > > > > > >         }
> > > > > > > > > > > >  }
> > > > > > > > > > > > @@ -916,6 +919,9 @@ rte_get_tx_ol_flag_list(uint64_t mask, char *buf, size_t buflen)
> > > > > > > > > > > >                 { PKT_TX_SEC_OFFLOAD, PKT_TX_SEC_OFFLOAD, NULL },
> > > > > > > > > > > >                 { PKT_TX_UDP_SEG, PKT_TX_UDP_SEG, NULL },
> > > > > > > > > > > >                 { PKT_TX_OUTER_UDP_CKSUM, PKT_TX_OUTER_UDP_CKSUM, NULL },
> > > > > > > > > > > > +               { PKT_TX_MARK_VLAN_DEI, PKT_TX_MARK_VLAN_DEI, NULL },
> > > > > > > > > > > > +               { PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_DSCP, NULL },
> > > > > > > > > > > > +               { PKT_TX_MARK_IP_ECN, PKT_TX_MARK_IP_ECN, NULL },
> > > > > > > > > > > >         };
> > > > > > > > > > > >         const char *name;
> > > > > > > > > > > >         unsigned int i;
> > > > > > > > > > > > diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> > > > > > > > > > > > index b9a59c8..d9f1290 100644
> > > > > > > > > > > > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > > > > > > > > > > > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > > > > > > > > > > > @@ -187,11 +187,40 @@ extern "C" {
> > > > > > > > > > > >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> > > > > > > > > > > >
> > > > > > > > > > > >  #define PKT_FIRST_FREE (1ULL << 23)
> > > > > > > > > > > > -#define PKT_LAST_FREE (1ULL << 40)
> > > > > > > > > > > > +#define PKT_LAST_FREE (1ULL << 37)
> > > > > > > > > > > >
> > > > > > > > > > > >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> > > > > > > > > > > >
> > > > > > > > > > > >  /**
> > > > > > > > > > > > + * Packet marking offload flags. These flags indicated what kind
> > > > > > > > > > > > + * of packet marking needs to be applied on a given mbuf when
> > > > > > > > > > > > + * appropriate Traffic Manager configuration is in place.
> > > > > > > > > > > > + * When user set's these flags on a mbuf, below assumptions are made
> > > > > > > > > > > > + * 1) When PKT_TX_MARK_VLAN_DEI is set,
> > > > > > > > > > > > + * a) PMD assumes pkt to be a 802.1q packet.
> > > > > > > >
> > > > > > > > What does that imply?
> > > > > > >
> > > > > > > I meant by setting the flag, a packet has VLAN header adhering to IEEE 802.1Q spec.
> > > > > > >
> > > > > > > >
> > > > > > > > > > > > + * b) Application should also set mbuf.l2_len where 802.1Q header is
> > > > > > > > > > > > + *    at (mbuf.l2_len - 6) offset.
> > > > > > > >
> > > > > > > > Why mbuf.l2_len - 6 ?
> > > > > > > L2 header when VLAN header is preset will be
> > > > > > > {custom header 'X' Bytes}:{Ethernet SRC+DST (12B)}:{VLAN Header (4B)}:{Ether Type (2B)}
> > > > > > > l2_len = X + 12 + 4 + 2
> > > > > > > So, VLAN header starts at (l2_len - 6) bytes.
> > > > > > >
> > > > > > > >
> > > > > > > > > > > > + * 2) When PKT_TX_MARK_IP_DSCP is set,
> > > > > > > > > > > > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6
> > > > > > > > > > > > + *    to indicate whether if it is IPv4 packet or IPv6 packet
> > > > > > > > > > > > + *    for DSCP marking. It should also set PKT_TX_IP_CKSUM if it is
> > > > > > > > > > > > + *    IPv4 pkt.
> > > > > > > > > > > > + * b) Application should also set mbuf.l2_len that indicates
> > > > > > > > > > > > + *    start offset of L3 header.
> > > > > > > > > > > > + * 3) When PKT_TX_MARK_IP_ECN is set,
> > > > > > > > > > > > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6.
> > > > > > > > > > > > + *    It should also set PKT_TX_IP_CKSUM if it is IPv4 pkt.
> > > > > > > > > > > > + * b) PMD will assume pkt L4 protocol is either TCP or SCTP and
> > > > > > > > > > > > + *    ECN is set to 2'b01 or 2'b10 as per RFC 3168 and hence HW
> > > > > > > > > > > > + *    can mark the packet for a configured color.
> > > > > > > > > > > > + * c) Application should also set mbuf.l2_len that indicates
> > > > > > > > > > > > + *    start offset of L3 header.
> > > > > > > > > > > > + */
> > > > > > > > > > > > +#define PKT_TX_MARK_VLAN_DEI           (1ULL << 38)
> > > > > > > > > > > > +#define PKT_TX_MARK_IP_DSCP            (1ULL << 39)
> > > > > > > > > > > > +#define PKT_TX_MARK_IP_ECN             (1ULL << 40)
> > > > > > > >
> > > > > > > > We should have one comment per define.
> > > > > > > Ack, will fix in V2.
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > > > > +
> > > > > > > > > > > > +/**
> > > > > > > > > > > >   * Outer UDP checksum offload flag. This flag is used for enabling
> > > > > > > > > > > >   * outer UDP checksum in PMD. To use outer UDP checksum, the user needs to
> > > > > > > > > > > >   * 1) Enable the following in mbuf,
> > > > > > > > > > > > @@ -384,7 +413,10 @@ extern "C" {
> > > > > > > > > > > >                 PKT_TX_MACSEC |          \
> > > > > > > > > > > >                 PKT_TX_SEC_OFFLOAD |     \
> > > > > > > > > > > >                 PKT_TX_UDP_SEG |         \
> > > > > > > > > > > > -               PKT_TX_OUTER_UDP_CKSUM)
> > > > > > > > > > > > +               PKT_TX_OUTER_UDP_CKSUM | \
> > > > > > > > > > > > +               PKT_TX_MARK_VLAN_DEI |   \
> > > > > > > > > > > > +               PKT_TX_MARK_IP_DSCP |    \
> > > > > > > > > > > > +               PKT_TX_MARK_IP_ECN)
> > > > > > > > > > > >
> > > > > > > > > > > >  /**
> > > > > > > > > > > >   * Mbuf having an external buffer attached. shinfo in mbuf must be filled.
> > > > > > > > > > > > --
> > > > > > > > > > > > 2.8.4
> > > > > > > > > > > >
Jerin Jacob May 30, 2020, 3:12 p.m. UTC | #21
> > > I also share Olivier's concern about consuming 3 bits in ol_flags for that feature.
> > > Can it probably be squeezed somehow?
> > > Let say we reserve one flag that this information is present or not, and
> > > re-use one of rx-only fields for store additional information (packet_type, or so).
> > > Or might be some other approach.
> >
> > We are fine with this approach where we define one bit in Tx offloads for pkt
> > marking and and 3 bits reused from Rx offload flags area.
> >
> > For example:
> >
> > @@ -186,10 +186,16 @@ extern "C" {
> >
> >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> >
> > +/* Reused Rx offload bits for Tx offloads */
> > +#define PKT_X_TX_MARK_VLAN_DEI         (1ULL << 0)
> > +#define PKT_X_TX_MARK_IP_DSCP          (1ULL << 1)
> > +#define PKT_X_TX_MARK_IP_ECN           (1ULL << 2)
> > +
> >  #define PKT_FIRST_FREE (1ULL << 23)
> > -#define PKT_LAST_FREE (1ULL << 40)
> > +#define PKT_LAST_FREE (1ULL << 39)
> >
> >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> > +#define PKT_TX_MARK_EN         (1ULL << 40)
> >
> > Is this fine ?
>
> Any thoughts on this approach which uses only 1 bit in Tx flags out of 18
> and reuse unused Rx flag bits ?

+ Techboard

There is a related thread going on
http://mails.dpdk.org/archives/dev/2020-May/168810.html

If there is no consensus on email, then I would like to add this item
to the next TB meeting.
Ananyev, Konstantin June 2, 2020, 10:53 a.m. UTC | #22
Hi Jerin,

> > > > I also share Olivier's concern about consuming 3 bits in ol_flags for that feature.
> > > > Can it probably be squeezed somehow?
> > > > Let say we reserve one flag that this information is present or not, and
> > > > re-use one of rx-only fields for store additional information (packet_type, or so).
> > > > Or might be some other approach.
> > >
> > > We are fine with this approach where we define one bit in Tx offloads for pkt
> > > marking and and 3 bits reused from Rx offload flags area.
> > >
> > > For example:
> > >
> > > @@ -186,10 +186,16 @@ extern "C" {
> > >
> > >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> > >
> > > +/* Reused Rx offload bits for Tx offloads */
> > > +#define PKT_X_TX_MARK_VLAN_DEI         (1ULL << 0)
> > > +#define PKT_X_TX_MARK_IP_DSCP          (1ULL << 1)
> > > +#define PKT_X_TX_MARK_IP_ECN           (1ULL << 2)
> > > +
> > >  #define PKT_FIRST_FREE (1ULL << 23)
> > > -#define PKT_LAST_FREE (1ULL << 40)
> > > +#define PKT_LAST_FREE (1ULL << 39)
> > >
> > >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> > > +#define PKT_TX_MARK_EN         (1ULL << 40)
> > >
> > > Is this fine ?
> >
> > Any thoughts on this approach which uses only 1 bit in Tx flags out of 18
> > and reuse unused Rx flag bits ?

My thought was not about re-defining the flags (I think it is better to keep them intact),
but adding a union for one of rx-only fields (packet_type/rss/timestamp).

> 
> + Techboard
> 
> There is a related thread going on
> http://mails.dpdk.org/archives/dev/2020-May/168810.html
> 
> If there is no consensus on email, then I would like to add this item
> to the next TB meeting.

Ok, I'll add that to tomorrow meeting agenda.
Konstantin
Nithin Dabilpuram June 2, 2020, 2:25 p.m. UTC | #23
On Tue, Jun 02, 2020 at 10:53:08AM +0000, Ananyev, Konstantin wrote:
> Hi Jerin,
> 
> > > > > I also share Olivier's concern about consuming 3 bits in ol_flags for that feature.
> > > > > Can it probably be squeezed somehow?
> > > > > Let say we reserve one flag that this information is present or not, and
> > > > > re-use one of rx-only fields for store additional information (packet_type, or so).
> > > > > Or might be some other approach.
> > > >
> > > > We are fine with this approach where we define one bit in Tx offloads for pkt
> > > > marking and and 3 bits reused from Rx offload flags area.
> > > >
> > > > For example:
> > > >
> > > > @@ -186,10 +186,16 @@ extern "C" {
> > > >
> > > >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> > > >
> > > > +/* Reused Rx offload bits for Tx offloads */
> > > > +#define PKT_X_TX_MARK_VLAN_DEI         (1ULL << 0)
> > > > +#define PKT_X_TX_MARK_IP_DSCP          (1ULL << 1)
> > > > +#define PKT_X_TX_MARK_IP_ECN           (1ULL << 2)
> > > > +
> > > >  #define PKT_FIRST_FREE (1ULL << 23)
> > > > -#define PKT_LAST_FREE (1ULL << 40)
> > > > +#define PKT_LAST_FREE (1ULL << 39)
> > > >
> > > >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> > > > +#define PKT_TX_MARK_EN         (1ULL << 40)
> > > >
> > > > Is this fine ?
> > >
> > > Any thoughts on this approach which uses only 1 bit in Tx flags out of 18
> > > and reuse unused Rx flag bits ?
> 
> My thought was not about re-defining the flags (I think it is better to keep them intact),
> but adding a union for one of rx-only fields (packet_type/rss/timestamp).

Ok. Adding a union field at packet_type field is also fine like below. 

@@ -187,9 +187,10 @@ extern "C" {
 /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
 
 #define PKT_FIRST_FREE (1ULL << 23)
-#define PKT_LAST_FREE (1ULL << 40)
+#define PKT_LAST_FREE (1ULL << 39)
 
 /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
+#define PKT_TX_MARK_EN		(1ULL << 40)
 
 /**
  * Outer UDP checksum offload flag. This flag is used for enabling
@@ -461,6 +462,14 @@ enum {
 #endif
 };
 
+/* Tx packet marking flags in rte_mbuf::tx_mark.
+ * Valid only when PKT_TX_MARK_EN is set in
+ * rte_mbuf::ol_flags.
+ */
+#define TX_MARK_VLAN_DEI	(1ULL << 0)
+#define TX_MARK_IP_DSCP	(1ULL << 1)
+#define TX_MARK_IP_ECN		(1ULL << 2)
+
 /**
  * The generic rte_mbuf, containing a packet mbuf.
  */
@@ -543,6 +552,10 @@ struct rte_mbuf {
 			};
 			uint32_t inner_l4_type:4; /**< Inner L4 type. */
 		};
+		struct {
+			uint32_t reserved:29;
+			uint32_t tx_mark:3;
+		};
 	};



Please correct me if this is not what you mean.

> 
> > 
> > + Techboard
> > 
> > There is a related thread going on
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__mails.dpdk.org_archives_dev_2020-2DMay_168810.html&d=DwIGaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=nyV4Rud03HW6DbWMpyvOCulQNkagmfo0wKtrwQ7zmmg&s=VuktoUb_xoLsHKdB9mV87x67cP9tXk3DqVXptt9nF_s&e= 
> > 
> > If there is no consensus on email, then I would like to add this item
> > to the next TB meeting.
> 
> Ok, I'll add that to tomorrow meeting agenda.
> Konstantin
>
Olivier Matz June 3, 2020, 8:28 a.m. UTC | #24
Hi,

On Tue, Jun 02, 2020 at 07:55:37PM +0530, Nithin Dabilpuram wrote:
> On Tue, Jun 02, 2020 at 10:53:08AM +0000, Ananyev, Konstantin wrote:
> > Hi Jerin,
> > 
> > > > > > I also share Olivier's concern about consuming 3 bits in ol_flags for that feature.
> > > > > > Can it probably be squeezed somehow?
> > > > > > Let say we reserve one flag that this information is present or not, and
> > > > > > re-use one of rx-only fields for store additional information (packet_type, or so).
> > > > > > Or might be some other approach.
> > > > >
> > > > > We are fine with this approach where we define one bit in Tx offloads for pkt
> > > > > marking and and 3 bits reused from Rx offload flags area.
> > > > >
> > > > > For example:
> > > > >
> > > > > @@ -186,10 +186,16 @@ extern "C" {
> > > > >
> > > > >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> > > > >
> > > > > +/* Reused Rx offload bits for Tx offloads */
> > > > > +#define PKT_X_TX_MARK_VLAN_DEI         (1ULL << 0)
> > > > > +#define PKT_X_TX_MARK_IP_DSCP          (1ULL << 1)
> > > > > +#define PKT_X_TX_MARK_IP_ECN           (1ULL << 2)
> > > > > +
> > > > >  #define PKT_FIRST_FREE (1ULL << 23)
> > > > > -#define PKT_LAST_FREE (1ULL << 40)
> > > > > +#define PKT_LAST_FREE (1ULL << 39)
> > > > >
> > > > >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> > > > > +#define PKT_TX_MARK_EN         (1ULL << 40)
> > > > >
> > > > > Is this fine ?
> > > >
> > > > Any thoughts on this approach which uses only 1 bit in Tx flags out of 18
> > > > and reuse unused Rx flag bits ?
> > 
> > My thought was not about re-defining the flags (I think it is better to keep them intact),
> > but adding a union for one of rx-only fields (packet_type/rss/timestamp).
> 
> Ok. Adding a union field at packet_type field is also fine like below. 
> 
> @@ -187,9 +187,10 @@ extern "C" {
>  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
>  
>  #define PKT_FIRST_FREE (1ULL << 23)
> -#define PKT_LAST_FREE (1ULL << 40)
> +#define PKT_LAST_FREE (1ULL << 39)
>  
>  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> +#define PKT_TX_MARK_EN		(1ULL << 40)
>  
>  /**
>   * Outer UDP checksum offload flag. This flag is used for enabling
> @@ -461,6 +462,14 @@ enum {
>  #endif
>  };
>  
> +/* Tx packet marking flags in rte_mbuf::tx_mark.
> + * Valid only when PKT_TX_MARK_EN is set in
> + * rte_mbuf::ol_flags.
> + */
> +#define TX_MARK_VLAN_DEI	(1ULL << 0)
> +#define TX_MARK_IP_DSCP	(1ULL << 1)
> +#define TX_MARK_IP_ECN		(1ULL << 2)
> +
>  /**
>   * The generic rte_mbuf, containing a packet mbuf.
>   */
> @@ -543,6 +552,10 @@ struct rte_mbuf {
>  			};
>  			uint32_t inner_l4_type:4; /**< Inner L4 type. */
>  		};
> +		struct {
> +			uint32_t reserved:29;
> +			uint32_t tx_mark:3;
> +		};
>  	};
> 
> 
> 
> Please correct me if this is not what you mean.

I'm not a big fan of reusing Rx fields or flags for Tx.
It's not obvious for an application than adding a tx_mark will overwrite
the packet_type. I understand that the risk is limited because packet_type
is Rx and the marks are Tx, but there is still one.

To summarize the different proposed approaches (please correct me if I'm wrong):

a- add 3 Tx mbuf flags
   (-) consumes limited resource

b- add 3 dynamic flags
   (-) slower

c- add 1 Tx flag and union with Rx field
   (-) exclusive with Rx field
   (-) still consumes one flag

My preference is still b-, for these reasons:

- There are many different DPDK use cases, and resources in mbuf is tight.
  Recent contributions (rte_flow and ice driver) already made use of dynamic
  fields/flags.

- When I implemented the dynamic fields/flags feature, I did a test which
  showed that the cost of having a dynamic offset was few cycles (on my test
  platform, it was~3 cycles for reading a field and ~2 cycles for writing a
  field).

Regards,
Olivier

> 
> > 
> > > 
> > > + Techboard
> > > 
> > > There is a related thread going on
> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__mails.dpdk.org_archives_dev_2020-2DMay_168810.html&d=DwIGaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=nyV4Rud03HW6DbWMpyvOCulQNkagmfo0wKtrwQ7zmmg&s=VuktoUb_xoLsHKdB9mV87x67cP9tXk3DqVXptt9nF_s&e= 
> > > 
> > > If there is no consensus on email, then I would like to add this item
> > > to the next TB meeting.
> > 
> > Ok, I'll add that to tomorrow meeting agenda.
> > Konstantin
> >
Ananyev, Konstantin June 3, 2020, 10:31 a.m. UTC | #25
> > > > > > I also share Olivier's concern about consuming 3 bits in ol_flags for that feature.
> > > > > > Can it probably be squeezed somehow?
> > > > > > Let say we reserve one flag that this information is present or not, and
> > > > > > re-use one of rx-only fields for store additional information (packet_type, or so).
> > > > > > Or might be some other approach.
> > > > >
> > > > > We are fine with this approach where we define one bit in Tx offloads for pkt
> > > > > marking and and 3 bits reused from Rx offload flags area.
> > > > >
> > > > > For example:
> > > > >
> > > > > @@ -186,10 +186,16 @@ extern "C" {
> > > > >
> > > > >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> > > > >
> > > > > +/* Reused Rx offload bits for Tx offloads */
> > > > > +#define PKT_X_TX_MARK_VLAN_DEI         (1ULL << 0)
> > > > > +#define PKT_X_TX_MARK_IP_DSCP          (1ULL << 1)
> > > > > +#define PKT_X_TX_MARK_IP_ECN           (1ULL << 2)
> > > > > +
> > > > >  #define PKT_FIRST_FREE (1ULL << 23)
> > > > > -#define PKT_LAST_FREE (1ULL << 40)
> > > > > +#define PKT_LAST_FREE (1ULL << 39)
> > > > >
> > > > >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> > > > > +#define PKT_TX_MARK_EN         (1ULL << 40)
> > > > >
> > > > > Is this fine ?
> > > >
> > > > Any thoughts on this approach which uses only 1 bit in Tx flags out of 18
> > > > and reuse unused Rx flag bits ?
> >
> > My thought was not about re-defining the flags (I think it is better to keep them intact),
> > but adding a union for one of rx-only fields (packet_type/rss/timestamp).
> 
> Ok. Adding a union field at packet_type field is also fine like below.
> 
> @@ -187,9 +187,10 @@ extern "C" {
>  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> 
>  #define PKT_FIRST_FREE (1ULL << 23)
> -#define PKT_LAST_FREE (1ULL << 40)
> +#define PKT_LAST_FREE (1ULL << 39)
> 
>  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> +#define PKT_TX_MARK_EN		(1ULL << 40)
> 
>  /**
>   * Outer UDP checksum offload flag. This flag is used for enabling
> @@ -461,6 +462,14 @@ enum {
>  #endif
>  };
> 
> +/* Tx packet marking flags in rte_mbuf::tx_mark.
> + * Valid only when PKT_TX_MARK_EN is set in
> + * rte_mbuf::ol_flags.
> + */
> +#define TX_MARK_VLAN_DEI	(1ULL << 0)
> +#define TX_MARK_IP_DSCP	(1ULL << 1)
> +#define TX_MARK_IP_ECN		(1ULL << 2)
> +
>  /**
>   * The generic rte_mbuf, containing a packet mbuf.
>   */
> @@ -543,6 +552,10 @@ struct rte_mbuf {
>  			};
>  			uint32_t inner_l4_type:4; /**< Inner L4 type. */
>  		};
> +		struct {
> +			uint32_t reserved:29;
> +			uint32_t tx_mark:3;
> +		};
>  	};
> 
> 
> 
> Please correct me if this is not what you mean.
> 

Yes, I thought about something like that.
Nithin Dabilpuram June 3, 2020, 10:44 a.m. UTC | #26
On Wed, Jun 03, 2020 at 10:28:44AM +0200, Olivier Matz wrote:
> Hi,
> 
> On Tue, Jun 02, 2020 at 07:55:37PM +0530, Nithin Dabilpuram wrote:
> > On Tue, Jun 02, 2020 at 10:53:08AM +0000, Ananyev, Konstantin wrote:
> > > Hi Jerin,
> > > 
> > > > > > > I also share Olivier's concern about consuming 3 bits in ol_flags for that feature.
> > > > > > > Can it probably be squeezed somehow?
> > > > > > > Let say we reserve one flag that this information is present or not, and
> > > > > > > re-use one of rx-only fields for store additional information (packet_type, or so).
> > > > > > > Or might be some other approach.
> > > > > >
> > > > > > We are fine with this approach where we define one bit in Tx offloads for pkt
> > > > > > marking and and 3 bits reused from Rx offload flags area.
> > > > > >
> > > > > > For example:
> > > > > >
> > > > > > @@ -186,10 +186,16 @@ extern "C" {
> > > > > >
> > > > > >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> > > > > >
> > > > > > +/* Reused Rx offload bits for Tx offloads */
> > > > > > +#define PKT_X_TX_MARK_VLAN_DEI         (1ULL << 0)
> > > > > > +#define PKT_X_TX_MARK_IP_DSCP          (1ULL << 1)
> > > > > > +#define PKT_X_TX_MARK_IP_ECN           (1ULL << 2)
> > > > > > +
> > > > > >  #define PKT_FIRST_FREE (1ULL << 23)
> > > > > > -#define PKT_LAST_FREE (1ULL << 40)
> > > > > > +#define PKT_LAST_FREE (1ULL << 39)
> > > > > >
> > > > > >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> > > > > > +#define PKT_TX_MARK_EN         (1ULL << 40)
> > > > > >
> > > > > > Is this fine ?
> > > > >
> > > > > Any thoughts on this approach which uses only 1 bit in Tx flags out of 18
> > > > > and reuse unused Rx flag bits ?
> > > 
> > > My thought was not about re-defining the flags (I think it is better to keep them intact),
> > > but adding a union for one of rx-only fields (packet_type/rss/timestamp).
> > 
> > Ok. Adding a union field at packet_type field is also fine like below. 
> > 
> > @@ -187,9 +187,10 @@ extern "C" {
> >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> >  
> >  #define PKT_FIRST_FREE (1ULL << 23)
> > -#define PKT_LAST_FREE (1ULL << 40)
> > +#define PKT_LAST_FREE (1ULL << 39)
> >  
> >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> > +#define PKT_TX_MARK_EN		(1ULL << 40)
> >  
> >  /**
> >   * Outer UDP checksum offload flag. This flag is used for enabling
> > @@ -461,6 +462,14 @@ enum {
> >  #endif
> >  };
> >  
> > +/* Tx packet marking flags in rte_mbuf::tx_mark.
> > + * Valid only when PKT_TX_MARK_EN is set in
> > + * rte_mbuf::ol_flags.
> > + */
> > +#define TX_MARK_VLAN_DEI	(1ULL << 0)
> > +#define TX_MARK_IP_DSCP	(1ULL << 1)
> > +#define TX_MARK_IP_ECN		(1ULL << 2)
> > +
> >  /**
> >   * The generic rte_mbuf, containing a packet mbuf.
> >   */
> > @@ -543,6 +552,10 @@ struct rte_mbuf {
> >  			};
> >  			uint32_t inner_l4_type:4; /**< Inner L4 type. */
> >  		};
> > +		struct {
> > +			uint32_t reserved:29;
> > +			uint32_t tx_mark:3;
> > +		};
> >  	};
> > 
> > 
> > 
> > Please correct me if this is not what you mean.
> 
> I'm not a big fan of reusing Rx fields or flags for Tx.
> It's not obvious for an application than adding a tx_mark will overwrite
> the packet_type. I understand that the risk is limited because packet_type
> is Rx and the marks are Tx, but there is still one.

I'm also not a big fan but just wanted to take this approach so that,
it can both conserve space and also help fast path.
Reusing Rx area is however not a new thing as is already followed for 
mbuf->txadapter field.

Apart from documentation issue, Is there any other issue or future 
ramification with using Rx field's for Tx ?
If it is only about documentation, then we can add more documentation to make things clear.

> 
> To summarize the different proposed approaches (please correct me if I'm wrong):
> 
> a- add 3 Tx mbuf flags
>    (-) consumes limited resource
> 
> b- add 3 dynamic flags
>    (-) slower

- Tx burst Vector implementation can't be done for this tx offload as
  offset keeps changing.

> 
> c- add 1 Tx flag and union with Rx field
>    (-) exclusive with Rx field
>    (-) still consumes one flag
> 
> My preference is still b-, for these reasons:
> 
> - There are many different DPDK use cases, and resources in mbuf is tight.
>   Recent contributions (rte_flow and ice driver) already made use of dynamic
>   fields/flags.
- Since RTE_FLOW metadata is 32-bit field, it is a clear candidate for
dynamic flags. 
- ICE PMD's dynamic field is however a vendor specific field and only for
ICE PMD users.

In this case, it is just 1 bit out of 18 free bits available in ol_flags.

> 
> - When I implemented the dynamic fields/flags feature, I did a test which
>   showed that the cost of having a dynamic offset was few cycles (on my test
>   platform, it was~3 cycles for reading a field and ~2 cycles for writing a
>   field).

I think this cost is of the case where the address where the dyn_offset is 
stored is already in cache as it needs to be read first.


> 
> Regards,
> Olivier
> 
> > 
> > > 
> > > > 
> > > > + Techboard
> > > > 
> > > > There is a related thread going on
> > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__mails.dpdk.org_archives_dev_2020-2DMay_168810.html&d=DwIGaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=nyV4Rud03HW6DbWMpyvOCulQNkagmfo0wKtrwQ7zmmg&s=VuktoUb_xoLsHKdB9mV87x67cP9tXk3DqVXptt9nF_s&e= 
> > > > 
> > > > If there is no consensus on email, then I would like to add this item
> > > > to the next TB meeting.
> > > 
> > > Ok, I'll add that to tomorrow meeting agenda.
> > > Konstantin
> > > 
> 
>
Olivier Matz June 3, 2020, 11:38 a.m. UTC | #27
Hi Nithin,

On Wed, Jun 03, 2020 at 04:14:14PM +0530, Nithin Dabilpuram wrote:
> On Wed, Jun 03, 2020 at 10:28:44AM +0200, Olivier Matz wrote:
> > Hi,
> > 
> > On Tue, Jun 02, 2020 at 07:55:37PM +0530, Nithin Dabilpuram wrote:
> > > On Tue, Jun 02, 2020 at 10:53:08AM +0000, Ananyev, Konstantin wrote:
> > > > Hi Jerin,
> > > > 
> > > > > > > > I also share Olivier's concern about consuming 3 bits in ol_flags for that feature.
> > > > > > > > Can it probably be squeezed somehow?
> > > > > > > > Let say we reserve one flag that this information is present or not, and
> > > > > > > > re-use one of rx-only fields for store additional information (packet_type, or so).
> > > > > > > > Or might be some other approach.
> > > > > > >
> > > > > > > We are fine with this approach where we define one bit in Tx offloads for pkt
> > > > > > > marking and and 3 bits reused from Rx offload flags area.
> > > > > > >
> > > > > > > For example:
> > > > > > >
> > > > > > > @@ -186,10 +186,16 @@ extern "C" {
> > > > > > >
> > > > > > >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> > > > > > >
> > > > > > > +/* Reused Rx offload bits for Tx offloads */
> > > > > > > +#define PKT_X_TX_MARK_VLAN_DEI         (1ULL << 0)
> > > > > > > +#define PKT_X_TX_MARK_IP_DSCP          (1ULL << 1)
> > > > > > > +#define PKT_X_TX_MARK_IP_ECN           (1ULL << 2)
> > > > > > > +
> > > > > > >  #define PKT_FIRST_FREE (1ULL << 23)
> > > > > > > -#define PKT_LAST_FREE (1ULL << 40)
> > > > > > > +#define PKT_LAST_FREE (1ULL << 39)
> > > > > > >
> > > > > > >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> > > > > > > +#define PKT_TX_MARK_EN         (1ULL << 40)
> > > > > > >
> > > > > > > Is this fine ?
> > > > > >
> > > > > > Any thoughts on this approach which uses only 1 bit in Tx flags out of 18
> > > > > > and reuse unused Rx flag bits ?
> > > > 
> > > > My thought was not about re-defining the flags (I think it is better to keep them intact),
> > > > but adding a union for one of rx-only fields (packet_type/rss/timestamp).
> > > 
> > > Ok. Adding a union field at packet_type field is also fine like below. 
> > > 
> > > @@ -187,9 +187,10 @@ extern "C" {
> > >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> > >  
> > >  #define PKT_FIRST_FREE (1ULL << 23)
> > > -#define PKT_LAST_FREE (1ULL << 40)
> > > +#define PKT_LAST_FREE (1ULL << 39)
> > >  
> > >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> > > +#define PKT_TX_MARK_EN		(1ULL << 40)
> > >  
> > >  /**
> > >   * Outer UDP checksum offload flag. This flag is used for enabling
> > > @@ -461,6 +462,14 @@ enum {
> > >  #endif
> > >  };
> > >  
> > > +/* Tx packet marking flags in rte_mbuf::tx_mark.
> > > + * Valid only when PKT_TX_MARK_EN is set in
> > > + * rte_mbuf::ol_flags.
> > > + */
> > > +#define TX_MARK_VLAN_DEI	(1ULL << 0)
> > > +#define TX_MARK_IP_DSCP	(1ULL << 1)
> > > +#define TX_MARK_IP_ECN		(1ULL << 2)
> > > +
> > >  /**
> > >   * The generic rte_mbuf, containing a packet mbuf.
> > >   */
> > > @@ -543,6 +552,10 @@ struct rte_mbuf {
> > >  			};
> > >  			uint32_t inner_l4_type:4; /**< Inner L4 type. */
> > >  		};
> > > +		struct {
> > > +			uint32_t reserved:29;
> > > +			uint32_t tx_mark:3;
> > > +		};
> > >  	};
> > > 
> > > 
> > > 
> > > Please correct me if this is not what you mean.
> > 
> > I'm not a big fan of reusing Rx fields or flags for Tx.
> > It's not obvious for an application than adding a tx_mark will overwrite
> > the packet_type. I understand that the risk is limited because packet_type
> > is Rx and the marks are Tx, but there is still one.
> 
> I'm also not a big fan but just wanted to take this approach so that,
> it can both conserve space and also help fast path.
> Reusing Rx area is however not a new thing as is already followed for
> mbuf->txadapter field.

Yes, and in my opinion this is something we should avoid when possible,
because it makes some features exclusive (ex: the big union with
sched/rss/adapter/usr/...).

> Apart from documentation issue, Is there any other issue or future 
> ramification with using Rx field's for Tx ?

No, I don't see any other issue except the ones we already mentioned (doc, code clarity, ).

> If it is only about documentation, then we can add more documentation to make things clear.
> > 
> > To summarize the different proposed approaches (please correct me if I'm wrong):
> > 
> > a- add 3 Tx mbuf flags
> >    (-) consumes limited resource
> > 
> > b- add 3 dynamic flags
> >    (-) slower
> 
> - Tx burst Vector implementation can't be done for this tx offload as
>   offset keeps changing.

A vector implementation can be done. But yes, it would be slower than
with a static flag.

> > 
> > c- add 1 Tx flag and union with Rx field
> >    (-) exclusive with Rx field
> >    (-) still consumes one flag
> > 
> > My preference is still b-, for these reasons:
> > 
> > - There are many different DPDK use cases, and resources in mbuf is tight.
> >   Recent contributions (rte_flow and ice driver) already made use of dynamic
> >   fields/flags.
> - Since RTE_FLOW metadata is 32-bit field, it is a clear candidate for
> dynamic flags.

I'm not sure to get why it is a better candidate than packet marking.
You mean because it requires more room in mbuf?

> - ICE PMD's dynamic field is however a vendor specific field and only for
> ICE PMD users.

Yes, but ICE PMD users may be as important as packet marking users.

> In this case, it is just 1 bit out of 18 free bits available in ol_flags.
> 
> > 
> > - When I implemented the dynamic fields/flags feature, I did a test which
> >   showed that the cost of having a dynamic offset was few cycles (on my test
> >   platform, it was~3 cycles for reading a field and ~2 cycles for writing a
> >   field).
> 
> I think this cost is of the case where the address where the dyn_offset is
> stored is already in cache as it needs to be read first.

This fetch of the value (in case it is not in cache) can be done once per bulk,
so I'm not sure the impact would be high.


Regards,
Olivier


> 
> 
> > 
> > Regards,
> > Olivier
> > 
> > > 
> > > > 
> > > > > 
> > > > > + Techboard
> > > > > 
> > > > > There is a related thread going on
> > > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__mails.dpdk.org_archives_dev_2020-2DMay_168810.html&d=DwIGaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=nyV4Rud03HW6DbWMpyvOCulQNkagmfo0wKtrwQ7zmmg&s=VuktoUb_xoLsHKdB9mV87x67cP9tXk3DqVXptt9nF_s&e= 
> > > > > 
> > > > > If there is no consensus on email, then I would like to add this item
> > > > > to the next TB meeting.
> > > > 
> > > > Ok, I'll add that to tomorrow meeting agenda.
> > > > Konstantin
> > > > 
> > 
> >
Nithin Dabilpuram June 3, 2020, 12:52 p.m. UTC | #28
On Wed, Jun 03, 2020 at 01:38:22PM +0200, Olivier Matz wrote:
> Hi Nithin,
> 
> On Wed, Jun 03, 2020 at 04:14:14PM +0530, Nithin Dabilpuram wrote:
> > On Wed, Jun 03, 2020 at 10:28:44AM +0200, Olivier Matz wrote:
> > > Hi,
> > > 
> > > On Tue, Jun 02, 2020 at 07:55:37PM +0530, Nithin Dabilpuram wrote:
> > > > On Tue, Jun 02, 2020 at 10:53:08AM +0000, Ananyev, Konstantin wrote:
> > > > > Hi Jerin,
> > > > > 
> > > > > > > > > I also share Olivier's concern about consuming 3 bits in ol_flags for that feature.
> > > > > > > > > Can it probably be squeezed somehow?
> > > > > > > > > Let say we reserve one flag that this information is present or not, and
> > > > > > > > > re-use one of rx-only fields for store additional information (packet_type, or so).
> > > > > > > > > Or might be some other approach.
> > > > > > > >
> > > > > > > > We are fine with this approach where we define one bit in Tx offloads for pkt
> > > > > > > > marking and and 3 bits reused from Rx offload flags area.
> > > > > > > >
> > > > > > > > For example:
> > > > > > > >
> > > > > > > > @@ -186,10 +186,16 @@ extern "C" {
> > > > > > > >
> > > > > > > >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> > > > > > > >
> > > > > > > > +/* Reused Rx offload bits for Tx offloads */
> > > > > > > > +#define PKT_X_TX_MARK_VLAN_DEI         (1ULL << 0)
> > > > > > > > +#define PKT_X_TX_MARK_IP_DSCP          (1ULL << 1)
> > > > > > > > +#define PKT_X_TX_MARK_IP_ECN           (1ULL << 2)
> > > > > > > > +
> > > > > > > >  #define PKT_FIRST_FREE (1ULL << 23)
> > > > > > > > -#define PKT_LAST_FREE (1ULL << 40)
> > > > > > > > +#define PKT_LAST_FREE (1ULL << 39)
> > > > > > > >
> > > > > > > >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> > > > > > > > +#define PKT_TX_MARK_EN         (1ULL << 40)
> > > > > > > >
> > > > > > > > Is this fine ?
> > > > > > >
> > > > > > > Any thoughts on this approach which uses only 1 bit in Tx flags out of 18
> > > > > > > and reuse unused Rx flag bits ?
> > > > > 
> > > > > My thought was not about re-defining the flags (I think it is better to keep them intact),
> > > > > but adding a union for one of rx-only fields (packet_type/rss/timestamp).
> > > > 
> > > > Ok. Adding a union field at packet_type field is also fine like below. 
> > > > 
> > > > @@ -187,9 +187,10 @@ extern "C" {
> > > >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> > > >  
> > > >  #define PKT_FIRST_FREE (1ULL << 23)
> > > > -#define PKT_LAST_FREE (1ULL << 40)
> > > > +#define PKT_LAST_FREE (1ULL << 39)
> > > >  
> > > >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> > > > +#define PKT_TX_MARK_EN		(1ULL << 40)
> > > >  
> > > >  /**
> > > >   * Outer UDP checksum offload flag. This flag is used for enabling
> > > > @@ -461,6 +462,14 @@ enum {
> > > >  #endif
> > > >  };
> > > >  
> > > > +/* Tx packet marking flags in rte_mbuf::tx_mark.
> > > > + * Valid only when PKT_TX_MARK_EN is set in
> > > > + * rte_mbuf::ol_flags.
> > > > + */
> > > > +#define TX_MARK_VLAN_DEI	(1ULL << 0)
> > > > +#define TX_MARK_IP_DSCP	(1ULL << 1)
> > > > +#define TX_MARK_IP_ECN		(1ULL << 2)
> > > > +
> > > >  /**
> > > >   * The generic rte_mbuf, containing a packet mbuf.
> > > >   */
> > > > @@ -543,6 +552,10 @@ struct rte_mbuf {
> > > >  			};
> > > >  			uint32_t inner_l4_type:4; /**< Inner L4 type. */
> > > >  		};
> > > > +		struct {
> > > > +			uint32_t reserved:29;
> > > > +			uint32_t tx_mark:3;
> > > > +		};
> > > >  	};
> > > > 
> > > > 
> > > > 
> > > > Please correct me if this is not what you mean.
> > > 
> > > I'm not a big fan of reusing Rx fields or flags for Tx.
> > > It's not obvious for an application than adding a tx_mark will overwrite
> > > the packet_type. I understand that the risk is limited because packet_type
> > > is Rx and the marks are Tx, but there is still one.
> > 
> > I'm also not a big fan but just wanted to take this approach so that,
> > it can both conserve space and also help fast path.
> > Reusing Rx area is however not a new thing as is already followed for
> > mbuf->txadapter field.
> 
> Yes, and in my opinion this is something we should avoid when possible,
> because it makes some features exclusive (ex: the big union with
> sched/rss/adapter/usr/...).
> 
> > Apart from documentation issue, Is there any other issue or future 
> > ramification with using Rx field's for Tx ?
> 
> No, I don't see any other issue except the ones we already mentioned (doc, code clarity, ).
> 
> > If it is only about documentation, then we can add more documentation to make things clear.
> > > 
> > > To summarize the different proposed approaches (please correct me if I'm wrong):
> > > 
> > > a- add 3 Tx mbuf flags
> > >    (-) consumes limited resource
> > > 
> > > b- add 3 dynamic flags
> > >    (-) slower
> > 
> > - Tx burst Vector implementation can't be done for this tx offload as
> >   offset keeps changing.
> 
> A vector implementation can be done. But yes, it would be slower than
> with a static flag.

Very slow atleast in our HW as, we try to translate ol_flags to
HW descriptor flags in addition to extra operations to be done
like offset calculations etc. 

So if we have fixed offsets, then it is easy to have static constant 
128/256 bit words with offsets and use things like shuffle/table lookup
to reorganize multiple mbuf flags to descriptor fields in a single instruction.

> 
> > > 
> > > c- add 1 Tx flag and union with Rx field
> > >    (-) exclusive with Rx field
> > >    (-) still consumes one flag
> > > 
> > > My preference is still b-, for these reasons:
> > > 
> > > - There are many different DPDK use cases, and resources in mbuf is tight.
> > >   Recent contributions (rte_flow and ice driver) already made use of dynamic
> > >   fields/flags.
> > - Since RTE_FLOW metadata is 32-bit field, it is a clear candidate for
> > dynamic flags.
> 
> I'm not sure to get why it is a better candidate than packet marking.
> You mean because it requires more room in mbuf?

Yes, I feel space consumption is one way to decide whether it should be
a dynfield or static field. 

IMO, other parameter to judge could be whether the field definition/usage itself 
is well know standard and is a part of RTE spec or its definition is vendor specific.

> 
> > - ICE PMD's dynamic field is however a vendor specific field and only for
> > ICE PMD users.
> 
> Yes, but ICE PMD users may be as important as packet marking users.

Agree, I only meant that the flag ICE PMD registered cannot be used for other PMD's
so by using dynamic field, we are avoiding wastage of a static field that is needed
only by one specific PMD irrespective of whether that PMD is probed or not.

> 
> > In this case, it is just 1 bit out of 18 free bits available in ol_flags.
> > 
> > > 
> > > - When I implemented the dynamic fields/flags feature, I did a test which
> > >   showed that the cost of having a dynamic offset was few cycles (on my test
> > >   platform, it was~3 cycles for reading a field and ~2 cycles for writing a
> > >   field).
> > 
> > I think this cost is of the case where the address where the dyn_offset is
> > stored is already in cache as it needs to be read first.
> 
> This fetch of the value (in case it is not in cache) can be done once per bulk,
> so I'm not sure the impact would be high.

Agreed, for bulk case offset loading should have less impact.

> 
> 
> Regards,
> Olivier
> 
> 
> > 
> > 
> > > 
> > > Regards,
> > > Olivier
> > > 
> > > > 
> > > > > 
> > > > > > 
> > > > > > + Techboard
> > > > > > 
> > > > > > There is a related thread going on
> > > > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__mails.dpdk.org_archives_dev_2020-2DMay_168810.html&d=DwIGaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=nyV4Rud03HW6DbWMpyvOCulQNkagmfo0wKtrwQ7zmmg&s=VuktoUb_xoLsHKdB9mV87x67cP9tXk3DqVXptt9nF_s&e= 
> > > > > > 
> > > > > > If there is no consensus on email, then I would like to add this item
> > > > > > to the next TB meeting.
> > > > > 
> > > > > Ok, I'll add that to tomorrow meeting agenda.
> > > > > Konstantin
> > > > > 
> > > 
> > >
Thomas Monjalon June 3, 2020, 2:56 p.m. UTC | #29
03/06/2020 13:38, Olivier Matz:
> On Wed, Jun 03, 2020 at 04:14:14PM +0530, Nithin Dabilpuram wrote:
> > On Wed, Jun 03, 2020 at 10:28:44AM +0200, Olivier Matz wrote:
> > > On Tue, Jun 02, 2020 at 07:55:37PM +0530, Nithin Dabilpuram wrote:
> > > > On Tue, Jun 02, 2020 at 10:53:08AM +0000, Ananyev, Konstantin wrote:
> > > > > > > > > I also share Olivier's concern about consuming 3 bits in ol_flags for that feature.
> > > > > > > > > Can it probably be squeezed somehow?
> > > > > > > > > Let say we reserve one flag that this information is present or not, and
> > > > > > > > > re-use one of rx-only fields for store additional information (packet_type, or so).
> > > > > > > > > Or might be some other approach.
> > > > > > > >
> > > > > > > > We are fine with this approach where we define one bit in Tx offloads for pkt
> > > > > > > > marking and and 3 bits reused from Rx offload flags area.
[...]
> > > I'm not a big fan of reusing Rx fields or flags for Tx.
> > > It's not obvious for an application than adding a tx_mark will overwrite
> > > the packet_type. I understand that the risk is limited because packet_type
> > > is Rx and the marks are Tx, but there is still one.

Mixing Rx and Tx info in the same field is a bad design pattern
which will create a lot of difficult bugs.


> > I'm also not a big fan but just wanted to take this approach so that,
> > it can both conserve space and also help fast path.
> > Reusing Rx area is however not a new thing as is already followed for
> > mbuf->txadapter field.

Yes there is a txadapter field union'ed with flow director and QoS.
This is a bad pattern that I highlighted in this presentation (slide 8):
https://www.dpdk.org/wp-content/uploads/sites/35/2019/10/DynamicMbuf.pdf

> Yes, and in my opinion this is something we should avoid when possible,
> because it makes some features exclusive (ex: the big union with
> sched/rss/adapter/usr/...).

Yes, the "RSS union" must be cleaned-up, as some other mbuf parts.


> > Apart from documentation issue, Is there any other issue or future 
> > ramification with using Rx field's for Tx ?
> 
> No, I don't see any other issue except the ones we already mentioned
> (doc, code clarity, ).

"doc clarity" should be understood as the opposite of
"design leading inevitably to bugs".

> > If it is only about documentation, then we can add more documentation to make things clear.

More documentation won't make a bad design better, unfortunately.


> > > To summarize the different proposed approaches (please correct me if I'm wrong):
> > > 
> > > a- add 3 Tx mbuf flags
> > >    (-) consumes limited resource
> > > 
> > > b- add 3 dynamic flags
> > >    (-) slower
> > 
> > - Tx burst Vector implementation can't be done for this tx offload as
> >   offset keeps changing.
> 
> A vector implementation can be done. But yes, it would be slower than
> with a static flag.
> 
> > > c- add 1 Tx flag and union with Rx field
> > >    (-) exclusive with Rx field
> > >    (-) still consumes one flag
> > > 
> > > My preference is still b-, for these reasons:

Me too, my preference is (b).
Slava Ovsiienko June 3, 2020, 4:14 p.m. UTC | #30
> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Nithin Dabilpuram
> Sent: Wednesday, June 3, 2020 15:52
> To: Olivier Matz <olivier.matz@6wind.com>
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Jerin Jacob
> <jerinjacobk@gmail.com>; Dumitrescu, Cristian
> <cristian.dumitrescu@intel.com>; Nithin Dabilpuram
> <nithind1988@gmail.com>; Thomas Monjalon <thomas@monjalon.net>;
> Yigit, Ferruh <ferruh.yigit@intel.com>; Andrew Rybchenko
> <arybchenko@solarflare.com>; Ori Kam <orika@mellanox.com>; Burakov,
> Anatoly <anatoly.burakov@intel.com>; Mcnamara, John
> <john.mcnamara@intel.com>; Kovacevic, Marko
> <marko.kovacevic@intel.com>; dpdk-dev <dev@dpdk.org>; Jerin Jacob
> <jerinj@marvell.com>; Krzysztof Kanas <kkanas@marvell.com>;
> techboard@dpdk.org
> Subject: Re: [dpdk-dev] [EXT] Re: [PATCH 1/3] mbuf: add Tx offloads for
> packet marking
> 
> On Wed, Jun 03, 2020 at 01:38:22PM +0200, Olivier Matz wrote:
> > Hi Nithin,
> >
> > On Wed, Jun 03, 2020 at 04:14:14PM +0530, Nithin Dabilpuram wrote:
> > > On Wed, Jun 03, 2020 at 10:28:44AM +0200, Olivier Matz wrote:
> > > > Hi,
> > > >
> > > > On Tue, Jun 02, 2020 at 07:55:37PM +0530, Nithin Dabilpuram wrote:
> > > > > On Tue, Jun 02, 2020 at 10:53:08AM +0000, Ananyev, Konstantin
> wrote:
> > > > > > Hi Jerin,
> > > > > >
> > > > > > > > > > I also share Olivier's concern about consuming 3 bits in
> ol_flags for that feature.
> > > > > > > > > > Can it probably be squeezed somehow?
> > > > > > > > > > Let say we reserve one flag that this information is
> > > > > > > > > > present or not, and re-use one of rx-only fields for store
> additional information (packet_type, or so).
> > > > > > > > > > Or might be some other approach.
> > > > > > > > >
> > > > > > > > > We are fine with this approach where we define one bit
> > > > > > > > > in Tx offloads for pkt marking and and 3 bits reused from Rx
> offload flags area.
> > > > > > > > >
> > > > > > > > > For example:
> > > > > > > > >
> > > > > > > > > @@ -186,10 +186,16 @@ extern "C" {
> > > > > > > > >
> > > > > > > > >  /* add new RX flags here, don't forget to update
> > > > > > > > > PKT_FIRST_FREE */
> > > > > > > > >
> > > > > > > > > +/* Reused Rx offload bits for Tx offloads */
> > > > > > > > > +#define PKT_X_TX_MARK_VLAN_DEI         (1ULL << 0)
> > > > > > > > > +#define PKT_X_TX_MARK_IP_DSCP          (1ULL << 1)
> > > > > > > > > +#define PKT_X_TX_MARK_IP_ECN           (1ULL << 2)
> > > > > > > > > +
> > > > > > > > >  #define PKT_FIRST_FREE (1ULL << 23) -#define
> > > > > > > > > PKT_LAST_FREE (1ULL << 40)
> > > > > > > > > +#define PKT_LAST_FREE (1ULL << 39)
> > > > > > > > >
> > > > > > > > >  /* add new TX flags here, don't forget to update
> > > > > > > > > PKT_LAST_FREE  */
> > > > > > > > > +#define PKT_TX_MARK_EN         (1ULL << 40)
> > > > > > > > >
> > > > > > > > > Is this fine ?
> > > > > > > >
> > > > > > > > Any thoughts on this approach which uses only 1 bit in Tx
> > > > > > > > flags out of 18 and reuse unused Rx flag bits ?
> > > > > >
> > > > > > My thought was not about re-defining the flags (I think it is
> > > > > > better to keep them intact), but adding a union for one of rx-only
> fields (packet_type/rss/timestamp).
> > > > >
> > > > > Ok. Adding a union field at packet_type field is also fine like below.
> > > > >
> > > > > @@ -187,9 +187,10 @@ extern "C" {
> > > > >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE
> > > > > */
> > > > >
> > > > >  #define PKT_FIRST_FREE (1ULL << 23) -#define PKT_LAST_FREE
> > > > > (1ULL << 40)
> > > > > +#define PKT_LAST_FREE (1ULL << 39)
> > > > >
> > > > >  /* add new TX flags here, don't forget to update PKT_LAST_FREE
> > > > > */
> > > > > +#define PKT_TX_MARK_EN		(1ULL << 40)
> > > > >
> > > > >  /**
> > > > >   * Outer UDP checksum offload flag. This flag is used for
> > > > > enabling @@ -461,6 +462,14 @@ enum {  #endif  };
> > > > >
> > > > > +/* Tx packet marking flags in rte_mbuf::tx_mark.
> > > > > + * Valid only when PKT_TX_MARK_EN is set in
> > > > > + * rte_mbuf::ol_flags.
> > > > > + */
> > > > > +#define TX_MARK_VLAN_DEI	(1ULL << 0)
> > > > > +#define TX_MARK_IP_DSCP	(1ULL << 1)
> > > > > +#define TX_MARK_IP_ECN		(1ULL << 2)
> > > > > +
> > > > >  /**
> > > > >   * The generic rte_mbuf, containing a packet mbuf.
> > > > >   */
> > > > > @@ -543,6 +552,10 @@ struct rte_mbuf {
> > > > >  			};
> > > > >  			uint32_t inner_l4_type:4; /**< Inner L4 type. */
> > > > >  		};
> > > > > +		struct {
> > > > > +			uint32_t reserved:29;
> > > > > +			uint32_t tx_mark:3;
> > > > > +		};
> > > > >  	};
> > > > >
> > > > >
> > > > >
> > > > > Please correct me if this is not what you mean.
> > > >
> > > > I'm not a big fan of reusing Rx fields or flags for Tx.
> > > > It's not obvious for an application than adding a tx_mark will
> > > > overwrite the packet_type. I understand that the risk is limited
> > > > because packet_type is Rx and the marks are Tx, but there is still one.
> > >
> > > I'm also not a big fan but just wanted to take this approach so
> > > that, it can both conserve space and also help fast path.
> > > Reusing Rx area is however not a new thing as is already followed
> > > for
> > > mbuf->txadapter field.
> >
> > Yes, and in my opinion this is something we should avoid when
> > possible, because it makes some features exclusive (ex: the big union
> > with sched/rss/adapter/usr/...).
> >
> > > Apart from documentation issue, Is there any other issue or future
> > > ramification with using Rx field's for Tx ?
> >
> > No, I don't see any other issue except the ones we already mentioned (doc,
> code clarity, ).
> >
> > > If it is only about documentation, then we can add more documentation
> to make things clear.
> > > >
> > > > To summarize the different proposed approaches (please correct me if
> I'm wrong):
> > > >
> > > > a- add 3 Tx mbuf flags
> > > >    (-) consumes limited resource
> > > >
> > > > b- add 3 dynamic flags
> > > >    (-) slower
> > >
> > > - Tx burst Vector implementation can't be done for this tx offload as
> > >   offset keeps changing.
> >
> > A vector implementation can be done. But yes, it would be slower than
> > with a static flag.
> 
> Very slow atleast in our HW as, we try to translate ol_flags to HW descriptor
> flags in addition to extra operations to be done like offset calculations etc.

The dynamic flag offset is not subject to be changed after registration.
So, flag offset can be converted once into appropriate mask and stored locally by PMD  for further using in vector instructions.
The only difference - loaded variable mask instead of constant one, should not affect performance too much.
With the cmpeq/shifts the and results can be converted to any desired predefined mask (like static one).

> 
> So if we have fixed offsets, then it is easy to have static constant

BTW, how this constant is loaded?
I suppose on x86 it would be rather mm_load?_si128(*constant array) - no difference 
between dynamic and static masks at all. BTW, there is the hint - to make local copy of the
mask/offset in order to avoid cache-line concurrency in global variable storage.

> 128/256 bit words with offsets and use things like shuffle/table lookup to
> reorganize multiple mbuf flags to descriptor fields in a single instruction.

Offsets in the HW descriptor remain the fixed ones, so shuffle would still work OK.
Do you already have some vectorized implementation? It would be curious to have a look at.

I strongly share the concern about defining the static mbuf flags,
we should consider all ways to avoid doing this.

WBR, Slava
> 
> >
> > > >
> > > > c- add 1 Tx flag and union with Rx field
> > > >    (-) exclusive with Rx field
> > > >    (-) still consumes one flag
> > > >
> > > > My preference is still b-, for these reasons:
> > > >
> > > > - There are many different DPDK use cases, and resources in mbuf is
> tight.
> > > >   Recent contributions (rte_flow and ice driver) already made use of
> dynamic
> > > >   fields/flags.
> > > - Since RTE_FLOW metadata is 32-bit field, it is a clear candidate
> > > for dynamic flags.
> >
> > I'm not sure to get why it is a better candidate than packet marking.
> > You mean because it requires more room in mbuf?
> 
> Yes, I feel space consumption is one way to decide whether it should be a
> dynfield or static field.
> 
> IMO, other parameter to judge could be whether the field definition/usage
> itself is well know standard and is a part of RTE spec or its definition is
> vendor specific.
> 
> >
> > > - ICE PMD's dynamic field is however a vendor specific field and
> > > only for ICE PMD users.
> >
> > Yes, but ICE PMD users may be as important as packet marking users.
> 
> Agree, I only meant that the flag ICE PMD registered cannot be used for
> other PMD's so by using dynamic field, we are avoiding wastage of a static
> field that is needed only by one specific PMD irrespective of whether that
> PMD is probed or not.
> 
> >
> > > In this case, it is just 1 bit out of 18 free bits available in ol_flags.
> > >
> > > >
> > > > - When I implemented the dynamic fields/flags feature, I did a test
> which
> > > >   showed that the cost of having a dynamic offset was few cycles (on
> my test
> > > >   platform, it was~3 cycles for reading a field and ~2 cycles for writing a
> > > >   field).
> > >
> > > I think this cost is of the case where the address where the
> > > dyn_offset is stored is already in cache as it needs to be read first.
> >
> > This fetch of the value (in case it is not in cache) can be done once
> > per bulk, so I'm not sure the impact would be high.
> 
> Agreed, for bulk case offset loading should have less impact.
> 
> >
> >
> > Regards,
> > Olivier
> >
> >
> > >
> > >
> > > >
> > > > Regards,
> > > > Olivier
> > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > + Techboard
> > > > > > >
> > > > > > > There is a related thread going on
> > > > > > > https://eur03.safelinks.protection.outlook.com/?url=https%3A
> > > > > > > %2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-
> 3A__ma
> > > > > > > ils.dpdk.org_archives_dev_2020-
> 2DMay_168810.html%26d%3DDwIGa
> > > > > > >
> Q%26c%3DnKjWec2b6R0mOyPaz7xtfQ%26r%3DFZ_tPCbgFOh18zwRPO9H0y
> D
> > > > > > >
> x8VW38vuapifdDfc8SFQ%26m%3DnyV4Rud03HW6DbWMpyvOCulQNkagmf
> o0w
> > > > > > >
> KtrwQ7zmmg%26s%3DVuktoUb_xoLsHKdB9mV87x67cP9tXk3DqVXptt9nF_s
> > > > > > >
> %26e%3D&amp;data=02%7C01%7Cviacheslavo%40mellanox.com%7C5e7a
> > > > > > >
> 9c93fd684e09267108d807bd0160%7Ca652971c7d2e4d9ba6a4d149256f4
> > > > > > >
> 61b%7C0%7C0%7C637267855650797843&amp;sdata=r%2B0JIDapZocl6DD
> > > > > > > A8wbNd4LV0CX6zEdDoQJhBpMoELw%3D&amp;reserved=0
> > > > > > >
> > > > > > > If there is no consensus on email, then I would like to add
> > > > > > > this item to the next TB meeting.
> > > > > >
> > > > > > Ok, I'll add that to tomorrow meeting agenda.
> > > > > > Konstantin
> > > > > >
> > > >
> > > >
Nithin Dabilpuram June 8, 2020, 9:39 a.m. UTC | #31
On Wed, Jun 03, 2020 at 04:14:13PM +0000, Slava Ovsiienko wrote:
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Nithin Dabilpuram
> > Sent: Wednesday, June 3, 2020 15:52
> > To: Olivier Matz <olivier.matz@6wind.com>
> > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Jerin Jacob
> > <jerinjacobk@gmail.com>; Dumitrescu, Cristian
> > <cristian.dumitrescu@intel.com>; Nithin Dabilpuram
> > <nithind1988@gmail.com>; Thomas Monjalon <thomas@monjalon.net>;
> > Yigit, Ferruh <ferruh.yigit@intel.com>; Andrew Rybchenko
> > <arybchenko@solarflare.com>; Ori Kam <orika@mellanox.com>; Burakov,
> > Anatoly <anatoly.burakov@intel.com>; Mcnamara, John
> > <john.mcnamara@intel.com>; Kovacevic, Marko
> > <marko.kovacevic@intel.com>; dpdk-dev <dev@dpdk.org>; Jerin Jacob
> > <jerinj@marvell.com>; Krzysztof Kanas <kkanas@marvell.com>;
> > techboard@dpdk.org
> > Subject: Re: [dpdk-dev] [EXT] Re: [PATCH 1/3] mbuf: add Tx offloads for
> > packet marking
> > 
> > On Wed, Jun 03, 2020 at 01:38:22PM +0200, Olivier Matz wrote:
> > > Hi Nithin,
> > >
> > > On Wed, Jun 03, 2020 at 04:14:14PM +0530, Nithin Dabilpuram wrote:
> > > > On Wed, Jun 03, 2020 at 10:28:44AM +0200, Olivier Matz wrote:
> > > > > Hi,
> > > > >
> > > > > On Tue, Jun 02, 2020 at 07:55:37PM +0530, Nithin Dabilpuram wrote:
> > > > > > On Tue, Jun 02, 2020 at 10:53:08AM +0000, Ananyev, Konstantin
> > wrote:
> > > > > > > Hi Jerin,
> > > > > > >
> > > > > > > > > > > I also share Olivier's concern about consuming 3 bits in
> > ol_flags for that feature.
> > > > > > > > > > > Can it probably be squeezed somehow?
> > > > > > > > > > > Let say we reserve one flag that this information is
> > > > > > > > > > > present or not, and re-use one of rx-only fields for store
> > additional information (packet_type, or so).
> > > > > > > > > > > Or might be some other approach.
> > > > > > > > > >
> > > > > > > > > > We are fine with this approach where we define one bit
> > > > > > > > > > in Tx offloads for pkt marking and and 3 bits reused from Rx
> > offload flags area.
> > > > > > > > > >
> > > > > > > > > > For example:
> > > > > > > > > >
> > > > > > > > > > @@ -186,10 +186,16 @@ extern "C" {
> > > > > > > > > >
> > > > > > > > > >  /* add new RX flags here, don't forget to update
> > > > > > > > > > PKT_FIRST_FREE */
> > > > > > > > > >
> > > > > > > > > > +/* Reused Rx offload bits for Tx offloads */
> > > > > > > > > > +#define PKT_X_TX_MARK_VLAN_DEI         (1ULL << 0)
> > > > > > > > > > +#define PKT_X_TX_MARK_IP_DSCP          (1ULL << 1)
> > > > > > > > > > +#define PKT_X_TX_MARK_IP_ECN           (1ULL << 2)
> > > > > > > > > > +
> > > > > > > > > >  #define PKT_FIRST_FREE (1ULL << 23) -#define
> > > > > > > > > > PKT_LAST_FREE (1ULL << 40)
> > > > > > > > > > +#define PKT_LAST_FREE (1ULL << 39)
> > > > > > > > > >
> > > > > > > > > >  /* add new TX flags here, don't forget to update
> > > > > > > > > > PKT_LAST_FREE  */
> > > > > > > > > > +#define PKT_TX_MARK_EN         (1ULL << 40)
> > > > > > > > > >
> > > > > > > > > > Is this fine ?
> > > > > > > > >
> > > > > > > > > Any thoughts on this approach which uses only 1 bit in Tx
> > > > > > > > > flags out of 18 and reuse unused Rx flag bits ?
> > > > > > >
> > > > > > > My thought was not about re-defining the flags (I think it is
> > > > > > > better to keep them intact), but adding a union for one of rx-only
> > fields (packet_type/rss/timestamp).
> > > > > >
> > > > > > Ok. Adding a union field at packet_type field is also fine like below.
> > > > > >
> > > > > > @@ -187,9 +187,10 @@ extern "C" {
> > > > > >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE
> > > > > > */
> > > > > >
> > > > > >  #define PKT_FIRST_FREE (1ULL << 23) -#define PKT_LAST_FREE
> > > > > > (1ULL << 40)
> > > > > > +#define PKT_LAST_FREE (1ULL << 39)
> > > > > >
> > > > > >  /* add new TX flags here, don't forget to update PKT_LAST_FREE
> > > > > > */
> > > > > > +#define PKT_TX_MARK_EN		(1ULL << 40)
> > > > > >
> > > > > >  /**
> > > > > >   * Outer UDP checksum offload flag. This flag is used for
> > > > > > enabling @@ -461,6 +462,14 @@ enum {  #endif  };
> > > > > >
> > > > > > +/* Tx packet marking flags in rte_mbuf::tx_mark.
> > > > > > + * Valid only when PKT_TX_MARK_EN is set in
> > > > > > + * rte_mbuf::ol_flags.
> > > > > > + */
> > > > > > +#define TX_MARK_VLAN_DEI	(1ULL << 0)
> > > > > > +#define TX_MARK_IP_DSCP	(1ULL << 1)
> > > > > > +#define TX_MARK_IP_ECN		(1ULL << 2)
> > > > > > +
> > > > > >  /**
> > > > > >   * The generic rte_mbuf, containing a packet mbuf.
> > > > > >   */
> > > > > > @@ -543,6 +552,10 @@ struct rte_mbuf {
> > > > > >  			};
> > > > > >  			uint32_t inner_l4_type:4; /**< Inner L4 type. */
> > > > > >  		};
> > > > > > +		struct {
> > > > > > +			uint32_t reserved:29;
> > > > > > +			uint32_t tx_mark:3;
> > > > > > +		};
> > > > > >  	};
> > > > > >
> > > > > >
> > > > > >
> > > > > > Please correct me if this is not what you mean.
> > > > >
> > > > > I'm not a big fan of reusing Rx fields or flags for Tx.
> > > > > It's not obvious for an application than adding a tx_mark will
> > > > > overwrite the packet_type. I understand that the risk is limited
> > > > > because packet_type is Rx and the marks are Tx, but there is still one.
> > > >
> > > > I'm also not a big fan but just wanted to take this approach so
> > > > that, it can both conserve space and also help fast path.
> > > > Reusing Rx area is however not a new thing as is already followed
> > > > for
> > > > mbuf->txadapter field.
> > >
> > > Yes, and in my opinion this is something we should avoid when
> > > possible, because it makes some features exclusive (ex: the big union
> > > with sched/rss/adapter/usr/...).
> > >
> > > > Apart from documentation issue, Is there any other issue or future
> > > > ramification with using Rx field's for Tx ?
> > >
> > > No, I don't see any other issue except the ones we already mentioned (doc,
> > code clarity, ).
> > >
> > > > If it is only about documentation, then we can add more documentation
> > to make things clear.
> > > > >
> > > > > To summarize the different proposed approaches (please correct me if
> > I'm wrong):
> > > > >
> > > > > a- add 3 Tx mbuf flags
> > > > >    (-) consumes limited resource
> > > > >
> > > > > b- add 3 dynamic flags
> > > > >    (-) slower
> > > >
> > > > - Tx burst Vector implementation can't be done for this tx offload as
> > > >   offset keeps changing.
> > >
> > > A vector implementation can be done. But yes, it would be slower than
> > > with a static flag.
> > 
> > Very slow atleast in our HW as, we try to translate ol_flags to HW descriptor
> > flags in addition to extra operations to be done like offset calculations etc.
> 
> The dynamic flag offset is not subject to be changed after registration.
> So, flag offset can be converted once into appropriate mask and stored locally by PMD  for further using in vector instructions.
> The only difference - loaded variable mask instead of constant one, should not affect performance too much.
> With the cmpeq/shifts the and results can be converted to any desired predefined mask (like static one).
> 
> > 
> > So if we have fixed offsets, then it is easy to have static constant
> 
> BTW, how this constant is loaded?
> I suppose on x86 it would be rather mm_load?_si128(*constant array) - no difference 
> between dynamic and static masks at all. BTW, there is the hint - to make local copy of the
> mask/offset in order to avoid cache-line concurrency in global variable storage.
> 
> > 128/256 bit words with offsets and use things like shuffle/table lookup to
> > reorganize multiple mbuf flags to descriptor fields in a single instruction.
> 
> Offsets in the HW descriptor remain the fixed ones, so shuffle would still work OK.
> Do you already have some vectorized implementation? It would be curious to have a look at.

Agree that the constants need to be loaded from memory, but there is also a
logic created onto transforming that data to descriptor which is not always straight forward.

For example, see otx2_tx.c, 

// mbuf has tx offload field L2_LEN of 7 bits, L3_LEN of 9 bits which are not on byte boundary. 
// Below are the transformations done to get them to byte boundary.

/* Get tx_offload for ol2, ol3, l2, l3 lengths from mbuf*/
/*
 * Operation result:
 * E(8):OL2_LEN(7):OL3_LEN(9):E(24):L3_LEN(9):L2_LEN(7)
 * E(8):OL2_LEN(7):OL3_LEN(9):E(24):L3_LEN(9):L2_LEN(7)
 */

  asm volatile ("LD1 {%[a].D}[0],[%[in]]\n\t" :
                [a]"+w"(senddesc01_w1) :
                [in]"r"(mbuf0 + 2) : "memory");


 /*
  * Operation Result:
  * E(47):L3_LEN(9):L2_LEN(7+z)
  * E(47):L3_LEN(9):L2_LEN(7+z)
  */
 senddesc01_w1 = vshlq_n_u64(senddesc01_w1, 1);
 senddesc23_w1 = vshlq_n_u64(senddesc23_w1, 1);

/*
 * Result: 
 * E(48):L3_LEN(8):L2_LEN(z+7)
 * E(48):L3_LEN(8):L2_LEN(z+7)
 */
const int8x16_t tshft3 = {
        -1, 0, 8, 8, 8, 8, 8, 8,
        -1, 0, 8, 8, 8, 8, 8, 8,
};

senddesc01_w1 = vshlq_u8(senddesc01_w1, tshft3);
senddesc23_w1 = vshlq_u8(senddesc23_w1, tshft3);


We cannot get all the above logic done runtime for every position of mbuf->l2_len if
the field position keeps changing for every application run.


> 
> I strongly share the concern about defining the static mbuf flags,
> we should consider all ways to avoid doing this.
> 
> WBR, Slava
> > 
> > >
> > > > >
> > > > > c- add 1 Tx flag and union with Rx field
> > > > >    (-) exclusive with Rx field
> > > > >    (-) still consumes one flag
> > > > >
> > > > > My preference is still b-, for these reasons:
> > > > >
> > > > > - There are many different DPDK use cases, and resources in mbuf is
> > tight.
> > > > >   Recent contributions (rte_flow and ice driver) already made use of
> > dynamic
> > > > >   fields/flags.
> > > > - Since RTE_FLOW metadata is 32-bit field, it is a clear candidate
> > > > for dynamic flags.
> > >
> > > I'm not sure to get why it is a better candidate than packet marking.
> > > You mean because it requires more room in mbuf?
> > 
> > Yes, I feel space consumption is one way to decide whether it should be a
> > dynfield or static field.
> > 
> > IMO, other parameter to judge could be whether the field definition/usage
> > itself is well know standard and is a part of RTE spec or its definition is
> > vendor specific.
> > 
> > >
> > > > - ICE PMD's dynamic field is however a vendor specific field and
> > > > only for ICE PMD users.
> > >
> > > Yes, but ICE PMD users may be as important as packet marking users.
> > 
> > Agree, I only meant that the flag ICE PMD registered cannot be used for
> > other PMD's so by using dynamic field, we are avoiding wastage of a static
> > field that is needed only by one specific PMD irrespective of whether that
> > PMD is probed or not.
> > 
> > >
> > > > In this case, it is just 1 bit out of 18 free bits available in ol_flags.
> > > >
> > > > >
> > > > > - When I implemented the dynamic fields/flags feature, I did a test
> > which
> > > > >   showed that the cost of having a dynamic offset was few cycles (on
> > my test
> > > > >   platform, it was~3 cycles for reading a field and ~2 cycles for writing a
> > > > >   field).
> > > >
> > > > I think this cost is of the case where the address where the
> > > > dyn_offset is stored is already in cache as it needs to be read first.
> > >
> > > This fetch of the value (in case it is not in cache) can be done once
> > > per bulk, so I'm not sure the impact would be high.
> > 
> > Agreed, for bulk case offset loading should have less impact.
> > 
> > >
> > >
> > > Regards,
> > > Olivier
> > >
> > >
> > > >
> > > >
> > > > >
> > > > > Regards,
> > > > > Olivier
> > > > >
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > + Techboard
> > > > > > > >
> > > > > > > > There is a related thread going on
> > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__eur03.safelinks.protection.outlook.com_-3Furl-3Dhttps-253A&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=5IzeBXssneRLE7EeAyOUnytITHfcmhQu-eRLr18e5U0&s=aCn4On5jrJAfrGkQm_ftPodlBQo3QiozQM76MU9S8j8&e= 
> > > > > > > > %2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-
> > 3A__ma
> > > > > > > > ils.dpdk.org_archives_dev_2020-
> > 2DMay_168810.html%26d%3DDwIGa
> > > > > > > >
> > Q%26c%3DnKjWec2b6R0mOyPaz7xtfQ%26r%3DFZ_tPCbgFOh18zwRPO9H0y
> > D
> > > > > > > >
> > x8VW38vuapifdDfc8SFQ%26m%3DnyV4Rud03HW6DbWMpyvOCulQNkagmf
> > o0w
> > > > > > > >
> > KtrwQ7zmmg%26s%3DVuktoUb_xoLsHKdB9mV87x67cP9tXk3DqVXptt9nF_s
> > > > > > > >
> > %26e%3D&amp;data=02%7C01%7Cviacheslavo%40mellanox.com%7C5e7a
> > > > > > > >
> > 9c93fd684e09267108d807bd0160%7Ca652971c7d2e4d9ba6a4d149256f4
> > > > > > > >
> > 61b%7C0%7C0%7C637267855650797843&amp;sdata=r%2B0JIDapZocl6DD
> > > > > > > > A8wbNd4LV0CX6zEdDoQJhBpMoELw%3D&amp;reserved=0
> > > > > > > >
> > > > > > > > If there is no consensus on email, then I would like to add
> > > > > > > > this item to the next TB meeting.
> > > > > > >
> > > > > > > Ok, I'll add that to tomorrow meeting agenda.
> > > > > > > Konstantin
> > > > > > >
> > > > >
> > > > >

Patch
diff mbox series

diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
index edd21c4..bc978fb 100644
--- a/doc/guides/nics/features.rst
+++ b/doc/guides/nics/features.rst
@@ -913,6 +913,20 @@  Supports to get Rx/Tx packet burst mode information.
 * **[implements] eth_dev_ops**: ``rx_burst_mode_get``, ``tx_burst_mode_get``.
 * **[related] API**: ``rte_eth_rx_burst_mode_get()``, ``rte_eth_tx_burst_mode_get()``.
 
+.. _nic_features_traffic_manager_packet_marking_offload:
+
+Traffic Manager Packet marking offload
+--------------------------------------
+
+Supports enabling a packet marking offload specific mbuf.
+
+* **[uses]     mbuf**: ``mbuf.ol_flags:PKT_TX_MARK_IP_DSCP``,
+  ``mbuf.ol_flags:PKT_TX_MARK_IP_ECN``, ``mbuf.ol_flags:PKT_TX_MARK_VLAN_DEI``,
+  ``mbuf.ol_flags:PKT_TX_IPV4``, ``mbuf.ol_flags:PKT_TX_IPV6``.
+* **[uses]     mbuf**: ``mbuf.l2_len``.
+* **[related] API**: ``rte_tm_mark_ip_dscp()``, ``rte_tm_mark_ip_ecn()``,
+  ``rte_tm_mark_vlan_dei()``.
+
 .. _nic_features_other:
 
 Other dev ops not represented by a Feature
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index cd5794d..5c6896d 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -880,6 +880,9 @@  const char *rte_get_tx_ol_flag_name(uint64_t mask)
 	case PKT_TX_SEC_OFFLOAD: return "PKT_TX_SEC_OFFLOAD";
 	case PKT_TX_UDP_SEG: return "PKT_TX_UDP_SEG";
 	case PKT_TX_OUTER_UDP_CKSUM: return "PKT_TX_OUTER_UDP_CKSUM";
+	case PKT_TX_MARK_VLAN_DEI: return "PKT_TX_MARK_VLAN_DEI";
+	case PKT_TX_MARK_IP_DSCP: return "PKT_TX_MARK_IP_DSCP";
+	case PKT_TX_MARK_IP_ECN: return "PKT_TX_MARK_IP_ECN";
 	default: return NULL;
 	}
 }
@@ -916,6 +919,9 @@  rte_get_tx_ol_flag_list(uint64_t mask, char *buf, size_t buflen)
 		{ PKT_TX_SEC_OFFLOAD, PKT_TX_SEC_OFFLOAD, NULL },
 		{ PKT_TX_UDP_SEG, PKT_TX_UDP_SEG, NULL },
 		{ PKT_TX_OUTER_UDP_CKSUM, PKT_TX_OUTER_UDP_CKSUM, NULL },
+		{ PKT_TX_MARK_VLAN_DEI, PKT_TX_MARK_VLAN_DEI, NULL },
+		{ PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_DSCP, NULL },
+		{ PKT_TX_MARK_IP_ECN, PKT_TX_MARK_IP_ECN, NULL },
 	};
 	const char *name;
 	unsigned int i;
diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
index b9a59c8..d9f1290 100644
--- a/lib/librte_mbuf/rte_mbuf_core.h
+++ b/lib/librte_mbuf/rte_mbuf_core.h
@@ -187,11 +187,40 @@  extern "C" {
 /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
 
 #define PKT_FIRST_FREE (1ULL << 23)
-#define PKT_LAST_FREE (1ULL << 40)
+#define PKT_LAST_FREE (1ULL << 37)
 
 /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
 
 /**
+ * Packet marking offload flags. These flags indicated what kind
+ * of packet marking needs to be applied on a given mbuf when
+ * appropriate Traffic Manager configuration is in place.
+ * When user set's these flags on a mbuf, below assumptions are made
+ * 1) When PKT_TX_MARK_VLAN_DEI is set,
+ * a) PMD assumes pkt to be a 802.1q packet.
+ * b) Application should also set mbuf.l2_len where 802.1Q header is
+ *    at (mbuf.l2_len - 6) offset.
+ * 2) When PKT_TX_MARK_IP_DSCP is set,
+ * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6
+ *    to indicate whether if it is IPv4 packet or IPv6 packet
+ *    for DSCP marking. It should also set PKT_TX_IP_CKSUM if it is
+ *    IPv4 pkt.
+ * b) Application should also set mbuf.l2_len that indicates
+ *    start offset of L3 header.
+ * 3) When PKT_TX_MARK_IP_ECN is set,
+ * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6.
+ *    It should also set PKT_TX_IP_CKSUM if it is IPv4 pkt.
+ * b) PMD will assume pkt L4 protocol is either TCP or SCTP and
+ *    ECN is set to 2'b01 or 2'b10 as per RFC 3168 and hence HW
+ *    can mark the packet for a configured color.
+ * c) Application should also set mbuf.l2_len that indicates
+ *    start offset of L3 header.
+ */
+#define PKT_TX_MARK_VLAN_DEI		(1ULL << 38)
+#define PKT_TX_MARK_IP_DSCP		(1ULL << 39)
+#define PKT_TX_MARK_IP_ECN		(1ULL << 40)
+
+/**
  * Outer UDP checksum offload flag. This flag is used for enabling
  * outer UDP checksum in PMD. To use outer UDP checksum, the user needs to
  * 1) Enable the following in mbuf,
@@ -384,7 +413,10 @@  extern "C" {
 		PKT_TX_MACSEC |		 \
 		PKT_TX_SEC_OFFLOAD |	 \
 		PKT_TX_UDP_SEG |	 \
-		PKT_TX_OUTER_UDP_CKSUM)
+		PKT_TX_OUTER_UDP_CKSUM | \
+		PKT_TX_MARK_VLAN_DEI |	 \
+		PKT_TX_MARK_IP_DSCP |	 \
+		PKT_TX_MARK_IP_ECN)
 
 /**
  * Mbuf having an external buffer attached. shinfo in mbuf must be filled.