mbuf: outer offsets must be zero for non-tunnel packets

Message ID 20190621103419.11626-1-ivan.malov@oktetlabs.ru (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series mbuf: outer offsets must be zero for non-tunnel packets |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation fail Compilation issues
ci/mellanox-Performance-Testing success Performance Testing PASS
ci/intel-Performance-Testing success Performance Testing PASS

Commit Message

Ivan Malov June 21, 2019, 10:34 a.m. UTC
  Make sure that outer L2 and L3 header length fields are
equal to zero for non-tunnel packets in order to ensure
consistent and predictable behaviour in network drivers.
Explain this expectation in comments to help developers.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
---

Notes:
    At the time of writing a couple of network drivers rely on
    the statement (i40e, ice) whilst more drivers have runtime
    conditional checks to guard all references to these fields.
    This patch is likely to relieve datapath checks in drivers.

 lib/librte_mbuf/rte_mbuf.h | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)
  

Comments

Ananyev, Konstantin June 21, 2019, 11:10 a.m. UTC | #1
Hi Ivan,

> 
> Make sure that outer L2 and L3 header length fields are
> equal to zero for non-tunnel packets in order to ensure
> consistent and predictable behaviour in network drivers.
> Explain this expectation in comments to help developers.
> 
> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
> ---

Not sure it is a good idea:
1) it is a change in public API behavior (requirements).
2) why these 2 particular tx_offload fields only?
If we'll follow that logic we should enforce same rule for other
tx_offload fileds (tso, l4_len, l3_len, etc.)

Personally I don't think there will be much gain from it.
Might be better and easier just to fix offending drivers that make wrong assumptions.

If we'll still decide to go that way, then I think at least it needs
to be explained in RN, and probably deprecation process has to be followed here.
 
Konstantin

> 
> Notes:
>     At the time of writing a couple of network drivers rely on
>     the statement (i40e, ice) whilst more drivers have runtime
>     conditional checks to guard all references to these fields.
>     This patch is likely to relieve datapath checks in drivers.
> 
>  lib/librte_mbuf/rte_mbuf.h | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 0d9fef0..cb8b34e 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -702,7 +702,12 @@ struct rte_mbuf {
>  			uint64_t tso_segsz:RTE_MBUF_TSO_SEGSZ_BITS;
>  			/**< TCP TSO segment size */
> 
> -			/* fields for TX offloading of tunnels */
> +			/*
> +			 * Fields for Tx offloading of tunnels.
> +			 * These fields must be equal to zero in the case
> +			 * when (ol_flags & PKT_TX_TUNNEL_MASK) == 0,
> +			 * i.e. for all non-tunnel packets.
> +			 */
>  			uint64_t outer_l3_len:RTE_MBUF_OUTL3_LEN_BITS;
>  			/**< Outer L3 (IP) Hdr Length. */
>  			uint64_t outer_l2_len:RTE_MBUF_OUTL2_LEN_BITS;
> @@ -2376,6 +2381,11 @@ static inline int rte_pktmbuf_chain(struct rte_mbuf *head, struct rte_mbuf *tail
>  			!(ol_flags & PKT_TX_OUTER_IPV4))
>  		return -EINVAL;
> 
> +	/* Outer L2/L3 offsets must be equal to zero for non-tunnel packets. */
> +	if ((ol_flags & PKT_TX_TUNNEL_MASK) == 0 &&
> +	    m->outer_l2_len + m->outer_l3_len != 0)
> +		return -EINVAL;
> +
>  	return 0;
>  }
> 
> --
> 1.8.3.1
  
Andrew Rybchenko June 21, 2019, 12:35 p.m. UTC | #2
Hi Konstantin,

On 6/21/19 2:10 PM, Ananyev, Konstantin wrote:
> Hi Ivan,
>> Make sure that outer L2 and L3 header length fields are
>> equal to zero for non-tunnel packets in order to ensure
>> consistent and predictable behaviour in network drivers.
>> Explain this expectation in comments to help developers.
>>
>> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
>> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
>> ---
> Not sure it is a good idea:
> 1) it is a change in public API behavior (requirements).

I would say that it is a clarification. Yes, in terms of 
rte_validate_tx_offload()
behaviour is it is a change. The area looks grey and we just want to make
it either black or white. What is the alternative? Say that outer_l2_len and
outer_l3_len content is undefined if packet is not tunnelled and drivers
must check (ol_flags & PKT_TX_TUNNEL_MASK) != 0 before usage these fields?

bnxt, fm10k, i40e, ixgbe (depends on PKT_TX_OUTER_IP_CKSUM in fact, but
not PKT_TX_TUNNEL_MASK) and ice use these fields w/o tunnel checks (if
I read code correctly).

enic, mlx4, mlx5, qede and sfc use them in the case of tunnel packet only.

I.e. 5 vs 5.

> 2) why these 2 particular tx_offload fields only?
> If we'll follow that logic we should enforce same rule for other
> tx_offload fileds (tso, l4_len, l3_len, etc.)

Because it is about tunnel packets and outer_l2_len and outer_l3_len
should be either undefined or 0 for non-tunnel packets.

> Personally I don't think there will be much gain from it.
> Might be better and easier just to fix offending drivers that make wrong assumptions.

We would prefer to define as the patch suggests since it allows
to avoid conditions. Other option is to add a comment saying that
content of these fields is undefined for non-tunnel packets.
Of course, the patch makes it required to care about outer_l2/3_len
when mbuf is reused and Tx offloads are requested. So, may be
from application point of view it is better to have it undefined for
non-tunnel packets.

> If we'll still decide to go that way, then I think at least it needs
> to be explained in RN, and probably deprecation process has to be followed here.

Yes, I agree and would like to understand which way is right
(just highlight in release notes or deprecation process).

BTW, may I ask you to take a look at two more small patches:
[1] https://patches.dpdk.org/patch/53691/
[2] https://patches.dpdk.org/patch/53857/

Many thanks,
Andrew.

(As Keith said some time ago it looks like almost nobody look at RFC
patches. Sad. The main goal of RFC patches is get feedback earlier.
RFC for this one was in April and we could start deprecation process
in previous release cycle if it is required. Luckily it is not critical
in this case.)

> Konstantin
>
>> Notes:
>>      At the time of writing a couple of network drivers rely on
>>      the statement (i40e, ice) whilst more drivers have runtime
>>      conditional checks to guard all references to these fields.
>>      This patch is likely to relieve datapath checks in drivers.
>>
>>   lib/librte_mbuf/rte_mbuf.h | 12 +++++++++++-
>>   1 file changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
>> index 0d9fef0..cb8b34e 100644
>> --- a/lib/librte_mbuf/rte_mbuf.h
>> +++ b/lib/librte_mbuf/rte_mbuf.h
>> @@ -702,7 +702,12 @@ struct rte_mbuf {
>>   			uint64_t tso_segsz:RTE_MBUF_TSO_SEGSZ_BITS;
>>   			/**< TCP TSO segment size */
>>
>> -			/* fields for TX offloading of tunnels */
>> +			/*
>> +			 * Fields for Tx offloading of tunnels.
>> +			 * These fields must be equal to zero in the case
>> +			 * when (ol_flags & PKT_TX_TUNNEL_MASK) == 0,
>> +			 * i.e. for all non-tunnel packets.
>> +			 */
>>   			uint64_t outer_l3_len:RTE_MBUF_OUTL3_LEN_BITS;
>>   			/**< Outer L3 (IP) Hdr Length. */
>>   			uint64_t outer_l2_len:RTE_MBUF_OUTL2_LEN_BITS;
>> @@ -2376,6 +2381,11 @@ static inline int rte_pktmbuf_chain(struct rte_mbuf *head, struct rte_mbuf *tail
>>   			!(ol_flags & PKT_TX_OUTER_IPV4))
>>   		return -EINVAL;
>>
>> +	/* Outer L2/L3 offsets must be equal to zero for non-tunnel packets. */
>> +	if ((ol_flags & PKT_TX_TUNNEL_MASK) == 0 &&
>> +	    m->outer_l2_len + m->outer_l3_len != 0)
>> +		return -EINVAL;
>> +
>>   	return 0;
>>   }
>>
>> --
>> 1.8.3.1
  
Ananyev, Konstantin June 24, 2019, 12:59 p.m. UTC | #3
Hi Andrew,

> Hi Konstantin,
> 
> On 6/21/19 2:10 PM, Ananyev, Konstantin wrote:
> Hi Ivan,
> 
> Make sure that outer L2 and L3 header length fields are
> equal to zero for non-tunnel packets in order to ensure
> consistent and predictable behaviour in network drivers.
> Explain this expectation in comments to help developers.
> 
> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
> ---
> 
>> Not sure it is a good idea:
>> 1) it is a change in public API behavior (requirements).
> 
> I would say that it is a clarification. Yes, in terms of rte_validate_tx_offload()
> behaviour is it is a change. The area looks grey and we just want to make
> it either black or white. What is the alternative? Say that outer_l2_len and
> outer_l3_len content is undefined if packet is not tunnelled and drivers
> must check (ol_flags & PKT_TX_TUNNEL_MASK) != 0 before usage these fields?

Yes, that was my thought.
As I understand, that what is implied right now.
Otherwise any app that setups tx_offload fieds for rte_eth_tx_burst()
need to be changed?

> 
> bnxt, fm10k, i40e, ixgbe (depends on PKT_TX_OUTER_IP_CKSUM in fact, but
> not PKT_TX_TUNNEL_MASK) and ice use these fields w/o tunnel checks (if
> I read code correctly).
> 
> enic, mlx4, mlx5, qede and sfc use them in the case of tunnel packet only.
> 
> I.e. 5 vs 5.
> 
> 
>> 2) why these 2 particular tx_offload fields only?
>> If we'll follow that logic we should enforce same rule for other
>> tx_offload fileds (tso, l4_len, l3_len, etc.)
> 
> Because it is about tunnel packets and outer_l2_len and outer_l3_len
> should be either undefined or 0 for non-tunnel packets.

I understand that, but I think rules for setting/treating tx_offload fields
should be the same for all fields.
We either allow any tx_offload field to be undefined when related
bit(s) in ol_flags are not set, or we need to force people to setup
whole 64-bit tx_offload value if any of related TX flags are set.

> 
> 
>> Personally I don't think there will be much gain from it.
>> Might be better and easier just to fix offending drivers that make wrong assumptions.
> 
> We would prefer to define as the patch suggests since it allows
> to avoid conditions.

It does, and it might simplify things for PMDs...
But as I said above, it would need changes in the apps that
do use tx_offload fileds for TX, right?

> Other option is to add a comment saying that
> content of these fields is undefined for non-tunnel packets.
> Of course, the patch makes it required to care about outer_l2/3_len
> when mbuf is reused and Tx offloads are requested. So, may be
> from application point of view it is better to have it undefined for
> non-tunnel packets.
> 
> 
> If we'll still decide to go that way, then I think at least it needs
> to be explained in RN, and probably deprecation process has to be followed here.
> 
> Yes, I agree and would like to understand which way is right
> (just highlight in release notes or deprecation process).

From my understanding: if changes inside app code might be necessary,
then we do need a deprecation note. 

> 
> BTW, may I ask you to take a look at two more small patches:
> [1] https://patches.dpdk.org/patch/53691/
> [2] https://patches.dpdk.org/patch/53857/

Will do

> 
> Many thanks,
> Andrew.
> 
> (As Keith said some time ago it looks like almost nobody look at RFC
> patches. Sad. The main goal of RFC patches is get feedback earlier.
> RFC for this one was in April and we could start deprecation process
> in previous release cycle if it is required. Luckily it is not critical
> in this case.)
> 
> 
> Konstantin
> 
> 
> Notes:
>     At the time of writing a couple of network drivers rely on
>     the statement (i40e, ice) whilst more drivers have runtime
>     conditional checks to guard all references to these fields.
>     This patch is likely to relieve datapath checks in drivers.
> 
>  lib/librte_mbuf/rte_mbuf.h | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 0d9fef0..cb8b34e 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -702,7 +702,12 @@ struct rte_mbuf {
>  			uint64_t tso_segsz:RTE_MBUF_TSO_SEGSZ_BITS;
>  			/**< TCP TSO segment size */
> 
> -			/* fields for TX offloading of tunnels */
> +			/*
> +			 * Fields for Tx offloading of tunnels.
> +			 * These fields must be equal to zero in the case
> +			 * when (ol_flags & PKT_TX_TUNNEL_MASK) == 0,
> +			 * i.e. for all non-tunnel packets.
> +			 */
>  			uint64_t outer_l3_len:RTE_MBUF_OUTL3_LEN_BITS;
>  			/**< Outer L3 (IP) Hdr Length. */
>  			uint64_t outer_l2_len:RTE_MBUF_OUTL2_LEN_BITS;
> @@ -2376,6 +2381,11 @@ static inline int rte_pktmbuf_chain(struct rte_mbuf *head, struct rte_mbuf *tail
>  			!(ol_flags & PKT_TX_OUTER_IPV4))
>  		return -EINVAL;
> 
> +	/* Outer L2/L3 offsets must be equal to zero for non-tunnel packets. */
> +	if ((ol_flags & PKT_TX_TUNNEL_MASK) == 0 &&
> +	    m->outer_l2_len + m->outer_l3_len != 0)
> +		return -EINVAL;
> +
>  	return 0;
>  }
> 
> --
> 1.8.3.1
  
Andrew Rybchenko June 24, 2019, 1:33 p.m. UTC | #4
On 6/24/19 3:59 PM, Ananyev, Konstantin wrote:
> Hi Andrew,
>
>> Hi Konstantin,
>>
>> On 6/21/19 2:10 PM, Ananyev, Konstantin wrote:
>> Hi Ivan,
>>
>> Make sure that outer L2 and L3 header length fields are
>> equal to zero for non-tunnel packets in order to ensure
>> consistent and predictable behaviour in network drivers.
>> Explain this expectation in comments to help developers.
>>
>> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
>> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
>> ---
>>
>>> Not sure it is a good idea:
>>> 1) it is a change in public API behavior (requirements).
>> I would say that it is a clarification. Yes, in terms of rte_validate_tx_offload()
>> behaviour is it is a change. The area looks grey and we just want to make
>> it either black or white. What is the alternative? Say that outer_l2_len and
>> outer_l3_len content is undefined if packet is not tunnelled and drivers
>> must check (ol_flags & PKT_TX_TUNNEL_MASK) != 0 before usage these fields?
> Yes, that was my thought.
> As I understand, that what is implied right now.
> Otherwise any app that setups tx_offload fieds for rte_eth_tx_burst()
> need to be changed?
>
>> bnxt, fm10k, i40e, ixgbe (depends on PKT_TX_OUTER_IP_CKSUM in fact, but
>> not PKT_TX_TUNNEL_MASK) and ice use these fields w/o tunnel checks (if
>> I read code correctly).
>>
>> enic, mlx4, mlx5, qede and sfc use them in the case of tunnel packet only.
>>
>> I.e. 5 vs 5.
>>
>>
>>> 2) why these 2 particular tx_offload fields only?
>>> If we'll follow that logic we should enforce same rule for other
>>> tx_offload fileds (tso, l4_len, l3_len, etc.)
>> Because it is about tunnel packets and outer_l2_len and outer_l3_len
>> should be either undefined or 0 for non-tunnel packets.
> I understand that, but I think rules for setting/treating tx_offload fields
> should be the same for all fields.
> We either allow any tx_offload field to be undefined when related
> bit(s) in ol_flags are not set, or we need to force people to setup
> whole 64-bit tx_offload value if any of related TX flags are set.

Yes, I agree.

>>> Personally I don't think there will be much gain from it.
>>> Might be better and easier just to fix offending drivers that make wrong assumptions.
>> We would prefer to define as the patch suggests since it allows
>> to avoid conditions.
> It does, and it might simplify things for PMDs...
> But as I said above, it would need changes in the apps that
> do use tx_offload fileds for TX, right?

Yes, it sounds bad. OK, we have discussed it and Ivan will send v2
which simply clarify comments.

It looks like there is no simple conditions when we should require
(in validate function) non-zero outer_l2/l3_len.

Thanks a lot,
Andrew.
  

Patch

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 0d9fef0..cb8b34e 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -702,7 +702,12 @@  struct rte_mbuf {
 			uint64_t tso_segsz:RTE_MBUF_TSO_SEGSZ_BITS;
 			/**< TCP TSO segment size */
 
-			/* fields for TX offloading of tunnels */
+			/*
+			 * Fields for Tx offloading of tunnels.
+			 * These fields must be equal to zero in the case
+			 * when (ol_flags & PKT_TX_TUNNEL_MASK) == 0,
+			 * i.e. for all non-tunnel packets.
+			 */
 			uint64_t outer_l3_len:RTE_MBUF_OUTL3_LEN_BITS;
 			/**< Outer L3 (IP) Hdr Length. */
 			uint64_t outer_l2_len:RTE_MBUF_OUTL2_LEN_BITS;
@@ -2376,6 +2381,11 @@  static inline int rte_pktmbuf_chain(struct rte_mbuf *head, struct rte_mbuf *tail
 			!(ol_flags & PKT_TX_OUTER_IPV4))
 		return -EINVAL;
 
+	/* Outer L2/L3 offsets must be equal to zero for non-tunnel packets. */
+	if ((ol_flags & PKT_TX_TUNNEL_MASK) == 0 &&
+	    m->outer_l2_len + m->outer_l3_len != 0)
+		return -EINVAL;
+
 	return 0;
 }