mbuf: move headers not fragmented check to checksum

Message ID 1550557852-21882-1-git-send-email-arybchenko@solarflare.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series mbuf: move headers not fragmented check to checksum |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/mellanox-Performance-Testing success Performance Testing PASS
ci/intel-Performance-Testing success Performance Testing PASS
ci/Intel-compilation success Compilation OK

Commit Message

Andrew Rybchenko Feb. 19, 2019, 6:30 a.m. UTC
  rte_validate_tx_offload() is used in Tx prepare callbacks
(RTE_LIBRTE_ETHDEV_DEBUG only) to check Tx offloads consistency.
Requirement that packet headers should not be fragmented is not
documented and unclear where it comes from except
rte_net_intel_cksum_prepare() functions which relies on it.

It could be NIC vendor specific driver or hardware limitation, but,
if so, it should be documented and checked in corresponding Tx
prepare callbacks.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
Looks good to me, though extra-testing would be needed.
Konstantin Ananyev <konstantin.ananyev@intel.com>

 lib/librte_mbuf/rte_mbuf.h | 12 ------------
 lib/librte_net/rte_net.h   | 17 +++++++++++++++++
 2 files changed, 17 insertions(+), 12 deletions(-)
  

Comments

Andrew Rybchenko March 28, 2019, 5:04 p.m. UTC | #1
Ping? (I have a number of net/sfc patches which heavily depend on this
one and must not be applied without this one)

Andrew.

On 2/19/19 9:30 AM, Andrew Rybchenko wrote:
> rte_validate_tx_offload() is used in Tx prepare callbacks
> (RTE_LIBRTE_ETHDEV_DEBUG only) to check Tx offloads consistency.
> Requirement that packet headers should not be fragmented is not
> documented and unclear where it comes from except
> rte_net_intel_cksum_prepare() functions which relies on it.
>
> It could be NIC vendor specific driver or hardware limitation, but,
> if so, it should be documented and checked in corresponding Tx
> prepare callbacks.
>
> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> ---
> Looks good to me, though extra-testing would be needed.
> Konstantin Ananyev <konstantin.ananyev@intel.com>
>
>   lib/librte_mbuf/rte_mbuf.h | 12 ------------
>   lib/librte_net/rte_net.h   | 17 +++++++++++++++++
>   2 files changed, 17 insertions(+), 12 deletions(-)
>
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index d961cca..73daa81 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -2257,23 +2257,11 @@ static inline int rte_pktmbuf_chain(struct rte_mbuf *head, struct rte_mbuf *tail
>   rte_validate_tx_offload(const struct rte_mbuf *m)
>   {
>   	uint64_t ol_flags = m->ol_flags;
> -	uint64_t inner_l3_offset = m->l2_len;
>   
>   	/* Does packet set any of available offloads? */
>   	if (!(ol_flags & PKT_TX_OFFLOAD_MASK))
>   		return 0;
>   
> -	if (ol_flags & PKT_TX_OUTER_IP_CKSUM)
> -		/* NB: elaborating the addition like this instead of using
> -		 *     += gives the result uint64_t type instead of int,
> -		 *     avoiding compiler warnings on gcc 8.1 at least */
> -		inner_l3_offset = inner_l3_offset + m->outer_l2_len +
> -				  m->outer_l3_len;
> -
> -	/* Headers are fragmented */
> -	if (rte_pktmbuf_data_len(m) < inner_l3_offset + m->l3_len + m->l4_len)
> -		return -ENOTSUP;
> -
>   	/* IP checksum can be counted only for IPv4 packet */
>   	if ((ol_flags & PKT_TX_IP_CKSUM) && (ol_flags & PKT_TX_IPV6))
>   		return -EINVAL;
> diff --git a/lib/librte_net/rte_net.h b/lib/librte_net/rte_net.h
> index e59760a..bd75aea 100644
> --- a/lib/librte_net/rte_net.h
> +++ b/lib/librte_net/rte_net.h
> @@ -118,10 +118,27 @@ uint32_t rte_net_get_ptype(const struct rte_mbuf *m,
>   	struct udp_hdr *udp_hdr;
>   	uint64_t inner_l3_offset = m->l2_len;
>   
> +	/*
> +	 * Does packet set any of available offloads?
> +	 * Mainly it is required to avoid fragmented headers check if
> +	 * no offloads are requested.
> +	 */
> +	if (!(ol_flags & PKT_TX_OFFLOAD_MASK))
> +		return 0;
> +
>   	if ((ol_flags & PKT_TX_OUTER_IP_CKSUM) ||
>   		(ol_flags & PKT_TX_OUTER_IPV6))
>   		inner_l3_offset += m->outer_l2_len + m->outer_l3_len;
>   
> +	/*
> +	 * Check if headers are fragmented.
> +	 * The check could be less strict depending on which offloads are
> +	 * requested and headers to be used, but let's keep it simple.
> +	 */
> +	if (unlikely(rte_pktmbuf_data_len(m) <
> +		     inner_l3_offset + m->l3_len + m->l4_len))
> +		return -ENOTSUP;
> +
>   	if (ol_flags & PKT_TX_IPV4) {
>   		ipv4_hdr = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *,
>   				inner_l3_offset);
  
Olivier Matz March 29, 2019, 1:09 p.m. UTC | #2
Hi Andrew,

On Thu, Mar 28, 2019 at 08:04:31PM +0300, Andrew Rybchenko wrote:
> Ping? (I have a number of net/sfc patches which heavily depend on this
> one and must not be applied without this one)
> 
> Andrew.
> 
> On 2/19/19 9:30 AM, Andrew Rybchenko wrote:
> > rte_validate_tx_offload() is used in Tx prepare callbacks
> > (RTE_LIBRTE_ETHDEV_DEBUG only) to check Tx offloads consistency.
> > Requirement that packet headers should not be fragmented is not
> > documented and unclear where it comes from except
> > rte_net_intel_cksum_prepare() functions which relies on it.
> > 
> > It could be NIC vendor specific driver or hardware limitation, but,
> > if so, it should be documented and checked in corresponding Tx
> > prepare callbacks.
> > 
> > Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
> > Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> > ---
> > Looks good to me, though extra-testing would be needed.
> > Konstantin Ananyev <konstantin.ananyev@intel.com>
> > 
> >   lib/librte_mbuf/rte_mbuf.h | 12 ------------
> >   lib/librte_net/rte_net.h   | 17 +++++++++++++++++
> >   2 files changed, 17 insertions(+), 12 deletions(-)
> > 
> > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > index d961cca..73daa81 100644
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > @@ -2257,23 +2257,11 @@ static inline int rte_pktmbuf_chain(struct rte_mbuf *head, struct rte_mbuf *tail
> >   rte_validate_tx_offload(const struct rte_mbuf *m)
> >   {
> >   	uint64_t ol_flags = m->ol_flags;
> > -	uint64_t inner_l3_offset = m->l2_len;
> >   	/* Does packet set any of available offloads? */
> >   	if (!(ol_flags & PKT_TX_OFFLOAD_MASK))
> >   		return 0;
> > -	if (ol_flags & PKT_TX_OUTER_IP_CKSUM)
> > -		/* NB: elaborating the addition like this instead of using
> > -		 *     += gives the result uint64_t type instead of int,
> > -		 *     avoiding compiler warnings on gcc 8.1 at least */
> > -		inner_l3_offset = inner_l3_offset + m->outer_l2_len +
> > -				  m->outer_l3_len;
> > -
> > -	/* Headers are fragmented */
> > -	if (rte_pktmbuf_data_len(m) < inner_l3_offset + m->l3_len + m->l4_len)
> > -		return -ENOTSUP;
> > -
> >   	/* IP checksum can be counted only for IPv4 packet */
> >   	if ((ol_flags & PKT_TX_IP_CKSUM) && (ol_flags & PKT_TX_IPV6))
> >   		return -EINVAL;
> > diff --git a/lib/librte_net/rte_net.h b/lib/librte_net/rte_net.h
> > index e59760a..bd75aea 100644
> > --- a/lib/librte_net/rte_net.h
> > +++ b/lib/librte_net/rte_net.h
> > @@ -118,10 +118,27 @@ uint32_t rte_net_get_ptype(const struct rte_mbuf *m,
> >   	struct udp_hdr *udp_hdr;
> >   	uint64_t inner_l3_offset = m->l2_len;
> > +	/*
> > +	 * Does packet set any of available offloads?
> > +	 * Mainly it is required to avoid fragmented headers check if
> > +	 * no offloads are requested.
> > +	 */
> > +	if (!(ol_flags & PKT_TX_OFFLOAD_MASK))
> > +		return 0;
> > +
> >   	if ((ol_flags & PKT_TX_OUTER_IP_CKSUM) ||
> >   		(ol_flags & PKT_TX_OUTER_IPV6))
> >   		inner_l3_offset += m->outer_l2_len + m->outer_l3_len;
> > +	/*
> > +	 * Check if headers are fragmented.
> > +	 * The check could be less strict depending on which offloads are
> > +	 * requested and headers to be used, but let's keep it simple.
> > +	 */
> > +	if (unlikely(rte_pktmbuf_data_len(m) <
> > +		     inner_l3_offset + m->l3_len + m->l4_len))
> > +		return -ENOTSUP;
> > +
> >   	if (ol_flags & PKT_TX_IPV4) {
> >   		ipv4_hdr = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *,
> >   				inner_l3_offset);
> 


To summarize, the previous code was in a generic part, only enabled if
RTE_LIBRTE_ETHDEV_DEBUG is set, and it is moved in an intel-specific part,
but always enabled. Am I correct?

So it may have a performance impact on intel NICs. Shouldn't it be under
a debug option?

Regards,
Olivier
  
Andrew Rybchenko March 29, 2019, 1:30 p.m. UTC | #3
Hi Olivier,

On 3/29/19 4:09 PM, Olivier Matz wrote:
> Hi Andrew,
>
> On Thu, Mar 28, 2019 at 08:04:31PM +0300, Andrew Rybchenko wrote:
>> Ping? (I have a number of net/sfc patches which heavily depend on this
>> one and must not be applied without this one)
>>
>> Andrew.
>>
>> On 2/19/19 9:30 AM, Andrew Rybchenko wrote:
>>> rte_validate_tx_offload() is used in Tx prepare callbacks
>>> (RTE_LIBRTE_ETHDEV_DEBUG only) to check Tx offloads consistency.
>>> Requirement that packet headers should not be fragmented is not
>>> documented and unclear where it comes from except
>>> rte_net_intel_cksum_prepare() functions which relies on it.
>>>
>>> It could be NIC vendor specific driver or hardware limitation, but,
>>> if so, it should be documented and checked in corresponding Tx
>>> prepare callbacks.
>>>
>>> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
>>> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
>>> ---
>>> Looks good to me, though extra-testing would be needed.
>>> Konstantin Ananyev <konstantin.ananyev@intel.com>
>>>
>>>    lib/librte_mbuf/rte_mbuf.h | 12 ------------
>>>    lib/librte_net/rte_net.h   | 17 +++++++++++++++++
>>>    2 files changed, 17 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
>>> index d961cca..73daa81 100644
>>> --- a/lib/librte_mbuf/rte_mbuf.h
>>> +++ b/lib/librte_mbuf/rte_mbuf.h
>>> @@ -2257,23 +2257,11 @@ static inline int rte_pktmbuf_chain(struct rte_mbuf *head, struct rte_mbuf *tail
>>>    rte_validate_tx_offload(const struct rte_mbuf *m)
>>>    {
>>>    	uint64_t ol_flags = m->ol_flags;
>>> -	uint64_t inner_l3_offset = m->l2_len;
>>>    	/* Does packet set any of available offloads? */
>>>    	if (!(ol_flags & PKT_TX_OFFLOAD_MASK))
>>>    		return 0;
>>> -	if (ol_flags & PKT_TX_OUTER_IP_CKSUM)
>>> -		/* NB: elaborating the addition like this instead of using
>>> -		 *     += gives the result uint64_t type instead of int,
>>> -		 *     avoiding compiler warnings on gcc 8.1 at least */
>>> -		inner_l3_offset = inner_l3_offset + m->outer_l2_len +
>>> -				  m->outer_l3_len;
>>> -
>>> -	/* Headers are fragmented */
>>> -	if (rte_pktmbuf_data_len(m) < inner_l3_offset + m->l3_len + m->l4_len)
>>> -		return -ENOTSUP;
>>> -
>>>    	/* IP checksum can be counted only for IPv4 packet */
>>>    	if ((ol_flags & PKT_TX_IP_CKSUM) && (ol_flags & PKT_TX_IPV6))
>>>    		return -EINVAL;
>>> diff --git a/lib/librte_net/rte_net.h b/lib/librte_net/rte_net.h
>>> index e59760a..bd75aea 100644
>>> --- a/lib/librte_net/rte_net.h
>>> +++ b/lib/librte_net/rte_net.h
>>> @@ -118,10 +118,27 @@ uint32_t rte_net_get_ptype(const struct rte_mbuf *m,
>>>    	struct udp_hdr *udp_hdr;
>>>    	uint64_t inner_l3_offset = m->l2_len;
>>> +	/*
>>> +	 * Does packet set any of available offloads?
>>> +	 * Mainly it is required to avoid fragmented headers check if
>>> +	 * no offloads are requested.
>>> +	 */
>>> +	if (!(ol_flags & PKT_TX_OFFLOAD_MASK))
>>> +		return 0;
>>> +
>>>    	if ((ol_flags & PKT_TX_OUTER_IP_CKSUM) ||
>>>    		(ol_flags & PKT_TX_OUTER_IPV6))
>>>    		inner_l3_offset += m->outer_l2_len + m->outer_l3_len;
>>> +	/*
>>> +	 * Check if headers are fragmented.
>>> +	 * The check could be less strict depending on which offloads are
>>> +	 * requested and headers to be used, but let's keep it simple.
>>> +	 */
>>> +	if (unlikely(rte_pktmbuf_data_len(m) <
>>> +		     inner_l3_offset + m->l3_len + m->l4_len))
>>> +		return -ENOTSUP;
>>> +
>>>    	if (ol_flags & PKT_TX_IPV4) {
>>>    		ipv4_hdr = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *,
>>>    				inner_l3_offset);
>
> To summarize, the previous code was in a generic part, only enabled if
> RTE_LIBRTE_ETHDEV_DEBUG is set, and it is moved in an intel-specific part,
> but always enabled. Am I correct?

Yes, correct.

> So it may have a performance impact on intel NICs. Shouldn't it be under
> a debug option?

Yes, to be 100% equivalent.

May be making these checks non-debug is a separate story since IMHO
these checks should be non-debug. Below code really depends on these
checks and if the condition is violated it will read and could write outside
of provided buffer (bad checksums, spoiled memory etc).

I'll send v2 shortly with RTE_LIBRTE_ETHDEV_DEBUG to make it easy to
pickup finally chosen version.

Thanks,
Andrew.
  

Patch

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index d961cca..73daa81 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -2257,23 +2257,11 @@  static inline int rte_pktmbuf_chain(struct rte_mbuf *head, struct rte_mbuf *tail
 rte_validate_tx_offload(const struct rte_mbuf *m)
 {
 	uint64_t ol_flags = m->ol_flags;
-	uint64_t inner_l3_offset = m->l2_len;
 
 	/* Does packet set any of available offloads? */
 	if (!(ol_flags & PKT_TX_OFFLOAD_MASK))
 		return 0;
 
-	if (ol_flags & PKT_TX_OUTER_IP_CKSUM)
-		/* NB: elaborating the addition like this instead of using
-		 *     += gives the result uint64_t type instead of int,
-		 *     avoiding compiler warnings on gcc 8.1 at least */
-		inner_l3_offset = inner_l3_offset + m->outer_l2_len +
-				  m->outer_l3_len;
-
-	/* Headers are fragmented */
-	if (rte_pktmbuf_data_len(m) < inner_l3_offset + m->l3_len + m->l4_len)
-		return -ENOTSUP;
-
 	/* IP checksum can be counted only for IPv4 packet */
 	if ((ol_flags & PKT_TX_IP_CKSUM) && (ol_flags & PKT_TX_IPV6))
 		return -EINVAL;
diff --git a/lib/librte_net/rte_net.h b/lib/librte_net/rte_net.h
index e59760a..bd75aea 100644
--- a/lib/librte_net/rte_net.h
+++ b/lib/librte_net/rte_net.h
@@ -118,10 +118,27 @@  uint32_t rte_net_get_ptype(const struct rte_mbuf *m,
 	struct udp_hdr *udp_hdr;
 	uint64_t inner_l3_offset = m->l2_len;
 
+	/*
+	 * Does packet set any of available offloads?
+	 * Mainly it is required to avoid fragmented headers check if
+	 * no offloads are requested.
+	 */
+	if (!(ol_flags & PKT_TX_OFFLOAD_MASK))
+		return 0;
+
 	if ((ol_flags & PKT_TX_OUTER_IP_CKSUM) ||
 		(ol_flags & PKT_TX_OUTER_IPV6))
 		inner_l3_offset += m->outer_l2_len + m->outer_l3_len;
 
+	/*
+	 * Check if headers are fragmented.
+	 * The check could be less strict depending on which offloads are
+	 * requested and headers to be used, but let's keep it simple.
+	 */
+	if (unlikely(rte_pktmbuf_data_len(m) <
+		     inner_l3_offset + m->l3_len + m->l4_len))
+		return -ENOTSUP;
+
 	if (ol_flags & PKT_TX_IPV4) {
 		ipv4_hdr = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *,
 				inner_l3_offset);