[RFC] ethdev: add sanity packet checks

Message ID 1614541699-99345-1-git-send-email-orika@nvidia.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series [RFC] ethdev: add sanity packet checks |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation fail apply issues

Commit Message

Ori Kam Feb. 28, 2021, 7:48 p.m. UTC
  Currently, DPDK application can offload the checksum check,
and report it in the mbuf.

However, this approach doesn't work if the traffic
is offloaded and should not arrive to the application.

This commit introduces rte flow item that enables
matching on the checksum of the L3 and L4 layers,
in addition to other checks that can determine if
the packet is valid.
some of those tests can be packet len, data len,
unsupported flags, and so on.

The full check is HW dependent.

Signed-off-by: Ori Kam <orika@nvidia.com>
---
 lib/librte_ethdev/rte_flow.c |  1 +
 lib/librte_ethdev/rte_flow.h | 50 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 51 insertions(+)
  

Comments

Thomas Monjalon Feb. 28, 2021, 8:14 p.m. UTC | #1
28/02/2021 20:48, Ori Kam:
> Currently, DPDK application can offload the checksum check,
> and report it in the mbuf.
> 
> However, this approach doesn't work if the traffic
> is offloaded and should not arrive to the application.
> 
> This commit introduces rte flow item that enables

s/rte flow/rte_flow/

> matching on the checksum of the L3 and L4 layers,
> in addition to other checks that can determine if
> the packet is valid.
> some of those tests can be packet len, data len,
> unsupported flags, and so on.
> 
> The full check is HW dependent.

What is the "full check"?
How much it is HW dependent?


> + * RTE_FLOW_ITEM_TYPE_SANITY_CHECKS
> + *
> + * Enable matching on packet validity based on HW checks for the L3 and L4
> + * layers.
> + */
> +struct rte_flow_item_sanity_checks {
> +	uint32_t level;
> +	/**< Packet encapsulation level the item should apply to.
> +	 * @see rte_flow_action_rss
> +	 */
> +RTE_STD_C11
> +	union {
> +		struct {

Why there is no L2 check?

> +			uint32_t l3_ok:1;
> +			/**< L3 layer is valid after passing all HW checking. */
> +			uint32_t l4_ok:1;
> +			/**< L4 layer is valid after passing all HW checking. */

l3_ok and l4_ok looks vague.
What does it cover exactly?

> +			uint32_t l3_ok_csum:1;
> +			/**< L3 layer checksum is valid. */
> +			uint32_t l4_ok_csum:1;
> +			/**< L4 layer checksum is valid. */
> +			uint32_t reserved:28;
> +		};
> +		uint32_t  value;
> +	};
> +};
  
Ori Kam March 4, 2021, 10 a.m. UTC | #2
Hi 

PSB

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Thomas Monjalon
> Sent: Sunday, February 28, 2021 10:14 PM
> 
> 28/02/2021 20:48, Ori Kam:
> > Currently, DPDK application can offload the checksum check,
> > and report it in the mbuf.
> >
> > However, this approach doesn't work if the traffic
> > is offloaded and should not arrive to the application.
> >
> > This commit introduces rte flow item that enables
> 
> s/rte flow/rte_flow/
> 

Sure

> > matching on the checksum of the L3 and L4 layers,
> > in addition to other checks that can determine if
> > the packet is valid.
> > some of those tests can be packet len, data len,
> > unsupported flags, and so on.
> >
> > The full check is HW dependent.
> 
> What is the "full check"?
> How much it is HW dependent?
> 

This also relates to your other comments,
Each HW may run different set of checks on the packet,
for example one PMD can just check the tcp flags while
a different PMD will also check the option.

> 
> > + * RTE_FLOW_ITEM_TYPE_SANITY_CHECKS
> > + *
> > + * Enable matching on packet validity based on HW checks for the L3 and L4
> > + * layers.
> > + */
> > +struct rte_flow_item_sanity_checks {
> > +	uint32_t level;
> > +	/**< Packet encapsulation level the item should apply to.
> > +	 * @see rte_flow_action_rss
> > +	 */
> > +RTE_STD_C11
> > +	union {
> > +		struct {
> 
> Why there is no L2 check?
> 
Our HW doesn't support it.
If other HW support it, it should be added.

> > +			uint32_t l3_ok:1;
> > +			/**< L3 layer is valid after passing all HW checking. */
> > +			uint32_t l4_ok:1;
> > +			/**< L4 layer is valid after passing all HW checking. */
> 
> l3_ok and l4_ok looks vague.
> What does it cover exactly?
> 
It depends on the HW in question.
In our case it checks in case of L3
the header len, and the version.
For L4 checking the len.

> > +			uint32_t l3_ok_csum:1;
> > +			/**< L3 layer checksum is valid. */
> > +			uint32_t l4_ok_csum:1;
> > +			/**< L4 layer checksum is valid. */
> > +			uint32_t reserved:28;
> > +		};
> > +		uint32_t  value;
> > +	};
> > +};
> 
> 

Best,
Ori
  
Thomas Monjalon March 4, 2021, 10:46 a.m. UTC | #3
04/03/2021 11:00, Ori Kam:
> From: Thomas Monjalon
> > 28/02/2021 20:48, Ori Kam:
> > > Currently, DPDK application can offload the checksum check,
> > > and report it in the mbuf.
> > >
> > > However, this approach doesn't work if the traffic
> > > is offloaded and should not arrive to the application.
> > >
> > > This commit introduces rte flow item that enables
> > 
> > s/rte flow/rte_flow/
> > 
> 
> Sure
> 
> > > matching on the checksum of the L3 and L4 layers,
> > > in addition to other checks that can determine if
> > > the packet is valid.
> > > some of those tests can be packet len, data len,
> > > unsupported flags, and so on.
> > >
> > > The full check is HW dependent.
> > 
> > What is the "full check"?
> > How much it is HW dependent?
> > 
> 
> This also relates to your other comments,
> Each HW may run different set of checks on the packet,
> for example one PMD can just check the tcp flags while
> a different PMD will also check the option.

I'm not sure how an application can rely on
such a vague definition.


> > > + * RTE_FLOW_ITEM_TYPE_SANITY_CHECKS
> > > + *
> > > + * Enable matching on packet validity based on HW checks for the L3 and L4
> > > + * layers.
> > > + */
> > > +struct rte_flow_item_sanity_checks {
> > > +	uint32_t level;
> > > +	/**< Packet encapsulation level the item should apply to.
> > > +	 * @see rte_flow_action_rss
> > > +	 */
> > > +RTE_STD_C11
> > > +	union {
> > > +		struct {
> > 
> > Why there is no L2 check?
> > 
> Our HW doesn't support it.
> If other HW support it, it should be added.

It would be an ABI breakage. Can we add it day one?

> > > +			uint32_t l3_ok:1;
> > > +			/**< L3 layer is valid after passing all HW checking. */
> > > +			uint32_t l4_ok:1;
> > > +			/**< L4 layer is valid after passing all HW checking. */
> > 
> > l3_ok and l4_ok looks vague.
> > What does it cover exactly?
> > 
> It depends on the HW in question.
> In our case it checks in case of L3
> the header len, and the version.
> For L4 checking the len.

If we don't know exactly what is checked,
how an application can rely on it?
Is it a best effort check? What is the use case?


> > > +			uint32_t l3_ok_csum:1;
> > > +			/**< L3 layer checksum is valid. */
> > > +			uint32_t l4_ok_csum:1;
> > > +			/**< L4 layer checksum is valid. */

What worries me is that the checksum is separate but other checks
are in a common bucket.
I think we should have one field per precise check
with a way to report what is checked.
  
Ori Kam March 7, 2021, 6:46 p.m. UTC | #4
Hi

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Thursday, March 4, 2021 12:46 PM
> Subject: Re: [dpdk-dev] [RFC] ethdev: add sanity packet checks
> 
> 04/03/2021 11:00, Ori Kam:
> > From: Thomas Monjalon
> > > 28/02/2021 20:48, Ori Kam:
> > > > Currently, DPDK application can offload the checksum check,
> > > > and report it in the mbuf.
> > > >
> > > > However, this approach doesn't work if the traffic
> > > > is offloaded and should not arrive to the application.
> > > >
> > > > This commit introduces rte flow item that enables
> > >
> > > s/rte flow/rte_flow/
> > >
> >
> > Sure
> >
> > > > matching on the checksum of the L3 and L4 layers,
> > > > in addition to other checks that can determine if
> > > > the packet is valid.
> > > > some of those tests can be packet len, data len,
> > > > unsupported flags, and so on.
> > > >
> > > > The full check is HW dependent.
> > >
> > > What is the "full check"?
> > > How much it is HW dependent?
> > >
> >
> > This also relates to your other comments,
> > Each HW may run different set of checks on the packet,
> > for example one PMD can just check the tcp flags while
> > a different PMD will also check the option.
> 
> I'm not sure how an application can rely on
> such a vague definition.
> 
Even now we are marking a packet in the mbuf with unknown 
in case of some error.
Would a better wording be " The HW detected errors in the packet"
in any case if the app will need to know what is the error it is his
responsibility, this item is just verification for fast path.
If you have better suggestion, I will be very happy to hear.

> 
> > > > + * RTE_FLOW_ITEM_TYPE_SANITY_CHECKS
> > > > + *
> > > > + * Enable matching on packet validity based on HW checks for the L3 and
> L4
> > > > + * layers.
> > > > + */
> > > > +struct rte_flow_item_sanity_checks {
> > > > +	uint32_t level;
> > > > +	/**< Packet encapsulation level the item should apply to.
> > > > +	 * @see rte_flow_action_rss
> > > > +	 */
> > > > +RTE_STD_C11
> > > > +	union {
> > > > +		struct {
> > >
> > > Why there is no L2 check?
> > >
> > Our HW doesn't support it.
> > If other HW support it, it should be added.
> 
> It would be an ABI breakage. Can we add it day one?
> 
Will add reserve, since this is bit field there shouldn't be any
ABI break.

> > > > +			uint32_t l3_ok:1;
> > > > +			/**< L3 layer is valid after passing all HW checking. */
> > > > +			uint32_t l4_ok:1;
> > > > +			/**< L4 layer is valid after passing all HW checking. */
> > >
> > > l3_ok and l4_ok looks vague.
> > > What does it cover exactly?
> > >
> > It depends on the HW in question.
> > In our case it checks in case of L3
> > the header len, and the version.
> > For L4 checking the len.
> 
> If we don't know exactly what is checked,
> how an application can rely on it?
> Is it a best effort check? What is the use case?
> 
From application point of view that packet is invalid.
it is the app responsibility to understand why.
You can think about it that in any case there might be
different counters for different errors or different actions.
For example, the app can decide that incorrect ipv4 version
should result in packet drop, while incorrect len
may pass.

Maybe we can list all possible error result, but I'm worried
that we may not cover all of them. On some HW there is just one 
bit that marks if the packet is valid or not.

> 
> > > > +			uint32_t l3_ok_csum:1;
> > > > +			/**< L3 layer checksum is valid. */
> > > > +			uint32_t l4_ok_csum:1;
> > > > +			/**< L4 layer checksum is valid. */
> 
> What worries me is that the checksum is separate but other checks
> are in a common bucket.
> I think we should have one field per precise check
> with a way to report what is checked.
> 
Pleas see above comment, Current HW doesn't support
such fine grain detection, so adding bit will force the user
to select all of them, in addition since the HW has some internal 
checks it is possible that the reject reason will be un related to the
L3, for example some HW may not support more then two vlans
so in case of 3 vlans may report bad L3.

Best,
Ori,
  
Ajit Khaparde March 8, 2021, 11:05 p.m. UTC | #5
On Sun, Mar 7, 2021 at 10:46 AM Ori Kam <orika@nvidia.com> wrote:
>
> Hi
>
> > -----Original Message-----
> > From: Thomas Monjalon <thomas@monjalon.net>
> > Sent: Thursday, March 4, 2021 12:46 PM
> > Subject: Re: [dpdk-dev] [RFC] ethdev: add sanity packet checks
> >
> > 04/03/2021 11:00, Ori Kam:
> > > From: Thomas Monjalon
> > > > 28/02/2021 20:48, Ori Kam:
> > > > > Currently, DPDK application can offload the checksum check,
> > > > > and report it in the mbuf.
> > > > >
> > > > > However, this approach doesn't work if the traffic
> > > > > is offloaded and should not arrive to the application.
> > > > >
> > > > > This commit introduces rte flow item that enables
> > > >
> > > > s/rte flow/rte_flow/
> > > >
> > >
> > > Sure
> > >
> > > > > matching on the checksum of the L3 and L4 layers,
> > > > > in addition to other checks that can determine if
> > > > > the packet is valid.
> > > > > some of those tests can be packet len, data len,
> > > > > unsupported flags, and so on.
> > > > >
> > > > > The full check is HW dependent.
> > > >
> > > > What is the "full check"?
> > > > How much it is HW dependent?
> > > >
> > >
> > > This also relates to your other comments,
> > > Each HW may run different set of checks on the packet,
> > > for example one PMD can just check the tcp flags while
> > > a different PMD will also check the option.
> >
> > I'm not sure how an application can rely on
> > such a vague definition.
> >
> Even now we are marking a packet in the mbuf with unknown
> in case of some error.
> Would a better wording be " The HW detected errors in the packet"
> in any case if the app will need to know what is the error it is his
> responsibility, this item is just verification for fast path.
> If you have better suggestion, I will be very happy to hear.
>
> >
> > > > > + * RTE_FLOW_ITEM_TYPE_SANITY_CHECKS
> > > > > + *
> > > > > + * Enable matching on packet validity based on HW checks for the L3 and
> > L4
> > > > > + * layers.
> > > > > + */
> > > > > +struct rte_flow_item_sanity_checks {
> > > > > +       uint32_t level;
> > > > > +       /**< Packet encapsulation level the item should apply to.
> > > > > +        * @see rte_flow_action_rss
> > > > > +        */
> > > > > +RTE_STD_C11
> > > > > +       union {
> > > > > +               struct {
> > > >
> > > > Why there is no L2 check?
> > > >
> > > Our HW doesn't support it.
> > > If other HW support it, it should be added.
> >
> > It would be an ABI breakage. Can we add it day one?
> >
> Will add reserve, since this is bit field there shouldn't be any
> ABI break.
>
> > > > > +                       uint32_t l3_ok:1;
> > > > > +                       /**< L3 layer is valid after passing all HW checking. */
> > > > > +                       uint32_t l4_ok:1;
> > > > > +                       /**< L4 layer is valid after passing all HW checking. */
> > > >
> > > > l3_ok and l4_ok looks vague.
> > > > What does it cover exactly?
> > > >
> > > It depends on the HW in question.
> > > In our case it checks in case of L3
> > > the header len, and the version.
> > > For L4 checking the len.
> >
> > If we don't know exactly what is checked,
> > how an application can rely on it?
> > Is it a best effort check? What is the use case?
> >
> From application point of view that packet is invalid.
> it is the app responsibility to understand why.

And that it can determine based on the available fields in ol_flags. right?
If HW can indicate that the packet integrity is in question,
a PMD should be able to set the bits in ol_flags. After that
the application should decide what to drop and what to pass.

What is missing is the ability for the application to tell the HW/PMD to
drop any packet which fails packet integrity checks.

I believe generally drop packets when Ethernet CRC check fails.
But l3 and l4 errors are left to the application to deal with.
If an application wants to save some CPU cycles, it could ask the
hardware to drop those packets as well. So one bit to enable/disable
this for all packets should be good.

In case we still want to pursue this per flow, how about
RTE_FLOW_ITEM_TYPE_PACKET_INTEGRITY_CHECKS instead of
RTE_FLOW_ITEM_TYPE_SANITY_CHECKS


>
> You can think about it that in any case there might be
> different counters for different errors or different actions.
> For example, the app can decide that incorrect ipv4 version
> should result in packet drop, while incorrect len
> may pass.
>
> Maybe we can list all possible error result, but I'm worried
> that we may not cover all of them. On some HW there is just one
> bit that marks if the packet is valid or not.
>
> >
> > > > > +                       uint32_t l3_ok_csum:1;
> > > > > +                       /**< L3 layer checksum is valid. */
> > > > > +                       uint32_t l4_ok_csum:1;
> > > > > +                       /**< L4 layer checksum is valid. */
> >
> > What worries me is that the checksum is separate but other checks
> > are in a common bucket.
> > I think we should have one field per precise check
> > with a way to report what is checked.
> >
> Pleas see above comment, Current HW doesn't support
> such fine grain detection, so adding bit will force the user
> to select all of them, in addition since the HW has some internal
> checks it is possible that the reject reason will be un related to the
> L3, for example some HW may not support more then two vlans
> so in case of 3 vlans may report bad L3.
>
> Best,
> Ori,
>
  
Andrew Rybchenko March 9, 2021, 9:01 a.m. UTC | #6
On 2/28/21 10:48 PM, Ori Kam wrote:
> Currently, DPDK application can offload the checksum check,
> and report it in the mbuf.
>
> However, this approach doesn't work if the traffic
> is offloaded and should not arrive to the application.
>
> This commit introduces rte flow item that enables
> matching on the checksum of the L3 and L4 layers,
> in addition to other checks that can determine if
> the packet is valid.
> some of those tests can be packet len, data len,
> unsupported flags, and so on.
>
> The full check is HW dependent.
>
> Signed-off-by: Ori Kam <orika@nvidia.com>

In general, I strongly dislike the approach. If such checks are required,
it must be done per item basis. I.e. we should add non-header boolean
flags to IPv4, TCP, UDP etc items. E.g.

struct rte_flow_item_ipv4 {
        struct rte_ipv4_hdr hdr; /**< IPv4 header definition. */
        bool hdr_checksum_valid;
};

Also it will allow to filter by packets with invalid checksum as well and
redirect to dedicated Rx path or drop and/or count etc.
  
Thomas Monjalon March 9, 2021, 9:11 a.m. UTC | #7
09/03/2021 10:01, Andrew Rybchenko:
> On 2/28/21 10:48 PM, Ori Kam wrote:
> > Currently, DPDK application can offload the checksum check,
> > and report it in the mbuf.
> >
> > However, this approach doesn't work if the traffic
> > is offloaded and should not arrive to the application.
> >
> > This commit introduces rte flow item that enables
> > matching on the checksum of the L3 and L4 layers,
> > in addition to other checks that can determine if
> > the packet is valid.
> > some of those tests can be packet len, data len,
> > unsupported flags, and so on.
> >
> > The full check is HW dependent.
> >
> > Signed-off-by: Ori Kam <orika@nvidia.com>
> 
> In general, I strongly dislike the approach. If such checks are required,
> it must be done per item basis. I.e. we should add non-header boolean
> flags to IPv4, TCP, UDP etc items. E.g.
> 
> struct rte_flow_item_ipv4 {
>         struct rte_ipv4_hdr hdr; /**< IPv4 header definition. */
>         bool hdr_checksum_valid;
> };
> 
> Also it will allow to filter by packets with invalid checksum as well and
> redirect to dedicated Rx path or drop and/or count etc.

+1

I think the only drawback of this solution is for HW giving a global
check status without knowing which header is valid or invalid.
  
Ori Kam March 9, 2021, 3:08 p.m. UTC | #8
Hi

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Thomas Monjalon
> Sent: Tuesday, March 9, 2021 11:11 AM
> Subject: Re: [dpdk-dev] [RFC] ethdev: add sanity packet checks
> 
> 09/03/2021 10:01, Andrew Rybchenko:
> > On 2/28/21 10:48 PM, Ori Kam wrote:
> > > Currently, DPDK application can offload the checksum check,
> > > and report it in the mbuf.
> > >
> > > However, this approach doesn't work if the traffic
> > > is offloaded and should not arrive to the application.
> > >
> > > This commit introduces rte flow item that enables
> > > matching on the checksum of the L3 and L4 layers,
> > > in addition to other checks that can determine if
> > > the packet is valid.
> > > some of those tests can be packet len, data len,
> > > unsupported flags, and so on.
> > >
> > > The full check is HW dependent.
> > >
> > > Signed-off-by: Ori Kam <orika@nvidia.com>
> >
> > In general, I strongly dislike the approach. If such checks are required,
> > it must be done per item basis. I.e. we should add non-header boolean
> > flags to IPv4, TCP, UDP etc items. E.g.
> >
> > struct rte_flow_item_ipv4 {
> >         struct rte_ipv4_hdr hdr; /**< IPv4 header definition. */
> >         bool hdr_checksum_valid;
> > };
> >
> > Also it will allow to filter by packets with invalid checksum as well and
> > redirect to dedicated Rx path or drop and/or count etc.
> 
> +1
> 
I'm not sure I understand your comment, also according to the proposed
RFC we can redirect to the correct path.

> I think the only drawback of this solution is for HW giving a global
> check status without knowing which header is valid or invalid.
> 
Like Thomas remark the main drawback with adding the valid to each
of the items is that, it forces the application to have detected rule per
each type, which will make the SW logic more complex.
Also this may have performance impact on the packets and on the
number of flows.
  
Andrew Rybchenko March 9, 2021, 3:27 p.m. UTC | #9
On 3/9/21 6:08 PM, Ori Kam wrote:
> Hi
> 
>> -----Original Message-----
>> From: dev <dev-bounces@dpdk.org> On Behalf Of Thomas Monjalon
>> Sent: Tuesday, March 9, 2021 11:11 AM
>> Subject: Re: [dpdk-dev] [RFC] ethdev: add sanity packet checks
>>
>> 09/03/2021 10:01, Andrew Rybchenko:
>>> On 2/28/21 10:48 PM, Ori Kam wrote:
>>>> Currently, DPDK application can offload the checksum check,
>>>> and report it in the mbuf.
>>>>
>>>> However, this approach doesn't work if the traffic
>>>> is offloaded and should not arrive to the application.
>>>>
>>>> This commit introduces rte flow item that enables
>>>> matching on the checksum of the L3 and L4 layers,
>>>> in addition to other checks that can determine if
>>>> the packet is valid.
>>>> some of those tests can be packet len, data len,
>>>> unsupported flags, and so on.
>>>>
>>>> The full check is HW dependent.
>>>>
>>>> Signed-off-by: Ori Kam <orika@nvidia.com>
>>>
>>> In general, I strongly dislike the approach. If such checks are required,
>>> it must be done per item basis. I.e. we should add non-header boolean
>>> flags to IPv4, TCP, UDP etc items. E.g.
>>>
>>> struct rte_flow_item_ipv4 {
>>>         struct rte_ipv4_hdr hdr; /**< IPv4 header definition. */
>>>         bool hdr_checksum_valid;
>>> };
>>>
>>> Also it will allow to filter by packets with invalid checksum as well and
>>> redirect to dedicated Rx path or drop and/or count etc.
>>
>> +1
>>
> I'm not sure I understand your comment, also according to the proposed
> RFC we can redirect to the correct path.
> 
>> I think the only drawback of this solution is for HW giving a global
>> check status without knowing which header is valid or invalid.
>>
> Like Thomas remark the main drawback with adding the valid to each
> of the items is that, it forces the application to have detected rule per
> each type, which will make the SW logic more complex.
> Also this may have performance impact on the packets and on the
> number of flows.
> 

If we try to match on something which is a part of the packet header X,
it must be done in a flow item which corresponds to
protocol X. Otherwise, we have two items which overlap.
Also approach suggested in RFC break networking protocol layering and
features of different layers are mixed in one
item. What will you when you need to support tunnels? Add
inner/outer fields? Sorry, it is ugly. If valid controls are
added in protocol items, no specific efforts are required for
tunnels.

Also approach suggested in RFC requires very careful definition
what happens if a packet does not have corresponding layer but
matching on valid or invalid checksum is requested.

IMHO a vendor HW limitations are out-of-scope of the
generic API definition. May be one day I will regret about
these words, but I'm still sure that from API point of view
solution suggested by me above is better.
  
Ori Kam March 9, 2021, 7:21 p.m. UTC | #10
Hi

> -----Original Message-----
> From: Ajit Khaparde <ajit.khaparde@broadcom.com>
> Subject: Re: [dpdk-dev] [RFC] ethdev: add sanity packet checks
> 
> On Sun, Mar 7, 2021 at 10:46 AM Ori Kam <orika@nvidia.com> wrote:
> >
> > Hi
> >
> > > -----Original Message-----
> > > From: Thomas Monjalon <thomas@monjalon.net>
> > > Sent: Thursday, March 4, 2021 12:46 PM
> > > Subject: Re: [dpdk-dev] [RFC] ethdev: add sanity packet checks
> > >
> > > 04/03/2021 11:00, Ori Kam:
> > > > From: Thomas Monjalon
> > > > > 28/02/2021 20:48, Ori Kam:
> > > > > > Currently, DPDK application can offload the checksum check,
> > > > > > and report it in the mbuf.
> > > > > >
> > > > > > However, this approach doesn't work if the traffic
> > > > > > is offloaded and should not arrive to the application.
> > > > > >
> > > > > > This commit introduces rte flow item that enables
> > > > >
> > > > > s/rte flow/rte_flow/
> > > > >
> > > >
> > > > Sure
> > > >
> > > > > > matching on the checksum of the L3 and L4 layers,
> > > > > > in addition to other checks that can determine if
> > > > > > the packet is valid.
> > > > > > some of those tests can be packet len, data len,
> > > > > > unsupported flags, and so on.
> > > > > >
> > > > > > The full check is HW dependent.
> > > > >
> > > > > What is the "full check"?
> > > > > How much it is HW dependent?
> > > > >
> > > >
> > > > This also relates to your other comments,
> > > > Each HW may run different set of checks on the packet,
> > > > for example one PMD can just check the tcp flags while
> > > > a different PMD will also check the option.
> > >
> > > I'm not sure how an application can rely on
> > > such a vague definition.
> > >
> > Even now we are marking a packet in the mbuf with unknown
> > in case of some error.
> > Would a better wording be " The HW detected errors in the packet"
> > in any case if the app will need to know what is the error it is his
> > responsibility, this item is just verification for fast path.
> > If you have better suggestion, I will be very happy to hear.
> >
> > >
> > > > > > + * RTE_FLOW_ITEM_TYPE_SANITY_CHECKS
> > > > > > + *
> > > > > > + * Enable matching on packet validity based on HW checks for the L3
> and
> > > L4
> > > > > > + * layers.
> > > > > > + */
> > > > > > +struct rte_flow_item_sanity_checks {
> > > > > > +       uint32_t level;
> > > > > > +       /**< Packet encapsulation level the item should apply to.
> > > > > > +        * @see rte_flow_action_rss
> > > > > > +        */
> > > > > > +RTE_STD_C11
> > > > > > +       union {
> > > > > > +               struct {
> > > > >
> > > > > Why there is no L2 check?
> > > > >
> > > > Our HW doesn't support it.
> > > > If other HW support it, it should be added.
> > >
> > > It would be an ABI breakage. Can we add it day one?
> > >
> > Will add reserve, since this is bit field there shouldn't be any
> > ABI break.
> >
> > > > > > +                       uint32_t l3_ok:1;
> > > > > > +                       /**< L3 layer is valid after passing all HW checking. */
> > > > > > +                       uint32_t l4_ok:1;
> > > > > > +                       /**< L4 layer is valid after passing all HW checking. */
> > > > >
> > > > > l3_ok and l4_ok looks vague.
> > > > > What does it cover exactly?
> > > > >
> > > > It depends on the HW in question.
> > > > In our case it checks in case of L3
> > > > the header len, and the version.
> > > > For L4 checking the len.
> > >
> > > If we don't know exactly what is checked,
> > > how an application can rely on it?
> > > Is it a best effort check? What is the use case?
> > >
> > From application point of view that packet is invalid.
> > it is the app responsibility to understand why.
> 
> And that it can determine based on the available fields in ol_flags. right?

Right.

> If HW can indicate that the packet integrity is in question,
> a PMD should be able to set the bits in ol_flags. After that
> the application should decide what to drop and what to pass.
> 
> What is missing is the ability for the application to tell the HW/PMD to
> drop any packet which fails packet integrity checks.
> 
This is the drop action.
Or am I missing something?

> I believe generally drop packets when Ethernet CRC check fails.
> But l3 and l4 errors are left to the application to deal with.
> If an application wants to save some CPU cycles, it could ask the
> hardware to drop those packets as well. So one bit to enable/disable
> this for all packets should be good.
> 
> In case we still want to pursue this per flow, how about
> RTE_FLOW_ITEM_TYPE_PACKET_INTEGRITY_CHECKS instead of
> RTE_FLOW_ITEM_TYPE_SANITY_CHECKS
> 
Sure I like your name better.

Best,
Ori
  
Ori Kam March 9, 2021, 7:46 p.m. UTC | #11
Hi Andrew,
Thanks for the reply PDB,

> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: Tuesday, March 9, 2021 5:28 PM
> Subject: Re: [dpdk-dev] [RFC] ethdev: add sanity packet checks
> 
> On 3/9/21 6:08 PM, Ori Kam wrote:
> > Hi
> >
> >> -----Original Message-----
> >> From: dev <dev-bounces@dpdk.org> On Behalf Of Thomas Monjalon
> >> Sent: Tuesday, March 9, 2021 11:11 AM
> >> Subject: Re: [dpdk-dev] [RFC] ethdev: add sanity packet checks
> >>
> >> 09/03/2021 10:01, Andrew Rybchenko:
> >>> On 2/28/21 10:48 PM, Ori Kam wrote:
> >>>> Currently, DPDK application can offload the checksum check,
> >>>> and report it in the mbuf.
> >>>>
> >>>> However, this approach doesn't work if the traffic
> >>>> is offloaded and should not arrive to the application.
> >>>>
> >>>> This commit introduces rte flow item that enables
> >>>> matching on the checksum of the L3 and L4 layers,
> >>>> in addition to other checks that can determine if
> >>>> the packet is valid.
> >>>> some of those tests can be packet len, data len,
> >>>> unsupported flags, and so on.
> >>>>
> >>>> The full check is HW dependent.
> >>>>
> >>>> Signed-off-by: Ori Kam <orika@nvidia.com>
> >>>
> >>> In general, I strongly dislike the approach. If such checks are required,
> >>> it must be done per item basis. I.e. we should add non-header boolean
> >>> flags to IPv4, TCP, UDP etc items. E.g.
> >>>
> >>> struct rte_flow_item_ipv4 {
> >>>         struct rte_ipv4_hdr hdr; /**< IPv4 header definition. */
> >>>         bool hdr_checksum_valid;
> >>> };
> >>>
> >>> Also it will allow to filter by packets with invalid checksum as well and
> >>> redirect to dedicated Rx path or drop and/or count etc.
> >>
> >> +1
> >>
> > I'm not sure I understand your comment, also according to the proposed
> > RFC we can redirect to the correct path.
> >
> >> I think the only drawback of this solution is for HW giving a global
> >> check status without knowing which header is valid or invalid.
> >>
> > Like Thomas remark the main drawback with adding the valid to each
> > of the items is that, it forces the application to have detected rule per
> > each type, which will make the SW logic more complex.
> > Also this may have performance impact on the packets and on the
> > number of flows.
> >
> 
> If we try to match on something which is a part of the packet header X,
> it must be done in a flow item which corresponds to
> protocol X. Otherwise, we have two items which overlap.
> Also approach suggested in RFC break networking protocol layering and
> features of different layers are mixed in one
> item. What will you when you need to support tunnels? Add
> inner/outer fields? Sorry, it is ugly. If valid controls are
> added in protocol items, no specific efforts are required for
> tunnels.
>
I don't think it breaks protocol layers. This is a very common use case that
application before touching the packet wants to make sure that the packet
is valid and not waste extra hops just to understand at the end that it is a bad packet.
Also even now we report in the mbuf the status of the L3,L4  check sum and status (using
the packet state)

About the tunnel we can solve it by adding level just like in the RSS case.

One more thing to consider is that adding the new bits to existing API
will break the API/ABI.


> Also approach suggested in RFC requires very careful definition
> what happens if a packet does not have corresponding layer but
> matching on valid or invalid checksum is requested.
> 
I would say this is app issue, just like modify TTL  in IPV4 if there is
no item then it is the app issue,

> IMHO a vendor HW limitations are out-of-scope of the
> generic API definition. May be one day I will regret about
> these words, but I'm still sure that from API point of view
> solution suggested by me above is better.

I fully agree (I may also regret those words) that HW limitations should be
out of scope unless they are in some form of hint which number of  HW
can use to increase performance.

As far as I can see the only difference between the your suggestion and the RFC
is that in the RFC there is a new dedicated item for such checks, while in your
suggestion we will add those bits to each of the items. (I assume that
also in your suggestion we will add more than just the check sum).
The main draw backs from my point of view of your suggestion 
1. ABI/AFI break.
2. Application is force to insert more rules.


Best,
Ori
  

Patch

diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index b07dbff..a8d7c7a 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -99,6 +99,7 @@  struct rte_flow_desc_data {
 	MK_FLOW_ITEM(ECPRI, sizeof(struct rte_flow_item_ecpri)),
 	MK_FLOW_ITEM(GENEVE_OPT, sizeof(struct rte_flow_item_geneve_opt)),
 	MK_FLOW_ITEM(SFT, sizeof(struct rte_flow_item_sft)),
+	MK_FLOW_ITEM(SANITY_CHECKS, sizeof(struct rte_flow_item_sanity_checks)),
 };
 
 /** Generate flow_action[] entry. */
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index ad876dd..f948c5e 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -559,6 +559,15 @@  enum rte_flow_item_type {
 	 * See struct rte_flow_item_sft.
 	 */
 	RTE_FLOW_ITEM_TYPE_SFT,
+
+	/**
+	 * [META]
+	 *
+	 * Matches packet sanity checks.
+	 *
+	 * See struct rte_flow_item_sanity_checks.
+	 */
+	RTE_FLOW_ITEM_TYPE_SANITY_CHECKS,
 };
 
 /**
@@ -1709,6 +1718,47 @@  struct rte_flow_item_sft {
 };
 
 /**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_SANITY_CHECKS
+ *
+ * Enable matching on packet validity based on HW checks for the L3 and L4
+ * layers.
+ */
+struct rte_flow_item_sanity_checks {
+	uint32_t level;
+	/**< Packet encapsulation level the item should apply to.
+	 * @see rte_flow_action_rss
+	 */
+RTE_STD_C11
+	union {
+		struct {
+			uint32_t l3_ok:1;
+			/**< L3 layer is valid after passing all HW checking. */
+			uint32_t l4_ok:1;
+			/**< L4 layer is valid after passing all HW checking. */
+			uint32_t l3_ok_csum:1;
+			/**< L3 layer checksum is valid. */
+			uint32_t l4_ok_csum:1;
+			/**< L4 layer checksum is valid. */
+			uint32_t reserved:28;
+		};
+		uint32_t  value;
+	};
+};
+
+#ifndef __cplusplus
+static const struct rte_flow_item_sanity_checks
+	rte_flow_item_sanity_checks_mask = {
+	.l3_ok = 1,
+	.l4_ok = 1,
+	.l3_ok_csum = 1,
+	.l4_ok_csum = 1,
+};
+#endif
+
+/**
  * Matching pattern item definition.
  *
  * A pattern is formed by stacking items starting from the lowest protocol