[v3,1/6] ethdev: add capability to keep flow rules on restart

Message ID 20211019123722.3414694-2-dkozlyuk@nvidia.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series Flow entites behavior on port restart |

Checks

Context Check Description
ci/checkpatch warning coding style issues

Commit Message

Dmitry Kozlyuk Oct. 19, 2021, 12:37 p.m. UTC
  Previously, it was not specified what happens to the flow rules
when the device is stopped, possibly reconfigured, then started.
If flow rules were kept, it could be convenient for application
developers, because they wouldn't need to save and restore them.
However, due to the number of flows and possible creation rate it is
impractical to save all flow rules in DPDK layer. This means that flow
rules persistence really depends on whether PMD and HW can implement it
efficiently. It can also be limited by the rule item and action types,
and its attributes transfer bit (a combination of an item/action type
and a value of the transfer bit is called a ruel feature).

Add a device capability bit for PMDs that can keep at least some
of the flow rules across restart. Without this capability behavior
is still unspecified and it is declared that the application must
flush the rules before stopping the device.
Allow the application to test for persitence of rules using
a particular feature by attempting to create a flow rule
using that feature when the device is stopped
and checking for the specific error.
This is logical because if the PMD can to create the flow rule
when the device is not started and use it after the start happens,
it is natural that it can move its internal flow rule object
to the same state when the device is stopped and restore the state
when the device is started.

Rule persistence across a reconfigurations is not required,
because tracking all the rules and configuration-dependent resources
they use may be infeasible. In case a PMD cannot keep the rules
across reconfiguration, it is allowed just to report an error.
Application must then flush the rules before attempting it.

Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
---
 doc/guides/prog_guide/rte_flow.rst | 25 +++++++++++++++++++++++++
 lib/ethdev/rte_ethdev.h            |  7 +++++++
 lib/ethdev/rte_flow.h              |  1 +
 3 files changed, 33 insertions(+)
  

Comments

Ori Kam Oct. 19, 2021, 3:22 p.m. UTC | #1
Hi Dmitry,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Dmitry Kozlyuk
> Sent: Tuesday, October 19, 2021 3:37 PM
> To: dev@dpdk.org
> Cc: Qi Zhang <qi.z.zhang@intel.com>; Ori Kam <orika@oss.nvidia.com>; NBU-Contact-Thomas
> Monjalon <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>; Andrew Rybchenko
> <andrew.rybchenko@oktetlabs.ru>
> Subject: [dpdk-dev] [PATCH v3 1/6] ethdev: add capability to keep flow rules on restart
> 
> Previously, it was not specified what happens to the flow rules when the device is stopped, possibly
> reconfigured, then started.
> If flow rules were kept, it could be convenient for application developers, because they wouldn't need
> to save and restore them.
> However, due to the number of flows and possible creation rate it is impractical to save all flow rules in
> DPDK layer. This means that flow rules persistence really depends on whether PMD and HW can
> implement it efficiently. It can also be limited by the rule item and action types, and its attributes
> transfer bit (a combination of an item/action type and a value of the transfer bit is called a ruel
> feature).
> 
> Add a device capability bit for PMDs that can keep at least some of the flow rules across restart.
> Without this capability behavior is still unspecified and it is declared that the application must flush the
> rules before stopping the device.
> Allow the application to test for persitence of rules using a particular feature by attempting to create a
> flow rule using that feature when the device is stopped and checking for the specific error.
> This is logical because if the PMD can to create the flow rule when the device is not started and use it
> after the start happens, it is natural that it can move its internal flow rule object to the same state
> when the device is stopped and restore the state when the device is started.
> 
> Rule persistence across a reconfigurations is not required, because tracking all the rules and
> configuration-dependent resources they use may be infeasible. In case a PMD cannot keep the rules
> across reconfiguration, it is allowed just to report an error.
> Application must then flush the rules before attempting it.
> 
> Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
> ---
>  doc/guides/prog_guide/rte_flow.rst | 25 +++++++++++++++++++++++++
>  lib/ethdev/rte_ethdev.h            |  7 +++++++
>  lib/ethdev/rte_flow.h              |  1 +
>  3 files changed, 33 insertions(+)
> 
> diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
> index 2b42d5ec8c..ff67b211e3 100644
> --- a/doc/guides/prog_guide/rte_flow.rst
> +++ b/doc/guides/prog_guide/rte_flow.rst
> @@ -87,6 +87,31 @@ To avoid resource leaks on the PMD side, handles must be explicitly  destroyed
> by the application before releasing associated resources such as  queues and ports.
> 
> +If ``RTE_ETH_DEV_CAPA_FLOW_RULE_KEEP`` is not advertised, rules cannot
> +be created until the device is started for the first time and cannot be
> +kept when the device is stopped.
> +However, PMD also does not flush them automatically on stop, so the
> +application must call ``rte_flow_flush()`` or ``rte_flow_destroy()``
> +before stopping the device to ensure no rules remain.
> +
> +If ``RTE_ETH_DEV_CAPA_FLOW_RULE_KEEP`` is advertised, this means the
> +PMD can keep at least some rules across the device stop and start.
> +However, ``rte_eth_dev_configure()`` may fail if any rules remain, so
> +the application must flush them before attempting a reconfiguration.
> +Keeping may be unsupported for some types of rule items and actions, as
> +well as depending on the value of flow attributes transfer bit.
> +A combination of an item or action type and a value of the transfer bit
> +is called a rule feature.
> +To test if rules with a particular feature are kept, the application
> +must try to create a valid rule using this feature when the device is
> +stopped (after it has been configured or started previously).
> +If it fails with an error of type ``RTE_FLOW_ERROR_TYPE_STATE``, rules
> +using this feature are flushed when the device is stopped.
> +If it suceeds, such rules will be kept when the device is stopped,
> +provided they do not use other features that are not supported.
> +Rules that are created when the device is stopped, including the rules
> +created for the test, will be kept after the device is started.
> +
>  The following sections cover:
> 
>  - **Attributes** (represented by ``struct rte_flow_attr``): properties of a diff --git
> a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index 6d80514ba7..a0b388bb25 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -90,6 +90,11 @@
>   *     - flow director filtering mode (but not filtering rules)
>   *     - NIC queue statistics mappings
>   *
> + * The following configuration may be retained or not
> + * depending on the device capabilities:
> + *
> + *     - flow rules
> + *
>   * Any other configuration will not be stored and will need to be re-entered
>   * before a call to rte_eth_dev_start().
>   *
> @@ -1445,6 +1450,8 @@ struct rte_eth_conf {  #define
> RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
>  /** Device supports Tx queue setup after device started. */  #define
> RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x00000002
> +/** Device supports keeping flow rules across restart. */ #define
> +RTE_ETH_DEV_CAPA_FLOW_RULE_KEEP 0x00000004
>  /**@}*/
> 
>  /*
> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index a89945061a..aa0182d021 100644
> --- a/lib/ethdev/rte_flow.h
> +++ b/lib/ethdev/rte_flow.h
> @@ -3344,6 +3344,7 @@ enum rte_flow_error_type {
>  	RTE_FLOW_ERROR_TYPE_ACTION_NUM, /**< Number of actions. */
>  	RTE_FLOW_ERROR_TYPE_ACTION_CONF, /**< Action configuration. */
>  	RTE_FLOW_ERROR_TYPE_ACTION, /**< Specific action. */
> +	RTE_FLOW_ERROR_TYPE_STATE, /**< Current device state. */
>  };
> 
>  /**
> --
> 2.25.1

Acked-by: Ori Kam <orika@nvidia.com>
Best,
Ori
  
Ferruh Yigit Oct. 19, 2021, 4:38 p.m. UTC | #2
On 10/19/2021 1:37 PM, Dmitry Kozlyuk wrote:
> diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
> index 2b42d5ec8c..ff67b211e3 100644
> --- a/doc/guides/prog_guide/rte_flow.rst
> +++ b/doc/guides/prog_guide/rte_flow.rst
> @@ -87,6 +87,31 @@ To avoid resource leaks on the PMD side, handles must be explicitly
>   destroyed by the application before releasing associated resources such as
>   queues and ports.
>   
> +If ``RTE_ETH_DEV_CAPA_FLOW_RULE_KEEP`` is not advertised,
> +rules cannot be created until the device is started for the first time
> +and cannot be kept when the device is stopped.

So flag means two things:
1) rules cannot be created until the device is started for the first time
2) rules cannot be kept when the device is stopped

Can't be a case one is true but other is not? I was thinking flag is
only for (2).

> +However, PMD also does not flush them automatically on stop,
> +so the application must call ``rte_flow_flush()`` or ``rte_flow_destroy()``
> +before stopping the device to ensure no rules remain.
> +
> +If ``RTE_ETH_DEV_CAPA_FLOW_RULE_KEEP`` is advertised, this means
> +the PMD can keep at least some rules across the device stop and start.
> +However, ``rte_eth_dev_configure()`` may fail if any rules remain,
> +so the application must flush them before attempting a reconfiguration.

If there are any remaining rules, should we fail the ``rte_eth_dev_configure()``,
or is it allowed PMD to flush the rules itself?

As far as I know some Intel PMDs flush remaining rules in configure itself
without failing, @Qi can correct me if I am wrong.


> +Keeping may be unsupported for some types of rule items and actions,
> +as well as depending on the value of flow attributes transfer bit.
> +A combination of an item or action type and a value of the transfer bit
> +is called a rule feature.
> +To test if rules with a particular feature are kept, the application must try
> +to create a valid rule using this feature when the device is stopped
> +(after it has been configured or started previously).
> +If it fails with an error of type ``RTE_FLOW_ERROR_TYPE_STATE``,
> +rules using this feature are flushed when the device is stopped.
> +If it suceeds, such rules will be kept when the device is stopped,
> +provided they do not use other features that are not supported.
> +Rules that are created when the device is stopped, including the rules
> +created for the test, will be kept after the device is started.
> +

I understand the intention, but I don't know if this is true for all devices.
Can't there be a case that driver can't create rule when it is stopped,
but it can keep the rules after stop. Or other-way around, driver can
create rule when it is stopped, but can't keep rule after stop.

I am feeling we are missing comments from different vendors if this logic
works for them.
  
Dmitry Kozlyuk Oct. 19, 2021, 5:13 p.m. UTC | #3
> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@intel.com>
> Sent: 19 октября 2021 г. 19:39
> To: Dmitry Kozlyuk <dkozlyuk@nvidia.com>; dev@dpdk.org; Qi Zhang
> <qi.z.zhang@intel.com>
> Cc: Ori Kam <orika@nvidia.com>; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>; Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Subject: Re: [PATCH v3 1/6] ethdev: add capability to keep flow rules on
> restart
> 
> External email: Use caution opening links or attachments
> 
> 
> On 10/19/2021 1:37 PM, Dmitry Kozlyuk wrote:
> > diff --git a/doc/guides/prog_guide/rte_flow.rst
> b/doc/guides/prog_guide/rte_flow.rst
> > index 2b42d5ec8c..ff67b211e3 100644
> > --- a/doc/guides/prog_guide/rte_flow.rst
> > +++ b/doc/guides/prog_guide/rte_flow.rst
> > @@ -87,6 +87,31 @@ To avoid resource leaks on the PMD side, handles must
> be explicitly
> >   destroyed by the application before releasing associated resources
> such as
> >   queues and ports.
> >
> > +If ``RTE_ETH_DEV_CAPA_FLOW_RULE_KEEP`` is not advertised,
> > +rules cannot be created until the device is started for the first time
> > +and cannot be kept when the device is stopped.
> 
> So flag means two things:
> 1) rules cannot be created until the device is started for the first time
> 2) rules cannot be kept when the device is stopped
> 
> Can't be a case one is true but other is not? I was thinking flag is
> only for (2).

It theoretically can, but it doesn't seem feasible
to separate these capabilities:

a) Suppose a PMD can create rules before the device is started.
They are in some special state when they are not applied to the traffic.
When the device is started, these rules begin being applied.
When the device is stopped, what would make the PMD unable to move
the rules back to the detached state they were before the start?
And then attach them back?

b) Suppose a PMD can keep the rules between stop and start.
It must be able to move them to the detached stated described above
on the device stop and attach them back when it is started again.
What would prevent the PMD to create rules in a detached state
before the device is started for the first time?

That's what I had in mind before,
and now it is just stated explicitly per Andrew's suggestion:
https://inbox.dpdk.org/dev/5ec7101f-169e-cbd0-87bb-810b7476c7d0@oktetlabs.ru

> > +However, PMD also does not flush them automatically on stop,
> > +so the application must call ``rte_flow_flush()`` or
> ``rte_flow_destroy()``
> > +before stopping the device to ensure no rules remain.
> > +
> > +If ``RTE_ETH_DEV_CAPA_FLOW_RULE_KEEP`` is advertised, this means
> > +the PMD can keep at least some rules across the device stop and start.
> > +However, ``rte_eth_dev_configure()`` may fail if any rules remain,
> > +so the application must flush them before attempting a reconfiguration.
> 
> If there are any remaining rules, should we fail the
> ``rte_eth_dev_configure()``,
> or is it allowed PMD to flush the rules itself?
>
> As far as I know some Intel PMDs flush remaining rules in configure itself
> without failing, @Qi can correct me if I am wrong.

Implicit flush is non-orthogonal API,
which only makes sense if it gives some performance benefit.

> > +Keeping may be unsupported for some types of rule items and actions,
> > +as well as depending on the value of flow attributes transfer bit.
> > +A combination of an item or action type and a value of the transfer bit
> > +is called a rule feature.
> > +To test if rules with a particular feature are kept, the application
> must try
> > +to create a valid rule using this feature when the device is stopped
> > +(after it has been configured or started previously).
> > +If it fails with an error of type ``RTE_FLOW_ERROR_TYPE_STATE``,
> > +rules using this feature are flushed when the device is stopped.
> > +If it suceeds, such rules will be kept when the device is stopped,
> > +provided they do not use other features that are not supported.
> > +Rules that are created when the device is stopped, including the rules
> > +created for the test, will be kept after the device is started.
> > +
> 
> I understand the intention, but I don't know if this is true for all
> devices.
> Can't there be a case that driver can't create rule when it is stopped,
> but it can keep the rules after stop. Or other-way around, driver can
> create rule when it is stopped, but can't keep rule after stop.

Isn't it the same consideration as the first comment?
If so, it's about the ability to have a rule in a detached state
given its features.

> I am feeling we are missing comments from different vendors if this logic
> works for them.
  
Andrew Rybchenko Oct. 20, 2021, 10:39 a.m. UTC | #4
On 10/19/21 3:37 PM, Dmitry Kozlyuk wrote:
> Previously, it was not specified what happens to the flow rules
> when the device is stopped, possibly reconfigured, then started.
> If flow rules were kept, it could be convenient for application
> developers, because they wouldn't need to save and restore them.
> However, due to the number of flows and possible creation rate it is
> impractical to save all flow rules in DPDK layer. This means that flow
> rules persistence really depends on whether PMD and HW can implement it
> efficiently. It can also be limited by the rule item and action types,
> and its attributes transfer bit (a combination of an item/action type
> and a value of the transfer bit is called a ruel feature).
> 
> Add a device capability bit for PMDs that can keep at least some
> of the flow rules across restart. Without this capability behavior
> is still unspecified and it is declared that the application must
> flush the rules before stopping the device.
> Allow the application to test for persitence of rules using

persitence -> persistence

> a particular feature by attempting to create a flow rule
> using that feature when the device is stopped
> and checking for the specific error.
> This is logical because if the PMD can to create the flow rule
> when the device is not started and use it after the start happens,
> it is natural that it can move its internal flow rule object
> to the same state when the device is stopped and restore the state
> when the device is started.
> 
> Rule persistence across a reconfigurations is not required,
> because tracking all the rules and configuration-dependent resources
> they use may be infeasible. In case a PMD cannot keep the rules
> across reconfiguration, it is allowed just to report an error.
> Application must then flush the rules before attempting it.
> 
> Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
> ---
>  doc/guides/prog_guide/rte_flow.rst | 25 +++++++++++++++++++++++++
>  lib/ethdev/rte_ethdev.h            |  7 +++++++
>  lib/ethdev/rte_flow.h              |  1 +
>  3 files changed, 33 insertions(+)
> 
> diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
> index 2b42d5ec8c..ff67b211e3 100644
> --- a/doc/guides/prog_guide/rte_flow.rst
> +++ b/doc/guides/prog_guide/rte_flow.rst
> @@ -87,6 +87,31 @@ To avoid resource leaks on the PMD side, handles must be explicitly
>  destroyed by the application before releasing associated resources such as
>  queues and ports.
>  
> +If ``RTE_ETH_DEV_CAPA_FLOW_RULE_KEEP`` is not advertised,
> +rules cannot be created until the device is started for the first time
> +and cannot be kept when the device is stopped.
> +However, PMD also does not flush them automatically on stop,
> +so the application must call ``rte_flow_flush()`` or ``rte_flow_destroy()``
> +before stopping the device to ensure no rules remain.
> +
> +If ``RTE_ETH_DEV_CAPA_FLOW_RULE_KEEP`` is advertised, this means
> +the PMD can keep at least some rules across the device stop and start.
> +However, ``rte_eth_dev_configure()`` may fail if any rules remain,
> +so the application must flush them before attempting a reconfiguration.
> +Keeping may be unsupported for some types of rule items and actions,
> +as well as depending on the value of flow attributes transfer bit.
> +A combination of an item or action type and a value of the transfer bit
> +is called a rule feature.

As I said before a combination is very hard to test and
unfriendly to applications. Do we really need to make it
that complex?

Which PMDs are going to support it? Which cases will really
be distinguished and will have different support (keep or not)?

> +To test if rules with a particular feature are kept, the application must try
> +to create a valid rule using this feature when the device is stopped
> +(after it has been configured or started previously).

Sorry, it hardly makes sense. Does it suggest an application
to:
 1. configure
 2. start
 3. stop
 4. check/create flow rules
 5. start again
as a regular start sequence instead of just configure+start.
IMHO, it must be possible to check just after configure without
start. Otherwise it looks really bad.

> +If it fails with an error of type ``RTE_FLOW_ERROR_TYPE_STATE``,
> +rules using this feature are flushed when the device is stopped.

Which entity does flush it?

> +If it suceeds, such rules will be kept when the device is stopped,

suceeds -> succeeds

kept and functional? I.e. transfer rules still route traffic to
other ports which could be up and running.

> +provided they do not use other features that are not supported.
> +Rules that are created when the device is stopped, including the rules
> +created for the test, will be kept after the device is started.
> +
>  The following sections cover:
>  
>  - **Attributes** (represented by ``struct rte_flow_attr``): properties of a
  
Dmitry Kozlyuk Oct. 20, 2021, 11:40 a.m. UTC | #5
> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: 20 октября 2021 г. 13:40
> To: Dmitry Kozlyuk <dkozlyuk@nvidia.com>; dev@dpdk.org
> Cc: Qi Zhang <qi.z.zhang@intel.com>; Ori Kam <orika@nvidia.com>; NBU-
> Contact-Thomas Monjalon <thomas@monjalon.net>; Ferruh Yigit
> <ferruh.yigit@intel.com>
> Subject: Re: [PATCH v3 1/6] ethdev: add capability to keep flow rules on
> restart
> 
> External email: Use caution opening links or attachments
> 
> 
> On 10/19/21 3:37 PM, Dmitry Kozlyuk wrote:
> > Previously, it was not specified what happens to the flow rules
> > when the device is stopped, possibly reconfigured, then started.
> > If flow rules were kept, it could be convenient for application
> > developers, because they wouldn't need to save and restore them.
> > However, due to the number of flows and possible creation rate it is
> > impractical to save all flow rules in DPDK layer. This means that flow
> > rules persistence really depends on whether PMD and HW can implement it
> > efficiently. It can also be limited by the rule item and action types,
> > and its attributes transfer bit (a combination of an item/action type
> > and a value of the transfer bit is called a ruel feature).
> >
> > Add a device capability bit for PMDs that can keep at least some
> > of the flow rules across restart. Without this capability behavior
> > is still unspecified and it is declared that the application must
> > flush the rules before stopping the device.
> > Allow the application to test for persitence of rules using
> 
> persitence -> persistence
>
> > a particular feature by attempting to create a flow rule
> > using that feature when the device is stopped
> > and checking for the specific error.
> > This is logical because if the PMD can to create the flow rule
> > when the device is not started and use it after the start happens,
> > it is natural that it can move its internal flow rule object
> > to the same state when the device is stopped and restore the state
> > when the device is started.
> >
> > Rule persistence across a reconfigurations is not required,
> > because tracking all the rules and configuration-dependent resources
> > they use may be infeasible. In case a PMD cannot keep the rules
> > across reconfiguration, it is allowed just to report an error.
> > Application must then flush the rules before attempting it.
> >
> > Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
> > ---
> >  doc/guides/prog_guide/rte_flow.rst | 25 +++++++++++++++++++++++++
> >  lib/ethdev/rte_ethdev.h            |  7 +++++++
> >  lib/ethdev/rte_flow.h              |  1 +
> >  3 files changed, 33 insertions(+)
> >
> > diff --git a/doc/guides/prog_guide/rte_flow.rst
> b/doc/guides/prog_guide/rte_flow.rst
> > index 2b42d5ec8c..ff67b211e3 100644
> > --- a/doc/guides/prog_guide/rte_flow.rst
> > +++ b/doc/guides/prog_guide/rte_flow.rst
> > @@ -87,6 +87,31 @@ To avoid resource leaks on the PMD side, handles must
> be explicitly
> >  destroyed by the application before releasing associated resources such
> as
> >  queues and ports.
> >
> > +If ``RTE_ETH_DEV_CAPA_FLOW_RULE_KEEP`` is not advertised,
> > +rules cannot be created until the device is started for the first time
> > +and cannot be kept when the device is stopped.
> > +However, PMD also does not flush them automatically on stop,
> > +so the application must call ``rte_flow_flush()`` or
> ``rte_flow_destroy()``
> > +before stopping the device to ensure no rules remain.
> > +
> > +If ``RTE_ETH_DEV_CAPA_FLOW_RULE_KEEP`` is advertised, this means
> > +the PMD can keep at least some rules across the device stop and start.
> > +However, ``rte_eth_dev_configure()`` may fail if any rules remain,
> > +so the application must flush them before attempting a reconfiguration.
> > +Keeping may be unsupported for some types of rule items and actions,
> > +as well as depending on the value of flow attributes transfer bit.
> > +A combination of an item or action type and a value of the transfer bit
> > +is called a rule feature.
> 
> As I said before a combination is very hard to test and
> unfriendly to applications. Do we really need to make it
> that complex?

Maybe the wording is not explicit enough,
but it exactly attempts to address your previous comment.
In v3, applications don't need to check for a full combination
of item types, actions types, and a transfer bit value.
Instead, they only need to check for a combination
of one type (of an item or an action) with a transfer bit value.
There is an example below in the text.

> Which PMDs are going to support it? Which cases will really
> be distinguished and will have different support (keep or not)?
> 
> > +To test if rules with a particular feature are kept, the application
> must try
> > +to create a valid rule using this feature when the device is stopped
> > +(after it has been configured or started previously).
> 
> Sorry, it hardly makes sense. Does it suggest an application
> to:
>  1. configure
>  2. start
>  3. stop
>  4. check/create flow rules
>  5. start again
> as a regular start sequence instead of just configure+start.
> IMHO, it must be possible to check just after configure without
> start. Otherwise it looks really bad.

Of course, the following sequence is meant:
1. Configure
2. Try to create flow rules with needed features,
   check for RTE_FLOW_ERROR_TYPE_STATE.
   If and only if the test rules are not needed, destroy them.
3. Start

The sequence you outlined is also possible, but it is not necessary.
It may even be useful, for example, if an application is switching the workload
and has a new set of rule features to check.

> 
> > +If it fails with an error of type ``RTE_FLOW_ERROR_TYPE_STATE``,
> > +rules using this feature are flushed when the device is stopped.
> 
> Which entity does flush it?

PMD does.
Overall approach is as follows:
no capability => no guarantees, the application must manage the entities itself;
have capability => PMD manages the entities, only it may be unable to keep some.

> > +If it suceeds, such rules will be kept when the device is stopped,
> 
> suceeds -> succeeds
> 
> kept and functional? I.e. transfer rules still route traffic to
> other ports which could be up and running.

Unless you or anyone object, it's kept, but non-functional,
because the semantic of port stop is to stop traffic processing.
A transfer rule can mention any port, but I think it should be controlled
via the one that created it.
In any case, this must be stated explicitly in v4.
  
Ori Kam Oct. 20, 2021, 1:40 p.m. UTC | #6
Hi Dmitry,

> -----Original Message-----
> From: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
> Sent: Wednesday, October 20, 2021 2:40 PM
> Subject: RE: [PATCH v3 1/6] ethdev: add capability to keep flow rules on restart
> 
> 
> 
> > -----Original Message-----
> > From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> > Sent: 20 октября 2021 г. 13:40
> > To: Dmitry Kozlyuk <dkozlyuk@nvidia.com>; dev@dpdk.org
> > Cc: Qi Zhang <qi.z.zhang@intel.com>; Ori Kam <orika@nvidia.com>; NBU-
> > Contact-Thomas Monjalon <thomas@monjalon.net>; Ferruh Yigit
> > <ferruh.yigit@intel.com>
> > Subject: Re: [PATCH v3 1/6] ethdev: add capability to keep flow rules on
> > restart
> >
> > External email: Use caution opening links or attachments
> >
> >
> > On 10/19/21 3:37 PM, Dmitry Kozlyuk wrote:
> > > Previously, it was not specified what happens to the flow rules
> > > when the device is stopped, possibly reconfigured, then started.
> > > If flow rules were kept, it could be convenient for application
> > > developers, because they wouldn't need to save and restore them.
> > > However, due to the number of flows and possible creation rate it is
> > > impractical to save all flow rules in DPDK layer. This means that flow
> > > rules persistence really depends on whether PMD and HW can implement it
> > > efficiently. It can also be limited by the rule item and action types,
> > > and its attributes transfer bit (a combination of an item/action type
> > > and a value of the transfer bit is called a ruel feature).
> > >
> > > Add a device capability bit for PMDs that can keep at least some
> > > of the flow rules across restart. Without this capability behavior
> > > is still unspecified and it is declared that the application must
> > > flush the rules before stopping the device.
> > > Allow the application to test for persitence of rules using
> >
> > persitence -> persistence
> >
> > > a particular feature by attempting to create a flow rule
> > > using that feature when the device is stopped
> > > and checking for the specific error.
> > > This is logical because if the PMD can to create the flow rule
> > > when the device is not started and use it after the start happens,
> > > it is natural that it can move its internal flow rule object
> > > to the same state when the device is stopped and restore the state
> > > when the device is started.
> > >
> > > Rule persistence across a reconfigurations is not required,
> > > because tracking all the rules and configuration-dependent resources
> > > they use may be infeasible. In case a PMD cannot keep the rules
> > > across reconfiguration, it is allowed just to report an error.
> > > Application must then flush the rules before attempting it.
> > >
> > > Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
> > > ---
> > >  doc/guides/prog_guide/rte_flow.rst | 25 +++++++++++++++++++++++++
> > >  lib/ethdev/rte_ethdev.h            |  7 +++++++
> > >  lib/ethdev/rte_flow.h              |  1 +
> > >  3 files changed, 33 insertions(+)
> > >
> > > diff --git a/doc/guides/prog_guide/rte_flow.rst
> > b/doc/guides/prog_guide/rte_flow.rst
> > > index 2b42d5ec8c..ff67b211e3 100644
> > > --- a/doc/guides/prog_guide/rte_flow.rst
> > > +++ b/doc/guides/prog_guide/rte_flow.rst
> > > @@ -87,6 +87,31 @@ To avoid resource leaks on the PMD side, handles must
> > be explicitly
> > >  destroyed by the application before releasing associated resources such
> > as
> > >  queues and ports.
> > >
> > > +If ``RTE_ETH_DEV_CAPA_FLOW_RULE_KEEP`` is not advertised,
> > > +rules cannot be created until the device is started for the first time
> > > +and cannot be kept when the device is stopped.
> > > +However, PMD also does not flush them automatically on stop,
> > > +so the application must call ``rte_flow_flush()`` or
> > ``rte_flow_destroy()``
> > > +before stopping the device to ensure no rules remain.
> > > +
> > > +If ``RTE_ETH_DEV_CAPA_FLOW_RULE_KEEP`` is advertised, this means
> > > +the PMD can keep at least some rules across the device stop and start.
> > > +However, ``rte_eth_dev_configure()`` may fail if any rules remain,
> > > +so the application must flush them before attempting a reconfiguration.
> > > +Keeping may be unsupported for some types of rule items and actions,
> > > +as well as depending on the value of flow attributes transfer bit.
> > > +A combination of an item or action type and a value of the transfer bit
> > > +is called a rule feature.
> >
> > As I said before a combination is very hard to test and
> > unfriendly to applications. Do we really need to make it
> > that complex?
> 
> Maybe the wording is not explicit enough,
> but it exactly attempts to address your previous comment.
> In v3, applications don't need to check for a full combination
> of item types, actions types, and a transfer bit value.
> Instead, they only need to check for a combination
> of one type (of an item or an action) with a transfer bit value.
> There is an example below in the text.
> 
> > Which PMDs are going to support it? Which cases will really
> > be distinguished and will have different support (keep or not)?
> >
> > > +To test if rules with a particular feature are kept, the application
> > must try
> > > +to create a valid rule using this feature when the device is stopped
> > > +(after it has been configured or started previously).
> >
> > Sorry, it hardly makes sense. Does it suggest an application
> > to:
> >  1. configure
> >  2. start
> >  3. stop
> >  4. check/create flow rules
> >  5. start again
> > as a regular start sequence instead of just configure+start.
> > IMHO, it must be possible to check just after configure without
> > start. Otherwise it looks really bad.
> 
> Of course, the following sequence is meant:
> 1. Configure
> 2. Try to create flow rules with needed features,
>    check for RTE_FLOW_ERROR_TYPE_STATE.
>    If and only if the test rules are not needed, destroy them.
> 3. Start
> 
> The sequence you outlined is also possible, but it is not necessary.
> It may even be useful, for example, if an application is switching the workload
> and has a new set of rule features to check.
> 
> >
> > > +If it fails with an error of type ``RTE_FLOW_ERROR_TYPE_STATE``,
> > > +rules using this feature are flushed when the device is stopped.
> >
> > Which entity does flush it?
> 
> PMD does.
> Overall approach is as follows:
> no capability => no guarantees, the application must manage the entities itself;
> have capability => PMD manages the entities, only it may be unable to keep some.
> 

I don't think it should be the PMD responsibility.
it is always the application,
if PMD returns ``RTE_FLOW_ERROR_TYPE_STATE`` it means that the application
must destroy the rules before stop.
Since the PMD said it can't keep the flows it may mean that he can't track them
so it can't remove them.


> > > +If it suceeds, such rules will be kept when the device is stopped,
> >
> > suceeds -> succeeds
> >
> > kept and functional? I.e. transfer rules still route traffic to
> > other ports which could be up and running.
> 
> Unless you or anyone object, it's kept, but non-functional,
> because the semantic of port stop is to stop traffic processing.
> A transfer rule can mention any port, but I think it should be controlled
> via the one that created it.
> In any case, this must be stated explicitly in v4.

Best,
Ori
  

Patch

diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 2b42d5ec8c..ff67b211e3 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -87,6 +87,31 @@  To avoid resource leaks on the PMD side, handles must be explicitly
 destroyed by the application before releasing associated resources such as
 queues and ports.
 
+If ``RTE_ETH_DEV_CAPA_FLOW_RULE_KEEP`` is not advertised,
+rules cannot be created until the device is started for the first time
+and cannot be kept when the device is stopped.
+However, PMD also does not flush them automatically on stop,
+so the application must call ``rte_flow_flush()`` or ``rte_flow_destroy()``
+before stopping the device to ensure no rules remain.
+
+If ``RTE_ETH_DEV_CAPA_FLOW_RULE_KEEP`` is advertised, this means
+the PMD can keep at least some rules across the device stop and start.
+However, ``rte_eth_dev_configure()`` may fail if any rules remain,
+so the application must flush them before attempting a reconfiguration.
+Keeping may be unsupported for some types of rule items and actions,
+as well as depending on the value of flow attributes transfer bit.
+A combination of an item or action type and a value of the transfer bit
+is called a rule feature.
+To test if rules with a particular feature are kept, the application must try
+to create a valid rule using this feature when the device is stopped
+(after it has been configured or started previously).
+If it fails with an error of type ``RTE_FLOW_ERROR_TYPE_STATE``,
+rules using this feature are flushed when the device is stopped.
+If it suceeds, such rules will be kept when the device is stopped,
+provided they do not use other features that are not supported.
+Rules that are created when the device is stopped, including the rules
+created for the test, will be kept after the device is started.
+
 The following sections cover:
 
 - **Attributes** (represented by ``struct rte_flow_attr``): properties of a
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 6d80514ba7..a0b388bb25 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -90,6 +90,11 @@ 
  *     - flow director filtering mode (but not filtering rules)
  *     - NIC queue statistics mappings
  *
+ * The following configuration may be retained or not
+ * depending on the device capabilities:
+ *
+ *     - flow rules
+ *
  * Any other configuration will not be stored and will need to be re-entered
  * before a call to rte_eth_dev_start().
  *
@@ -1445,6 +1450,8 @@  struct rte_eth_conf {
 #define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
 /** Device supports Tx queue setup after device started. */
 #define RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x00000002
+/** Device supports keeping flow rules across restart. */
+#define RTE_ETH_DEV_CAPA_FLOW_RULE_KEEP 0x00000004
 /**@}*/
 
 /*
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index a89945061a..aa0182d021 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -3344,6 +3344,7 @@  enum rte_flow_error_type {
 	RTE_FLOW_ERROR_TYPE_ACTION_NUM, /**< Number of actions. */
 	RTE_FLOW_ERROR_TYPE_ACTION_CONF, /**< Action configuration. */
 	RTE_FLOW_ERROR_TYPE_ACTION, /**< Specific action. */
+	RTE_FLOW_ERROR_TYPE_STATE, /**< Current device state. */
 };
 
 /**