[RFC,v3,1/3] ethdev: support device reset and recovery events

Message ID 20200930071258.22457-2-kalesh-anakkur.purayil@broadcom.com (mailing list archive)
State Changes Requested, archived
Delegated to: Ferruh Yigit
Headers
Series librte_ethdev: error recovery support |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Kalesh A P Sept. 30, 2020, 7:12 a.m. UTC
  From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>

Adding support for device reset and recovery events in the
rte_eth_event framework. FW error and FW reset conditions would be
managed internally by PMD without needing application intervention.
In such cases, PMD would need reset/recovery events to notify application
that PMD is undergoing a reset.

Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
---
 lib/librte_ethdev/rte_ethdev.h | 2 ++
 1 file changed, 2 insertions(+)
  

Comments

Thomas Monjalon Sept. 30, 2020, 7:50 a.m. UTC | #1
Hi,

Please use --cc-cmd devtools/get-maintainer.sh so all relevant
maintainers are Cc'ed. Adding Andrew.

> From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> 
> Adding support for device reset and recovery events in the
> rte_eth_event framework. FW error and FW reset conditions would be
> managed internally by PMD without needing application intervention.
> In such cases, PMD would need reset/recovery events to notify application
> that PMD is undergoing a reset.

We already have this event:
	
    RTE_ETH_EVENT_INTR_RESET,
            /**< reset interrupt event, sent to VF on PF reset */

I don't know why "INTR" is in the name of this event,
and I think it does not need to be restricted to VF.
The application does not need to know whether the reset
is caused by the PF, the FW or the HW.
That's why I think you could share the same event.

> +       RTE_ETH_EVENT_RESET,    /**< port resetting from an error */
> +       RTE_ETH_EVENT_RECOVERED, /**< port recovered from an error */

You ignored my previous comments:
"
What the application is supposed to do when receiving such event?
How the application knows that flow rules were resetted?
Is there any other configuration resetted?
These informations must be explicit in the doxygen comments.
"
  
Kalesh A P Sept. 30, 2020, 8:35 a.m. UTC | #2
Hi Thomas,

Please see my response inline.

Regards,
Kalesh

On Wed, Sep 30, 2020 at 1:21 PM Thomas Monjalon <thomas@monjalon.net> wrote:

> Hi,
>
> Please use --cc-cmd devtools/get-maintainer.sh so all relevant
> maintainers are Cc'ed. Adding Andrew.
>
[Kalesh]: Thank you. Sure, I will follow next time.

>
> > From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> >
> > Adding support for device reset and recovery events in the
> > rte_eth_event framework. FW error and FW reset conditions would be
> > managed internally by PMD without needing application intervention.
> > In such cases, PMD would need reset/recovery events to notify application
> > that PMD is undergoing a reset.
>
> We already have this event:
>
>     RTE_ETH_EVENT_INTR_RESET,
>             /**< reset interrupt event, sent to VF on PF reset */
>
> I don't know why "INTR" is in the name of this event,
> and I think it does not need to be restricted to VF.
> The application does not need to know whether the reset
> is caused by the PF, the FW or the HW.
> That's why I think you could share the same event.
>

[Kalesh]: Yes. As you mentioned, this event is used for some other purpose.
I did not want to break the existing usage/purpose of this event.
For example, upon receiving the RTE_ETH_EVENT_INTR_RESET event OVS
application invokes rte_eth_dev_reset() to reset the port.
The aim here is to recover from the device error condition without the
intervention of Applications. PMD itself will recover from the error using
the protocol with FW.

>
> > +       RTE_ETH_EVENT_RESET,    /**< port resetting from an error */
> > +       RTE_ETH_EVENT_RECOVERED, /**< port recovered from an error */
>
> You ignored my previous comments:
> "
> What the application is supposed to do when receiving such event?
> How the application knows that flow rules were resetted?
> Is there any other configuration resetted?
> These informations must be explicit in the doxygen comments.
> "
>
[Kalesh]: Sorry, I missed it.
I am not sure what you meant by "These information must be explicit in the
doxygen comments ".
Could you please elaborate a little how to/where to put these details?
  
Thomas Monjalon Sept. 30, 2020, 9:31 a.m. UTC | #3
30/09/2020 10:35, Kalesh Anakkur Purayil:
> On Wed, Sep 30, 2020 at 1:21 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> > >
> > > Adding support for device reset and recovery events in the
> > > rte_eth_event framework. FW error and FW reset conditions would be
> > > managed internally by PMD without needing application intervention.
> > > In such cases, PMD would need reset/recovery events to notify application
> > > that PMD is undergoing a reset.
> >
> > We already have this event:
> >
> >     RTE_ETH_EVENT_INTR_RESET,
> >             /**< reset interrupt event, sent to VF on PF reset */
> >
> > I don't know why "INTR" is in the name of this event,
> > and I think it does not need to be restricted to VF.
> > The application does not need to know whether the reset
> > is caused by the PF, the FW or the HW.
> > That's why I think you could share the same event.
> >
> 
> [Kalesh]: Yes. As you mentioned, this event is used for some other purpose.
> I did not want to break the existing usage/purpose of this event.
> For example, upon receiving the RTE_ETH_EVENT_INTR_RESET event OVS
> application invokes rte_eth_dev_reset() to reset the port.
> The aim here is to recover from the device error condition without the
> intervention of Applications. PMD itself will recover from the error using
> the protocol with FW.
> 
> >
> > > +       RTE_ETH_EVENT_RESET,    /**< port resetting from an error */
> > > +       RTE_ETH_EVENT_RECOVERED, /**< port recovered from an error */
> >
> > You ignored my previous comments:
> > "
> > What the application is supposed to do when receiving such event?
> > How the application knows that flow rules were resetted?
> > Is there any other configuration resetted?
> > These informations must be explicit in the doxygen comments.
> > "
> >
> [Kalesh]: Sorry, I missed it.
> I am not sure what you meant by "These information must be explicit in the
> doxygen comments ".
> Could you please elaborate a little how to/where to put these details?

/** is the start of a doxygen comment.
This is the place (in the .h file) to explain to application
developer what to do with the event.
The code + the comments is what we call "the API".

You should complete the description of RTE_ETH_EVENT_INTR_RESET
as well: the need for calling rte_eth_dev_reset() was not explicit.
  

Patch

diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index 9759f13..b9dd14c 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -3207,6 +3207,8 @@  enum rte_eth_event_type {
 	RTE_ETH_EVENT_DESTROY,  /**< port is released */
 	RTE_ETH_EVENT_IPSEC,    /**< IPsec offload related event */
 	RTE_ETH_EVENT_FLOW_AGED,/**< New aged-out flows is detected */
+	RTE_ETH_EVENT_RESET,    /**< port resetting from an error */
+	RTE_ETH_EVENT_RECOVERED, /**< port recovered from an error */
 	RTE_ETH_EVENT_MAX       /**< max value of this enum */
 };