Message ID | 20201009034832.10302-2-kalesh-anakkur.purayil@broadcom.com |
---|---|
State | New |
Delegated to: | Ferruh Yigit |
Headers | show |
Series |
|
Related | show |
Context | Check | Description |
---|---|---|
ci/checkpatch | success | coding style OK |
09/10/2020 05:48, Kalesh A P: > From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> > > Adding support for device reset and recovery events in the > rte_eth_event framework. FW error and FW reset conditions would be > managed internally by PMD without needing application intervention. > In such cases, PMD would need reset/recovery events to notify application > that PMD is undergoing a reset. > > Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> > Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> > Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> > Reviewed-by: Asaf Penso <asafp@nvidia.com> The ethdev maintainers are not Cc'ed. Please use the option --cc-cmd devtools/get-maintainer.sh > +Error recovery support > +~~~~~~~~~~~~~~~~~~~~~~ > + > +When the PMD detects a FW reset or error condition, it will try to recover > +from the error without needing the application intervention. In such cases, > +PMD would need events to notify the application that it is undergoing > +an error recovery. > + > +The PMD will trigger RTE_ETH_EVENT_ERR_RECOVERING event to notify the > +application that PMD detected a FW reset or FW error condition. PMD will > +try to recover from the error by itself. Data path will be halted and > +control path operations would fail during the recovery period. > + > +The PMD will trigger RTE_ETH_EVENT_RECOVERED event to notify the application > +that the it has recovered from the error condition. Control path and data path > +are up now. Since the device undergone a reset, flow rules offloaded prior to > +the reset will be lost and the application has to recreate the rules again. > diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h > index 9759f13..9b4b015 100644 > --- a/lib/librte_ethdev/rte_ethdev.h > +++ b/lib/librte_ethdev/rte_ethdev.h > @@ -3207,6 +3207,23 @@ enum rte_eth_event_type { > RTE_ETH_EVENT_DESTROY, /**< port is released */ > RTE_ETH_EVENT_IPSEC, /**< IPsec offload related event */ > RTE_ETH_EVENT_FLOW_AGED,/**< New aged-out flows is detected */ > + RTE_ETH_EVENT_ERR_RECOVERING, > + /**< port recovering from an error > + * > + * PMD detected a FW reset or error condition. > + * PMD will try to recover from the error. > + * Data path will be halted and Control path operations > + * would fail at this time. > + */ Does it mean the application has nothing to do when receiving this event? I think the app should stop polling at least. > + RTE_ETH_EVENT_RECOVERED, > + /**< port recovered from an error > + * > + * PMD has recovered from the error condition. > + * Control path and Data path are up now. > + * Since the device undergone a reset, flow rules > + * offloaded prior to the reset will be lost and > + * the application has to recreate the rules again. > + */ Please be more precise. Should the app re-configure the port, setup the queues, start the port?
On 10/12/20 12:29 AM, Thomas Monjalon wrote: > 09/10/2020 05:48, Kalesh A P: >> From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> >> >> Adding support for device reset and recovery events in the >> rte_eth_event framework. FW error and FW reset conditions would be >> managed internally by PMD without needing application intervention. >> In such cases, PMD would need reset/recovery events to notify application >> that PMD is undergoing a reset. >> >> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> >> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> >> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> >> Reviewed-by: Asaf Penso <asafp@nvidia.com> > > The ethdev maintainers are not Cc'ed. > Please use the option --cc-cmd devtools/get-maintainer.sh > > >> +Error recovery support >> +~~~~~~~~~~~~~~~~~~~~~~ >> + >> +When the PMD detects a FW reset or error condition, it will try to recover >> +from the error without needing the application intervention. In such cases, >> +PMD would need events to notify the application that it is undergoing >> +an error recovery. >> + >> +The PMD will trigger RTE_ETH_EVENT_ERR_RECOVERING event to notify the >> +application that PMD detected a FW reset or FW error condition. PMD will >> +try to recover from the error by itself. Data path will be halted and >> +control path operations would fail during the recovery period. >> + >> +The PMD will trigger RTE_ETH_EVENT_RECOVERED event to notify the application >> +that the it has recovered from the error condition. Control path and data path >> +are up now. Since the device undergone a reset, flow rules offloaded prior to >> +the reset will be lost and the application has to recreate the rules again. What should be done if the state is not recoverable? >> diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h >> index 9759f13..9b4b015 100644 >> --- a/lib/librte_ethdev/rte_ethdev.h >> +++ b/lib/librte_ethdev/rte_ethdev.h >> @@ -3207,6 +3207,23 @@ enum rte_eth_event_type { >> RTE_ETH_EVENT_DESTROY, /**< port is released */ >> RTE_ETH_EVENT_IPSEC, /**< IPsec offload related event */ >> RTE_ETH_EVENT_FLOW_AGED,/**< New aged-out flows is detected */ >> + RTE_ETH_EVENT_ERR_RECOVERING, >> + /**< port recovering from an error >> + * >> + * PMD detected a FW reset or error condition. >> + * PMD will try to recover from the error. >> + * Data path will be halted and Control path operations >> + * would fail at this time. >> + */ > > Does it mean the application has nothing to do when receiving this event? > I think the app should stop polling at least. > >> + RTE_ETH_EVENT_RECOVERED, >> + /**< port recovered from an error >> + * >> + * PMD has recovered from the error condition. >> + * Control path and Data path are up now. >> + * Since the device undergone a reset, flow rules >> + * offloaded prior to the reset will be lost and >> + * the application has to recreate the rules again. >> + */ > > Please be more precise. > Should the app re-configure the port, setup the queues, start the port? > >
diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst index 86e0a14..c03f0ef 100644 --- a/doc/guides/prog_guide/poll_mode_drv.rst +++ b/doc/guides/prog_guide/poll_mode_drv.rst @@ -615,3 +615,21 @@ by application. The PMD itself should not call rte_eth_dev_reset(). The PMD can trigger the application to handle reset event. It is duty of application to handle all synchronization before it calls rte_eth_dev_reset(). + +Error recovery support +~~~~~~~~~~~~~~~~~~~~~~ + +When the PMD detects a FW reset or error condition, it will try to recover +from the error without needing the application intervention. In such cases, +PMD would need events to notify the application that it is undergoing +an error recovery. + +The PMD will trigger RTE_ETH_EVENT_ERR_RECOVERING event to notify the +application that PMD detected a FW reset or FW error condition. PMD will +try to recover from the error by itself. Data path will be halted and +control path operations would fail during the recovery period. + +The PMD will trigger RTE_ETH_EVENT_RECOVERED event to notify the application +that the it has recovered from the error condition. Control path and data path +are up now. Since the device undergone a reset, flow rules offloaded prior to +the reset will be lost and the application has to recreate the rules again. diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h index 9759f13..9b4b015 100644 --- a/lib/librte_ethdev/rte_ethdev.h +++ b/lib/librte_ethdev/rte_ethdev.h @@ -3207,6 +3207,23 @@ enum rte_eth_event_type { RTE_ETH_EVENT_DESTROY, /**< port is released */ RTE_ETH_EVENT_IPSEC, /**< IPsec offload related event */ RTE_ETH_EVENT_FLOW_AGED,/**< New aged-out flows is detected */ + RTE_ETH_EVENT_ERR_RECOVERING, + /**< port recovering from an error + * + * PMD detected a FW reset or error condition. + * PMD will try to recover from the error. + * Data path will be halted and Control path operations + * would fail at this time. + */ + RTE_ETH_EVENT_RECOVERED, + /**< port recovered from an error + * + * PMD has recovered from the error condition. + * Control path and Data path are up now. + * Since the device undergone a reset, flow rules + * offloaded prior to the reset will be lost and + * the application has to recreate the rules again. + */ RTE_ETH_EVENT_MAX /**< max value of this enum */ };