[v5,1/3] ethdev: support device reset and recovery events

Message ID 20201007164915.14375-2-kalesh-anakkur.purayil@broadcom.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series librte_ethdev: error recovery support |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Kalesh A P Oct. 7, 2020, 4:49 p.m. UTC
  From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>

Adding support for device reset and recovery events in the
rte_eth_event framework. FW error and FW reset conditions would be
managed internally by PMD without needing application intervention.
In such cases, PMD would need reset/recovery events to notify application
that PMD is undergoing a reset.

Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
---
 doc/guides/prog_guide/poll_mode_drv.rst | 18 ++++++++++++++++++
 lib/librte_ethdev/rte_ethdev.h          | 17 +++++++++++++++++
 2 files changed, 35 insertions(+)
  

Comments

Asaf Penso Oct. 8, 2020, 10:49 a.m. UTC | #1
>-----Original Message-----
>From: dev <dev-bounces@dpdk.org> On Behalf Of Kalesh A P
>Sent: Wednesday, October 7, 2020 7:49 PM
>To: dev@dpdk.org
>Subject: [dpdk-dev] [PATCH v5 1/3] ethdev: support device reset and
>recovery events
>
>From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
>
>Adding support for device reset and recovery events in the rte_eth_event
>framework. FW error and FW reset conditions would be managed internally
>by PMD without needing application intervention.
>In such cases, PMD would need reset/recovery events to notify application
>that PMD is undergoing a reset.
>
>Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
>Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
>Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
>---
> doc/guides/prog_guide/poll_mode_drv.rst | 18 ++++++++++++++++++
> lib/librte_ethdev/rte_ethdev.h          | 17 +++++++++++++++++
> 2 files changed, 35 insertions(+)
>
>diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
>b/doc/guides/prog_guide/poll_mode_drv.rst
>index 86e0a14..c03f0ef 100644
>--- a/doc/guides/prog_guide/poll_mode_drv.rst
>+++ b/doc/guides/prog_guide/poll_mode_drv.rst
>@@ -615,3 +615,21 @@ by application.
> The PMD itself should not call rte_eth_dev_reset(). The PMD can trigger  the
>application to handle reset event. It is duty of application to  handle all
>synchronization before it calls rte_eth_dev_reset().
>+
>+Error recovery support
>+~~~~~~~~~~~~~~~~~~~~~~
>+
>+When the PMD detects a FW reset or error condition, it will try to
>+recover from the error without needing the application intervention. In
>+such cases, PMD would need events to notify the application that it is
>+undergoing an error recovery.
>+
>+The PMD will trigger RTE_ETH_EVENT_ERR_RECOVERING event to notify the
>+application that PMD detected a FW reset or FW error condition. PMD
>+will try to recover from the error by itself. Data path will be halted
>+and control path operations would fail during the recovery period.
>+
>+The PMD will trigger RTE_ETH_EVENT_RECOVERED event to notify the
>+application that the it has recovered from the error condition. Control
>+path and data path are up now. Since the device undergone a reset, flow
>+rules offloaded prior to the reset will be lost and the application has to
>recreate the rules again.
>diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
>index 9759f13..9b4b015 100644
>--- a/lib/librte_ethdev/rte_ethdev.h
>+++ b/lib/librte_ethdev/rte_ethdev.h
>@@ -3207,6 +3207,23 @@ enum rte_eth_event_type {
> 	RTE_ETH_EVENT_DESTROY,  /**< port is released */
> 	RTE_ETH_EVENT_IPSEC,    /**< IPsec offload related event */
> 	RTE_ETH_EVENT_FLOW_AGED,/**< New aged-out flows is detected
>*/
>+	RTE_ETH_EVENT_ERR_RECOVERING,
>+			/**< port recovering from an error
>+			 *
>+			 * PMD detected a FW reset or error condition.
>+			 * PMD will try to recover from the error.
>+			 * Data path will be halted and Control path operations
>+			 * would fail at this time.
>+			 */
>+	RTE_ETH_EVENT_RECOVERED,
>+			/**< port recovered from an error
>+			 *
>+			 * PMD has recovered from the error condition.
>+			 * Control path and Data path are up now.
>+			 * Since the device undergone a reset, flow rules
>+			 * offloaded prior to the reset will be lost and
>+			 * the application has to recreate the rules again.
>+			 */
> 	RTE_ETH_EVENT_MAX       /**< max value of this enum */
> };
>
>--
>2.10.1

LGTM
Reviewed-by: Asaf Penso <asafp@nvidia.com>
  

Patch

diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
index 86e0a14..c03f0ef 100644
--- a/doc/guides/prog_guide/poll_mode_drv.rst
+++ b/doc/guides/prog_guide/poll_mode_drv.rst
@@ -615,3 +615,21 @@  by application.
 The PMD itself should not call rte_eth_dev_reset(). The PMD can trigger
 the application to handle reset event. It is duty of application to
 handle all synchronization before it calls rte_eth_dev_reset().
+
+Error recovery support
+~~~~~~~~~~~~~~~~~~~~~~
+
+When the PMD detects a FW reset or error condition, it will try to recover
+from the error without needing the application intervention. In such cases,
+PMD would need events to notify the application that it is undergoing
+an error recovery.
+
+The PMD will trigger RTE_ETH_EVENT_ERR_RECOVERING event to notify the
+application that PMD detected a FW reset or FW error condition. PMD will
+try to recover from the error by itself. Data path will be halted and
+control path operations would fail during the recovery period.
+
+The PMD will trigger RTE_ETH_EVENT_RECOVERED event to notify the application
+that the it has recovered from the error condition. Control path and data path
+are up now. Since the device undergone a reset, flow rules offloaded prior to
+the reset will be lost and the application has to recreate the rules again.
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index 9759f13..9b4b015 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -3207,6 +3207,23 @@  enum rte_eth_event_type {
 	RTE_ETH_EVENT_DESTROY,  /**< port is released */
 	RTE_ETH_EVENT_IPSEC,    /**< IPsec offload related event */
 	RTE_ETH_EVENT_FLOW_AGED,/**< New aged-out flows is detected */
+	RTE_ETH_EVENT_ERR_RECOVERING,
+			/**< port recovering from an error
+			 *
+			 * PMD detected a FW reset or error condition.
+			 * PMD will try to recover from the error.
+			 * Data path will be halted and Control path operations
+			 * would fail at this time.
+			 */
+	RTE_ETH_EVENT_RECOVERED,
+			/**< port recovered from an error
+			 *
+			 * PMD has recovered from the error condition.
+			 * Control path and Data path are up now.
+			 * Since the device undergone a reset, flow rules
+			 * offloaded prior to the reset will be lost and
+			 * the application has to recreate the rules again.
+			 */
 	RTE_ETH_EVENT_MAX       /**< max value of this enum */
 };