From patchwork Fri Oct 9 03:48:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kalesh A P X-Patchwork-Id: 80094 X-Patchwork-Delegate: ferruh.yigit@amd.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id CC33AA04BC; Fri, 9 Oct 2020 05:34:22 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 13D2F1BED5; Fri, 9 Oct 2020 05:34:06 +0200 (CEST) Received: from relay.smtp-ext.broadcom.com (lpdvacalvio01.broadcom.com [192.19.229.182]) by dpdk.org (Postfix) with ESMTP id A5BA71BECA for ; Fri, 9 Oct 2020 05:34:03 +0200 (CEST) Received: from dhcp-10-123-153-22.dhcp.broadcom.net (bgccx-dev-host-lnx2.bec.broadcom.net [10.123.153.22]) by relay.smtp-ext.broadcom.com (Postfix) with ESMTP id 5B8547E072 for ; Thu, 8 Oct 2020 20:34:01 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 relay.smtp-ext.broadcom.com 5B8547E072 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=broadcom.com; s=dkimrelay; t=1602214442; bh=dLbErOAA6HbQHq/MOt9o4uutyMHZFA2G015hxXE23uk=; h=From:To:Subject:Date:In-Reply-To:References:From; b=RHGo/JP3mMdx3uCPjhGM6BfTushvVmrQX0banG8fxEbLw8TIj/SaYANaT9jHIZPLH Eno5coY084jkR59DwDCm61FJEp7aFKuEu+ALVY+7v8NL1xU7WgplcK7974t6NUYogy WlsYJZr37m8vxWkkJF1cdEx1iRUkCO9uFiIgEtgM= From: Kalesh A P To: dev@dpdk.org Date: Fri, 9 Oct 2020 09:18:30 +0530 Message-Id: <20201009034832.10302-2-kalesh-anakkur.purayil@broadcom.com> X-Mailer: git-send-email 2.10.1 In-Reply-To: <20201009034832.10302-1-kalesh-anakkur.purayil@broadcom.com> References: <20200122101654.20824-1-kalesh-anakkur.purayil@broadcom.com> <20201009034832.10302-1-kalesh-anakkur.purayil@broadcom.com> Subject: [dpdk-dev] [PATCH v6 1/3] ethdev: support device reset and recovery events X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Kalesh AP Adding support for device reset and recovery events in the rte_eth_event framework. FW error and FW reset conditions would be managed internally by PMD without needing application intervention. In such cases, PMD would need reset/recovery events to notify application that PMD is undergoing a reset. Signed-off-by: Somnath Kotur Signed-off-by: Kalesh AP Reviewed-by: Ajit Khaparde Reviewed-by: Asaf Penso --- doc/guides/prog_guide/poll_mode_drv.rst | 18 ++++++++++++++++++ lib/librte_ethdev/rte_ethdev.h | 17 +++++++++++++++++ 2 files changed, 35 insertions(+) diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst index 86e0a14..c03f0ef 100644 --- a/doc/guides/prog_guide/poll_mode_drv.rst +++ b/doc/guides/prog_guide/poll_mode_drv.rst @@ -615,3 +615,21 @@ by application. The PMD itself should not call rte_eth_dev_reset(). The PMD can trigger the application to handle reset event. It is duty of application to handle all synchronization before it calls rte_eth_dev_reset(). + +Error recovery support +~~~~~~~~~~~~~~~~~~~~~~ + +When the PMD detects a FW reset or error condition, it will try to recover +from the error without needing the application intervention. In such cases, +PMD would need events to notify the application that it is undergoing +an error recovery. + +The PMD will trigger RTE_ETH_EVENT_ERR_RECOVERING event to notify the +application that PMD detected a FW reset or FW error condition. PMD will +try to recover from the error by itself. Data path will be halted and +control path operations would fail during the recovery period. + +The PMD will trigger RTE_ETH_EVENT_RECOVERED event to notify the application +that the it has recovered from the error condition. Control path and data path +are up now. Since the device undergone a reset, flow rules offloaded prior to +the reset will be lost and the application has to recreate the rules again. diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h index 9759f13..9b4b015 100644 --- a/lib/librte_ethdev/rte_ethdev.h +++ b/lib/librte_ethdev/rte_ethdev.h @@ -3207,6 +3207,23 @@ enum rte_eth_event_type { RTE_ETH_EVENT_DESTROY, /**< port is released */ RTE_ETH_EVENT_IPSEC, /**< IPsec offload related event */ RTE_ETH_EVENT_FLOW_AGED,/**< New aged-out flows is detected */ + RTE_ETH_EVENT_ERR_RECOVERING, + /**< port recovering from an error + * + * PMD detected a FW reset or error condition. + * PMD will try to recover from the error. + * Data path will be halted and Control path operations + * would fail at this time. + */ + RTE_ETH_EVENT_RECOVERED, + /**< port recovered from an error + * + * PMD has recovered from the error condition. + * Control path and Data path are up now. + * Since the device undergone a reset, flow rules + * offloaded prior to the reset will be lost and + * the application has to recreate the rules again. + */ RTE_ETH_EVENT_MAX /**< max value of this enum */ }; From patchwork Fri Oct 9 03:48:31 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kalesh A P X-Patchwork-Id: 80095 X-Patchwork-Delegate: ferruh.yigit@amd.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 72CD8A04BC; Fri, 9 Oct 2020 05:34:39 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 50AD01BEE1; Fri, 9 Oct 2020 05:34:07 +0200 (CEST) Received: from relay.smtp-ext.broadcom.com (lpdvacalvio01.broadcom.com [192.19.229.182]) by dpdk.org (Postfix) with ESMTP id 40FEF1BED3 for ; Fri, 9 Oct 2020 05:34:05 +0200 (CEST) Received: from dhcp-10-123-153-22.dhcp.broadcom.net (bgccx-dev-host-lnx2.bec.broadcom.net [10.123.153.22]) by relay.smtp-ext.broadcom.com (Postfix) with ESMTP id EED387E07F for ; Thu, 8 Oct 2020 20:34:02 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 relay.smtp-ext.broadcom.com EED387E07F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=broadcom.com; s=dkimrelay; t=1602214443; bh=PhfYxgRyZd2glmV3OTDdi8TOyDn8AB8A4EkgtGC/oa4=; h=From:To:Subject:Date:In-Reply-To:References:From; b=FYpiB3gUyE5NsS+4ohZdT7+CZIlA8DUdGNvR5Lq19mMb0mayOSNTykgEEEOicRlOD Te8PUcSTpPPhE+lbwdnEp924I8oG6z4j59bEn8DSRkXrJZX1bu1svdQCuc7A4kvNgv VqizctV+zOA+3gkOZOt0Pn7aUqq9Ai8g/tDGPZmo= From: Kalesh A P To: dev@dpdk.org Date: Fri, 9 Oct 2020 09:18:31 +0530 Message-Id: <20201009034832.10302-3-kalesh-anakkur.purayil@broadcom.com> X-Mailer: git-send-email 2.10.1 In-Reply-To: <20201009034832.10302-1-kalesh-anakkur.purayil@broadcom.com> References: <20200122101654.20824-1-kalesh-anakkur.purayil@broadcom.com> <20201009034832.10302-1-kalesh-anakkur.purayil@broadcom.com> Subject: [dpdk-dev] [PATCH v6 2/3] net/bnxt: notify applications about device reset/recovery X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Kalesh AP When the driver receives RESET_NOTIFY async event from FW or detected an error condition, it should update the application that FW is going to reset. Once the driver recoveres from the reset, update the reset recovery status to application as well. The recovery process is transparent to the application as the driver itself tries to recover from FW reset or FW error conditions. Signed-off-by: Kalesh AP Signed-off-by: Ajit Khaparde Signed-off-by: Somnath Kotur --- drivers/net/bnxt/bnxt_cpr.c | 3 +++ drivers/net/bnxt/bnxt_ethdev.c | 9 +++++++++ 2 files changed, 12 insertions(+) diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c index 8311e26..987c010 100644 --- a/drivers/net/bnxt/bnxt_cpr.c +++ b/drivers/net/bnxt/bnxt_cpr.c @@ -129,6 +129,9 @@ void bnxt_handle_async_event(struct bnxt *bp, bp->flags |= BNXT_FLAG_FATAL_ERROR; return; } + rte_eth_dev_callback_process(bp->eth_dev, + RTE_ETH_EVENT_ERR_RECOVERING, + NULL); event_data = rte_le_to_cpu_32(async_cmp->event_data1); /* timestamp_lo/hi values are in units of 100ms */ diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c index b99c712..e3798de 100644 --- a/drivers/net/bnxt/bnxt_ethdev.c +++ b/drivers/net/bnxt/bnxt_ethdev.c @@ -4566,6 +4566,9 @@ static void bnxt_dev_recover(void *arg) goto err_start; PMD_DRV_LOG(INFO, "Recovered from FW reset\n"); + rte_eth_dev_callback_process(bp->eth_dev, + RTE_ETH_EVENT_RECOVERED, + NULL); return; err_start: bnxt_dev_stop_op(bp->eth_dev); @@ -4573,6 +4576,9 @@ static void bnxt_dev_recover(void *arg) bp->flags |= BNXT_FLAG_FATAL_ERROR; bnxt_uninit_resources(bp, false); PMD_DRV_LOG(ERR, "Failed to recover from FW reset\n"); + rte_eth_dev_callback_process(bp->eth_dev, + RTE_ETH_EVENT_INTR_RMV, + NULL); } void bnxt_dev_reset_and_resume(void *arg) @@ -4708,6 +4714,9 @@ static void bnxt_check_fw_health(void *arg) bp->flags |= BNXT_FLAG_FW_RESET; PMD_DRV_LOG(ERR, "Detected FW dead condition\n"); + rte_eth_dev_callback_process(bp->eth_dev, + RTE_ETH_EVENT_ERR_RECOVERING, + NULL); if (bnxt_is_master_func(bp)) wait_msec = info->master_func_wait_period; From patchwork Fri Oct 9 03:48:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kalesh A P X-Patchwork-Id: 80096 X-Patchwork-Delegate: ferruh.yigit@amd.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 61583A04BC; Fri, 9 Oct 2020 05:34:56 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id C45C41BEEA; Fri, 9 Oct 2020 05:34:09 +0200 (CEST) Received: from relay.smtp-ext.broadcom.com (lpdvacalvio01.broadcom.com [192.19.229.182]) by dpdk.org (Postfix) with ESMTP id 5ED7C1BEDD for ; Fri, 9 Oct 2020 05:34:06 +0200 (CEST) Received: from dhcp-10-123-153-22.dhcp.broadcom.net (bgccx-dev-host-lnx2.bec.broadcom.net [10.123.153.22]) by relay.smtp-ext.broadcom.com (Postfix) with ESMTP id 21A517E081 for ; Thu, 8 Oct 2020 20:34:03 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 relay.smtp-ext.broadcom.com 21A517E081 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=broadcom.com; s=dkimrelay; t=1602214444; bh=7Zls0PYL+mV5puk6MnoVJIQ1OFcOJFAmZBwOA11gn5k=; h=From:To:Subject:Date:In-Reply-To:References:From; b=bcY+fM9kMWx3R2ovLQqSFgVN90b/XdqOlJH6KL2aTQpwkIrfCrTJ+XoVbRDEAOwrs sJLZCrphzZUceaqT+uUIRkZuG1L+DJ2lNORy4ODLoT3udxq2JYN9sxiVVjcz22C1Ij EKI9zr1PF1m/y74EZ5rAxbjOhKNPvJ3SUkFQU29U= From: Kalesh A P To: dev@dpdk.org Date: Fri, 9 Oct 2020 09:18:32 +0530 Message-Id: <20201009034832.10302-4-kalesh-anakkur.purayil@broadcom.com> X-Mailer: git-send-email 2.10.1 In-Reply-To: <20201009034832.10302-1-kalesh-anakkur.purayil@broadcom.com> References: <20200122101654.20824-1-kalesh-anakkur.purayil@broadcom.com> <20201009034832.10302-1-kalesh-anakkur.purayil@broadcom.com> Subject: [dpdk-dev] [PATCH v6 3/3] app/testpmd: handle device recovery event X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Kalesh AP Added code to handle error recovery events in testpmd. This is an indication from the PMD that it is undergoing an error recovery and recovered from the error condition. Updated 20.11 release notes as well. Signed-off-by: Kalesh AP Signed-off-by: Somnath Kotur Reviewed-by: Ajit Kumar Khaparde --- app/test-pmd/parameters.c | 8 ++++++-- app/test-pmd/testpmd.c | 6 +++++- doc/guides/rel_notes/release_20_11.rst | 10 ++++++++++ 3 files changed, 21 insertions(+), 3 deletions(-) diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c index 1ead595..560f9ba 100644 --- a/app/test-pmd/parameters.c +++ b/app/test-pmd/parameters.c @@ -192,9 +192,9 @@ usage(char* progname) printf(" --no-rmv-interrupt: disable device removal interrupt.\n"); printf(" --bitrate-stats=N: set the logical core N to perform " "bit-rate calculation.\n"); - printf(" --print-event : " + printf(" --print-event : " "enable print of designated event or all of them.\n"); - printf(" --mask-event : " + printf(" --mask-event : " "disable print of designated event or all of them.\n"); printf(" --flow-isolate-all: " "requests flow API isolated mode on all ports at initialization time.\n"); @@ -556,6 +556,10 @@ parse_event_printing_config(const char *optarg, int enable) mask = UINT32_C(1) << RTE_ETH_EVENT_DESTROY; else if (!strcmp(optarg, "flow_aged")) mask = UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED; + else if (!strcmp(optarg, "err_recovering")) + mask = UINT32_C(1) << RTE_ETH_EVENT_ERR_RECOVERING; + else if (!strcmp(optarg, "recovered")) + mask = UINT32_C(1) << RTE_ETH_EVENT_RECOVERED; else if (!strcmp(optarg, "all")) mask = ~UINT32_C(0); else { diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index fe6450c..80ae3fa 100644 --- a/app/test-pmd/testpmd.c +++ b/app/test-pmd/testpmd.c @@ -380,6 +380,8 @@ static const char * const eth_event_desc[] = { [RTE_ETH_EVENT_NEW] = "device probed", [RTE_ETH_EVENT_DESTROY] = "device released", [RTE_ETH_EVENT_FLOW_AGED] = "flow aged", + [RTE_ETH_EVENT_ERR_RECOVERING] = "device error under recovery", + [RTE_ETH_EVENT_RECOVERED] = "device recovered", [RTE_ETH_EVENT_MAX] = NULL, }; @@ -394,7 +396,9 @@ uint32_t event_print_mask = (UINT32_C(1) << RTE_ETH_EVENT_UNKNOWN) | (UINT32_C(1) << RTE_ETH_EVENT_IPSEC) | (UINT32_C(1) << RTE_ETH_EVENT_MACSEC) | (UINT32_C(1) << RTE_ETH_EVENT_INTR_RMV) | - (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED); + (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED) | + (UINT32_C(1) << RTE_ETH_EVENT_ERR_RECOVERING) | + (UINT32_C(1) << RTE_ETH_EVENT_RECOVERED); /* * Decide if all memory are locked for performance. */ diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst index 4bcf220..f732ff6 100644 --- a/doc/guides/rel_notes/release_20_11.rst +++ b/doc/guides/rel_notes/release_20_11.rst @@ -78,11 +78,21 @@ New Features ``--portmask=N`` where N represents the hexadecimal bitmask of ports used. +* **Added error recovery support.** + + Added error recovery support to detect and recover from device errors including: + + * Added new event: ``RTE_ETH_EVENT_ERR_RECOVERING`` for the driver to report + that the port is recovering from an error. + * Added new event: ``RTE_ETH_EVENT_RECOVERED`` for the driver to report + that the port has recovered from an error. + * **Updated Broadcom bnxt driver.** Updated the Broadcom bnxt driver with new features and improvements, including: * Added support for 200G PAM4 link speed. + * Added support to handle device recovery events. Removed Items