From patchwork Fri Aug 17 10:48:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Guo, Jia" X-Patchwork-Id: 43762 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id CE1392A62; Fri, 17 Aug 2018 12:51:24 +0200 (CEST) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 613B91E2F for ; Fri, 17 Aug 2018 12:51:23 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Aug 2018 03:51:22 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,250,1531810800"; d="scan'208";a="63131817" Received: from jeffguo-z170x-ud5.sh.intel.com (HELO localhost.localdomain) ([10.67.104.10]) by fmsmga007.fm.intel.com with ESMTP; 17 Aug 2018 03:51:20 -0700 From: Jeff Guo To: stephen@networkplumber.org, bruce.richardson@intel.com, ferruh.yigit@intel.com, konstantin.ananyev@intel.com, gaetan.rivet@6wind.com, jingjing.wu@intel.com, thomas@monjalon.net, motih@mellanox.com, matan@mellanox.com, harry.van.haaren@intel.com, qi.z.zhang@intel.com, shaopeng.he@intel.com, bernard.iremonger@intel.com, arybchenko@solarflare.com, wenzhuo.lu@intel.com Cc: jblunck@infradead.org, shreyansh.jain@nxp.com, dev@dpdk.org, jia.guo@intel.com, helin.zhang@intel.com Date: Fri, 17 Aug 2018 18:48:28 +0800 Message-Id: <1534502916-31636-2-git-send-email-jia.guo@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1534502916-31636-1-git-send-email-jia.guo@intel.com> References: <1498711073-42917-1-git-send-email-jia.guo@intel.com> <1534502916-31636-1-git-send-email-jia.guo@intel.com> Subject: [dpdk-dev] [PATCH v10 1/8] bus: add memory failure handler X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" A memory failure and system crash can be caused if a device is hotplugged out but the application can still access the device by MMIO. This patch introduces bus ops to handle memory failures of illegal access, especially for hotplug. Each bus can implement its own case-dependent logic to handle the memory failures. Signed-off-by: Jeff Guo --- v10->v9: modify bus ops name --- lib/librte_eal/common/include/rte_bus.h | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h index b7b5b08..2606451 100644 --- a/lib/librte_eal/common/include/rte_bus.h +++ b/lib/librte_eal/common/include/rte_bus.h @@ -168,6 +168,20 @@ typedef int (*rte_bus_unplug_t)(struct rte_device *dev); typedef int (*rte_bus_parse_t)(const char *name, void *addr); /** + * Implement a specific memory failure handler, which is responsible for + * handle the failure of memory illegal access, especially for hotplug. When + * the event of hotplug-out be detected, it could call this function to handle + * the memory failure and avoid system crash. + * @param dev + * Pointer of the device structure. + * + * @return + * 0 on success. + * !0 on error. + */ +typedef int (*rte_bus_memory_failure_handler_t)(struct rte_device *dev); + +/** * Bus scan policies */ enum rte_bus_scan_mode { @@ -212,6 +226,8 @@ struct rte_bus { struct rte_bus_conf conf; /**< Bus configuration */ rte_bus_get_iommu_class_t get_iommu_class; /**< Get iommu class */ rte_dev_iterate_t dev_iterate; /**< Device iterator. */ + rte_bus_memory_failure_handler_t memory_failure_handler; + /**< handle memory failure on the bus */ }; /** From patchwork Fri Aug 17 10:48:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Guo, Jia" X-Patchwork-Id: 43763 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 9C572324D; Fri, 17 Aug 2018 12:51:27 +0200 (CEST) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 2E0E62C6A for ; Fri, 17 Aug 2018 12:51:26 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Aug 2018 03:51:25 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,250,1531810800"; d="scan'208";a="63131830" Received: from jeffguo-z170x-ud5.sh.intel.com (HELO localhost.localdomain) ([10.67.104.10]) by fmsmga007.fm.intel.com with ESMTP; 17 Aug 2018 03:51:23 -0700 From: Jeff Guo To: stephen@networkplumber.org, bruce.richardson@intel.com, ferruh.yigit@intel.com, konstantin.ananyev@intel.com, gaetan.rivet@6wind.com, jingjing.wu@intel.com, thomas@monjalon.net, motih@mellanox.com, matan@mellanox.com, harry.van.haaren@intel.com, qi.z.zhang@intel.com, shaopeng.he@intel.com, bernard.iremonger@intel.com, arybchenko@solarflare.com, wenzhuo.lu@intel.com Cc: jblunck@infradead.org, shreyansh.jain@nxp.com, dev@dpdk.org, jia.guo@intel.com, helin.zhang@intel.com Date: Fri, 17 Aug 2018 18:48:29 +0800 Message-Id: <1534502916-31636-3-git-send-email-jia.guo@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1534502916-31636-1-git-send-email-jia.guo@intel.com> References: <1498711073-42917-1-git-send-email-jia.guo@intel.com> <1534502916-31636-1-git-send-email-jia.guo@intel.com> Subject: [dpdk-dev] [PATCH v10 2/8] bus/pci: implement memory failure handler ops X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This patch implements the ops to handle memory failures on the PCI bus. It avoids MMIO read/write errors by creating a new dummy memory to remap the memory where the failure is. Signed-off-by: Jeff Guo Acked-by: Shaopeng He --- v10->v9: change pci bus ops name --- drivers/bus/pci/pci_common.c | 28 ++++++++++++++++++++++++++++ drivers/bus/pci/pci_common_uio.c | 33 +++++++++++++++++++++++++++++++++ drivers/bus/pci/private.h | 12 ++++++++++++ 3 files changed, 73 insertions(+) diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c index 7736b3f..759ccc3 100644 --- a/drivers/bus/pci/pci_common.c +++ b/drivers/bus/pci/pci_common.c @@ -406,6 +406,33 @@ pci_find_device(const struct rte_device *start, rte_dev_cmp_t cmp, } static int +pci_memory_failure_handler(struct rte_device *dev) +{ + struct rte_pci_device *pdev = NULL; + int ret = 0; + + pdev = RTE_DEV_TO_PCI(dev); + if (!pdev) + return -1; + + switch (pdev->kdrv) { + case RTE_KDRV_IGB_UIO: + case RTE_KDRV_UIO_GENERIC: + case RTE_KDRV_NIC_UIO: + /* mmio resource is invalid, remap it to be safe. */ + ret = pci_uio_remap_resource(pdev); + break; + default: + RTE_LOG(DEBUG, EAL, + "Not managed by a supported kernel driver, skipped\n"); + ret = -1; + break; + } + + return ret; +} + +static int pci_plug(struct rte_device *dev) { return pci_probe_all_drivers(RTE_DEV_TO_PCI(dev)); @@ -435,6 +462,7 @@ struct rte_pci_bus rte_pci_bus = { .unplug = pci_unplug, .parse = pci_parse, .get_iommu_class = rte_pci_get_iommu_class, + .memory_failure_handler = pci_memory_failure_handler, }, .device_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.device_list), .driver_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.driver_list), diff --git a/drivers/bus/pci/pci_common_uio.c b/drivers/bus/pci/pci_common_uio.c index 54bc20b..7ea73db 100644 --- a/drivers/bus/pci/pci_common_uio.c +++ b/drivers/bus/pci/pci_common_uio.c @@ -146,6 +146,39 @@ pci_uio_unmap(struct mapped_pci_resource *uio_res) } } +/* remap the PCI resource of a PCI device in anonymous virtual memory */ +int +pci_uio_remap_resource(struct rte_pci_device *dev) +{ + int i; + void *map_address; + + if (dev == NULL) + return -1; + + /* Remap all BARs */ + for (i = 0; i != PCI_MAX_RESOURCE; i++) { + /* skip empty BAR */ + if (dev->mem_resource[i].phys_addr == 0) + continue; + map_address = mmap(dev->mem_resource[i].addr, + (size_t)dev->mem_resource[i].len, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0); + if (map_address == MAP_FAILED) { + RTE_LOG(ERR, EAL, + "Cannot remap resource for device %s\n", + dev->name); + return -1; + } + RTE_LOG(INFO, EAL, + "Successful remap resource for device %s\n", + dev->name); + } + + return 0; +} + static struct mapped_pci_resource * pci_uio_find_resource(struct rte_pci_device *dev) { diff --git a/drivers/bus/pci/private.h b/drivers/bus/pci/private.h index 8ddd03e..6b312e5 100644 --- a/drivers/bus/pci/private.h +++ b/drivers/bus/pci/private.h @@ -123,6 +123,18 @@ void pci_uio_free_resource(struct rte_pci_device *dev, struct mapped_pci_resource *uio_res); /** + * Remap the PCI resource of a PCI device in anonymous virtual memory. + * + * @param dev + * Point to the struct rte pci device. + * @return + * - On success, zero. + * - On failure, a negative value. + */ +int +pci_uio_remap_resource(struct rte_pci_device *dev); + +/** * Map device memory to uio resource * * This function is private to EAL. From patchwork Fri Aug 17 10:48:30 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Guo, Jia" X-Patchwork-Id: 43764 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 424932B83; Fri, 17 Aug 2018 12:51:31 +0200 (CEST) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 12337493D for ; Fri, 17 Aug 2018 12:51:28 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Aug 2018 03:51:28 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,250,1531810800"; d="scan'208";a="63131836" Received: from jeffguo-z170x-ud5.sh.intel.com (HELO localhost.localdomain) ([10.67.104.10]) by fmsmga007.fm.intel.com with ESMTP; 17 Aug 2018 03:51:26 -0700 From: Jeff Guo To: stephen@networkplumber.org, bruce.richardson@intel.com, ferruh.yigit@intel.com, konstantin.ananyev@intel.com, gaetan.rivet@6wind.com, jingjing.wu@intel.com, thomas@monjalon.net, motih@mellanox.com, matan@mellanox.com, harry.van.haaren@intel.com, qi.z.zhang@intel.com, shaopeng.he@intel.com, bernard.iremonger@intel.com, arybchenko@solarflare.com, wenzhuo.lu@intel.com Cc: jblunck@infradead.org, shreyansh.jain@nxp.com, dev@dpdk.org, jia.guo@intel.com, helin.zhang@intel.com Date: Fri, 17 Aug 2018 18:48:30 +0800 Message-Id: <1534502916-31636-4-git-send-email-jia.guo@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1534502916-31636-1-git-send-email-jia.guo@intel.com> References: <1498711073-42917-1-git-send-email-jia.guo@intel.com> <1534502916-31636-1-git-send-email-jia.guo@intel.com> Subject: [dpdk-dev] [PATCH v10 3/8] bus: add sigbus handler X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" When a device is hotplugged out, a sigbus error will occur of the datapath can still read/write to the device. A handler is required here to capture the sigbus signal and handle it appropriately. This patch introduces bus ops to handle sigbus errors. Each bus can implement its own case-dependent logic to handle the sigbus errors. Signed-off-by: Jeff Guo Acked-by: Shaopeng He --- v10->v9: refine commit log --- lib/librte_eal/common/include/rte_bus.h | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h index 2606451..ddb29dd 100644 --- a/lib/librte_eal/common/include/rte_bus.h +++ b/lib/librte_eal/common/include/rte_bus.h @@ -182,6 +182,21 @@ typedef int (*rte_bus_parse_t)(const char *name, void *addr); typedef int (*rte_bus_memory_failure_handler_t)(struct rte_device *dev); /** + * Implement a specific sigbus handler, which is responsible for handle + * the sigbus error which is either original memory error, or specific memory + * error that caused of device be hotplug-out. When sigbus error be captured, + * it could call this function to handle sigbus error. + * @param failure_addr + * Pointer of the fault address of the sigbus error. + * + * @return + * 0 for success handle the sigbus. + * 1 for no bus handle the sigbus. + * -1 for failed to handle the sigbus + */ +typedef int (*rte_bus_sigbus_handler_t)(const void *failure_addr); + +/** * Bus scan policies */ enum rte_bus_scan_mode { @@ -228,6 +243,9 @@ struct rte_bus { rte_dev_iterate_t dev_iterate; /**< Device iterator. */ rte_bus_memory_failure_handler_t memory_failure_handler; /**< handle memory failure on the bus */ + rte_bus_sigbus_handler_t sigbus_handler; + /**< handle sigbus error on the bus */ + }; /** From patchwork Fri Aug 17 10:48:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Guo, Jia" X-Patchwork-Id: 43765 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 8B8B34C6F; Fri, 17 Aug 2018 12:51:33 +0200 (CEST) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id EEBAD49E1 for ; Fri, 17 Aug 2018 12:51:31 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Aug 2018 03:51:31 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,250,1531810800"; d="scan'208";a="63131843" Received: from jeffguo-z170x-ud5.sh.intel.com (HELO localhost.localdomain) ([10.67.104.10]) by fmsmga007.fm.intel.com with ESMTP; 17 Aug 2018 03:51:28 -0700 From: Jeff Guo To: stephen@networkplumber.org, bruce.richardson@intel.com, ferruh.yigit@intel.com, konstantin.ananyev@intel.com, gaetan.rivet@6wind.com, jingjing.wu@intel.com, thomas@monjalon.net, motih@mellanox.com, matan@mellanox.com, harry.van.haaren@intel.com, qi.z.zhang@intel.com, shaopeng.he@intel.com, bernard.iremonger@intel.com, arybchenko@solarflare.com, wenzhuo.lu@intel.com Cc: jblunck@infradead.org, shreyansh.jain@nxp.com, dev@dpdk.org, jia.guo@intel.com, helin.zhang@intel.com Date: Fri, 17 Aug 2018 18:48:31 +0800 Message-Id: <1534502916-31636-5-git-send-email-jia.guo@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1534502916-31636-1-git-send-email-jia.guo@intel.com> References: <1498711073-42917-1-git-send-email-jia.guo@intel.com> <1534502916-31636-1-git-send-email-jia.guo@intel.com> Subject: [dpdk-dev] [PATCH v10 4/8] bus/pci: implement sigbus handler ops X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This patch implements the ops for the PCI bus sigbus handler. It finds the PCI device that is being hotplugged out and calls the relevant ops of the memory failure handler to handle the failure of the device. Signed-off-by: Jeff Guo Acked-by: Shaopeng He --- v10->v9: refine doc. --- drivers/bus/pci/pci_common.c | 53 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 53 insertions(+) diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c index 759ccc3..b8f3244 100644 --- a/drivers/bus/pci/pci_common.c +++ b/drivers/bus/pci/pci_common.c @@ -405,6 +405,36 @@ pci_find_device(const struct rte_device *start, rte_dev_cmp_t cmp, return NULL; } +/** + * find the device which encounter the failure, by iterate all device on + * PCI bus to check if the memory failure address is located in the range + * of the BARs of any device. + */ +static struct rte_pci_device * +pci_find_device_by_addr(const void *failure_addr) +{ + struct rte_pci_device *pdev = NULL; + int i; + + FOREACH_DEVICE_ON_PCIBUS(pdev) { + for (i = 0; i != RTE_DIM(pdev->mem_resource); i++) { + if ((uint64_t)(uintptr_t)failure_addr >= + (uint64_t)(uintptr_t)pdev->mem_resource[i].addr && + (uint64_t)(uintptr_t)failure_addr < + (uint64_t)(uintptr_t)pdev->mem_resource[i].addr + + pdev->mem_resource[i].len) { + RTE_LOG(INFO, EAL, "Failure address " + "%16.16"PRIx64" belongs to " + "device %s!\n", + (uint64_t)(uintptr_t)failure_addr, + pdev->device.name); + return pdev; + } + } + } + return NULL; +} + static int pci_memory_failure_handler(struct rte_device *dev) { @@ -433,6 +463,28 @@ pci_memory_failure_handler(struct rte_device *dev) } static int +pci_sigbus_handler(const void *failure_addr) +{ + struct rte_pci_device *pdev = NULL; + int ret = 0; + + pdev = pci_find_device_by_addr(failure_addr); + if (!pdev) { + /* It is a generic sigbus error, no bus would handle it. */ + ret = 1; + } else { + /* The sigbus error is caused of hotplug-out. */ + ret = pci_memory_failure_handler(&pdev->device); + if (ret) { + RTE_LOG(ERR, EAL, "Failed to handle failure for " + "device %s", pdev->name); + ret = -1; + } + } + return ret; +} + +static int pci_plug(struct rte_device *dev) { return pci_probe_all_drivers(RTE_DEV_TO_PCI(dev)); @@ -463,6 +515,7 @@ struct rte_pci_bus rte_pci_bus = { .parse = pci_parse, .get_iommu_class = rte_pci_get_iommu_class, .memory_failure_handler = pci_memory_failure_handler, + .sigbus_handler = pci_sigbus_handler, }, .device_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.device_list), .driver_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.driver_list), From patchwork Fri Aug 17 10:48:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Guo, Jia" X-Patchwork-Id: 43766 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 090F84C8C; Fri, 17 Aug 2018 12:51:36 +0200 (CEST) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id DCB934C95 for ; Fri, 17 Aug 2018 12:51:34 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Aug 2018 03:51:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,250,1531810800"; d="scan'208";a="63131855" Received: from jeffguo-z170x-ud5.sh.intel.com (HELO localhost.localdomain) ([10.67.104.10]) by fmsmga007.fm.intel.com with ESMTP; 17 Aug 2018 03:51:31 -0700 From: Jeff Guo To: stephen@networkplumber.org, bruce.richardson@intel.com, ferruh.yigit@intel.com, konstantin.ananyev@intel.com, gaetan.rivet@6wind.com, jingjing.wu@intel.com, thomas@monjalon.net, motih@mellanox.com, matan@mellanox.com, harry.van.haaren@intel.com, qi.z.zhang@intel.com, shaopeng.he@intel.com, bernard.iremonger@intel.com, arybchenko@solarflare.com, wenzhuo.lu@intel.com Cc: jblunck@infradead.org, shreyansh.jain@nxp.com, dev@dpdk.org, jia.guo@intel.com, helin.zhang@intel.com Date: Fri, 17 Aug 2018 18:48:32 +0800 Message-Id: <1534502916-31636-6-git-send-email-jia.guo@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1534502916-31636-1-git-send-email-jia.guo@intel.com> References: <1498711073-42917-1-git-send-email-jia.guo@intel.com> <1534502916-31636-1-git-send-email-jia.guo@intel.com> Subject: [dpdk-dev] [PATCH v10 5/8] bus: add helper to handle sigbus X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This patch aims to add a helper to iterate through all buses to find the relevant bus to handle the sigbus error. Signed-off-by: Jeff Guo Acked-by: Shaopeng He --- v10->v9: refine commit log --- lib/librte_eal/common/eal_common_bus.c | 43 ++++++++++++++++++++++++++++++++++ lib/librte_eal/common/eal_private.h | 12 ++++++++++ 2 files changed, 55 insertions(+) diff --git a/lib/librte_eal/common/eal_common_bus.c b/lib/librte_eal/common/eal_common_bus.c index 0943851..62b7318 100644 --- a/lib/librte_eal/common/eal_common_bus.c +++ b/lib/librte_eal/common/eal_common_bus.c @@ -37,6 +37,7 @@ #include #include #include +#include #include "eal_private.h" @@ -242,3 +243,45 @@ rte_bus_get_iommu_class(void) } return mode; } + +static int +bus_handle_sigbus(const struct rte_bus *bus, + const void *failure_addr) +{ + int ret; + + if (!bus->sigbus_handler) + return -1; + + ret = bus->sigbus_handler(failure_addr); + + /* find bus but handle failed, keep the errno be set. */ + if (ret < 0 && rte_errno == 0) + rte_errno = ENOTSUP; + + return ret > 0; +} + +int +rte_bus_sigbus_handler(const void *failure_addr) +{ + struct rte_bus *bus; + + int ret = 0; + int old_errno = rte_errno; + + rte_errno = 0; + + bus = rte_bus_find(NULL, bus_handle_sigbus, failure_addr); + /* can not find bus. */ + if (!bus) + return 1; + /* find bus but handle failed, pass on the new errno. */ + else if (rte_errno != 0) + return -1; + + /* restore the old errno. */ + rte_errno = old_errno; + + return ret; +} diff --git a/lib/librte_eal/common/eal_private.h b/lib/librte_eal/common/eal_private.h index 4f809a8..168430e 100644 --- a/lib/librte_eal/common/eal_private.h +++ b/lib/librte_eal/common/eal_private.h @@ -304,4 +304,16 @@ int rte_devargs_layers_parse(struct rte_devargs *devargs, const char *devstr); +/** + * Iterate all buses to find the corresponding bus to handle the sigbus error. + * @param failure_addr + * Pointer of the fault address of the sigbus error. + * + * @return + * 0 success to handle the sigbus. + * -1 failed to handle the sigbus + * 1 no bus can handler the sigbus + */ +int rte_bus_sigbus_handler(const void *failure_addr); + #endif /* _EAL_PRIVATE_H_ */ From patchwork Fri Aug 17 10:48:34 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Guo, Jia" X-Patchwork-Id: 43768 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id CF2A74C57; Fri, 17 Aug 2018 12:51:42 +0200 (CEST) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id B39BE4CBB for ; Fri, 17 Aug 2018 12:51:40 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Aug 2018 03:51:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,250,1531810800"; d="scan'208";a="63131875" Received: from jeffguo-z170x-ud5.sh.intel.com (HELO localhost.localdomain) ([10.67.104.10]) by fmsmga007.fm.intel.com with ESMTP; 17 Aug 2018 03:51:37 -0700 From: Jeff Guo To: stephen@networkplumber.org, bruce.richardson@intel.com, ferruh.yigit@intel.com, konstantin.ananyev@intel.com, gaetan.rivet@6wind.com, jingjing.wu@intel.com, thomas@monjalon.net, motih@mellanox.com, matan@mellanox.com, harry.van.haaren@intel.com, qi.z.zhang@intel.com, shaopeng.he@intel.com, bernard.iremonger@intel.com, arybchenko@solarflare.com, wenzhuo.lu@intel.com Cc: jblunck@infradead.org, shreyansh.jain@nxp.com, dev@dpdk.org, jia.guo@intel.com, helin.zhang@intel.com Date: Fri, 17 Aug 2018 18:48:34 +0800 Message-Id: <1534502916-31636-8-git-send-email-jia.guo@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1534502916-31636-1-git-send-email-jia.guo@intel.com> References: <1498711073-42917-1-git-send-email-jia.guo@intel.com> <1534502916-31636-1-git-send-email-jia.guo@intel.com> Subject: [dpdk-dev] [PATCH v10 6/8] eal: add failure handle mechanism for hotplug X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The mechanism can initially register the sigbus handler after the device event monitor is enabled. When a sigbus error is captured, the mechanism will check the failure address and accordingly remap the invalid memory for the corresponding device. It could prevent the application from crashing when a device is hotplugged out. By this patch, users could call below new added APIs to enable/disable the device failure handle mechanism: - rte_dev_hotplug_handle_enable - rte_dev_hotplug_handle_disable Signed-off-by: Jeff Guo --- v10->v9: add new APIs to enable/disable hotplug handling --- doc/guides/rel_notes/release_18_08.rst | 5 + lib/librte_eal/bsdapp/eal/eal_dev.c | 14 +++ lib/librte_eal/common/eal_private.h | 26 ++++++ lib/librte_eal/common/include/rte_dev.h | 26 ++++++ lib/librte_eal/linuxapp/eal/eal_dev.c | 159 +++++++++++++++++++++++++++++++- lib/librte_eal/rte_eal_version.map | 2 + 6 files changed, 231 insertions(+), 1 deletion(-) diff --git a/doc/guides/rel_notes/release_18_08.rst b/doc/guides/rel_notes/release_18_08.rst index 321fa84..95dc1e0 100644 --- a/doc/guides/rel_notes/release_18_08.rst +++ b/doc/guides/rel_notes/release_18_08.rst @@ -117,6 +117,11 @@ New Features Added support for chained mbufs (input and output). +* **Added failure handle mechanism for hotplug.** + + ``rte_dev_hotplug_handle_enable`` and ``rte_dev_hotplug_handle_disable`` are + for enable or disable failure handle mechanism for hotplug. + API Changes ----------- diff --git a/lib/librte_eal/bsdapp/eal/eal_dev.c b/lib/librte_eal/bsdapp/eal/eal_dev.c index 1c6c51b..ae1c558 100644 --- a/lib/librte_eal/bsdapp/eal/eal_dev.c +++ b/lib/librte_eal/bsdapp/eal/eal_dev.c @@ -19,3 +19,17 @@ rte_dev_event_monitor_stop(void) RTE_LOG(ERR, EAL, "Device event is not supported for FreeBSD\n"); return -1; } + +int +rte_dev_hotplug_handle_enable(void) +{ + RTE_LOG(ERR, EAL, "Device event is not supported for FreeBSD\n"); + return -1; +} + +int +rte_dev_hotplug_handle_disable(void) +{ + RTE_LOG(ERR, EAL, "Device event is not supported for FreeBSD\n"); + return -1; +} diff --git a/lib/librte_eal/common/eal_private.h b/lib/librte_eal/common/eal_private.h index 168430e..3cf0357 100644 --- a/lib/librte_eal/common/eal_private.h +++ b/lib/librte_eal/common/eal_private.h @@ -316,4 +316,30 @@ rte_devargs_layers_parse(struct rte_devargs *devargs, */ int rte_bus_sigbus_handler(const void *failure_addr); +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Register the sigbus hander. + * + * @return + * - On success, zero. + * - On failure, a negative value. + */ +int +rte_dev_sigbus_handler_register(void); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Unregister the sigbus hander. + * + * @return + * - On success, zero. + * - On failure, a negative value. + */ +int +rte_dev_sigbus_handler_unregister(void); + #endif /* _EAL_PRIVATE_H_ */ diff --git a/lib/librte_eal/common/include/rte_dev.h b/lib/librte_eal/common/include/rte_dev.h index b80a805..ff580a0 100644 --- a/lib/librte_eal/common/include/rte_dev.h +++ b/lib/librte_eal/common/include/rte_dev.h @@ -460,4 +460,30 @@ rte_dev_event_monitor_start(void); int __rte_experimental rte_dev_event_monitor_stop(void); +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Enable hotplug handling for devices. + * + * @return + * - On success, zero. + * - On failure, a negative value. + */ +int __rte_experimental +rte_dev_hotplug_handle_enable(void); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Disable hotplug handling for devices. + * + * @return + * - On success, zero. + * - On failure, a negative value. + */ +int __rte_experimental +rte_dev_hotplug_handle_disable(void); + #endif /* _RTE_DEV_H_ */ diff --git a/lib/librte_eal/linuxapp/eal/eal_dev.c b/lib/librte_eal/linuxapp/eal/eal_dev.c index 1cf6aeb..fa5cb9b 100644 --- a/lib/librte_eal/linuxapp/eal/eal_dev.c +++ b/lib/librte_eal/linuxapp/eal/eal_dev.c @@ -4,6 +4,8 @@ #include #include +#include +#include #include #include @@ -14,15 +16,31 @@ #include #include #include +#include +#include +#include +#include #include "eal_private.h" static struct rte_intr_handle intr_handle = {.fd = -1 }; static bool monitor_started; +static bool hotplug_handle; #define EAL_UEV_MSG_LEN 4096 #define EAL_UEV_MSG_ELEM_LEN 128 +/* + * spinlock for device failure handle, if try to access bus or device, + * such as handle sigbus on bus or handle memory failure for device just use + * this lock. It could protect the bus and the device to avoid race condition. + */ +static rte_spinlock_t failure_handle_lock = RTE_SPINLOCK_INITIALIZER; + +static struct sigaction sigbus_action_old; + +static int sigbus_need_recover; + static void dev_uev_handler(__rte_unused void *param); /* identify the system layer which reports this event. */ @@ -33,6 +51,49 @@ enum eal_dev_event_subsystem { EAL_DEV_EVENT_SUBSYSTEM_MAX }; +static void +sigbus_action_recover(void) +{ + if (sigbus_need_recover) { + sigaction(SIGBUS, &sigbus_action_old, NULL); + sigbus_need_recover = 0; + } +} + +static void sigbus_handler(int signum, siginfo_t *info, + void *ctx __rte_unused) +{ + int ret; + + RTE_LOG(INFO, EAL, "Thread[%d] catch SIGBUS, fault address:%p\n", + (int)pthread_self(), info->si_addr); + + rte_spinlock_lock(&failure_handle_lock); + ret = rte_bus_sigbus_handler(info->si_addr); + rte_spinlock_unlock(&failure_handle_lock); + if (ret == -1) { + rte_exit(EXIT_FAILURE, + "Failed to handle SIGBUS for hotplug, " + "(rte_errno: %s)!", strerror(rte_errno)); + } else if (ret == 1) { + if (sigbus_action_old.sa_handler) + (*(sigbus_action_old.sa_handler))(signum); + else + rte_exit(EXIT_FAILURE, + "Failed to handle generic SIGBUS!"); + } + + RTE_LOG(INFO, EAL, "Success to handle SIGBUS for hotplug!\n"); +} + +static int cmp_dev_name(const struct rte_device *dev, + const void *_name) +{ + const char *name = _name; + + return strcmp(dev->name, name); +} + static int dev_uev_socket_fd_create(void) { @@ -147,6 +208,9 @@ dev_uev_handler(__rte_unused void *param) struct rte_dev_event uevent; int ret; char buf[EAL_UEV_MSG_LEN]; + struct rte_bus *bus; + struct rte_device *dev; + const char *busname = ""; memset(&uevent, 0, sizeof(struct rte_dev_event)); memset(buf, 0, EAL_UEV_MSG_LEN); @@ -171,8 +235,43 @@ dev_uev_handler(__rte_unused void *param) RTE_LOG(DEBUG, EAL, "receive uevent(name:%s, type:%d, subsystem:%d)\n", uevent.devname, uevent.type, uevent.subsystem); - if (uevent.devname) + switch (uevent.subsystem) { + case EAL_DEV_EVENT_SUBSYSTEM_PCI: + case EAL_DEV_EVENT_SUBSYSTEM_UIO: + busname = "pci"; + break; + default: + break; + } + + if (uevent.devname) { + if (uevent.type == RTE_DEV_EVENT_REMOVE && hotplug_handle) { + rte_spinlock_lock(&failure_handle_lock); + bus = rte_bus_find_by_name(busname); + if (bus == NULL) { + RTE_LOG(ERR, EAL, "Cannot find bus (%s)\n", + busname); + return; + } + + dev = bus->find_device(NULL, cmp_dev_name, + uevent.devname); + if (dev == NULL) { + RTE_LOG(ERR, EAL, "Cannot find device (%s) on " + "bus (%s)\n", uevent.devname, busname); + return; + } + + ret = bus->memory_failure_handler(dev); + rte_spinlock_unlock(&failure_handle_lock); + if (ret) { + RTE_LOG(ERR, EAL, "Can not handle hotplug for " + "device (%s)\n", dev->name); + return; + } + } dev_callback_process(uevent.devname, uevent.type); + } } int __rte_experimental @@ -220,5 +319,63 @@ rte_dev_event_monitor_stop(void) close(intr_handle.fd); intr_handle.fd = -1; monitor_started = false; + return 0; } + +int +rte_dev_sigbus_handler_register(void) +{ + sigset_t mask; + struct sigaction action; + + rte_errno = 0; + + sigemptyset(&mask); + sigaddset(&mask, SIGBUS); + action.sa_flags = SA_SIGINFO; + action.sa_mask = mask; + action.sa_sigaction = sigbus_handler; + sigbus_need_recover = !sigaction(SIGBUS, &action, &sigbus_action_old); + + return rte_errno; +} + +int +rte_dev_sigbus_handler_unregister(void) +{ + rte_errno = 0; + sigbus_need_recover = 1; + + sigbus_action_recover(); + + return rte_errno; +} + +int +rte_dev_hotplug_handle_enable(void) +{ + int ret = 0; + + ret = rte_dev_sigbus_handler_register(); + if (ret < 0) + RTE_LOG(ERR, EAL, "fail to register sigbus handler for devices.\n"); + + hotplug_handle = true; + + return ret; +} + +int +rte_dev_hotplug_handle_disable(void) +{ + int ret = 0; + + ret = rte_dev_sigbus_handler_unregister(); + if (ret < 0) + RTE_LOG(ERR, EAL, "fail to unregister sigbus handler for devices.\n"); + + hotplug_handle = false; + + return ret; +} diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map index 344a43d..996e709 100644 --- a/lib/librte_eal/rte_eal_version.map +++ b/lib/librte_eal/rte_eal_version.map @@ -274,6 +274,8 @@ EXPERIMENTAL { rte_dev_event_callback_unregister; rte_dev_event_monitor_start; rte_dev_event_monitor_stop; + rte_dev_hotplug_handle_enable; + rte_dev_hotplug_handle_disable; rte_dev_iterator_init; rte_dev_iterator_next; rte_devargs_add; From patchwork Fri Aug 17 10:48:35 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Guo, Jia" X-Patchwork-Id: 43769 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 4B99B5323; Fri, 17 Aug 2018 12:51:47 +0200 (CEST) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id ADCCB4CE4 for ; Fri, 17 Aug 2018 12:51:43 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Aug 2018 03:51:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,250,1531810800"; d="scan'208";a="63131882" Received: from jeffguo-z170x-ud5.sh.intel.com (HELO localhost.localdomain) ([10.67.104.10]) by fmsmga007.fm.intel.com with ESMTP; 17 Aug 2018 03:51:40 -0700 From: Jeff Guo To: stephen@networkplumber.org, bruce.richardson@intel.com, ferruh.yigit@intel.com, konstantin.ananyev@intel.com, gaetan.rivet@6wind.com, jingjing.wu@intel.com, thomas@monjalon.net, motih@mellanox.com, matan@mellanox.com, harry.van.haaren@intel.com, qi.z.zhang@intel.com, shaopeng.he@intel.com, bernard.iremonger@intel.com, arybchenko@solarflare.com, wenzhuo.lu@intel.com Cc: jblunck@infradead.org, shreyansh.jain@nxp.com, dev@dpdk.org, jia.guo@intel.com, helin.zhang@intel.com Date: Fri, 17 Aug 2018 18:48:35 +0800 Message-Id: <1534502916-31636-9-git-send-email-jia.guo@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1534502916-31636-1-git-send-email-jia.guo@intel.com> References: <1498711073-42917-1-git-send-email-jia.guo@intel.com> <1534502916-31636-1-git-send-email-jia.guo@intel.com> Subject: [dpdk-dev] [PATCH v10 7/8] igb_uio: fix unexpected remove issue for hotplug X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" When a device is hotplugged out, the PCI resource is released in the kernel, the UIO file descriptor will disappear and the irq will be released. After this, a kernel crash will be caused if the igb uio driver tries to access or release these resources. And more, uio_remove will be called unexpectedly before uio_release when device be hotpluggged out, the uio_remove procedure will free resources that are required by uio_release. This will later affect the usage of interrupt as there is no way to disable the interrupt which is defined in uio_release. To prevent this, the hotplug removal needs to be identified and processed accordingly in igb uio driver. This patch proposes the addition of enum rte_udev_state in the rte_uio_pci_dev struct. This will store the state of the uio device as one of the following: probed/opened/released/removed. This patch also checks the kobject's remove_uevent_sent state to detect if the removal status is hotplug-out. Once a hotplug-out is detected, it will call uio_release and set the uio status to "removed". After that, uio will check the status in the uio_release function. If uio has already been removed, it will only free the dirty uio resource. Signed-off-by: Jeff Guo Acked-by: Shaopeng He --- v10->v9: refine commmit log. --- kernel/linux/igb_uio/igb_uio.c | 69 +++++++++++++++++++++++++++++++++--------- 1 file changed, 55 insertions(+), 14 deletions(-) diff --git a/kernel/linux/igb_uio/igb_uio.c b/kernel/linux/igb_uio/igb_uio.c index 3398eac..d126371 100644 --- a/kernel/linux/igb_uio/igb_uio.c +++ b/kernel/linux/igb_uio/igb_uio.c @@ -19,6 +19,14 @@ #include "compat.h" +/* uio pci device state */ +enum rte_udev_state { + RTE_UDEV_PROBED, + RTE_UDEV_OPENNED, + RTE_UDEV_RELEASED, + RTE_UDEV_REMOVED, +}; + /** * A structure describing the private information for a uio device. */ @@ -28,6 +36,7 @@ struct rte_uio_pci_dev { enum rte_intr_mode mode; struct mutex lock; int refcnt; + enum rte_udev_state state; }; static int wc_activate; @@ -309,6 +318,17 @@ igbuio_pci_disable_interrupts(struct rte_uio_pci_dev *udev) #endif } +/* Unmap previously ioremap'd resources */ +static void +igbuio_pci_release_iomem(struct uio_info *info) +{ + int i; + + for (i = 0; i < MAX_UIO_MAPS; i++) { + if (info->mem[i].internal_addr) + iounmap(info->mem[i].internal_addr); + } +} /** * This gets called while opening uio device file. @@ -331,20 +351,35 @@ igbuio_pci_open(struct uio_info *info, struct inode *inode) /* enable interrupts */ err = igbuio_pci_enable_interrupts(udev); - mutex_unlock(&udev->lock); if (err) { dev_err(&dev->dev, "Enable interrupt fails\n"); + pci_clear_master(dev); + mutex_unlock(&udev->lock); return err; } + udev->state = RTE_UDEV_OPENNED; + mutex_unlock(&udev->lock); return 0; } +/** + * This gets called while closing uio device file. + */ static int igbuio_pci_release(struct uio_info *info, struct inode *inode) { struct rte_uio_pci_dev *udev = info->priv; struct pci_dev *dev = udev->pdev; + if (udev->state == RTE_UDEV_REMOVED) { + mutex_destroy(&udev->lock); + igbuio_pci_release_iomem(&udev->info); + pci_disable_device(dev); + pci_set_drvdata(dev, NULL); + kfree(udev); + return 0; + } + mutex_lock(&udev->lock); if (--udev->refcnt > 0) { mutex_unlock(&udev->lock); @@ -356,7 +391,7 @@ igbuio_pci_release(struct uio_info *info, struct inode *inode) /* stop the device from further DMA */ pci_clear_master(dev); - + udev->state = RTE_UDEV_RELEASED; mutex_unlock(&udev->lock); return 0; } @@ -414,18 +449,6 @@ igbuio_pci_setup_ioport(struct pci_dev *dev, struct uio_info *info, return 0; } -/* Unmap previously ioremap'd resources */ -static void -igbuio_pci_release_iomem(struct uio_info *info) -{ - int i; - - for (i = 0; i < MAX_UIO_MAPS; i++) { - if (info->mem[i].internal_addr) - iounmap(info->mem[i].internal_addr); - } -} - static int igbuio_setup_bars(struct pci_dev *dev, struct uio_info *info) { @@ -562,6 +585,9 @@ igbuio_pci_probe(struct pci_dev *dev, const struct pci_device_id *id) (unsigned long long)map_dma_addr, map_addr); } + mutex_lock(&udev->lock); + udev->state = RTE_UDEV_PROBED; + mutex_unlock(&udev->lock); return 0; fail_remove_group: @@ -579,6 +605,21 @@ static void igbuio_pci_remove(struct pci_dev *dev) { struct rte_uio_pci_dev *udev = pci_get_drvdata(dev); + struct pci_dev *pdev = udev->pdev; + int ret; + + /* handle unexpected removal */ + if (udev->state == RTE_UDEV_OPENNED || + (&pdev->dev.kobj)->state_remove_uevent_sent == 1) { + dev_notice(&dev->dev, "Unexpected removal!\n"); + ret = igbuio_pci_release(&udev->info, NULL); + if (ret) + return; + mutex_lock(&udev->lock); + udev->state = RTE_UDEV_REMOVED; + mutex_unlock(&udev->lock); + return; + } mutex_destroy(&udev->lock); sysfs_remove_group(&dev->dev.kobj, &dev_attr_grp); From patchwork Fri Aug 17 10:48:36 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Guo, Jia" X-Patchwork-Id: 43770 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 9E66058C3; Fri, 17 Aug 2018 12:51:50 +0200 (CEST) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 9A8584C72 for ; Fri, 17 Aug 2018 12:51:46 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Aug 2018 03:51:46 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,250,1531810800"; d="scan'208";a="63131886" Received: from jeffguo-z170x-ud5.sh.intel.com (HELO localhost.localdomain) ([10.67.104.10]) by fmsmga007.fm.intel.com with ESMTP; 17 Aug 2018 03:51:43 -0700 From: Jeff Guo To: stephen@networkplumber.org, bruce.richardson@intel.com, ferruh.yigit@intel.com, konstantin.ananyev@intel.com, gaetan.rivet@6wind.com, jingjing.wu@intel.com, thomas@monjalon.net, motih@mellanox.com, matan@mellanox.com, harry.van.haaren@intel.com, qi.z.zhang@intel.com, shaopeng.he@intel.com, bernard.iremonger@intel.com, arybchenko@solarflare.com, wenzhuo.lu@intel.com Cc: jblunck@infradead.org, shreyansh.jain@nxp.com, dev@dpdk.org, jia.guo@intel.com, helin.zhang@intel.com Date: Fri, 17 Aug 2018 18:48:36 +0800 Message-Id: <1534502916-31636-10-git-send-email-jia.guo@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1534502916-31636-1-git-send-email-jia.guo@intel.com> References: <1498711073-42917-1-git-send-email-jia.guo@intel.com> <1534502916-31636-1-git-send-email-jia.guo@intel.com> Subject: [dpdk-dev] [PATCH v10 8/8] testpmd: use hotplug failure handle mechanism X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This patch use testpmd for example to show how to use failure handle mechanism for hotplug in app. Signed-off-by: Jeff Guo --- v10->v9: use new APIs to manage hotplug handling. --- app/test-pmd/testpmd.c | 27 ++++++++++++++++++++++----- 1 file changed, 22 insertions(+), 5 deletions(-) diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index ee48db2..12fc497 100644 --- a/app/test-pmd/testpmd.c +++ b/app/test-pmd/testpmd.c @@ -2098,14 +2098,22 @@ pmd_test_exit(void) if (hot_plug) { ret = rte_dev_event_monitor_stop(); - if (ret) + if (ret) { RTE_LOG(ERR, EAL, "fail to stop device event monitor."); + return; + } ret = eth_dev_event_callback_unregister(); if (ret) + return; + + ret = rte_dev_hotplug_handle_disable(); + if (ret) { RTE_LOG(ERR, EAL, - "fail to unregister all event callbacks."); + "fail to disable hotplug handling."); + return; + } } printf("\nBye...\n"); @@ -2784,14 +2792,23 @@ main(int argc, char** argv) init_config(); if (hot_plug) { - /* enable hot plug monitoring */ + ret = rte_dev_hotplug_handle_enable(); + if (ret) { + RTE_LOG(ERR, EAL, + "fail to enable hotplug handling."); + return -1; + } + ret = rte_dev_event_monitor_start(); if (ret) { - rte_errno = EINVAL; + RTE_LOG(ERR, EAL, + "fail to start device event monitoring."); return -1; } - eth_dev_event_callback_register(); + ret = eth_dev_event_callback_register(); + if (ret) + return -1; } if (start_port(RTE_PORT_ALL) != 0)