[v2,1/4] bus/pci: handle device hot unplug

Message ID 1529668268-7462-2-git-send-email-jia.guo@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series hot plug failure handle mechanism |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/Intel-compilation success Compilation OK

Commit Message

Guo, Jia June 22, 2018, 11:51 a.m. UTC
  When a hardware device is removed physically or the software disables
it, the hot unplug occur. App need to call ether dev API to detach the
device, to unplug the device at the bus level and make access to the device
invalid. But the problem is that, the removal of the device from the
software lists is not going to be instantaneous, at this time if the data
path still read/write the device, it will cause MMIO error and result of
the app crash out. So a hot unplug handle mechanism need to guaranty app
will not crash out when hot unplug device.

To handle device hot unplug is bus-specific behavior, this patch introduces
a bus ops so that each kind of bus can implement its own logic. Further,
this patch implements the ops for PCI bus: remap a dummy memory to avoid
bus read/write error.

Signed-off-by: Jeff Guo <jia.guo@intel.com>
---
v2->v1(v21):
refind commit log
---
 drivers/bus/pci/pci_common.c            | 65 +++++++++++++++++++++++++++++++++
 drivers/bus/pci/pci_common_uio.c        | 33 +++++++++++++++++
 drivers/bus/pci/private.h               | 12 ++++++
 lib/librte_eal/common/include/rte_bus.h | 16 ++++++++
 4 files changed, 126 insertions(+)
  

Comments

Gaëtan Rivet June 22, 2018, 12:59 p.m. UTC | #1
Hi Jeff,

Sorry, I followed this development from afar,
I have a remark regarding this API, I think it can be made simpler.
Details below.

On Fri, Jun 22, 2018 at 07:51:05PM +0800, Jeff Guo wrote:
> When a hardware device is removed physically or the software disables
> it, the hot unplug occur. App need to call ether dev API to detach the
> device, to unplug the device at the bus level and make access to the device
> invalid. But the problem is that, the removal of the device from the
> software lists is not going to be instantaneous, at this time if the data
> path still read/write the device, it will cause MMIO error and result of
> the app crash out. So a hot unplug handle mechanism need to guaranty app
> will not crash out when hot unplug device.
> 
> To handle device hot unplug is bus-specific behavior, this patch introduces
> a bus ops so that each kind of bus can implement its own logic. Further,
> this patch implements the ops for PCI bus: remap a dummy memory to avoid
> bus read/write error.
> 
> Signed-off-by: Jeff Guo <jia.guo@intel.com>
> ---
> v2->v1(v21):
> refind commit log
> ---
>  drivers/bus/pci/pci_common.c            | 65 +++++++++++++++++++++++++++++++++
>  drivers/bus/pci/pci_common_uio.c        | 33 +++++++++++++++++
>  drivers/bus/pci/private.h               | 12 ++++++
>  lib/librte_eal/common/include/rte_bus.h | 16 ++++++++
>  4 files changed, 126 insertions(+)
> 
> diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c
> index 7215aae..74d9aa8 100644
> --- a/drivers/bus/pci/pci_common.c
> +++ b/drivers/bus/pci/pci_common.c
> @@ -472,6 +472,70 @@ pci_find_device(const struct rte_device *start, rte_dev_cmp_t cmp,
>  	return NULL;
>  }
>  
> +static struct rte_pci_device *
> +pci_find_device_by_addr(void *failure_addr)
> +{
> +	struct rte_pci_device *pdev = NULL;
> +	int i;
> +
> +	FOREACH_DEVICE_ON_PCIBUS(pdev) {
> +		for (i = 0; i != RTE_DIM(pdev->mem_resource); i++) {
> +			if ((uint64_t)(uintptr_t)failure_addr >=
> +			    (uint64_t)(uintptr_t)pdev->mem_resource[i].addr &&
> +			    (uint64_t)(uintptr_t)failure_addr <
> +			    (uint64_t)(uintptr_t)pdev->mem_resource[i].addr +
> +			    pdev->mem_resource[i].len) {
> +				RTE_LOG(ERR, EAL, "Failure address "
> +					"%16.16"PRIx64" belongs to "
> +					"device %s!\n",
> +					(uint64_t)(uintptr_t)failure_addr,
> +					pdev->device.name);
> +				return pdev;
> +			}
> +		}
> +	}
> +	return NULL;
> +}

You define here a new bus ops that takes either an rte_device or an
arbitrary address as input.

In the uev handler that would call this ops afterward, you similarly try
to find either a bus using the device name, or then iterate over all
buses and try to find one able to handle the error.

This seems redundant and prone to ambiguity: should one check that the
device address is actually linked with the provided address? If not, is
it an improper call or a special case? This is unclear.

Note: I haven't followed the previous discussion, maybe the
      dual dev_addr + failure_addr is warranted in the API here,
      if so why not. Otherwise it just seems redundant:
      the dev addr will never be within a physical BAR mapping,
      and for all buses / drivers not using physical mappings,
      the addr is only meaningful as a cue to find an internal
      resource.

You can use the bus->find_device() to iterate over buses, and design
your bus ops such that when provided with an addr, would do whatever it
needs internally to find a relevant resource and either handle the
error, or return that the error was not handled.

Something like that:

/* new bus ops: */
/* If generic error codes are defined as part of the API,
   >0 should mean that the sigbus was not handled,
   <0 that an error occured but that one should stop trying,
   0 that everything is ok.
 */
int (*handle_sigbus)(void *addr);

/* new rte_bus API: */

static int
bus_handle_sigbus(const struct rte_bus *bus,
                  const void *addr)
{
        /* If additional error codes are defined as part of the API,
           negative values should stop the iteration.
           In this case, rte_errno would need to be set as well.
         */
        return !(bus->handle_sigbus && bus->handle_sigbus(addr) <= 0);
}

int
rte_bus_sigbus_handler(void *addr)
{
        struct rte_bus *bus;
        int old_errno = rte_errno;

        rte_errno = 0;
        bus = rte_bus_find(NULL, bus_handle_sigbus, addr);
        if (bus == NULL) {
                /* ERROR: no bus could handle the error. */
                RTE_LOG(ERR, EAL, "No bus was able to handle the error");
                return -1;
        } else if {rte_errno != 0) {
                /* ERROR: a generic sigbus handling error. */
                RTE_LOG(ERR, EAL, "Say what the error is");
                return -1;
        }
        rte_errno = old_errno;
        return 0;
}

Which would afterward be implemented, for example in PCI bus:

static rte_pci_device *
pci_find_device_by_addr(void *addr)
{
        struct rte_pci_device *pdev;

        FOREACH_DEVICE_ON_PCIBUS(pdev)
                if (&pdev->device == addr ||
                    /* addr within mappings of pdev */)
                        return pdev;
        return NULL;
}

static int
pci_handle_sigbus(void *addr)
{
        static rte_pci_device *pdev;

        pdev = pci_find_device_by_addr(addr);
        if (pdev == NULL)
                return -1;
        /* Here, handle uio remap if needed. */
}

-------------------------

- Leave the bus iteration and check within rte_bus. Centralize the call
  in a tighter rte_bus API, do not use directly your OPS from your other
  EAL facility.

- You have left several error messages for signaling success (!), or
  even simply that a bus could not handle a specific address. This is
  bad. An error should only appear on error. Otherwise, all of this can
  easily be traced using a debugger, so I don't think it's necessary
  to leave it at a DEBUG level.

  But in any case, remove all ERR-level messages for success.

<...>

> diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
> index eb9eded..6a5609f 100644
> --- a/lib/librte_eal/common/include/rte_bus.h
> +++ b/lib/librte_eal/common/include/rte_bus.h
> @@ -168,6 +168,20 @@ typedef int (*rte_bus_unplug_t)(struct rte_device *dev);
>  typedef int (*rte_bus_parse_t)(const char *name, void *addr);
>  
>  /**
> + * Implementation a specific hot unplug handler, which is responsible
> + * for handle the failure when hot unplug the device, guaranty the system
> + * would not hung in the case.
> + * @param dev
> + *	Pointer of the device structure.
> + *
> + * @return
> + *	0 on success.
> + *	!0 on error.
> + */
> +typedef int (*rte_bus_handle_hot_unplug_t)(struct rte_device *dev,
> +						void *dev_addr);
> +

I don't like the name of the OPS.
The documentation evokes only "the failure".
So is it a handle for any and all error possibly happening to a device?
If so, where is the input to describe the error?

If it is only meant to handle SIGBUS, because it is a very specific
error state only meant to happen on certain parts of the bus (the queue
mappings, if relevant), then it makes sense to only have an arbitrary
address as context for handling.

But then, it needs to be called as such. The expected failure to be
handled should be explicit in the name of the ops, and the documentation
should be more precise about what a bus developper should do with the
input.

> +/**
>   * Bus scan policies
>   */
>  enum rte_bus_scan_mode {
> @@ -209,6 +223,8 @@ struct rte_bus {
>  	rte_bus_plug_t plug;         /**< Probe single device for drivers */
>  	rte_bus_unplug_t unplug;     /**< Remove single device from driver */
>  	rte_bus_parse_t parse;       /**< Parse a device name */
> +	rte_bus_handle_hot_unplug_t handle_hot_unplug; /**< handle hot unplug
> +							device event */

The new ops should be added at the end of the structure.

Regards,
  
Guo, Jia June 26, 2018, 3:30 p.m. UTC | #2
hi, gaetan,

thanks for your review, see comment as bellow


On 6/22/2018 8:59 PM, Gaëtan Rivet wrote:
> Hi Jeff,
>
> Sorry, I followed this development from afar,
> I have a remark regarding this API, I think it can be made simpler.
> Details below.
>
> On Fri, Jun 22, 2018 at 07:51:05PM +0800, Jeff Guo wrote:
>> When a hardware device is removed physically or the software disables
>> it, the hot unplug occur. App need to call ether dev API to detach the
>> device, to unplug the device at the bus level and make access to the device
>> invalid. But the problem is that, the removal of the device from the
>> software lists is not going to be instantaneous, at this time if the data
>> path still read/write the device, it will cause MMIO error and result of
>> the app crash out. So a hot unplug handle mechanism need to guaranty app
>> will not crash out when hot unplug device.
>>
>> To handle device hot unplug is bus-specific behavior, this patch introduces
>> a bus ops so that each kind of bus can implement its own logic. Further,
>> this patch implements the ops for PCI bus: remap a dummy memory to avoid
>> bus read/write error.
>>
>> Signed-off-by: Jeff Guo <jia.guo@intel.com>
>> ---
>> v2->v1(v21):
>> refind commit log
>> ---
>>   drivers/bus/pci/pci_common.c            | 65 +++++++++++++++++++++++++++++++++
>>   drivers/bus/pci/pci_common_uio.c        | 33 +++++++++++++++++
>>   drivers/bus/pci/private.h               | 12 ++++++
>>   lib/librte_eal/common/include/rte_bus.h | 16 ++++++++
>>   4 files changed, 126 insertions(+)
>>
>> diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c
>> index 7215aae..74d9aa8 100644
>> --- a/drivers/bus/pci/pci_common.c
>> +++ b/drivers/bus/pci/pci_common.c
>> @@ -472,6 +472,70 @@ pci_find_device(const struct rte_device *start, rte_dev_cmp_t cmp,
>>   	return NULL;
>>   }
>>   
>> +static struct rte_pci_device *
>> +pci_find_device_by_addr(void *failure_addr)
>> +{
>> +	struct rte_pci_device *pdev = NULL;
>> +	int i;
>> +
>> +	FOREACH_DEVICE_ON_PCIBUS(pdev) {
>> +		for (i = 0; i != RTE_DIM(pdev->mem_resource); i++) {
>> +			if ((uint64_t)(uintptr_t)failure_addr >=
>> +			    (uint64_t)(uintptr_t)pdev->mem_resource[i].addr &&
>> +			    (uint64_t)(uintptr_t)failure_addr <
>> +			    (uint64_t)(uintptr_t)pdev->mem_resource[i].addr +
>> +			    pdev->mem_resource[i].len) {
>> +				RTE_LOG(ERR, EAL, "Failure address "
>> +					"%16.16"PRIx64" belongs to "
>> +					"device %s!\n",
>> +					(uint64_t)(uintptr_t)failure_addr,
>> +					pdev->device.name);
>> +				return pdev;
>> +			}
>> +		}
>> +	}
>> +	return NULL;
>> +}
> You define here a new bus ops that takes either an rte_device or an
> arbitrary address as input.
>
> In the uev handler that would call this ops afterward, you similarly try
> to find either a bus using the device name, or then iterate over all
> buses and try to find one able to handle the error.
>
> This seems redundant and prone to ambiguity: should one check that the
> device address is actually linked with the provided address? If not, is
> it an improper call or a special case? This is unclear.
>
> Note: I haven't followed the previous discussion, maybe the
>        dual dev_addr + failure_addr is warranted in the API here,
>        if so why not. Otherwise it just seems redundant:
>        the dev addr will never be within a physical BAR mapping,
>        and for all buses / drivers not using physical mappings,
>        the addr is only meaningful as a cue to find an internal
>        resource.
>
> You can use the bus->find_device() to iterate over buses, and design
> your bus ops such that when provided with an addr, would do whatever it
> needs internally to find a relevant resource and either handle the
> error, or return that the error was not handled.

if bus->find_device() can make think more simpler, why not? i will check 
that and modify it.

> Something like that:
>
> /* new bus ops: */
> /* If generic error codes are defined as part of the API,
>     >0 should mean that the sigbus was not handled,
>     <0 that an error occured but that one should stop trying,
>     0 that everything is ok.
>   */
> int (*handle_sigbus)(void *addr);
>
> /* new rte_bus API: */
>
> static int
> bus_handle_sigbus(const struct rte_bus *bus,
>                    const void *addr)
> {
>          /* If additional error codes are defined as part of the API,
>             negative values should stop the iteration.
>             In this case, rte_errno would need to be set as well.
>           */
>          return !(bus->handle_sigbus && bus->handle_sigbus(addr) <= 0);
> }
>
> int
> rte_bus_sigbus_handler(void *addr)
> {
>          struct rte_bus *bus;
>          int old_errno = rte_errno;
>
>          rte_errno = 0;
>          bus = rte_bus_find(NULL, bus_handle_sigbus, addr);
>          if (bus == NULL) {
>                  /* ERROR: no bus could handle the error. */
>                  RTE_LOG(ERR, EAL, "No bus was able to handle the error");
>                  return -1;
>          } else if {rte_errno != 0) {
>                  /* ERROR: a generic sigbus handling error. */
>                  RTE_LOG(ERR, EAL, "Say what the error is");
>                  return -1;
>          }
>          rte_errno = old_errno;
>          return 0;
> }
>
> Which would afterward be implemented, for example in PCI bus:
>
> static rte_pci_device *
> pci_find_device_by_addr(void *addr)
> {
>          struct rte_pci_device *pdev;
>
>          FOREACH_DEVICE_ON_PCIBUS(pdev)
>                  if (&pdev->device == addr ||
>                      /* addr within mappings of pdev */)
>                          return pdev;
>          return NULL;
> }
>
> static int
> pci_handle_sigbus(void *addr)
> {
>          static rte_pci_device *pdev;
>
>          pdev = pci_find_device_by_addr(addr);
>          if (pdev == NULL)
>                  return -1;
>          /* Here, handle uio remap if needed. */
> }
>
> -------------------------
>
> - Leave the bus iteration and check within rte_bus. Centralize the call
>    in a tighter rte_bus API, do not use directly your OPS from your other
>    EAL facility.
>
> - You have left several error messages for signaling success (!), or
>    even simply that a bus could not handle a specific address. This is
>    bad. An error should only appear on error. Otherwise, all of this can
>    easily be traced using a debugger, so I don't think it's necessary
>    to leave it at a DEBUG level.
>
>    But in any case, remove all ERR-level messages for success.

make sense. i will check which debug message is no need at least.

> <...>
>
>> diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
>> index eb9eded..6a5609f 100644
>> --- a/lib/librte_eal/common/include/rte_bus.h
>> +++ b/lib/librte_eal/common/include/rte_bus.h
>> @@ -168,6 +168,20 @@ typedef int (*rte_bus_unplug_t)(struct rte_device *dev);
>>   typedef int (*rte_bus_parse_t)(const char *name, void *addr);
>>   
>>   /**
>> + * Implementation a specific hot unplug handler, which is responsible
>> + * for handle the failure when hot unplug the device, guaranty the system
>> + * would not hung in the case.
>> + * @param dev
>> + *	Pointer of the device structure.
>> + *
>> + * @return
>> + *	0 on success.
>> + *	!0 on error.
>> + */
>> +typedef int (*rte_bus_handle_hot_unplug_t)(struct rte_device *dev,
>> +						void *dev_addr);
>> +
> I don't like the name of the OPS.
> The documentation evokes only "the failure".
> So is it a handle for any and all error possibly happening to a device?
> If so, where is the input to describe the error?
> If it is only meant to handle SIGBUS, because it is a very specific
> error state only meant to happen on certain parts of the bus (the queue
> mappings, if relevant), then it makes sense to only have an arbitrary
> address as context for handling.
>
> But then, it needs to be called as such. The expected failure to be
> handled should be explicit in the name of the ops, and the documentation
> should be more precise about what a bus developper should do with the
> input.

I agree with your point of let the name more explicit, but i think here 
maybe we should spit it into two ops, the one is hotplug_handler, the 
other is sigbus_handler, because there are
2 path that both, data path and control path, they are also need to call 
remap function when detect the hot remove event,even there are no sigbus 
happen.

>> +/**
>>    * Bus scan policies
>>    */
>>   enum rte_bus_scan_mode {
>> @@ -209,6 +223,8 @@ struct rte_bus {
>>   	rte_bus_plug_t plug;         /**< Probe single device for drivers */
>>   	rte_bus_unplug_t unplug;     /**< Remove single device from driver */
>>   	rte_bus_parse_t parse;       /**< Parse a device name */
>> +	rte_bus_handle_hot_unplug_t handle_hot_unplug; /**< handle hot unplug
>> +							device event */
> The new ops should be added at the end of the structure.

ok.

> Regards,
  

Patch

diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c
index 7215aae..74d9aa8 100644
--- a/drivers/bus/pci/pci_common.c
+++ b/drivers/bus/pci/pci_common.c
@@ -472,6 +472,70 @@  pci_find_device(const struct rte_device *start, rte_dev_cmp_t cmp,
 	return NULL;
 }
 
+static struct rte_pci_device *
+pci_find_device_by_addr(void *failure_addr)
+{
+	struct rte_pci_device *pdev = NULL;
+	int i;
+
+	FOREACH_DEVICE_ON_PCIBUS(pdev) {
+		for (i = 0; i != RTE_DIM(pdev->mem_resource); i++) {
+			if ((uint64_t)(uintptr_t)failure_addr >=
+			    (uint64_t)(uintptr_t)pdev->mem_resource[i].addr &&
+			    (uint64_t)(uintptr_t)failure_addr <
+			    (uint64_t)(uintptr_t)pdev->mem_resource[i].addr +
+			    pdev->mem_resource[i].len) {
+				RTE_LOG(ERR, EAL, "Failure address "
+					"%16.16"PRIx64" belongs to "
+					"device %s!\n",
+					(uint64_t)(uintptr_t)failure_addr,
+					pdev->device.name);
+				return pdev;
+			}
+		}
+	}
+	return NULL;
+}
+static int
+pci_handle_hot_unplug(struct rte_device *dev, void *failure_addr)
+{
+	struct rte_pci_device *pdev = NULL;
+	int ret = 0;
+
+	if (dev != NULL)
+		pdev = RTE_DEV_TO_PCI(dev);
+	else
+		pdev = pci_find_device_by_addr(failure_addr);
+
+	if (!pdev)
+		return -1;
+
+	/* remap resources for devices */
+	switch (pdev->kdrv) {
+	case RTE_KDRV_VFIO:
+#ifdef VFIO_PRESENT
+		/* TODO */
+		ret = -1;
+#endif
+		break;
+	case RTE_KDRV_IGB_UIO:
+	case RTE_KDRV_UIO_GENERIC:
+	case RTE_KDRV_NIC_UIO:
+		ret = pci_uio_remap_resource(pdev);
+		break;
+	default:
+		RTE_LOG(DEBUG, EAL,
+			"Not managed by a supported kernel driver, skipped\n");
+		ret = -1;
+		break;
+	}
+
+	if (ret != 0)
+		RTE_LOG(ERR, EAL, "Failed to handle hot unplug of device %s",
+			pdev->name);
+	return ret;
+}
+
 static int
 pci_plug(struct rte_device *dev)
 {
@@ -502,6 +566,7 @@  struct rte_pci_bus rte_pci_bus = {
 		.unplug = pci_unplug,
 		.parse = pci_parse,
 		.get_iommu_class = rte_pci_get_iommu_class,
+		.handle_hot_unplug = pci_handle_hot_unplug,
 	},
 	.device_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.device_list),
 	.driver_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.driver_list),
diff --git a/drivers/bus/pci/pci_common_uio.c b/drivers/bus/pci/pci_common_uio.c
index 54bc20b..7ea73db 100644
--- a/drivers/bus/pci/pci_common_uio.c
+++ b/drivers/bus/pci/pci_common_uio.c
@@ -146,6 +146,39 @@  pci_uio_unmap(struct mapped_pci_resource *uio_res)
 	}
 }
 
+/* remap the PCI resource of a PCI device in anonymous virtual memory */
+int
+pci_uio_remap_resource(struct rte_pci_device *dev)
+{
+	int i;
+	void *map_address;
+
+	if (dev == NULL)
+		return -1;
+
+	/* Remap all BARs */
+	for (i = 0; i != PCI_MAX_RESOURCE; i++) {
+		/* skip empty BAR */
+		if (dev->mem_resource[i].phys_addr == 0)
+			continue;
+		map_address = mmap(dev->mem_resource[i].addr,
+				(size_t)dev->mem_resource[i].len,
+				PROT_READ | PROT_WRITE,
+				MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
+		if (map_address == MAP_FAILED) {
+			RTE_LOG(ERR, EAL,
+				"Cannot remap resource for device %s\n",
+				dev->name);
+			return -1;
+		}
+		RTE_LOG(INFO, EAL,
+			"Successful remap resource for device %s\n",
+			dev->name);
+	}
+
+	return 0;
+}
+
 static struct mapped_pci_resource *
 pci_uio_find_resource(struct rte_pci_device *dev)
 {
diff --git a/drivers/bus/pci/private.h b/drivers/bus/pci/private.h
index 88fa587..5551506 100644
--- a/drivers/bus/pci/private.h
+++ b/drivers/bus/pci/private.h
@@ -173,6 +173,18 @@  void pci_uio_free_resource(struct rte_pci_device *dev,
 		struct mapped_pci_resource *uio_res);
 
 /**
+ * Remap the PCI resource of a PCI device in anonymous virtual memory.
+ *
+ * @param dev
+ *   Point to the struct rte pci device.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int
+pci_uio_remap_resource(struct rte_pci_device *dev);
+
+/**
  * Map device memory to uio resource
  *
  * This function is private to EAL.
diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
index eb9eded..6a5609f 100644
--- a/lib/librte_eal/common/include/rte_bus.h
+++ b/lib/librte_eal/common/include/rte_bus.h
@@ -168,6 +168,20 @@  typedef int (*rte_bus_unplug_t)(struct rte_device *dev);
 typedef int (*rte_bus_parse_t)(const char *name, void *addr);
 
 /**
+ * Implementation a specific hot unplug handler, which is responsible
+ * for handle the failure when hot unplug the device, guaranty the system
+ * would not hung in the case.
+ * @param dev
+ *	Pointer of the device structure.
+ *
+ * @return
+ *	0 on success.
+ *	!0 on error.
+ */
+typedef int (*rte_bus_handle_hot_unplug_t)(struct rte_device *dev,
+						void *dev_addr);
+
+/**
  * Bus scan policies
  */
 enum rte_bus_scan_mode {
@@ -209,6 +223,8 @@  struct rte_bus {
 	rte_bus_plug_t plug;         /**< Probe single device for drivers */
 	rte_bus_unplug_t unplug;     /**< Remove single device from driver */
 	rte_bus_parse_t parse;       /**< Parse a device name */
+	rte_bus_handle_hot_unplug_t handle_hot_unplug; /**< handle hot unplug
+							device event */
 	struct rte_bus_conf conf;    /**< Bus configuration */
 	rte_bus_get_iommu_class_t get_iommu_class; /**< Get iommu class */
 };