[V4,1/5] drivers/bus: restore driver assignment at front of probing

Message ID 20221206092649.8287-2-lihuisong@huawei.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series app/testpmd: support mulitple process attach and detach port |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

lihuisong (C) Dec. 6, 2022, 9:26 a.m. UTC
  The driver assignment was moved back at the end of the device probing
because there is no something to use rte_driver during the phase of
probing. See commit 391797f04208 ("drivers/bus: move driver assignment
to end of probing")

However, it is necessary for probing callback to reference rte_driver
before probing. For example, probing callback may call some APIs which
access the rte_pci_driver::driver by the device::driver pointer to get
driver information. In this case, a segment fault will occur in probing
callback if there is not this assignment.

Further, some comments in code need to be updated if we do that. The
driver pointer in rte_device is set before probing and needs to be reset
if probing failed. And rte_dev_is_probed can not be called inside probing.

Fixes: 391797f04208 ("drivers/bus: move driver assignment to end of probing")
Cc: stable@dpdk.org

Signed-off-by: Huisong Li <lihuisong@huawei.com>
---
 drivers/bus/auxiliary/auxiliary_common.c |  9 +++++++--
 drivers/bus/dpaa/dpaa_bus.c              |  9 +++++++--
 drivers/bus/fslmc/fslmc_bus.c            |  8 +++++++-
 drivers/bus/ifpga/ifpga_bus.c            | 12 +++++++++---
 drivers/bus/pci/pci_common.c             |  9 +++++++--
 drivers/bus/vdev/vdev.c                  | 10 ++++++++--
 drivers/bus/vmbus/vmbus_common.c         |  9 +++++++--
 7 files changed, 52 insertions(+), 14 deletions(-)
  

Comments

Ferruh Yigit Jan. 11, 2023, 12:51 p.m. UTC | #1
On 12/6/2022 9:26 AM, Huisong Li wrote:
> The driver assignment was moved back at the end of the device probing
> because there is no something to use rte_driver during the phase of
> probing. See commit 391797f04208 ("drivers/bus: move driver assignment
> to end of probing")
> 
> However, it is necessary for probing callback to reference rte_driver
> before probing. For example, probing callback may call some APIs which
> access the rte_pci_driver::driver by the device::driver pointer to get
> driver information. In this case, a segment fault will occur in probing
> callback if there is not this assignment.
> 

Probing callback gets driver as parameter, so callback function can
access it via 'drv->driver', is there a specific usecase that
'dev->device->driver' needs to be accessed explicitly?

I assume this is related to coming patches that setting up device in
testpmd event callback, but can you please clarify exact need.

> Further, some comments in code need to be updated if we do that. The
> driver pointer in rte_device is set before probing and needs to be reset
> if probing failed. And rte_dev_is_probed can not be called inside probing.
> 
> Fixes: 391797f04208 ("drivers/bus: move driver assignment to end of probing")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Huisong Li <lihuisong@huawei.com>
> ---
>  drivers/bus/auxiliary/auxiliary_common.c |  9 +++++++--
>  drivers/bus/dpaa/dpaa_bus.c              |  9 +++++++--
>  drivers/bus/fslmc/fslmc_bus.c            |  8 +++++++-
>  drivers/bus/ifpga/ifpga_bus.c            | 12 +++++++++---
>  drivers/bus/pci/pci_common.c             |  9 +++++++--
>  drivers/bus/vdev/vdev.c                  | 10 ++++++++--
>  drivers/bus/vmbus/vmbus_common.c         |  9 +++++++--
>  7 files changed, 52 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/bus/auxiliary/auxiliary_common.c b/drivers/bus/auxiliary/auxiliary_common.c
> index ff1369353a..13cb3fe0f8 100644
> --- a/drivers/bus/auxiliary/auxiliary_common.c
> +++ b/drivers/bus/auxiliary/auxiliary_common.c
> @@ -132,16 +132,21 @@ rte_auxiliary_probe_one_driver(struct rte_auxiliary_driver *drv,
>  	}
>  
>  	dev->driver = drv;
> +	/*
> +	 * Reference rte_driver before probing so as to this pointer can
> +	 * be used to get driver information in case of segment fault in
> +	 * probing callback.
> +	 */
> +	dev->device.driver = &drv->driver;
>  
>  	AUXILIARY_LOG(INFO, "Probe auxiliary driver: %s device: %s (NUMA node %i)",
>  		      drv->driver.name, dev->name, dev->device.numa_node);
>  	ret = drv->probe(drv, dev);
>  	if (ret != 0) {
>  		dev->driver = NULL;
> +		dev->device.driver = NULL;
>  		rte_intr_instance_free(dev->intr_handle);
>  		dev->intr_handle = NULL;
> -	} else {
> -		dev->device.driver = &drv->driver;
>  	}
>  
>  	return ret;
> diff --git a/drivers/bus/dpaa/dpaa_bus.c b/drivers/bus/dpaa/dpaa_bus.c
> index e57159f5d8..f1b817e58c 100644
> --- a/drivers/bus/dpaa/dpaa_bus.c
> +++ b/drivers/bus/dpaa/dpaa_bus.c
> @@ -693,17 +693,22 @@ rte_dpaa_bus_probe(void)
>  			    (dev->device.devargs &&
>  			     dev->device.devargs->policy == RTE_DEV_BLOCKED))
>  				continue;
> -
> +			/*
> +			 * Reference rte_driver before probing so as to this
> +			 * pointer can be used to get driver information in case
> +			 * of segment fault in probing callback.
> +			 */
> +			dev->device.driver = &drv->driver;
>  			if (probe_all ||
>  			    (dev->device.devargs &&
>  			     dev->device.devargs->policy == RTE_DEV_ALLOWED)) {
>  				ret = drv->probe(drv, dev);
>  				if (ret) {
> +					dev->device.driver = NULL;
>  					DPAA_BUS_ERR("unable to probe:%s",
>  						     dev->name);
>  				} else {
>  					dev->driver = drv;
> -					dev->device.driver = &drv->driver;
>  				}
>  			}
>  			break;
> diff --git a/drivers/bus/fslmc/fslmc_bus.c b/drivers/bus/fslmc/fslmc_bus.c
> index 57bfb5111a..4bc0c6d3d4 100644
> --- a/drivers/bus/fslmc/fslmc_bus.c
> +++ b/drivers/bus/fslmc/fslmc_bus.c
> @@ -471,15 +471,21 @@ rte_fslmc_probe(void)
>  				continue;
>  			}
>  
> +			/*
> +			 * Reference rte_driver before probing so as to this
> +			 * pointer can be used to get driver information in case
> +			 * of segment fault in probing callback.
> +			 */
> +			dev->device.driver = &drv->driver;
>  			if (probe_all ||
>  			   (dev->device.devargs &&
>  			    dev->device.devargs->policy == RTE_DEV_ALLOWED)) {
>  				ret = drv->probe(drv, dev);
>  				if (ret) {
> +					dev->device.driver = NULL;
>  					DPAA2_BUS_ERR("Unable to probe");
>  				} else {
>  					dev->driver = drv;
> -					dev->device.driver = &drv->driver;
>  				}
>  			}
>  			break;
> diff --git a/drivers/bus/ifpga/ifpga_bus.c b/drivers/bus/ifpga/ifpga_bus.c
> index bb943b58b5..5f23446f41 100644
> --- a/drivers/bus/ifpga/ifpga_bus.c
> +++ b/drivers/bus/ifpga/ifpga_bus.c
> @@ -293,13 +293,19 @@ ifpga_probe_one_driver(struct rte_afu_driver *drv,
>  
>  	/* reference driver structure */
>  	afu_dev->driver = drv;
> +	/*
> +	 * Reference rte_driver before probing so as to this pointer can
> +	 * be used to get driver information in case of segment fault in
> +	 * probing callback.
> +	 */
> +	afu_dev->device.driver = &drv->driver;
>  
>  	/* call the driver probe() function */
>  	ret = drv->probe(afu_dev);
> -	if (ret)
> +	if (ret) {
>  		afu_dev->driver = NULL;
> -	else
> -		afu_dev->device.driver = &drv->driver;
> +		afu_dev->device.driver = NULL;
> +	}
>  
>  	return ret;
>  }
> diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c
> index bc3a7f39fe..efaa1792e9 100644
> --- a/drivers/bus/pci/pci_common.c
> +++ b/drivers/bus/pci/pci_common.c
> @@ -302,6 +302,12 @@ rte_pci_probe_one_driver(struct rte_pci_driver *dr,
>  				return ret;
>  			}
>  		}
> +		/*
> +		 * Reference rte_driver before probing so as to this pointer can
> +		 * be used to get driver information in case of segment fault in
> +		 * probing callback.
> +		 */
> +		dev->device.driver = &dr->driver;
>  	}
>  
>  	RTE_LOG(INFO, EAL, "Probe PCI driver: %s (%x:%x) device: "PCI_PRI_FMT" (socket %i)\n",
> @@ -314,6 +320,7 @@ rte_pci_probe_one_driver(struct rte_pci_driver *dr,
>  		return ret; /* no rollback if already succeeded earlier */
>  	if (ret) {
>  		dev->driver = NULL;
> +		dev->device.driver = NULL;
>  		if ((dr->drv_flags & RTE_PCI_DRV_NEED_MAPPING) &&
>  			/* Don't unmap if device is unsupported and
>  			 * driver needs mapped resources.
> @@ -325,8 +332,6 @@ rte_pci_probe_one_driver(struct rte_pci_driver *dr,
>  		dev->vfio_req_intr_handle = NULL;
>  		rte_intr_instance_free(dev->intr_handle);
>  		dev->intr_handle = NULL;
> -	} else {
> -		dev->device.driver = &dr->driver;
>  	}
>  
>  	return ret;
> diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
> index 41bc07dde7..2e3f0f2e12 100644
> --- a/drivers/bus/vdev/vdev.c
> +++ b/drivers/bus/vdev/vdev.c
> @@ -207,9 +207,15 @@ vdev_probe_all_drivers(struct rte_vdev_device *dev)
>  		return -1;
>  	}
>  
> +	/*
> +	 * Reference rte_driver before probing so as to this pointer can
> +	 * be used to get driver information in case of segment fault in
> +	 * probing callback.
> +	 */
> +	dev->device.driver = &driver->driver;
>  	ret = driver->probe(dev);
> -	if (ret == 0)
> -		dev->device.driver = &driver->driver;
> +	if (ret != 0)
> +		dev->device.driver = NULL;
>  	return ret;
>  }
>  
> diff --git a/drivers/bus/vmbus/vmbus_common.c b/drivers/bus/vmbus/vmbus_common.c
> index 8d32d66504..feb1651984 100644
> --- a/drivers/bus/vmbus/vmbus_common.c
> +++ b/drivers/bus/vmbus/vmbus_common.c
> @@ -110,6 +110,12 @@ vmbus_probe_one_driver(struct rte_vmbus_driver *dr,
>  
>  	/* reference driver structure */
>  	dev->driver = dr;
> +	/*
> +	 * Reference rte_driver before probing so as to this pointer can
> +	 * be used to get driver information in case of segment fault in
> +	 * probing callback.
> +	 */
> +	dev->device.driver = &dr->driver;
>  
>  	if (dev->device.numa_node < 0 && rte_socket_count() > 1)
>  		VMBUS_LOG(INFO, "Device %s is not NUMA-aware", guid);
> @@ -119,9 +125,8 @@ vmbus_probe_one_driver(struct rte_vmbus_driver *dr,
>  	ret = dr->probe(dr, dev);
>  	if (ret) {
>  		dev->driver = NULL;
> +		dev->device.driver = NULL;
>  		rte_vmbus_unmap_device(dev);
> -	} else {
> -		dev->device.driver = &dr->driver;
>  	}
>  
>  	return ret;
  
lihuisong (C) Jan. 12, 2023, 2:44 a.m. UTC | #2
在 2023/1/11 20:51, Ferruh Yigit 写道:
> On 12/6/2022 9:26 AM, Huisong Li wrote:
>> The driver assignment was moved back at the end of the device probing
>> because there is no something to use rte_driver during the phase of
>> probing. See commit 391797f04208 ("drivers/bus: move driver assignment
>> to end of probing")
>>
>> However, it is necessary for probing callback to reference rte_driver
>> before probing. For example, probing callback may call some APIs which
>> access the rte_pci_driver::driver by the device::driver pointer to get
>> driver information. In this case, a segment fault will occur in probing
>> callback if there is not this assignment.
>>
> Probing callback gets driver as parameter, so callback function can
> access it via 'drv->driver', is there a specific usecase that
> 'dev->device->driver' needs to be accessed explicitly?
>
> I assume this is related to coming patches that setting up device in
> testpmd event callback, but can you please clarify exact need.
For example, rte_eth_dev_info_get is called in this event callback to get
driver name.
>
>> Further, some comments in code need to be updated if we do that. The
>> driver pointer in rte_device is set before probing and needs to be reset
>> if probing failed. And rte_dev_is_probed can not be called inside probing.
>>
>> Fixes: 391797f04208 ("drivers/bus: move driver assignment to end of probing")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Huisong Li <lihuisong@huawei.com>
>> ---
>>   drivers/bus/auxiliary/auxiliary_common.c |  9 +++++++--
>>   drivers/bus/dpaa/dpaa_bus.c              |  9 +++++++--
>>   drivers/bus/fslmc/fslmc_bus.c            |  8 +++++++-
>>   drivers/bus/ifpga/ifpga_bus.c            | 12 +++++++++---
>>   drivers/bus/pci/pci_common.c             |  9 +++++++--
>>   drivers/bus/vdev/vdev.c                  | 10 ++++++++--
>>   drivers/bus/vmbus/vmbus_common.c         |  9 +++++++--
>>   7 files changed, 52 insertions(+), 14 deletions(-)
>>
>> diff --git a/drivers/bus/auxiliary/auxiliary_common.c b/drivers/bus/auxiliary/auxiliary_common.c
>> index ff1369353a..13cb3fe0f8 100644
>> --- a/drivers/bus/auxiliary/auxiliary_common.c
>> +++ b/drivers/bus/auxiliary/auxiliary_common.c
>> @@ -132,16 +132,21 @@ rte_auxiliary_probe_one_driver(struct rte_auxiliary_driver *drv,
>>   	}
>>   
>>   	dev->driver = drv;
>> +	/*
>> +	 * Reference rte_driver before probing so as to this pointer can
>> +	 * be used to get driver information in case of segment fault in
>> +	 * probing callback.
>> +	 */
>> +	dev->device.driver = &drv->driver;
>>   
>>   	AUXILIARY_LOG(INFO, "Probe auxiliary driver: %s device: %s (NUMA node %i)",
>>   		      drv->driver.name, dev->name, dev->device.numa_node);
>>   	ret = drv->probe(drv, dev);
>>   	if (ret != 0) {
>>   		dev->driver = NULL;
>> +		dev->device.driver = NULL;
>>   		rte_intr_instance_free(dev->intr_handle);
>>   		dev->intr_handle = NULL;
>> -	} else {
>> -		dev->device.driver = &drv->driver;
>>   	}
>>   
>>   	return ret;
>> diff --git a/drivers/bus/dpaa/dpaa_bus.c b/drivers/bus/dpaa/dpaa_bus.c
>> index e57159f5d8..f1b817e58c 100644
>> --- a/drivers/bus/dpaa/dpaa_bus.c
>> +++ b/drivers/bus/dpaa/dpaa_bus.c
>> @@ -693,17 +693,22 @@ rte_dpaa_bus_probe(void)
>>   			    (dev->device.devargs &&
>>   			     dev->device.devargs->policy == RTE_DEV_BLOCKED))
>>   				continue;
>> -
>> +			/*
>> +			 * Reference rte_driver before probing so as to this
>> +			 * pointer can be used to get driver information in case
>> +			 * of segment fault in probing callback.
>> +			 */
>> +			dev->device.driver = &drv->driver;
>>   			if (probe_all ||
>>   			    (dev->device.devargs &&
>>   			     dev->device.devargs->policy == RTE_DEV_ALLOWED)) {
>>   				ret = drv->probe(drv, dev);
>>   				if (ret) {
>> +					dev->device.driver = NULL;
>>   					DPAA_BUS_ERR("unable to probe:%s",
>>   						     dev->name);
>>   				} else {
>>   					dev->driver = drv;
>> -					dev->device.driver = &drv->driver;
>>   				}
>>   			}
>>   			break;
>> diff --git a/drivers/bus/fslmc/fslmc_bus.c b/drivers/bus/fslmc/fslmc_bus.c
>> index 57bfb5111a..4bc0c6d3d4 100644
>> --- a/drivers/bus/fslmc/fslmc_bus.c
>> +++ b/drivers/bus/fslmc/fslmc_bus.c
>> @@ -471,15 +471,21 @@ rte_fslmc_probe(void)
>>   				continue;
>>   			}
>>   
>> +			/*
>> +			 * Reference rte_driver before probing so as to this
>> +			 * pointer can be used to get driver information in case
>> +			 * of segment fault in probing callback.
>> +			 */
>> +			dev->device.driver = &drv->driver;
>>   			if (probe_all ||
>>   			   (dev->device.devargs &&
>>   			    dev->device.devargs->policy == RTE_DEV_ALLOWED)) {
>>   				ret = drv->probe(drv, dev);
>>   				if (ret) {
>> +					dev->device.driver = NULL;
>>   					DPAA2_BUS_ERR("Unable to probe");
>>   				} else {
>>   					dev->driver = drv;
>> -					dev->device.driver = &drv->driver;
>>   				}
>>   			}
>>   			break;
>> diff --git a/drivers/bus/ifpga/ifpga_bus.c b/drivers/bus/ifpga/ifpga_bus.c
>> index bb943b58b5..5f23446f41 100644
>> --- a/drivers/bus/ifpga/ifpga_bus.c
>> +++ b/drivers/bus/ifpga/ifpga_bus.c
>> @@ -293,13 +293,19 @@ ifpga_probe_one_driver(struct rte_afu_driver *drv,
>>   
>>   	/* reference driver structure */
>>   	afu_dev->driver = drv;
>> +	/*
>> +	 * Reference rte_driver before probing so as to this pointer can
>> +	 * be used to get driver information in case of segment fault in
>> +	 * probing callback.
>> +	 */
>> +	afu_dev->device.driver = &drv->driver;
>>   
>>   	/* call the driver probe() function */
>>   	ret = drv->probe(afu_dev);
>> -	if (ret)
>> +	if (ret) {
>>   		afu_dev->driver = NULL;
>> -	else
>> -		afu_dev->device.driver = &drv->driver;
>> +		afu_dev->device.driver = NULL;
>> +	}
>>   
>>   	return ret;
>>   }
>> diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c
>> index bc3a7f39fe..efaa1792e9 100644
>> --- a/drivers/bus/pci/pci_common.c
>> +++ b/drivers/bus/pci/pci_common.c
>> @@ -302,6 +302,12 @@ rte_pci_probe_one_driver(struct rte_pci_driver *dr,
>>   				return ret;
>>   			}
>>   		}
>> +		/*
>> +		 * Reference rte_driver before probing so as to this pointer can
>> +		 * be used to get driver information in case of segment fault in
>> +		 * probing callback.
>> +		 */
>> +		dev->device.driver = &dr->driver;
>>   	}
>>   
>>   	RTE_LOG(INFO, EAL, "Probe PCI driver: %s (%x:%x) device: "PCI_PRI_FMT" (socket %i)\n",
>> @@ -314,6 +320,7 @@ rte_pci_probe_one_driver(struct rte_pci_driver *dr,
>>   		return ret; /* no rollback if already succeeded earlier */
>>   	if (ret) {
>>   		dev->driver = NULL;
>> +		dev->device.driver = NULL;
>>   		if ((dr->drv_flags & RTE_PCI_DRV_NEED_MAPPING) &&
>>   			/* Don't unmap if device is unsupported and
>>   			 * driver needs mapped resources.
>> @@ -325,8 +332,6 @@ rte_pci_probe_one_driver(struct rte_pci_driver *dr,
>>   		dev->vfio_req_intr_handle = NULL;
>>   		rte_intr_instance_free(dev->intr_handle);
>>   		dev->intr_handle = NULL;
>> -	} else {
>> -		dev->device.driver = &dr->driver;
>>   	}
>>   
>>   	return ret;
>> diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
>> index 41bc07dde7..2e3f0f2e12 100644
>> --- a/drivers/bus/vdev/vdev.c
>> +++ b/drivers/bus/vdev/vdev.c
>> @@ -207,9 +207,15 @@ vdev_probe_all_drivers(struct rte_vdev_device *dev)
>>   		return -1;
>>   	}
>>   
>> +	/*
>> +	 * Reference rte_driver before probing so as to this pointer can
>> +	 * be used to get driver information in case of segment fault in
>> +	 * probing callback.
>> +	 */
>> +	dev->device.driver = &driver->driver;
>>   	ret = driver->probe(dev);
>> -	if (ret == 0)
>> -		dev->device.driver = &driver->driver;
>> +	if (ret != 0)
>> +		dev->device.driver = NULL;
>>   	return ret;
>>   }
>>   
>> diff --git a/drivers/bus/vmbus/vmbus_common.c b/drivers/bus/vmbus/vmbus_common.c
>> index 8d32d66504..feb1651984 100644
>> --- a/drivers/bus/vmbus/vmbus_common.c
>> +++ b/drivers/bus/vmbus/vmbus_common.c
>> @@ -110,6 +110,12 @@ vmbus_probe_one_driver(struct rte_vmbus_driver *dr,
>>   
>>   	/* reference driver structure */
>>   	dev->driver = dr;
>> +	/*
>> +	 * Reference rte_driver before probing so as to this pointer can
>> +	 * be used to get driver information in case of segment fault in
>> +	 * probing callback.
>> +	 */
>> +	dev->device.driver = &dr->driver;
>>   
>>   	if (dev->device.numa_node < 0 && rte_socket_count() > 1)
>>   		VMBUS_LOG(INFO, "Device %s is not NUMA-aware", guid);
>> @@ -119,9 +125,8 @@ vmbus_probe_one_driver(struct rte_vmbus_driver *dr,
>>   	ret = dr->probe(dr, dev);
>>   	if (ret) {
>>   		dev->driver = NULL;
>> +		dev->device.driver = NULL;
>>   		rte_vmbus_unmap_device(dev);
>> -	} else {
>> -		dev->device.driver = &dr->driver;
>>   	}
>>   
>>   	return ret;
> .
  
Ferruh Yigit Feb. 15, 2023, 4:09 p.m. UTC | #3
On 1/12/2023 2:44 AM, lihuisong (C) wrote:
> 
> 在 2023/1/11 20:51, Ferruh Yigit 写道:
>> On 12/6/2022 9:26 AM, Huisong Li wrote:
>>> The driver assignment was moved back at the end of the device probing
>>> because there is no something to use rte_driver during the phase of
>>> probing. See commit 391797f04208 ("drivers/bus: move driver assignment
>>> to end of probing")
>>>
>>> However, it is necessary for probing callback to reference rte_driver
>>> before probing. For example, probing callback may call some APIs which
>>> access the rte_pci_driver::driver by the device::driver pointer to get
>>> driver information. In this case, a segment fault will occur in probing
>>> callback if there is not this assignment.
>>>
>> Probing callback gets driver as parameter, so callback function can
>> access it via 'drv->driver', is there a specific usecase that
>> 'dev->device->driver' needs to be accessed explicitly?
>>
>> I assume this is related to coming patches that setting up device in
>> testpmd event callback, but can you please clarify exact need.
>
> For example, rte_eth_dev_info_get is called in this event callback to get
> driver name.
>

Why 'rte_eth_dev_info_get()' is called in the event called at first place?


This set updates multiple things to extend 'RTE_ETH_EVENT_NEW' event
callback function support, like:

- Assign device driver *before* probing completed, so that even callback
can run 'rte_eth_dev_info_get()'

- Add a new ethdev state so that port can be recognized as valid port in
the even callback.

- Stop forwarding implicitly in even callback in case event callback run
while forwarding is on.


All looks to me hack/complexity to make a specific case work, which is
make secondary *testmp* application work with attached/detached device.

And finally patch 4/5 adds port setup to testpmd event callback for this.


I understand the intention, but I disagree with bus and ethdev level
changes for this.



Event callback may not be only way to share port attach/detach
information between primary and secondary, there is a MP socket and
'rte_mp_handle' thread to handle communication between primary and
secondary process, this should be able to use carrying device
information, as far as I remember this is why it is introduced at first
place.

Did you consider using MP socket for your use case?



Following is a sample usage:

Primary:
started as:
sudo ./build/app/dpdk-testpmd --no-pci --proc-type=auto -l 0-1
--log-level=*:debug -- -i --num-procs=2 --proc-id=0

``
testpmd> show port summary all
Number of available ports: 0
Port MAC Address       Name         Driver         Status   Link

testpmd> port attach net_null0
Attaching a new port...
dpaa: rte_dpaa_bus_parse(): Parse device name (net_null0 )
fslmc: rte_fslmc_parse(): Parsing dev=(net_null0 )
fslmc: rte_fslmc_parse(): Unknown or unsupported device (net_null0 )
vdev_probe_all_drivers(): Search driver to probe device net_null0
rte_pmd_null_probe(): Initializing pmd_null for net_null0
rte_pmd_null_probe(): Configure pmd_null: packet size is 64, packet copy
is disabled
eth_dev_null_create(): Creating null ethdev on numa socket 0
EAL: request: eal_dev_mp_request
EAL: msg: bus_vdev_mp
vdev_action(): send vdev, net_null0
EAL: sendmsg: bus_vdev_mp
EAL: reply: bus_vdev_mp
EAL: msg: eal_dev_mp_request
Port 0 is attached. Now total ports is 1
Done

testpmd> show port summary all
Number of available ports: 1
Port MAC Address       Name         Driver         Status   Link
0    DE:E5:79:00:A9:68 net_null0    net_null       down     10 Gbps

testpmd> port detach 0
Port was not closed
Removing a device...
EAL: request: eal_dev_mp_request
EAL: msg: eal_dev_mp_request
eth_dev_close(): Closing null ethdev on NUMA socket 0
Port 0 is closed
Device is detached
Now total ports is 0
Done

testpmd> show port summary all
Number of available ports: 0
Port MAC Address       Name         Driver         Status   Link
testpmd>

``

Secondary:
started as:
sudo ./build/app/dpdk-testpmd --no-pci --proc-type=auto -l 2-3
--log-level=*:debug -- -i --num-procs=2 --proc-id=1

``
testpmd> show port summary all
Number of available ports: 0
Port MAC Address       Name         Driver         Status   Link

testpmd> EAL: msg: eal_dev_mp_request
dpaa: rte_dpaa_bus_parse(): Parse device name (net_null0 )
fslmc: rte_fslmc_parse(): Parsing dev=(net_null0 )
fslmc: rte_fslmc_parse(): Unknown or unsupported device (net_null0 )
EAL: request: bus_vdev_mp
EAL: msg: bus_vdev_mp
vdev_action(): receive vdev, net_null0
EAL: msg: bus_vdev_mp
vdev_scan(): Received 1 vdevs
vdev_probe_all_drivers(): Search driver to probe device net_null0
rte_pmd_null_probe(): Initializing pmd_null for net_null0
EAL: reply: eal_dev_mp_request

testpmd> show port summary all
Number of available ports: 1
Port MAC Address       Name         Driver         Status   Link
0    DE:E5:79:00:A9:68 net_null0    net_null       down     10 Gbps

testpmd> EAL: msg: eal_dev_mp_request
dpaa: rte_dpaa_bus_parse(): Parse device name (net_null0 )
fslmc: rte_fslmc_parse(): Parsing dev=(net_null0 )
fslmc: rte_fslmc_parse(): Unknown or unsupported device (net_null0 )
eth_dev_close(): Closing null ethdev on NUMA socket 4294967295
Port 0 is closed
EAL: reply: eal_dev_mp_request

testpmd> show port summary all
Number of available ports: 0
Port MAC Address       Name         Driver         Status   Link
testpmd>
``
  
lihuisong (C) Feb. 28, 2023, 2:21 a.m. UTC | #4
在 2023/2/16 0:09, Ferruh Yigit 写道:
> On 1/12/2023 2:44 AM, lihuisong (C) wrote:
>> 在 2023/1/11 20:51, Ferruh Yigit 写道:
>>> On 12/6/2022 9:26 AM, Huisong Li wrote:
>>>> The driver assignment was moved back at the end of the device probing
>>>> because there is no something to use rte_driver during the phase of
>>>> probing. See commit 391797f04208 ("drivers/bus: move driver assignment
>>>> to end of probing")
>>>>
>>>> However, it is necessary for probing callback to reference rte_driver
>>>> before probing. For example, probing callback may call some APIs which
>>>> access the rte_pci_driver::driver by the device::driver pointer to get
>>>> driver information. In this case, a segment fault will occur in probing
>>>> callback if there is not this assignment.
>>>>
>>> Probing callback gets driver as parameter, so callback function can
>>> access it via 'drv->driver', is there a specific usecase that
>>> 'dev->device->driver' needs to be accessed explicitly?
>>>
>>> I assume this is related to coming patches that setting up device in
>>> testpmd event callback, but can you please clarify exact need.
>> For example, rte_eth_dev_info_get is called in this event callback to get
>> driver name.
>>
Hi Ferruh,

Sorry for the delay. I missed this email.
Thanks for your review.
> Why 'rte_eth_dev_info_get()' is called in the event called at first place?
There is no limitation on rte_eth_dev_info_get() in event callback.
The upper layer is entirely possible to use.
After all, it is more convenient for app to use this pointer to get 
driver information in a probing callback.
>
> This set updates multiple things to extend 'RTE_ETH_EVENT_NEW' event
> callback function support, like:
>
> - Assign device driver *before* probing completed, so that even callback
> can run 'rte_eth_dev_info_get()'
>
> - Add a new ethdev state so that port can be recognized as valid port in
> the even callback.
Yes
>
> - Stop forwarding implicitly in even callback in case event callback run
> while forwarding is on.
But this is a modification for tesptmd to make it work well.
>
>
> All looks to me hack/complexity to make a specific case work, which is
> make secondary *testmp* application work with attached/detached device.
It is not just a testpmd application case, but, I think, still exists in 
actual case.
Because eal lib supports hotplugging device on primary and secondary 
process and the communication each other when attach or detach device.
The reason why no one has ever put forward this question, I think it may 
be attributed to the fact that the scene is not or rarely tested.
>
> And finally patch 4/5 adds port setup to testpmd event callback for this.
>
>
> I understand the intention, but I disagree with bus and ethdev level
> changes for this.
I'm just raising this issue, and we can discuss how to deal with it 
together.😁
>
>
>
> Event callback may not be only way to share port attach/detach
> information between primary and secondary, there is a MP socket and
> 'rte_mp_handle' thread to handle communication between primary and
> secondary process, this should be able to use carrying device
> information, as far as I remember this is why it is introduced at first
> place.
>
> Did you consider using MP socket for your use case?
Actually, the probing event callback called in patch 4/5 is the result 
of the MP socket communication (please see hogplug_mp.c).
>
>
>
> Following is a sample usage:
>
> Primary:
> started as:
> sudo ./build/app/dpdk-testpmd --no-pci --proc-type=auto -l 0-1
> --log-level=*:debug -- -i --num-procs=2 --proc-id=0
>
> ``
> testpmd> show port summary all
> Number of available ports: 0
> Port MAC Address       Name         Driver         Status   Link
>
> testpmd> port attach net_null0
> Attaching a new port...
> dpaa: rte_dpaa_bus_parse(): Parse device name (net_null0 )
> fslmc: rte_fslmc_parse(): Parsing dev=(net_null0 )
> fslmc: rte_fslmc_parse(): Unknown or unsupported device (net_null0 )
> vdev_probe_all_drivers(): Search driver to probe device net_null0
> rte_pmd_null_probe(): Initializing pmd_null for net_null0
> rte_pmd_null_probe(): Configure pmd_null: packet size is 64, packet copy
> is disabled
> eth_dev_null_create(): Creating null ethdev on numa socket 0
> EAL: request: eal_dev_mp_request
> EAL: msg: bus_vdev_mp
> vdev_action(): send vdev, net_null0
> EAL: sendmsg: bus_vdev_mp
> EAL: reply: bus_vdev_mp
> EAL: msg: eal_dev_mp_request
> Port 0 is attached. Now total ports is 1
> Done
>
> testpmd> show port summary all
> Number of available ports: 1
> Port MAC Address       Name         Driver         Status   Link
> 0    DE:E5:79:00:A9:68 net_null0    net_null       down     10 Gbps
>
> testpmd> port detach 0
> Port was not closed
> Removing a device...
> EAL: request: eal_dev_mp_request
> EAL: msg: eal_dev_mp_request
> eth_dev_close(): Closing null ethdev on NUMA socket 0
> Port 0 is closed
> Device is detached
> Now total ports is 0
Please note the log *"Now total ports is 0"*,
which indicates the port number is updated if we detached device in this 
process.
> Done
>
> testpmd> show port summary all
> Number of available ports: 0
> Port MAC Address       Name         Driver         Status   Link
> testpmd>
>
> ``
>
> Secondary:
> started as:
> sudo ./build/app/dpdk-testpmd --no-pci --proc-type=auto -l 2-3
> --log-level=*:debug -- -i --num-procs=2 --proc-id=1
>
> ``
> testpmd> show port summary all
> Number of available ports: 0
> Port MAC Address       Name         Driver         Status   Link
>
> testpmd> EAL: msg: eal_dev_mp_request
> dpaa: rte_dpaa_bus_parse(): Parse device name (net_null0 )
> fslmc: rte_fslmc_parse(): Parsing dev=(net_null0 )
> fslmc: rte_fslmc_parse(): Unknown or unsupported device (net_null0 )
> EAL: request: bus_vdev_mp
> EAL: msg: bus_vdev_mp
> vdev_action(): receive vdev, net_null0
> EAL: msg: bus_vdev_mp
> vdev_scan(): Received 1 vdevs
> vdev_probe_all_drivers(): Search driver to probe device net_null0
> rte_pmd_null_probe(): Initializing pmd_null for net_null0
> EAL: reply: eal_dev_mp_request
>
> testpmd> show port summary all
> Number of available ports: 1
> Port MAC Address       Name         Driver         Status   Link
> 0    DE:E5:79:00:A9:68 net_null0    net_null       down     10 Gbps
>
> testpmd> EAL: msg: eal_dev_mp_request
> dpaa: rte_dpaa_bus_parse(): Parse device name (net_null0 )
> fslmc: rte_fslmc_parse(): Parsing dev=(net_null0 )
> fslmc: rte_fslmc_parse(): Unknown or unsupported device (net_null0 )
> eth_dev_close(): Closing null ethdev on NUMA socket 4294967295
> Port 0 is closed
> EAL: reply: eal_dev_mp_request
But the port number in this process does not be updated after finishing 
the destroy event.
That's the problem.
> testpmd> show port summary all
> Number of available ports: 0
> Port MAC Address       Name         Driver         Status   Link
> testpmd>
> ``
>
>
> .
  
Ferruh Yigit June 6, 2023, 4:12 p.m. UTC | #5
On 2/28/2023 2:21 AM, lihuisong (C) wrote:
> 
> 在 2023/2/16 0:09, Ferruh Yigit 写道:
>> On 1/12/2023 2:44 AM, lihuisong (C) wrote:
>>> 在 2023/1/11 20:51, Ferruh Yigit 写道:
>>>> On 12/6/2022 9:26 AM, Huisong Li wrote:
>>>>> The driver assignment was moved back at the end of the device probing
>>>>> because there is no something to use rte_driver during the phase of
>>>>> probing. See commit 391797f04208 ("drivers/bus: move driver assignment
>>>>> to end of probing")
>>>>>
>>>>> However, it is necessary for probing callback to reference rte_driver
>>>>> before probing. For example, probing callback may call some APIs which
>>>>> access the rte_pci_driver::driver by the device::driver pointer to get
>>>>> driver information. In this case, a segment fault will occur in probing
>>>>> callback if there is not this assignment.
>>>>>
>>>> Probing callback gets driver as parameter, so callback function can
>>>> access it via 'drv->driver', is there a specific usecase that
>>>> 'dev->device->driver' needs to be accessed explicitly?
>>>>
>>>> I assume this is related to coming patches that setting up device in
>>>> testpmd event callback, but can you please clarify exact need.
>>> For example, rte_eth_dev_info_get is called in this event callback to get
>>> driver name.
>>>
> Hi Ferruh,
> 
> Sorry for the delay. I missed this email.
> Thanks for your review.
>> Why 'rte_eth_dev_info_get()' is called in the event called at first place?
> There is no limitation on rte_eth_dev_info_get() in event callback.
> The upper layer is entirely possible to use.
> After all, it is more convenient for app to use this pointer to get
> driver information in a probing callback.
>

I think there is a logical limit in calling 'rte_eth_dev_info_get()' in
NEW event callback, we may document this if it is missing.

The NEW event callback is to let application do proper tasks before port
addition finalized in the ethdev layer, but you are trying to call
'rte_eth_dev_info_get()' in the callback when it is not ready yet.


>> This set updates multiple things to extend 'RTE_ETH_EVENT_NEW' event
>> callback function support, like:
>>
>> - Assign device driver *before* probing completed, so that even callback
>> can run 'rte_eth_dev_info_get()'
>>
>> - Add a new ethdev state so that port can be recognized as valid port in
>> the even callback.
> Yes
>> - Stop forwarding implicitly in even callback in case event callback run
>> while forwarding is on.
> But this is a modification for tesptmd to make it work well.
>> All looks to me hack/complexity to make a specific case work, which is
>> make secondary *testmp* application work with attached/detached device.
> It is not just a testpmd application case, but, I think, still exists in
> actual case.
> Because eal lib supports hotplugging device on primary and secondary
> process and the communication each other when attach or detach device.
> The reason why no one has ever put forward this question, I think it may
> be attributed to the fact that the scene is not or rarely tested.
>> And finally patch 4/5 adds port setup to testpmd event callback for this.
>>
>>
>> I understand the intention, but I disagree with bus and ethdev level
>> changes for this.
> I'm just raising this issue, and we can discuss how to deal with it
> together.😁
>>
>> Event callback may not be only way to share port attach/detach
>> information between primary and secondary, there is a MP socket and
>> 'rte_mp_handle' thread to handle communication between primary and
>> secondary process, this should be able to use carrying device
>> information, as far as I remember this is why it is introduced at first
>> place.
>>
>> Did you consider using MP socket for your use case?
> Actually, the probing event callback called in patch 4/5 is the result
> of the MP socket communication (please see hogplug_mp.c).
>>
>> Following is a sample usage:
>>
>> Primary:
>> started as:
>> sudo ./build/app/dpdk-testpmd --no-pci --proc-type=auto -l 0-1
>> --log-level=*:debug -- -i --num-procs=2 --proc-id=0
>>
>> ``
>> testpmd> show port summary all
>> Number of available ports: 0
>> Port MAC Address       Name         Driver         Status   Link
>>
>> testpmd> port attach net_null0
>> Attaching a new port...
>> dpaa: rte_dpaa_bus_parse(): Parse device name (net_null0 )
>> fslmc: rte_fslmc_parse(): Parsing dev=(net_null0 )
>> fslmc: rte_fslmc_parse(): Unknown or unsupported device (net_null0 )
>> vdev_probe_all_drivers(): Search driver to probe device net_null0
>> rte_pmd_null_probe(): Initializing pmd_null for net_null0
>> rte_pmd_null_probe(): Configure pmd_null: packet size is 64, packet copy
>> is disabled
>> eth_dev_null_create(): Creating null ethdev on numa socket 0
>> EAL: request: eal_dev_mp_request
>> EAL: msg: bus_vdev_mp
>> vdev_action(): send vdev, net_null0
>> EAL: sendmsg: bus_vdev_mp
>> EAL: reply: bus_vdev_mp
>> EAL: msg: eal_dev_mp_request
>> Port 0 is attached. Now total ports is 1
>> Done
>>
>> testpmd> show port summary all
>> Number of available ports: 1
>> Port MAC Address       Name         Driver         Status   Link
>> 0    DE:E5:79:00:A9:68 net_null0    net_null       down     10 Gbps
>>
>> testpmd> port detach 0
>> Port was not closed
>> Removing a device...
>> EAL: request: eal_dev_mp_request
>> EAL: msg: eal_dev_mp_request
>> eth_dev_close(): Closing null ethdev on NUMA socket 0
>> Port 0 is closed
>> Device is detached
>> Now total ports is 0
> Please note the log *"Now total ports is 0"*,
> which indicates the port number is updated if we detached device in this
> process.
>

Yes, as expected.


>> Done
>>
>> testpmd> show port summary all
>> Number of available ports: 0
>> Port MAC Address       Name         Driver         Status   Link
>> testpmd>
>>
>> ``
>>
>> Secondary:
>> started as:
>> sudo ./build/app/dpdk-testpmd --no-pci --proc-type=auto -l 2-3
>> --log-level=*:debug -- -i --num-procs=2 --proc-id=1
>>
>> ``
>> testpmd> show port summary all
>> Number of available ports: 0
>> Port MAC Address       Name         Driver         Status   Link
>>
>> testpmd> EAL: msg: eal_dev_mp_request
>> dpaa: rte_dpaa_bus_parse(): Parse device name (net_null0 )
>> fslmc: rte_fslmc_parse(): Parsing dev=(net_null0 )
>> fslmc: rte_fslmc_parse(): Unknown or unsupported device (net_null0 )
>> EAL: request: bus_vdev_mp
>> EAL: msg: bus_vdev_mp
>> vdev_action(): receive vdev, net_null0
>> EAL: msg: bus_vdev_mp
>> vdev_scan(): Received 1 vdevs
>> vdev_probe_all_drivers(): Search driver to probe device net_null0
>> rte_pmd_null_probe(): Initializing pmd_null for net_null0
>> EAL: reply: eal_dev_mp_request
>>
>> testpmd> show port summary all
>> Number of available ports: 1
>> Port MAC Address       Name         Driver         Status   Link
>> 0    DE:E5:79:00:A9:68 net_null0    net_null       down     10 Gbps
>>
>> testpmd> EAL: msg: eal_dev_mp_request
>> dpaa: rte_dpaa_bus_parse(): Parse device name (net_null0 )
>> fslmc: rte_fslmc_parse(): Parsing dev=(net_null0 )
>> fslmc: rte_fslmc_parse(): Unknown or unsupported device (net_null0 )
>> eth_dev_close(): Closing null ethdev on NUMA socket 4294967295
>> Port 0 is closed
>> EAL: reply: eal_dev_mp_request
> But the port number in this process does not be updated after finishing
> the destroy event.
> That's the problem.
>

Please see following command output,
it says "Number of available ports: 0",
so that shows port number is updated after destroy event, isn't it?


>> testpmd> show port summary all
>> Number of available ports: 0
>> Port MAC Address       Name         Driver         Status   Link
>> testpmd>
>> ``
>>
>>
>> .
  
lihuisong (C) June 7, 2023, 10:11 a.m. UTC | #6
在 2023/6/7 0:12, Ferruh Yigit 写道:
> On 2/28/2023 2:21 AM, lihuisong (C) wrote:
>> 在 2023/2/16 0:09, Ferruh Yigit 写道:
>>> On 1/12/2023 2:44 AM, lihuisong (C) wrote:
>>>> 在 2023/1/11 20:51, Ferruh Yigit 写道:
>>>>> On 12/6/2022 9:26 AM, Huisong Li wrote:
>>>>>> The driver assignment was moved back at the end of the device probing
>>>>>> because there is no something to use rte_driver during the phase of
>>>>>> probing. See commit 391797f04208 ("drivers/bus: move driver assignment
>>>>>> to end of probing")
>>>>>>
>>>>>> However, it is necessary for probing callback to reference rte_driver
>>>>>> before probing. For example, probing callback may call some APIs which
>>>>>> access the rte_pci_driver::driver by the device::driver pointer to get
>>>>>> driver information. In this case, a segment fault will occur in probing
>>>>>> callback if there is not this assignment.
>>>>>>
>>>>> Probing callback gets driver as parameter, so callback function can
>>>>> access it via 'drv->driver', is there a specific usecase that
>>>>> 'dev->device->driver' needs to be accessed explicitly?
>>>>>
>>>>> I assume this is related to coming patches that setting up device in
>>>>> testpmd event callback, but can you please clarify exact need.
>>>> For example, rte_eth_dev_info_get is called in this event callback to get
>>>> driver name.
>>>>
>> Hi Ferruh,
>>
>> Sorry for the delay. I missed this email.
>> Thanks for your review.
>>> Why 'rte_eth_dev_info_get()' is called in the event called at first place?
>> There is no limitation on rte_eth_dev_info_get() in event callback.
>> The upper layer is entirely possible to use.
>> After all, it is more convenient for app to use this pointer to get
>> driver information in a probing callback.
>>
> I think there is a logical limit in calling 'rte_eth_dev_info_get()' in
> NEW event callback, we may document this if it is missing.
>
> The NEW event callback is to let application do proper tasks before port
> addition finalized in the ethdev layer, but you are trying to call
> 'rte_eth_dev_info_get()' in the callback when it is not ready yet.
We probably shouldn't add this logical limit.
In terms of application perception, app may need to do something when a 
new or destory event is received, instead of just for printing a message.
>
>>> This set updates multiple things to extend 'RTE_ETH_EVENT_NEW' event
>>> callback function support, like:
>>>
>>> - Assign device driver *before* probing completed, so that even callback
>>> can run 'rte_eth_dev_info_get()'
>>>
>>> - Add a new ethdev state so that port can be recognized as valid port in
>>> the even callback.
>> Yes
>>> - Stop forwarding implicitly in even callback in case event callback run
>>> while forwarding is on.
>> But this is a modification for tesptmd to make it work well.
>>> All looks to me hack/complexity to make a specific case work, which is
>>> make secondary *testmp* application work with attached/detached device.
>> It is not just a testpmd application case, but, I think, still exists in
>> actual case.
>> Because eal lib supports hotplugging device on primary and secondary
>> process and the communication each other when attach or detach device.
>> The reason why no one has ever put forward this question, I think it may
>> be attributed to the fact that the scene is not or rarely tested.
>>> And finally patch 4/5 adds port setup to testpmd event callback for this.
>>>
>>>
>>> I understand the intention, but I disagree with bus and ethdev level
>>> changes for this.
>> I'm just raising this issue, and we can discuss how to deal with it
>> together.😁
>>> Event callback may not be only way to share port attach/detach
>>> information between primary and secondary, there is a MP socket and
>>> 'rte_mp_handle' thread to handle communication between primary and
>>> secondary process, this should be able to use carrying device
>>> information, as far as I remember this is why it is introduced at first
>>> place.
>>>
>>> Did you consider using MP socket for your use case?
>> Actually, the probing event callback called in patch 4/5 is the result
>> of the MP socket communication (please see hogplug_mp.c).
>>> Following is a sample usage:
>>>
>>> Primary:
>>> started as:
>>> sudo ./build/app/dpdk-testpmd --no-pci --proc-type=auto -l 0-1
>>> --log-level=*:debug -- -i --num-procs=2 --proc-id=0
>>>
>>> ``
>>> testpmd> show port summary all
>>> Number of available ports: 0
>>> Port MAC Address       Name         Driver         Status   Link
>>>
>>> testpmd> port attach net_null0
>>> Attaching a new port...
>>> dpaa: rte_dpaa_bus_parse(): Parse device name (net_null0 )
>>> fslmc: rte_fslmc_parse(): Parsing dev=(net_null0 )
>>> fslmc: rte_fslmc_parse(): Unknown or unsupported device (net_null0 )
>>> vdev_probe_all_drivers(): Search driver to probe device net_null0
>>> rte_pmd_null_probe(): Initializing pmd_null for net_null0
>>> rte_pmd_null_probe(): Configure pmd_null: packet size is 64, packet copy
>>> is disabled
>>> eth_dev_null_create(): Creating null ethdev on numa socket 0
>>> EAL: request: eal_dev_mp_request
>>> EAL: msg: bus_vdev_mp
>>> vdev_action(): send vdev, net_null0
>>> EAL: sendmsg: bus_vdev_mp
>>> EAL: reply: bus_vdev_mp
>>> EAL: msg: eal_dev_mp_request
>>> Port 0 is attached. Now total ports is 1
>>> Done
>>>
>>> testpmd> show port summary all
>>> Number of available ports: 1
>>> Port MAC Address       Name         Driver         Status   Link
>>> 0    DE:E5:79:00:A9:68 net_null0    net_null       down     10 Gbps
>>>
>>> testpmd> port detach 0
>>> Port was not closed
>>> Removing a device...
>>> EAL: request: eal_dev_mp_request
>>> EAL: msg: eal_dev_mp_request
>>> eth_dev_close(): Closing null ethdev on NUMA socket 0
>>> Port 0 is closed
>>> Device is detached
>>> Now total ports is 0
>> Please note the log *"Now total ports is 0"*,
>> which indicates the port number is updated if we detached device in this
>> process.
>>
> Yes, as expected.
Yeah, it is just expected for the process did attatch or dettach operation.
Because the thread for updating the port number info (in 
setup_attached_port) and doing probe thread is the same.
please see attach_port().
This is ok for testpmd that does not support multiple processes. But 
testpmd support multiple processes now.
>
>
>>> Done
>>>
>>> testpmd> show port summary all
>>> Number of available ports: 0
>>> Port MAC Address       Name         Driver         Status   Link
>>> testpmd>
>>>
>>> ``
>>>
>>> Secondary:
>>> started as:
>>> sudo ./build/app/dpdk-testpmd --no-pci --proc-type=auto -l 2-3
>>> --log-level=*:debug -- -i --num-procs=2 --proc-id=1
>>>
>>> ``
>>> testpmd> show port summary all
>>> Number of available ports: 0
>>> Port MAC Address       Name         Driver         Status   Link
>>>
>>> testpmd> EAL: msg: eal_dev_mp_request
>>> dpaa: rte_dpaa_bus_parse(): Parse device name (net_null0 )
>>> fslmc: rte_fslmc_parse(): Parsing dev=(net_null0 )
>>> fslmc: rte_fslmc_parse(): Unknown or unsupported device (net_null0 )
>>> EAL: request: bus_vdev_mp
>>> EAL: msg: bus_vdev_mp
>>> vdev_action(): receive vdev, net_null0
>>> EAL: msg: bus_vdev_mp
>>> vdev_scan(): Received 1 vdevs
>>> vdev_probe_all_drivers(): Search driver to probe device net_null0
>>> rte_pmd_null_probe(): Initializing pmd_null for net_null0
>>> EAL: reply: eal_dev_mp_request
>>>
>>> testpmd> show port summary all
>>> Number of available ports: 1
>>> Port MAC Address       Name         Driver         Status   Link
>>> 0    DE:E5:79:00:A9:68 net_null0    net_null       down     10 Gbps
>>>
>>> testpmd> EAL: msg: eal_dev_mp_request
>>> dpaa: rte_dpaa_bus_parse(): Parse device name (net_null0 )
>>> fslmc: rte_fslmc_parse(): Parsing dev=(net_null0 )
>>> fslmc: rte_fslmc_parse(): Unknown or unsupported device (net_null0 )
>>> eth_dev_close(): Closing null ethdev on NUMA socket 4294967295
>>> Port 0 is closed
>>> EAL: reply: eal_dev_mp_request
>> But the port number in this process does not be updated after finishing
>> the destroy event.
>> That's the problem.
>>
> Please see following command output,
> it says "Number of available ports: 0",
Here comes from rte_eth_dev_count_avail().
> so that shows port number is updated after destroy event, isn't it?
no.
There are three global variables,  "fwd_ports_ids[]", "ports[]", 
"nb_ports" maintained in testpmd.
The secondary doesn't update these info after receiving a destory or new 
message in your test.
>
>
>>> testpmd> show port summary all
>>> Number of available ports: 0
>>> Port MAC Address       Name         Driver         Status   Link
>>> testpmd>
>>> ``
>>>
>>>
>>> .
> .
  
lihuisong (C) June 15, 2023, 2:21 a.m. UTC | #7
Hi Ferruh,

add the call stack as belows.


在 2023/6/7 18:11, lihuisong (C) 写道:
>
> 在 2023/6/7 0:12, Ferruh Yigit 写道:
>> On 2/28/2023 2:21 AM, lihuisong (C) wrote:
>>> 在 2023/2/16 0:09, Ferruh Yigit 写道:
>>>> On 1/12/2023 2:44 AM, lihuisong (C) wrote:
>>>>> 在 2023/1/11 20:51, Ferruh Yigit 写道:
>>>>>> On 12/6/2022 9:26 AM, Huisong Li wrote:
>>>>>>> The driver assignment was moved back at the end of the device 
>>>>>>> probing
>>>>>>> because there is no something to use rte_driver during the phase of
>>>>>>> probing. See commit 391797f04208 ("drivers/bus: move driver 
>>>>>>> assignment
>>>>>>> to end of probing")
>>>>>>>
>>>>>>> However, it is necessary for probing callback to reference 
>>>>>>> rte_driver
>>>>>>> before probing. For example, probing callback may call some APIs 
>>>>>>> which
>>>>>>> access the rte_pci_driver::driver by the device::driver pointer 
>>>>>>> to get
>>>>>>> driver information. In this case, a segment fault will occur in 
>>>>>>> probing
>>>>>>> callback if there is not this assignment.
>>>>>>>
>>>>>> Probing callback gets driver as parameter, so callback function can
>>>>>> access it via 'drv->driver', is there a specific usecase that
>>>>>> 'dev->device->driver' needs to be accessed explicitly?
>>>>>>
>>>>>> I assume this is related to coming patches that setting up device in
>>>>>> testpmd event callback, but can you please clarify exact need.
>>>>> For example, rte_eth_dev_info_get is called in this event callback 
>>>>> to get
>>>>> driver name.
>>>>>
>>> Hi Ferruh,
>>>
>>> Sorry for the delay. I missed this email.
>>> Thanks for your review.
>>>> Why 'rte_eth_dev_info_get()' is called in the event called at first 
>>>> place?
>>> There is no limitation on rte_eth_dev_info_get() in event callback.
>>> The upper layer is entirely possible to use.
>>> After all, it is more convenient for app to use this pointer to get
>>> driver information in a probing callback.
>>>
>> I think there is a logical limit in calling 'rte_eth_dev_info_get()' in
>> NEW event callback, we may document this if it is missing.
>>
>> The NEW event callback is to let application do proper tasks before port
>> addition finalized in the ethdev layer, but you are trying to call
>> 'rte_eth_dev_info_get()' in the callback when it is not ready yet.
> We probably shouldn't add this logical limit.
> In terms of application perception, app may need to do something when 
> a new or destory event is received, instead of just for printing a 
> message.
In addition, if no this patch(1/5), a segment fault will occur when 
attach a new device. The call stack is as belows:
-->
testpmd> port attach 0000:7d:00.0
Attaching a new port...
EAL: Using IOMMU type 1 (Type 1)
EAL: Ignore mapping IO port bar(1)
EAL: Ignore mapping IO port bar(3)
EAL: Probe PCI driver: net_hns3 (19e5:a222) device: 0000:7d:00.0 (socket 0)

Thread 1 "dpdk-testpmd" received signal SIGSEGV, Segmentation fault.
rte_eth_dev_info_get (port_id=0, dev_info=0x17f95ba80) at 
../lib/ethdev/rte_ethdev.c:3708
3708        dev_info->driver_name = dev->device->driver->name;
Missing separate debuginfos, use: dnf debuginfo-install 
libatomic-10.3.1-10.oe2203.aarch64 libnl3-3.5.0-4.oe2203.aarch64 
openssl-libs-1.1.1m-1.oe2203.aarch64 zlib-1.2.11-19.oe2203.aarch64
(gdb) bt
#0  rte_eth_dev_info_get (port_id=0, dev_info=0x17f95ba80)
     at ../lib/ethdev/rte_ethdev.c:3708
#1  0x00000000005e6038 in eth_dev_info_get_print_err (port_id=0, 
dev_info=0x17f95ba80)
     at ../app/test-pmd/util.c:441
#2  0x00000000005d5654 in init_config_port_offloads (pid=0, socket_id=0)
     at ../app/test-pmd/testpmd.c:1632
#3  0x00000000005d5ec0 in reconfig (new_port_id=0, socket_id=0)
     at ../app/test-pmd/testpmd.c:1828
#4  0x00000000005dafa4 in setup_attached_port (pi=0) at 
../app/test-pmd/testpmd.c:3639
#5  0x00000000005dbd14 in eth_event_callback (port_id=0, 
type=RTE_ETH_EVENT_NEW,
     param=0x0, ret_param=0x0) at ../app/test-pmd/testpmd.c:3959
#6  0x00000000008677dc in rte_eth_dev_callback_process (dev=0x1826fd80 
<rte_eth_devices>,
     event=RTE_ETH_EVENT_NEW, ret_param=0x0) at 
../lib/ethdev/ethdev_driver.c:188
#7  0x0000000000867898 in rte_eth_dev_probing_finish (dev=0x1826fd80 
<rte_eth_devices>)
     at ../lib/ethdev/ethdev_driver.c:212
#8  0x00000000055621c8 in rte_eth_dev_pci_generic_probe 
(pci_dev=0x18500470,
     private_data_size=8448, dev_init=0x5571454 <hns3_dev_init>)
     at ../lib/ethdev/ethdev_pci.h:139
#9  0x00000000055717bc in eth_hns3_pci_probe (pci_drv=0x17e46728 
<rte_hns3_pmd>,
     pci_dev=0x18500470) at ../drivers/net/hns3/hns3_ethdev.c:6577
#10 0x0000000000970474 in rte_pci_probe_one_driver (dr=0x17e46728 
<rte_hns3_pmd>,
     dev=0x18500470) at ../drivers/bus/pci/pci_common.c:312
#11 0x00000000009706e0 in pci_probe_all_drivers (dev=0x18500470)
     at ../drivers/bus/pci/pci_common.c:396
#12 0x0000000000970f78 in pci_plug (dev=0x18500480) at 
../drivers/bus/pci/pci_common.c:670
#13 0x00000000008e31d4 in local_dev_probe (devargs=0xffffffffb6f0 
"0000:7d:00.0",
     new_dev=0xffffffff9478) at ../lib/eal/common/eal_common_dev.c:212
#14 0x00000000008e3330 in rte_dev_probe (devargs=0xffffffffb6f0 
"0000:7d:00.0")
     at ../lib/eal/common/eal_common_dev.c:264
#15 0x00000000005daeb4 in attach_port (identifier=0xffffffffb6f0 
"0000:7d:00.0")
     at ../app/test-pmd/testpmd.c:3608
#16 0x0000000000579a74 in cmd_operate_attach_port_parsed 
(parsed_result=0xffffffffb5f0,
     cl=0x184f9010, data=0x0) at ../app/test-pmd/cmdline.c:1194
--Type <RET> for more, q to quit, c to continue without paging--
#17 0x00000000008624a4 in __cmdline_parse (cl=0x184f9010,
     buf=0x184f9058 "port attach 0000:7d:00.0\n", call_fn=true)
     at ../lib/cmdline/cmdline_parse.c:294
#18 0x00000000008624dc in cmdline_parse (cl=0x184f9010,
     buf=0x184f9058 "port attach 0000:7d:00.0\n") at 
../lib/cmdline/cmdline_parse.c:302
#19 0x0000000000860308 in cmdline_valid_buffer (rdl=0x184f9020,
     buf=0x184f9058 "port attach 0000:7d:00.0\n", size=26) at 
../lib/cmdline/cmdline.c:24
#20 0x000000000086588c in rdline_char_in (rdl=0x184f9020, c=10 '\n')
     at ../lib/cmdline/cmdline_rdline.c:444
#21 0x0000000000860750 in cmdline_in (cl=0x184f9010,
     buf=0xfffffffff77f "\n\220\367\377\377\377\377", size=1)
     at ../lib/cmdline/cmdline.c:146
#22 0x0000000000860a14 in cmdline_interact (cl=0x184f9010) at 
../lib/cmdline/cmdline.c:226
#23 0x0000000000587908 in prompt () at ../app/test-pmd/cmdline.c:13065
#24 0x00000000005ddaf4 in main (argc=6, argv=0xfffffffffad8)
     at ../app/test-pmd/testpmd.c:4690
>>
>>>> This set updates multiple things to extend 'RTE_ETH_EVENT_NEW' event
>>>> callback function support, like:
>>>>
>>>> - Assign device driver *before* probing completed, so that even 
>>>> callback
>>>> can run 'rte_eth_dev_info_get()'
>>>>
>>>> - Add a new ethdev state so that port can be recognized as valid 
>>>> port in
>>>> the even callback.
>>> Yes
>>>> - Stop forwarding implicitly in even callback in case event 
>>>> callback run
>>>> while forwarding is on.
>>> But this is a modification for tesptmd to make it work well.
>>>> All looks to me hack/complexity to make a specific case work, which is
>>>> make secondary *testmp* application work with attached/detached 
>>>> device.
>>> It is not just a testpmd application case, but, I think, still 
>>> exists in
>>> actual case.
>>> Because eal lib supports hotplugging device on primary and secondary
>>> process and the communication each other when attach or detach device.
>>> The reason why no one has ever put forward this question, I think it 
>>> may
>>> be attributed to the fact that the scene is not or rarely tested.
>>>> And finally patch 4/5 adds port setup to testpmd event callback for 
>>>> this.
>>>>
>>>>
>>>> I understand the intention, but I disagree with bus and ethdev level
>>>> changes for this.
>>> I'm just raising this issue, and we can discuss how to deal with it
>>> together.😁
>>>> Event callback may not be only way to share port attach/detach
>>>> information between primary and secondary, there is a MP socket and
>>>> 'rte_mp_handle' thread to handle communication between primary and
>>>> secondary process, this should be able to use carrying device
>>>> information, as far as I remember this is why it is introduced at 
>>>> first
>>>> place.
>>>>
>>>> Did you consider using MP socket for your use case?
>>> Actually, the probing event callback called in patch 4/5 is the result
>>> of the MP socket communication (please see hogplug_mp.c).
>>>> Following is a sample usage:
>>>>
>>>> Primary:
>>>> started as:
>>>> sudo ./build/app/dpdk-testpmd --no-pci --proc-type=auto -l 0-1
>>>> --log-level=*:debug -- -i --num-procs=2 --proc-id=0
>>>>
>>>> ``
>>>> testpmd> show port summary all
>>>> Number of available ports: 0
>>>> Port MAC Address       Name         Driver         Status Link
>>>>
>>>> testpmd> port attach net_null0
>>>> Attaching a new port...
>>>> dpaa: rte_dpaa_bus_parse(): Parse device name (net_null0 )
>>>> fslmc: rte_fslmc_parse(): Parsing dev=(net_null0 )
>>>> fslmc: rte_fslmc_parse(): Unknown or unsupported device (net_null0 )
>>>> vdev_probe_all_drivers(): Search driver to probe device net_null0
>>>> rte_pmd_null_probe(): Initializing pmd_null for net_null0
>>>> rte_pmd_null_probe(): Configure pmd_null: packet size is 64, packet 
>>>> copy
>>>> is disabled
>>>> eth_dev_null_create(): Creating null ethdev on numa socket 0
>>>> EAL: request: eal_dev_mp_request
>>>> EAL: msg: bus_vdev_mp
>>>> vdev_action(): send vdev, net_null0
>>>> EAL: sendmsg: bus_vdev_mp
>>>> EAL: reply: bus_vdev_mp
>>>> EAL: msg: eal_dev_mp_request
>>>> Port 0 is attached. Now total ports is 1
>>>> Done
>>>>
>>>> testpmd> show port summary all
>>>> Number of available ports: 1
>>>> Port MAC Address       Name         Driver         Status Link
>>>> 0    DE:E5:79:00:A9:68 net_null0    net_null       down 10 Gbps
>>>>
>>>> testpmd> port detach 0
>>>> Port was not closed
>>>> Removing a device...
>>>> EAL: request: eal_dev_mp_request
>>>> EAL: msg: eal_dev_mp_request
>>>> eth_dev_close(): Closing null ethdev on NUMA socket 0
>>>> Port 0 is closed
>>>> Device is detached
>>>> Now total ports is 0
>>> Please note the log *"Now total ports is 0"*,
>>> which indicates the port number is updated if we detached device in 
>>> this
>>> process.
>>>
>> Yes, as expected.
> Yeah, it is just expected for the process did attatch or dettach 
> operation.
> Because the thread for updating the port number info (in 
> setup_attached_port) and doing probe thread is the same.
> please see attach_port().
> This is ok for testpmd that does not support multiple processes. But 
> testpmd support multiple processes now.
>>
>>
>>>> Done
>>>>
>>>> testpmd> show port summary all
>>>> Number of available ports: 0
>>>> Port MAC Address       Name         Driver         Status Link
>>>> testpmd>
>>>>
>>>> ``
>>>>
>>>> Secondary:
>>>> started as:
>>>> sudo ./build/app/dpdk-testpmd --no-pci --proc-type=auto -l 2-3
>>>> --log-level=*:debug -- -i --num-procs=2 --proc-id=1
>>>>
>>>> ``
>>>> testpmd> show port summary all
>>>> Number of available ports: 0
>>>> Port MAC Address       Name         Driver         Status Link
>>>>
>>>> testpmd> EAL: msg: eal_dev_mp_request
>>>> dpaa: rte_dpaa_bus_parse(): Parse device name (net_null0 )
>>>> fslmc: rte_fslmc_parse(): Parsing dev=(net_null0 )
>>>> fslmc: rte_fslmc_parse(): Unknown or unsupported device (net_null0 )
>>>> EAL: request: bus_vdev_mp
>>>> EAL: msg: bus_vdev_mp
>>>> vdev_action(): receive vdev, net_null0
>>>> EAL: msg: bus_vdev_mp
>>>> vdev_scan(): Received 1 vdevs
>>>> vdev_probe_all_drivers(): Search driver to probe device net_null0
>>>> rte_pmd_null_probe(): Initializing pmd_null for net_null0
>>>> EAL: reply: eal_dev_mp_request
>>>>
>>>> testpmd> show port summary all
>>>> Number of available ports: 1
>>>> Port MAC Address       Name         Driver         Status Link
>>>> 0    DE:E5:79:00:A9:68 net_null0    net_null       down 10 Gbps
>>>>
>>>> testpmd> EAL: msg: eal_dev_mp_request
>>>> dpaa: rte_dpaa_bus_parse(): Parse device name (net_null0 )
>>>> fslmc: rte_fslmc_parse(): Parsing dev=(net_null0 )
>>>> fslmc: rte_fslmc_parse(): Unknown or unsupported device (net_null0 )
>>>> eth_dev_close(): Closing null ethdev on NUMA socket 4294967295
>>>> Port 0 is closed
>>>> EAL: reply: eal_dev_mp_request
>>> But the port number in this process does not be updated after finishing
>>> the destroy event.
>>> That's the problem.
>>>
>> Please see following command output,
>> it says "Number of available ports: 0",
> Here comes from rte_eth_dev_count_avail().
>> so that shows port number is updated after destroy event, isn't it?
> no.
> There are three global variables,  "fwd_ports_ids[]", "ports[]", 
> "nb_ports" maintained in testpmd.
> The secondary doesn't update these info after receiving a destory or 
> new message in your test.
>>
>>
>>>> testpmd> show port summary all
>>>> Number of available ports: 0
>>>> Port MAC Address       Name         Driver         Status Link
>>>> testpmd>
>>>> ``
>>>>
>>>>
>>>> .
>> .
> .
  

Patch

diff --git a/drivers/bus/auxiliary/auxiliary_common.c b/drivers/bus/auxiliary/auxiliary_common.c
index ff1369353a..13cb3fe0f8 100644
--- a/drivers/bus/auxiliary/auxiliary_common.c
+++ b/drivers/bus/auxiliary/auxiliary_common.c
@@ -132,16 +132,21 @@  rte_auxiliary_probe_one_driver(struct rte_auxiliary_driver *drv,
 	}
 
 	dev->driver = drv;
+	/*
+	 * Reference rte_driver before probing so as to this pointer can
+	 * be used to get driver information in case of segment fault in
+	 * probing callback.
+	 */
+	dev->device.driver = &drv->driver;
 
 	AUXILIARY_LOG(INFO, "Probe auxiliary driver: %s device: %s (NUMA node %i)",
 		      drv->driver.name, dev->name, dev->device.numa_node);
 	ret = drv->probe(drv, dev);
 	if (ret != 0) {
 		dev->driver = NULL;
+		dev->device.driver = NULL;
 		rte_intr_instance_free(dev->intr_handle);
 		dev->intr_handle = NULL;
-	} else {
-		dev->device.driver = &drv->driver;
 	}
 
 	return ret;
diff --git a/drivers/bus/dpaa/dpaa_bus.c b/drivers/bus/dpaa/dpaa_bus.c
index e57159f5d8..f1b817e58c 100644
--- a/drivers/bus/dpaa/dpaa_bus.c
+++ b/drivers/bus/dpaa/dpaa_bus.c
@@ -693,17 +693,22 @@  rte_dpaa_bus_probe(void)
 			    (dev->device.devargs &&
 			     dev->device.devargs->policy == RTE_DEV_BLOCKED))
 				continue;
-
+			/*
+			 * Reference rte_driver before probing so as to this
+			 * pointer can be used to get driver information in case
+			 * of segment fault in probing callback.
+			 */
+			dev->device.driver = &drv->driver;
 			if (probe_all ||
 			    (dev->device.devargs &&
 			     dev->device.devargs->policy == RTE_DEV_ALLOWED)) {
 				ret = drv->probe(drv, dev);
 				if (ret) {
+					dev->device.driver = NULL;
 					DPAA_BUS_ERR("unable to probe:%s",
 						     dev->name);
 				} else {
 					dev->driver = drv;
-					dev->device.driver = &drv->driver;
 				}
 			}
 			break;
diff --git a/drivers/bus/fslmc/fslmc_bus.c b/drivers/bus/fslmc/fslmc_bus.c
index 57bfb5111a..4bc0c6d3d4 100644
--- a/drivers/bus/fslmc/fslmc_bus.c
+++ b/drivers/bus/fslmc/fslmc_bus.c
@@ -471,15 +471,21 @@  rte_fslmc_probe(void)
 				continue;
 			}
 
+			/*
+			 * Reference rte_driver before probing so as to this
+			 * pointer can be used to get driver information in case
+			 * of segment fault in probing callback.
+			 */
+			dev->device.driver = &drv->driver;
 			if (probe_all ||
 			   (dev->device.devargs &&
 			    dev->device.devargs->policy == RTE_DEV_ALLOWED)) {
 				ret = drv->probe(drv, dev);
 				if (ret) {
+					dev->device.driver = NULL;
 					DPAA2_BUS_ERR("Unable to probe");
 				} else {
 					dev->driver = drv;
-					dev->device.driver = &drv->driver;
 				}
 			}
 			break;
diff --git a/drivers/bus/ifpga/ifpga_bus.c b/drivers/bus/ifpga/ifpga_bus.c
index bb943b58b5..5f23446f41 100644
--- a/drivers/bus/ifpga/ifpga_bus.c
+++ b/drivers/bus/ifpga/ifpga_bus.c
@@ -293,13 +293,19 @@  ifpga_probe_one_driver(struct rte_afu_driver *drv,
 
 	/* reference driver structure */
 	afu_dev->driver = drv;
+	/*
+	 * Reference rte_driver before probing so as to this pointer can
+	 * be used to get driver information in case of segment fault in
+	 * probing callback.
+	 */
+	afu_dev->device.driver = &drv->driver;
 
 	/* call the driver probe() function */
 	ret = drv->probe(afu_dev);
-	if (ret)
+	if (ret) {
 		afu_dev->driver = NULL;
-	else
-		afu_dev->device.driver = &drv->driver;
+		afu_dev->device.driver = NULL;
+	}
 
 	return ret;
 }
diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c
index bc3a7f39fe..efaa1792e9 100644
--- a/drivers/bus/pci/pci_common.c
+++ b/drivers/bus/pci/pci_common.c
@@ -302,6 +302,12 @@  rte_pci_probe_one_driver(struct rte_pci_driver *dr,
 				return ret;
 			}
 		}
+		/*
+		 * Reference rte_driver before probing so as to this pointer can
+		 * be used to get driver information in case of segment fault in
+		 * probing callback.
+		 */
+		dev->device.driver = &dr->driver;
 	}
 
 	RTE_LOG(INFO, EAL, "Probe PCI driver: %s (%x:%x) device: "PCI_PRI_FMT" (socket %i)\n",
@@ -314,6 +320,7 @@  rte_pci_probe_one_driver(struct rte_pci_driver *dr,
 		return ret; /* no rollback if already succeeded earlier */
 	if (ret) {
 		dev->driver = NULL;
+		dev->device.driver = NULL;
 		if ((dr->drv_flags & RTE_PCI_DRV_NEED_MAPPING) &&
 			/* Don't unmap if device is unsupported and
 			 * driver needs mapped resources.
@@ -325,8 +332,6 @@  rte_pci_probe_one_driver(struct rte_pci_driver *dr,
 		dev->vfio_req_intr_handle = NULL;
 		rte_intr_instance_free(dev->intr_handle);
 		dev->intr_handle = NULL;
-	} else {
-		dev->device.driver = &dr->driver;
 	}
 
 	return ret;
diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index 41bc07dde7..2e3f0f2e12 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -207,9 +207,15 @@  vdev_probe_all_drivers(struct rte_vdev_device *dev)
 		return -1;
 	}
 
+	/*
+	 * Reference rte_driver before probing so as to this pointer can
+	 * be used to get driver information in case of segment fault in
+	 * probing callback.
+	 */
+	dev->device.driver = &driver->driver;
 	ret = driver->probe(dev);
-	if (ret == 0)
-		dev->device.driver = &driver->driver;
+	if (ret != 0)
+		dev->device.driver = NULL;
 	return ret;
 }
 
diff --git a/drivers/bus/vmbus/vmbus_common.c b/drivers/bus/vmbus/vmbus_common.c
index 8d32d66504..feb1651984 100644
--- a/drivers/bus/vmbus/vmbus_common.c
+++ b/drivers/bus/vmbus/vmbus_common.c
@@ -110,6 +110,12 @@  vmbus_probe_one_driver(struct rte_vmbus_driver *dr,
 
 	/* reference driver structure */
 	dev->driver = dr;
+	/*
+	 * Reference rte_driver before probing so as to this pointer can
+	 * be used to get driver information in case of segment fault in
+	 * probing callback.
+	 */
+	dev->device.driver = &dr->driver;
 
 	if (dev->device.numa_node < 0 && rte_socket_count() > 1)
 		VMBUS_LOG(INFO, "Device %s is not NUMA-aware", guid);
@@ -119,9 +125,8 @@  vmbus_probe_one_driver(struct rte_vmbus_driver *dr,
 	ret = dr->probe(dr, dev);
 	if (ret) {
 		dev->driver = NULL;
+		dev->device.driver = NULL;
 		rte_vmbus_unmap_device(dev);
-	} else {
-		dev->device.driver = &dr->driver;
 	}
 
 	return ret;