[dpdk-dev] rte_eth_dev_attach returns 0, although device is not attached

Message ID 57A34175.1040204@intel.com (mailing list archive)
State Not Applicable, archived
Headers

Commit Message

Ferruh Yigit Aug. 4, 2016, 1:21 p.m. UTC
  On 8/4/2016 12:51 PM, Igor Ryzhov wrote:
> Hello Ferruh,
> 
>> 4 авг. 2016 г., в 14:33, Ferruh Yigit <ferruh.yigit@intel.com> написал(а):
>>
>> Hi Igor,
>>
>> On 8/3/2016 5:58 PM, Igor Ryzhov wrote:
>>> Hello.
>>>
>>> Function rte_eth_dev_attach can return false positive result.
>>> It happens because rte_eal_pci_probe_one returns zero if no driver is found for the device:
>>> ret = pci_probe_all_drivers(dev);
>>> if (ret < 0)
>>> 	goto err_return;
>>> return 0;
>>> (pci_probe_all_drivers returns 1 in that case)
>>>
>>> For example, it can be easily reproduced by trying to attach virtio device, managed by kernel driver.
>>
>> You are right, and I did able to reproduce this issue with virtio as you
>> suggest.
>>
>> But I wonder why rte_eth_dev_get_port_by_addr() is not catching this.
>> Perhaps a dev->attached check needs to be added into this function.

With a second check, rte_eth_dev_get_port_by_addr() catches it if the
driver is missing.

But for virtio case, problem is not missing driver.
Problem is eth_virtio_dev_init() is returning a positive value on fail.

Call stack is:
rte_eal_pci_probe_one
    pci_probe_all_drivers
        rte_eal_pci_probe_one_driver
            rte_eth_dev_init
               eth_virtio_dev_init

So rte_eal_pci_probe_one_driver() also returns positive value, as no
driver found, and rte_eth_dev_get_port_by_addr() returns a valid
port_id, since rte_eth_dev_init() allocated an eth_dev.

Briefly, this can be fixed in virtio pmd, instead of eal pci.

>>
>>>
>>> I think it should be:
>>> ret = pci_probe_all_drivers(dev);
>>> if (ret)
>>> 	goto err_return;
>>> return 0;
>>
>> Your proposal looks good to me. Will you send a patch?
> 

Original code silently ignores the if driver is missing for that dev,
although it is still questionable, I think we can keep this as it is.

> Patch sent.

Sorry for this, but can you please test with following modification in
virtio:
index 07d6449..c74eeee 100644


> 
>>
>>> Best regards,
>>> Igor
>>>
>>
>
  

Comments

Igor Ryzhov Aug. 4, 2016, 2:54 p.m. UTC | #1
> 4 авг. 2016 г., в 16:21, Ferruh Yigit <ferruh.yigit@intel.com> написал(а):
> 
> On 8/4/2016 12:51 PM, Igor Ryzhov wrote:
>> Hello Ferruh,
>> 
>>> 4 авг. 2016 г., в 14:33, Ferruh Yigit <ferruh.yigit@intel.com> написал(а):
>>> 
>>> Hi Igor,
>>> 
>>> On 8/3/2016 5:58 PM, Igor Ryzhov wrote:
>>>> Hello.
>>>> 
>>>> Function rte_eth_dev_attach can return false positive result.
>>>> It happens because rte_eal_pci_probe_one returns zero if no driver is found for the device:
>>>> ret = pci_probe_all_drivers(dev);
>>>> if (ret < 0)
>>>> 	goto err_return;
>>>> return 0;
>>>> (pci_probe_all_drivers returns 1 in that case)
>>>> 
>>>> For example, it can be easily reproduced by trying to attach virtio device, managed by kernel driver.
>>> 
>>> You are right, and I did able to reproduce this issue with virtio as you
>>> suggest.
>>> 
>>> But I wonder why rte_eth_dev_get_port_by_addr() is not catching this.
>>> Perhaps a dev->attached check needs to be added into this function.
> 
> With a second check, rte_eth_dev_get_port_by_addr() catches it if the
> driver is missing.
> 
> But for virtio case, problem is not missing driver.
> Problem is eth_virtio_dev_init() is returning a positive value on fail.
> 
> Call stack is:
> rte_eal_pci_probe_one
>    pci_probe_all_drivers
>        rte_eal_pci_probe_one_driver
>            rte_eth_dev_init
>               eth_virtio_dev_init
> 
> So rte_eal_pci_probe_one_driver() also returns positive value, as no
> driver found, and rte_eth_dev_get_port_by_addr() returns a valid
> port_id, since rte_eth_dev_init() allocated an eth_dev.
> 
> Briefly, this can be fixed in virtio pmd, instead of eal pci.
> 
>>> 
>>>> 
>>>> I think it should be:
>>>> ret = pci_probe_all_drivers(dev);
>>>> if (ret)
>>>> 	goto err_return;
>>>> return 0;
>>> 
>>> Your proposal looks good to me. Will you send a patch?
>> 
> 
> Original code silently ignores the if driver is missing for that dev,
> although it is still questionable, I think we can keep this as it is.
> 
>> Patch sent.
> 
> Sorry for this, but can you please test with following modification in
> virtio:
> index 07d6449..c74eeee 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -1156,7 +1156,7 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
>        if (pci_dev) {
>                ret = vtpci_init(pci_dev, hw, &dev_flags);
>                if (ret)
> -                       return ret;
> +                       return -1;
>        }
> 
>        /* Reset the device although not necessary at startup */

I think it's not a good change, because it will break the idea of this patch - http://dpdk.org/browse/dpdk/commit/?id=ac5e1d83 <http://dpdk.org/browse/dpdk/commit/?id=ac5e1d83>

Also, with your patch the application will not start, because rte_eal_pci_probe will fail:

	if (ret < 0)
		rte_exit(EXIT_FAILURE, "Requested device " PCI_PRI_FMT
			 " cannot be used\n", dev->addr.domain, dev->addr.bus,
			 dev->addr.devid, dev->addr.function);

And now I think that maybe we should change the way rte_eal_pci_probe works.
I think we shouldn't stop the application if just one of PCI devices is not probed successfully.

> 
> 
>> 
>>> 
>>>> Best regards,
>>>> Igor
  
Ferruh Yigit Aug. 4, 2016, 3:47 p.m. UTC | #2
On 8/4/2016 3:54 PM, Igor Ryzhov wrote:
> 
>> 4 авг. 2016 г., в 16:21, Ferruh Yigit <ferruh.yigit@intel.com
>> <mailto:ferruh.yigit@intel.com>> написал(а):
>>
>> On 8/4/2016 12:51 PM, Igor Ryzhov wrote:
>>> Hello Ferruh,
>>>
>>>> 4 авг. 2016 г., в 14:33, Ferruh Yigit <ferruh.yigit@intel.com
>>>> <mailto:ferruh.yigit@intel.com>> написал(а):
>>>>
>>>> Hi Igor,
>>>>
>>>> On 8/3/2016 5:58 PM, Igor Ryzhov wrote:
>>>>> Hello.
>>>>>
>>>>> Function rte_eth_dev_attach can return false positive result.
>>>>> It happens because rte_eal_pci_probe_one returns zero if no driver
>>>>> is found for the device:
>>>>> ret = pci_probe_all_drivers(dev);
>>>>> if (ret < 0)
>>>>> goto err_return;
>>>>> return 0;
>>>>> (pci_probe_all_drivers returns 1 in that case)
>>>>>
>>>>> For example, it can be easily reproduced by trying to attach virtio
>>>>> device, managed by kernel driver.
>>>>
>>>> You are right, and I did able to reproduce this issue with virtio as you
>>>> suggest.
>>>>
>>>> But I wonder why rte_eth_dev_get_port_by_addr() is not catching this.
>>>> Perhaps a dev->attached check needs to be added into this function.
>>
>> With a second check, rte_eth_dev_get_port_by_addr() catches it if the
>> driver is missing.
>>
>> But for virtio case, problem is not missing driver.
>> Problem is eth_virtio_dev_init() is returning a positive value on fail.
>>
>> Call stack is:
>> rte_eal_pci_probe_one
>>    pci_probe_all_drivers
>>        rte_eal_pci_probe_one_driver
>>            rte_eth_dev_init
>>               eth_virtio_dev_init
>>
>> So rte_eal_pci_probe_one_driver() also returns positive value, as no
>> driver found, and rte_eth_dev_get_port_by_addr() returns a valid
>> port_id, since rte_eth_dev_init() allocated an eth_dev.
>>
>> Briefly, this can be fixed in virtio pmd, instead of eal pci.
>>
>>>>
>>>>>
>>>>> I think it should be:
>>>>> ret = pci_probe_all_drivers(dev);
>>>>> if (ret)
>>>>> goto err_return;
>>>>> return 0;
>>>>
>>>> Your proposal looks good to me. Will you send a patch?
>>>
>>
>> Original code silently ignores the if driver is missing for that dev,
>> although it is still questionable, I think we can keep this as it is.
>>
>>> Patch sent.
>>
>> Sorry for this, but can you please test with following modification in
>> virtio:
>> index 07d6449..c74eeee 100644
>> --- a/drivers/net/virtio/virtio_ethdev.c
>> +++ b/drivers/net/virtio/virtio_ethdev.c
>> @@ -1156,7 +1156,7 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
>>        if (pci_dev) {
>>                ret = vtpci_init(pci_dev, hw, &dev_flags);
>>                if (ret)
>> -                       return ret;
>> +                       return -1;
>>        }
>>
>>        /* Reset the device although not necessary at startup */
> 
> I think it's not a good change, because it will break the idea of this
> patch - http://dpdk.org/browse/dpdk/commit/?id=ac5e1d83

Yes, breaks this one, I wasn't aware of this patch. But in this patch,
commit log says: "return 1 to tell the upper layer we
don't take over this device.", I am not sure upper layer designed for this.

> 
> Also, with your patch the application will not start, because
> rte_eal_pci_probe will fail:
> 
> if (ret < 0)
> rte_exit(EXIT_FAILURE, "Requested device " PCI_PRI_FMT
>  " cannot be used\n", dev->addr.domain, dev->addr.bus,
>  dev->addr.devid, dev->addr.function);

Yes it fails, and this looks like intended behavior. This failure is
correct according code.

> 
> And now I think that maybe we should change the way rte_eal_pci_probe works.
> I think we shouldn't stop the application if just one of PCI devices is
> not probed successfully.

Agreed. Overall rte_exit() usage already discussed a few times.

I think best option is:
- don't exit app if rte_eal_pci_probe() fails, only print an error.
- eth_virtio_dev_init() return negative error value for all error cases
(including device managed by kernel)

Or perhaps RTE_KDRV_UNKNOWN check can be moved from virtio_pmd into
higher level and can be done for all devices. Like
pci_probe_one_driver() can fail if device driver is RTE_KDRV_UNKNOWN.

Any comments?


> 
>>
>>
>>>
>>>>
>>>>> Best regards,
>>>>> Igor
>
  
Bruce Richardson Aug. 5, 2016, 12:29 p.m. UTC | #3
On Thu, Aug 04, 2016 at 04:47:25PM +0100, Ferruh Yigit wrote:
> On 8/4/2016 3:54 PM, Igor Ryzhov wrote:
> > 
> >> 4 авг. 2016 г., в 16:21, Ferruh Yigit <ferruh.yigit@intel.com
> >> <mailto:ferruh.yigit@intel.com>> написал(а):
> >>
> >> On 8/4/2016 12:51 PM, Igor Ryzhov wrote:
> >>> Hello Ferruh,
> >>>
> >>>> 4 авг. 2016 г., в 14:33, Ferruh Yigit <ferruh.yigit@intel.com
> >>>> <mailto:ferruh.yigit@intel.com>> написал(а):
> >>>>
> >>>> Hi Igor,
> >>>>
> >>>> On 8/3/2016 5:58 PM, Igor Ryzhov wrote:
> >>>>> Hello.
> >>>>>
> >>>>> Function rte_eth_dev_attach can return false positive result.
> >>>>> It happens because rte_eal_pci_probe_one returns zero if no driver
> >>>>> is found for the device:
> >>>>> ret = pci_probe_all_drivers(dev);
> >>>>> if (ret < 0)
> >>>>> goto err_return;
> >>>>> return 0;
> >>>>> (pci_probe_all_drivers returns 1 in that case)
> >>>>>
> >>>>> For example, it can be easily reproduced by trying to attach virtio
> >>>>> device, managed by kernel driver.
> >>>>
> >>>> You are right, and I did able to reproduce this issue with virtio as you
> >>>> suggest.
> >>>>
> >>>> But I wonder why rte_eth_dev_get_port_by_addr() is not catching this.
> >>>> Perhaps a dev->attached check needs to be added into this function.
> >>
> >> With a second check, rte_eth_dev_get_port_by_addr() catches it if the
> >> driver is missing.
> >>
> >> But for virtio case, problem is not missing driver.
> >> Problem is eth_virtio_dev_init() is returning a positive value on fail.
> >>
> >> Call stack is:
> >> rte_eal_pci_probe_one
> >>    pci_probe_all_drivers
> >>        rte_eal_pci_probe_one_driver
> >>            rte_eth_dev_init
> >>               eth_virtio_dev_init
> >>
> >> So rte_eal_pci_probe_one_driver() also returns positive value, as no
> >> driver found, and rte_eth_dev_get_port_by_addr() returns a valid
> >> port_id, since rte_eth_dev_init() allocated an eth_dev.
> >>
> >> Briefly, this can be fixed in virtio pmd, instead of eal pci.
> >>
> >>>>
> >>>>>
> >>>>> I think it should be:
> >>>>> ret = pci_probe_all_drivers(dev);
> >>>>> if (ret)
> >>>>> goto err_return;
> >>>>> return 0;
> >>>>
> >>>> Your proposal looks good to me. Will you send a patch?
> >>>
> >>
> >> Original code silently ignores the if driver is missing for that dev,
> >> although it is still questionable, I think we can keep this as it is.
> >>
> >>> Patch sent.
> >>
> >> Sorry for this, but can you please test with following modification in
> >> virtio:
> >> index 07d6449..c74eeee 100644
> >> --- a/drivers/net/virtio/virtio_ethdev.c
> >> +++ b/drivers/net/virtio/virtio_ethdev.c
> >> @@ -1156,7 +1156,7 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
> >>        if (pci_dev) {
> >>                ret = vtpci_init(pci_dev, hw, &dev_flags);
> >>                if (ret)
> >> -                       return ret;
> >> +                       return -1;
> >>        }
> >>
> >>        /* Reset the device although not necessary at startup */
> > 
> > I think it's not a good change, because it will break the idea of this
> > patch - http://dpdk.org/browse/dpdk/commit/?id=ac5e1d83
> 
> Yes, breaks this one, I wasn't aware of this patch. But in this patch,
> commit log says: "return 1 to tell the upper layer we
> don't take over this device.", I am not sure upper layer designed for this.
> 
> > 
> > Also, with your patch the application will not start, because
> > rte_eal_pci_probe will fail:
> > 
> > if (ret < 0)
> > rte_exit(EXIT_FAILURE, "Requested device " PCI_PRI_FMT
> >  " cannot be used\n", dev->addr.domain, dev->addr.bus,
> >  dev->addr.devid, dev->addr.function);
> 
> Yes it fails, and this looks like intended behavior. This failure is
> correct according code.
> 
> > 
> > And now I think that maybe we should change the way rte_eal_pci_probe works.
> > I think we shouldn't stop the application if just one of PCI devices is
> > not probed successfully.
> 
> Agreed. Overall rte_exit() usage already discussed a few times.
> 
> I think best option is:
> - don't exit app if rte_eal_pci_probe() fails, only print an error.

Whether or not the pci probe exits the app or not, I think it should signal
a serious error if the probe fails and a device was explicitly whitelisted on
the commandline. Given the user explicitly requested the device, a failure to
use is probably a problem which requires the user to fix before running the
app.

/Bruce
  

Patch

--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1156,7 +1156,7 @@  eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
        if (pci_dev) {
                ret = vtpci_init(pci_dev, hw, &dev_flags);
                if (ret)
-                       return ret;
+                       return -1;
        }

        /* Reset the device although not necessary at startup */