[v5,3/3] PCI: don't use vfio ioctl call to access PIO resource

Message ID 1603381885-88819-4-git-send-email-huawei.xhw@alibaba-inc.com (mailing list archive)
State Rejected, archived
Delegated to: Maxime Coquelin
Headers
Series support both PIO and MMIO BAR for virtio PMD |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/Intel-compilation success Compilation OK
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-testing success Testing PASS
ci/travis-robot warning Travis build: failed
ci/iol-mellanox-Performance success Performance Testing PASS

Commit Message

谢华伟(此时此刻) Oct. 22, 2020, 3:51 p.m. UTC
  From: "huawei.xhw" <huawei.xhw@alibaba-inc.com>

VFIO should use the same way to map/read/write PORT IO as UIO, for
virtio PMD.

Signed-off-by: huawei.xhw <huawei.xhw@alibaba-inc.com>
---
 drivers/bus/pci/linux/pci.c     | 8 ++++----
 drivers/bus/pci/linux/pci_uio.c | 4 +++-
 2 files changed, 7 insertions(+), 5 deletions(-)
  

Comments

Maxime Coquelin Jan. 12, 2021, 9:37 a.m. UTC | #1
bus/pci: ...

On 10/22/20 5:51 PM, 谢华伟(此时此刻) wrote:
> From: "huawei.xhw" <huawei.xhw@alibaba-inc.com>
> 
> VFIO should use the same way to map/read/write PORT IO as UIO, for
> virtio PMD.

Please provide more details in the commit message on why the way VFIO
works today is wrong (The cover letter is lost once applied).

> Signed-off-by: huawei.xhw <huawei.xhw@alibaba-inc.com>

Same comment about name format as on previous patches.

> ---
>  drivers/bus/pci/linux/pci.c     | 8 ++++----
>  drivers/bus/pci/linux/pci_uio.c | 4 +++-
>  2 files changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c
> index 0dc99e9..2ed9f2b 100644
> --- a/drivers/bus/pci/linux/pci.c
> +++ b/drivers/bus/pci/linux/pci.c
> @@ -687,7 +687,7 @@ int rte_pci_write_config(const struct rte_pci_device *device,
>  #ifdef VFIO_PRESENT
>  	case RTE_PCI_KDRV_VFIO:
>  		if (pci_vfio_is_enabled())
> -			ret = pci_vfio_ioport_map(dev, bar, p);
> +			ret = pci_uio_ioport_map(dev, bar, p);

Doesn't it create a regression with regards to needed capabilities?
My understanding is that before this patch we don't need to call iopl(),
whereas once applied it is required, correct?

Regards,
Maxime
  
Maxime Coquelin Jan. 12, 2021, 4:58 p.m. UTC | #2
On 1/12/21 10:37 AM, Maxime Coquelin wrote:
> bus/pci: ...
> 
> On 10/22/20 5:51 PM, 谢华伟(此时此刻) wrote:
>> From: "huawei.xhw" <huawei.xhw@alibaba-inc.com>
>>
>> VFIO should use the same way to map/read/write PORT IO as UIO, for
>> virtio PMD.
> 
> Please provide more details in the commit message on why the way VFIO
> works today is wrong (The cover letter is lost once applied).
> 
>> Signed-off-by: huawei.xhw <huawei.xhw@alibaba-inc.com>
> 
> Same comment about name format as on previous patches.
> 
>> ---
>>  drivers/bus/pci/linux/pci.c     | 8 ++++----
>>  drivers/bus/pci/linux/pci_uio.c | 4 +++-
>>  2 files changed, 7 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c
>> index 0dc99e9..2ed9f2b 100644
>> --- a/drivers/bus/pci/linux/pci.c
>> +++ b/drivers/bus/pci/linux/pci.c
>> @@ -687,7 +687,7 @@ int rte_pci_write_config(const struct rte_pci_device *device,
>>  #ifdef VFIO_PRESENT
>>  	case RTE_PCI_KDRV_VFIO:
>>  		if (pci_vfio_is_enabled())
>> -			ret = pci_vfio_ioport_map(dev, bar, p);
>> +			ret = pci_uio_ioport_map(dev, bar, p);
> 
> Doesn't it create a regression with regards to needed capabilities?
> My understanding is that before this patch we don't need to call iopl(),
> whereas once applied it is required, correct?

I did some testing today, and think it is not a regression with para-
virtualized Virtio devices.

Indeed, I thought it would be a regression with Legacy devices when
IOMMU is enabled and the program is run as non-root (IOMMU enabled
just to suport IOVA as VA mode). But it turns out para-virtualized
Virtio legacy device and vIOMMU enabled is not a supported configuration
by QEMU.

Note that when noiommu mode is enabled, the app needs cap_sys_rawio, so
same as iopl(). No regression in this case too.

That said, with real (non para-virtualized) Virtio device using PIO like
yours, doesn't your patch introduce a restriction for your device that
it will require cap_sys_rawio whereas it would not be needed?

Thanks,
Maxime

> Regards,
> Maxime
>
  
谢华伟(此时此刻) Jan. 20, 2021, 2:54 p.m. UTC | #3
On 2021/1/13 0:58, Maxime Coquelin wrote:
>
> On 1/12/21 10:37 AM, Maxime Coquelin wrote:
>> bus/pci: ...
>>
>> On 10/22/20 5:51 PM, 谢华伟(此时此刻) wrote:
>>> From: "huawei.xhw" <huawei.xhw@alibaba-inc.com>
>>>
>>> VFIO should use the same way to map/read/write PORT IO as UIO, for
>>> virtio PMD.
>> Please provide more details in the commit message on why the way VFIO
>> works today is wrong (The cover letter is lost once applied).
ok
>>
>>> Signed-off-by: huawei.xhw <huawei.xhw@alibaba-inc.com>
>> Same comment about name format as on previous patches.
>>
>>> ---
>>>   drivers/bus/pci/linux/pci.c     | 8 ++++----
>>>   drivers/bus/pci/linux/pci_uio.c | 4 +++-
>>>   2 files changed, 7 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c
>>> index 0dc99e9..2ed9f2b 100644
>>> --- a/drivers/bus/pci/linux/pci.c
>>> +++ b/drivers/bus/pci/linux/pci.c
>>> @@ -687,7 +687,7 @@ int rte_pci_write_config(const struct rte_pci_device *device,
>>>   #ifdef VFIO_PRESENT
>>>   	case RTE_PCI_KDRV_VFIO:
>>>   		if (pci_vfio_is_enabled())
>>> -			ret = pci_vfio_ioport_map(dev, bar, p);
>>> +			ret = pci_uio_ioport_map(dev, bar, p);
>> Doesn't it create a regression with regards to needed capabilities?
>> My understanding is that before this patch we don't need to call iopl(),
>> whereas once applied it is required, correct?
> I did some testing today, and think it is not a regression with para-
> virtualized Virtio devices.
>
> Indeed, I thought it would be a regression with Legacy devices when
> IOMMU is enabled and the program is run as non-root (IOMMU enabled
> just to suport IOVA as VA mode). But it turns out para-virtualized
> Virtio legacy device and vIOMMU enabled is not a supported configuration
> by QEMU.
>
> Note that when noiommu mode is enabled, the app needs cap_sys_rawio, so
> same as iopl(). No regression in this case too.
>
> That said, with real (non para-virtualized) Virtio device using PIO like
> yours, doesn't your patch introduce a restriction for your device that
> it will require cap_sys_rawio whereas it would not be needed?

I don't catch the regression issue.

With real virtio device(hardware implemented), if it is using MMIO, no 
cap_sys_rawio is required.

If it is using PIO, iopl is required always.


> Thanks,
> Maxime
>
>> Regards,
>> Maxime
>>
  
Maxime Coquelin Jan. 21, 2021, 8:29 a.m. UTC | #4
On 1/20/21 3:54 PM, 谢华伟(此时此刻) wrote:
> 
> On 2021/1/13 0:58, Maxime Coquelin wrote:
>>
>> On 1/12/21 10:37 AM, Maxime Coquelin wrote:
>>> bus/pci: ...
>>>
>>> On 10/22/20 5:51 PM, 谢华伟(此时此刻) wrote:
>>>> From: "huawei.xhw" <huawei.xhw@alibaba-inc.com>
>>>>
>>>> VFIO should use the same way to map/read/write PORT IO as UIO, for
>>>> virtio PMD.
>>> Please provide more details in the commit message on why the way VFIO
>>> works today is wrong (The cover letter is lost once applied).
> ok
>>>
>>>> Signed-off-by: huawei.xhw <huawei.xhw@alibaba-inc.com>
>>> Same comment about name format as on previous patches.
>>>
>>>> ---
>>>>   drivers/bus/pci/linux/pci.c     | 8 ++++----
>>>>   drivers/bus/pci/linux/pci_uio.c | 4 +++-
>>>>   2 files changed, 7 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c
>>>> index 0dc99e9..2ed9f2b 100644
>>>> --- a/drivers/bus/pci/linux/pci.c
>>>> +++ b/drivers/bus/pci/linux/pci.c
>>>> @@ -687,7 +687,7 @@ int rte_pci_write_config(const struct
>>>> rte_pci_device *device,
>>>>   #ifdef VFIO_PRESENT
>>>>       case RTE_PCI_KDRV_VFIO:
>>>>           if (pci_vfio_is_enabled())
>>>> -            ret = pci_vfio_ioport_map(dev, bar, p);
>>>> +            ret = pci_uio_ioport_map(dev, bar, p);
>>> Doesn't it create a regression with regards to needed capabilities?
>>> My understanding is that before this patch we don't need to call iopl(),
>>> whereas once applied it is required, correct?
>> I did some testing today, and think it is not a regression with para-
>> virtualized Virtio devices.
>>
>> Indeed, I thought it would be a regression with Legacy devices when
>> IOMMU is enabled and the program is run as non-root (IOMMU enabled
>> just to suport IOVA as VA mode). But it turns out para-virtualized
>> Virtio legacy device and vIOMMU enabled is not a supported configuration
>> by QEMU.
>>
>> Note that when noiommu mode is enabled, the app needs cap_sys_rawio, so
>> same as iopl(). No regression in this case too.
>>
>> That said, with real (non para-virtualized) Virtio device using PIO like
>> yours, doesn't your patch introduce a restriction for your device that
>> it will require cap_sys_rawio whereas it would not be needed?
> 
> I don't catch the regression issue.
> 
> With real virtio device(hardware implemented), if it is using MMIO, no
> cap_sys_rawio is required.
> 
> If it is using PIO, iopl is required always.

My understanding of the Kernel VFIO driver is that cap_sys_rawio is only
necessary in noiommu mode, i.e. when VFIO is loaded with
enable_unsafe_noiommu parameter set. The doc for this parameters seems
to validate my understanding of the code:
"
MODULE_PARM_DESC(enable_unsafe_noiommu_mode, "Enable UNSAFE, no-IOMMU
mode.  This mode provides no device isolation, no DMA translation, no
host kernel protection, cannot be used for device assignment to virtual
machines, requires RAWIO permissions, and will taint the kernel.  If you
do not know what this is for, step away. (default: false)");
"

I think that using inb/outb in the case of VFIO with IOMMU enabled won't
work without cap_sys_rawio, and using it in the case of VFIO with IOMMU
disabled just bypasses VFIO and so is not correct.

In my opinion, what we should do is to add something like this in the
DPDK documentation:

 - MMIO BAR: VFIO with IOMMU enabled recommended. Equivalent performance
as with IGB UIO or VFIO with NOIOMMU. VFIO with IOMMU is recommended for
security reasons.
 - PIO BAR: VFIO with IOMMU enabled is recommended for security reasons,
providing proper isolation and not requiring cap_sys_rawio. However, use
of IOMMU is not always possible in some cases (e.g. para-virtualized
Virtio-net legacy device). Also, performance of using VFIO for PIO BARs
accesses has an impact on performance as it uses pread/pwrite syscalls,
whereas UIO drivers use inb/outb. If security is not a concern or IOMMU
is not available, one might consider using UIO driver in this case for
performance reasons.

What do you think?

Thanks,
Maxime

> 
>> Thanks,
>> Maxime
>>
>>> Regards,
>>> Maxime
>>>
>
  
谢华伟(此时此刻) Jan. 21, 2021, 2:57 p.m. UTC | #5
On 2021/1/21 16:29, Maxime Coquelin wrote:
>
> On 1/20/21 3:54 PM, 谢华伟(此时此刻) wrote:
>> On 2021/1/13 0:58, Maxime Coquelin wrote:
>>> On 1/12/21 10:37 AM, Maxime Coquelin wrote:
>>>> bus/pci: ...
>>>>
>>>> On 10/22/20 5:51 PM, 谢华伟(此时此刻) wrote:
>>>>> From: "huawei.xhw" <huawei.xhw@alibaba-inc.com>
>>>>>
>>>>> VFIO should use the same way to map/read/write PORT IO as UIO, for
>>>>> virtio PMD.
>>>> Please provide more details in the commit message on why the way VFIO
>>>> works today is wrong (The cover letter is lost once applied).
>> ok
>>>>> Signed-off-by: huawei.xhw <huawei.xhw@alibaba-inc.com>
>>>> Same comment about name format as on previous patches.
>>>>
>>>>> ---
>>>>>    drivers/bus/pci/linux/pci.c     | 8 ++++----
>>>>>    drivers/bus/pci/linux/pci_uio.c | 4 +++-
>>>>>    2 files changed, 7 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c
>>>>> index 0dc99e9..2ed9f2b 100644
>>>>> --- a/drivers/bus/pci/linux/pci.c
>>>>> +++ b/drivers/bus/pci/linux/pci.c
>>>>> @@ -687,7 +687,7 @@ int rte_pci_write_config(const struct
>>>>> rte_pci_device *device,
>>>>>    #ifdef VFIO_PRESENT
>>>>>        case RTE_PCI_KDRV_VFIO:
>>>>>            if (pci_vfio_is_enabled())
>>>>> -            ret = pci_vfio_ioport_map(dev, bar, p);
>>>>> +            ret = pci_uio_ioport_map(dev, bar, p);
>>>> Doesn't it create a regression with regards to needed capabilities?
>>>> My understanding is that before this patch we don't need to call iopl(),
>>>> whereas once applied it is required, correct?
>>> I did some testing today, and think it is not a regression with para-
>>> virtualized Virtio devices.
>>>
>>> Indeed, I thought it would be a regression with Legacy devices when
>>> IOMMU is enabled and the program is run as non-root (IOMMU enabled
>>> just to suport IOVA as VA mode). But it turns out para-virtualized
>>> Virtio legacy device and vIOMMU enabled is not a supported configuration
>>> by QEMU.
>>>
>>> Note that when noiommu mode is enabled, the app needs cap_sys_rawio, so
>>> same as iopl(). No regression in this case too.
>>>
>>> That said, with real (non para-virtualized) Virtio device using PIO like
>>> yours, doesn't your patch introduce a restriction for your device that
>>> it will require cap_sys_rawio whereas it would not be needed?
>> I don't catch the regression issue.
>>
>> With real virtio device(hardware implemented), if it is using MMIO, no
>> cap_sys_rawio is required.
>>
>> If it is using PIO, iopl is required always.
> My understanding of the Kernel VFIO driver is that cap_sys_rawio is only
> necessary in noiommu mode, i.e. when VFIO is loaded with
> enable_unsafe_noiommu parameter set. The doc for this parameters seems
> to validate my understanding of the code:
> "
> MODULE_PARM_DESC(enable_unsafe_noiommu_mode, "Enable UNSAFE, no-IOMMU
> mode.  This mode provides no device isolation, no DMA translation, no
> host kernel protection, cannot be used for device assignment to virtual
> machines, requires RAWIO permissions, and will taint the kernel.  If you
> do not know what this is for, step away. (default: false)");
> "
>
> I think that using inb/outb in the case of VFIO with IOMMU enabled won't
> work without cap_sys_rawio, and using it in the case of VFIO with IOMMU
> disabled just bypasses VFIO and so is not correct.

Get your concern.

PIO bar:

     HW virtio on HW machine: any vendor implements hardware virtio 
using PIO bar? I think this isn't right. And i dout if vfio doesn't 
check rawio perssion in the syscall in this case.

     Para virtio:  you have no choice to enable unsafe no-iommu mode.  
You must have RAWIO permission.

so with PIO bar, the regression doesn't exist in real world.

Btw, our virtio device is basically MMIO bar, either in hardware machine 
or in pass-throughed virtual machine.


Do you mean we apply or abandon patch 3? I am both OK. The first 
priority to me is to enable MMIO bar support.


> In my opinion, what we should do is to add something like this in the
> DPDK documentation:
>
>   - MMIO BAR: VFIO with IOMMU enabled recommended. Equivalent performance
> as with IGB UIO or VFIO with NOIOMMU. VFIO with IOMMU is recommended for
> security reasons.
>   - PIO BAR: VFIO with IOMMU enabled is recommended for security reasons,
> providing proper isolation and not requiring cap_sys_rawio. However, use
> of IOMMU is not always possible in some cases (e.g. para-virtualized
> Virtio-net legacy device). Also, performance of using VFIO for PIO BARs
> accesses has an impact on performance as it uses pread/pwrite syscalls,
> whereas UIO drivers use inb/outb. If security is not a concern or IOMMU
> is not available, one might consider using UIO driver in this case for
> performance reasons.
>
> What do you think?
>>> Thanks,
>>> Maxime
>>>
>>>> Regards,
>>>> Maxime
>>>>
  
谢华伟(此时此刻) Jan. 21, 2021, 3 p.m. UTC | #6
>> "
>>
>> I think that using inb/outb in the case of VFIO with IOMMU enabled won't
>> work without cap_sys_rawio, and using it in the case of VFIO with IOMMU
>> disabled just bypasses VFIO and so is not correct.
>
> Get your concern.
>
> PIO bar:
>
>     HW virtio on HW machine: any vendor implements hardware virtio 
> using PIO bar? I think this isn't right. And i dout if vfio doesn't 
> check rawio perssion in the syscall in this case.
>
>     Para virtio:  you have no choice to enable unsafe no-iommu mode.  
> You must have RAWIO permission.
Sorry. typo.  "you have no choice but to enable unsafe no-iommu mode. "
>
> so with PIO bar, the regression doesn't exist in real world.
>
> Btw, our virtio device is basically MMIO bar, either in hardware 
> machine or in pass-throughed virtual machine.
>
>
> Do you mean we apply or abandon patch 3? I am both OK. The first 
> priority to me is to enable MMIO bar support.
>
>
  
Maxime Coquelin Jan. 21, 2021, 3:38 p.m. UTC | #7
On 1/21/21 3:57 PM, 谢华伟(此时此刻) wrote:
> 
> On 2021/1/21 16:29, Maxime Coquelin wrote:
>>
>> On 1/20/21 3:54 PM, 谢华伟(此时此刻) wrote:
>>> On 2021/1/13 0:58, Maxime Coquelin wrote:
>>>> On 1/12/21 10:37 AM, Maxime Coquelin wrote:
>>>>> bus/pci: ...
>>>>>
>>>>> On 10/22/20 5:51 PM, 谢华伟(此时此刻) wrote:
>>>>>> From: "huawei.xhw" <huawei.xhw@alibaba-inc.com>
>>>>>>
>>>>>> VFIO should use the same way to map/read/write PORT IO as UIO, for
>>>>>> virtio PMD.
>>>>> Please provide more details in the commit message on why the way VFIO
>>>>> works today is wrong (The cover letter is lost once applied).
>>> ok
>>>>>> Signed-off-by: huawei.xhw <huawei.xhw@alibaba-inc.com>
>>>>> Same comment about name format as on previous patches.
>>>>>
>>>>>> ---
>>>>>>    drivers/bus/pci/linux/pci.c     | 8 ++++----
>>>>>>    drivers/bus/pci/linux/pci_uio.c | 4 +++-
>>>>>>    2 files changed, 7 insertions(+), 5 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/bus/pci/linux/pci.c
>>>>>> b/drivers/bus/pci/linux/pci.c
>>>>>> index 0dc99e9..2ed9f2b 100644
>>>>>> --- a/drivers/bus/pci/linux/pci.c
>>>>>> +++ b/drivers/bus/pci/linux/pci.c
>>>>>> @@ -687,7 +687,7 @@ int rte_pci_write_config(const struct
>>>>>> rte_pci_device *device,
>>>>>>    #ifdef VFIO_PRESENT
>>>>>>        case RTE_PCI_KDRV_VFIO:
>>>>>>            if (pci_vfio_is_enabled())
>>>>>> -            ret = pci_vfio_ioport_map(dev, bar, p);
>>>>>> +            ret = pci_uio_ioport_map(dev, bar, p);
>>>>> Doesn't it create a regression with regards to needed capabilities?
>>>>> My understanding is that before this patch we don't need to call
>>>>> iopl(),
>>>>> whereas once applied it is required, correct?
>>>> I did some testing today, and think it is not a regression with para-
>>>> virtualized Virtio devices.
>>>>
>>>> Indeed, I thought it would be a regression with Legacy devices when
>>>> IOMMU is enabled and the program is run as non-root (IOMMU enabled
>>>> just to suport IOVA as VA mode). But it turns out para-virtualized
>>>> Virtio legacy device and vIOMMU enabled is not a supported
>>>> configuration
>>>> by QEMU.
>>>>
>>>> Note that when noiommu mode is enabled, the app needs cap_sys_rawio, so
>>>> same as iopl(). No regression in this case too.
>>>>
>>>> That said, with real (non para-virtualized) Virtio device using PIO
>>>> like
>>>> yours, doesn't your patch introduce a restriction for your device that
>>>> it will require cap_sys_rawio whereas it would not be needed?
>>> I don't catch the regression issue.
>>>
>>> With real virtio device(hardware implemented), if it is using MMIO, no
>>> cap_sys_rawio is required.
>>>
>>> If it is using PIO, iopl is required always.
>> My understanding of the Kernel VFIO driver is that cap_sys_rawio is only
>> necessary in noiommu mode, i.e. when VFIO is loaded with
>> enable_unsafe_noiommu parameter set. The doc for this parameters seems
>> to validate my understanding of the code:
>> "
>> MODULE_PARM_DESC(enable_unsafe_noiommu_mode, "Enable UNSAFE, no-IOMMU
>> mode.  This mode provides no device isolation, no DMA translation, no
>> host kernel protection, cannot be used for device assignment to virtual
>> machines, requires RAWIO permissions, and will taint the kernel.  If you
>> do not know what this is for, step away. (default: false)");
>> "
>>
>> I think that using inb/outb in the case of VFIO with IOMMU enabled won't
>> work without cap_sys_rawio, and using it in the case of VFIO with IOMMU
>> disabled just bypasses VFIO and so is not correct.
> 
> Get your concern.
> 
> PIO bar:
> 
>     HW virtio on HW machine: any vendor implements hardware virtio using
> PIO bar? I think this isn't right. And i dout if vfio doesn't check
> rawio perssion in the syscall in this case.

I checked VFIO code, and it only check for rawio permission if noiommu
mode is enabled.

>     Para virtio:  you have no choice to enable unsafe no-iommu mode. 
> You must have RAWIO permission.
> 
> so with PIO bar, the regression doesn't exist in real world.
>
> 
> Btw, our virtio device is basically MMIO bar, either in hardware machine
> or in pass-throughed virtual machine.

OK, that thing was not clear to me.

> 
> Do you mean we apply or abandon patch 3? I am both OK. The first
> priority to me is to enable MMIO bar support.

OK, so yes, I think we should abandon patch 2 and patch 3.
For patch 1, it looks valid to me, but I'll let Ferruh decide.

For your device, if my understanding is correct, what we need to do is
to support MMIO for legacy devices. Correct?

If so, the change should be in virtio_pci.c. In vtpci_init(), after
modern detection has failed, we should check the the BAR is PIO or MMIO
based on the flag. the result can be saved in struct virtio_pci_dev.


We would introduce new wrappers like vtpci_legacy_read,
vtpci_legacy_write that would either call rte_pci_ioport_read,
rte_pci_ioport_read in case of PIO, or rte_read32, rte_write32 in case
of MMIO.

It is not too late for this release, as the change will not be that
intrusive. But if you prepare such patch, please base it on top of my
virtio rework series; To make it easier to you, I added it to the dpdk-
next-virtio tree:
https://git.dpdk.org/next/dpdk-next-virtio/log/?h=virtio_pmd_rework_v2

Thanks,
Maxime

> 
>> In my opinion, what we should do is to add something like this in the
>> DPDK documentation:
>>
>>   - MMIO BAR: VFIO with IOMMU enabled recommended. Equivalent performance
>> as with IGB UIO or VFIO with NOIOMMU. VFIO with IOMMU is recommended for
>> security reasons.
>>   - PIO BAR: VFIO with IOMMU enabled is recommended for security reasons,
>> providing proper isolation and not requiring cap_sys_rawio. However, use
>> of IOMMU is not always possible in some cases (e.g. para-virtualized
>> Virtio-net legacy device). Also, performance of using VFIO for PIO BARs
>> accesses has an impact on performance as it uses pread/pwrite syscalls,
>> whereas UIO drivers use inb/outb. If security is not a concern or IOMMU
>> is not available, one might consider using UIO driver in this case for
>> performance reasons.
>>
>> What do you think?
>>>> Thanks,
>>>> Maxime
>>>>
>>>>> Regards,
>>>>> Maxime
>>>>>
>
  
谢华伟(此时此刻) Jan. 22, 2021, 7:25 a.m. UTC | #8
On 2021/1/21 23:38, Maxime Coquelin wrote:
>> Do you mean we apply or abandon patch 3? I am both OK. The first
>> priority to me is to enable MMIO bar support.
> OK, so yes, I think we should abandon patch 2 and patch 3.
> For patch 1, it looks valid to me, but I'll let Ferruh decide.
>
> For your device, if my understanding is correct, what we need to do is
> to support MMIO for legacy devices. Correct?
yes.
> If so, the change should be in virtio_pci.c. In vtpci_init(), after
> modern detection has failed, we should check the the BAR is PIO or MMIO
> based on the flag. the result can be saved in struct virtio_pci_dev.
>
>
> We would introduce new wrappers like vtpci_legacy_read,
> vtpci_legacy_write that would either call rte_pci_ioport_read,
> rte_pci_ioport_read in case of PIO, or rte_read32, rte_write32 in case
> of MMIO.

There are two choices.

1, apply patch 2.

     IO/MMIO port are mapped and accessed using the same API. Kernel is 
doing in the same way like the following.

             io_addr = pci_iomap

                 get PIO directly or ioremap

             iowrite16/32(val, io_addr + offset)

I think applying patch 2 is a correct choice. It is a fix. Driver had 
better not know if bar is PIO or MMIO.  ioport in ioport_xx API means 
IO, not PIO.

Btw, it only affects virtio PMD,  not that intrusive.

  2, virtio specific change to enable MMIO support.

Comparing with choice 1, i feels it is not that clean and pretty.

>
> It is not too late for this release, as the change will not be that
> intrusive. But if you prepare such patch, please base it on top of my
> virtio rework series; To make it easier to you, I added it to the dpdk-
> next-virtio tree:
> https://git.dpdk.org/next/dpdk-next-virtio/log/?h=virtio_pmd_rework_v2
>
> Thanks,
> Maxime
>
  
Maxime Coquelin Jan. 26, 2021, 10:44 a.m. UTC | #9
On 1/22/21 8:25 AM, 谢华伟(此时此刻) wrote:
> 
> On 2021/1/21 23:38, Maxime Coquelin wrote:
>>> Do you mean we apply or abandon patch 3? I am both OK. The first
>>> priority to me is to enable MMIO bar support.
>> OK, so yes, I think we should abandon patch 2 and patch 3.
>> For patch 1, it looks valid to me, but I'll let Ferruh decide.
>>
>> For your device, if my understanding is correct, what we need to do is
>> to support MMIO for legacy devices. Correct?
> yes.
>> If so, the change should be in virtio_pci.c. In vtpci_init(), after
>> modern detection has failed, we should check the the BAR is PIO or MMIO
>> based on the flag. the result can be saved in struct virtio_pci_dev.
>>
>>
>> We would introduce new wrappers like vtpci_legacy_read,
>> vtpci_legacy_write that would either call rte_pci_ioport_read,
>> rte_pci_ioport_read in case of PIO, or rte_read32, rte_write32 in case
>> of MMIO.
> 
> There are two choices.
> 
> 1, apply patch 2.
> 
>     IO/MMIO port are mapped and accessed using the same API. Kernel is
> doing in the same way like the following.
> 
>             io_addr = pci_iomap
> 
>                 get PIO directly or ioremap
> 
>             iowrite16/32(val, io_addr + offset)
> 
> I think applying patch 2 is a correct choice. It is a fix. Driver had
> better not know if bar is PIO or MMIO.  ioport in ioport_xx API means
> IO, not PIO.
> 
> Btw, it only affects virtio PMD,  not that intrusive.
> 
>  2, virtio specific change to enable MMIO support.
> 
> Comparing with choice 1, i feels it is not that clean and pretty.

OK, that makes sense. I am OK with keeping patch 2, but would like
Ferruh's ACK.

Could you please post v6?

Thanks,
Maxime

>>
>> It is not too late for this release, as the change will not be that
>> intrusive. But if you prepare such patch, please base it on top of my
>> virtio rework series; To make it easier to you, I added it to the dpdk-
>> next-virtio tree:
>> https://git.dpdk.org/next/dpdk-next-virtio/log/?h=virtio_pmd_rework_v2
>>
>> Thanks,
>> Maxime
>>
>
  
谢华伟(此时此刻) Jan. 26, 2021, 12:30 p.m. UTC | #10
On 2021/1/22 15:25, chris wrote:
>
> On 2021/1/21 23:38, Maxime Coquelin wrote:
>>> Do you mean we apply or abandon patch 3? I am both OK. The first
>>> priority to me is to enable MMIO bar support.
>> OK, so yes, I think we should abandon patch 2 and patch 3.
>> For patch 1, it looks valid to me, but I'll let Ferruh decide.
>>
>> For your device, if my understanding is correct, what we need to do is
>> to support MMIO for legacy devices. Correct?
> yes.
>> If so, the change should be in virtio_pci.c. In vtpci_init(), after
>> modern detection has failed, we should check the the BAR is PIO or MMIO
>> based on the flag. the result can be saved in struct virtio_pci_dev.
>>
>>
>> We would introduce new wrappers like vtpci_legacy_read,
>> vtpci_legacy_write that would either call rte_pci_ioport_read,
>> rte_pci_ioport_read in case of PIO, or rte_read32, rte_write32 in case
>> of MMIO.
>
> There are two choices.
>
> 1, apply patch 2.
>
>     IO/MMIO port are mapped and accessed using the same API. Kernel is 
> doing in the same way like the following.
>
>             io_addr = pci_iomap
>
>                 get PIO directly or ioremap
>
>             iowrite16/32(val, io_addr + offset)
>
> I think applying patch 2 is a correct choice. It is a fix. Driver had 
> better not know if bar is PIO or MMIO.  ioport in ioport_xx API means 
> IO, not PIO.
>
> Btw, it only affects virtio PMD,  not that intrusive.
>
>  2, virtio specific change to enable MMIO support.
>
> Comparing with choice 1, i feels it is not that clean and pretty.
>
>>
>> It is not too late for this release, as the change will not be that
>> intrusive. But if you prepare such patch, please base it on top of my
>> virtio rework series; To make it easier to you, I added it to the dpdk-
>> next-virtio tree:
>> https://git.dpdk.org/next/dpdk-next-virtio/log/?h=virtio_pmd_rework_v2
>>
Hi Maxime:

Decision on patch 2?

I still think current patch 2 is cleaner.

Thanks,  huawei


>> Maxime
>>
  
Maxime Coquelin Jan. 26, 2021, 12:35 p.m. UTC | #11
On 1/26/21 1:30 PM, 谢华伟(此时此刻) wrote:
> 
> On 2021/1/22 15:25, chris wrote:
>>
>> On 2021/1/21 23:38, Maxime Coquelin wrote:
>>>> Do you mean we apply or abandon patch 3? I am both OK. The first
>>>> priority to me is to enable MMIO bar support.
>>> OK, so yes, I think we should abandon patch 2 and patch 3.
>>> For patch 1, it looks valid to me, but I'll let Ferruh decide.
>>>
>>> For your device, if my understanding is correct, what we need to do is
>>> to support MMIO for legacy devices. Correct?
>> yes.
>>> If so, the change should be in virtio_pci.c. In vtpci_init(), after
>>> modern detection has failed, we should check the the BAR is PIO or MMIO
>>> based on the flag. the result can be saved in struct virtio_pci_dev.
>>>
>>>
>>> We would introduce new wrappers like vtpci_legacy_read,
>>> vtpci_legacy_write that would either call rte_pci_ioport_read,
>>> rte_pci_ioport_read in case of PIO, or rte_read32, rte_write32 in case
>>> of MMIO.
>>
>> There are two choices.
>>
>> 1, apply patch 2.
>>
>>     IO/MMIO port are mapped and accessed using the same API. Kernel is
>> doing in the same way like the following.
>>
>>             io_addr = pci_iomap
>>
>>                 get PIO directly or ioremap
>>
>>             iowrite16/32(val, io_addr + offset)
>>
>> I think applying patch 2 is a correct choice. It is a fix. Driver had
>> better not know if bar is PIO or MMIO.  ioport in ioport_xx API means
>> IO, not PIO.
>>
>> Btw, it only affects virtio PMD,  not that intrusive.
>>
>>  2, virtio specific change to enable MMIO support.
>>
>> Comparing with choice 1, i feels it is not that clean and pretty.
>>
>>>
>>> It is not too late for this release, as the change will not be that
>>> intrusive. But if you prepare such patch, please base it on top of my
>>> virtio rework series; To make it easier to you, I added it to the dpdk-
>>> next-virtio tree:
>>> https://git.dpdk.org/next/dpdk-next-virtio/log/?h=virtio_pmd_rework_v2
>>>
> Hi Maxime:
> 
> Decision on patch 2?
> 
> I still think current patch 2 is cleaner.

Hi,

I actually replied one hour ago:
"
OK, that makes sense. I am OK with keeping patch 2, but would like
Ferruh's ACK.

Could you please post v6?
"

Thanks,
Maxime


> Thanks,  huawei
> 
> 
>>> Maxime
>>>
>
  
谢华伟(此时此刻) Jan. 26, 2021, 2:24 p.m. UTC | #12
On 2021/1/26 20:35, Maxime Coquelin wrote:
>
> On 1/26/21 1:30 PM, 谢华伟(此时此刻) wrote:
>> On 2021/1/22 15:25, chris wrote:
>>> On 2021/1/21 23:38, Maxime Coquelin wrote:
>>>>> Do you mean we apply or abandon patch 3? I am both OK. The first
>>>>> priority to me is to enable MMIO bar support.
>>>> OK, so yes, I think we should abandon patch 2 and patch 3.
>>>> For patch 1, it looks valid to me, but I'll let Ferruh decide.
>>>>
>>>> For your device, if my understanding is correct, what we need to do is
>>>> to support MMIO for legacy devices. Correct?
>>> yes.
>>>> If so, the change should be in virtio_pci.c. In vtpci_init(), after
>>>> modern detection has failed, we should check the the BAR is PIO or MMIO
>>>> based on the flag. the result can be saved in struct virtio_pci_dev.
>>>>
>>>>
>>>> We would introduce new wrappers like vtpci_legacy_read,
>>>> vtpci_legacy_write that would either call rte_pci_ioport_read,
>>>> rte_pci_ioport_read in case of PIO, or rte_read32, rte_write32 in case
>>>> of MMIO.
>>> There are two choices.
>>>
>>> 1, apply patch 2.
>>>
>>>      IO/MMIO port are mapped and accessed using the same API. Kernel is
>>> doing in the same way like the following.
>>>
>>>              io_addr = pci_iomap
>>>
>>>                  get PIO directly or ioremap
>>>
>>>              iowrite16/32(val, io_addr + offset)
>>>
>>> I think applying patch 2 is a correct choice. It is a fix. Driver had
>>> better not know if bar is PIO or MMIO.  ioport in ioport_xx API means
>>> IO, not PIO.
>>>
>>> Btw, it only affects virtio PMD,  not that intrusive.
>>>
>>>   2, virtio specific change to enable MMIO support.
>>>
>>> Comparing with choice 1, i feels it is not that clean and pretty.
>>>
>>>> It is not too late for this release, as the change will not be that
>>>> intrusive. But if you prepare such patch, please base it on top of my
>>>> virtio rework series; To make it easier to you, I added it to the dpdk-
>>>> next-virtio tree:
>>>> https://git.dpdk.org/next/dpdk-next-virtio/log/?h=virtio_pmd_rework_v2
>>>>
>> Hi Maxime:
>>
>> Decision on patch 2?
>>
>> I still think current patch 2 is cleaner.
> Hi,
>
> I actually replied one hour ago:
> "
> OK, that makes sense. I am OK with keeping patch 2, but would like
> Ferruh's ACK.
>
> Could you please post v6?
> "
Sorry, missed it. would do it.
>
> Thanks,
> Maxime
>
>
>> Thanks,  huawei
>>
>>
>>>> Maxime
>>>>
  
Ferruh Yigit Jan. 27, 2021, 10:32 a.m. UTC | #13
On 1/26/2021 10:44 AM, Maxime Coquelin wrote:
> 
> 
> On 1/22/21 8:25 AM, 谢华伟(此时此刻) wrote:
>>
>> On 2021/1/21 23:38, Maxime Coquelin wrote:
>>>> Do you mean we apply or abandon patch 3? I am both OK. The first
>>>> priority to me is to enable MMIO bar support.
>>> OK, so yes, I think we should abandon patch 2 and patch 3.
>>> For patch 1, it looks valid to me, but I'll let Ferruh decide.
>>>
>>> For your device, if my understanding is correct, what we need to do is
>>> to support MMIO for legacy devices. Correct?
>> yes.
>>> If so, the change should be in virtio_pci.c. In vtpci_init(), after
>>> modern detection has failed, we should check the the BAR is PIO or MMIO
>>> based on the flag. the result can be saved in struct virtio_pci_dev.
>>>
>>>
>>> We would introduce new wrappers like vtpci_legacy_read,
>>> vtpci_legacy_write that would either call rte_pci_ioport_read,
>>> rte_pci_ioport_read in case of PIO, or rte_read32, rte_write32 in case
>>> of MMIO.
>>
>> There are two choices.
>>
>> 1, apply patch 2.
>>
>>      IO/MMIO port are mapped and accessed using the same API. Kernel is
>> doing in the same way like the following.
>>
>>              io_addr = pci_iomap
>>
>>                  get PIO directly or ioremap
>>
>>              iowrite16/32(val, io_addr + offset)
>>
>> I think applying patch 2 is a correct choice. It is a fix. Driver had
>> better not know if bar is PIO or MMIO.  ioport in ioport_xx API means
>> IO, not PIO.
>>
>> Btw, it only affects virtio PMD,  not that intrusive.
>>
>>   2, virtio specific change to enable MMIO support.
>>
>> Comparing with choice 1, i feels it is not that clean and pretty.
> 
> OK, that makes sense. I am OK with keeping patch 2, but would like
> Ferruh's ACK.
> 

I was waiting for clarification if this can be solved in virtio, which seems 
clarified and decided to go with this patch, I am OK to proceed with patch 1 & 2.

But first patch changes how PIO address get, it changes the Linux interface used 
to get the PIO.
And as far as I can see second patch requires this new interface to be able to 
access the MEM resources.

I have a concern that this interface change may cause issues with various 
distros, kernel versions etc.. And prefer it goes through a full -rc1 validation 
cycle.

Huawei, I am aware the patch is around for a while but to play safe, I suggest 
considering it for early next release, so it can be tested enough, instead of 
getting if for -rc2/3 in this release.

Thanks,
ferruh


> Could you please post v6?
> 
> Thanks,
> Maxime
> 
>>>
>>> It is not too late for this release, as the change will not be that
>>> intrusive. But if you prepare such patch, please base it on top of my
>>> virtio rework series; To make it easier to you, I added it to the dpdk-
>>> next-virtio tree:
>>> https://git.dpdk.org/next/dpdk-next-virtio/log/?h=virtio_pmd_rework_v2
>>>
>>> Thanks,
>>> Maxime
>>>
>>
>
  
Maxime Coquelin Jan. 27, 2021, 12:17 p.m. UTC | #14
On 1/27/21 11:32 AM, Ferruh Yigit wrote:
> On 1/26/2021 10:44 AM, Maxime Coquelin wrote:
>>
>>
>> On 1/22/21 8:25 AM, 谢华伟(此时此刻) wrote:
>>>
>>> On 2021/1/21 23:38, Maxime Coquelin wrote:
>>>>> Do you mean we apply or abandon patch 3? I am both OK. The first
>>>>> priority to me is to enable MMIO bar support.
>>>> OK, so yes, I think we should abandon patch 2 and patch 3.
>>>> For patch 1, it looks valid to me, but I'll let Ferruh decide.
>>>>
>>>> For your device, if my understanding is correct, what we need to do is
>>>> to support MMIO for legacy devices. Correct?
>>> yes.
>>>> If so, the change should be in virtio_pci.c. In vtpci_init(), after
>>>> modern detection has failed, we should check the the BAR is PIO or MMIO
>>>> based on the flag. the result can be saved in struct virtio_pci_dev.
>>>>
>>>>
>>>> We would introduce new wrappers like vtpci_legacy_read,
>>>> vtpci_legacy_write that would either call rte_pci_ioport_read,
>>>> rte_pci_ioport_read in case of PIO, or rte_read32, rte_write32 in case
>>>> of MMIO.
>>>
>>> There are two choices.
>>>
>>> 1, apply patch 2.
>>>
>>>      IO/MMIO port are mapped and accessed using the same API. Kernel is
>>> doing in the same way like the following.
>>>
>>>              io_addr = pci_iomap
>>>
>>>                  get PIO directly or ioremap
>>>
>>>              iowrite16/32(val, io_addr + offset)
>>>
>>> I think applying patch 2 is a correct choice. It is a fix. Driver had
>>> better not know if bar is PIO or MMIO.  ioport in ioport_xx API means
>>> IO, not PIO.
>>>
>>> Btw, it only affects virtio PMD,  not that intrusive.
>>>
>>>   2, virtio specific change to enable MMIO support.
>>>
>>> Comparing with choice 1, i feels it is not that clean and pretty.
>>
>> OK, that makes sense. I am OK with keeping patch 2, but would like
>> Ferruh's ACK.
>>
> 
> I was waiting for clarification if this can be solved in virtio, which
> seems clarified and decided to go with this patch, I am OK to proceed
> with patch 1 & 2.
> 
> But first patch changes how PIO address get, it changes the Linux
> interface used to get the PIO.
> And as far as I can see second patch requires this new interface to be
> able to access the MEM resources.
> 
> I have a concern that this interface change may cause issues with
> various distros, kernel versions etc.. And prefer it goes through a full
> -rc1 validation cycle.

While I think the risk for patch 2 is close to zero, I understand your
concern on patch 1 (especially with the upcoming holidays in China,
which will have an impact on QE capacity).

Huawei, do you think patch 2 can be slightly modified to be applied
alone, without patch 1? If possible, we may be able to pick patch2 for
this release and postpone patch 1 to v21.05?

> Huawei, I am aware the patch is around for a while but to play safe, I
> suggest considering it for early next release, so it can be tested
> enough, instead of getting if for -rc2/3 in this release.
> 
> Thanks,
> ferruh
> 
> 
>> Could you please post v6?
>>
>> Thanks,
>> Maxime
>>
>>>>
>>>> It is not too late for this release, as the change will not be that
>>>> intrusive. But if you prepare such patch, please base it on top of my
>>>> virtio rework series; To make it easier to you, I added it to the dpdk-
>>>> next-virtio tree:
>>>> https://git.dpdk.org/next/dpdk-next-virtio/log/?h=virtio_pmd_rework_v2
>>>>
>>>> Thanks,
>>>> Maxime
>>>>
>>>
>>
>
  
谢华伟(此时此刻) Jan. 27, 2021, 2:43 p.m. UTC | #15
On 2021/1/27 18:32, Ferruh Yigit wrote:
> I was waiting for clarification if this can be solved in virtio, which 
> seems clarified and decided to go with this patch, I am OK to proceed 
> with patch 1 & 2.
>
> But first patch changes how PIO address get, it changes the Linux 
> interface used to get the PIO.
> And as far as I can see second patch requires this new interface to be 
> able to access the MEM resources.
>
> I have a concern that this interface change may cause issues with 
> various distros, kernel versions etc.. And prefer it goes through a 
> full -rc1 validation cycle.
>
> Huawei, I am aware the patch is around for a while but to play safe, I 
> suggest considering it for early next release, so it can be tested 
> enough, instead of getting if for -rc2/3 in this release.
>
> Thanks,
> ferruh
>
Hi Ferruh and Maxime:

igb_uio kernel driver gets resource through pci_resource_start, i.e, 
(dev)->resource[(bar)].start

uio_pci_generic and the generic way in my patch 1 gets resource through 
the same interface:

         pci dev driver exports to userspace the bar resource attributes 
through pci_dev->resource (check resource_show in kernel's 
drivers/pci/pci-sysfs.c)

Other arch than x86 uses the same interface in their pci_uio_ioport_map.

So patch 1 is the most generic way and shouldn't break things. 
/proc/ioports should be fully dropped.

Using /proc/ioport is partly my fault at the very beginning. It causes 
so much mess.

Could you please confirm this?

Thanks huawei

>> Could you please post v6?
>>
>> Thanks,
>> Maxime
>>
>>>>
>>>> It is not too late for this release, as the change will not be that
>>>> intrusive. But if you prepare such patch, please base it on top of my
>>>> virtio rework series; To make it easier to you, I added it to the 
>>>> dpdk-
>>>> next-virtio tree:
>>>> https://git.dpdk.org/next/dpdk-next-virtio/log/?h=virtio_pmd_rework_v2
>>>>
>>>> Thanks,
>>>> Maxime
>>>>
>>>
  
Ferruh Yigit Jan. 27, 2021, 4:45 p.m. UTC | #16
On 1/27/2021 2:43 PM, 谢华伟(此时此刻) wrote:
> 
> On 2021/1/27 18:32, Ferruh Yigit wrote:
>> I was waiting for clarification if this can be solved in virtio, which seems 
>> clarified and decided to go with this patch, I am OK to proceed with patch 1 & 2.
>>
>> But first patch changes how PIO address get, it changes the Linux interface 
>> used to get the PIO.
>> And as far as I can see second patch requires this new interface to be able to 
>> access the MEM resources.
>>
>> I have a concern that this interface change may cause issues with various 
>> distros, kernel versions etc.. And prefer it goes through a full -rc1 
>> validation cycle.
>>
>> Huawei, I am aware the patch is around for a while but to play safe, I suggest 
>> considering it for early next release, so it can be tested enough, instead of 
>> getting if for -rc2/3 in this release.
>>
>> Thanks,
>> ferruh
>>
> Hi Ferruh and Maxime:
> 
> igb_uio kernel driver gets resource through pci_resource_start, i.e, 
> (dev)->resource[(bar)].start
> 
> uio_pci_generic and the generic way in my patch 1 gets resource through the same 
> interface:
> 
>          pci dev driver exports to userspace the bar resource attributes through 
> pci_dev->resource (check resource_show in kernel's drivers/pci/pci-sysfs.c)
> 
> Other arch than x86 uses the same interface in their pci_uio_ioport_map.
> 
> So patch 1 is the most generic way and shouldn't break things. /proc/ioports 
> should be fully dropped.
> 
> Using /proc/ioport is partly my fault at the very beginning. It causes so much 
> mess.
> 
> Could you please confirm this?
> 

Hi Huawei,

I confirm that interface is already in use, 'pci_parse_sysfs_resource()' does 
similar parsing.

Most probably it is safe as you and Maxime said, and I am not trying to be 
difficult but extra conscious here.

Will it cause too much trouble to consider the patch early next release? This 
gives more time and testing after the patch merged.

Thanks,
ferruh

> Thanks huawei
> 
>>> Could you please post v6?
>>>
>>> Thanks,
>>> Maxime
>>>
>>>>>
>>>>> It is not too late for this release, as the change will not be that
>>>>> intrusive. But if you prepare such patch, please base it on top of my
>>>>> virtio rework series; To make it easier to you, I added it to the dpdk-
>>>>> next-virtio tree:
>>>>> https://git.dpdk.org/next/dpdk-next-virtio/log/?h=virtio_pmd_rework_v2
>>>>>
>>>>> Thanks,
>>>>> Maxime
>>>>>
>>>>
  
谢华伟(此时此刻) Jan. 28, 2021, 1:43 p.m. UTC | #17
On 2021/1/28 0:45, Ferruh Yigit wrote:
> On 1/27/2021 2:43 PM, 谢华伟(此时此刻) wrote:
>>
>> On 2021/1/27 18:32, Ferruh Yigit wrote:
>>> I was waiting for clarification if this can be solved in virtio, 
>>> which seems clarified and decided to go with this patch, I am OK to 
>>> proceed with patch 1 & 2.
>>>
>>> But first patch changes how PIO address get, it changes the Linux 
>>> interface used to get the PIO.
>>> And as far as I can see second patch requires this new interface to 
>>> be able to access the MEM resources.
>>>
>>> I have a concern that this interface change may cause issues with 
>>> various distros, kernel versions etc.. And prefer it goes through a 
>>> full -rc1 validation cycle.
>>>
>>> Huawei, I am aware the patch is around for a while but to play safe, 
>>> I suggest considering it for early next release, so it can be tested 
>>> enough, instead of getting if for -rc2/3 in this release.
>>>
>>> Thanks,
>>> ferruh
>>>
>> Hi Ferruh and Maxime:
>>
>> igb_uio kernel driver gets resource through pci_resource_start, i.e, 
>> (dev)->resource[(bar)].start
>>
>> uio_pci_generic and the generic way in my patch 1 gets resource 
>> through the same interface:
>>
>>          pci dev driver exports to userspace the bar resource 
>> attributes through pci_dev->resource (check resource_show in kernel's 
>> drivers/pci/pci-sysfs.c)
>>
>> Other arch than x86 uses the same interface in their pci_uio_ioport_map.
>>
>> So patch 1 is the most generic way and shouldn't break things. 
>> /proc/ioports should be fully dropped.
>>
>> Using /proc/ioport is partly my fault at the very beginning. It 
>> causes so much mess.
>>
>> Could you please confirm this?
>>
>
> Hi Huawei,
>
> I confirm that interface is already in use, 
> 'pci_parse_sysfs_resource()' does similar parsing.
>
> Most probably it is safe as you and Maxime said, and I am not trying 
> to be difficult but extra conscious here.
>
> Will it cause too much trouble to consider the patch early next 
> release? This gives more time and testing after the patch merged.
>
> Thanks,
> ferruh

Hi Ferruh:

If early next release, what is about the schedule, early next February?


In summary, patch 1 is simple and straightforward. It just don't use 
/proc/ioports and don't use resource attribute created by igb_uio and 
use the standard resource attribute under /sys/pci/.

As i explained, the resource attribute created by igb_uio is exactly the 
same thing as the standard resource attribute under /sys/pci/.

Patch 1 fixes messy things. Patch 2 fixes wrong assumptions (BAR is IO bar).


Customers have been pushing us for quite a long time. Besides, at least 
in China,  virtio in VM with MMIO bar is a de-facto implementation. It 
brings quite much trouble not only to cloud users, but also to us self 
when we run DPDK.


>
>> Thanks huawei
>>
>>>> Could you please post v6?
>>>>
>>>> Thanks,
>>>> Maxime
>>>>
>>>>>>
>>>>>> It is not too late for this release, as the change will not be that
>>>>>> intrusive. But if you prepare such patch, please base it on top 
>>>>>> of my
>>>>>> virtio rework series; To make it easier to you, I added it to the 
>>>>>> dpdk-
>>>>>> next-virtio tree:
>>>>>> https://git.dpdk.org/next/dpdk-next-virtio/log/?h=virtio_pmd_rework_v2 
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Maxime
>>>>>>
>>>>>
  

Patch

diff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c
index 0dc99e9..2ed9f2b 100644
--- a/drivers/bus/pci/linux/pci.c
+++ b/drivers/bus/pci/linux/pci.c
@@ -687,7 +687,7 @@  int rte_pci_write_config(const struct rte_pci_device *device,
 #ifdef VFIO_PRESENT
 	case RTE_PCI_KDRV_VFIO:
 		if (pci_vfio_is_enabled())
-			ret = pci_vfio_ioport_map(dev, bar, p);
+			ret = pci_uio_ioport_map(dev, bar, p);
 		break;
 #endif
 	case RTE_PCI_KDRV_IGB_UIO:
@@ -711,7 +711,7 @@  int rte_pci_write_config(const struct rte_pci_device *device,
 	switch (p->dev->kdrv) {
 #ifdef VFIO_PRESENT
 	case RTE_PCI_KDRV_VFIO:
-		pci_vfio_ioport_read(p, data, len, offset);
+		pci_uio_ioport_read(p, data, len, offset);
 		break;
 #endif
 	case RTE_PCI_KDRV_IGB_UIO:
@@ -730,7 +730,7 @@  int rte_pci_write_config(const struct rte_pci_device *device,
 	switch (p->dev->kdrv) {
 #ifdef VFIO_PRESENT
 	case RTE_PCI_KDRV_VFIO:
-		pci_vfio_ioport_write(p, data, len, offset);
+		pci_uio_ioport_write(p, data, len, offset);
 		break;
 #endif
 	case RTE_PCI_KDRV_IGB_UIO:
@@ -751,7 +751,7 @@  int rte_pci_write_config(const struct rte_pci_device *device,
 #ifdef VFIO_PRESENT
 	case RTE_PCI_KDRV_VFIO:
 		if (pci_vfio_is_enabled())
-			ret = pci_vfio_ioport_unmap(p);
+			ret = pci_uio_ioport_unmap(p);
 		break;
 #endif
 	case RTE_PCI_KDRV_IGB_UIO:
diff --git a/drivers/bus/pci/linux/pci_uio.c b/drivers/bus/pci/linux/pci_uio.c
index c19382f..463792b 100644
--- a/drivers/bus/pci/linux/pci_uio.c
+++ b/drivers/bus/pci/linux/pci_uio.c
@@ -429,7 +429,9 @@ 
 	}
 
 	/* FIXME only for primary process ? */
-	if (dev->intr_handle.type == RTE_INTR_HANDLE_UNKNOWN) {
+	if (dev->intr_handle.type == RTE_INTR_HANDLE_UNKNOWN &&
+		(dev->kdrv == RTE_PCI_KDRV_IGB_UIO ||
+		 dev->kdrv == RTE_PCI_KDRV_UIO_GENERIC)) {
 		int uio_num = pci_get_uio_dev(dev, dirname, sizeof(dirname), 0);
 		if (uio_num < 0) {
 			RTE_LOG(ERR, EAL, "cannot open %s: %s\n",