mbox series

[RFC,00/14] mlx5: support SubFunction

Message ID 20210527133759.17401-1-xuemingl@nvidia.com (mailing list archive)
Headers
Series mlx5: support SubFunction |

Message

Xueming Li May 27, 2021, 1:37 p.m. UTC
  SubFunction [1] is a portion of the PCI device, a SF netdev has its own
dedicated queues(txq, rxq). A SF shares PCI level resources with other
SFs and/or with its parent PCI function. Auxiliary bus is the
fundamental of SF.

This patch set introduces SubFunction support for mlx5 PMD driver
including class net, regex, vdpa and compress.

Depends-on: series-16904 ("bus/auxiliary: introduce auxiliary bus")

Version history:
  RFC:
 	initial version

[1] SubFunction in kernel:
https://lore.kernel.org/netdev/20201112192424.2742-1-parav@nvidia.com/

[2] Auxiliary bus:
http://patchwork.dpdk.org/project/dpdk/patch/20210510134732.2174-1-xuemingl@nvidia.com/


Thomas Monjalon (5):
  common/mlx5: move description of PCI sysfs functions
  vdpa/mlx5: fix driver name
  vdpa/mlx5: remove PCI specifics
  common/mlx5: get PCI device address from any bus
  vdpa/mlx5: support SubFunction

Xueming Li (9):
  common/mlx5: add common device driver
  net/mlx5: remove PCI dependency
  net/mlx5: migrate to bus-agnostic common driver
  regex/mlx5: migrate to common driver
  compress/mlx5: migrate to common driver
  common/mlx5: clean up legacy PCI bus driver
  bus/auxiliary: introduce auxiliary bus
  common/mlx5: support auxiliary bus
  net/mlx5: support SubFunction

 MAINTAINERS                                   |   5 +
 doc/guides/nics/mlx5.rst                      | 339 ++++++++++-
 drivers/bus/auxiliary/auxiliary_common.c      | 408 +++++++++++++
 drivers/bus/auxiliary/auxiliary_params.c      |  58 ++
 drivers/bus/auxiliary/linux/auxiliary.c       | 147 +++++
 drivers/bus/auxiliary/meson.build             |  11 +
 drivers/bus/auxiliary/private.h               | 120 ++++
 drivers/bus/auxiliary/rte_bus_auxiliary.h     | 199 +++++++
 drivers/bus/auxiliary/version.map             |  10 +
 drivers/bus/meson.build                       |   1 +
 drivers/common/mlx5/linux/meson.build         |   3 +
 .../common/mlx5/linux/mlx5_common_auxiliary.c | 188 ++++++
 drivers/common/mlx5/linux/mlx5_common_os.c    |  53 +-
 drivers/common/mlx5/linux/mlx5_common_os.h    |   4 -
 drivers/common/mlx5/linux/mlx5_common_verbs.c |  24 +-
 drivers/common/mlx5/mlx5_common.c             | 340 ++++++++++-
 drivers/common/mlx5/mlx5_common.h             | 176 +++++-
 drivers/common/mlx5/mlx5_common_pci.c         | 554 ++++--------------
 drivers/common/mlx5/mlx5_common_pci.h         |  77 ---
 drivers/common/mlx5/mlx5_common_private.h     |  50 ++
 drivers/common/mlx5/version.map               |  14 +-
 drivers/compress/mlx5/mlx5_compress.c         |  71 +--
 drivers/net/mlx5/linux/mlx5_ethdev_os.c       |  14 +-
 drivers/net/mlx5/linux/mlx5_os.c              | 193 ++++--
 drivers/net/mlx5/linux/mlx5_os.h              |   5 +-
 drivers/net/mlx5/mlx5.c                       |  97 +--
 drivers/net/mlx5/mlx5.h                       |  12 +-
 drivers/net/mlx5/mlx5_ethdev.c                |   2 +-
 drivers/net/mlx5/mlx5_mr.c                    |  46 +-
 drivers/net/mlx5/mlx5_rxmode.c                |   8 +-
 drivers/net/mlx5/mlx5_rxtx.h                  |   9 +-
 drivers/net/mlx5/mlx5_trigger.c               |  14 +-
 drivers/net/mlx5/mlx5_txq.c                   |   2 +-
 drivers/net/mlx5/windows/mlx5_os.c            |  20 +-
 drivers/regex/mlx5/mlx5_regex.c               |  49 +-
 drivers/regex/mlx5/mlx5_regex.h               |   1 -
 drivers/vdpa/mlx5/mlx5_vdpa.c                 | 128 ++--
 drivers/vdpa/mlx5/mlx5_vdpa.h                 |   1 -
 38 files changed, 2532 insertions(+), 921 deletions(-)
 create mode 100644 drivers/bus/auxiliary/auxiliary_common.c
 create mode 100644 drivers/bus/auxiliary/auxiliary_params.c
 create mode 100644 drivers/bus/auxiliary/linux/auxiliary.c
 create mode 100644 drivers/bus/auxiliary/meson.build
 create mode 100644 drivers/bus/auxiliary/private.h
 create mode 100644 drivers/bus/auxiliary/rte_bus_auxiliary.h
 create mode 100644 drivers/bus/auxiliary/version.map
 create mode 100644 drivers/common/mlx5/linux/mlx5_common_auxiliary.c
 delete mode 100644 drivers/common/mlx5/mlx5_common_pci.h
 create mode 100644 drivers/common/mlx5/mlx5_common_private.h
  

Comments

Ferruh Yigit June 10, 2021, 10:33 a.m. UTC | #1
On 5/27/2021 2:37 PM, Xueming Li wrote:
> SubFunction [1] is a portion of the PCI device, a SF netdev has its own
> dedicated queues(txq, rxq). A SF shares PCI level resources with other
> SFs and/or with its parent PCI function. Auxiliary bus is the
> fundamental of SF.
> 
> This patch set introduces SubFunction support for mlx5 PMD driver
> including class net, regex, vdpa and compress.
> 

There is already an mdev patch, originated from long ago. Aren't subfunctions
presented as mdev device? If so can't we use mdev for it?
  
Thomas Monjalon June 10, 2021, 1:23 p.m. UTC | #2
10/06/2021 12:33, Ferruh Yigit:
> On 5/27/2021 2:37 PM, Xueming Li wrote:
> > SubFunction [1] is a portion of the PCI device, a SF netdev has its own
> > dedicated queues(txq, rxq). A SF shares PCI level resources with other
> > SFs and/or with its parent PCI function. Auxiliary bus is the
> > fundamental of SF.
> > 
> > This patch set introduces SubFunction support for mlx5 PMD driver
> > including class net, regex, vdpa and compress.
> > 
> 
> There is already an mdev patch, originated from long ago. Aren't subfunctions
> presented as mdev device? If so can't we use mdev for it?

No unfortunately that's different.
mlx5 SF is based on top of auxiliary bus in the kernel/sysfs.
  
Chenbo Xia June 11, 2021, 5:14 a.m. UTC | #3
Hi Thomas,

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Thursday, June 10, 2021 9:23 PM
> To: Yigit, Ferruh <ferruh.yigit@intel.com>
> Cc: Xueming Li <xuemingl@nvidia.com>; Viacheslav Ovsiienko
> <viacheslavo@nvidia.com>; dev@dpdk.org; Xia, Chenbo <chenbo.xia@intel.com>
> Subject: Re: [dpdk-dev] [RFC 00/14] mlx5: support SubFunction
> 
> 10/06/2021 12:33, Ferruh Yigit:
> > On 5/27/2021 2:37 PM, Xueming Li wrote:
> > > SubFunction [1] is a portion of the PCI device, a SF netdev has its
> own
> > > dedicated queues(txq, rxq). A SF shares PCI level resources with other
> > > SFs and/or with its parent PCI function. Auxiliary bus is the
> > > fundamental of SF.
> > >
> > > This patch set introduces SubFunction support for mlx5 PMD driver
> > > including class net, regex, vdpa and compress.
> > >
> >
> > There is already an mdev patch, originated from long ago. Aren't
> subfunctions
> > presented as mdev device? If so can't we use mdev for it?
> 
> No unfortunately that's different.
> mlx5 SF is based on top of auxiliary bus in the kernel/sysfs.
> 

Just out of curiosity:

Does SF use mdev before aux bus is introduced in kernel. I see some history
of it but am not sure: [1] seems SF was base on mdev. [2] seems BlueField
software v2.5 is using mdev for SF. I saw it yesterday and try to figure
out the history. Since you are here, guess you know something 😊

[1] https://patchwork.ozlabs.org/project/netdev/cover/20191107160448.20962-1-parav@mellanox.com/
[2] https://docs.mellanox.com/display/BlueFieldSWv25011176/Mediated+Devices
  
Thomas Monjalon June 11, 2021, 7:54 a.m. UTC | #4
11/06/2021 07:14, Xia, Chenbo:
> From: Thomas Monjalon <thomas@monjalon.net>
> > 10/06/2021 12:33, Ferruh Yigit:
> > > On 5/27/2021 2:37 PM, Xueming Li wrote:
> > > > SubFunction [1] is a portion of the PCI device, a SF netdev has its
> > own
> > > > dedicated queues(txq, rxq). A SF shares PCI level resources with other
> > > > SFs and/or with its parent PCI function. Auxiliary bus is the
> > > > fundamental of SF.
> > > >
> > > > This patch set introduces SubFunction support for mlx5 PMD driver
> > > > including class net, regex, vdpa and compress.
> > > >
> > >
> > > There is already an mdev patch, originated from long ago. Aren't
> > subfunctions
> > > presented as mdev device? If so can't we use mdev for it?
> > 
> > No unfortunately that's different.
> > mlx5 SF is based on top of auxiliary bus in the kernel/sysfs.
> 
> Just out of curiosity:
> 
> Does SF use mdev before aux bus is introduced in kernel. I see some history
> of it but am not sure: [1] seems SF was base on mdev. [2] seems BlueField
> software v2.5 is using mdev for SF. I saw it yesterday and try to figure
> out the history. Since you are here, guess you know something 😊
> 
> [1] https://patchwork.ozlabs.org/project/netdev/cover/20191107160448.20962-1-parav@mellanox.com/
> [2] https://docs.mellanox.com/display/BlueFieldSWv25011176/Mediated+Devices

Kernel maintainers rejected the use of mdev for this purpose
and suggested to use a real bus.
You can follow the discussion here:
https://lore.kernel.org/netdev/20191108205204.GB1277001@kroah.com/

Does Intel plan to use mdev for SF?

For the sake of follow-up discussion, this is the official mdev doc:
https://www.kernel.org/doc/Documentation/vfio-mediated-device.txt
  
Chenbo Xia June 15, 2021, 2:10 a.m. UTC | #5
Hi Thomas,

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Friday, June 11, 2021 3:54 PM
> To: Yigit, Ferruh <ferruh.yigit@intel.com>; Xia, Chenbo <chenbo.xia@intel.com>
> Cc: Xueming Li <xuemingl@nvidia.com>; Viacheslav Ovsiienko
> <viacheslavo@nvidia.com>; dev@dpdk.org; parav@nvidia.com; jgg@nvidia.com
> Subject: Re: [dpdk-dev] [RFC 00/14] mlx5: support SubFunction
> 
> 11/06/2021 07:14, Xia, Chenbo:
> > From: Thomas Monjalon <thomas@monjalon.net>
> > > 10/06/2021 12:33, Ferruh Yigit:
> > > > On 5/27/2021 2:37 PM, Xueming Li wrote:
> > > > > SubFunction [1] is a portion of the PCI device, a SF netdev has its
> > > own
> > > > > dedicated queues(txq, rxq). A SF shares PCI level resources with other
> > > > > SFs and/or with its parent PCI function. Auxiliary bus is the
> > > > > fundamental of SF.
> > > > >
> > > > > This patch set introduces SubFunction support for mlx5 PMD driver
> > > > > including class net, regex, vdpa and compress.
> > > > >
> > > >
> > > > There is already an mdev patch, originated from long ago. Aren't
> > > subfunctions
> > > > presented as mdev device? If so can't we use mdev for it?
> > >
> > > No unfortunately that's different.
> > > mlx5 SF is based on top of auxiliary bus in the kernel/sysfs.
> >
> > Just out of curiosity:
> >
> > Does SF use mdev before aux bus is introduced in kernel. I see some history
> > of it but am not sure: [1] seems SF was base on mdev. [2] seems BlueField
> > software v2.5 is using mdev for SF. I saw it yesterday and try to figure
> > out the history. Since you are here, guess you know something 😊
> >
> > [1] https://patchwork.ozlabs.org/project/netdev/cover/20191107160448.20962-
> 1-parav@mellanox.com/
> > [2] https://docs.mellanox.com/display/BlueFieldSWv25011176/Mediated+Devices
> 
> Kernel maintainers rejected the use of mdev for this purpose
> and suggested to use a real bus.
> You can follow the discussion here:
> https://lore.kernel.org/netdev/20191108205204.GB1277001@kroah.com/

OK. Thanks for the info.

> 
> Does Intel plan to use mdev for SF?

Yes. In our term it's called Assignable Device Interface (ADI) introduced in Intel
Scalable IOV (https://01.org/blogs/2019/assignable-interfaces-intel-scalable-i/o-virtualization-linux)

And vfio-mdev is chosen to be the software framework for it. I start to realize there
is difference between SF and ADI: SF considers multi-function devices which may include
net/regex/vdpa/... But ADI only focuses on the virtualization of the devices and splitting
devices to logic parts and providing huge number of interfaces to host APP. I think SF
also considers this but is mainly used for multi-function devices (like DPU in your term?
Correct me if I'm wrong).

And I also noticed that the mdev-based interface can only be used in userspace but aux-based
interface can also be used by other kernel sub-system (like for net, wrap it as netdev).

Thanks,
Chenbo

> 
> For the sake of follow-up discussion, this is the official mdev doc:
> https://www.kernel.org/doc/Documentation/vfio-mediated-device.txt
> 
>
  
Parav Pandit June 15, 2021, 4:04 a.m. UTC | #6
Hi Chenbo,

> From: Xia, Chenbo <chenbo.xia@intel.com>
> Sent: Tuesday, June 15, 2021 7:41 AM
> 
> Hi Thomas,
> 
> > From: Thomas Monjalon <thomas@monjalon.net>
> > Sent: Friday, June 11, 2021 3:54 PM
[..]

> 
> Yes. In our term it's called Assignable Device Interface (ADI) introduced in
> Intel Scalable IOV (https://01.org/blogs/2019/assignable-interfaces-intel-
> scalable-i/o-virtualization-linux)
> 
> And vfio-mdev is chosen to be the software framework for it. I start to realize
> there is difference between SF and ADI: SF considers multi-function devices
> which may include net/regex/vdpa/... 
Yes. net, rdma, vdpa, regex ++.
And eventually vfio_device to map to VM too.

Non mdev framework is chosen so that all the use cases of kernel only, or user only or mix modes can be supported.

> But ADI only focuses on the
> virtualization of the devices and splitting devices to logic parts and providing
> huge number of interfaces to host APP. I think SF also considers this but is
> mainly used for multi-function devices (like DPU in your term?
> Correct me if I'm wrong).
> 
SF also supports DPU mode too but it is in addition to above use cases.
SF will expose mdev (or a vfio_device) to map to a VM.

> And I also noticed that the mdev-based interface can only be used in
> userspace but aux-based interface can also be used by other kernel sub-
> system (like for net, wrap it as netdev).
Correct.
  
Chenbo Xia June 15, 2021, 5:33 a.m. UTC | #7
Hi Parav,

> -----Original Message-----
> From: Parav Pandit <parav@nvidia.com>
> Sent: Tuesday, June 15, 2021 12:05 PM
> To: Xia, Chenbo <chenbo.xia@intel.com>; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>; Yigit, Ferruh <ferruh.yigit@intel.com>
> Cc: Xueming(Steven) Li <xuemingl@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; dev@dpdk.org; Jason Gunthorpe <jgg@nvidia.com>
> Subject: RE: [dpdk-dev] [RFC 00/14] mlx5: support SubFunction
> 
> Hi Chenbo,
> 
> > From: Xia, Chenbo <chenbo.xia@intel.com>
> > Sent: Tuesday, June 15, 2021 7:41 AM
> >
> > Hi Thomas,
> >
> > > From: Thomas Monjalon <thomas@monjalon.net>
> > > Sent: Friday, June 11, 2021 3:54 PM
> [..]
> 
> >
> > Yes. In our term it's called Assignable Device Interface (ADI) introduced in
> > Intel Scalable IOV (https://01.org/blogs/2019/assignable-interfaces-intel-
> > scalable-i/o-virtualization-linux)
> >
> > And vfio-mdev is chosen to be the software framework for it. I start to
> realize
> > there is difference between SF and ADI: SF considers multi-function devices
> > which may include net/regex/vdpa/...
> Yes. net, rdma, vdpa, regex ++.
> And eventually vfio_device to map to VM too.
> 
> Non mdev framework is chosen so that all the use cases of kernel only, or user
> only or mix modes can be supported.

OK. Got it.

> 
> > But ADI only focuses on the
> > virtualization of the devices and splitting devices to logic parts and
> providing
> > huge number of interfaces to host APP. I think SF also considers this but is
> > mainly used for multi-function devices (like DPU in your term?
> > Correct me if I'm wrong).
> >
> SF also supports DPU mode too but it is in addition to above use cases.
> SF will expose mdev (or a vfio_device) to map to a VM.

So your SW actually supports vfio-mdev? I suppose the device-specific mdev
Kernel module is out-of-tree?

Just FYI:

We are introducing a new mdev bus for DPDK:
http://patchwork.dpdk.org/project/dpdk/cover/20210601030644.3318-1-chenbo.xia@intel.com/

Thanks,
Chenbo

> 
> > And I also noticed that the mdev-based interface can only be used in
> > userspace but aux-based interface can also be used by other kernel sub-
> > system (like for net, wrap it as netdev).
> Correct.
  
Parav Pandit June 15, 2021, 5:43 a.m. UTC | #8
> From: Xia, Chenbo <chenbo.xia@intel.com>
> Sent: Tuesday, June 15, 2021 11:03 AM
> 
> Hi Parav,
> 
> > -----Original Message-----
> > From: Parav Pandit <parav@nvidia.com>
> > Sent: Tuesday, June 15, 2021 12:05 PM
> > To: Xia, Chenbo <chenbo.xia@intel.com>; NBU-Contact-Thomas Monjalon
> > <thomas@monjalon.net>; Yigit, Ferruh <ferruh.yigit@intel.com>
> > Cc: Xueming(Steven) Li <xuemingl@nvidia.com>; Slava Ovsiienko
> > <viacheslavo@nvidia.com>; dev@dpdk.org; Jason Gunthorpe
> > <jgg@nvidia.com>
> > Subject: RE: [dpdk-dev] [RFC 00/14] mlx5: support SubFunction
> >
> > Hi Chenbo,
> >
> > > From: Xia, Chenbo <chenbo.xia@intel.com>
> > > Sent: Tuesday, June 15, 2021 7:41 AM
> > >
> > > Hi Thomas,
> > >
> > > > From: Thomas Monjalon <thomas@monjalon.net>
> > > > Sent: Friday, June 11, 2021 3:54 PM
> > [..]
> >
> > >
> > > Yes. In our term it's called Assignable Device Interface (ADI)
> > > introduced in Intel Scalable IOV
> > > (https://01.org/blogs/2019/assignable-interfaces-intel-
> > > scalable-i/o-virtualization-linux)
> > >
> > > And vfio-mdev is chosen to be the software framework for it. I start
> > > to
> > realize
> > > there is difference between SF and ADI: SF considers multi-function
> > > devices which may include net/regex/vdpa/...
> > Yes. net, rdma, vdpa, regex ++.
> > And eventually vfio_device to map to VM too.
> >
> > Non mdev framework is chosen so that all the use cases of kernel only,
> > or user only or mix modes can be supported.
> 
> OK. Got it.
> 
> >
> > > But ADI only focuses on the
> > > virtualization of the devices and splitting devices to logic parts
> > > and
> > providing
> > > huge number of interfaces to host APP. I think SF also considers
> > > this but is mainly used for multi-function devices (like DPU in your term?
> > > Correct me if I'm wrong).
> > >
> > SF also supports DPU mode too but it is in addition to above use cases.
> > SF will expose mdev (or a vfio_device) to map to a VM.
> 
> So your SW actually supports vfio-mdev? I suppose the device-specific mdev
> Kernel module is out-of-tree?
> 
mlx5 driver doesn't support vfio_device for SFs. 
Kernel plumbing for PASID assignment to SF is WIP currently kernel community.
We do not have any out-of-tree kernel module.

> Just FYI:
> 
> We are introducing a new mdev bus for DPDK:
> http://patchwork.dpdk.org/project/dpdk/cover/20210601030644.3318-1-
> chenbo.xia@intel.com/
> 
I am yet to read about it. But I am not sure what value does it add.
A user can open a vfio device using vfio subsystem and operate on it. 
A vfio device can be a create as a result of binding PCI VF/PF to vfio-pci driver or a SF by binding SF to vfio_foo driver.
There is kernel work in progress to use vfio core as library.
So we do not anticipate to use add mdev layer and uuid to create a vfio device for a SF.

For Intel, ADI will never has any netdevs or rdma dev?
  
Chenbo Xia June 15, 2021, 11:19 a.m. UTC | #9
Hi Parav,

> -----Original Message-----
> From: Parav Pandit <parav@nvidia.com>
> Sent: Tuesday, June 15, 2021 1:43 PM
> To: Xia, Chenbo <chenbo.xia@intel.com>; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>; Yigit, Ferruh <ferruh.yigit@intel.com>
> Cc: Xueming(Steven) Li <xuemingl@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; dev@dpdk.org; Jason Gunthorpe <jgg@nvidia.com>
> Subject: RE: [dpdk-dev] [RFC 00/14] mlx5: support SubFunction
> 
> 
> 
> > From: Xia, Chenbo <chenbo.xia@intel.com>
> > Sent: Tuesday, June 15, 2021 11:03 AM
> >
> > Hi Parav,
> >
> > > -----Original Message-----
> > > From: Parav Pandit <parav@nvidia.com>
> > > Sent: Tuesday, June 15, 2021 12:05 PM
> > > To: Xia, Chenbo <chenbo.xia@intel.com>; NBU-Contact-Thomas Monjalon
> > > <thomas@monjalon.net>; Yigit, Ferruh <ferruh.yigit@intel.com>
> > > Cc: Xueming(Steven) Li <xuemingl@nvidia.com>; Slava Ovsiienko
> > > <viacheslavo@nvidia.com>; dev@dpdk.org; Jason Gunthorpe
> > > <jgg@nvidia.com>
> > > Subject: RE: [dpdk-dev] [RFC 00/14] mlx5: support SubFunction
> > >
> > > Hi Chenbo,
> > >
> > > > From: Xia, Chenbo <chenbo.xia@intel.com>
> > > > Sent: Tuesday, June 15, 2021 7:41 AM
> > > >
> > > > Hi Thomas,
> > > >
> > > > > From: Thomas Monjalon <thomas@monjalon.net>
> > > > > Sent: Friday, June 11, 2021 3:54 PM
> > > [..]
> > >
> > > >
> > > > Yes. In our term it's called Assignable Device Interface (ADI)
> > > > introduced in Intel Scalable IOV
> > > > (https://01.org/blogs/2019/assignable-interfaces-intel-
> > > > scalable-i/o-virtualization-linux)
> > > >
> > > > And vfio-mdev is chosen to be the software framework for it. I start
> > > > to
> > > realize
> > > > there is difference between SF and ADI: SF considers multi-function
> > > > devices which may include net/regex/vdpa/...
> > > Yes. net, rdma, vdpa, regex ++.
> > > And eventually vfio_device to map to VM too.
> > >
> > > Non mdev framework is chosen so that all the use cases of kernel only,
> > > or user only or mix modes can be supported.
> >
> > OK. Got it.
> >
> > >
> > > > But ADI only focuses on the
> > > > virtualization of the devices and splitting devices to logic parts
> > > > and
> > > providing
> > > > huge number of interfaces to host APP. I think SF also considers
> > > > this but is mainly used for multi-function devices (like DPU in your
> term?
> > > > Correct me if I'm wrong).
> > > >
> > > SF also supports DPU mode too but it is in addition to above use cases.
> > > SF will expose mdev (or a vfio_device) to map to a VM.
> >
> > So your SW actually supports vfio-mdev? I suppose the device-specific mdev
> > Kernel module is out-of-tree?
> >
> mlx5 driver doesn't support vfio_device for SFs.
> Kernel plumbing for PASID assignment to SF is WIP currently kernel community.
> We do not have any out-of-tree kernel module.
> 
> > Just FYI:
> >
> > We are introducing a new mdev bus for DPDK:
> > http://patchwork.dpdk.org/project/dpdk/cover/20210601030644.3318-1-
> > chenbo.xia@intel.com/
> >
> I am yet to read about it. But I am not sure what value does it add.
> A user can open a vfio device using vfio subsystem and operate on it.
> A vfio device can be a create as a result of binding PCI VF/PF to vfio-pci
> driver or a SF by binding SF to vfio_foo driver.

Yes, in general it is the way. For vfio-mdev, it works as binding the vfio-mdev
to parent device and echo uuid to create a virtual device. VFIO APP like DPDK,
as you said, should work similar with VFIO UAPI for vfio-pci devices or mdev-based
devices. But currently DPDK only cares about vfio-pci devices and does not care
things for other cases like mdev-based pci devices. For example, it does not scan
/sys/bus/mdev and it always uses pci bdf as device address, which mdev-based pci
devices do not have. Therefore I sent that patchset.

> There is kernel work in progress to use vfio core as library.

OK. Could you share me some link to it? Much appreciated.

> So we do not anticipate to use add mdev layer and uuid to create a vfio device
> for a SF.

OK. For now, we are following the vfio-mdev standard, using UUID to create vfio
devices.

> 
> For Intel, ADI will never has any netdevs or rdma dev?

I think technically it could have. But for some devices like our dma devices, it's
just using mdev:

https://www.spinics.net/lists/kvm/msg244417.html

Thanks,
Chenbo
  
Parav Pandit June 15, 2021, 12:47 p.m. UTC | #10
> From: Xia, Chenbo <chenbo.xia@intel.com>
> Sent: Tuesday, June 15, 2021 4:49 PM
> 
> >
> > > Just FYI:
> > >
> > > We are introducing a new mdev bus for DPDK:
> > > http://patchwork.dpdk.org/project/dpdk/cover/20210601030644.3318-1-
> > > chenbo.xia@intel.com/
> > >
> > I am yet to read about it. But I am not sure what value does it add.
> > A user can open a vfio device using vfio subsystem and operate on it.
> > A vfio device can be a create as a result of binding PCI VF/PF to
> > vfio-pci driver or a SF by binding SF to vfio_foo driver.
> 
> Yes, in general it is the way. For vfio-mdev, it works as binding the vfio-mdev
> to parent device and echo uuid to create a virtual device. VFIO APP like
> DPDK, as you said, should work similar with VFIO UAPI for vfio-pci devices or
> mdev-based devices. But currently DPDK only cares about vfio-pci devices
> and does not care things for other cases like mdev-based pci devices. For
> example, it does not scan /sys/bus/mdev and it always uses pci bdf as device
> address, which mdev-based pci devices do not have. Therefore I sent that
> patchset.
mdev device reside on mdev bus. So dpdk should identify the mdev object by specifying bus type = mdev, and device id = uuid.
There should not be any attachment to pci as Thomas said.

> 
> > There is kernel work in progress to use vfio core as library.
> 
> OK. Could you share me some link to it? Much appreciated.
> 
[1] https://lore.kernel.org/kvm/20210603160809.15845-1-mgurtovoy@nvidia.com/

> > So we do not anticipate to use add mdev layer and uuid to create a
> > vfio device for a SF.
> 
> OK. For now, we are following the vfio-mdev standard, using UUID to create
> vfio devices.
> 
If this layer is going to work on top of VFIO devices, does it really care that is it mdev?
Can it identify the vfio device by vfio device and its UAPI in uniform way?
such as open("/dev/vfio/98" ..);


> >
> > For Intel, ADI will never has any netdevs or rdma dev?
> 
> I think technically it could have. 
Unlikely. As I explained in previous email that creating netdev, rdma dev using mdev bus was already rejected in my previous patches.
And we step forward with auxiliary bus.

> But for some devices like our dma devices,
> it's just using mdev:
> 
> https://www.spinics.net/lists/kvm/msg244417.html
Possibly yes. Some devices might live on mdev bus.
You should wait for kernel patches to be merged as Jason said.

I still think that identifying vfio device by its /dev/vfio/<id> will go long way regardless of its bus type.
  
Jason Gunthorpe June 15, 2021, 3:19 p.m. UTC | #11
On Tue, Jun 15, 2021 at 12:47:13PM +0000, Parav Pandit wrote:

> > But for some devices like our dma devices,
> > it's just using mdev:
> > 
> > https://www.spinics.net/lists/kvm/msg244417.html
> Possibly yes. Some devices might live on mdev bus.
> You should wait for kernel patches to be merged as Jason said.

Also I would not expect dpdk to consume IDXD via vfio, but instead via
the normal multi-process host IOCTL interface it has.

Jason