mbox series

[RFC,00/27] Add VDUSE support to Vhost library

Message ID 20230331154259.1447831-1-maxime.coquelin@redhat.com (mailing list archive)
Headers
Series Add VDUSE support to Vhost library |

Message

Maxime Coquelin March 31, 2023, 3:42 p.m. UTC
  This series introduces a new type of backend, VDUSE,
to the Vhost library.

VDUSE stands for vDPA device in Userspace, it enables
implementing a Virtio device in userspace and have it
attached to the Kernel vDPA bus.

Once attached to the vDPA bus, it can be used either by
Kernel Virtio drivers, like virtio-net in our case, via
the virtio-vdpa driver. Doing that, the device is visible
to the Kernel networking stack and is exposed to userspace
as a regular netdev.

It can also be exposed to userspace thanks to the
vhost-vdpa driver, via a vhost-vdpa chardev that can be
passed to QEMU or Virtio-user PMD.

While VDUSE support is already available in upstream
Kernel, a couple of patches are required to support
network device type:

https://gitlab.com/mcoquelin/linux/-/tree/vduse_networking_poc

In order to attach the created VDUSE device to the vDPA
bus, a recent iproute2 version containing the vdpa tool is
required.

Usage:
======

1. Probe required Kernel modules
# modprobe vdpa
# modprobe vduse
# modprobe virtio-vdpa

2. Build (require vduse kernel headers to be available)
# meson build
# ninja -C build

3. Create a VDUSE device (vduse0) using Vhost PMD with
testpmd (with 4 queue pairs in this example)
# ./build/app/dpdk-testpmd --no-pci --vdev=net_vhost0,iface=/dev/vduse/vduse0,queues=4 --log-level=*:9  -- -i --txq=4 --rxq=4
 
4. Attach the VDUSE device to the vDPA bus
# vdpa dev add name vduse0 mgmtdev vduse
=> The virtio-net netdev shows up (eth0 here)
# ip l show eth0
21: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether c2:73:ea:a7:68:6d brd ff:ff:ff:ff:ff:ff

5. Start/stop traffic in testpmd
testpmd> start
testpmd> show port stats 0
  ######################## NIC statistics for port 0  ########################
  RX-packets: 11         RX-missed: 0          RX-bytes:  1482
  RX-errors: 0
  RX-nombuf:  0
  TX-packets: 1          TX-errors: 0          TX-bytes:  62

  Throughput (since last show)
  Rx-pps:            0          Rx-bps:            0
  Tx-pps:            0          Tx-bps:            0
  ############################################################################
testpmd> stop

6. Detach the VDUSE device from the vDPA bus
# vdpa dev del vduse0

7. Quit testpmd
testpmd> quit

Known issues & remaining work:
==============================
- Fix issue in FD manager (still polling while FD has been removed)
- Add Netlink support in Vhost library
- Support device reconnection
- Support packed ring
- Enable & test more Virtio features
- Provide performance benchmark results


Maxime Coquelin (27):
  vhost: fix missing guest notif stat increment
  vhost: fix invalid call FD handling
  vhost: fix IOTLB entries overlap check with previous entry
  vhost: add helper of IOTLB entries coredump
  vhost: add helper for IOTLB entries shared page check
  vhost: don't dump unneeded pages with IOTLB
  vhost: change to single IOTLB cache per device
  vhost: add offset field to IOTLB entries
  vhost: add page size info to IOTLB entry
  vhost: retry translating IOVA after IOTLB miss
  vhost: introduce backend ops
  vhost: add IOTLB cache entry removal callback
  vhost: add helper for IOTLB misses
  vhost: add helper for interrupt injection
  vhost: add API to set max queue pairs
  net/vhost: use API to set max queue pairs
  vhost: add control virtqueue support
  vhost: add VDUSE device creation and destruction
  vhost: add VDUSE callback for IOTLB miss
  vhost: add VDUSE callback for IOTLB entry removal
  vhost: add VDUSE callback for IRQ injection
  vhost: add VDUSE events handler
  vhost: add support for virtqueue state get event
  vhost: add support for VDUSE status set event
  vhost: add support for VDUSE IOTLB update event
  vhost: add VDUSE device startup
  vhost: add multiqueue support to VDUSE

 doc/guides/prog_guide/vhost_lib.rst |   4 +
 drivers/net/vhost/rte_eth_vhost.c   |   3 +
 lib/vhost/iotlb.c                   | 333 +++++++++--------
 lib/vhost/iotlb.h                   |  45 ++-
 lib/vhost/meson.build               |   5 +
 lib/vhost/rte_vhost.h               |  17 +
 lib/vhost/socket.c                  |  72 +++-
 lib/vhost/vduse.c                   | 553 ++++++++++++++++++++++++++++
 lib/vhost/vduse.h                   |  33 ++
 lib/vhost/version.map               |   3 +
 lib/vhost/vhost.c                   |  51 ++-
 lib/vhost/vhost.h                   |  90 +++--
 lib/vhost/vhost_user.c              |  53 ++-
 lib/vhost/vhost_user.h              |   2 +-
 lib/vhost/virtio_net_ctrl.c         | 282 ++++++++++++++
 lib/vhost/virtio_net_ctrl.h         |  10 +
 16 files changed, 1317 insertions(+), 239 deletions(-)
 create mode 100644 lib/vhost/vduse.c
 create mode 100644 lib/vhost/vduse.h
 create mode 100644 lib/vhost/virtio_net_ctrl.c
 create mode 100644 lib/vhost/virtio_net_ctrl.h
  

Comments

Yongji Xie April 6, 2023, 3:44 a.m. UTC | #1
Hi Maxime,

On Fri, Mar 31, 2023 at 11:43 PM Maxime Coquelin
<maxime.coquelin@redhat.com> wrote:
>
> This series introduces a new type of backend, VDUSE,
> to the Vhost library.
>
> VDUSE stands for vDPA device in Userspace, it enables
> implementing a Virtio device in userspace and have it
> attached to the Kernel vDPA bus.
>
> Once attached to the vDPA bus, it can be used either by
> Kernel Virtio drivers, like virtio-net in our case, via
> the virtio-vdpa driver. Doing that, the device is visible
> to the Kernel networking stack and is exposed to userspace
> as a regular netdev.
>
> It can also be exposed to userspace thanks to the
> vhost-vdpa driver, via a vhost-vdpa chardev that can be
> passed to QEMU or Virtio-user PMD.
>
> While VDUSE support is already available in upstream
> Kernel, a couple of patches are required to support
> network device type:
>
> https://gitlab.com/mcoquelin/linux/-/tree/vduse_networking_poc
>
> In order to attach the created VDUSE device to the vDPA
> bus, a recent iproute2 version containing the vdpa tool is
> required.
>
> Usage:
> ======
>
> 1. Probe required Kernel modules
> # modprobe vdpa
> # modprobe vduse
> # modprobe virtio-vdpa
>
> 2. Build (require vduse kernel headers to be available)
> # meson build
> # ninja -C build
>
> 3. Create a VDUSE device (vduse0) using Vhost PMD with
> testpmd (with 4 queue pairs in this example)
> # ./build/app/dpdk-testpmd --no-pci --vdev=net_vhost0,iface=/dev/vduse/vduse0,queues=4 --log-level=*:9  -- -i --txq=4 --rxq=4
>
> 4. Attach the VDUSE device to the vDPA bus
> # vdpa dev add name vduse0 mgmtdev vduse
> => The virtio-net netdev shows up (eth0 here)
> # ip l show eth0
> 21: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
>     link/ether c2:73:ea:a7:68:6d brd ff:ff:ff:ff:ff:ff
>
> 5. Start/stop traffic in testpmd
> testpmd> start
> testpmd> show port stats 0
>   ######################## NIC statistics for port 0  ########################
>   RX-packets: 11         RX-missed: 0          RX-bytes:  1482
>   RX-errors: 0
>   RX-nombuf:  0
>   TX-packets: 1          TX-errors: 0          TX-bytes:  62
>
>   Throughput (since last show)
>   Rx-pps:            0          Rx-bps:            0
>   Tx-pps:            0          Tx-bps:            0
>   ############################################################################
> testpmd> stop
>
> 6. Detach the VDUSE device from the vDPA bus
> # vdpa dev del vduse0
>
> 7. Quit testpmd
> testpmd> quit
>
> Known issues & remaining work:
> ==============================
> - Fix issue in FD manager (still polling while FD has been removed)
> - Add Netlink support in Vhost library
> - Support device reconnection
> - Support packed ring
> - Enable & test more Virtio features
> - Provide performance benchmark results
>

Nice work! Thanks for bringing VDUSE to the network area. I wonder if
you have some plan to support userspace memory registration [1]? I
think this feature can benefit the performance since an extra data
copy could be eliminated in our case.

[1] https://lwn.net/Articles/902809/

Thanks,
Yongji
  
Maxime Coquelin April 6, 2023, 8:16 a.m. UTC | #2
Hi Yongji,

On 4/6/23 05:44, Yongji Xie wrote:
> Hi Maxime,
> 
> On Fri, Mar 31, 2023 at 11:43 PM Maxime Coquelin
> <maxime.coquelin@redhat.com> wrote:
>>
>> This series introduces a new type of backend, VDUSE,
>> to the Vhost library.
>>
>> VDUSE stands for vDPA device in Userspace, it enables
>> implementing a Virtio device in userspace and have it
>> attached to the Kernel vDPA bus.
>>
>> Once attached to the vDPA bus, it can be used either by
>> Kernel Virtio drivers, like virtio-net in our case, via
>> the virtio-vdpa driver. Doing that, the device is visible
>> to the Kernel networking stack and is exposed to userspace
>> as a regular netdev.
>>
>> It can also be exposed to userspace thanks to the
>> vhost-vdpa driver, via a vhost-vdpa chardev that can be
>> passed to QEMU or Virtio-user PMD.
>>
>> While VDUSE support is already available in upstream
>> Kernel, a couple of patches are required to support
>> network device type:
>>
>> https://gitlab.com/mcoquelin/linux/-/tree/vduse_networking_poc
>>
>> In order to attach the created VDUSE device to the vDPA
>> bus, a recent iproute2 version containing the vdpa tool is
>> required.
>>
>> Usage:
>> ======
>>
>> 1. Probe required Kernel modules
>> # modprobe vdpa
>> # modprobe vduse
>> # modprobe virtio-vdpa
>>
>> 2. Build (require vduse kernel headers to be available)
>> # meson build
>> # ninja -C build
>>
>> 3. Create a VDUSE device (vduse0) using Vhost PMD with
>> testpmd (with 4 queue pairs in this example)
>> # ./build/app/dpdk-testpmd --no-pci --vdev=net_vhost0,iface=/dev/vduse/vduse0,queues=4 --log-level=*:9  -- -i --txq=4 --rxq=4
>>
>> 4. Attach the VDUSE device to the vDPA bus
>> # vdpa dev add name vduse0 mgmtdev vduse
>> => The virtio-net netdev shows up (eth0 here)
>> # ip l show eth0
>> 21: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
>>      link/ether c2:73:ea:a7:68:6d brd ff:ff:ff:ff:ff:ff
>>
>> 5. Start/stop traffic in testpmd
>> testpmd> start
>> testpmd> show port stats 0
>>    ######################## NIC statistics for port 0  ########################
>>    RX-packets: 11         RX-missed: 0          RX-bytes:  1482
>>    RX-errors: 0
>>    RX-nombuf:  0
>>    TX-packets: 1          TX-errors: 0          TX-bytes:  62
>>
>>    Throughput (since last show)
>>    Rx-pps:            0          Rx-bps:            0
>>    Tx-pps:            0          Tx-bps:            0
>>    ############################################################################
>> testpmd> stop
>>
>> 6. Detach the VDUSE device from the vDPA bus
>> # vdpa dev del vduse0
>>
>> 7. Quit testpmd
>> testpmd> quit
>>
>> Known issues & remaining work:
>> ==============================
>> - Fix issue in FD manager (still polling while FD has been removed)
>> - Add Netlink support in Vhost library
>> - Support device reconnection
>> - Support packed ring
>> - Enable & test more Virtio features
>> - Provide performance benchmark results
>>
> 
> Nice work! Thanks for bringing VDUSE to the network area. I wonder if
> you have some plan to support userspace memory registration [1]? I
> think this feature can benefit the performance since an extra data
> copy could be eliminated in our case.

I plan to have a closer look later, once VDUSE support will be added.
I think it will be difficult to support it in the case of DPDK for
networking:

  - For dequeue path it would be basically re-introducing dequeue zero-
copy support that we removed some time ago. It was a hack where we
replaced the regular mbuf buffer with the descriptor one, increased the
reference counter, and at next dequeue API calls checked if the former
mbufs ref counter is 1 and restore the mbuf. Issue is that physical NIC
drivers usually release sent mbufs by pool, once a certain threshold is
met. So it can cause draining of the virtqueue as the descs are not
written back into the used ring for quite some time, depending on the
NIC/traffic/...

- For enqueue path, I don't think this is possible with virtual switches
by design, as when a mbuf is received on a physical port, we don't know
in which Vhost/VDUSE port it will be switched to. And for VM to VM
communication, should it use the src VM buffer or the dest VM one?

Only case it could work is if you had a simple forwarder between a VDUSE
device and a physical port. But I don't think there is much interest in
such use-case.

Any thoughts?

Thanks,
Maxime

> [1] https://lwn.net/Articles/902809/
> 
> Thanks,
> Yongji
>
  
Yongji Xie April 6, 2023, 11:04 a.m. UTC | #3
On Thu, Apr 6, 2023 at 4:17 PM Maxime Coquelin
<maxime.coquelin@redhat.com> wrote:
>
> Hi Yongji,
>
> On 4/6/23 05:44, Yongji Xie wrote:
> > Hi Maxime,
> >
> > On Fri, Mar 31, 2023 at 11:43 PM Maxime Coquelin
> > <maxime.coquelin@redhat.com> wrote:
> >>
> >> This series introduces a new type of backend, VDUSE,
> >> to the Vhost library.
> >>
> >> VDUSE stands for vDPA device in Userspace, it enables
> >> implementing a Virtio device in userspace and have it
> >> attached to the Kernel vDPA bus.
> >>
> >> Once attached to the vDPA bus, it can be used either by
> >> Kernel Virtio drivers, like virtio-net in our case, via
> >> the virtio-vdpa driver. Doing that, the device is visible
> >> to the Kernel networking stack and is exposed to userspace
> >> as a regular netdev.
> >>
> >> It can also be exposed to userspace thanks to the
> >> vhost-vdpa driver, via a vhost-vdpa chardev that can be
> >> passed to QEMU or Virtio-user PMD.
> >>
> >> While VDUSE support is already available in upstream
> >> Kernel, a couple of patches are required to support
> >> network device type:
> >>
> >> https://gitlab.com/mcoquelin/linux/-/tree/vduse_networking_poc
> >>
> >> In order to attach the created VDUSE device to the vDPA
> >> bus, a recent iproute2 version containing the vdpa tool is
> >> required.
> >>
> >> Usage:
> >> ======
> >>
> >> 1. Probe required Kernel modules
> >> # modprobe vdpa
> >> # modprobe vduse
> >> # modprobe virtio-vdpa
> >>
> >> 2. Build (require vduse kernel headers to be available)
> >> # meson build
> >> # ninja -C build
> >>
> >> 3. Create a VDUSE device (vduse0) using Vhost PMD with
> >> testpmd (with 4 queue pairs in this example)
> >> # ./build/app/dpdk-testpmd --no-pci --vdev=net_vhost0,iface=/dev/vduse/vduse0,queues=4 --log-level=*:9  -- -i --txq=4 --rxq=4
> >>
> >> 4. Attach the VDUSE device to the vDPA bus
> >> # vdpa dev add name vduse0 mgmtdev vduse
> >> => The virtio-net netdev shows up (eth0 here)
> >> # ip l show eth0
> >> 21: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
> >>      link/ether c2:73:ea:a7:68:6d brd ff:ff:ff:ff:ff:ff
> >>
> >> 5. Start/stop traffic in testpmd
> >> testpmd> start
> >> testpmd> show port stats 0
> >>    ######################## NIC statistics for port 0  ########################
> >>    RX-packets: 11         RX-missed: 0          RX-bytes:  1482
> >>    RX-errors: 0
> >>    RX-nombuf:  0
> >>    TX-packets: 1          TX-errors: 0          TX-bytes:  62
> >>
> >>    Throughput (since last show)
> >>    Rx-pps:            0          Rx-bps:            0
> >>    Tx-pps:            0          Tx-bps:            0
> >>    ############################################################################
> >> testpmd> stop
> >>
> >> 6. Detach the VDUSE device from the vDPA bus
> >> # vdpa dev del vduse0
> >>
> >> 7. Quit testpmd
> >> testpmd> quit
> >>
> >> Known issues & remaining work:
> >> ==============================
> >> - Fix issue in FD manager (still polling while FD has been removed)
> >> - Add Netlink support in Vhost library
> >> - Support device reconnection
> >> - Support packed ring
> >> - Enable & test more Virtio features
> >> - Provide performance benchmark results
> >>
> >
> > Nice work! Thanks for bringing VDUSE to the network area. I wonder if
> > you have some plan to support userspace memory registration [1]? I
> > think this feature can benefit the performance since an extra data
> > copy could be eliminated in our case.
>
> I plan to have a closer look later, once VDUSE support will be added.
> I think it will be difficult to support it in the case of DPDK for
> networking:
>
>   - For dequeue path it would be basically re-introducing dequeue zero-
> copy support that we removed some time ago. It was a hack where we
> replaced the regular mbuf buffer with the descriptor one, increased the
> reference counter, and at next dequeue API calls checked if the former
> mbufs ref counter is 1 and restore the mbuf. Issue is that physical NIC
> drivers usually release sent mbufs by pool, once a certain threshold is
> met. So it can cause draining of the virtqueue as the descs are not
> written back into the used ring for quite some time, depending on the
> NIC/traffic/...
>

OK, I see. Could this issue be mitigated by releasing sent mbufs one
by one once we sent it out or simply increasing the virtqueue size?

> - For enqueue path, I don't think this is possible with virtual switches
> by design, as when a mbuf is received on a physical port, we don't know
> in which Vhost/VDUSE port it will be switched to. And for VM to VM
> communication, should it use the src VM buffer or the dest VM one?
>

Yes, I agree that it's hard to achieve that in the enqueue path.

> Only case it could work is if you had a simple forwarder between a VDUSE
> device and a physical port. But I don't think there is much interest in
> such use-case.
>

OK, I get it.

Thanks,
Yongji
  
Ferruh Yigit April 12, 2023, 11:33 a.m. UTC | #4
On 3/31/2023 4:42 PM, Maxime Coquelin wrote:
> This series introduces a new type of backend, VDUSE,
> to the Vhost library.
> 
> VDUSE stands for vDPA device in Userspace, it enables
> implementing a Virtio device in userspace and have it
> attached to the Kernel vDPA bus.
> 
> Once attached to the vDPA bus, it can be used either by
> Kernel Virtio drivers, like virtio-net in our case, via
> the virtio-vdpa driver. Doing that, the device is visible
> to the Kernel networking stack and is exposed to userspace
> as a regular netdev.
> 
> It can also be exposed to userspace thanks to the
> vhost-vdpa driver, via a vhost-vdpa chardev that can be
> passed to QEMU or Virtio-user PMD.
> 
> While VDUSE support is already available in upstream
> Kernel, a couple of patches are required to support
> network device type:
> 
> https://gitlab.com/mcoquelin/linux/-/tree/vduse_networking_poc
> 
> In order to attach the created VDUSE device to the vDPA
> bus, a recent iproute2 version containing the vdpa tool is
> required.

Hi Maxime,

Is this a replacement to the existing DPDK vDPA framework? What is the
plan for long term?
  
Maxime Coquelin April 12, 2023, 3:28 p.m. UTC | #5
Hi Ferruh,

On 4/12/23 13:33, Ferruh Yigit wrote:
> On 3/31/2023 4:42 PM, Maxime Coquelin wrote:
>> This series introduces a new type of backend, VDUSE,
>> to the Vhost library.
>>
>> VDUSE stands for vDPA device in Userspace, it enables
>> implementing a Virtio device in userspace and have it
>> attached to the Kernel vDPA bus.
>>
>> Once attached to the vDPA bus, it can be used either by
>> Kernel Virtio drivers, like virtio-net in our case, via
>> the virtio-vdpa driver. Doing that, the device is visible
>> to the Kernel networking stack and is exposed to userspace
>> as a regular netdev.
>>
>> It can also be exposed to userspace thanks to the
>> vhost-vdpa driver, via a vhost-vdpa chardev that can be
>> passed to QEMU or Virtio-user PMD.
>>
>> While VDUSE support is already available in upstream
>> Kernel, a couple of patches are required to support
>> network device type:
>>
>> https://gitlab.com/mcoquelin/linux/-/tree/vduse_networking_poc
>>
>> In order to attach the created VDUSE device to the vDPA
>> bus, a recent iproute2 version containing the vdpa tool is
>> required.
> 
> Hi Maxime,
> 
> Is this a replacement to the existing DPDK vDPA framework? What is the
> plan for long term?
> 

No, this is not a replacement for DPDK vDPA framework.

We (Red Hat) don't have plans to support DPDK vDPA framework in our
products, but there are still contribution to DPDK vDPA by several vDPA
hardware vendors (Intel, Nvidia, Xilinx), so I don't think it is going
to be deprecated soon.

Regards,
Maxime
  
Morten Brørup April 12, 2023, 7:40 p.m. UTC | #6
> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
> Sent: Wednesday, 12 April 2023 17.28
> 
> Hi Ferruh,
> 
> On 4/12/23 13:33, Ferruh Yigit wrote:
> > On 3/31/2023 4:42 PM, Maxime Coquelin wrote:
> >> This series introduces a new type of backend, VDUSE,
> >> to the Vhost library.
> >>
> >> VDUSE stands for vDPA device in Userspace, it enables
> >> implementing a Virtio device in userspace and have it
> >> attached to the Kernel vDPA bus.
> >>
> >> Once attached to the vDPA bus, it can be used either by
> >> Kernel Virtio drivers, like virtio-net in our case, via
> >> the virtio-vdpa driver. Doing that, the device is visible
> >> to the Kernel networking stack and is exposed to userspace
> >> as a regular netdev.
> >>
> >> It can also be exposed to userspace thanks to the
> >> vhost-vdpa driver, via a vhost-vdpa chardev that can be
> >> passed to QEMU or Virtio-user PMD.
> >>
> >> While VDUSE support is already available in upstream
> >> Kernel, a couple of patches are required to support
> >> network device type:
> >>
> >> https://gitlab.com/mcoquelin/linux/-/tree/vduse_networking_poc
> >>
> >> In order to attach the created VDUSE device to the vDPA
> >> bus, a recent iproute2 version containing the vdpa tool is
> >> required.
> >
> > Hi Maxime,
> >
> > Is this a replacement to the existing DPDK vDPA framework? What is the
> > plan for long term?
> >
> 
> No, this is not a replacement for DPDK vDPA framework.
> 
> We (Red Hat) don't have plans to support DPDK vDPA framework in our
> products, but there are still contribution to DPDK vDPA by several vDPA
> hardware vendors (Intel, Nvidia, Xilinx), so I don't think it is going
> to be deprecated soon.

Ferruh's question made me curious...

I don't know anything about VDUSE or vDPA, and don't use any of it, so consider me ignorant in this area.

Is VDUSE an alternative to the existing DPDK vDPA framework? What are the differences, e.g. in which cases would an application developer (or user) choose one or the other?

And if it is a better alternative, perhaps the documentation should mention that it is recommended over DPDK vDPA. Just like we started recommending alternatives to the KNI driver, so we could phase it out and eventually get rid of it.

> 
> Regards,
> Maxime
  
Chenbo Xia April 13, 2023, 7:08 a.m. UTC | #7
> -----Original Message-----
> From: Morten Brørup <mb@smartsharesystems.com>
> Sent: Thursday, April 13, 2023 3:41 AM
> To: Maxime Coquelin <maxime.coquelin@redhat.com>; Ferruh Yigit
> <ferruh.yigit@amd.com>; dev@dpdk.org; david.marchand@redhat.com; Xia,
> Chenbo <chenbo.xia@intel.com>; mkp@redhat.com; fbl@redhat.com;
> jasowang@redhat.com; Liang, Cunming <cunming.liang@intel.com>; Xie, Yongji
> <xieyongji@bytedance.com>; echaudro@redhat.com; eperezma@redhat.com;
> amorenoz@redhat.com
> Subject: RE: [RFC 00/27] Add VDUSE support to Vhost library
> 
> > From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
> > Sent: Wednesday, 12 April 2023 17.28
> >
> > Hi Ferruh,
> >
> > On 4/12/23 13:33, Ferruh Yigit wrote:
> > > On 3/31/2023 4:42 PM, Maxime Coquelin wrote:
> > >> This series introduces a new type of backend, VDUSE,
> > >> to the Vhost library.
> > >>
> > >> VDUSE stands for vDPA device in Userspace, it enables
> > >> implementing a Virtio device in userspace and have it
> > >> attached to the Kernel vDPA bus.
> > >>
> > >> Once attached to the vDPA bus, it can be used either by
> > >> Kernel Virtio drivers, like virtio-net in our case, via
> > >> the virtio-vdpa driver. Doing that, the device is visible
> > >> to the Kernel networking stack and is exposed to userspace
> > >> as a regular netdev.
> > >>
> > >> It can also be exposed to userspace thanks to the
> > >> vhost-vdpa driver, via a vhost-vdpa chardev that can be
> > >> passed to QEMU or Virtio-user PMD.
> > >>
> > >> While VDUSE support is already available in upstream
> > >> Kernel, a couple of patches are required to support
> > >> network device type:
> > >>
> > >> https://gitlab.com/mcoquelin/linux/-/tree/vduse_networking_poc
> > >>
> > >> In order to attach the created VDUSE device to the vDPA
> > >> bus, a recent iproute2 version containing the vdpa tool is
> > >> required.
> > >
> > > Hi Maxime,
> > >
> > > Is this a replacement to the existing DPDK vDPA framework? What is the
> > > plan for long term?
> > >
> >
> > No, this is not a replacement for DPDK vDPA framework.
> >
> > We (Red Hat) don't have plans to support DPDK vDPA framework in our
> > products, but there are still contribution to DPDK vDPA by several vDPA
> > hardware vendors (Intel, Nvidia, Xilinx), so I don't think it is going
> > to be deprecated soon.
> 
> Ferruh's question made me curious...
> 
> I don't know anything about VDUSE or vDPA, and don't use any of it, so
> consider me ignorant in this area.
> 
> Is VDUSE an alternative to the existing DPDK vDPA framework? What are the
> differences, e.g. in which cases would an application developer (or user)
> choose one or the other?

Maxime should give better explanation.. but let me just explain a bit.

Vendors have vDPA HW that support vDPA framework (most likely in their DPU/IPU
products). This work is introducing a way to emulate a SW vDPA device in
userspace (DPDK), and this SW vDPA device also supports vDPA framework.

So it's not an alternative to existing DPDK vDPA framework :)

Thanks,
Chenbo

> 
> And if it is a better alternative, perhaps the documentation should
> mention that it is recommended over DPDK vDPA. Just like we started
> recommending alternatives to the KNI driver, so we could phase it out and
> eventually get rid of it.
> 
> >
> > Regards,
> > Maxime
  
Morten Brørup April 13, 2023, 7:58 a.m. UTC | #8
> From: Xia, Chenbo [mailto:chenbo.xia@intel.com]
> Sent: Thursday, 13 April 2023 09.08
> 
> > From: Morten Brørup <mb@smartsharesystems.com>
> > Sent: Thursday, April 13, 2023 3:41 AM
> >
> > > From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
> > > Sent: Wednesday, 12 April 2023 17.28
> > >
> > > Hi Ferruh,
> > >
> > > On 4/12/23 13:33, Ferruh Yigit wrote:
> > > > On 3/31/2023 4:42 PM, Maxime Coquelin wrote:
> > > >> This series introduces a new type of backend, VDUSE,
> > > >> to the Vhost library.
> > > >>
> > > >> VDUSE stands for vDPA device in Userspace, it enables
> > > >> implementing a Virtio device in userspace and have it
> > > >> attached to the Kernel vDPA bus.
> > > >>
> > > >> Once attached to the vDPA bus, it can be used either by
> > > >> Kernel Virtio drivers, like virtio-net in our case, via
> > > >> the virtio-vdpa driver. Doing that, the device is visible
> > > >> to the Kernel networking stack and is exposed to userspace
> > > >> as a regular netdev.
> > > >>
> > > >> It can also be exposed to userspace thanks to the
> > > >> vhost-vdpa driver, via a vhost-vdpa chardev that can be
> > > >> passed to QEMU or Virtio-user PMD.
> > > >>
> > > >> While VDUSE support is already available in upstream
> > > >> Kernel, a couple of patches are required to support
> > > >> network device type:
> > > >>
> > > >> https://gitlab.com/mcoquelin/linux/-/tree/vduse_networking_poc
> > > >>
> > > >> In order to attach the created VDUSE device to the vDPA
> > > >> bus, a recent iproute2 version containing the vdpa tool is
> > > >> required.
> > > >
> > > > Hi Maxime,
> > > >
> > > > Is this a replacement to the existing DPDK vDPA framework? What is the
> > > > plan for long term?
> > > >
> > >
> > > No, this is not a replacement for DPDK vDPA framework.
> > >
> > > We (Red Hat) don't have plans to support DPDK vDPA framework in our
> > > products, but there are still contribution to DPDK vDPA by several vDPA
> > > hardware vendors (Intel, Nvidia, Xilinx), so I don't think it is going
> > > to be deprecated soon.
> >
> > Ferruh's question made me curious...
> >
> > I don't know anything about VDUSE or vDPA, and don't use any of it, so
> > consider me ignorant in this area.
> >
> > Is VDUSE an alternative to the existing DPDK vDPA framework? What are the
> > differences, e.g. in which cases would an application developer (or user)
> > choose one or the other?
> 
> Maxime should give better explanation.. but let me just explain a bit.
> 
> Vendors have vDPA HW that support vDPA framework (most likely in their DPU/IPU
> products). This work is introducing a way to emulate a SW vDPA device in
> userspace (DPDK), and this SW vDPA device also supports vDPA framework.
> 
> So it's not an alternative to existing DPDK vDPA framework :)
> 
> Thanks,
> Chenbo

Not an alternative, then nothing further from me. :-)

> 
> >
> > And if it is a better alternative, perhaps the documentation should
> > mention that it is recommended over DPDK vDPA. Just like we started
> > recommending alternatives to the KNI driver, so we could phase it out and
> > eventually get rid of it.
> >
> > >
> > > Regards,
> > > Maxime
  
Maxime Coquelin April 13, 2023, 7:59 a.m. UTC | #9
Hi,

On 4/13/23 09:08, Xia, Chenbo wrote:
>> -----Original Message-----
>> From: Morten Brørup <mb@smartsharesystems.com>
>> Sent: Thursday, April 13, 2023 3:41 AM
>> To: Maxime Coquelin <maxime.coquelin@redhat.com>; Ferruh Yigit
>> <ferruh.yigit@amd.com>; dev@dpdk.org; david.marchand@redhat.com; Xia,
>> Chenbo <chenbo.xia@intel.com>; mkp@redhat.com; fbl@redhat.com;
>> jasowang@redhat.com; Liang, Cunming <cunming.liang@intel.com>; Xie, Yongji
>> <xieyongji@bytedance.com>; echaudro@redhat.com; eperezma@redhat.com;
>> amorenoz@redhat.com
>> Subject: RE: [RFC 00/27] Add VDUSE support to Vhost library
>>
>>> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
>>> Sent: Wednesday, 12 April 2023 17.28
>>>
>>> Hi Ferruh,
>>>
>>> On 4/12/23 13:33, Ferruh Yigit wrote:
>>>> On 3/31/2023 4:42 PM, Maxime Coquelin wrote:
>>>>> This series introduces a new type of backend, VDUSE,
>>>>> to the Vhost library.
>>>>>
>>>>> VDUSE stands for vDPA device in Userspace, it enables
>>>>> implementing a Virtio device in userspace and have it
>>>>> attached to the Kernel vDPA bus.
>>>>>
>>>>> Once attached to the vDPA bus, it can be used either by
>>>>> Kernel Virtio drivers, like virtio-net in our case, via
>>>>> the virtio-vdpa driver. Doing that, the device is visible
>>>>> to the Kernel networking stack and is exposed to userspace
>>>>> as a regular netdev.
>>>>>
>>>>> It can also be exposed to userspace thanks to the
>>>>> vhost-vdpa driver, via a vhost-vdpa chardev that can be
>>>>> passed to QEMU or Virtio-user PMD.
>>>>>
>>>>> While VDUSE support is already available in upstream
>>>>> Kernel, a couple of patches are required to support
>>>>> network device type:
>>>>>
>>>>> https://gitlab.com/mcoquelin/linux/-/tree/vduse_networking_poc
>>>>>
>>>>> In order to attach the created VDUSE device to the vDPA
>>>>> bus, a recent iproute2 version containing the vdpa tool is
>>>>> required.
>>>>
>>>> Hi Maxime,
>>>>
>>>> Is this a replacement to the existing DPDK vDPA framework? What is the
>>>> plan for long term?
>>>>
>>>
>>> No, this is not a replacement for DPDK vDPA framework.
>>>
>>> We (Red Hat) don't have plans to support DPDK vDPA framework in our
>>> products, but there are still contribution to DPDK vDPA by several vDPA
>>> hardware vendors (Intel, Nvidia, Xilinx), so I don't think it is going
>>> to be deprecated soon.
>>
>> Ferruh's question made me curious...
>>
>> I don't know anything about VDUSE or vDPA, and don't use any of it, so
>> consider me ignorant in this area.
>>
>> Is VDUSE an alternative to the existing DPDK vDPA framework? What are the
>> differences, e.g. in which cases would an application developer (or user)
>> choose one or the other?
> 
> Maxime should give better explanation.. but let me just explain a bit.
> 
> Vendors have vDPA HW that support vDPA framework (most likely in their DPU/IPU
> products). This work is introducing a way to emulate a SW vDPA device in
> userspace (DPDK), and this SW vDPA device also supports vDPA framework.
> 
> So it's not an alternative to existing DPDK vDPA framework :)

Correct.

When using DPDK vDPA, the datapath of a Vhost-user port is offloaded to
a compatible physical NIC (i.e. a NIC that implements Virtio rings
support), the control path remains the same as a regular Vhost-user
port, i.e. it provides a Vhost-user unix socket to the application (like
QEMU or DPDK Virtio-user PMD).

When using Kernel vDPA, the datapath is also offloaded to a vDPA
compatible device, and the control path is managed by the vDPA bus.
It can either be consumed by a Kernel Virtio device (here Virtio-net)
when using Virtio-vDPA. In this case the device is exposed as a regular
netdev and, in the case of Kubernetes, can be used as primary interfaces
for the pods.
Or it can be exposed to user-space via Vhost-vDPA, a chardev that can be
seen as an alternative to Vhost-user sockets. In this case it can for
example be used by QEMU or DPDK Virtio-user PMD. In Kubernetes, it can
be used as a secondary interface.

Now comes VDUSE. VDUSE is a Kernel vDPA device, but instead of being a
physical device where the Virtio datapath is offloaded, the Virtio
datapath is offloaded to a user-space application. With this series, a
DPDK application, like OVS-DPDK for instance, can create VDUSE device
and expose them either as regular netdev when binding them to Kernel
Virtio-net driver via Virtio-vDPA, or as Vhost-vDPA interface to be
consumed by another userspace appliation like QEMU or DPDK application
using Virtio-user PMD. With this solution, OVS-DPDK could serve both
primary and secondary interfaces of Kubernetes pods.

I hope it clarifies, I will add these information in the cover-letter
for next revisions. Let me know if anything is still unclear.

I did a presentation at last DPDK summit [0], maybe the diagrams will 
help to clarify furthermore.

Regards,
Maxime

> Thanks,
> Chenbo
> 
>>
>> And if it is a better alternative, perhaps the documentation should
>> mention that it is recommended over DPDK vDPA. Just like we started
>> recommending alternatives to the KNI driver, so we could phase it out and
>> eventually get rid of it.
>>
>>>
>>> Regards,
>>> Maxime
> 

[0]: 
https://static.sched.com/hosted_files/dpdkuserspace22/9f/Open%20DPDK%20to%20containers%20networking%20with%20VDUSE.pdf
  
Ferruh Yigit April 14, 2023, 10:48 a.m. UTC | #10
On 4/13/2023 8:59 AM, Maxime Coquelin wrote:
> Hi,
> 
> On 4/13/23 09:08, Xia, Chenbo wrote:
>>> -----Original Message-----
>>> From: Morten Brørup <mb@smartsharesystems.com>
>>> Sent: Thursday, April 13, 2023 3:41 AM
>>> To: Maxime Coquelin <maxime.coquelin@redhat.com>; Ferruh Yigit
>>> <ferruh.yigit@amd.com>; dev@dpdk.org; david.marchand@redhat.com; Xia,
>>> Chenbo <chenbo.xia@intel.com>; mkp@redhat.com; fbl@redhat.com;
>>> jasowang@redhat.com; Liang, Cunming <cunming.liang@intel.com>; Xie,
>>> Yongji
>>> <xieyongji@bytedance.com>; echaudro@redhat.com; eperezma@redhat.com;
>>> amorenoz@redhat.com
>>> Subject: RE: [RFC 00/27] Add VDUSE support to Vhost library
>>>
>>>> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
>>>> Sent: Wednesday, 12 April 2023 17.28
>>>>
>>>> Hi Ferruh,
>>>>
>>>> On 4/12/23 13:33, Ferruh Yigit wrote:
>>>>> On 3/31/2023 4:42 PM, Maxime Coquelin wrote:
>>>>>> This series introduces a new type of backend, VDUSE,
>>>>>> to the Vhost library.
>>>>>>
>>>>>> VDUSE stands for vDPA device in Userspace, it enables
>>>>>> implementing a Virtio device in userspace and have it
>>>>>> attached to the Kernel vDPA bus.
>>>>>>
>>>>>> Once attached to the vDPA bus, it can be used either by
>>>>>> Kernel Virtio drivers, like virtio-net in our case, via
>>>>>> the virtio-vdpa driver. Doing that, the device is visible
>>>>>> to the Kernel networking stack and is exposed to userspace
>>>>>> as a regular netdev.
>>>>>>
>>>>>> It can also be exposed to userspace thanks to the
>>>>>> vhost-vdpa driver, via a vhost-vdpa chardev that can be
>>>>>> passed to QEMU or Virtio-user PMD.
>>>>>>
>>>>>> While VDUSE support is already available in upstream
>>>>>> Kernel, a couple of patches are required to support
>>>>>> network device type:
>>>>>>
>>>>>> https://gitlab.com/mcoquelin/linux/-/tree/vduse_networking_poc
>>>>>>
>>>>>> In order to attach the created VDUSE device to the vDPA
>>>>>> bus, a recent iproute2 version containing the vdpa tool is
>>>>>> required.
>>>>>
>>>>> Hi Maxime,
>>>>>
>>>>> Is this a replacement to the existing DPDK vDPA framework? What is the
>>>>> plan for long term?
>>>>>
>>>>
>>>> No, this is not a replacement for DPDK vDPA framework.
>>>>
>>>> We (Red Hat) don't have plans to support DPDK vDPA framework in our
>>>> products, but there are still contribution to DPDK vDPA by several vDPA
>>>> hardware vendors (Intel, Nvidia, Xilinx), so I don't think it is going
>>>> to be deprecated soon.
>>>
>>> Ferruh's question made me curious...
>>>
>>> I don't know anything about VDUSE or vDPA, and don't use any of it, so
>>> consider me ignorant in this area.
>>>
>>> Is VDUSE an alternative to the existing DPDK vDPA framework? What are
>>> the
>>> differences, e.g. in which cases would an application developer (or
>>> user)
>>> choose one or the other?
>>
>> Maxime should give better explanation.. but let me just explain a bit.
>>
>> Vendors have vDPA HW that support vDPA framework (most likely in their
>> DPU/IPU
>> products). This work is introducing a way to emulate a SW vDPA device in
>> userspace (DPDK), and this SW vDPA device also supports vDPA framework.
>>
>> So it's not an alternative to existing DPDK vDPA framework :)
> 
> Correct.
> 
> When using DPDK vDPA, the datapath of a Vhost-user port is offloaded to
> a compatible physical NIC (i.e. a NIC that implements Virtio rings
> support), the control path remains the same as a regular Vhost-user
> port, i.e. it provides a Vhost-user unix socket to the application (like
> QEMU or DPDK Virtio-user PMD).
> 
> When using Kernel vDPA, the datapath is also offloaded to a vDPA
> compatible device, and the control path is managed by the vDPA bus.
> It can either be consumed by a Kernel Virtio device (here Virtio-net)
> when using Virtio-vDPA. In this case the device is exposed as a regular
> netdev and, in the case of Kubernetes, can be used as primary interfaces
> for the pods.
> Or it can be exposed to user-space via Vhost-vDPA, a chardev that can be
> seen as an alternative to Vhost-user sockets. In this case it can for
> example be used by QEMU or DPDK Virtio-user PMD. In Kubernetes, it can
> be used as a secondary interface.
> 
> Now comes VDUSE. VDUSE is a Kernel vDPA device, but instead of being a
> physical device where the Virtio datapath is offloaded, the Virtio
> datapath is offloaded to a user-space application. With this series, a
> DPDK application, like OVS-DPDK for instance, can create VDUSE device
> and expose them either as regular netdev when binding them to Kernel
> Virtio-net driver via Virtio-vDPA, or as Vhost-vDPA interface to be
> consumed by another userspace appliation like QEMU or DPDK application
> using Virtio-user PMD. With this solution, OVS-DPDK could serve both
> primary and secondary interfaces of Kubernetes pods.
> 
> I hope it clarifies, I will add these information in the cover-letter
> for next revisions. Let me know if anything is still unclear.
> 
> I did a presentation at last DPDK summit [0], maybe the diagrams will
> help to clarify furthermore.
> 

Thanks Chenbo, Maxime for clarification.

After reading a little more (I think) I got it better, slides [0] were
useful.

So this is more like a backend/handler, similar to vhost-user, although
it is vDPA device emulation.
Can you please describe more the benefit of vduse comparing to vhost-user?

Also what is "VDUSE daemon", which is referred a few times in
documentation, is it another userspace implementation of the vduse?


>>>
>>> And if it is a better alternative, perhaps the documentation should
>>> mention that it is recommended over DPDK vDPA. Just like we started
>>> recommending alternatives to the KNI driver, so we could phase it out
>>> and
>>> eventually get rid of it.
>>>
>>>>
>>>> Regards,
>>>> Maxime
>>
> 
> [0]:
> https://static.sched.com/hosted_files/dpdkuserspace22/9f/Open%20DPDK%20to%20containers%20networking%20with%20VDUSE.pdf
>
  
Maxime Coquelin April 14, 2023, 12:06 p.m. UTC | #11
On 4/14/23 12:48, Ferruh Yigit wrote:
> On 4/13/2023 8:59 AM, Maxime Coquelin wrote:
>> Hi,
>>
>> On 4/13/23 09:08, Xia, Chenbo wrote:
>>>> -----Original Message-----
>>>> From: Morten Brørup <mb@smartsharesystems.com>
>>>> Sent: Thursday, April 13, 2023 3:41 AM
>>>> To: Maxime Coquelin <maxime.coquelin@redhat.com>; Ferruh Yigit
>>>> <ferruh.yigit@amd.com>; dev@dpdk.org; david.marchand@redhat.com; Xia,
>>>> Chenbo <chenbo.xia@intel.com>; mkp@redhat.com; fbl@redhat.com;
>>>> jasowang@redhat.com; Liang, Cunming <cunming.liang@intel.com>; Xie,
>>>> Yongji
>>>> <xieyongji@bytedance.com>; echaudro@redhat.com; eperezma@redhat.com;
>>>> amorenoz@redhat.com
>>>> Subject: RE: [RFC 00/27] Add VDUSE support to Vhost library
>>>>
>>>>> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
>>>>> Sent: Wednesday, 12 April 2023 17.28
>>>>>
>>>>> Hi Ferruh,
>>>>>
>>>>> On 4/12/23 13:33, Ferruh Yigit wrote:
>>>>>> On 3/31/2023 4:42 PM, Maxime Coquelin wrote:
>>>>>>> This series introduces a new type of backend, VDUSE,
>>>>>>> to the Vhost library.
>>>>>>>
>>>>>>> VDUSE stands for vDPA device in Userspace, it enables
>>>>>>> implementing a Virtio device in userspace and have it
>>>>>>> attached to the Kernel vDPA bus.
>>>>>>>
>>>>>>> Once attached to the vDPA bus, it can be used either by
>>>>>>> Kernel Virtio drivers, like virtio-net in our case, via
>>>>>>> the virtio-vdpa driver. Doing that, the device is visible
>>>>>>> to the Kernel networking stack and is exposed to userspace
>>>>>>> as a regular netdev.
>>>>>>>
>>>>>>> It can also be exposed to userspace thanks to the
>>>>>>> vhost-vdpa driver, via a vhost-vdpa chardev that can be
>>>>>>> passed to QEMU or Virtio-user PMD.
>>>>>>>
>>>>>>> While VDUSE support is already available in upstream
>>>>>>> Kernel, a couple of patches are required to support
>>>>>>> network device type:
>>>>>>>
>>>>>>> https://gitlab.com/mcoquelin/linux/-/tree/vduse_networking_poc
>>>>>>>
>>>>>>> In order to attach the created VDUSE device to the vDPA
>>>>>>> bus, a recent iproute2 version containing the vdpa tool is
>>>>>>> required.
>>>>>>
>>>>>> Hi Maxime,
>>>>>>
>>>>>> Is this a replacement to the existing DPDK vDPA framework? What is the
>>>>>> plan for long term?
>>>>>>
>>>>>
>>>>> No, this is not a replacement for DPDK vDPA framework.
>>>>>
>>>>> We (Red Hat) don't have plans to support DPDK vDPA framework in our
>>>>> products, but there are still contribution to DPDK vDPA by several vDPA
>>>>> hardware vendors (Intel, Nvidia, Xilinx), so I don't think it is going
>>>>> to be deprecated soon.
>>>>
>>>> Ferruh's question made me curious...
>>>>
>>>> I don't know anything about VDUSE or vDPA, and don't use any of it, so
>>>> consider me ignorant in this area.
>>>>
>>>> Is VDUSE an alternative to the existing DPDK vDPA framework? What are
>>>> the
>>>> differences, e.g. in which cases would an application developer (or
>>>> user)
>>>> choose one or the other?
>>>
>>> Maxime should give better explanation.. but let me just explain a bit.
>>>
>>> Vendors have vDPA HW that support vDPA framework (most likely in their
>>> DPU/IPU
>>> products). This work is introducing a way to emulate a SW vDPA device in
>>> userspace (DPDK), and this SW vDPA device also supports vDPA framework.
>>>
>>> So it's not an alternative to existing DPDK vDPA framework :)
>>
>> Correct.
>>
>> When using DPDK vDPA, the datapath of a Vhost-user port is offloaded to
>> a compatible physical NIC (i.e. a NIC that implements Virtio rings
>> support), the control path remains the same as a regular Vhost-user
>> port, i.e. it provides a Vhost-user unix socket to the application (like
>> QEMU or DPDK Virtio-user PMD).
>>
>> When using Kernel vDPA, the datapath is also offloaded to a vDPA
>> compatible device, and the control path is managed by the vDPA bus.
>> It can either be consumed by a Kernel Virtio device (here Virtio-net)
>> when using Virtio-vDPA. In this case the device is exposed as a regular
>> netdev and, in the case of Kubernetes, can be used as primary interfaces
>> for the pods.
>> Or it can be exposed to user-space via Vhost-vDPA, a chardev that can be
>> seen as an alternative to Vhost-user sockets. In this case it can for
>> example be used by QEMU or DPDK Virtio-user PMD. In Kubernetes, it can
>> be used as a secondary interface.
>>
>> Now comes VDUSE. VDUSE is a Kernel vDPA device, but instead of being a
>> physical device where the Virtio datapath is offloaded, the Virtio
>> datapath is offloaded to a user-space application. With this series, a
>> DPDK application, like OVS-DPDK for instance, can create VDUSE device
>> and expose them either as regular netdev when binding them to Kernel
>> Virtio-net driver via Virtio-vDPA, or as Vhost-vDPA interface to be
>> consumed by another userspace appliation like QEMU or DPDK application
>> using Virtio-user PMD. With this solution, OVS-DPDK could serve both
>> primary and secondary interfaces of Kubernetes pods.
>>
>> I hope it clarifies, I will add these information in the cover-letter
>> for next revisions. Let me know if anything is still unclear.
>>
>> I did a presentation at last DPDK summit [0], maybe the diagrams will
>> help to clarify furthermore.
>>
> 
> Thanks Chenbo, Maxime for clarification.
> 
> After reading a little more (I think) I got it better, slides [0] were
> useful.
> 
> So this is more like a backend/handler, similar to vhost-user, although
> it is vDPA device emulation.
> Can you please describe more the benefit of vduse comparing to vhost-user?

The main benefit is that VDUSE device can be exposed as a regular
netdev, while this is not possible with Vhost-user.

> Also what is "VDUSE daemon", which is referred a few times in
> documentation, is it another userspace implementation of the vduse?

VDUSE daemon is the application that implements the VDUSE device, e.g.
OVS-DPDK with DPDK Vhost library using this series in our case.

Maxime
> 
>>>>
>>>> And if it is a better alternative, perhaps the documentation should
>>>> mention that it is recommended over DPDK vDPA. Just like we started
>>>> recommending alternatives to the KNI driver, so we could phase it out
>>>> and
>>>> eventually get rid of it.
>>>>
>>>>>
>>>>> Regards,
>>>>> Maxime
>>>
>>
>> [0]:
>> https://static.sched.com/hosted_files/dpdkuserspace22/9f/Open%20DPDK%20to%20containers%20networking%20with%20VDUSE.pdf
>>
>
  
Ferruh Yigit April 14, 2023, 2:25 p.m. UTC | #12
On 4/14/2023 1:06 PM, Maxime Coquelin wrote:
> 
> 
> On 4/14/23 12:48, Ferruh Yigit wrote:
>> On 4/13/2023 8:59 AM, Maxime Coquelin wrote:
>>> Hi,
>>>
>>> On 4/13/23 09:08, Xia, Chenbo wrote:
>>>>> -----Original Message-----
>>>>> From: Morten Brørup <mb@smartsharesystems.com>
>>>>> Sent: Thursday, April 13, 2023 3:41 AM
>>>>> To: Maxime Coquelin <maxime.coquelin@redhat.com>; Ferruh Yigit
>>>>> <ferruh.yigit@amd.com>; dev@dpdk.org; david.marchand@redhat.com; Xia,
>>>>> Chenbo <chenbo.xia@intel.com>; mkp@redhat.com; fbl@redhat.com;
>>>>> jasowang@redhat.com; Liang, Cunming <cunming.liang@intel.com>; Xie,
>>>>> Yongji
>>>>> <xieyongji@bytedance.com>; echaudro@redhat.com; eperezma@redhat.com;
>>>>> amorenoz@redhat.com
>>>>> Subject: RE: [RFC 00/27] Add VDUSE support to Vhost library
>>>>>
>>>>>> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
>>>>>> Sent: Wednesday, 12 April 2023 17.28
>>>>>>
>>>>>> Hi Ferruh,
>>>>>>
>>>>>> On 4/12/23 13:33, Ferruh Yigit wrote:
>>>>>>> On 3/31/2023 4:42 PM, Maxime Coquelin wrote:
>>>>>>>> This series introduces a new type of backend, VDUSE,
>>>>>>>> to the Vhost library.
>>>>>>>>
>>>>>>>> VDUSE stands for vDPA device in Userspace, it enables
>>>>>>>> implementing a Virtio device in userspace and have it
>>>>>>>> attached to the Kernel vDPA bus.
>>>>>>>>
>>>>>>>> Once attached to the vDPA bus, it can be used either by
>>>>>>>> Kernel Virtio drivers, like virtio-net in our case, via
>>>>>>>> the virtio-vdpa driver. Doing that, the device is visible
>>>>>>>> to the Kernel networking stack and is exposed to userspace
>>>>>>>> as a regular netdev.
>>>>>>>>
>>>>>>>> It can also be exposed to userspace thanks to the
>>>>>>>> vhost-vdpa driver, via a vhost-vdpa chardev that can be
>>>>>>>> passed to QEMU or Virtio-user PMD.
>>>>>>>>
>>>>>>>> While VDUSE support is already available in upstream
>>>>>>>> Kernel, a couple of patches are required to support
>>>>>>>> network device type:
>>>>>>>>
>>>>>>>> https://gitlab.com/mcoquelin/linux/-/tree/vduse_networking_poc
>>>>>>>>
>>>>>>>> In order to attach the created VDUSE device to the vDPA
>>>>>>>> bus, a recent iproute2 version containing the vdpa tool is
>>>>>>>> required.
>>>>>>>
>>>>>>> Hi Maxime,
>>>>>>>
>>>>>>> Is this a replacement to the existing DPDK vDPA framework? What
>>>>>>> is the
>>>>>>> plan for long term?
>>>>>>>
>>>>>>
>>>>>> No, this is not a replacement for DPDK vDPA framework.
>>>>>>
>>>>>> We (Red Hat) don't have plans to support DPDK vDPA framework in our
>>>>>> products, but there are still contribution to DPDK vDPA by several
>>>>>> vDPA
>>>>>> hardware vendors (Intel, Nvidia, Xilinx), so I don't think it is
>>>>>> going
>>>>>> to be deprecated soon.
>>>>>
>>>>> Ferruh's question made me curious...
>>>>>
>>>>> I don't know anything about VDUSE or vDPA, and don't use any of it, so
>>>>> consider me ignorant in this area.
>>>>>
>>>>> Is VDUSE an alternative to the existing DPDK vDPA framework? What are
>>>>> the
>>>>> differences, e.g. in which cases would an application developer (or
>>>>> user)
>>>>> choose one or the other?
>>>>
>>>> Maxime should give better explanation.. but let me just explain a bit.
>>>>
>>>> Vendors have vDPA HW that support vDPA framework (most likely in their
>>>> DPU/IPU
>>>> products). This work is introducing a way to emulate a SW vDPA
>>>> device in
>>>> userspace (DPDK), and this SW vDPA device also supports vDPA framework.
>>>>
>>>> So it's not an alternative to existing DPDK vDPA framework :)
>>>
>>> Correct.
>>>
>>> When using DPDK vDPA, the datapath of a Vhost-user port is offloaded to
>>> a compatible physical NIC (i.e. a NIC that implements Virtio rings
>>> support), the control path remains the same as a regular Vhost-user
>>> port, i.e. it provides a Vhost-user unix socket to the application (like
>>> QEMU or DPDK Virtio-user PMD).
>>>
>>> When using Kernel vDPA, the datapath is also offloaded to a vDPA
>>> compatible device, and the control path is managed by the vDPA bus.
>>> It can either be consumed by a Kernel Virtio device (here Virtio-net)
>>> when using Virtio-vDPA. In this case the device is exposed as a regular
>>> netdev and, in the case of Kubernetes, can be used as primary interfaces
>>> for the pods.
>>> Or it can be exposed to user-space via Vhost-vDPA, a chardev that can be
>>> seen as an alternative to Vhost-user sockets. In this case it can for
>>> example be used by QEMU or DPDK Virtio-user PMD. In Kubernetes, it can
>>> be used as a secondary interface.
>>>
>>> Now comes VDUSE. VDUSE is a Kernel vDPA device, but instead of being a
>>> physical device where the Virtio datapath is offloaded, the Virtio
>>> datapath is offloaded to a user-space application. With this series, a
>>> DPDK application, like OVS-DPDK for instance, can create VDUSE device
>>> and expose them either as regular netdev when binding them to Kernel
>>> Virtio-net driver via Virtio-vDPA, or as Vhost-vDPA interface to be
>>> consumed by another userspace appliation like QEMU or DPDK application
>>> using Virtio-user PMD. With this solution, OVS-DPDK could serve both
>>> primary and secondary interfaces of Kubernetes pods.
>>>
>>> I hope it clarifies, I will add these information in the cover-letter
>>> for next revisions. Let me know if anything is still unclear.
>>>
>>> I did a presentation at last DPDK summit [0], maybe the diagrams will
>>> help to clarify furthermore.
>>>
>>
>> Thanks Chenbo, Maxime for clarification.
>>
>> After reading a little more (I think) I got it better, slides [0] were
>> useful.
>>
>> So this is more like a backend/handler, similar to vhost-user, although
>> it is vDPA device emulation.
>> Can you please describe more the benefit of vduse comparing to
>> vhost-user?
> 
> The main benefit is that VDUSE device can be exposed as a regular
> netdev, while this is not possible with Vhost-user.
> 

Got it, thanks. I think better to highlight this in commit logs.

And out of curiosity,
Why there is no virtio(guest) to virtio(host) communication support
(without vdpa), something like adding virtio as backend to vhost-net, is
it not needed or technical difficulties?

>> Also what is "VDUSE daemon", which is referred a few times in
>> documentation, is it another userspace implementation of the vduse?
> 
> VDUSE daemon is the application that implements the VDUSE device, e.g.
> OVS-DPDK with DPDK Vhost library using this series in our case.
> 
> Maxime
>>
>>>>>
>>>>> And if it is a better alternative, perhaps the documentation should
>>>>> mention that it is recommended over DPDK vDPA. Just like we started
>>>>> recommending alternatives to the KNI driver, so we could phase it out
>>>>> and
>>>>> eventually get rid of it.
>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Maxime
>>>>
>>>
>>> [0]:
>>> https://static.sched.com/hosted_files/dpdkuserspace22/9f/Open%20DPDK%20to%20containers%20networking%20with%20VDUSE.pdf
>>>
>>
>
  
Jason Wang April 17, 2023, 3:10 a.m. UTC | #13
On Fri, Apr 14, 2023 at 10:25 PM Ferruh Yigit <ferruh.yigit@amd.com> wrote:
>
> On 4/14/2023 1:06 PM, Maxime Coquelin wrote:
> >
> >
> > On 4/14/23 12:48, Ferruh Yigit wrote:
> >> On 4/13/2023 8:59 AM, Maxime Coquelin wrote:
> >>> Hi,
> >>>
> >>> On 4/13/23 09:08, Xia, Chenbo wrote:
> >>>>> -----Original Message-----
> >>>>> From: Morten Brørup <mb@smartsharesystems.com>
> >>>>> Sent: Thursday, April 13, 2023 3:41 AM
> >>>>> To: Maxime Coquelin <maxime.coquelin@redhat.com>; Ferruh Yigit
> >>>>> <ferruh.yigit@amd.com>; dev@dpdk.org; david.marchand@redhat.com; Xia,
> >>>>> Chenbo <chenbo.xia@intel.com>; mkp@redhat.com; fbl@redhat.com;
> >>>>> jasowang@redhat.com; Liang, Cunming <cunming.liang@intel.com>; Xie,
> >>>>> Yongji
> >>>>> <xieyongji@bytedance.com>; echaudro@redhat.com; eperezma@redhat.com;
> >>>>> amorenoz@redhat.com
> >>>>> Subject: RE: [RFC 00/27] Add VDUSE support to Vhost library
> >>>>>
> >>>>>> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
> >>>>>> Sent: Wednesday, 12 April 2023 17.28
> >>>>>>
> >>>>>> Hi Ferruh,
> >>>>>>
> >>>>>> On 4/12/23 13:33, Ferruh Yigit wrote:
> >>>>>>> On 3/31/2023 4:42 PM, Maxime Coquelin wrote:
> >>>>>>>> This series introduces a new type of backend, VDUSE,
> >>>>>>>> to the Vhost library.
> >>>>>>>>
> >>>>>>>> VDUSE stands for vDPA device in Userspace, it enables
> >>>>>>>> implementing a Virtio device in userspace and have it
> >>>>>>>> attached to the Kernel vDPA bus.
> >>>>>>>>
> >>>>>>>> Once attached to the vDPA bus, it can be used either by
> >>>>>>>> Kernel Virtio drivers, like virtio-net in our case, via
> >>>>>>>> the virtio-vdpa driver. Doing that, the device is visible
> >>>>>>>> to the Kernel networking stack and is exposed to userspace
> >>>>>>>> as a regular netdev.
> >>>>>>>>
> >>>>>>>> It can also be exposed to userspace thanks to the
> >>>>>>>> vhost-vdpa driver, via a vhost-vdpa chardev that can be
> >>>>>>>> passed to QEMU or Virtio-user PMD.
> >>>>>>>>
> >>>>>>>> While VDUSE support is already available in upstream
> >>>>>>>> Kernel, a couple of patches are required to support
> >>>>>>>> network device type:
> >>>>>>>>
> >>>>>>>> https://gitlab.com/mcoquelin/linux/-/tree/vduse_networking_poc
> >>>>>>>>
> >>>>>>>> In order to attach the created VDUSE device to the vDPA
> >>>>>>>> bus, a recent iproute2 version containing the vdpa tool is
> >>>>>>>> required.
> >>>>>>>
> >>>>>>> Hi Maxime,
> >>>>>>>
> >>>>>>> Is this a replacement to the existing DPDK vDPA framework? What
> >>>>>>> is the
> >>>>>>> plan for long term?
> >>>>>>>
> >>>>>>
> >>>>>> No, this is not a replacement for DPDK vDPA framework.
> >>>>>>
> >>>>>> We (Red Hat) don't have plans to support DPDK vDPA framework in our
> >>>>>> products, but there are still contribution to DPDK vDPA by several
> >>>>>> vDPA
> >>>>>> hardware vendors (Intel, Nvidia, Xilinx), so I don't think it is
> >>>>>> going
> >>>>>> to be deprecated soon.
> >>>>>
> >>>>> Ferruh's question made me curious...
> >>>>>
> >>>>> I don't know anything about VDUSE or vDPA, and don't use any of it, so
> >>>>> consider me ignorant in this area.
> >>>>>
> >>>>> Is VDUSE an alternative to the existing DPDK vDPA framework? What are
> >>>>> the
> >>>>> differences, e.g. in which cases would an application developer (or
> >>>>> user)
> >>>>> choose one or the other?
> >>>>
> >>>> Maxime should give better explanation.. but let me just explain a bit.
> >>>>
> >>>> Vendors have vDPA HW that support vDPA framework (most likely in their
> >>>> DPU/IPU
> >>>> products). This work is introducing a way to emulate a SW vDPA
> >>>> device in
> >>>> userspace (DPDK), and this SW vDPA device also supports vDPA framework.
> >>>>
> >>>> So it's not an alternative to existing DPDK vDPA framework :)
> >>>
> >>> Correct.
> >>>
> >>> When using DPDK vDPA, the datapath of a Vhost-user port is offloaded to
> >>> a compatible physical NIC (i.e. a NIC that implements Virtio rings
> >>> support), the control path remains the same as a regular Vhost-user
> >>> port, i.e. it provides a Vhost-user unix socket to the application (like
> >>> QEMU or DPDK Virtio-user PMD).
> >>>
> >>> When using Kernel vDPA, the datapath is also offloaded to a vDPA
> >>> compatible device, and the control path is managed by the vDPA bus.
> >>> It can either be consumed by a Kernel Virtio device (here Virtio-net)
> >>> when using Virtio-vDPA. In this case the device is exposed as a regular
> >>> netdev and, in the case of Kubernetes, can be used as primary interfaces
> >>> for the pods.
> >>> Or it can be exposed to user-space via Vhost-vDPA, a chardev that can be
> >>> seen as an alternative to Vhost-user sockets. In this case it can for
> >>> example be used by QEMU or DPDK Virtio-user PMD. In Kubernetes, it can
> >>> be used as a secondary interface.
> >>>
> >>> Now comes VDUSE. VDUSE is a Kernel vDPA device, but instead of being a
> >>> physical device where the Virtio datapath is offloaded, the Virtio
> >>> datapath is offloaded to a user-space application. With this series, a
> >>> DPDK application, like OVS-DPDK for instance, can create VDUSE device
> >>> and expose them either as regular netdev when binding them to Kernel
> >>> Virtio-net driver via Virtio-vDPA, or as Vhost-vDPA interface to be
> >>> consumed by another userspace appliation like QEMU or DPDK application
> >>> using Virtio-user PMD. With this solution, OVS-DPDK could serve both
> >>> primary and secondary interfaces of Kubernetes pods.
> >>>
> >>> I hope it clarifies, I will add these information in the cover-letter
> >>> for next revisions. Let me know if anything is still unclear.
> >>>
> >>> I did a presentation at last DPDK summit [0], maybe the diagrams will
> >>> help to clarify furthermore.
> >>>
> >>
> >> Thanks Chenbo, Maxime for clarification.
> >>
> >> After reading a little more (I think) I got it better, slides [0] were
> >> useful.
> >>
> >> So this is more like a backend/handler, similar to vhost-user, although
> >> it is vDPA device emulation.
> >> Can you please describe more the benefit of vduse comparing to
> >> vhost-user?
> >
> > The main benefit is that VDUSE device can be exposed as a regular
> > netdev, while this is not possible with Vhost-user.
> >
>
> Got it, thanks. I think better to highlight this in commit logs.
>
> And out of curiosity,
> Why there is no virtio(guest) to virtio(host) communication support
> (without vdpa), something like adding virtio as backend to vhost-net, is
> it not needed or technical difficulties?

The main reason is that a lot of operations are not supported by virtio yet:

1) virtqueue saving, restoring
2) provisioning and management
3) address space id etc

Thanks

>
> >> Also what is "VDUSE daemon", which is referred a few times in
> >> documentation, is it another userspace implementation of the vduse?
> >
> > VDUSE daemon is the application that implements the VDUSE device, e.g.
> > OVS-DPDK with DPDK Vhost library using this series in our case.
> >
> > Maxime
> >>
> >>>>>
> >>>>> And if it is a better alternative, perhaps the documentation should
> >>>>> mention that it is recommended over DPDK vDPA. Just like we started
> >>>>> recommending alternatives to the KNI driver, so we could phase it out
> >>>>> and
> >>>>> eventually get rid of it.
> >>>>>
> >>>>>>
> >>>>>> Regards,
> >>>>>> Maxime
> >>>>
> >>>
> >>> [0]:
> >>> https://static.sched.com/hosted_files/dpdkuserspace22/9f/Open%20DPDK%20to%20containers%20networking%20with%20VDUSE.pdf
> >>>
> >>
> >
>
  
Chenbo Xia May 5, 2023, 5:53 a.m. UTC | #14
Hi Maxime,

> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Friday, March 31, 2023 11:43 PM
> To: dev@dpdk.org; david.marchand@redhat.com; Xia, Chenbo
> <chenbo.xia@intel.com>; mkp@redhat.com; fbl@redhat.com;
> jasowang@redhat.com; Liang, Cunming <cunming.liang@intel.com>; Xie, Yongji
> <xieyongji@bytedance.com>; echaudro@redhat.com; eperezma@redhat.com;
> amorenoz@redhat.com
> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
> Subject: [RFC 00/27] Add VDUSE support to Vhost library
> 
> This series introduces a new type of backend, VDUSE,
> to the Vhost library.
> 
> VDUSE stands for vDPA device in Userspace, it enables
> implementing a Virtio device in userspace and have it
> attached to the Kernel vDPA bus.
> 
> Once attached to the vDPA bus, it can be used either by
> Kernel Virtio drivers, like virtio-net in our case, via
> the virtio-vdpa driver. Doing that, the device is visible
> to the Kernel networking stack and is exposed to userspace
> as a regular netdev.
> 
> It can also be exposed to userspace thanks to the
> vhost-vdpa driver, via a vhost-vdpa chardev that can be
> passed to QEMU or Virtio-user PMD.
> 
> While VDUSE support is already available in upstream
> Kernel, a couple of patches are required to support
> network device type:
> 
> https://gitlab.com/mcoquelin/linux/-/tree/vduse_networking_poc
> 
> In order to attach the created VDUSE device to the vDPA
> bus, a recent iproute2 version containing the vdpa tool is
> required.
> --
> 2.39.2

Btw: when this series merged in future, will Redhat run all the
test cases of vduse in every release?

Thanks,
Chenbo