mbox series

[v11,0/9] implement packed virtqueues

Message ID 20181203141515.28368-1-jfreimann@redhat.com (mailing list archive)
Headers show
Series implement packed virtqueues | expand

Message

Jens Freimann Dec. 3, 2018, 2:15 p.m. UTC
This is a basic implementation of packed virtqueues as specified in the
Virtio 1.1 draft. A compiled version of the current draft is available
at https://github.com/oasis-tcs/virtio-docs.git (or as .pdf at
https://github.com/oasis-tcs/virtio-docs/blob/master/virtio-v1.1-packed-wd10.pdf

A packed virtqueue is different from a split virtqueue in that it
consists of only a single descriptor ring that replaces available and
used ring, index and descriptor pointers.

Each descriptor is readable and writable and has a flags field. These flags
will mark if a descriptor is available or used.  To detect new available descriptors
even after the ring has wrapped, device and driver each have a
single-bit wrap counter that is flipped from 0 to 1 and vice versa every time
the last descriptor in the ring is used/made available.

With this patch set I see a slight performance drop compared to split
virtqueues. I tested according to
http://doc.dpdk.org/guides/howto/pvp_reference_benchmark.html and I see
a small performance drop of 3-4 percent in PVP and and similar numbers
when doing txonly tests. I tested with DPDK v18.11 in the host and Wei's
v3 QEMU series for packed ring [1]. The root cause of this is still under
investigation and will hopefully be fixed in the next version of this
patch set. I'm posting this in the current state so people can review
and possibly do their own tests. 

regards,
Jens 

[1] https://github.com/jensfr/qemu/tree/wexu-packed-ring-v3

v10-v11:
 * this version includes some fixes from Tiwei, so I added his
   Signed-off-by to some of the patches
 * fix hang with mergable rx buffers (Tiwei)
 * clean-up code and simplify buffer handling (Tiwei)
 * rebase to current virtio-next master branch

v9-v10:
 * don't mix index into buffer list and descriptors
 * whitespace and formatting issues
 * remove "VQ:" in dump virtqueue patch
 * add extra packed vring struct to virtqueue and change function
   prototypes and code accordingly
 * move wrap_counters to virtqueue
 * make if-conditions for packed and !packed more clear in
   set_rxtx_funcs()
 * initialize wrap counters in first patch, instead of rx and tx
   implementation patch
 * make virtio-user not supported with packed virtqueues, to
   be fixed in other patch set?

v8-v9:
 * fix virtio_ring_free_chain_packed() to handle descriptors
   correctly in case of out-of-order
 * fix check in virtqueue_xmit_cleanup_packed() to improve performance

v7->v8:
 * move desc_is_used change to correct patch
 * remove trailing newline
 * correct xmit code, flags update and memory barrier
 * move packed desc init to dedicated function, split
   and packed variant


Jens Freimann (8):
  net/virtio: vring init for packed queues
  net/virtio: add packed virtqueue defines
  net/virtio: add packed virtqueue helpers
  net/virtio: dump packed virtqueue data
  net/virtio: implement transmit path for packed queues
  net/virtio: implement receive path for packed queues
  net/virtio: add virtio send command packed queue support
  net/virtio: enable packed virtqueues by default

Yuanhan Liu (1):
  net/virtio-user: add option to use packed queues

 drivers/net/virtio/virtio_ethdev.c            | 227 +++++--
 drivers/net/virtio/virtio_ethdev.h            |   8 +
 drivers/net/virtio/virtio_pci.h               |   7 +
 drivers/net/virtio/virtio_ring.h              |  64 +-
 drivers/net/virtio/virtio_rxtx.c              | 604 +++++++++++++++++-
 .../net/virtio/virtio_user/virtio_user_dev.c  |  19 +-
 .../net/virtio/virtio_user/virtio_user_dev.h  |   2 +-
 drivers/net/virtio/virtio_user_ethdev.c       |  14 +-
 drivers/net/virtio/virtqueue.c                |  22 +
 drivers/net/virtio/virtqueue.h                | 122 +++-
 10 files changed, 1014 insertions(+), 75 deletions(-)

Comments

Jens Freimann Dec. 3, 2018, 3:29 p.m. UTC | #1
On Mon, Dec 03, 2018 at 03:15:06PM +0100, Jens Freimann wrote:
>
>This is a basic implementation of packed virtqueues as specified in the
>Virtio 1.1 draft. A compiled version of the current draft is available
>at https://github.com/oasis-tcs/virtio-docs.git (or as .pdf at
>https://github.com/oasis-tcs/virtio-docs/blob/master/virtio-v1.1-packed-wd10.pdf
>
>A packed virtqueue is different from a split virtqueue in that it
>consists of only a single descriptor ring that replaces available and
>used ring, index and descriptor pointers.
>
>Each descriptor is readable and writable and has a flags field. These flags
>will mark if a descriptor is available or used.  To detect new available descriptors
>even after the ring has wrapped, device and driver each have a
>single-bit wrap counter that is flipped from 0 to 1 and vice versa every time
>the last descriptor in the ring is used/made available.
>
>With this patch set I see a slight performance drop compared to split
>virtqueues. I tested according to
>http://doc.dpdk.org/guides/howto/pvp_reference_benchmark.html and I see
>a small performance drop of 3-4 percent in PVP and and similar numbers

It's actually bigger with mergeable rx buffers turned off. I measured
13% less mpps with packed virtqueues.

regards,
Jens
Maxime Coquelin Dec. 3, 2018, 3:34 p.m. UTC | #2
On 12/3/18 4:29 PM, Jens Freimann wrote:
> On Mon, Dec 03, 2018 at 03:15:06PM +0100, Jens Freimann wrote:
>>
>> This is a basic implementation of packed virtqueues as specified in the
>> Virtio 1.1 draft. A compiled version of the current draft is available
>> at https://github.com/oasis-tcs/virtio-docs.git (or as .pdf at
>> https://github.com/oasis-tcs/virtio-docs/blob/master/virtio-v1.1-packed-wd10.pdf 
>>
>>
>> A packed virtqueue is different from a split virtqueue in that it
>> consists of only a single descriptor ring that replaces available and
>> used ring, index and descriptor pointers.
>>
>> Each descriptor is readable and writable and has a flags field. These 
>> flags
>> will mark if a descriptor is available or used.  To detect new 
>> available descriptors
>> even after the ring has wrapped, device and driver each have a
>> single-bit wrap counter that is flipped from 0 to 1 and vice versa 
>> every time
>> the last descriptor in the ring is used/made available.
>>
>> With this patch set I see a slight performance drop compared to split
>> virtqueues. I tested according to
>> http://doc.dpdk.org/guides/howto/pvp_reference_benchmark.html and I see
>> a small performance drop of 3-4 percent in PVP and and similar numbers
> 
> It's actually bigger with mergeable rx buffers turned off. I measured
> 13% less mpps with packed virtqueues.

That's interesting. I just tried Rxonly micro-benchmark, and I get 
almost same perf for mrg and non-mrg cases:
MRG ON: 14.45Mpps
MRG OFF: 14.57Mpps

> regards,
> Jens