[v2] vhost: avoid memory barriers when no descriptors dequeued
Checks
Commit Message
In both split and packed dequeue paths, flush_shadow_used_ring
and vhost_ring_call variants gets called even if not packets
have been dequeued, and so no descriptors updates happened.
It has an impact on CPU pipeline, as memory barriers are used
in these functions.
This patch don't call these functions if no descriptors have
been dequeued. The performance gain with split ring when
dequeue zero-copy is disabled should be null, but should be
noticeable with packed ring or dequeue zero-copy enabled.
Fixes: ae999ce49dcb ("vhost: add Tx support for packed ring")
Fixes: 915cf9404225 ("vhost: use shadow used ring in dequeue path")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
Changes in v2:
- Fix shadow_used_idx reset in error path (Tiwei)
lib/librte_vhost/virtio_net.c | 24 ++++++++++++++++--------
1 file changed, 16 insertions(+), 8 deletions(-)
Comments
On Tue, Oct 23, 2018 at 12:07:10PM +0200, Maxime Coquelin wrote:
>In both split and packed dequeue paths, flush_shadow_used_ring
>and vhost_ring_call variants gets called even if not packets
>have been dequeued, and so no descriptors updates happened.
>
>It has an impact on CPU pipeline, as memory barriers are used
>in these functions.
>
>This patch don't call these functions if no descriptors have
>been dequeued. The performance gain with split ring when
>dequeue zero-copy is disabled should be null, but should be
>noticeable with packed ring or dequeue zero-copy enabled.
I tried this with packed ring pmd patch series v8 and it works fine.
It doesn't hurt performance either, I see improvements in sending
packets from guest to host.
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Tested-by: Jens Freimann <jfreimann@redhat.com>
regards,
Jens
On Tue, Oct 23, 2018 at 12:07:10PM +0200, Maxime Coquelin wrote:
> In both split and packed dequeue paths, flush_shadow_used_ring
> and vhost_ring_call variants gets called even if not packets
> have been dequeued, and so no descriptors updates happened.
>
> It has an impact on CPU pipeline, as memory barriers are used
> in these functions.
>
> This patch don't call these functions if no descriptors have
> been dequeued. The performance gain with split ring when
> dequeue zero-copy is disabled should be null, but should be
> noticeable with packed ring or dequeue zero-copy enabled.
>
> Fixes: ae999ce49dcb ("vhost: add Tx support for packed ring")
> Fixes: 915cf9404225 ("vhost: use shadow used ring in dequeue path")
> Cc: stable@dpdk.org
>
> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
>
> Changes in v2:
> - Fix shadow_used_idx reset in error path (Tiwei)
>
> lib/librte_vhost/virtio_net.c | 24 ++++++++++++++++--------
> 1 file changed, 16 insertions(+), 8 deletions(-)
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
On 10/23/18 12:07 PM, Maxime Coquelin wrote:
> In both split and packed dequeue paths, flush_shadow_used_ring
> and vhost_ring_call variants gets called even if not packets
> have been dequeued, and so no descriptors updates happened.
>
> It has an impact on CPU pipeline, as memory barriers are used
> in these functions.
>
> This patch don't call these functions if no descriptors have
> been dequeued. The performance gain with split ring when
> dequeue zero-copy is disabled should be null, but should be
> noticeable with packed ring or dequeue zero-copy enabled.
>
> Fixes: ae999ce49dcb ("vhost: add Tx support for packed ring")
> Fixes: 915cf9404225 ("vhost: use shadow used ring in dequeue path")
> Cc:stable@dpdk.org
>
> Signed-off-by: Maxime Coquelin<maxime.coquelin@redhat.com>
> ---
>
> Changes in v2:
> - Fix shadow_used_idx reset in error path (Tiwei)
>
> lib/librte_vhost/virtio_net.c | 24 ++++++++++++++++--------
> 1 file changed, 16 insertions(+), 8 deletions(-)
Applied to dpdk-next-virtio/master.
Maxime
@@ -1360,8 +1360,10 @@ virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq,
}
}
- flush_shadow_used_ring_split(dev, vq);
- vhost_vring_call_split(dev, vq);
+ if (likely(vq->shadow_used_idx)) {
+ flush_shadow_used_ring_split(dev, vq);
+ vhost_vring_call_split(dev, vq);
+ }
}
rte_prefetch0(&vq->avail->ring[vq->last_avail_idx & (vq->size - 1)]);
@@ -1440,8 +1442,10 @@ virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq,
do_data_copy_dequeue(vq);
if (unlikely(i < count))
vq->shadow_used_idx = i;
- flush_shadow_used_ring_split(dev, vq);
- vhost_vring_call_split(dev, vq);
+ if (likely(vq->shadow_used_idx)) {
+ flush_shadow_used_ring_split(dev, vq);
+ vhost_vring_call_split(dev, vq);
+ }
}
return i;
@@ -1476,8 +1480,10 @@ virtio_dev_tx_packed(struct virtio_net *dev, struct vhost_virtqueue *vq,
}
}
- flush_shadow_used_ring_packed(dev, vq);
- vhost_vring_call_packed(dev, vq);
+ if (likely(vq->shadow_used_idx)) {
+ flush_shadow_used_ring_packed(dev, vq);
+ vhost_vring_call_packed(dev, vq);
+ }
}
VHOST_LOG_DEBUG(VHOST_DATA, "(%d) %s\n", dev->vid, __func__);
@@ -1555,8 +1561,10 @@ virtio_dev_tx_packed(struct virtio_net *dev, struct vhost_virtqueue *vq,
do_data_copy_dequeue(vq);
if (unlikely(i < count))
vq->shadow_used_idx = i;
- flush_shadow_used_ring_packed(dev, vq);
- vhost_vring_call_packed(dev, vq);
+ if (likely(vq->shadow_used_idx)) {
+ flush_shadow_used_ring_packed(dev, vq);
+ vhost_vring_call_packed(dev, vq);
+ }
}
return i;