[dpdk-dev,v2,4/4] vhost: destroy unused virtqueues when multiqueue not negotiated
Checks
Commit Message
QEMU sends VHOST_USER_SET_VRING_CALL requests for all queues
declared in QEMU command line before the guest is started.
It has the effect in DPDK vhost-user backend to allocate vrings
for all queues declared by QEMU.
If the first driver being used does not support multiqueue,
the device never changes to VIRTIO_DEV_RUNNING state as only
the first queue pair is initialized. One driver impacted by
this bug is virtio-net's iPXE driver which does not support
VIRTIO_NET_F_MQ feature.
It is safe to destroy unused virtqueues in SET_FEATURES request
handler, as it is ensured the device is not in running state
at this stage, so virtqueues aren't being processed.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_vhost/vhost_user.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
Comments
Hi Maxime,
On 12/05/17 09:34, Maxime Coquelin wrote:
> QEMU sends VHOST_USER_SET_VRING_CALL requests for all queues
> declared in QEMU command line before the guest is started.
> It has the effect in DPDK vhost-user backend to allocate vrings
> for all queues declared by QEMU.
>
> If the first driver being used does not support multiqueue,
> the device never changes to VIRTIO_DEV_RUNNING state as only
> the first queue pair is initialized. One driver impacted by
> this bug is virtio-net's iPXE driver which does not support
> VIRTIO_NET_F_MQ feature.
>
> It is safe to destroy unused virtqueues in SET_FEATURES request
> handler, as it is ensured the device is not in running state
> at this stage, so virtqueues aren't being processed.
>
> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
> lib/librte_vhost/vhost_user.c | 19 +++++++++++++++++++
> 1 file changed, 19 insertions(+)
>
> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
> index a5e1f2482..b17080215 100644
> --- a/lib/librte_vhost/vhost_user.c
> +++ b/lib/librte_vhost/vhost_user.c
> @@ -173,6 +173,7 @@ vhost_user_get_features(struct virtio_net *dev)
> static int
> vhost_user_set_features(struct virtio_net *dev, uint64_t features)
> {
> + int i;
> uint64_t vhost_features = 0;
>
> rte_vhost_driver_get_features(dev->ifname, &vhost_features);
> @@ -216,6 +217,24 @@ vhost_user_set_features(struct virtio_net *dev, uint64_t features)
> (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) ? "on" : "off",
> (dev->features & (1ULL << VIRTIO_F_VERSION_1)) ? "on" : "off");
>
> + if (!(dev->features & (1ULL << VIRTIO_NET_F_MQ))) {
> + /*
> + * Remove all but first queue pair if MQ hasn't been
> + * negotiated. This is safe because the device is not
> + * running at this stage.
> + */
> + for (i = dev->nr_vring; i > 1; i--) {
> + struct vhost_virtqueue *vq = dev->virtqueue[i];
Sorry, I don't have any experience with dpdk.
But, if "dev->virtqueue" has "dev->nr_vring" elements, then this loop is
off-by one; dev->virtqueue[dev->nr_vring] should never be accessed. For
example, if you have three queues, numbered 0, 1 and 2, this loop will
access/release virtqueue[3] (bad) and virtqueue[2] (good).
Instead, I suggest:
i = dev->nr_vring;
while (i > 2) {
struct vhost_virtqueue *vq;
vq = dev->virtqueue[--i];
/* the rest here */
}
The loop body is entered with "i" standing for "how many queues are left
that should be freed".
Thanks
Laszlo
> +
> + if (!vq)
> + continue;
> +
> + cleanup_vq(vq, 1);
> + free_vq(vq);
> + dev->nr_vring--;
> + }
> + }
> +
> return 0;
> }
>
>
Hi Laszlo,
On 12/05/2017 03:40 PM, Laszlo Ersek wrote:
> Hi Maxime,
>
> On 12/05/17 09:34, Maxime Coquelin wrote:
>> QEMU sends VHOST_USER_SET_VRING_CALL requests for all queues
>> declared in QEMU command line before the guest is started.
>> It has the effect in DPDK vhost-user backend to allocate vrings
>> for all queues declared by QEMU.
>>
>> If the first driver being used does not support multiqueue,
>> the device never changes to VIRTIO_DEV_RUNNING state as only
>> the first queue pair is initialized. One driver impacted by
>> this bug is virtio-net's iPXE driver which does not support
>> VIRTIO_NET_F_MQ feature.
>>
>> It is safe to destroy unused virtqueues in SET_FEATURES request
>> handler, as it is ensured the device is not in running state
>> at this stage, so virtqueues aren't being processed.
>>
>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>> ---
>> lib/librte_vhost/vhost_user.c | 19 +++++++++++++++++++
>> 1 file changed, 19 insertions(+)
>>
>> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
>> index a5e1f2482..b17080215 100644
>> --- a/lib/librte_vhost/vhost_user.c
>> +++ b/lib/librte_vhost/vhost_user.c
>> @@ -173,6 +173,7 @@ vhost_user_get_features(struct virtio_net *dev)
>> static int
>> vhost_user_set_features(struct virtio_net *dev, uint64_t features)
>> {
>> + int i;
>> uint64_t vhost_features = 0;
>>
>> rte_vhost_driver_get_features(dev->ifname, &vhost_features);
>> @@ -216,6 +217,24 @@ vhost_user_set_features(struct virtio_net *dev, uint64_t features)
>> (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) ? "on" : "off",
>> (dev->features & (1ULL << VIRTIO_F_VERSION_1)) ? "on" : "off");
>>
>> + if (!(dev->features & (1ULL << VIRTIO_NET_F_MQ))) {
>> + /*
>> + * Remove all but first queue pair if MQ hasn't been
>> + * negotiated. This is safe because the device is not
>> + * running at this stage.
>> + */
>> + for (i = dev->nr_vring; i > 1; i--) {
>> + struct vhost_virtqueue *vq = dev->virtqueue[i];
>
> Sorry, I don't have any experience with dpdk.
>
> But, if "dev->virtqueue" has "dev->nr_vring" elements, then this loop is
> off-by one; dev->virtqueue[dev->nr_vring] should never be accessed. For
> example, if you have three queues, numbered 0, 1 and 2, this loop will
> access/release virtqueue[3] (bad) and virtqueue[2] (good).
Right, thanks for spotting this.
I didn't noticed my mistake while testing it because of the NULL check
in the loop.
> Instead, I suggest:
>
> i = dev->nr_vring;
> while (i > 2) {
> struct vhost_virtqueue *vq;
>
> vq = dev->virtqueue[--i];
> /* the rest here */
> }
>
> The loop body is entered with "i" standing for "how many queues are left
> that should be freed".
Yes, that sounds cleaner. I think dev->nr_vring can safely be
decremented directly, so "i" could be skipped.
Thanks!
Maxime
> Thanks
> Laszlo
>
>> +
>> + if (!vq)
>> + continue;
>> +
>> + cleanup_vq(vq, 1);
>> + free_vq(vq);
>> + dev->nr_vring--;
>> + }
>> + }
>> +
>> return 0;
>> }
>>
>>
>
@@ -173,6 +173,7 @@ vhost_user_get_features(struct virtio_net *dev)
static int
vhost_user_set_features(struct virtio_net *dev, uint64_t features)
{
+ int i;
uint64_t vhost_features = 0;
rte_vhost_driver_get_features(dev->ifname, &vhost_features);
@@ -216,6 +217,24 @@ vhost_user_set_features(struct virtio_net *dev, uint64_t features)
(dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) ? "on" : "off",
(dev->features & (1ULL << VIRTIO_F_VERSION_1)) ? "on" : "off");
+ if (!(dev->features & (1ULL << VIRTIO_NET_F_MQ))) {
+ /*
+ * Remove all but first queue pair if MQ hasn't been
+ * negotiated. This is safe because the device is not
+ * running at this stage.
+ */
+ for (i = dev->nr_vring; i > 1; i--) {
+ struct vhost_virtqueue *vq = dev->virtqueue[i];
+
+ if (!vq)
+ continue;
+
+ cleanup_vq(vq, 1);
+ free_vq(vq);
+ dev->nr_vring--;
+ }
+ }
+
return 0;
}