vhost: flush shadow tx if there is no more packets

Message ID 20200129193310.9157-1-eperezma@redhat.com (mailing list archive)
State Accepted, archived
Delegated to: Maxime Coquelin
Headers
Series vhost: flush shadow tx if there is no more packets |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-nxp-Performance success Performance Testing PASS
ci/iol-testing success Testing PASS
ci/travis-robot warning Travis build: failed
ci/Intel-compilation fail apply issues

Commit Message

Eugenio Perez Martin Jan. 29, 2020, 7:33 p.m. UTC
  The current implementation of vhost_net in packed vring tries to fill
the shadow vector before send any actual changes to the guest. While
this can be beneficial for the throughput, it conflicts with some
bufferfloats methods like the linux kernel napi, that stops
transmitting packets if there are too much bytes/buffers in the
driver.

To solve it, we flush the shadow packets at the end of
virtio_dev_tx_packed if we have starved the vring, i.e., the next
buffer is not available for the device.

Since this last check can be expensive because of the atomic, we only
check it if we have not obtained the expected (count) packets. If it
happens to obtain "count" packets and there is no more available
packets the caller needs to keep call virtio_dev_tx_packed again.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 lib/librte_vhost/virtio_net.c | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)
  

Comments

Kevin Traynor Jan. 31, 2020, 6:38 p.m. UTC | #1
Hi Eugenio,

On 29/01/2020 19:33, Eugenio Pérez wrote:
> The current implementation of vhost_net in packed vring tries to fill
> the shadow vector before send any actual changes to the guest. While
> this can be beneficial for the throughput, it conflicts with some
> bufferfloats methods like the linux kernel napi, that stops
> transmitting packets if there are too much bytes/buffers in the
> driver.
> 
> To solve it, we flush the shadow packets at the end of
> virtio_dev_tx_packed if we have starved the vring, i.e., the next
> buffer is not available for the device.
> 
> Since this last check can be expensive because of the atomic, we only
> check it if we have not obtained the expected (count) packets. If it
> happens to obtain "count" packets and there is no more available
> packets the caller needs to keep call virtio_dev_tx_packed again.
> 

It seems to be fixing an issue and should be considered for stable
branches? You can add the tags needed in the commit message here:

Fixes: <commit that introduced bug/missed this case>
Cc: stable@dpdk.org

> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
>  lib/librte_vhost/virtio_net.c | 27 ++++++++++++++++++++++++++-
>  1 file changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
> index 21c311732..ac2842b2d 100644
> --- a/lib/librte_vhost/virtio_net.c
> +++ b/lib/librte_vhost/virtio_net.c
> @@ -2133,6 +2133,20 @@ virtio_dev_tx_packed_zmbuf(struct virtio_net *dev,
>  	return pkt_idx;
>  }
>  
> +static __rte_always_inline bool
> +next_desc_is_avail(const struct vhost_virtqueue *vq)
> +{
> +	bool wrap_counter = vq->avail_wrap_counter;
> +	uint16_t next_used_idx = vq->last_used_idx + 1;
> +
> +	if (next_used_idx >= vq->size) {
> +		next_used_idx -= vq->size;
> +		wrap_counter ^= 1;
> +	}
> +
> +	return desc_is_avail(&vq->desc_packed[next_used_idx], wrap_counter);
> +}
> +
>  static __rte_noinline uint16_t
>  virtio_dev_tx_packed(struct virtio_net *dev,
>  		     struct vhost_virtqueue *vq,
> @@ -2165,9 +2179,20 @@ virtio_dev_tx_packed(struct virtio_net *dev,
>  
>  	} while (remained);
>  
> -	if (vq->shadow_used_idx)
> +	if (vq->shadow_used_idx) {
>  		do_data_copy_dequeue(vq);
>  
> +		if (remained && !next_desc_is_avail(vq)) {
> +			/*
> +			 * The guest may be waiting to TX some buffers to
> +			 * enqueue more to avoid bufferfloat, so we try to
> +			 * reduce latency here.
> +			 */
> +			vhost_flush_dequeue_shadow_packed(dev, vq);
> +			vhost_vring_call_packed(dev, vq);
> +		}
> +	}
> +
>  	return pkt_idx;
>  }
>  
>
  
Eugenio Perez Martin Feb. 4, 2020, 9:23 a.m. UTC | #2
Hi Kevin!

Sorry, thanks for noticing it! It fixes commit ("31d6c6a5b vhost: optimize
packed ring dequeue"), what was not present on 18.11 version (I've checked
that v19.08 does not contain the failure).

Do I need to send another patch version with corrected commit message?

Thanks!

On Fri, Jan 31, 2020 at 7:38 PM Kevin Traynor <ktraynor@redhat.com> wrote:

> Hi Eugenio,
>
> On 29/01/2020 19:33, Eugenio Pérez wrote:
> > The current implementation of vhost_net in packed vring tries to fill
> > the shadow vector before send any actual changes to the guest. While
> > this can be beneficial for the throughput, it conflicts with some
> > bufferfloats methods like the linux kernel napi, that stops
> > transmitting packets if there are too much bytes/buffers in the
> > driver.
> >
> > To solve it, we flush the shadow packets at the end of
> > virtio_dev_tx_packed if we have starved the vring, i.e., the next
> > buffer is not available for the device.
> >
> > Since this last check can be expensive because of the atomic, we only
> > check it if we have not obtained the expected (count) packets. If it
> > happens to obtain "count" packets and there is no more available
> > packets the caller needs to keep call virtio_dev_tx_packed again.
> >
>
> It seems to be fixing an issue and should be considered for stable
> branches? You can add the tags needed in the commit message here:
>
> Fixes: <commit that introduced bug/missed this case>
> Cc: stable@dpdk.org
>
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> >  lib/librte_vhost/virtio_net.c | 27 ++++++++++++++++++++++++++-
> >  1 file changed, 26 insertions(+), 1 deletion(-)
> >
> > diff --git a/lib/librte_vhost/virtio_net.c
> b/lib/librte_vhost/virtio_net.c
> > index 21c311732..ac2842b2d 100644
> > --- a/lib/librte_vhost/virtio_net.c
> > +++ b/lib/librte_vhost/virtio_net.c
> > @@ -2133,6 +2133,20 @@ virtio_dev_tx_packed_zmbuf(struct virtio_net *dev,
> >       return pkt_idx;
> >  }
> >
> > +static __rte_always_inline bool
> > +next_desc_is_avail(const struct vhost_virtqueue *vq)
> > +{
> > +     bool wrap_counter = vq->avail_wrap_counter;
> > +     uint16_t next_used_idx = vq->last_used_idx + 1;
> > +
> > +     if (next_used_idx >= vq->size) {
> > +             next_used_idx -= vq->size;
> > +             wrap_counter ^= 1;
> > +     }
> > +
> > +     return desc_is_avail(&vq->desc_packed[next_used_idx],
> wrap_counter);
> > +}
> > +
> >  static __rte_noinline uint16_t
> >  virtio_dev_tx_packed(struct virtio_net *dev,
> >                    struct vhost_virtqueue *vq,
> > @@ -2165,9 +2179,20 @@ virtio_dev_tx_packed(struct virtio_net *dev,
> >
> >       } while (remained);
> >
> > -     if (vq->shadow_used_idx)
> > +     if (vq->shadow_used_idx) {
> >               do_data_copy_dequeue(vq);
> >
> > +             if (remained && !next_desc_is_avail(vq)) {
> > +                     /*
> > +                      * The guest may be waiting to TX some buffers to
> > +                      * enqueue more to avoid bufferfloat, so we try to
> > +                      * reduce latency here.
> > +                      */
> > +                     vhost_flush_dequeue_shadow_packed(dev, vq);
> > +                     vhost_vring_call_packed(dev, vq);
> > +             }
> > +     }
> > +
> >       return pkt_idx;
> >  }
> >
> >
>
>
  
Kevin Traynor Feb. 4, 2020, 1:48 p.m. UTC | #3
On 04/02/2020 09:23, Eugenio Perez Martin wrote:
> Hi Kevin!
> 
> Sorry, thanks for noticing it! It fixes commit ("31d6c6a5b vhost: optimize
> packed ring dequeue"), what was not present on 18.11 version (I've checked
> that v19.08 does not contain the failure).
> 

Right, in that case the issue is present on 19.11 stable, so it's worth
adding the tags to get it fixed in 19.11 stable.

> Do I need to send another patch version with corrected commit message?
> 

Probably Maxime can do it on applying if you ask nicely :-)

> Thanks!
> 
> On Fri, Jan 31, 2020 at 7:38 PM Kevin Traynor <ktraynor@redhat.com> wrote:
> 
>> Hi Eugenio,
>>
>> On 29/01/2020 19:33, Eugenio Pérez wrote:
>>> The current implementation of vhost_net in packed vring tries to fill
>>> the shadow vector before send any actual changes to the guest. While
>>> this can be beneficial for the throughput, it conflicts with some
>>> bufferfloats methods like the linux kernel napi, that stops
>>> transmitting packets if there are too much bytes/buffers in the
>>> driver.
>>>
>>> To solve it, we flush the shadow packets at the end of
>>> virtio_dev_tx_packed if we have starved the vring, i.e., the next
>>> buffer is not available for the device.
>>>
>>> Since this last check can be expensive because of the atomic, we only
>>> check it if we have not obtained the expected (count) packets. If it
>>> happens to obtain "count" packets and there is no more available
>>> packets the caller needs to keep call virtio_dev_tx_packed again.
>>>
>>
>> It seems to be fixing an issue and should be considered for stable
>> branches? You can add the tags needed in the commit message here:
>>
>> Fixes: <commit that introduced bug/missed this case>
>> Cc: stable@dpdk.org
>>
>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>> ---
>>>  lib/librte_vhost/virtio_net.c | 27 ++++++++++++++++++++++++++-
>>>  1 file changed, 26 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/lib/librte_vhost/virtio_net.c
>> b/lib/librte_vhost/virtio_net.c
>>> index 21c311732..ac2842b2d 100644
>>> --- a/lib/librte_vhost/virtio_net.c
>>> +++ b/lib/librte_vhost/virtio_net.c
>>> @@ -2133,6 +2133,20 @@ virtio_dev_tx_packed_zmbuf(struct virtio_net *dev,
>>>       return pkt_idx;
>>>  }
>>>
>>> +static __rte_always_inline bool
>>> +next_desc_is_avail(const struct vhost_virtqueue *vq)
>>> +{
>>> +     bool wrap_counter = vq->avail_wrap_counter;
>>> +     uint16_t next_used_idx = vq->last_used_idx + 1;
>>> +
>>> +     if (next_used_idx >= vq->size) {
>>> +             next_used_idx -= vq->size;
>>> +             wrap_counter ^= 1;
>>> +     }
>>> +
>>> +     return desc_is_avail(&vq->desc_packed[next_used_idx],
>> wrap_counter);
>>> +}
>>> +
>>>  static __rte_noinline uint16_t
>>>  virtio_dev_tx_packed(struct virtio_net *dev,
>>>                    struct vhost_virtqueue *vq,
>>> @@ -2165,9 +2179,20 @@ virtio_dev_tx_packed(struct virtio_net *dev,
>>>
>>>       } while (remained);
>>>
>>> -     if (vq->shadow_used_idx)
>>> +     if (vq->shadow_used_idx) {
>>>               do_data_copy_dequeue(vq);
>>>
>>> +             if (remained && !next_desc_is_avail(vq)) {
>>> +                     /*
>>> +                      * The guest may be waiting to TX some buffers to
>>> +                      * enqueue more to avoid bufferfloat, so we try to
>>> +                      * reduce latency here.
>>> +                      */
>>> +                     vhost_flush_dequeue_shadow_packed(dev, vq);
>>> +                     vhost_vring_call_packed(dev, vq);
>>> +             }
>>> +     }
>>> +
>>>       return pkt_idx;
>>>  }
>>>
>>>
>>
>>
>
  
Eugenio Perez Martin Feb. 4, 2020, 3:05 p.m. UTC | #4
Ouch, my bad again, sorry :).

I've forwarded the patch to stable@, please let me know if I need to do
something else.

Maxime, please let me know if I need to send a new version with the "Fixed:
" tag :).

Thanks!

On Tue, Feb 4, 2020 at 2:49 PM Kevin Traynor <ktraynor@redhat.com> wrote:

> On 04/02/2020 09:23, Eugenio Perez Martin wrote:
> > Hi Kevin!
> >
> > Sorry, thanks for noticing it! It fixes commit ("31d6c6a5b vhost:
> optimize
> > packed ring dequeue"), what was not present on 18.11 version (I've
> checked
> > that v19.08 does not contain the failure).
> >
>
> Right, in that case the issue is present on 19.11 stable, so it's worth
> adding the tags to get it fixed in 19.11 stable.
>
> > Do I need to send another patch version with corrected commit message?
> >
>
> Probably Maxime can do it on applying if you ask nicely :-)
>
> > Thanks!
> >
> > On Fri, Jan 31, 2020 at 7:38 PM Kevin Traynor <ktraynor@redhat.com>
> wrote:
> >
> >> Hi Eugenio,
> >>
> >> On 29/01/2020 19:33, Eugenio Pérez wrote:
> >>> The current implementation of vhost_net in packed vring tries to fill
> >>> the shadow vector before send any actual changes to the guest. While
> >>> this can be beneficial for the throughput, it conflicts with some
> >>> bufferfloats methods like the linux kernel napi, that stops
> >>> transmitting packets if there are too much bytes/buffers in the
> >>> driver.
> >>>
> >>> To solve it, we flush the shadow packets at the end of
> >>> virtio_dev_tx_packed if we have starved the vring, i.e., the next
> >>> buffer is not available for the device.
> >>>
> >>> Since this last check can be expensive because of the atomic, we only
> >>> check it if we have not obtained the expected (count) packets. If it
> >>> happens to obtain "count" packets and there is no more available
> >>> packets the caller needs to keep call virtio_dev_tx_packed again.
> >>>
> >>
> >> It seems to be fixing an issue and should be considered for stable
> >> branches? You can add the tags needed in the commit message here:
> >>
> >> Fixes: <commit that introduced bug/missed this case>
> >> Cc: stable@dpdk.org
> >>
> >>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> >>> ---
> >>>  lib/librte_vhost/virtio_net.c | 27 ++++++++++++++++++++++++++-
> >>>  1 file changed, 26 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/lib/librte_vhost/virtio_net.c
> >> b/lib/librte_vhost/virtio_net.c
> >>> index 21c311732..ac2842b2d 100644
> >>> --- a/lib/librte_vhost/virtio_net.c
> >>> +++ b/lib/librte_vhost/virtio_net.c
> >>> @@ -2133,6 +2133,20 @@ virtio_dev_tx_packed_zmbuf(struct virtio_net
> *dev,
> >>>       return pkt_idx;
> >>>  }
> >>>
> >>> +static __rte_always_inline bool
> >>> +next_desc_is_avail(const struct vhost_virtqueue *vq)
> >>> +{
> >>> +     bool wrap_counter = vq->avail_wrap_counter;
> >>> +     uint16_t next_used_idx = vq->last_used_idx + 1;
> >>> +
> >>> +     if (next_used_idx >= vq->size) {
> >>> +             next_used_idx -= vq->size;
> >>> +             wrap_counter ^= 1;
> >>> +     }
> >>> +
> >>> +     return desc_is_avail(&vq->desc_packed[next_used_idx],
> >> wrap_counter);
> >>> +}
> >>> +
> >>>  static __rte_noinline uint16_t
> >>>  virtio_dev_tx_packed(struct virtio_net *dev,
> >>>                    struct vhost_virtqueue *vq,
> >>> @@ -2165,9 +2179,20 @@ virtio_dev_tx_packed(struct virtio_net *dev,
> >>>
> >>>       } while (remained);
> >>>
> >>> -     if (vq->shadow_used_idx)
> >>> +     if (vq->shadow_used_idx) {
> >>>               do_data_copy_dequeue(vq);
> >>>
> >>> +             if (remained && !next_desc_is_avail(vq)) {
> >>> +                     /*
> >>> +                      * The guest may be waiting to TX some buffers to
> >>> +                      * enqueue more to avoid bufferfloat, so we try
> to
> >>> +                      * reduce latency here.
> >>> +                      */
> >>> +                     vhost_flush_dequeue_shadow_packed(dev, vq);
> >>> +                     vhost_vring_call_packed(dev, vq);
> >>> +             }
> >>> +     }
> >>> +
> >>>       return pkt_idx;
> >>>  }
> >>>
> >>>
> >>
> >>
> >
>
>
  
Maxime Coquelin Feb. 4, 2020, 3:10 p.m. UTC | #5
On 2/4/20 4:05 PM, Eugenio Perez Martin wrote:
> Ouch, my bad again, sorry :).
> 
> I've forwarded the patch to stable@, please let me know if I need to do
> something else.
> 
> Maxime, please let me know if I need to send a new version with the "Fixed:
> " tag :).

Not needed, I will handle it.

Thanks for the fix,
Maxime

> Thanks!
> 
> On Tue, Feb 4, 2020 at 2:49 PM Kevin Traynor <ktraynor@redhat.com> wrote:
> 
>> On 04/02/2020 09:23, Eugenio Perez Martin wrote:
>>> Hi Kevin!
>>>
>>> Sorry, thanks for noticing it! It fixes commit ("31d6c6a5b vhost:
>> optimize
>>> packed ring dequeue"), what was not present on 18.11 version (I've
>> checked
>>> that v19.08 does not contain the failure).
>>>
>>
>> Right, in that case the issue is present on 19.11 stable, so it's worth
>> adding the tags to get it fixed in 19.11 stable.
>>
>>> Do I need to send another patch version with corrected commit message?
>>>
>>
>> Probably Maxime can do it on applying if you ask nicely :-)
>>
>>> Thanks!
>>>
>>> On Fri, Jan 31, 2020 at 7:38 PM Kevin Traynor <ktraynor@redhat.com>
>> wrote:
>>>
>>>> Hi Eugenio,
>>>>
>>>> On 29/01/2020 19:33, Eugenio Pérez wrote:
>>>>> The current implementation of vhost_net in packed vring tries to fill
>>>>> the shadow vector before send any actual changes to the guest. While
>>>>> this can be beneficial for the throughput, it conflicts with some
>>>>> bufferfloats methods like the linux kernel napi, that stops
>>>>> transmitting packets if there are too much bytes/buffers in the
>>>>> driver.
>>>>>
>>>>> To solve it, we flush the shadow packets at the end of
>>>>> virtio_dev_tx_packed if we have starved the vring, i.e., the next
>>>>> buffer is not available for the device.
>>>>>
>>>>> Since this last check can be expensive because of the atomic, we only
>>>>> check it if we have not obtained the expected (count) packets. If it
>>>>> happens to obtain "count" packets and there is no more available
>>>>> packets the caller needs to keep call virtio_dev_tx_packed again.
>>>>>
>>>>
>>>> It seems to be fixing an issue and should be considered for stable
>>>> branches? You can add the tags needed in the commit message here:
>>>>
>>>> Fixes: <commit that introduced bug/missed this case>
>>>> Cc: stable@dpdk.org
>>>>
>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>>> ---
>>>>>  lib/librte_vhost/virtio_net.c | 27 ++++++++++++++++++++++++++-
>>>>>  1 file changed, 26 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/lib/librte_vhost/virtio_net.c
>>>> b/lib/librte_vhost/virtio_net.c
>>>>> index 21c311732..ac2842b2d 100644
>>>>> --- a/lib/librte_vhost/virtio_net.c
>>>>> +++ b/lib/librte_vhost/virtio_net.c
>>>>> @@ -2133,6 +2133,20 @@ virtio_dev_tx_packed_zmbuf(struct virtio_net
>> *dev,
>>>>>       return pkt_idx;
>>>>>  }
>>>>>
>>>>> +static __rte_always_inline bool
>>>>> +next_desc_is_avail(const struct vhost_virtqueue *vq)
>>>>> +{
>>>>> +     bool wrap_counter = vq->avail_wrap_counter;
>>>>> +     uint16_t next_used_idx = vq->last_used_idx + 1;
>>>>> +
>>>>> +     if (next_used_idx >= vq->size) {
>>>>> +             next_used_idx -= vq->size;
>>>>> +             wrap_counter ^= 1;
>>>>> +     }
>>>>> +
>>>>> +     return desc_is_avail(&vq->desc_packed[next_used_idx],
>>>> wrap_counter);
>>>>> +}
>>>>> +
>>>>>  static __rte_noinline uint16_t
>>>>>  virtio_dev_tx_packed(struct virtio_net *dev,
>>>>>                    struct vhost_virtqueue *vq,
>>>>> @@ -2165,9 +2179,20 @@ virtio_dev_tx_packed(struct virtio_net *dev,
>>>>>
>>>>>       } while (remained);
>>>>>
>>>>> -     if (vq->shadow_used_idx)
>>>>> +     if (vq->shadow_used_idx) {
>>>>>               do_data_copy_dequeue(vq);
>>>>>
>>>>> +             if (remained && !next_desc_is_avail(vq)) {
>>>>> +                     /*
>>>>> +                      * The guest may be waiting to TX some buffers to
>>>>> +                      * enqueue more to avoid bufferfloat, so we try
>> to
>>>>> +                      * reduce latency here.
>>>>> +                      */
>>>>> +                     vhost_flush_dequeue_shadow_packed(dev, vq);
>>>>> +                     vhost_vring_call_packed(dev, vq);
>>>>> +             }
>>>>> +     }
>>>>> +
>>>>>       return pkt_idx;
>>>>>  }
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>
  
Maxime Coquelin Feb. 5, 2020, 9:09 a.m. UTC | #6
On 2/4/20 3:47 PM, Eugenio Pérez wrote:
> The current implementation of vhost_net in packed vring tries to fill
> the shadow vector before send any actual changes to the guest. While
> this can be beneficial for the throughput, it conflicts with some
> bufferfloats methods like the linux kernel napi, that stops
> transmitting packets if there are too much bytes/buffers in the
> driver.
> 
> To solve it, we flush the shadow packets at the end of
> virtio_dev_tx_packed if we have starved the vring, i.e., the next
> buffer is not available for the device.
> 
> Since this last check can be expensive because of the atomic, we only
> check it if we have not obtained the expected (count) packets. If it
> happens to obtain "count" packets and there is no more available
> packets the caller needs to keep call virtio_dev_tx_packed again.
> 
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
>  lib/librte_vhost/virtio_net.c | 27 ++++++++++++++++++++++++++-
>  1 file changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
> index 21c311732..ac2842b2d 100644
> --- a/lib/librte_vhost/virtio_net.c
> +++ b/lib/librte_vhost/virtio_net.c
> @@ -2133,6 +2133,20 @@ virtio_dev_tx_packed_zmbuf(struct virtio_net *dev,
>  	return pkt_idx;
>  }
>  
> +static __rte_always_inline bool
> +next_desc_is_avail(const struct vhost_virtqueue *vq)
> +{
> +	bool wrap_counter = vq->avail_wrap_counter;
> +	uint16_t next_used_idx = vq->last_used_idx + 1;
> +
> +	if (next_used_idx >= vq->size) {
> +		next_used_idx -= vq->size;
> +		wrap_counter ^= 1;
> +	}
> +
> +	return desc_is_avail(&vq->desc_packed[next_used_idx], wrap_counter);
> +}
> +
>  static __rte_noinline uint16_t
>  virtio_dev_tx_packed(struct virtio_net *dev,
>  		     struct vhost_virtqueue *vq,
> @@ -2165,9 +2179,20 @@ virtio_dev_tx_packed(struct virtio_net *dev,
>  
>  	} while (remained);
>  
> -	if (vq->shadow_used_idx)
> +	if (vq->shadow_used_idx) {
>  		do_data_copy_dequeue(vq);
>  
> +		if (remained && !next_desc_is_avail(vq)) {
> +			/*
> +			 * The guest may be waiting to TX some buffers to
> +			 * enqueue more to avoid bufferfloat, so we try to
> +			 * reduce latency here.
> +			 */
> +			vhost_flush_dequeue_shadow_packed(dev, vq);
> +			vhost_vring_call_packed(dev, vq);
> +		}
> +	}
> +
>  	return pkt_idx;
>  }
>  
> 

Reviewed-by: Maxime Coquelin <maxime.coquelin@¶edhat.com>

I'll fix the commit message while applying.

Thanks,
Maxime
  
Maxime Coquelin Feb. 5, 2020, 9:47 a.m. UTC | #7
On 2/4/20 3:47 PM, Eugenio Pérez wrote:
> The current implementation of vhost_net in packed vring tries to fill
> the shadow vector before send any actual changes to the guest. While
> this can be beneficial for the throughput, it conflicts with some
> bufferfloats methods like the linux kernel napi, that stops
> transmitting packets if there are too much bytes/buffers in the
> driver.
> 
> To solve it, we flush the shadow packets at the end of
> virtio_dev_tx_packed if we have starved the vring, i.e., the next
> buffer is not available for the device.
> 
> Since this last check can be expensive because of the atomic, we only
> check it if we have not obtained the expected (count) packets. If it
> happens to obtain "count" packets and there is no more available
> packets the caller needs to keep call virtio_dev_tx_packed again.
> 
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
>  lib/librte_vhost/virtio_net.c | 27 ++++++++++++++++++++++++++-
>  1 file changed, 26 insertions(+), 1 deletion(-)

Applied to dpdk-next-virtio tree.

Thanks,
Maxime
  
Eugenio Perez Martin Feb. 5, 2020, 3:45 p.m. UTC | #8
Thanks!

On Wed, Feb 5, 2020 at 10:48 AM Maxime Coquelin <maxime.coquelin@redhat.com>
wrote:

>
>
> On 2/4/20 3:47 PM, Eugenio Pérez wrote:
> > The current implementation of vhost_net in packed vring tries to fill
> > the shadow vector before send any actual changes to the guest. While
> > this can be beneficial for the throughput, it conflicts with some
> > bufferfloats methods like the linux kernel napi, that stops
> > transmitting packets if there are too much bytes/buffers in the
> > driver.
> >
> > To solve it, we flush the shadow packets at the end of
> > virtio_dev_tx_packed if we have starved the vring, i.e., the next
> > buffer is not available for the device.
> >
> > Since this last check can be expensive because of the atomic, we only
> > check it if we have not obtained the expected (count) packets. If it
> > happens to obtain "count" packets and there is no more available
> > packets the caller needs to keep call virtio_dev_tx_packed again.
> >
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> >  lib/librte_vhost/virtio_net.c | 27 ++++++++++++++++++++++++++-
> >  1 file changed, 26 insertions(+), 1 deletion(-)
>
> Applied to dpdk-next-virtio tree.
>
> Thanks,
> Maxime
>
>
  

Patch

diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 21c311732..ac2842b2d 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -2133,6 +2133,20 @@  virtio_dev_tx_packed_zmbuf(struct virtio_net *dev,
 	return pkt_idx;
 }
 
+static __rte_always_inline bool
+next_desc_is_avail(const struct vhost_virtqueue *vq)
+{
+	bool wrap_counter = vq->avail_wrap_counter;
+	uint16_t next_used_idx = vq->last_used_idx + 1;
+
+	if (next_used_idx >= vq->size) {
+		next_used_idx -= vq->size;
+		wrap_counter ^= 1;
+	}
+
+	return desc_is_avail(&vq->desc_packed[next_used_idx], wrap_counter);
+}
+
 static __rte_noinline uint16_t
 virtio_dev_tx_packed(struct virtio_net *dev,
 		     struct vhost_virtqueue *vq,
@@ -2165,9 +2179,20 @@  virtio_dev_tx_packed(struct virtio_net *dev,
 
 	} while (remained);
 
-	if (vq->shadow_used_idx)
+	if (vq->shadow_used_idx) {
 		do_data_copy_dequeue(vq);
 
+		if (remained && !next_desc_is_avail(vq)) {
+			/*
+			 * The guest may be waiting to TX some buffers to
+			 * enqueue more to avoid bufferfloat, so we try to
+			 * reduce latency here.
+			 */
+			vhost_flush_dequeue_shadow_packed(dev, vq);
+			vhost_vring_call_packed(dev, vq);
+		}
+	}
+
 	return pkt_idx;
 }