[dpdk-dev,v2] net/virtio: fix an incorrect behavior of device stop/start

Message ID BCFD2BB875535045A5368D9ADBF296993506013E@IRSMSX101.ger.corp.intel.com (mailing list archive)
State Not Applicable, archived
Delegated to: Yuanhan Liu
Headers

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/Intel-compilation fail apply patch file failure

Commit Message

Antonio Fischetti Dec. 1, 2017, 5:17 p.m. UTC
  Hi All,
I've got an update on this.
I could replicate the same issue by using testpmd + a VM (= Virtual Machine).

The test topology I'm using is:


[Traffic gen]----[NIC port #0]----[testpmd]----[vhost port #2]----+
                                                                  | 
                                                                  |
                                                             [testpmd in
                                                                the VM]
                                                                  |
                                                                  |
[Traffic gen]----[NIC port #1]----[testpmd]----[vhost port #3]----+


So there's no OvS now in the picture: one testpmd running on the host
and one testpmd running on the VM.

The issue is that no packet goes through testpmd in the VM.
It seems this is happening after this patch was upstreamed.

Please note
-----------
To replicate this issue both the next 2 conditions must be met:
 - the traffic is already being sent before launching testpmd in the VM
 - there are at least 2 forwarding lcores.

I found that a way to fix this is:

or alternatively by using a commit before 

commit d8227497ec5c3de75fe378e09fc9673ae097fa73
Author: Tiwei Bie <tiwei.bie@intel.com>
Date:   Fri Oct 20 10:09:28 2017 +0800

    net/virtio: flush Rx queues on start


Other details of my testbench below.

- testpmd on the host has 2 forwarding lcores: #2 and #3 
(otherwise the issue doesn't show up with 1 fwd lcore).

- the VM (qemu) is using lcore #10

- the VM has 2 vhost ports, with 1 queue each

- in this case testpmd on the host uses vhostuser ports. 
In the other testbench with OvS - see previous email - OvS 
was using vhostuserclient ports and the VM was acting as a 
vhost server. The issue happens in both cases.

For further info, please let me know.


Thanks,
Antonio


> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Fischetti, Antonio
> Sent: Tuesday, November 14, 2017 5:39 PM
> To: Bie, Tiwei <tiwei.bie@intel.com>; dev@dpdk.org
> Cc: yliu@fridaylinux.org; maxime.coquelin@redhat.com;
> jfreimann@redhat.com; stable@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] net/virtio: fix an incorrect behavior
> of device stop/start
> 
> Hi Tiwei,
> 
> I'm doing some regression tests with v17.11-rc4. I ran
> into a hitch with testpmd running into a guest VM. It happens
> that no packet gets forwarded by testpmd.
> The issue seems to appear after this patch was upstreamed.
> 
> I saw there's a way to make it work, ie by avoiding to
> increment the last consumed descriptor:
> 
> --- a/drivers/net/virtio/virtqueue.c
> +++ b/drivers/net/virtio/virtqueue.c
> @@ -80,7 +80,7 @@ virtqueue_flush(struct virtqueue *vq)
>                         rte_pktmbuf_free(dxp->cookie);
>                         dxp->cookie = NULL;
>                 }
> -               vq->vq_used_cons_idx++;
> +               //vq->vq_used_cons_idx++;
>                 vq_ring_free_chain(vq, desc_idx);
> 
> Not quite sure if this change make any sense to you?
> 
> Some details below.
> 
> The issue appears only if the traffic generator is already
> sending packets before I launch testpmd in the guest.
> 
> In my testbench I have Open-vSwitch (OvS-DPDK) which launches
> a VM with 2 vhostuserclient ports (vhu0 and vhu1), each with
> a single queue.
> My OvS has 2 physical ports: dpdk0 and dpdk1.
> dpdk0 forwards packets back and forth from/to the generator
> to/from vhu0.
> Similarly, dpdk1 forwards packets back and forth from/to the generator
> to/from vhu1.
> 
> In OvS there are 2 different PMD threads serving the 2
> vhostuserclient ports.
> 
> While the traffic generator is already sending packets, in the
> guest VM I launch
>   ./testpmd -c 0x3 -n 4 --socket-mem 512 -- --burst=64 -i --
> txqflags=0xf00 --disable-hw-vlan
> 
> The issue is that I see no packet received on the traffic generator
> and in fact testpmd shows
> 
> ---------------------- Forward statistics for port 0  ------------------
> ----
>   RX-packets: 0              RX-dropped: 0             RX-total: 0
>   TX-packets: 0              TX-dropped: 0             TX-total: 0
>   ----------------------------------------------------------------------
> ------
> 
>   ---------------------- Forward statistics for port 1  ----------------
> ------
>   RX-packets: 0              RX-dropped: 0             RX-total: 0
>   TX-packets: 0              TX-dropped: 0             TX-total: 0
>   ----------------------------------------------------------------------
> ------
> 
>   +++++++++++++++ Accumulated forward statistics for all
> ports+++++++++++++++
>   RX-packets: 0              RX-dropped: 0             RX-total: 0
>   TX-packets: 0              TX-dropped: 0             TX-total: 0
> 
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> ++++
> 
> Please let me know if I missed something or if you need
> more info on my testbench.
> 
> 
> Thanks,
> Antonio
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Tiwei Bie
> > Sent: Friday, October 20, 2017 3:09 AM
> > To: dev@dpdk.org
> > Cc: yliu@fridaylinux.org; maxime.coquelin@redhat.com;
> > jfreimann@redhat.com; stable@dpdk.org
> > Subject: [dpdk-dev] [PATCH v2] net/virtio: fix an incorrect behavior
> of
> > device stop/start
> >
> > After starting a device, the driver shouldn't deliver the
> > packets that already existed before the device is started
> > to applications. Otherwise it will lead to incorrect packet
> > collection for port state. This patch fixes this issue by
> > flushing the Rx queues when starting the device.
> >
> > Fixes: a85786dc816f ("virtio: fix states handling during
> > initialization")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > Reviewed-by: Jens Freimann <jfreimann@redhat.com>
> > ---
> > v2:
> > - Use the existing `for` loop
> > - Improve the commit log
> >
> >  drivers/net/virtio/virtio_ethdev.c |  2 ++
> >  drivers/net/virtio/virtio_rxtx.c   |  2 +-
> >  drivers/net/virtio/virtqueue.c     | 25 +++++++++++++++++++++++++
> >  drivers/net/virtio/virtqueue.h     |  5 +++++
> >  4 files changed, 33 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/virtio/virtio_ethdev.c
> > b/drivers/net/virtio/virtio_ethdev.c
> > index 42c2836..9ccb0f4 100644
> > --- a/drivers/net/virtio/virtio_ethdev.c
> > +++ b/drivers/net/virtio/virtio_ethdev.c
> > @@ -1819,6 +1819,8 @@ virtio_dev_start(struct rte_eth_dev *dev)
> >
> >  	for (i = 0; i < dev->data->nb_rx_queues; i++) {
> >  		rxvq = dev->data->rx_queues[i];
> > +		/* Flush the old packets */
> > +		virtqueue_flush(rxvq->vq);
> >  		virtqueue_notify(rxvq->vq);
> >  	}
> >
> > diff --git a/drivers/net/virtio/virtio_rxtx.c
> > b/drivers/net/virtio/virtio_rxtx.c
> > index 609b413..f5b6f94 100644
> > --- a/drivers/net/virtio/virtio_rxtx.c
> > +++ b/drivers/net/virtio/virtio_rxtx.c
> > @@ -80,7 +80,7 @@ virtio_dev_rx_queue_done(void *rxq, uint16_t offset)
> >  	return VIRTQUEUE_NUSED(vq) >= offset;
> >  }
> >
> > -static void
> > +void
> >  vq_ring_free_chain(struct virtqueue *vq, uint16_t desc_idx)
> >  {
> >  	struct vring_desc *dp, *dp_tail;
> > diff --git a/drivers/net/virtio/virtqueue.c
> > b/drivers/net/virtio/virtqueue.c
> > index 9ad77b8..c3a536f 100644
> > --- a/drivers/net/virtio/virtqueue.c
> > +++ b/drivers/net/virtio/virtqueue.c
> > @@ -59,3 +59,28 @@ virtqueue_detatch_unused(struct virtqueue *vq)
> >  		}
> >  	return NULL;
> >  }
> > +
> > +/* Flush the elements in the used ring. */
> > +void
> > +virtqueue_flush(struct virtqueue *vq)
> > +{
> > +	struct vring_used_elem *uep;
> > +	struct vq_desc_extra *dxp;
> > +	uint16_t used_idx, desc_idx;
> > +	uint16_t nb_used, i;
> > +
> > +	nb_used = VIRTQUEUE_NUSED(vq);
> > +
> > +	for (i = 0; i < nb_used; i++) {
> > +		used_idx = vq->vq_used_cons_idx & (vq->vq_nentries - 1);
> > +		uep = &vq->vq_ring.used->ring[used_idx];
> > +		desc_idx = (uint16_t)uep->id;
> > +		dxp = &vq->vq_descx[desc_idx];
> > +		if (dxp->cookie != NULL) {
> > +			rte_pktmbuf_free(dxp->cookie);
> > +			dxp->cookie = NULL;
> > +		}
> > +		vq->vq_used_cons_idx++;
> > +		vq_ring_free_chain(vq, desc_idx);
> > +	}
> > +}
> > diff --git a/drivers/net/virtio/virtqueue.h
> > b/drivers/net/virtio/virtqueue.h
> > index 9c4f96d..11059fb 100644
> > --- a/drivers/net/virtio/virtqueue.h
> > +++ b/drivers/net/virtio/virtqueue.h
> > @@ -304,6 +304,9 @@ void virtqueue_dump(struct virtqueue *vq);
> >   */
> >  struct rte_mbuf *virtqueue_detatch_unused(struct virtqueue *vq);
> >
> > +/* Flush the elements in the used ring. */
> > +void virtqueue_flush(struct virtqueue *vq);
> > +
> >  static inline int
> >  virtqueue_full(const struct virtqueue *vq)
> >  {
> > @@ -312,6 +315,8 @@ virtqueue_full(const struct virtqueue *vq)
> >
> >  #define VIRTQUEUE_NUSED(vq) ((uint16_t)((vq)->vq_ring.used->idx -
> (vq)-
> > >vq_used_cons_idx))
> >
> > +void vq_ring_free_chain(struct virtqueue *vq, uint16_t desc_idx);
> > +
> >  static inline void
> >  vq_update_avail_idx(struct virtqueue *vq)
> >  {
> > --
> > 2.7.4
  

Comments

Tiwei Bie Dec. 2, 2017, 4:30 a.m. UTC | #1
Hi Antonio,

On Sat, Dec 02, 2017 at 01:17:58AM +0800, Fischetti, Antonio wrote:
> Hi All,
> I've got an update on this.
> I could replicate the same issue by using testpmd + a VM (= Virtual Machine).
> 
> The test topology I'm using is:
> 
> 
> [Traffic gen]----[NIC port #0]----[testpmd]----[vhost port #2]----+
>                                                                   | 
>                                                                   |
>                                                              [testpmd in
>                                                                 the VM]
>                                                                   |
>                                                                   |
> [Traffic gen]----[NIC port #1]----[testpmd]----[vhost port #3]----+
> 
> 
> So there's no OvS now in the picture: one testpmd running on the host
> and one testpmd running on the VM.
> 
> The issue is that no packet goes through testpmd in the VM.
> It seems this is happening after this patch was upstreamed.
> 
> Please note
> -----------
> To replicate this issue both the next 2 conditions must be met:
>  - the traffic is already being sent before launching testpmd in the VM
>  - there are at least 2 forwarding lcores.
> 

Thanks again for reporting this issue with such detailed info!

Due to the current environment I have, I tried a slightly different
topo setup, instead of the above you described, my topo setup is:

[testpmd1/txonly]----[vhost port #0]----+
                                        |
                                        |
                                   [testpmd in
                                      the VM]
                                        |
                                        |
[testpmd2/rxonly]----[vhost port #1]----+

That is to say, I'm using testpmd/{txonly,rxonly} directly without
external traffic generators.

Below is the script I'm using to launch the testpmds:

# testpmd1:

sudo ./x86_64-native-linuxapp-gcc/app/testpmd -l 1,2 \
                --socket-mem 1024,1024 \
                --file-prefix=vhost0 \
                --no-pci \
                --vdev=net_vhost0,iface=/tmp/socket-0,queues=1 \
                -- \
                --port-topology=chained \
                -i \
                --nb-cores=1 \
                --forward-mode=txonly

# testpmd2:

sudo ./x86_64-native-linuxapp-gcc/app/testpmd -l 3,4 \
                --socket-mem 1024,1024 \
                --file-prefix=vhost1 \
                --no-pci \
                --vdev=net_vhost1,iface=/tmp/socket-1,queues=1 \
                -- \
                --port-topology=chained \
                -i \
                --nb-cores=1 \
                --forward-mode=rxonly

# testpmd in the VM:

sudo \
./x86_64-native-linuxapp-gcc/app/testpmd -l 1,2,3 \
        --socket-mem 512 \
        -- \
        --burst=64 -i --txqflags=0xf00 --disable-hw-vlan \
        --nb-cores=2 \
        --forward-mode=mac

Unfortunately, I failed to reproduce this issue.

I'm starting the traffics of the testpmd1 and testpmd2 (i.e. txonly
fwd in testpmd1 and rxonly fwd in testpmd2) before launching the
testpmd in the VM.

And I'm using 2 forwarding lcores for the testpmd in the VM:

testpmd> show config fwd
mac packet forwarding - ports=2 - cores=2 - streams=2 - NUMA support enabled, MP over anonymous pages disabled
Logical Core 2 (socket 0) forwards packets on 1 streams:
  RX P=0/Q=0 (socket 0) -> TX P=1/Q=0 (socket 0) peer=02:00:00:00:00:01
Logical Core 3 (socket 0) forwards packets on 1 streams:
  RX P=1/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00

I can see the traffics are forwarded once the fwd is started
in the testpmd in the VM:

testpmd> show port stats all

  ######################## NIC statistics for port 0  ########################
  RX-packets: 52865632   RX-missed: 0          RX-bytes:  3383400448
  RX-errors: 0
  RX-nombuf:  0
  TX-packets: 0          TX-errors: 0          TX-bytes:  0

  Throughput (since last show)
  Rx-pps:      6040178
  Tx-pps:            0
  ############################################################################

  ######################## NIC statistics for port 1  ########################
  RX-packets: 0          RX-missed: 0          RX-bytes:  0
  RX-errors: 0
  RX-nombuf:  0
  TX-packets: 52865568   TX-errors: 0          TX-bytes:  3383396352

  Throughput (since last show)
  Rx-pps:            0
  Tx-pps:      6040177
  ############################################################################

The DPDK version I'm using is: 18.02-rc0

Do you see anything I missed? Or can you reproduce the issue with the
setup I'm using?

> I found that a way to fix this is:
> 
> diff --git a/drivers/net/virtio/virtqueue.c b/drivers/net/virtio/virtqueue.c
> index c3a536f..17dd87d 100644
> --- a/drivers/net/virtio/virtqueue.c
> +++ b/drivers/net/virtio/virtqueue.c
> @@ -80,7 +80,7 @@ virtqueue_flush(struct virtqueue *vq)
>                         rte_pktmbuf_free(dxp->cookie);
>                         dxp->cookie = NULL;
>                 }
> -               vq->vq_used_cons_idx++;
> +               //vq->vq_used_cons_idx++;
>                 vq_ring_free_chain(vq, desc_idx);
>         }
> 
> or alternatively by using a commit before 
> 
> commit d8227497ec5c3de75fe378e09fc9673ae097fa73
> Author: Tiwei Bie <tiwei.bie@intel.com>
> Date:   Fri Oct 20 10:09:28 2017 +0800
> 
>     net/virtio: flush Rx queues on start
> 
> 
> Other details of my testbench below.
> 
> - testpmd on the host has 2 forwarding lcores: #2 and #3 
> (otherwise the issue doesn't show up with 1 fwd lcore).
> 

I'm using two testpmd running on two cores on the host.
I'm not sure whether it matches the condition you mentioned.
It's strange that the number forwarding cores on the host
would affect the virtio driver in the VM..

Best regards,
Tiwei Bie

> - the VM (qemu) is using lcore #10
> 
> - the VM has 2 vhost ports, with 1 queue each
> 
> - in this case testpmd on the host uses vhostuser ports. 
> In the other testbench with OvS - see previous email - OvS 
> was using vhostuserclient ports and the VM was acting as a 
> vhost server. The issue happens in both cases.
> 
> For further info, please let me know.
> 
> 
> Thanks,
> Antonio
> 
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Fischetti, Antonio
> > Sent: Tuesday, November 14, 2017 5:39 PM
> > To: Bie, Tiwei <tiwei.bie@intel.com>; dev@dpdk.org
> > Cc: yliu@fridaylinux.org; maxime.coquelin@redhat.com;
> > jfreimann@redhat.com; stable@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2] net/virtio: fix an incorrect behavior
> > of device stop/start
> > 
> > Hi Tiwei,
> > 
> > I'm doing some regression tests with v17.11-rc4. I ran
> > into a hitch with testpmd running into a guest VM. It happens
> > that no packet gets forwarded by testpmd.
> > The issue seems to appear after this patch was upstreamed.
> > 
> > I saw there's a way to make it work, ie by avoiding to
> > increment the last consumed descriptor:
> > 
> > --- a/drivers/net/virtio/virtqueue.c
> > +++ b/drivers/net/virtio/virtqueue.c
> > @@ -80,7 +80,7 @@ virtqueue_flush(struct virtqueue *vq)
> >                         rte_pktmbuf_free(dxp->cookie);
> >                         dxp->cookie = NULL;
> >                 }
> > -               vq->vq_used_cons_idx++;
> > +               //vq->vq_used_cons_idx++;
> >                 vq_ring_free_chain(vq, desc_idx);
> > 
> > Not quite sure if this change make any sense to you?
> > 
> > Some details below.
> > 
> > The issue appears only if the traffic generator is already
> > sending packets before I launch testpmd in the guest.
> > 
> > In my testbench I have Open-vSwitch (OvS-DPDK) which launches
> > a VM with 2 vhostuserclient ports (vhu0 and vhu1), each with
> > a single queue.
> > My OvS has 2 physical ports: dpdk0 and dpdk1.
> > dpdk0 forwards packets back and forth from/to the generator
> > to/from vhu0.
> > Similarly, dpdk1 forwards packets back and forth from/to the generator
> > to/from vhu1.
> > 
> > In OvS there are 2 different PMD threads serving the 2
> > vhostuserclient ports.
> > 
> > While the traffic generator is already sending packets, in the
> > guest VM I launch
> >   ./testpmd -c 0x3 -n 4 --socket-mem 512 -- --burst=64 -i --
> > txqflags=0xf00 --disable-hw-vlan
> > 
> > The issue is that I see no packet received on the traffic generator
> > and in fact testpmd shows
> > 
> > ---------------------- Forward statistics for port 0  ------------------
> > ----
> >   RX-packets: 0              RX-dropped: 0             RX-total: 0
> >   TX-packets: 0              TX-dropped: 0             TX-total: 0
> >   ----------------------------------------------------------------------
> > ------
> > 
> >   ---------------------- Forward statistics for port 1  ----------------
> > ------
> >   RX-packets: 0              RX-dropped: 0             RX-total: 0
> >   TX-packets: 0              TX-dropped: 0             TX-total: 0
> >   ----------------------------------------------------------------------
> > ------
> > 
> >   +++++++++++++++ Accumulated forward statistics for all
> > ports+++++++++++++++
> >   RX-packets: 0              RX-dropped: 0             RX-total: 0
> >   TX-packets: 0              TX-dropped: 0             TX-total: 0
> > 
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > ++++
> > 
> > Please let me know if I missed something or if you need
> > more info on my testbench.
> > 
> > 
> > Thanks,
> > Antonio
> > 
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Tiwei Bie
> > > Sent: Friday, October 20, 2017 3:09 AM
> > > To: dev@dpdk.org
> > > Cc: yliu@fridaylinux.org; maxime.coquelin@redhat.com;
> > > jfreimann@redhat.com; stable@dpdk.org
> > > Subject: [dpdk-dev] [PATCH v2] net/virtio: fix an incorrect behavior
> > of
> > > device stop/start
> > >
> > > After starting a device, the driver shouldn't deliver the
> > > packets that already existed before the device is started
> > > to applications. Otherwise it will lead to incorrect packet
> > > collection for port state. This patch fixes this issue by
> > > flushing the Rx queues when starting the device.
> > >
> > > Fixes: a85786dc816f ("virtio: fix states handling during
> > > initialization")
> > > Cc: stable@dpdk.org
> > >
> > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > Reviewed-by: Jens Freimann <jfreimann@redhat.com>
> > > ---
> > > v2:
> > > - Use the existing `for` loop
> > > - Improve the commit log
> > >
> > >  drivers/net/virtio/virtio_ethdev.c |  2 ++
> > >  drivers/net/virtio/virtio_rxtx.c   |  2 +-
> > >  drivers/net/virtio/virtqueue.c     | 25 +++++++++++++++++++++++++
> > >  drivers/net/virtio/virtqueue.h     |  5 +++++
> > >  4 files changed, 33 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/net/virtio/virtio_ethdev.c
> > > b/drivers/net/virtio/virtio_ethdev.c
> > > index 42c2836..9ccb0f4 100644
> > > --- a/drivers/net/virtio/virtio_ethdev.c
> > > +++ b/drivers/net/virtio/virtio_ethdev.c
> > > @@ -1819,6 +1819,8 @@ virtio_dev_start(struct rte_eth_dev *dev)
> > >
> > >  	for (i = 0; i < dev->data->nb_rx_queues; i++) {
> > >  		rxvq = dev->data->rx_queues[i];
> > > +		/* Flush the old packets */
> > > +		virtqueue_flush(rxvq->vq);
> > >  		virtqueue_notify(rxvq->vq);
> > >  	}
> > >
> > > diff --git a/drivers/net/virtio/virtio_rxtx.c
> > > b/drivers/net/virtio/virtio_rxtx.c
> > > index 609b413..f5b6f94 100644
> > > --- a/drivers/net/virtio/virtio_rxtx.c
> > > +++ b/drivers/net/virtio/virtio_rxtx.c
> > > @@ -80,7 +80,7 @@ virtio_dev_rx_queue_done(void *rxq, uint16_t offset)
> > >  	return VIRTQUEUE_NUSED(vq) >= offset;
> > >  }
> > >
> > > -static void
> > > +void
> > >  vq_ring_free_chain(struct virtqueue *vq, uint16_t desc_idx)
> > >  {
> > >  	struct vring_desc *dp, *dp_tail;
> > > diff --git a/drivers/net/virtio/virtqueue.c
> > > b/drivers/net/virtio/virtqueue.c
> > > index 9ad77b8..c3a536f 100644
> > > --- a/drivers/net/virtio/virtqueue.c
> > > +++ b/drivers/net/virtio/virtqueue.c
> > > @@ -59,3 +59,28 @@ virtqueue_detatch_unused(struct virtqueue *vq)
> > >  		}
> > >  	return NULL;
> > >  }
> > > +
> > > +/* Flush the elements in the used ring. */
> > > +void
> > > +virtqueue_flush(struct virtqueue *vq)
> > > +{
> > > +	struct vring_used_elem *uep;
> > > +	struct vq_desc_extra *dxp;
> > > +	uint16_t used_idx, desc_idx;
> > > +	uint16_t nb_used, i;
> > > +
> > > +	nb_used = VIRTQUEUE_NUSED(vq);
> > > +
> > > +	for (i = 0; i < nb_used; i++) {
> > > +		used_idx = vq->vq_used_cons_idx & (vq->vq_nentries - 1);
> > > +		uep = &vq->vq_ring.used->ring[used_idx];
> > > +		desc_idx = (uint16_t)uep->id;
> > > +		dxp = &vq->vq_descx[desc_idx];
> > > +		if (dxp->cookie != NULL) {
> > > +			rte_pktmbuf_free(dxp->cookie);
> > > +			dxp->cookie = NULL;
> > > +		}
> > > +		vq->vq_used_cons_idx++;
> > > +		vq_ring_free_chain(vq, desc_idx);
> > > +	}
> > > +}
> > > diff --git a/drivers/net/virtio/virtqueue.h
> > > b/drivers/net/virtio/virtqueue.h
> > > index 9c4f96d..11059fb 100644
> > > --- a/drivers/net/virtio/virtqueue.h
> > > +++ b/drivers/net/virtio/virtqueue.h
> > > @@ -304,6 +304,9 @@ void virtqueue_dump(struct virtqueue *vq);
> > >   */
> > >  struct rte_mbuf *virtqueue_detatch_unused(struct virtqueue *vq);
> > >
> > > +/* Flush the elements in the used ring. */
> > > +void virtqueue_flush(struct virtqueue *vq);
> > > +
> > >  static inline int
> > >  virtqueue_full(const struct virtqueue *vq)
> > >  {
> > > @@ -312,6 +315,8 @@ virtqueue_full(const struct virtqueue *vq)
> > >
> > >  #define VIRTQUEUE_NUSED(vq) ((uint16_t)((vq)->vq_ring.used->idx -
> > (vq)-
> > > >vq_used_cons_idx))
> > >
> > > +void vq_ring_free_chain(struct virtqueue *vq, uint16_t desc_idx);
> > > +
> > >  static inline void
> > >  vq_update_avail_idx(struct virtqueue *vq)
> > >  {
> > > --
> > > 2.7.4
>
  
Tiwei Bie Dec. 4, 2017, 7:19 a.m. UTC | #2
On Sat, Dec 02, 2017 at 12:30:33PM +0800, Tiwei Bie wrote:
> Hi Antonio,
> 
> On Sat, Dec 02, 2017 at 01:17:58AM +0800, Fischetti, Antonio wrote:
> > Hi All,
> > I've got an update on this.
> > I could replicate the same issue by using testpmd + a VM (= Virtual Machine).
> > 
> > The test topology I'm using is:
> > 
> > 
> > [Traffic gen]----[NIC port #0]----[testpmd]----[vhost port #2]----+
> >                                                                   | 
> >                                                                   |
> >                                                              [testpmd in
> >                                                                 the VM]
> >                                                                   |
> >                                                                   |
> > [Traffic gen]----[NIC port #1]----[testpmd]----[vhost port #3]----+
> > 
> > 
> > So there's no OvS now in the picture: one testpmd running on the host
> > and one testpmd running on the VM.
> > 
> > The issue is that no packet goes through testpmd in the VM.
> > It seems this is happening after this patch was upstreamed.
> > 
> > Please note
> > -----------
> > To replicate this issue both the next 2 conditions must be met:
> >  - the traffic is already being sent before launching testpmd in the VM
> >  - there are at least 2 forwarding lcores.
> > 
> 
[...]
> 
> Do you see anything I missed? Or can you reproduce the issue with the
> setup I'm using?
> 

Hi Antonio,

Are you using vector Rx in your test? After some further
investigations, I found that the vector Rx could be broken
if backend has consumed all the avail descs before the
device is started. Because in current implementation, the
vector Rx will return immediately without refilling the
avail ring if the used ring is empty. So we have to refill
the avail ring after flushing the elements in the used ring.

Best regards,
Tiwei Bie
  
Antonio Fischetti Dec. 4, 2017, 11:46 a.m. UTC | #3
> -----Original Message-----

> From: Bie, Tiwei

> Sent: Monday, December 4, 2017 7:20 AM

> To: Fischetti, Antonio <antonio.fischetti@intel.com>

> Cc: dev@dpdk.org; yliu@fridaylinux.org; maxime.coquelin@redhat.com;

> jfreimann@redhat.com; stable@dpdk.org

> Subject: Re: [dpdk-dev] [PATCH v2] net/virtio: fix an incorrect behavior

> of device stop/start

> 

> On Sat, Dec 02, 2017 at 12:30:33PM +0800, Tiwei Bie wrote:

> > Hi Antonio,

> >

> > On Sat, Dec 02, 2017 at 01:17:58AM +0800, Fischetti, Antonio wrote:

> > > Hi All,

> > > I've got an update on this.

> > > I could replicate the same issue by using testpmd + a VM (= Virtual

> Machine).

> > >

> > > The test topology I'm using is:

> > >

> > >

> > > [Traffic gen]----[NIC port #0]----[testpmd]----[vhost port #2]----+

> > >                                                                   |

> > >                                                                   |

> > >

> [testpmd in

> > >                                                                 the

> VM]

> > >                                                                   |

> > >                                                                   |

> > > [Traffic gen]----[NIC port #1]----[testpmd]----[vhost port #3]----+

> > >

> > >

> > > So there's no OvS now in the picture: one testpmd running on the

> host

> > > and one testpmd running on the VM.

> > >

> > > The issue is that no packet goes through testpmd in the VM.

> > > It seems this is happening after this patch was upstreamed.

> > >

> > > Please note

> > > -----------

> > > To replicate this issue both the next 2 conditions must be met:

> > >  - the traffic is already being sent before launching testpmd in the

> VM

> > >  - there are at least 2 forwarding lcores.

> > >

> >

> [...]

> >

> > Do you see anything I missed? Or can you reproduce the issue with the

> > setup I'm using?

> >

> 

> Hi Antonio,

> 

> Are you using vector Rx in your test? After some further


[Antonio] Hi Tiwei, yes I suppose so.

Below some more details on my testbench to explain why I'm
using this configuration.

With this topology I can replicate as much as possible
my initial OvS-DPDK testbench. In the ovs case I 
had 1 single process on the host (OvS) which spawns 2 
dequeuing threads running on 2 different lcores (please 
note: if they run on the same lcore there's no issue).

So in place of Ovs I'm using 1 testpmd only on the host like:

sudo ./x86_64-native-linuxapp-gcc/app/testpmd -l 0-3 -n 4 \
    --socket-mem=1024,1024 \
    --vdev 'eth_vhost0,iface=/tmp/sock0,queues=1' \
    --vdev 'eth_vhost1,iface=/tmp/sock1,queues=1' -- -i  

so the testpmd on the host is seeing 4 phy ports, 2 are real
phy ports and 2 are vdev.

Then I set the forwarding cores

 testpmd> set corelist 2,3

and the port order
 testpmd> set portlist 0,2,1,3
 testpmd> set fwd mac retry
 testpmd> start

I then launch the VM and inside of it I launch testpmd:

./testpmd -c 0x3 -n 4 --socket-mem 512 -- --burst=64 -i \
    --txqflags=0xf00 --disable-hw-vlan

on testpmd console:
 testpmd> set fwd mac retry
 testpmd> start

As the traffic is already running at the line-rate for 
some seconds, for sure all the descs have been consumed.

I'm currently using 1 GB hugepage sizes, but I was
seeing the same issue with OvS when I was using 2 MB 
hugepage sizes.



> investigations, I found that the vector Rx could be broken

> if backend has consumed all the avail descs before the

> device is started. Because in current implementation, the

> vector Rx will return immediately without refilling the

> avail ring if the used ring is empty. So we have to refill

> the avail ring after flushing the elements in the used ring.

> 

> Best regards,

> Tiwei Bie
  
Tiwei Bie Dec. 5, 2017, 3:11 a.m. UTC | #4
On Mon, Dec 04, 2017 at 07:46:07PM +0800, Fischetti, Antonio wrote:
> > -----Original Message-----
> > From: Bie, Tiwei
> > Sent: Monday, December 4, 2017 7:20 AM
> > To: Fischetti, Antonio <antonio.fischetti@intel.com>
> > Cc: dev@dpdk.org; yliu@fridaylinux.org; maxime.coquelin@redhat.com;
> > jfreimann@redhat.com; stable@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2] net/virtio: fix an incorrect behavior
> > of device stop/start
> > 
> > On Sat, Dec 02, 2017 at 12:30:33PM +0800, Tiwei Bie wrote:
> > > Hi Antonio,
> > >
> > > On Sat, Dec 02, 2017 at 01:17:58AM +0800, Fischetti, Antonio wrote:
> > > > Hi All,
> > > > I've got an update on this.
> > > > I could replicate the same issue by using testpmd + a VM (= Virtual
> > Machine).
> > > >
> > > > The test topology I'm using is:
> > > >
> > > >
> > > > [Traffic gen]----[NIC port #0]----[testpmd]----[vhost port #2]----+
> > > >                                                                   |
> > > >                                                                   |
> > > >
> > [testpmd in
> > > >                                                                 the
> > VM]
> > > >                                                                   |
> > > >                                                                   |
> > > > [Traffic gen]----[NIC port #1]----[testpmd]----[vhost port #3]----+
> > > >
> > > >
> > > > So there's no OvS now in the picture: one testpmd running on the
> > host
> > > > and one testpmd running on the VM.
> > > >
> > > > The issue is that no packet goes through testpmd in the VM.
> > > > It seems this is happening after this patch was upstreamed.
> > > >
> > > > Please note
> > > > -----------
> > > > To replicate this issue both the next 2 conditions must be met:
> > > >  - the traffic is already being sent before launching testpmd in the
> > VM
> > > >  - there are at least 2 forwarding lcores.
> > > >
> > >
> > [...]
> > >
> > > Do you see anything I missed? Or can you reproduce the issue with the
> > > setup I'm using?
> > >
> > 
> > Hi Antonio,
> > 
> > Are you using vector Rx in your test? After some further
> 
> [Antonio] Hi Tiwei, yes I suppose so.
> 
> Below some more details on my testbench to explain why I'm
> using this configuration.
> 
> With this topology I can replicate as much as possible
> my initial OvS-DPDK testbench. In the ovs case I 
> had 1 single process on the host (OvS) which spawns 2 
> dequeuing threads running on 2 different lcores (please 
> note: if they run on the same lcore there's no issue).
> 
> So in place of Ovs I'm using 1 testpmd only on the host like:
> 
> sudo ./x86_64-native-linuxapp-gcc/app/testpmd -l 0-3 -n 4 \
>     --socket-mem=1024,1024 \
>     --vdev 'eth_vhost0,iface=/tmp/sock0,queues=1' \
>     --vdev 'eth_vhost1,iface=/tmp/sock1,queues=1' -- -i  
> 
> so the testpmd on the host is seeing 4 phy ports, 2 are real
> phy ports and 2 are vdev.
> 
> Then I set the forwarding cores
> 
>  testpmd> set corelist 2,3
> 
> and the port order
>  testpmd> set portlist 0,2,1,3
>  testpmd> set fwd mac retry
>  testpmd> start
> 
> I then launch the VM and inside of it I launch testpmd:
> 
> ./testpmd -c 0x3 -n 4 --socket-mem 512 -- --burst=64 -i \
>     --txqflags=0xf00 --disable-hw-vlan
> 
> on testpmd console:
>  testpmd> set fwd mac retry
>  testpmd> start
> 
> As the traffic is already running at the line-rate for 
> some seconds, for sure all the descs have been consumed.
> 
> I'm currently using 1 GB hugepage sizes, but I was
> seeing the same issue with OvS when I was using 2 MB 
> hugepage sizes.
> 

Hi Antonio,

Got it! Thank you for the detailed info!
I will send out a patch to fix this issue ASAP.

Best regards,
Tiwei Bie
  
Antonio Fischetti Dec. 5, 2017, 8:52 a.m. UTC | #5
> -----Original Message-----

> From: Bie, Tiwei

> Sent: Tuesday, December 5, 2017 3:12 AM

> To: Fischetti, Antonio <antonio.fischetti@intel.com>

> Cc: dev@dpdk.org; yliu@fridaylinux.org; maxime.coquelin@redhat.com;

> jfreimann@redhat.com; stable@dpdk.org

> Subject: Re: [dpdk-dev] [PATCH v2] net/virtio: fix an incorrect behavior

> of device stop/start

> 

> On Mon, Dec 04, 2017 at 07:46:07PM +0800, Fischetti, Antonio wrote:

> > > -----Original Message-----

> > > From: Bie, Tiwei

> > > Sent: Monday, December 4, 2017 7:20 AM

> > > To: Fischetti, Antonio <antonio.fischetti@intel.com>

> > > Cc: dev@dpdk.org; yliu@fridaylinux.org; maxime.coquelin@redhat.com;

> > > jfreimann@redhat.com; stable@dpdk.org

> > > Subject: Re: [dpdk-dev] [PATCH v2] net/virtio: fix an incorrect

> behavior

> > > of device stop/start

> > >

> > > On Sat, Dec 02, 2017 at 12:30:33PM +0800, Tiwei Bie wrote:

> > > > Hi Antonio,

> > > >

> > > > On Sat, Dec 02, 2017 at 01:17:58AM +0800, Fischetti, Antonio

> wrote:

> > > > > Hi All,

> > > > > I've got an update on this.

> > > > > I could replicate the same issue by using testpmd + a VM (=

> Virtual

> > > Machine).

> > > > >

> > > > > The test topology I'm using is:

> > > > >

> > > > >

> > > > > [Traffic gen]----[NIC port #0]----[testpmd]----[vhost port #2]--

> --+

> > > > >

> |

> > > > >

> |

> > > > >

> > > [testpmd in

> > > > >

> the

> > > VM]

> > > > >

> |

> > > > >

> |

> > > > > [Traffic gen]----[NIC port #1]----[testpmd]----[vhost port #3]--

> --+

> > > > >

> > > > >

> > > > > So there's no OvS now in the picture: one testpmd running on the

> > > host

> > > > > and one testpmd running on the VM.

> > > > >

> > > > > The issue is that no packet goes through testpmd in the VM.

> > > > > It seems this is happening after this patch was upstreamed.

> > > > >

> > > > > Please note

> > > > > -----------

> > > > > To replicate this issue both the next 2 conditions must be met:

> > > > >  - the traffic is already being sent before launching testpmd in

> the

> > > VM

> > > > >  - there are at least 2 forwarding lcores.

> > > > >

> > > >

> > > [...]

> > > >

> > > > Do you see anything I missed? Or can you reproduce the issue with

> the

> > > > setup I'm using?

> > > >

> > >

> > > Hi Antonio,

> > >

> > > Are you using vector Rx in your test? After some further

> >

> > [Antonio] Hi Tiwei, yes I suppose so.

> >

> > Below some more details on my testbench to explain why I'm

> > using this configuration.

> >

> > With this topology I can replicate as much as possible

> > my initial OvS-DPDK testbench. In the ovs case I

> > had 1 single process on the host (OvS) which spawns 2

> > dequeuing threads running on 2 different lcores (please

> > note: if they run on the same lcore there's no issue).

> >

> > So in place of Ovs I'm using 1 testpmd only on the host like:

> >

> > sudo ./x86_64-native-linuxapp-gcc/app/testpmd -l 0-3 -n 4 \

> >     --socket-mem=1024,1024 \

> >     --vdev 'eth_vhost0,iface=/tmp/sock0,queues=1' \

> >     --vdev 'eth_vhost1,iface=/tmp/sock1,queues=1' -- -i

> >

> > so the testpmd on the host is seeing 4 phy ports, 2 are real

> > phy ports and 2 are vdev.

> >

> > Then I set the forwarding cores

> >

> >  testpmd> set corelist 2,3

> >

> > and the port order

> >  testpmd> set portlist 0,2,1,3

> >  testpmd> set fwd mac retry

> >  testpmd> start

> >

> > I then launch the VM and inside of it I launch testpmd:

> >

> > ./testpmd -c 0x3 -n 4 --socket-mem 512 -- --burst=64 -i \

> >     --txqflags=0xf00 --disable-hw-vlan

> >

> > on testpmd console:

> >  testpmd> set fwd mac retry

> >  testpmd> start

> >

> > As the traffic is already running at the line-rate for

> > some seconds, for sure all the descs have been consumed.

> >

> > I'm currently using 1 GB hugepage sizes, but I was

> > seeing the same issue with OvS when I was using 2 MB

> > hugepage sizes.

> >

> 

> Hi Antonio,

> 

> Got it! Thank you for the detailed info!

> I will send out a patch to fix this issue ASAP.


[Antonio] That's great Tiwei, thanks! Once it is ready I 
can give it a try as with testpmd as with OvS on the host.


> 

> Best regards,

> Tiwei Bie
  

Patch

diff --git a/drivers/net/virtio/virtqueue.c b/drivers/net/virtio/virtqueue.c
index c3a536f..17dd87d 100644
--- a/drivers/net/virtio/virtqueue.c
+++ b/drivers/net/virtio/virtqueue.c
@@ -80,7 +80,7 @@  virtqueue_flush(struct virtqueue *vq)
                        rte_pktmbuf_free(dxp->cookie);
                        dxp->cookie = NULL;
                }
-               vq->vq_used_cons_idx++;
+               //vq->vq_used_cons_idx++;
                vq_ring_free_chain(vq, desc_idx);
        }