[19.11,V2,08/12] virtio: fix rx stats with vectorized functions
Checks
Commit Message
From: Thibaut Collet <thibaut.collet@6wind.com>
With vectorized functions, only the rx stats for number of packets is
incremented.
Update also the other statistics.
Performance impact is about 2%
Fixes: fc3d66212fed ("virtio: add vector Rx")
Cc: stable@dpdk.org
Signed-off-by: Thibaut Collet <thibaut.collet@6wind.com>
---
drivers/net/virtio/virtio_rxtx.c | 2 +-
drivers/net/virtio/virtio_rxtx.h | 2 ++
drivers/net/virtio/virtio_rxtx_simple_neon.c | 6 ++++++
drivers/net/virtio/virtio_rxtx_simple_sse.c | 6 ++++++
4 files changed, 15 insertions(+), 1 deletion(-)
Comments
On Wed, Aug 07, 2019 at 05:09:17PM +0200, Thierry Herbelot wrote:
> From: Thibaut Collet <thibaut.collet@6wind.com>
>
> With vectorized functions, only the rx stats for number of packets is
> incremented.
> Update also the other statistics.
> Performance impact is about 2%
Could you share some details about your performance test?
> diff --git a/drivers/net/virtio/virtio_rxtx_simple_sse.c b/drivers/net/virtio/virtio_rxtx_simple_sse.c
> index af76708d66ae..c757e8c9d601 100644
> --- a/drivers/net/virtio/virtio_rxtx_simple_sse.c
> +++ b/drivers/net/virtio/virtio_rxtx_simple_sse.c
> @@ -48,6 +48,7 @@ virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
> struct vring_used_elem *rused;
> struct rte_mbuf **sw_ring;
> struct rte_mbuf **sw_ring_end;
> + struct rte_mbuf **ref_rx_pkts;
> uint16_t nb_pkts_received = 0;
> __m128i shuf_msk1, shuf_msk2, len_adjust;
>
> @@ -107,6 +108,7 @@ virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
> virtqueue_notify(vq);
> }
>
> + ref_rx_pkts = rx_pkts;
> for (nb_pkts_received = 0;
> nb_pkts_received < nb_used;) {
> __m128i desc[RTE_VIRTIO_DESC_PER_LOOP / 2];
> @@ -190,5 +192,9 @@ virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
> vq->vq_used_cons_idx += nb_pkts_received;
> vq->vq_free_cnt += nb_pkts_received;
> rxvq->stats.packets += nb_pkts_received;
> + for (nb_used = 0; nb_used < nb_pkts_received; nb_used++) {
> + rxvq->stats.bytes += ref_rx_pkts[nb_used]->pkt_len;
> + virtio_update_packet_stats(&rxvq->stats, ref_rx_pkts[nb_used]);
The stats.bytes was updated twice by above code.
> + }
> return nb_pkts_received;
> }
> --
> 2.11.0
>
On Thu, Aug 8, 2019 at 7:17 AM Tiwei Bie <tiwei.bie@intel.com> wrote:
>
> On Wed, Aug 07, 2019 at 05:09:17PM +0200, Thierry Herbelot wrote:
> > From: Thibaut Collet <thibaut.collet@6wind.com>
> >
> > With vectorized functions, only the rx stats for number of packets is
> > incremented.
> > Update also the other statistics.
> > Performance impact is about 2%
>
> Could you share some details about your performance test?
The test has been done with a 6wind application based on dpdk and I
have not keep details. With test pmd impact is maybe a little bit
higher.
>
>
> > diff --git a/drivers/net/virtio/virtio_rxtx_simple_sse.c b/drivers/net/virtio/virtio_rxtx_simple_sse.c
> > index af76708d66ae..c757e8c9d601 100644
> > --- a/drivers/net/virtio/virtio_rxtx_simple_sse.c
> > +++ b/drivers/net/virtio/virtio_rxtx_simple_sse.c
> > @@ -48,6 +48,7 @@ virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
> > struct vring_used_elem *rused;
> > struct rte_mbuf **sw_ring;
> > struct rte_mbuf **sw_ring_end;
> > + struct rte_mbuf **ref_rx_pkts;
> > uint16_t nb_pkts_received = 0;
> > __m128i shuf_msk1, shuf_msk2, len_adjust;
> >
> > @@ -107,6 +108,7 @@ virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
> > virtqueue_notify(vq);
> > }
> >
> > + ref_rx_pkts = rx_pkts;
> > for (nb_pkts_received = 0;
> > nb_pkts_received < nb_used;) {
> > __m128i desc[RTE_VIRTIO_DESC_PER_LOOP / 2];
> > @@ -190,5 +192,9 @@ virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
> > vq->vq_used_cons_idx += nb_pkts_received;
> > vq->vq_free_cnt += nb_pkts_received;
> > rxvq->stats.packets += nb_pkts_received;
> > + for (nb_used = 0; nb_used < nb_pkts_received; nb_used++) {
> > + rxvq->stats.bytes += ref_rx_pkts[nb_used]->pkt_len;
> > + virtio_update_packet_stats(&rxvq->stats, ref_rx_pkts[nb_used]);
>
> The stats.bytes was updated twice by above code.
Agree. My correction is an old one, done on the old 18.11 dpdk, and so
done before patch
https://git.dpdk.org/dpdk/commit/drivers/net/virtio/virtio_rxtx.c?id=81e5cdf19e583c742040d3be83b8cc79b451e243.
So the line
+ rxvq->stats.bytes += ref_rx_pkts[nb_used]->pkt_len;
must be removed to be compliant with Ilya Maximets patch.
>
> > + }
> > return nb_pkts_received;
> > }
> > --
> > 2.11.0
> >
@@ -1083,7 +1083,7 @@ virtio_discard_rxbuf_inorder(struct virtqueue *vq, struct rte_mbuf *m)
}
}
-static inline void
+void
virtio_update_packet_stats(struct virtnet_stats *stats, struct rte_mbuf *mbuf)
{
uint32_t s = mbuf->pkt_len;
@@ -59,5 +59,7 @@ struct virtnet_ctl {
};
int virtio_rxq_vec_setup(struct virtnet_rx *rxvq);
+void virtio_update_packet_stats(struct virtnet_stats *stats,
+ struct rte_mbuf *mbuf);
#endif /* _VIRTIO_RXTX_H_ */
@@ -47,6 +47,7 @@ virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
struct vring_used_elem *rused;
struct rte_mbuf **sw_ring;
struct rte_mbuf **sw_ring_end;
+ struct rte_mbuf **ref_rx_pkts;
uint16_t nb_pkts_received = 0;
uint8x16_t shuf_msk1 = {
@@ -105,6 +106,7 @@ virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
virtqueue_notify(vq);
}
+ ref_rx_pkts = rx_pkts;
for (nb_pkts_received = 0;
nb_pkts_received < nb_used;) {
uint64x2_t desc[RTE_VIRTIO_DESC_PER_LOOP / 2];
@@ -204,5 +206,9 @@ virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
vq->vq_used_cons_idx += nb_pkts_received;
vq->vq_free_cnt += nb_pkts_received;
rxvq->stats.packets += nb_pkts_received;
+ for (nb_used = 0; nb_used < nb_pkts_received; nb_used++) {
+ rxvq->stats.bytes += ref_rx_pkts[nb_used]->pkt_len;
+ virtio_update_packet_stats(&rxvq->stats, ref_rx_pkts[nb_used]);
+ }
return nb_pkts_received;
}
@@ -48,6 +48,7 @@ virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
struct vring_used_elem *rused;
struct rte_mbuf **sw_ring;
struct rte_mbuf **sw_ring_end;
+ struct rte_mbuf **ref_rx_pkts;
uint16_t nb_pkts_received = 0;
__m128i shuf_msk1, shuf_msk2, len_adjust;
@@ -107,6 +108,7 @@ virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
virtqueue_notify(vq);
}
+ ref_rx_pkts = rx_pkts;
for (nb_pkts_received = 0;
nb_pkts_received < nb_used;) {
__m128i desc[RTE_VIRTIO_DESC_PER_LOOP / 2];
@@ -190,5 +192,9 @@ virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
vq->vq_used_cons_idx += nb_pkts_received;
vq->vq_free_cnt += nb_pkts_received;
rxvq->stats.packets += nb_pkts_received;
+ for (nb_used = 0; nb_used < nb_pkts_received; nb_used++) {
+ rxvq->stats.bytes += ref_rx_pkts[nb_used]->pkt_len;
+ virtio_update_packet_stats(&rxvq->stats, ref_rx_pkts[nb_used]);
+ }
return nb_pkts_received;
}