[dpdk-dev,4/5] virtio: use any layout on transmit

Message ID 1445231772-17467-5-git-send-email-stephen@networkplumber.org (mailing list archive)
State Changes Requested, archived
Headers

Commit Message

Stephen Hemminger Oct. 19, 2015, 5:16 a.m. UTC
  Virtio supports a feature that allows sender to put transmit
header prepended to data.  It requires that the mbuf be writeable, correct
alignment, and the feature has been negotiatied.  If all this works out,
then it will be the optimum way to transmit a single segment packet.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 drivers/net/virtio/virtio_ethdev.h |  3 +-
 drivers/net/virtio/virtio_rxtx.c   | 66 +++++++++++++++++++++++---------------
 2 files changed, 42 insertions(+), 27 deletions(-)
  

Comments

Huawei Xie Oct. 19, 2015, 4:28 p.m. UTC | #1
On 10/19/2015 1:16 PM, Stephen Hemminger wrote:
> Virtio supports a feature that allows sender to put transmit
> header prepended to data.  It requires that the mbuf be writeable, correct
> alignment, and the feature has been negotiatied.  If all this works out,
> then it will be the optimum way to transmit a single segment packet.
"When using legacy interfaces, transitional drivers which have not
negotiated VIRTIO_F_ANY_LAYOUT
MUST use a single descriptor for the struct virtio_net_hdr on both
transmit and receive, with the
network data in the following descriptors."

I think we shouldn't assume that virtio header descriptor uses a
separate descriptor. It could be with data. Virtio RX(and dpdk vhost)
actually is implemented like this before, i.e, i thought this should be
inherent but not a feature.
Is the current RX implementation wrong?
[...]
  
Stephen Hemminger Oct. 19, 2015, 4:43 p.m. UTC | #2
On Mon, 19 Oct 2015 16:28:30 +0000
"Xie, Huawei" <huawei.xie@intel.com> wrote:

> On 10/19/2015 1:16 PM, Stephen Hemminger wrote:
> > Virtio supports a feature that allows sender to put transmit
> > header prepended to data.  It requires that the mbuf be writeable, correct
> > alignment, and the feature has been negotiatied.  If all this works out,
> > then it will be the optimum way to transmit a single segment packet.  
> "When using legacy interfaces, transitional drivers which have not
> negotiated VIRTIO_F_ANY_LAYOUT
> MUST use a single descriptor for the struct virtio_net_hdr on both
> transmit and receive, with the
> network data in the following descriptors."

The code checks for the any layout feature, what is the problem?
  
Huawei Xie Oct. 19, 2015, 4:56 p.m. UTC | #3
On 10/20/2015 12:43 AM, Stephen Hemminger wrote:
> On Mon, 19 Oct 2015 16:28:30 +0000
> "Xie, Huawei" <huawei.xie@intel.com> wrote:
>
>> On 10/19/2015 1:16 PM, Stephen Hemminger wrote:
>>> Virtio supports a feature that allows sender to put transmit
>>> header prepended to data.  It requires that the mbuf be writeable, correct
>>> alignment, and the feature has been negotiatied.  If all this works out,
>>> then it will be the optimum way to transmit a single segment packet.  
>> "When using legacy interfaces, transitional drivers which have not
>> negotiated VIRTIO_F_ANY_LAYOUT
>> MUST use a single descriptor for the struct virtio_net_hdr on both
>> transmit and receive, with the
>> network data in the following descriptors."
> The code checks for the any layout feature, what is the problem?
My reply is removed. I said virtio RX is already implemented using this
feature by default without negotiation(at the time of implementation, no
idea of this feature), is the RX implementation wrong?
  
Stephen Hemminger Oct. 19, 2015, 5:19 p.m. UTC | #4
On Mon, 19 Oct 2015 16:28:30 +0000
"Xie, Huawei" <huawei.xie@intel.com> wrote:

> "When using legacy interfaces, transitional drivers which have not
> negotiated VIRTIO_F_ANY_LAYOUT
> MUST use a single descriptor for the struct virtio_net_hdr on both
> transmit and receive, with the
> network data in the following descriptors."
> 
> I think we shouldn't assume that virtio header descriptor uses a
> separate descriptor. It could be with data. Virtio RX(and dpdk vhost)
> actually is implemented like this before, i.e, i thought this should be
> inherent but not a feature.
> Is the current RX implementation wrong?

I believe current RX is ok, the any layout refers more to what is
handed to the host on transmit. Rusty said something like
"any sane implementation would work with contiguous buffer"
but the standard couldn't assume sanity!
  
Stephen Hemminger Oct. 26, 2015, 11:47 p.m. UTC | #5
On Mon, 19 Oct 2015 16:56:02 +0000
"Xie, Huawei" <huawei.xie@intel.com> wrote:

> On 10/20/2015 12:43 AM, Stephen Hemminger wrote:
> > On Mon, 19 Oct 2015 16:28:30 +0000
> > "Xie, Huawei" <huawei.xie@intel.com> wrote:
> >
> >> On 10/19/2015 1:16 PM, Stephen Hemminger wrote:
> >>> Virtio supports a feature that allows sender to put transmit
> >>> header prepended to data.  It requires that the mbuf be writeable, correct
> >>> alignment, and the feature has been negotiatied.  If all this works out,
> >>> then it will be the optimum way to transmit a single segment packet.  
> >> "When using legacy interfaces, transitional drivers which have not
> >> negotiated VIRTIO_F_ANY_LAYOUT
> >> MUST use a single descriptor for the struct virtio_net_hdr on both
> >> transmit and receive, with the
> >> network data in the following descriptors."
> > The code checks for the any layout feature, what is the problem?
> My reply is removed. I said virtio RX is already implemented using this
> feature by default without negotiation(at the time of implementation, no
> idea of this feature), is the RX implementation wrong?
> 

No receiver is fine, it is okay to handle it coming in as long
as it doesn't assume that is the only possible layout.
  

Patch

diff --git a/drivers/net/virtio/virtio_ethdev.h b/drivers/net/virtio/virtio_ethdev.h
index 07a9265..f260fbb 100644
--- a/drivers/net/virtio/virtio_ethdev.h
+++ b/drivers/net/virtio/virtio_ethdev.h
@@ -65,7 +65,8 @@ 
 	 1u << VIRTIO_NET_F_CTRL_RX	  |	\
 	 1u << VIRTIO_NET_F_CTRL_VLAN	  |	\
 	 1u << VIRTIO_NET_F_MRG_RXBUF     |	\
-	 1u << VIRTIO_RING_F_INDIRECT_DESC)
+	 1u << VIRTIO_RING_F_INDIRECT_DESC|	\
+	 1u << VIRTIO_F_ANY_LAYOUT)
 
 /*
  * CQ function prototype
diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index f68ab8f..dbedcc3 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -200,13 +200,13 @@  virtqueue_enqueue_recv_refill(struct virtqueue *vq, struct rte_mbuf *cookie)
 
 static int
 virtqueue_enqueue_xmit(struct virtqueue *txvq, struct rte_mbuf *cookie,
-		       int use_indirect)
+		       uint16_t needed, int use_indirect, int can_push)
 {
 	struct vq_desc_extra *dxp;
 	struct vring_desc *start_dp;
 	uint16_t seg_num = cookie->nb_segs;
-	uint16_t needed = use_indirect ? 1 : 1 + seg_num;
 	uint16_t head_idx, idx;
+	uint16_t head_size = txvq->hw->vtnet_hdr_size;
 	unsigned long offs;
 
 	if (unlikely(txvq->vq_free_cnt == 0))
@@ -223,7 +223,12 @@  virtqueue_enqueue_xmit(struct virtqueue *txvq, struct rte_mbuf *cookie,
 	dxp->ndescs = needed;
 	start_dp = txvq->vq_ring.desc;
 
-	if (use_indirect) {
+	if (can_push) {
+		/* put on zero'd transmit header (no offloads) */
+		void *hdr = rte_pktmbuf_prepend(cookie, head_size);
+
+		memset(hdr, 0, head_size);
+	} else if (use_indirect) {
 		struct virtio_tx_region *txr
 			= txvq->virtio_net_hdr_mz->addr;
 
@@ -235,7 +240,7 @@  virtqueue_enqueue_xmit(struct virtqueue *txvq, struct rte_mbuf *cookie,
 		start_dp[idx].flags = VRING_DESC_F_INDIRECT;
 
 		start_dp = txr[idx].tx_indir;
-		idx = 0;
+		idx = 1;
 	} else {
 		offs = idx * sizeof(struct virtio_tx_region)
 			+ offsetof(struct virtio_tx_region, tx_hdr);
@@ -243,22 +248,19 @@  virtqueue_enqueue_xmit(struct virtqueue *txvq, struct rte_mbuf *cookie,
 		start_dp[idx].addr  = txvq->virtio_net_hdr_mem + offs;
 		start_dp[idx].len   = txvq->hw->vtnet_hdr_size;
 		start_dp[idx].flags = VRING_DESC_F_NEXT;
+		idx = start_dp[idx].next;
 	}
 
-	for (; ((seg_num > 0) && (cookie != NULL)); seg_num--) {
-		idx = start_dp[idx].next;
+	while (cookie != NULL) {
 		start_dp[idx].addr  = RTE_MBUF_DATA_DMA_ADDR(cookie);
 		start_dp[idx].len   = cookie->data_len;
-		start_dp[idx].flags = VRING_DESC_F_NEXT;
+		start_dp[idx].flags = cookie->next ? VRING_DESC_F_NEXT : 0;
 		cookie = cookie->next;
+		idx = start_dp[idx].next;
 	}
 
-	start_dp[idx].flags &= ~VRING_DESC_F_NEXT;
-
 	if (use_indirect)
 		idx = txvq->vq_ring.desc[head_idx].next;
-	else
-		idx = start_dp[idx].next;
 
 	txvq->vq_desc_head_idx = idx;
 	if (txvq->vq_desc_head_idx == VQ_RING_DESC_CHAIN_END)
@@ -761,10 +763,13 @@  virtio_recv_mergeable_pkts(void *rx_queue,
 	return nb_rx;
 }
 
+
 uint16_t
 virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 {
 	struct virtqueue *txvq = tx_queue;
+	struct virtio_hw *hw = txvq->hw;
+	uint16_t hdr_size = hw->vtnet_hdr_size;
 	uint16_t nb_used, nb_tx;
 	int error;
 
@@ -780,14 +785,31 @@  virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 
 	for (nb_tx = 0; nb_tx < nb_pkts; nb_tx++) {
 		struct rte_mbuf *txm = tx_pkts[nb_tx];
-		int use_indirect, slots, need;
+		int can_push = 0, use_indirect = 0, slots, need;
+
+		/* Do VLAN tag insertion */
+		if (txm->ol_flags & PKT_TX_VLAN_PKT) {
+			error = rte_vlan_insert(&txm);
+			if (unlikely(error)) {
+				rte_pktmbuf_free(txm);
+				continue;
+			}
+		}
 
-		use_indirect = vtpci_with_feature(txvq->hw,
-						  VIRTIO_RING_F_INDIRECT_DESC)
-			&& (txm->nb_segs < VIRTIO_MAX_TX_INDIRECT);
+		/* optimize ring usage */
+		if (vtpci_with_feature(hw, VIRTIO_F_ANY_LAYOUT) &&
+		    rte_mbuf_refcnt_read(txm) == 1 &&
+		    txm->nb_segs == 1 &&
+		    rte_pktmbuf_headroom(txm) >= hdr_size &&
+		    rte_is_aligned(rte_pktmbuf_mtod(txm, char *),
+				   __alignof__(struct virtio_net_hdr_mrg_rxbuf)))
+			can_push = 1;
+		else if (vtpci_with_feature(hw, VIRTIO_RING_F_INDIRECT_DESC) &&
+			 txm->nb_segs < VIRTIO_MAX_TX_INDIRECT)
+			use_indirect = 1;
 
 		/* How many ring entries are needed to this Tx? */
-		slots = use_indirect ? 1 : 1 + txm->nb_segs;
+		slots = use_indirect ? 1 : !can_push + txm->nb_segs;
 		need = slots - txvq->vq_free_cnt;
 
 		/* Positive value indicates it need free vring descriptors */
@@ -805,17 +827,9 @@  virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 			}
 		}
 
-		/* Do VLAN tag insertion */
-		if (txm->ol_flags & PKT_TX_VLAN_PKT) {
-			error = rte_vlan_insert(&txm);
-			if (unlikely(error)) {
-				rte_pktmbuf_free(txm);
-				continue;
-			}
-		}
-
 		/* Enqueue Packet buffers */
-		error = virtqueue_enqueue_xmit(txvq, txm, use_indirect);
+		error = virtqueue_enqueue_xmit(txvq, txm, slots,
+					       use_indirect, can_push);
 		if (unlikely(error)) {
 			if (error == ENOSPC)
 				PMD_TX_LOG(ERR, "virtqueue_enqueue Free count = 0");