From patchwork Fri Mar 6 00:10:30 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stephen Hemminger X-Patchwork-Id: 3890 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id 212A96A80; Fri, 6 Mar 2015 01:10:51 +0100 (CET) Received: from mail-pd0-f176.google.com (mail-pd0-f176.google.com [209.85.192.176]) by dpdk.org (Postfix) with ESMTP id 6E32311C5 for ; Fri, 6 Mar 2015 01:10:48 +0100 (CET) Received: by pdbfp1 with SMTP id fp1so33237790pdb.7 for ; Thu, 05 Mar 2015 16:10:47 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=KcLG4ivHRdFWtKnMUWL91T0ylzg4Cxa8eIUhpIBBL0Q=; b=PV3oDJEDGu6TW2gQLWPTfKKZBLEhshV1XbooLQiJ09Ln+Bti/KCpQ/TASlGfzDF/ZI KTKndxPCPMILAdCfHqtRV96K0fSbqba51fXxpXlJw3wUELrKkDZM4KwFxF2Ug2RKJ4yL tRwCcaEm6qPj7kbmD0zjdDM0h3vr3q36owdysWx5E2RDBa2dUewWi8keIOrNJSv/iHJT DLNH6gO2natDx6386qbyJ5udi2Suwwc0UuEqQF5jWn5ZVi8AzjeHMKgSfYZmpayd8FVm c4uPtYBitT320t+svBOUlh/RgdPBQpofPa2CwVeBZgsFSOZbQdIsciFqzdSZyVOQ5h0m m7Pw== X-Gm-Message-State: ALoCoQlsCt8D5dAbS6QBUq7rWdcDSc6o7O118KyhH44yQuUrFdkURNAnPylxylulpKLS8DIGs6Lw X-Received: by 10.66.171.199 with SMTP id aw7mr21043816pac.6.1425600647724; Thu, 05 Mar 2015 16:10:47 -0800 (PST) Received: from urahara.brocade.com (static-50-53-82-155.bvtn.or.frontiernet.net. [50.53.82.155]) by mx.google.com with ESMTPSA id ms5sm7940550pbb.59.2015.03.05.16.10.46 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 05 Mar 2015 16:10:46 -0800 (PST) From: Stephen Hemminger To: Yong Wang Date: Thu, 5 Mar 2015 16:10:30 -0800 Message-Id: <1425600635-20628-6-git-send-email-stephen@networkplumber.org> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1425600635-20628-1-git-send-email-stephen@networkplumber.org> References: <1425600635-20628-1-git-send-email-stephen@networkplumber.org> Cc: dev@dpdk.org, Stephen Hemminger Subject: [dpdk-dev] [PATCH v3 05/10] vmxnet3: add support for multi-segment transmit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Stephen Hemminger Change sending loop to support multi-segment mbufs. The VMXNET3 api has start-of-packet and end-packet flags, so it is not hard to send multi-segment mbuf's. Also, update descriptor in 32 bit value rather than toggling bitfields which is slower and error prone. Based on code in earlier driver, and the Linux kernel driver. Add a compiler barrier to make sure that update of earlier descriptor are completed prior to update of generation bit on start of packet. Signed-off-by: Stephen Hemminger --- v3 -- add back in small packet optimization lib/librte_pmd_vmxnet3/vmxnet3_ring.h | 1 + lib/librte_pmd_vmxnet3/vmxnet3_rxtx.c | 155 +++++++++++++++++----------------- 2 files changed, 79 insertions(+), 77 deletions(-) diff --git a/lib/librte_pmd_vmxnet3/vmxnet3_ring.h b/lib/librte_pmd_vmxnet3/vmxnet3_ring.h index ebe6268..612487e 100644 --- a/lib/librte_pmd_vmxnet3/vmxnet3_ring.h +++ b/lib/librte_pmd_vmxnet3/vmxnet3_ring.h @@ -125,6 +125,7 @@ struct vmxnet3_txq_stats { * the counters below track droppings due to * different reasons */ + uint64_t drop_too_many_segs; uint64_t drop_tso; uint64_t tx_ring_full; }; diff --git a/lib/librte_pmd_vmxnet3/vmxnet3_rxtx.c b/lib/librte_pmd_vmxnet3/vmxnet3_rxtx.c index 38ac811..5d0f227 100644 --- a/lib/librte_pmd_vmxnet3/vmxnet3_rxtx.c +++ b/lib/librte_pmd_vmxnet3/vmxnet3_rxtx.c @@ -306,26 +306,24 @@ vmxnet3_tq_tx_complete(vmxnet3_tx_queue_t *txq) (comp_ring->base + comp_ring->next2proc); while (tcd->gen == comp_ring->gen) { - /* Release cmd_ring descriptor and free mbuf */ #ifdef RTE_LIBRTE_VMXNET3_DEBUG_DRIVER VMXNET3_ASSERT(txq->cmd_ring.base[tcd->txdIdx].txd.eop == 1); #endif - mbuf = txq->cmd_ring.buf_info[tcd->txdIdx].m; - if (unlikely(mbuf == NULL)) - rte_panic("EOP desc does not point to a valid mbuf"); - else - rte_pktmbuf_free(mbuf); + while (txq->cmd_ring.next2comp != tcd->txdIdx) { + mbuf = txq->cmd_ring.buf_info[txq->cmd_ring.next2comp].m; + rte_pktmbuf_free_seg(mbuf); + txq->cmd_ring.buf_info[txq->cmd_ring.next2comp].m = NULL; - txq->cmd_ring.buf_info[tcd->txdIdx].m = NULL; - /* Mark the txd for which tcd was generated as completed */ - vmxnet3_cmd_ring_adv_next2comp(&txq->cmd_ring); + /* Mark the txd for which tcd was generated as completed */ + vmxnet3_cmd_ring_adv_next2comp(&txq->cmd_ring); + completed++; + } vmxnet3_comp_ring_adv_next2proc(comp_ring); tcd = (struct Vmxnet3_TxCompDesc *)(comp_ring->base + comp_ring->next2proc); - completed++; } PMD_TX_LOG(DEBUG, "Processed %d tx comps & command descs.", completed); @@ -336,13 +334,8 @@ vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts) { uint16_t nb_tx; - Vmxnet3_TxDesc *txd = NULL; - vmxnet3_buf_info_t *tbi = NULL; - struct vmxnet3_hw *hw; - struct rte_mbuf *txm; vmxnet3_tx_queue_t *txq = tx_queue; - - hw = txq->hw; + struct vmxnet3_hw *hw = txq->hw; if (unlikely(txq->stopped)) { PMD_TX_LOG(DEBUG, "Tx queue is stopped."); @@ -354,75 +347,89 @@ vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, nb_tx = 0; while (nb_tx < nb_pkts) { + Vmxnet3_GenericDesc *gdesc; + vmxnet3_buf_info_t *tbi; + uint32_t first2fill, avail, dw2; + struct rte_mbuf *txm = tx_pkts[nb_tx]; + + /* Is this packet execessively fragmented, then drop */ + if (unlikely(txm->nb_segs > VMXNET3_MAX_TXD_PER_PKT)) { + ++txq->stats.drop_too_many_segs; + ++txq->stats.drop_total; + rte_pktmbuf_free(txm); + ++nb_tx; + continue; + } - if (vmxnet3_cmd_ring_desc_avail(&txq->cmd_ring)) { - int copy_size = 0; + /* Is command ring full? */ + avail = vmxnet3_cmd_ring_desc_avail(&txq->cmd_ring); + if (txm->nb_segs > avail) { + ++txq->stats.tx_ring_full; + break; + } - txm = tx_pkts[nb_tx]; - /* Don't support scatter packets yet, free them if met */ - if (txm->nb_segs != 1) { - PMD_TX_LOG(DEBUG, "Don't support scatter packets yet, drop!"); - rte_pktmbuf_free(tx_pkts[nb_tx]); - txq->stats.drop_total++; + /* use the previous gen bit for the SOP desc */ + dw2 = (txq->cmd_ring.gen ^ 0x1) << VMXNET3_TXD_GEN_SHIFT; + first2fill = txq->cmd_ring.next2fill; - nb_tx++; - continue; - } + /* special case for small packets */ + if (txm->nb_segs == 1 && txm->pkt_len <= VMXNET3_HDR_COPY_SIZE) { + struct Vmxnet3_TxDataDesc *tdd + = txq->data_ring.base + first2fill; - txd = (Vmxnet3_TxDesc *)(txq->cmd_ring.base + txq->cmd_ring.next2fill); - if (rte_pktmbuf_pkt_len(txm) <= VMXNET3_HDR_COPY_SIZE) { - struct Vmxnet3_TxDataDesc *tdd; + rte_memcpy(tdd->data, rte_pktmbuf_mtod(txm, char *), txm->pkt_len); - tdd = txq->data_ring.base + txq->cmd_ring.next2fill; - copy_size = rte_pktmbuf_pkt_len(txm); - rte_memcpy(tdd->data, rte_pktmbuf_mtod(txm, char *), copy_size); - } + tbi = txq->cmd_ring.buf_info + first2fill; + tbi->m = txm; - /* Fill the tx descriptor */ - tbi = txq->cmd_ring.buf_info + txq->cmd_ring.next2fill; - tbi->bufPA = RTE_MBUF_DATA_DMA_ADDR(txm); - if (copy_size) - txd->addr = rte_cpu_to_le_64(txq->data_ring.basePA + - txq->cmd_ring.next2fill * - sizeof(struct Vmxnet3_TxDataDesc)); - else - txd->addr = tbi->bufPA; - txd->len = txm->data_len; + gdesc = txq->cmd_ring.base + first2fill; + gdesc->txd.addr = rte_cpu_to_le_64(txq->data_ring.basePA + + first2fill * sizeof(struct Vmxnet3_TxDataDesc)); + gdesc->dword[2] = dw2 | txm->pkt_len; + gdesc->dword[3] = 0; - /* Mark the last descriptor as End of Packet. */ - txd->cq = 1; - txd->eop = 1; + /* move to the next2fill descriptor */ + vmxnet3_cmd_ring_adv_next2fill(&txq->cmd_ring); + } else { + struct rte_mbuf *m_seg = txm; - /* Add VLAN tag if requested */ - if (txm->ol_flags & PKT_TX_VLAN_PKT) { - txd->ti = 1; - txd->tci = rte_cpu_to_le_16(txm->vlan_tci); - } + /* Multisegment and in/place transmit */ + do { + /* Remember the transmit buffer for cleanup */ + tbi = txq->cmd_ring.buf_info + txq->cmd_ring.next2fill; + tbi->m = m_seg; - /* Record current mbuf for freeing it later in tx complete */ -#ifdef RTE_LIBRTE_VMXNET3_DEBUG_DRIVER - VMXNET3_ASSERT(txm); -#endif - tbi->m = txm; + gdesc = txq->cmd_ring.base + txq->cmd_ring.next2fill; + gdesc->txd.addr = RTE_MBUF_DATA_DMA_ADDR(m_seg); + gdesc->dword[2] = dw2 | m_seg->data_len; + gdesc->dword[3] = 0; - /* Set the offloading mode to default */ - txd->hlen = 0; - txd->om = VMXNET3_OM_NONE; - txd->msscof = 0; + /* move to the next2fill descriptor */ + vmxnet3_cmd_ring_adv_next2fill(&txq->cmd_ring); - /* finally flip the GEN bit of the SOP desc */ - txd->gen = txq->cmd_ring.gen; - txq->shared->ctrl.txNumDeferred++; + /* use the right gen for non-SOP desc */ + dw2 = txq->cmd_ring.gen << VMXNET3_TXD_GEN_SHIFT; + } while ((m_seg = m_seg->next) != NULL); + } - /* move to the next2fill descriptor */ - vmxnet3_cmd_ring_adv_next2fill(&txq->cmd_ring); - nb_tx++; + /* Update the EOP descriptor */ + gdesc->dword[3] |= VMXNET3_TXD_EOP | VMXNET3_TXD_CQ; - } else { - PMD_TX_LOG(DEBUG, "No free tx cmd desc(s)"); - txq->stats.drop_total += (nb_pkts - nb_tx); - break; + /* Add VLAN tag if present */ + gdesc = txq->cmd_ring.base + first2fill; + if (txm->ol_flags & PKT_TX_VLAN_PKT) { + gdesc->txd.ti = 1; + gdesc->txd.tci = txm->vlan_tci; } + + /* TODO: Add transmit checksum offload here */ + + /* flip the GEN bit on the SOP */ + rte_compiler_barrier(); + gdesc->dword[2] ^= VMXNET3_TXD_GEN; + + txq->shared->ctrl.txNumDeferred++; + nb_tx++; } PMD_TX_LOG(DEBUG, "vmxnet3 txThreshold: %u", txq->shared->ctrl.txThreshold); @@ -722,12 +729,6 @@ vmxnet3_dev_tx_queue_setup(struct rte_eth_dev *dev, PMD_INIT_FUNC_TRACE(); - if ((tx_conf->txq_flags & ETH_TXQ_FLAGS_NOMULTSEGS) != - ETH_TXQ_FLAGS_NOMULTSEGS) { - PMD_INIT_LOG(ERR, "TX Multi segment not support yet"); - return -EINVAL; - } - if ((tx_conf->txq_flags & ETH_TXQ_FLAGS_NOXSUMS) != ETH_TXQ_FLAGS_NOXSUMS) { PMD_INIT_LOG(ERR, "TX no support for checksum offload yet");