From patchwork Tue Oct 27 20:56:40 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Mike A. Polehn" X-Patchwork-Id: 8109 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id DEFB08DA9; Tue, 27 Oct 2015 21:56:58 +0100 (CET) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 5F3DAE72 for ; Tue, 27 Oct 2015 21:56:56 +0100 (CET) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga101.jf.intel.com with ESMTP; 27 Oct 2015 13:56:42 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,206,1444719600"; d="scan'208";a="836917235" Received: from orsmsx104.amr.corp.intel.com ([10.22.225.131]) by fmsmga002.fm.intel.com with ESMTP; 27 Oct 2015 13:56:42 -0700 Received: from orsmsx102.amr.corp.intel.com ([169.254.1.29]) by ORSMSX104.amr.corp.intel.com ([169.254.3.250]) with mapi id 14.03.0248.002; Tue, 27 Oct 2015 13:56:42 -0700 From: "Polehn, Mike A" To: "dev@dpdk.org" Thread-Topic: [Patch 2/2] i40e rx Bulk Alloc: Larger list size (33 to 128) throughput optimization Thread-Index: AdEQ+QLr26v5daA+SCefhlouIPNIVw== Date: Tue, 27 Oct 2015 20:56:40 +0000 Message-ID: <745DB4B8861F8E4B9849C970520ABBF14974C1F4@ORSMSX102.amr.corp.intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsIiwiaWQiOiIzMGM2MDMxZi00ODM0LTRiMTAtOTVjOC0xNDNhOWE0NTAxZWIiLCJwcm9wcyI6W3sibiI6IkludGVsRGF0YUNsYXNzaWZpY2F0aW9uIiwidmFscyI6W3sidmFsdWUiOiJDVFBfSUMifV19XX0sIlN1YmplY3RMYWJlbHMiOltdLCJUTUNWZXJzaW9uIjoiMTUuNC4xMC4xOSIsIlRydXN0ZWRMYWJlbEhhc2giOiJOazRXUXJyZTljTEswclIyUVJcL3ZPdlVmdkkySlp5OXZtRyt3SkpaVzFEcz0ifQ== x-inteldataclassification: CTP_IC x-originating-ip: [10.22.254.139] MIME-Version: 1.0 Subject: [dpdk-dev] [Patch 2/2] i40e rx Bulk Alloc: Larger list size (33 to 128) throughput optimization X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Added check of minimum of 2 packet allocation count to eliminate the extra overhead for supporting prefetch for the case of checking for only one packet allocated into the queue at a time. Used some standard variables to help reduce overhead of non-standard variable sizes. Added second level prefetch to get packet address in cache 0 earlier and eliminated calculation inside loop to determine end of prefetch loop. Used old time instruction C optimization methods of, using pointers instead of arrays, and reducing scope of some variables to improve chances of using register variables instead of a stack variables. Signed-off-by: Mike A. Polehn diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c index ec62f75..2032e06 100644 --- a/drivers/net/i40e/i40e_rxtx.c +++ b/drivers/net/i40e/i40e_rxtx.c @@ -64,6 +64,7 @@ #define DEFAULT_TX_FREE_THRESH 32 #define I40E_MAX_PKT_TYPE 256 #define I40E_RX_INPUT_BUF_MAX 256 +#define I40E_RX_FREE_THRESH_MIN 2 #define I40E_TX_MAX_BURST 32 @@ -942,6 +943,12 @@ check_rx_burst_bulk_alloc_preconditions(__rte_unused struct i40e_rx_queue *rxq) "rxq->rx_free_thresh=%d", rxq->nb_rx_desc, rxq->rx_free_thresh); ret = -EINVAL; + } else if (rxq->rx_free_thresh < I40E_RX_FREE_THRESH_MIN) { + PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions: " + "rxq->rx_free_thresh=%d, " + "I40E_RX_FREE_THRESH_MIN=%d", + rxq->rx_free_thresh, I40E_RX_FREE_THRESH_MIN); + ret = -EINVAL; } else if (!(rxq->nb_rx_desc < (I40E_MAX_RING_DESC - RTE_PMD_I40E_RX_MAX_BURST))) { PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions: " @@ -1058,9 +1065,8 @@ i40e_rx_alloc_bufs(struct i40e_rx_queue *rxq) { volatile union i40e_rx_desc *rxdp; struct i40e_rx_entry *rxep; - struct rte_mbuf *mb; - unsigned alloc_idx, i; - uint64_t dma_addr; + struct rte_mbuf *pk, *npk; + unsigned alloc_idx, i, l; int diag; /* Allocate buffers in bulk */ @@ -1076,22 +1082,36 @@ i40e_rx_alloc_bufs(struct i40e_rx_queue *rxq) return -ENOMEM; } + pk = rxep->mbuf; + rte_prefetch0(pk); + rxep++; + npk = rxep->mbuf; + rte_prefetch0(npk); + rxep++; + l = rxq->rx_free_thresh - 2; + rxdp = &rxq->rx_ring[alloc_idx]; for (i = 0; i < rxq->rx_free_thresh; i++) { - if (likely(i < (rxq->rx_free_thresh - 1))) + struct rte_mbuf *mb = pk; + pk = npk; + if (likely(i < l)) { /* Prefetch next mbuf */ - rte_prefetch0(rxep[i + 1].mbuf); - - mb = rxep[i].mbuf; - rte_mbuf_refcnt_set(mb, 1); - mb->next = NULL; + npk = rxep->mbuf; + rte_prefetch0(npk); + rxep++; + } mb->data_off = RTE_PKTMBUF_HEADROOM; + rte_mbuf_refcnt_set(mb, 1); mb->nb_segs = 1; mb->port = rxq->port_id; - dma_addr = rte_cpu_to_le_64(\ - RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb)); - rxdp[i].read.hdr_addr = 0; - rxdp[i].read.pkt_addr = dma_addr; + mb->next = NULL; + { + uint64_t dma_addr = rte_cpu_to_le_64( + RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb)); + rxdp->read.hdr_addr = dma_addr; + rxdp->read.pkt_addr = dma_addr; + } + rxdp++; } rxq->rx_last_pos = alloc_idx + rxq->rx_free_thresh - 1;