From patchwork Fri Oct 2 11:16:50 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rahul Lakkireddy X-Patchwork-Id: 7361 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id ACA548E70; Fri, 2 Oct 2015 13:17:32 +0200 (CEST) Received: from stargate3.asicdesigners.com (stargate.chelsio.com [67.207.112.58]) by dpdk.org (Postfix) with ESMTP id 6511C8E6E for ; Fri, 2 Oct 2015 13:17:31 +0200 (CEST) Received: from localhost (scalar.blr.asicdesigners.com [10.193.185.94]) by stargate3.asicdesigners.com (8.13.8/8.13.8) with ESMTP id t92BHS3r005241; Fri, 2 Oct 2015 04:17:29 -0700 From: Rahul Lakkireddy To: dev@dpdk.org Date: Fri, 2 Oct 2015 16:46:50 +0530 Message-Id: <318fc8559675b1157e7f049a6a955a6a2059bac7.1443704150.git.rahul.lakkireddy@chelsio.com> X-Mailer: git-send-email 2.5.3 In-Reply-To: References: In-Reply-To: References: Cc: Kumar Sanghvi , Felix Marti , Nirranjan Kirubaharan Subject: [dpdk-dev] [PATCH 1/6] cxgbe: Optimize forwarding performance for 40G X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Update sge initialization with respect to free-list manager configuration and ingress arbiter. Also update refill logic to refill mbufs only after a certain threshold for rx. Optimize tx packet prefetch and free. Approx. 4 MPPS improvement seen in forwarding performance after the optimization. Signed-off-by: Rahul Lakkireddy Signed-off-by: Kumar Sanghvi --- drivers/net/cxgbe/base/t4_regs.h | 16 ++++++++++++++++ drivers/net/cxgbe/cxgbe_main.c | 7 +++++++ drivers/net/cxgbe/sge.c | 17 ++++++++++++----- 3 files changed, 35 insertions(+), 5 deletions(-) diff --git a/drivers/net/cxgbe/base/t4_regs.h b/drivers/net/cxgbe/base/t4_regs.h index cd28b59..9057e40 100644 --- a/drivers/net/cxgbe/base/t4_regs.h +++ b/drivers/net/cxgbe/base/t4_regs.h @@ -266,6 +266,18 @@ #define A_SGE_FL_BUFFER_SIZE2 0x104c #define A_SGE_FL_BUFFER_SIZE3 0x1050 +#define A_SGE_FLM_CFG 0x1090 + +#define S_CREDITCNT 4 +#define M_CREDITCNT 0x3U +#define V_CREDITCNT(x) ((x) << S_CREDITCNT) +#define G_CREDITCNT(x) (((x) >> S_CREDITCNT) & M_CREDITCNT) + +#define S_CREDITCNTPACKING 2 +#define M_CREDITCNTPACKING 0x3U +#define V_CREDITCNTPACKING(x) ((x) << S_CREDITCNTPACKING) +#define G_CREDITCNTPACKING(x) (((x) >> S_CREDITCNTPACKING) & M_CREDITCNTPACKING) + #define A_SGE_CONM_CTRL 0x1094 #define S_EGRTHRESHOLD 8 @@ -361,6 +373,10 @@ #define A_SGE_CONTROL2 0x1124 +#define S_IDMAARBROUNDROBIN 19 +#define V_IDMAARBROUNDROBIN(x) ((x) << S_IDMAARBROUNDROBIN) +#define F_IDMAARBROUNDROBIN V_IDMAARBROUNDROBIN(1U) + #define S_INGPACKBOUNDARY 16 #define M_INGPACKBOUNDARY 0x7U #define V_INGPACKBOUNDARY(x) ((x) << S_INGPACKBOUNDARY) diff --git a/drivers/net/cxgbe/cxgbe_main.c b/drivers/net/cxgbe/cxgbe_main.c index 3755444..316b87d 100644 --- a/drivers/net/cxgbe/cxgbe_main.c +++ b/drivers/net/cxgbe/cxgbe_main.c @@ -422,6 +422,13 @@ static int adap_init0_tweaks(struct adapter *adapter) t4_set_reg_field(adapter, A_SGE_CONTROL, V_PKTSHIFT(M_PKTSHIFT), V_PKTSHIFT(rx_dma_offset)); + t4_set_reg_field(adapter, A_SGE_FLM_CFG, + V_CREDITCNT(M_CREDITCNT) | M_CREDITCNTPACKING, + V_CREDITCNT(3) | V_CREDITCNTPACKING(1)); + + t4_set_reg_field(adapter, A_SGE_CONTROL2, V_IDMAARBROUNDROBIN(1U), + V_IDMAARBROUNDROBIN(1U)); + /* * Don't include the "IP Pseudo Header" in CPL_RX_PKT checksums: Linux * adds the pseudo header itself. diff --git a/drivers/net/cxgbe/sge.c b/drivers/net/cxgbe/sge.c index 6eb1244..e540881 100644 --- a/drivers/net/cxgbe/sge.c +++ b/drivers/net/cxgbe/sge.c @@ -286,8 +286,7 @@ static void unmap_rx_buf(struct sge_fl *q) static inline void ring_fl_db(struct adapter *adap, struct sge_fl *q) { - /* see if we have exceeded q->size / 4 */ - if (q->pend_cred >= (q->size / 4)) { + if (q->pend_cred >= 64) { u32 val = adap->params.arch.sge_fl_db; if (is_t4(adap->params.chip)) @@ -995,7 +994,14 @@ static inline int tx_do_packet_coalesce(struct sge_eth_txq *txq, int i; for (i = 0; i < sd->coalesce.idx; i++) { - rte_pktmbuf_free(sd->coalesce.mbuf[i]); + struct rte_mbuf *tmp = sd->coalesce.mbuf[i]; + + do { + struct rte_mbuf *next = tmp->next; + + rte_pktmbuf_free_seg(tmp); + tmp = next; + } while (tmp); sd->coalesce.mbuf[i] = NULL; } } @@ -1054,7 +1060,6 @@ out_free: return 0; } - rte_prefetch0(&((&txq->q)->sdesc->mbuf->pool)); pi = (struct port_info *)txq->eth_dev->data->dev_private; adap = pi->adapter; @@ -1070,6 +1075,7 @@ out_free: txq->stats.mapping_err++; goto out_free; } + rte_prefetch0((volatile void *)addr); return tx_do_packet_coalesce(txq, mbuf, cflits, adap, pi, addr); } else { @@ -1454,7 +1460,8 @@ static int process_responses(struct sge_rspq *q, int budget, unsigned int params; u32 val; - __refill_fl(q->adapter, &rxq->fl); + if (fl_cap(&rxq->fl) - rxq->fl.avail >= 64) + __refill_fl(q->adapter, &rxq->fl); params = V_QINTR_TIMER_IDX(X_TIMERREG_UPDATE_CIDX); q->next_intr_params = params; val = V_CIDXINC(cidx_inc) | V_SEINTARM(params);