From patchwork Fri Jun 24 13:17:55 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?N=C3=A9lio_Laranjeiro?= X-Patchwork-Id: 14356 X-Patchwork-Delegate: bruce.richardson@intel.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id E20FAC734; Fri, 24 Jun 2016 15:19:20 +0200 (CEST) Received: from mail-wm0-f50.google.com (mail-wm0-f50.google.com [74.125.82.50]) by dpdk.org (Postfix) with ESMTP id 7F763C6C4 for ; Fri, 24 Jun 2016 15:19:08 +0200 (CEST) Received: by mail-wm0-f50.google.com with SMTP id r201so25232519wme.1 for ; Fri, 24 Jun 2016 06:19:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=QS2dYzCqF8R4N0L4Gc9QjSdaf3/p9L+E77elStNa/Ls=; b=Ixc2MHzUPRsh1UrHhQZsFEJdr9y+BXc0yejwCZJwddBSGINIziaEM+2F1Ss2BbdCHL Xxiw5a7mCzSBJdCifuJ7jn3iE7Sn46tMgbSeGZUEVUe9hMNJYMeyM/wJLZ+zDgpuTfLL gNJ9LmymqBfJlWBX7nl6xlk6yDRt09mceX4Xkz98MnVqUkCx671+4TAcGlvkelJTIdAz zm+gCmazh3BvoxjOcUWHd0feqVw2XwCrukytyCeJt9qlBlONDTRrczdUMq7o8H+ombRO VNvx/O01Z8tqAAoFIRAgMymfTgrnGF8nUNUgJGQV6AkrZfmbrUKeeHaf4gTy3oDDvBcY U1Pw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=QS2dYzCqF8R4N0L4Gc9QjSdaf3/p9L+E77elStNa/Ls=; b=dizY/lS2uT/jkb+uEP6RfK19v8lThIMvOcp9Yjii1xipqaYdftyczQYZd7E6bLyAzU iXDJy/OQERtVOiKg+09R9QyW840AiZGRUyJA02Lo5bdt+OjsCm+KHRIDA/UlhYzwWiQi J1zZ4BVyfLFrU0l8DEEwn2016V/u/XzxHXhh4rlQ7a8kpoftumXttqOiuOy4LZP2Uqyq BlOCD5W9NNYGGgvtf8zRMWvLSen52NQVGaBpoFhz6xfNu7HNX21g234S/SiHsFpAhL17 3rMKLfEaibHvAKPKh970eifiFkxU6JcuDbs6g+fOG/D1Mp4JtX9/ml73zW/N1UIQXd9z f0ng== X-Gm-Message-State: ALyK8tL2GqjN/lywH1h3wzMb3tQhTEcWfa2IoS5KMdpdzWSQ6bKjJZl3OoR333yEpLU0sxhD X-Received: by 10.194.173.230 with SMTP id bn6mr3652217wjc.8.1466774348120; Fri, 24 Jun 2016 06:19:08 -0700 (PDT) Received: from ping.vm.6wind.com (guy78-3-82-239-227-177.fbx.proxad.net. [82.239.227.177]) by smtp.gmail.com with ESMTPSA id m125sm1279533wmm.8.2016.06.24.06.19.06 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 24 Jun 2016 06:19:07 -0700 (PDT) From: Nelio Laranjeiro To: dev@dpdk.org Cc: Bruce Richardson , Ferruh Yigit , Adrien Mazarguil , Vasily Philipov Date: Fri, 24 Jun 2016 15:17:55 +0200 Message-Id: <1466774284-20932-17-git-send-email-nelio.laranjeiro@6wind.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1466774284-20932-1-git-send-email-nelio.laranjeiro@6wind.com> References: <1466758261-25986-1-git-send-email-nelio.laranjeiro@6wind.com> <1466774284-20932-1-git-send-email-nelio.laranjeiro@6wind.com> Subject: [dpdk-dev] [PATCH v7 16/25] mlx5: replace countdown with threshold for Tx completions X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Adrien Mazarguil Replacing the variable countdown (which depends on the number of descriptors) with a fixed relative threshold known at compile time improves performance by reducing the TX queue structure footprint and the amount of code to manage completions during a burst. Completions are now requested at most once per burst after threshold is reached. Signed-off-by: Adrien Mazarguil Signed-off-by: Nelio Laranjeiro Signed-off-by: Vasily Philipov --- drivers/net/mlx5/mlx5_defs.h | 7 +++++-- drivers/net/mlx5/mlx5_rxtx.c | 44 +++++++++++++++++++++++++------------------- drivers/net/mlx5/mlx5_rxtx.h | 5 ++--- drivers/net/mlx5/mlx5_txq.c | 21 ++++++++++++--------- 4 files changed, 44 insertions(+), 33 deletions(-) diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h index 8d2ec7a..cc2a6f3 100644 --- a/drivers/net/mlx5/mlx5_defs.h +++ b/drivers/net/mlx5/mlx5_defs.h @@ -48,8 +48,11 @@ /* Maximum number of special flows. */ #define MLX5_MAX_SPECIAL_FLOWS 4 -/* Request send completion once in every 64 sends, might be less. */ -#define MLX5_PMD_TX_PER_COMP_REQ 64 +/* + * Request TX completion every time descriptors reach this threshold since + * the previous request. Must be a power of two for performance reasons. + */ +#define MLX5_TX_COMP_THRESH 32 /* RSS Indirection table size. */ #define RSS_INDIRECTION_TABLE_SIZE 256 diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index 43236f5..9d992c3 100644 --- a/drivers/net/mlx5/mlx5_rxtx.c +++ b/drivers/net/mlx5/mlx5_rxtx.c @@ -156,9 +156,6 @@ check_cqe64(volatile struct mlx5_cqe64 *cqe, * Manage TX completions. * * When sending a burst, mlx5_tx_burst() posts several WRs. - * To improve performance, a completion event is only required once every - * MLX5_PMD_TX_PER_COMP_REQ sends. Doing so discards completion information - * for other WRs, but this information would not be used anyway. * * @param txq * Pointer to TX queue structure. @@ -172,14 +169,16 @@ txq_complete(struct txq *txq) uint16_t elts_free = txq->elts_tail; uint16_t elts_tail; uint16_t cq_ci = txq->cq_ci; - unsigned int wqe_ci = (unsigned int)-1; + volatile struct mlx5_cqe64 *cqe = NULL; + volatile union mlx5_wqe *wqe; do { - unsigned int idx = cq_ci & cqe_cnt; - volatile struct mlx5_cqe64 *cqe = &(*txq->cqes)[idx].cqe64; + volatile struct mlx5_cqe64 *tmp; - if (check_cqe64(cqe, cqe_n, cq_ci) == 1) + tmp = &(*txq->cqes)[cq_ci & cqe_cnt].cqe64; + if (check_cqe64(tmp, cqe_n, cq_ci)) break; + cqe = tmp; #ifndef NDEBUG if (MLX5_CQE_FORMAT(cqe->op_own) == MLX5_COMPRESSED) { if (!check_cqe64_seen(cqe)) @@ -193,14 +192,15 @@ txq_complete(struct txq *txq) return; } #endif /* NDEBUG */ - wqe_ci = ntohs(cqe->wqe_counter); ++cq_ci; } while (1); - if (unlikely(wqe_ci == (unsigned int)-1)) + if (unlikely(cqe == NULL)) return; + wqe = &(*txq->wqes)[htons(cqe->wqe_counter) & (txq->wqe_n - 1)]; + elts_tail = wqe->wqe.ctrl.data[3]; + assert(elts_tail < txq->wqe_n); /* Free buffers. */ - elts_tail = (wqe_ci + 1) & (elts_n - 1); - do { + while (elts_free != elts_tail) { struct rte_mbuf *elt = (*txq->elts)[elts_free]; unsigned int elts_free_next = (elts_free + 1) & (elts_n - 1); @@ -216,7 +216,7 @@ txq_complete(struct txq *txq) /* Only one segment needs to be freed. */ rte_pktmbuf_free_seg(elt); elts_free = elts_free_next; - } while (elts_free != elts_tail); + } txq->cq_ci = cq_ci; txq->elts_tail = elts_tail; /* Update the consumer index. */ @@ -437,6 +437,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n) const unsigned int elts_n = txq->elts_n; unsigned int i; unsigned int max; + unsigned int comp; volatile union mlx5_wqe *wqe; struct rte_mbuf *buf; @@ -486,13 +487,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n) buf->vlan_tci); else mlx5_wqe_write(txq, wqe, addr, length, lkey); - /* Request completion if needed. */ - if (unlikely(--txq->elts_comp == 0)) { - wqe->wqe.ctrl.data[2] = htonl(8); - txq->elts_comp = txq->elts_comp_cd_init; - } else { - wqe->wqe.ctrl.data[2] = 0; - } + wqe->wqe.ctrl.data[2] = 0; /* Should we enable HW CKSUM offload */ if (buf->ol_flags & (PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM | PKT_TX_UDP_CKSUM)) { @@ -512,6 +507,17 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n) /* Take a shortcut if nothing must be sent. */ if (unlikely(i == 0)) return 0; + /* Check whether completion threshold has been reached. */ + comp = txq->elts_comp + i; + if (comp >= MLX5_TX_COMP_THRESH) { + /* Request completion on last WQE. */ + wqe->wqe.ctrl.data[2] = htonl(8); + /* Save elts_head in unused "immediate" field of WQE. */ + wqe->wqe.ctrl.data[3] = elts_head; + txq->elts_comp = 0; + } else { + txq->elts_comp = comp; + } #ifdef MLX5_PMD_SOFT_COUNTERS /* Increment sent packets counter. */ txq->stats.opackets += i; diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h index 77b0fde..f900e65 100644 --- a/drivers/net/mlx5/mlx5_rxtx.h +++ b/drivers/net/mlx5/mlx5_rxtx.h @@ -238,8 +238,7 @@ struct hash_rxq { struct txq { uint16_t elts_head; /* Current index in (*elts)[]. */ uint16_t elts_tail; /* First element awaiting completion. */ - uint16_t elts_comp_cd_init; /* Initial value for countdown. */ - uint16_t elts_comp; /* Elements before asking a completion. */ + uint16_t elts_comp; /* Counter since last completion request. */ uint16_t elts_n; /* (*elts)[] length. */ uint16_t cq_ci; /* Consumer index for completion queue. */ uint16_t cqe_n; /* Number of CQ elements. */ @@ -247,6 +246,7 @@ struct txq { uint16_t wqe_n; /* Number of WQ elements. */ uint16_t bf_offset; /* Blueflame offset. */ uint16_t bf_buf_size; /* Blueflame size. */ + uint32_t qp_num_8s; /* QP number shifted by 8. */ volatile struct mlx5_cqe (*cqes)[]; /* Completion queue. */ volatile union mlx5_wqe (*wqes)[]; /* Work queue. */ volatile uint32_t *qp_db; /* Work queue doorbell. */ @@ -259,7 +259,6 @@ struct txq { } mp2mr[MLX5_PMD_TX_MP_CACHE]; /* MP to MR translation table. */ struct rte_mbuf *(*elts)[]; /* TX elements. */ struct mlx5_txq_stats stats; /* TX queue counters. */ - uint32_t qp_num_8s; /* QP number shifted by 8. */ } __rte_cache_aligned; /* TX queue control descriptor. */ diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c index 22e9bae..7b2dc7c 100644 --- a/drivers/net/mlx5/mlx5_txq.c +++ b/drivers/net/mlx5/mlx5_txq.c @@ -89,6 +89,7 @@ txq_alloc_elts(struct txq_ctrl *txq_ctrl, unsigned int elts_n) DEBUG("%p: allocated and configured %u WRs", (void *)txq_ctrl, elts_n); txq_ctrl->txq.elts_head = 0; txq_ctrl->txq.elts_tail = 0; + txq_ctrl->txq.elts_comp = 0; } /** @@ -108,6 +109,7 @@ txq_free_elts(struct txq_ctrl *txq_ctrl) DEBUG("%p: freeing WRs", (void *)txq_ctrl); txq_ctrl->txq.elts_head = 0; txq_ctrl->txq.elts_tail = 0; + txq_ctrl->txq.elts_comp = 0; while (elts_tail != elts_head) { struct rte_mbuf *elt = (*elts)[elts_tail]; @@ -274,15 +276,8 @@ txq_ctrl_setup(struct rte_eth_dev *dev, struct txq_ctrl *txq_ctrl, goto error; } (void)conf; /* Thresholds configuration (ignored). */ + assert(desc > MLX5_TX_COMP_THRESH); tmpl.txq.elts_n = desc; - /* - * Request send completion every MLX5_PMD_TX_PER_COMP_REQ packets or - * at least 4 times per ring. - */ - tmpl.txq.elts_comp_cd_init = - ((MLX5_PMD_TX_PER_COMP_REQ < (desc / 4)) ? - MLX5_PMD_TX_PER_COMP_REQ : (desc / 4)); - tmpl.txq.elts_comp = tmpl.txq.elts_comp_cd_init; /* MRs will be registered in mp2mr[] later. */ attr.rd = (struct ibv_exp_res_domain_init_attr){ .comp_mask = (IBV_EXP_RES_DOMAIN_THREAD_MODEL | @@ -302,7 +297,8 @@ txq_ctrl_setup(struct rte_eth_dev *dev, struct txq_ctrl *txq_ctrl, .res_domain = tmpl.rd, }; tmpl.cq = ibv_exp_create_cq(priv->ctx, - (desc / tmpl.txq.elts_comp_cd_init) - 1, + (((desc / MLX5_TX_COMP_THRESH) - 1) ? + ((desc / MLX5_TX_COMP_THRESH) - 1) : 1), NULL, NULL, 0, &attr.cq); if (tmpl.cq == NULL) { ret = ENOMEM; @@ -454,6 +450,13 @@ mlx5_tx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc, return -E_RTE_SECONDARY; priv_lock(priv); + if (desc <= MLX5_TX_COMP_THRESH) { + WARN("%p: number of descriptors requested for TX queue %u" + " must be higher than MLX5_TX_COMP_THRESH, using" + " %u instead of %u", + (void *)dev, idx, MLX5_TX_COMP_THRESH + 1, desc); + desc = MLX5_TX_COMP_THRESH + 1; + } if (!rte_is_power_of_2(desc)) { desc = 1 << log2above(desc); WARN("%p: increased number of descriptors in TX queue %u"