From patchwork Tue Sep 20 08:53:50 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?N=C3=A9lio_Laranjeiro?= X-Patchwork-Id: 15944 X-Patchwork-Delegate: bruce.richardson@intel.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id 4CC8C5922; Tue, 20 Sep 2016 10:54:27 +0200 (CEST) Received: from mail-wm0-f48.google.com (mail-wm0-f48.google.com [74.125.82.48]) by dpdk.org (Postfix) with ESMTP id 24D4658F7 for ; Tue, 20 Sep 2016 10:54:16 +0200 (CEST) Received: by mail-wm0-f48.google.com with SMTP id a25so9486866wmi.1 for ; Tue, 20 Sep 2016 01:54:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ITjj93VYyVft9Xank3nuYZ7s9O7ukwDBwi3ywoc4Hyg=; b=pBXUCxhqFv4ezm4Q6K+GbS94qRmskzuhEDs5zF5rPbPy3bXK+q/gGayPI1isTR2cVQ ig2g3YzKA40N0RrOz6Ci82JTbMTTENnF2BAyK7MD1sEnagyK61plURizVKhPcYTxkh0+ FxDO47W/hbGldfU6o43Okb+dOA1Oo9ycaynKUpd9TMdHVFdQg9MqnTESKa2Vl0YKGIva 1FVLRMBQjtLclOostevsEX2YVDYulKMtbk7jLKT15lG7+1UwjoCuFaWw7fAVr8//kqvp MCJoEUygRD9sgB/vj34ewxyTXiw2PnIBEaHx8A7XoonhUS2FBoTCl+ckl0w1SSjz7VZH DePQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ITjj93VYyVft9Xank3nuYZ7s9O7ukwDBwi3ywoc4Hyg=; b=jUT5aHV8cbMfoRqYWykP8teUVA5Nss2Ln/sXfhni+C2dq4UeJLbeVtwniGABvnYcC3 /pK5GYxCblVKoa4QAzlSI51Gok+t112GX+iXwdW1Rrgj348JYUC6ipueE9LGeU+9cjIM 2focMTy0JTJbDSa/V/tekjFpQ6qi1+lF2RigxfrjctbKO5XGOcIA7HzV1Nte2yMFQjPL ISs154Om4pwfKp7li7S5YpiAfa0a5PEdZ+04dHetfdOg5YCrbqUXNxBkjD2yql7EoUqP ZVA0v967G0mAVtPM632d4YjewWVqHLmFbHb6z6N7i0//8WqI3sqF5AjTE9hLKi5OLM86 YaRQ== X-Gm-Message-State: AE9vXwO3davmGNwneEw2unJvblG83wUqQWfwJn7aO0bYjkHO7+kElwXzXFq12jVzxe62YcrV X-Received: by 10.194.8.226 with SMTP id u2mr30853741wja.153.1474361655552; Tue, 20 Sep 2016 01:54:15 -0700 (PDT) Received: from ping.vm.6wind.com (guy78-3-82-239-227-177.fbx.proxad.net. [82.239.227.177]) by smtp.gmail.com with ESMTPSA id p13sm26252793wmd.1.2016.09.20.01.54.14 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 20 Sep 2016 01:54:14 -0700 (PDT) From: Nelio Laranjeiro To: dev@dpdk.org Cc: Adrien Mazarguil , Bruce Richardson Date: Tue, 20 Sep 2016 10:53:50 +0200 Message-Id: <0e58bd2798d879d106ef6e486083b0e5c764de8e.1474360134.git.nelio.laranjeiro@6wind.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 5/6] net/mlx5: reduce memory overhead for WQE handling X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" PMD uses only power of two number of Work Queue Elements (aka WQE), storing the number of elements in log2 helps to reduce the size of the container to store it. Signed-off-by: Nelio Laranjeiro --- drivers/net/mlx5/mlx5_rxtx.c | 23 ++++++++++++----------- drivers/net/mlx5/mlx5_rxtx.h | 2 +- drivers/net/mlx5/mlx5_txq.c | 4 ++-- 3 files changed, 15 insertions(+), 14 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index 214922b..9d00ddc 100644 --- a/drivers/net/mlx5/mlx5_rxtx.c +++ b/drivers/net/mlx5/mlx5_rxtx.c @@ -199,9 +199,10 @@ txq_complete(struct txq *txq) } while (1); if (unlikely(cqe == NULL)) return; - wqe = &(*txq->wqes)[htons(cqe->wqe_counter) & (txq->wqe_n - 1)].hdr; + wqe = &(*txq->wqes)[htons(cqe->wqe_counter) & + ((1 << txq->wqe_n) - 1)].hdr; elts_tail = wqe->ctrl[3]; - assert(elts_tail < txq->wqe_n); + assert(elts_tail < (1 << txq->wqe_n)); /* Free buffers. */ while (elts_free != elts_tail) { struct rte_mbuf *elt = (*txq->elts)[elts_free]; @@ -335,7 +336,7 @@ mlx5_wqe_write(struct txq *txq, volatile struct mlx5_wqe *wqe, } /* Inline if enough room. */ if (txq->max_inline != 0) { - uintptr_t end = (uintptr_t)&(*txq->wqes)[txq->wqe_n]; + uintptr_t end = (uintptr_t)&(*txq->wqes)[1 << txq->wqe_n]; uint16_t max_inline = txq->max_inline * RTE_CACHE_LINE_SIZE; uint16_t room; @@ -446,7 +447,7 @@ tx_prefetch_wqe(struct txq *txq, uint16_t ci) { volatile struct mlx5_wqe64 *wqe; - wqe = &(*txq->wqes)[ci & (txq->wqe_n - 1)]; + wqe = &(*txq->wqes)[ci & ((1 << txq->wqe_n) - 1)]; rte_prefetch0(wqe); } @@ -504,7 +505,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n) max -= segs_n; --pkts_n; elts_head_next = (elts_head + 1) & (elts_n - 1); - wqe = &(*txq->wqes)[txq->wqe_ci & (txq->wqe_n - 1)].hdr; + wqe = &(*txq->wqes)[txq->wqe_ci & ((1 << txq->wqe_n) - 1)].hdr; tx_prefetch_wqe(txq, txq->wqe_ci); tx_prefetch_wqe(txq, txq->wqe_ci + 1); if (pkts_n) @@ -540,7 +541,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n) if (!(ds % (MLX5_WQE_SIZE / MLX5_WQE_DWORD_SIZE))) dseg = (volatile void *) &(*txq->wqes)[txq->wqe_ci++ & - (txq->wqe_n - 1)]; + ((1 << txq->wqe_n) - 1)]; else ++dseg; ++ds; @@ -607,10 +608,10 @@ skip_segs: static inline void mlx5_mpw_new(struct txq *txq, struct mlx5_mpw *mpw, uint32_t length) { - uint16_t idx = txq->wqe_ci & (txq->wqe_n - 1); + uint16_t idx = txq->wqe_ci & ((1 << txq->wqe_n) - 1); volatile struct mlx5_wqe_data_seg (*dseg)[MLX5_MPW_DSEG_MAX] = (volatile struct mlx5_wqe_data_seg (*)[]) - (uintptr_t)&(*txq->wqes)[(idx + 1) & (txq->wqe_n - 1)]; + (uintptr_t)&(*txq->wqes)[(idx + 1) & ((1 << txq->wqe_n) - 1)]; mpw->state = MLX5_MPW_STATE_OPENED; mpw->pkts_n = 0; @@ -815,7 +816,7 @@ mlx5_tx_burst_mpw(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n) static inline void mlx5_mpw_inline_new(struct txq *txq, struct mlx5_mpw *mpw, uint32_t length) { - uint16_t idx = txq->wqe_ci & (txq->wqe_n - 1); + uint16_t idx = txq->wqe_ci & ((1 << txq->wqe_n) - 1); struct mlx5_wqe_inl_small *inl; mpw->state = MLX5_MPW_INL_STATE_OPENED; @@ -1000,7 +1001,7 @@ mlx5_tx_burst_mpw_inline(void *dpdk_txq, struct rte_mbuf **pkts, addr = rte_pktmbuf_mtod(buf, uintptr_t); (*txq->elts)[elts_head] = buf; /* Maximum number of bytes before wrapping. */ - max = ((uintptr_t)&(*txq->wqes)[txq->wqe_n] - + max = ((uintptr_t)&(*txq->wqes)[1 << txq->wqe_n] - (uintptr_t)mpw.data.raw); if (length > max) { rte_memcpy((void *)(uintptr_t)mpw.data.raw, @@ -1019,7 +1020,7 @@ mlx5_tx_burst_mpw_inline(void *dpdk_txq, struct rte_mbuf **pkts, mpw.data.raw += length; } if ((uintptr_t)mpw.data.raw == - (uintptr_t)&(*txq->wqes)[txq->wqe_n]) + (uintptr_t)&(*txq->wqes)[1 << txq->wqe_n]) mpw.data.raw = (volatile void *)&(*txq->wqes)[0]; ++mpw.pkts_n; diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h index 3dca8ca..9828aef 100644 --- a/drivers/net/mlx5/mlx5_rxtx.h +++ b/drivers/net/mlx5/mlx5_rxtx.h @@ -246,9 +246,9 @@ struct txq { uint16_t elts_comp; /* Counter since last completion request. */ uint16_t cq_ci; /* Consumer index for completion queue. */ uint16_t wqe_ci; /* Consumer index for work queue. */ - uint16_t wqe_n; /* Number of WQ elements. */ uint16_t elts_n:4; /* (*elts)[] length (in log2). */ uint16_t cqe_n:4; /* Number of CQ elements (in log2). */ + uint16_t wqe_n:4; /* Number of of WQ elements (in log2). */ uint16_t bf_buf_size:4; /* Log2 Blueflame size. */ uint16_t bf_offset; /* Blueflame offset. */ uint16_t max_inline; /* Multiple of RTE_CACHE_LINE_SIZE to inline. */ diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c index 9919e37..3d2d132 100644 --- a/drivers/net/mlx5/mlx5_txq.c +++ b/drivers/net/mlx5/mlx5_txq.c @@ -81,7 +81,7 @@ txq_alloc_elts(struct txq_ctrl *txq_ctrl, unsigned int elts_n) for (i = 0; (i != elts_n); ++i) (*txq_ctrl->txq.elts)[i] = NULL; - for (i = 0; (i != txq_ctrl->txq.wqe_n); ++i) { + for (i = 0; (i != (1u << txq_ctrl->txq.wqe_n)); ++i) { volatile struct mlx5_wqe64 *wqe = &(*txq_ctrl->txq.wqes)[i]; memset((void *)(uintptr_t)wqe, 0x0, sizeof(*wqe)); @@ -217,7 +217,7 @@ txq_setup(struct txq_ctrl *tmpl, struct txq_ctrl *txq_ctrl) tmpl->txq.wqes = (volatile struct mlx5_wqe64 (*)[]) (uintptr_t)qp->gen_data.sqstart; - tmpl->txq.wqe_n = qp->sq.wqe_cnt; + tmpl->txq.wqe_n = log2above(qp->sq.wqe_cnt); tmpl->txq.qp_db = &qp->gen_data.db[MLX5_SND_DBR]; tmpl->txq.bf_reg = qp->gen_data.bf->reg; tmpl->txq.bf_offset = qp->gen_data.bf->offset;