From patchwork Wed Feb 22 16:10:00 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shahaf Shuler X-Patchwork-Id: 20672 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id 045035A3E; Wed, 22 Feb 2017 17:10:57 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id 8D37D4A65 for ; Wed, 22 Feb 2017 17:10:22 +0100 (CET) Received: from Internal Mail-Server by MTLPINE1 (envelope-from shahafs@mellanox.com) with ESMTPS (AES256-SHA encrypted); 22 Feb 2017 18:10:20 +0200 Received: from arch009.mtl.labs.mlnx (arch009.mtl.labs.mlnx [10.7.12.209]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id v1MGAK6u008268; Wed, 22 Feb 2017 18:10:20 +0200 Received: from arch009.mtl.labs.mlnx (localhost [127.0.0.1]) by arch009.mtl.labs.mlnx (8.14.7/8.14.7) with ESMTP id v1MGAKP5046838; Wed, 22 Feb 2017 18:10:20 +0200 Received: (from root@localhost) by arch009.mtl.labs.mlnx (8.14.7/8.14.7/Submit) id v1MGAKw2046837; Wed, 22 Feb 2017 18:10:20 +0200 From: Shahaf Shuler To: adrien.mazarguil@6wind.com, nelio.laranjeiro@6wind.com, thomas.monjalon@6wind.com, jingjing.wu@intel.com Cc: dev@dpdk.org Date: Wed, 22 Feb 2017 18:10:00 +0200 Message-Id: <1487779800-46491-5-git-send-email-shahafs@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1487779800-46491-1-git-send-email-shahafs@mellanox.com> References: <1487779800-46491-1-git-send-email-shahafs@mellanox.com> Subject: [dpdk-dev] [PATCH 4/4] net/mlx5: add hardware TSO support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Implement support for hardware TSO. Signed-off-by: Shahaf Shuler --- Performance impact on tx function due to tso logic insertion is estimated ~4 clocks. --- doc/guides/nics/features/mlx5.ini | 1 + drivers/net/mlx5/mlx5.c | 8 +++ drivers/net/mlx5/mlx5.h | 2 + drivers/net/mlx5/mlx5_defs.h | 3 + drivers/net/mlx5/mlx5_ethdev.c | 5 ++ drivers/net/mlx5/mlx5_rxtx.c | 130 ++++++++++++++++++++++++++++++++++---- drivers/net/mlx5/mlx5_rxtx.h | 2 + drivers/net/mlx5/mlx5_txq.c | 10 +++ 8 files changed, 147 insertions(+), 14 deletions(-) diff --git a/doc/guides/nics/features/mlx5.ini b/doc/guides/nics/features/mlx5.ini index f20d214..c6948cb 100644 --- a/doc/guides/nics/features/mlx5.ini +++ b/doc/guides/nics/features/mlx5.ini @@ -19,6 +19,7 @@ RSS hash = Y RSS key update = Y RSS reta update = Y SR-IOV = Y +TSO = Y VLAN filter = Y Flow director = Y Flow API = Y diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index d4bd469..74cae6a 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -479,6 +479,7 @@ IBV_EXP_DEVICE_ATTR_RX_HASH | IBV_EXP_DEVICE_ATTR_VLAN_OFFLOADS | IBV_EXP_DEVICE_ATTR_RX_PAD_END_ALIGN | + IBV_EXP_DEVICE_ATTR_TSO_CAPS | 0; DEBUG("using port %u (%08" PRIx32 ")", port, test); @@ -586,6 +587,13 @@ err = ENOTSUP; goto port_error; } + priv->tso = ((!priv->mps) && + (exp_device_attr.tso_caps.max_tso > 0) && + (exp_device_attr.tso_caps.supported_qpts & + (1 << IBV_QPT_RAW_ETH))); + if (priv->tso) + priv->max_tso_payload_sz = + exp_device_attr.tso_caps.max_tso; /* Allocate and register default RSS hash keys. */ priv->rss_conf = rte_calloc(__func__, hash_rxq_init_n, sizeof((*priv->rss_conf)[0]), 0); diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index 2b4345a..d2bb835 100644 --- a/drivers/net/mlx5/mlx5.h +++ b/drivers/net/mlx5/mlx5.h @@ -126,6 +126,8 @@ struct priv { unsigned int mps:1; /* Whether multi-packet send is supported. */ unsigned int cqe_comp:1; /* Whether CQE compression is enabled. */ unsigned int pending_alarm:1; /* An alarm is pending. */ + unsigned int tso:1; /* Whether TSO is supported. */ + unsigned int max_tso_payload_sz; /* Maximum TCP payload for TSO. */ unsigned int txq_inline; /* Maximum packet size for inlining. */ unsigned int txqs_inline; /* Queue number threshold for inlining. */ /* RX/TX queues. */ diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h index e91d245..eecb908 100644 --- a/drivers/net/mlx5/mlx5_defs.h +++ b/drivers/net/mlx5/mlx5_defs.h @@ -79,4 +79,7 @@ /* Maximum number of extended statistics counters. */ #define MLX5_MAX_XSTATS 32 +/* Maximum Packet headers size (L2+L3+L4) for TSO. */ +#define MLX5_MAX_TSO_HEADER 128 + #endif /* RTE_PMD_MLX5_DEFS_H_ */ diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c index 6b64f44..5fc9511 100644 --- a/drivers/net/mlx5/mlx5_ethdev.c +++ b/drivers/net/mlx5/mlx5_ethdev.c @@ -693,6 +693,11 @@ struct priv * (DEV_TX_OFFLOAD_IPV4_CKSUM | DEV_TX_OFFLOAD_UDP_CKSUM | DEV_TX_OFFLOAD_TCP_CKSUM); + if (priv->tso) { + info->tx_offload_capa |= DEV_TX_OFFLOAD_TCP_TSO; + info->tx_off_lim.max_tso_payload_sz = priv->max_tso_payload_sz; + info->tx_off_lim.max_tso_headers_sz = MLX5_MAX_TSO_HEADER; + } if (priv_get_ifname(priv, &ifname) == 0) info->if_index = if_nametoindex(ifname); /* FIXME: RETA update/query API expects the callee to know the size of diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index b2b7223..30b0096 100644 --- a/drivers/net/mlx5/mlx5_rxtx.c +++ b/drivers/net/mlx5/mlx5_rxtx.c @@ -365,6 +365,7 @@ const unsigned int elts_n = 1 << txq->elts_n; unsigned int i = 0; unsigned int j = 0; + unsigned int k = 0; unsigned int max; uint16_t max_wqe; unsigned int comp; @@ -392,8 +393,10 @@ uintptr_t addr; uint64_t naddr; uint16_t pkt_inline_sz = MLX5_WQE_DWORD_SIZE + 2; + uint16_t tso_header_sz = 0; uint16_t ehdr; uint8_t cs_flags = 0; + uint64_t tso = 0; #ifdef MLX5_PMD_SOFT_COUNTERS uint32_t total_length = 0; #endif @@ -465,6 +468,88 @@ length -= pkt_inline_sz; addr += pkt_inline_sz; } + if (txq->max_tso_inline) { + tso = buf->ol_flags & PKT_TX_TCP_SEG; + if (tso) { + uintptr_t end = (uintptr_t) + (((uintptr_t)txq->wqes) + + (1 << txq->wqe_n) * + MLX5_WQE_SIZE); + unsigned int max_tso_inline = + txq->max_tso_inline * + RTE_CACHE_LINE_SIZE - + MLX5_WQE_DWORD_SIZE; + unsigned int max_native_inline = + txq->max_inline ? + txq->max_inline * + RTE_CACHE_LINE_SIZE - + MLX5_WQE_DWORD_SIZE : + 0; + unsigned int max_inline = + RTE_MAX(max_tso_inline, + max_native_inline); + uintptr_t addr_end = (addr + max_inline) & + ~(RTE_CACHE_LINE_SIZE - 1); + unsigned int copy_b; + uint16_t n; + + tso_header_sz = buf->l2_len + buf->l3_len + + buf->l4_len; + copy_b = tso_header_sz - pkt_inline_sz; + /* First seg must contain all headers. */ + assert(copy_b <= length); + raw += MLX5_WQE_DWORD_SIZE; + if (copy_b && + ((end - (uintptr_t)raw) > copy_b)) { +tso_inline: + n = (MLX5_WQE_DS(copy_b) - 1 + 3) / 4; + if (unlikely(max_wqe < n)) + break; + max_wqe -= n; + rte_memcpy((void *)raw, + (void *)addr, copy_b); + addr += copy_b; + length -= copy_b; + pkt_inline_sz += copy_b; + } else { + /* NOP WQE. */ + wqe->ctrl = (rte_v128u32_t){ + htonl(txq->wqe_ci << 8), + htonl(txq->qp_num_8s | 1), + 0, + 0, + }; + ds = 1; + total_length = 0; + pkts--; + pkts_n++; + elts_head = (elts_head - 1) & + (elts_n - 1); + k++; + goto next_wqe; + } + raw += MLX5_WQE_DS(copy_b) * + MLX5_WQE_DWORD_SIZE; + copy_b = (addr_end > addr) ? + RTE_MIN((addr_end - addr), length) : + 0; + if (copy_b) { + uint32_t inl = + htonl(copy_b | MLX5_INLINE_SEG); + + pkt_inline_sz = + MLX5_WQE_DS(tso_header_sz) * + MLX5_WQE_DWORD_SIZE; + rte_memcpy((void *)raw, + (void *)&inl, sizeof(inl)); + raw += sizeof(inl); + pkt_inline_sz += sizeof(inl); + goto tso_inline; + } else { + goto check_ptr; + } + } + } /* Inline if enough room. */ if (txq->max_inline) { uintptr_t end = (uintptr_t) @@ -496,6 +581,7 @@ length -= copy_b; pkt_inline_sz += copy_b; } +check_ptr: /* * 2 DWORDs consumed by the WQE header + ETH segment + * the size of the inline part of the packet. @@ -591,18 +677,34 @@ next_pkt: ++i; /* Initialize known and common part of the WQE structure. */ - wqe->ctrl = (rte_v128u32_t){ - htonl((txq->wqe_ci << 8) | MLX5_OPCODE_SEND), - htonl(txq->qp_num_8s | ds), - 0, - 0, - }; - wqe->eseg = (rte_v128u32_t){ - 0, - cs_flags, - 0, - (ehdr << 16) | htons(pkt_inline_sz), - }; + if (tso) { + wqe->ctrl = (rte_v128u32_t){ + htonl((txq->wqe_ci << 8) | MLX5_OPCODE_TSO), + htonl(txq->qp_num_8s | ds), + 0, + 0, + }; + wqe->eseg = (rte_v128u32_t){ + 0, + cs_flags | (htons(buf->tso_segsz) << 16), + 0, + (ehdr << 16) | htons(tso_header_sz), + }; + } else { + wqe->ctrl = (rte_v128u32_t){ + htonl((txq->wqe_ci << 8) | MLX5_OPCODE_SEND), + htonl(txq->qp_num_8s | ds), + 0, + 0, + }; + wqe->eseg = (rte_v128u32_t){ + 0, + cs_flags, + 0, + (ehdr << 16) | htons(pkt_inline_sz), + }; + } +next_wqe: txq->wqe_ci += (ds + 3) / 4; #ifdef MLX5_PMD_SOFT_COUNTERS /* Increment sent bytes counter. */ @@ -610,10 +712,10 @@ #endif } while (pkts_n); /* Take a shortcut if nothing must be sent. */ - if (unlikely(i == 0)) + if (unlikely((i + k) == 0)) return 0; /* Check whether completion threshold has been reached. */ - comp = txq->elts_comp + i + j; + comp = txq->elts_comp + i + j + k; if (comp >= MLX5_TX_COMP_THRESH) { volatile struct mlx5_wqe_ctrl *w = (volatile struct mlx5_wqe_ctrl *)wqe; diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h index 41a34d7..a6b8bdd 100644 --- a/drivers/net/mlx5/mlx5_rxtx.h +++ b/drivers/net/mlx5/mlx5_rxtx.h @@ -254,6 +254,8 @@ struct txq { uint16_t cqe_n:4; /* Number of CQ elements (in log2). */ uint16_t wqe_n:4; /* Number of of WQ elements (in log2). */ uint16_t max_inline; /* Multiple of RTE_CACHE_LINE_SIZE to inline. */ + uint16_t max_tso_inline; + /* Multiple of RTE_CACHE_LINE_SIZE to inline. */ uint32_t qp_num_8s; /* QP number shifted by 8. */ volatile struct mlx5_cqe (*cqes)[]; /* Completion queue. */ volatile void *wqes; /* Work queue (use volatile to write into). */ diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c index 949035b..8fe6328 100644 --- a/drivers/net/mlx5/mlx5_txq.c +++ b/drivers/net/mlx5/mlx5_txq.c @@ -343,6 +343,16 @@ attr.init.cap.max_inline_data = tmpl.txq.max_inline * RTE_CACHE_LINE_SIZE; } + if (priv->tso && !(conf->txq_flags & ETH_TXQ_FLAGS_NOTSOOFFL)) { + uint16_t max_tso_inline = ((MLX5_MAX_TSO_HEADER + + (RTE_CACHE_LINE_SIZE - 1)) / + RTE_CACHE_LINE_SIZE); + + attr.init.max_tso_header = + max_tso_inline * RTE_CACHE_LINE_SIZE; + attr.init.comp_mask |= IBV_EXP_QP_INIT_ATTR_MAX_TSO_HEADER; + tmpl.txq.max_tso_inline = max_tso_inline; + } tmpl.qp = ibv_exp_create_qp(priv->ctx, &attr.init); if (tmpl.qp == NULL) { ret = (errno ? errno : EINVAL);