From patchwork Thu Oct 13 17:36:57 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tomasz Kulasek X-Patchwork-Id: 16557 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id 83EAA5934; Thu, 13 Oct 2016 19:37:29 +0200 (CEST) Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by dpdk.org (Postfix) with ESMTP id 0E385590B for ; Thu, 13 Oct 2016 19:37:26 +0200 (CEST) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP; 13 Oct 2016 10:37:26 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos; i="5.31,340,1473145200"; d="scan'208"; a="1070049042" Received: from unknown (HELO Sent) ([10.103.102.234]) by fmsmga002.fm.intel.com with SMTP; 13 Oct 2016 10:37:24 -0700 Received: by Sent (sSMTP sendmail emulation); Thu, 13 Oct 2016 19:37:23 +0200 From: Tomasz Kulasek To: dev@dpdk.org Cc: konstantin.ananyev@intel.com, thomas.monjalon@6wind.com Date: Thu, 13 Oct 2016 19:36:57 +0200 Message-Id: <1476380222-9900-2-git-send-email-tomaszx.kulasek@intel.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1476380222-9900-1-git-send-email-tomaszx.kulasek@intel.com> References: <20160930090039.10164-1-tomaszx.kulasek@intel.com> <1476380222-9900-1-git-send-email-tomaszx.kulasek@intel.com> Subject: [dpdk-dev] [PATCH v5 1/6] ethdev: add Tx preparation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Added API for `rte_eth_tx_prep` uint16_t rte_eth_tx_prep(uint8_t port_id, uint16_t queue_id, struct rte_mbuf **tx_pkts, uint16_t nb_pkts) Added fields to the `struct rte_eth_desc_lim`: uint16_t nb_seg_max; /**< Max number of segments per whole packet. */ uint16_t nb_mtu_seg_max; /**< Max number of segments per one MTU */ Created `rte_pkt.h` header with common used functions: int rte_validate_tx_offload(struct rte_mbuf *m) to validate general requirements for tx offload in packet such a flag completness. In current implementation this function is called optionaly when RTE_LIBRTE_ETHDEV_DEBUG is enabled. int rte_phdr_cksum_fix(struct rte_mbuf *m) to fix pseudo header checksum for TSO and non-TSO tcp/udp packets before hardware tx checksum offload. - for non-TSO tcp/udp packets full pseudo-header checksum is counted and set. - for TSO the IP payload length is not included. Signed-off-by: Tomasz Kulasek --- config/common_base | 1 + lib/librte_ether/rte_ethdev.h | 85 +++++++++++++++++++++++++ lib/librte_mbuf/rte_mbuf.h | 9 +++ lib/librte_net/Makefile | 3 +- lib/librte_net/rte_pkt.h | 139 +++++++++++++++++++++++++++++++++++++++++ 5 files changed, 236 insertions(+), 1 deletion(-) create mode 100644 lib/librte_net/rte_pkt.h diff --git a/config/common_base b/config/common_base index f5d2eff..0af6481 100644 --- a/config/common_base +++ b/config/common_base @@ -120,6 +120,7 @@ CONFIG_RTE_MAX_QUEUES_PER_PORT=1024 CONFIG_RTE_LIBRTE_IEEE1588=n CONFIG_RTE_ETHDEV_QUEUE_STAT_CNTRS=16 CONFIG_RTE_ETHDEV_RXTX_CALLBACKS=y +CONFIG_RTE_ETHDEV_TX_PREP=y # # Support NIC bypass logic diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index 0a32ebb..db5dc93 100644 --- a/lib/librte_ether/rte_ethdev.h +++ b/lib/librte_ether/rte_ethdev.h @@ -182,6 +182,7 @@ extern "C" { #include #include #include +#include #include "rte_ether.h" #include "rte_eth_ctrl.h" #include "rte_dev_info.h" @@ -699,6 +700,8 @@ struct rte_eth_desc_lim { uint16_t nb_max; /**< Max allowed number of descriptors. */ uint16_t nb_min; /**< Min allowed number of descriptors. */ uint16_t nb_align; /**< Number of descriptors should be aligned to. */ + uint16_t nb_seg_max; /**< Max number of segments per whole packet. */ + uint16_t nb_mtu_seg_max; /**< Max number of segments per one MTU */ }; /** @@ -1188,6 +1191,11 @@ typedef uint16_t (*eth_tx_burst_t)(void *txq, uint16_t nb_pkts); /**< @internal Send output packets on a transmit queue of an Ethernet device. */ +typedef uint16_t (*eth_tx_prep_t)(void *txq, + struct rte_mbuf **tx_pkts, + uint16_t nb_pkts); +/**< @internal Prepare output packets on a transmit queue of an Ethernet device. */ + typedef int (*flow_ctrl_get_t)(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf); /**< @internal Get current flow control parameter on an Ethernet device */ @@ -1622,6 +1630,7 @@ struct rte_eth_rxtx_callback { struct rte_eth_dev { eth_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive function. */ eth_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit function. */ + eth_tx_prep_t tx_pkt_prep; /**< Pointer to PMD transmit prepare function. */ struct rte_eth_dev_data *data; /**< Pointer to device data */ const struct eth_driver *driver;/**< Driver for this device */ const struct eth_dev_ops *dev_ops; /**< Functions exported by PMD */ @@ -2816,6 +2825,82 @@ rte_eth_tx_burst(uint8_t port_id, uint16_t queue_id, return (*dev->tx_pkt_burst)(dev->data->tx_queues[queue_id], tx_pkts, nb_pkts); } +/** + * Process a burst of output packets on a transmit queue of an Ethernet device. + * + * The rte_eth_tx_prep() function is invoked to prepare output packets to be + * transmitted on the output queue *queue_id* of the Ethernet device designated + * by its *port_id*. + * The *nb_pkts* parameter is the number of packets to be prepared which are + * supplied in the *tx_pkts* array of *rte_mbuf* structures, each of them + * allocated from a pool created with rte_pktmbuf_pool_create(). + * For each packet to send, the rte_eth_tx_prep() function performs + * the following operations: + * + * - Check if packet meets devices requirements for tx offloads. + * + * - Check limitations about number of segments. + * + * - Check additional requirements when debug is enabled. + * + * - Update and/or reset required checksums when tx offload is set for packet. + * + * The rte_eth_tx_prep() function returns the number of packets ready to be + * sent. A return value equal to *nb_pkts* means that all packets are valid and + * ready to be sent. + * + * @param port_id + * The port identifier of the Ethernet device. + * @param queue_id + * The index of the transmit queue through which output packets must be + * sent. + * The value must be in the range [0, nb_tx_queue - 1] previously supplied + * to rte_eth_dev_configure(). + * @param tx_pkts + * The address of an array of *nb_pkts* pointers to *rte_mbuf* structures + * which contain the output packets. + * @param nb_pkts + * The maximum number of packets to process. + * @return + * The number of packets correct and ready to be sent. The return value can be + * less than the value of the *tx_pkts* parameter when some packet doesn't + * meet devices requirements with rte_errno set appropriately. + */ + +#ifdef RTE_ETHDEV_TX_PREP + +static inline uint16_t +rte_eth_tx_prep(uint8_t port_id, uint16_t queue_id, struct rte_mbuf **tx_pkts, + uint16_t nb_pkts) +{ + struct rte_eth_dev *dev = &rte_eth_devices[port_id]; + + if (!dev->tx_pkt_prep) + return nb_pkts; + +#ifdef RTE_LIBRTE_ETHDEV_DEBUG + if (queue_id >= dev->data->nb_tx_queues) { + RTE_PMD_DEBUG_TRACE("Invalid TX queue_id=%d\n", queue_id); + rte_errno = -EINVAL; + return 0; + } +#endif + + return (*dev->tx_pkt_prep)(dev->data->tx_queues[queue_id], + tx_pkts, nb_pkts); +} + +#else + +static inline uint16_t +rte_eth_tx_prep(uint8_t port_id __rte_unused, uint16_t queue_id __rte_unused, + struct rte_mbuf **tx_pkts __rte_unused, uint16_t nb_pkts) +{ + return nb_pkts; +} + +#endif + typedef void (*buffer_tx_error_fn)(struct rte_mbuf **unsent, uint16_t count, void *userdata); diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h index 7541070..bc8fd87 100644 --- a/lib/librte_mbuf/rte_mbuf.h +++ b/lib/librte_mbuf/rte_mbuf.h @@ -224,6 +224,15 @@ extern "C" { */ #define PKT_TX_OUTER_IPV4 (1ULL << 59) +#define PKT_TX_OFFLOAD_MASK ( \ + PKT_TX_IP_CKSUM | \ + PKT_TX_L4_MASK | \ + PKT_TX_OUTER_IP_CKSUM | \ + PKT_TX_TCP_SEG | \ + PKT_TX_QINQ_PKT | \ + PKT_TX_VLAN_PKT | \ + PKT_TX_TUNNEL_MASK) + /** * Packet outer header is IPv6. This flag must be set when using any * outer offload feature (L4 checksum) to tell the NIC that the outer diff --git a/lib/librte_net/Makefile b/lib/librte_net/Makefile index e5758ce..cc69bc0 100644 --- a/lib/librte_net/Makefile +++ b/lib/librte_net/Makefile @@ -1,6 +1,6 @@ # BSD LICENSE # -# Copyright(c) 2010-2014 Intel Corporation. All rights reserved. +# Copyright(c) 2010-2016 Intel Corporation. All rights reserved. # All rights reserved. # # Redistribution and use in source and binary forms, with or without @@ -44,6 +44,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_NET) := rte_net.c SYMLINK-$(CONFIG_RTE_LIBRTE_NET)-include := rte_ip.h rte_tcp.h rte_udp.h SYMLINK-$(CONFIG_RTE_LIBRTE_NET)-include += rte_sctp.h rte_icmp.h rte_arp.h SYMLINK-$(CONFIG_RTE_LIBRTE_NET)-include += rte_ether.h rte_gre.h rte_net.h +SYMLINK-$(CONFIG_RTE_LIBRTE_NET)-include += rte_pkt.h DEPDIRS-$(CONFIG_RTE_LIBRTE_NET) += lib/librte_eal lib/librte_mempool DEPDIRS-$(CONFIG_RTE_LIBRTE_NET) += lib/librte_mbuf diff --git a/lib/librte_net/rte_pkt.h b/lib/librte_net/rte_pkt.h new file mode 100644 index 0000000..8f53c0b --- /dev/null +++ b/lib/librte_net/rte_pkt.h @@ -0,0 +1,139 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2016 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_PKT_H_ +#define _RTE_PKT_H_ + +#include +#include +#include +#include + +/** + * Validate general requirements for tx offload in packet. + */ +static inline int +rte_validate_tx_offload(struct rte_mbuf *m) +{ + uint64_t ol_flags = m->ol_flags; + + /* Does packet set any of available offloads? */ + if (!(ol_flags & PKT_TX_OFFLOAD_MASK)) + return 0; + + /* IP checksum can be counted only for IPv4 packet */ + if ((ol_flags & PKT_TX_IP_CKSUM) && (ol_flags & PKT_TX_IPV6)) + return -EINVAL; + + if (ol_flags & (PKT_TX_L4_MASK | PKT_TX_TCP_SEG)) + /* IP type not set */ + if (!(ol_flags & (PKT_TX_IPV4 | PKT_TX_IPV6))) + return -EINVAL; + + if (ol_flags & PKT_TX_TCP_SEG) + /* PKT_TX_IP_CKSUM offload not set for IPv4 TSO packet */ + if ((m->tso_segsz == 0) || + ((ol_flags & PKT_TX_IPV4) && !(ol_flags & PKT_TX_IP_CKSUM))) + return -EINVAL; + + /* PKT_TX_OUTER_IP_CKSUM set for non outer IPv4 packet. */ + if ((ol_flags & PKT_TX_OUTER_IP_CKSUM) && !(ol_flags & PKT_TX_OUTER_IPV4)) + return -EINVAL; + + return 0; +} + +/** + * Fix pseudo header checksum for TSO and non-TSO tcp/udp packets before + * hardware tx checksum. + * For non-TSO tcp/udp packets full pseudo-header checksum is counted and set. + * For TSO the IP payload length is not included. + */ +static inline int +rte_phdr_cksum_fix(struct rte_mbuf *m) +{ + struct ipv4_hdr *ipv4_hdr; + struct ipv6_hdr *ipv6_hdr; + struct tcp_hdr *tcp_hdr; + struct udp_hdr *udp_hdr; + uint64_t ol_flags = m->ol_flags; + uint64_t inner_l3_offset = m->l2_len; + + if (ol_flags & PKT_TX_OUTER_IP_CKSUM) + inner_l3_offset += m->outer_l2_len + m->outer_l3_len; + + if (ol_flags & PKT_TX_IPV4) { + if ((ol_flags & PKT_TX_UDP_CKSUM) == PKT_TX_UDP_CKSUM) { + /* non-TSO udp */ + ipv4_hdr = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *, + inner_l3_offset); + + if (ol_flags & PKT_TX_IP_CKSUM) + ipv4_hdr->hdr_checksum = 0; + + udp_hdr = (struct udp_hdr *)((char *)ipv4_hdr + m->l3_len); + udp_hdr->dgram_cksum = rte_ipv4_phdr_cksum(ipv4_hdr, ol_flags); + } else if ((ol_flags & PKT_TX_TCP_CKSUM) || + (ol_flags & PKT_TX_TCP_SEG)) { + ipv4_hdr = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *, + inner_l3_offset); + + if (ol_flags & PKT_TX_IP_CKSUM) + ipv4_hdr->hdr_checksum = 0; + + /* non-TSO tcp or TSO */ + tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + m->l3_len); + tcp_hdr->cksum = rte_ipv4_phdr_cksum(ipv4_hdr, ol_flags); + } + } else if (m->ol_flags & PKT_TX_IPV6) { + if ((m->ol_flags & PKT_TX_UDP_CKSUM) == PKT_TX_UDP_CKSUM) { + ipv6_hdr = rte_pktmbuf_mtod_offset(m, struct ipv6_hdr *, + inner_l3_offset); + /* non-TSO udp */ + udp_hdr = rte_pktmbuf_mtod_offset(m, struct udp_hdr *, + inner_l3_offset + m->l3_len); + udp_hdr->dgram_cksum = rte_ipv6_phdr_cksum(ipv6_hdr, ol_flags); + } else if ((ol_flags & PKT_TX_TCP_CKSUM) || + (ol_flags & PKT_TX_TCP_SEG)) { + ipv6_hdr = rte_pktmbuf_mtod_offset(m, struct ipv6_hdr *, + inner_l3_offset); + /* non-TSO tcp or TSO */ + tcp_hdr = rte_pktmbuf_mtod_offset(m, struct tcp_hdr *, + inner_l3_offset + m->l3_len); + tcp_hdr->cksum = rte_ipv6_phdr_cksum(ipv6_hdr, ol_flags); + } + } + return 0; +} + +#endif /* _RTE_PKT_H_ */