From patchwork Fri Mar 1 08:09:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiaolong Ye X-Patchwork-Id: 50704 X-Patchwork-Delegate: ferruh.yigit@amd.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 349063576; Fri, 1 Mar 2019 09:12:46 +0100 (CET) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id 7EB3E2B87 for ; Fri, 1 Mar 2019 09:12:43 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 Mar 2019 00:12:41 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.58,426,1544515200"; d="scan'208";a="147594731" Received: from yexl-server.sh.intel.com ([10.67.110.206]) by fmsmga002.fm.intel.com with ESMTP; 01 Mar 2019 00:12:40 -0800 From: Xiaolong Ye To: dev@dpdk.org Cc: Qi Zhang , Xiaolong Ye Date: Fri, 1 Mar 2019 16:09:42 +0800 Message-Id: <20190301080947.91086-2-xiaolong.ye@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190301080947.91086-1-xiaolong.ye@intel.com> References: <20190301080947.91086-1-xiaolong.ye@intel.com> Subject: [dpdk-dev] [PATCH v1 1/6] net/af_xdp: introduce AF_XDP PMD driver X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add a new PMD driver for AF_XDP which is a proposed faster version of AF_PACKET interface in Linux. More info about AF_XDP, please refer to [1] [2]. This is the vanilla version PMD which just uses a raw buffer registered as the umem. [1] https://fosdem.org/2018/schedule/event/af_xdp/ [2] https://lwn.net/Articles/745934/ Signed-off-by: Xiaolong Ye --- MAINTAINERS | 6 + config/common_base | 5 + doc/guides/nics/af_xdp.rst | 43 + doc/guides/rel_notes/release_18_11.rst | 7 + drivers/net/Makefile | 1 + drivers/net/af_xdp/Makefile | 31 + drivers/net/af_xdp/meson.build | 7 + drivers/net/af_xdp/rte_eth_af_xdp.c | 903 ++++++++++++++++++ drivers/net/af_xdp/rte_pmd_af_xdp_version.map | 4 + mk/rte.app.mk | 1 + 10 files changed, 1008 insertions(+) create mode 100644 doc/guides/nics/af_xdp.rst create mode 100644 drivers/net/af_xdp/Makefile create mode 100644 drivers/net/af_xdp/meson.build create mode 100644 drivers/net/af_xdp/rte_eth_af_xdp.c create mode 100644 drivers/net/af_xdp/rte_pmd_af_xdp_version.map diff --git a/MAINTAINERS b/MAINTAINERS index 15c53888c..baa92a732 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -469,6 +469,12 @@ M: John W. Linville F: drivers/net/af_packet/ F: doc/guides/nics/features/afpacket.ini +Linux AF_XDP +M: Xiaolong Ye +M: Qi Zhang +F: drivers/net/af_xdp/ +F: doc/guides/nics/features/af_xdp.rst + Amazon ENA M: Marcin Wojtas M: Michal Krawczyk diff --git a/config/common_base b/config/common_base index 7c6da5165..c45d2dad1 100644 --- a/config/common_base +++ b/config/common_base @@ -416,6 +416,11 @@ CONFIG_RTE_LIBRTE_VMXNET3_DEBUG_TX_FREE=n # CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n +# +# Compile software PMD backed by AF_XDP sockets (Linux only) +# +CONFIG_RTE_LIBRTE_PMD_AF_XDP=n + # # Compile link bonding PMD library # diff --git a/doc/guides/nics/af_xdp.rst b/doc/guides/nics/af_xdp.rst new file mode 100644 index 000000000..126d9df3c --- /dev/null +++ b/doc/guides/nics/af_xdp.rst @@ -0,0 +1,43 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2018 Intel Corporation. + +AF_XDP Poll Mode Driver +========================== + +AF_XDP is an address family that is optimized for high performance +packet processing. AF_XDP sockets enable the possibility for XDP program to +redirect packets to a memory buffer in userspace. + +For the full details behind AF_XDP socket, you can refer to +`AF_XDP documentation in the Kernel +`_. + +This Linux-specific PMD driver creates the AF_XDP socket and binds it to a +specific netdev queue, it allows a DPDK application to send and receive raw +packets through the socket which would bypass the kernel network stack. + +Options +------- + +The following options can be provided to set up an af_xdp port in DPDK. + +* ``iface`` - name of the Kernel interface to attach to (required); +* ``queue`` - netdev queue id (optional, default 0); + +Prerequisites +------------- + +This is a Linux-specific PMD, thus the following prerequisites apply: + +* A Linux Kernel with XDP sockets configuration enabled; +* libbpf with latest af_xdp support installed +* A Kernel bound interface to attach to. + +Set up an af_xdp interface +----------------------------- + +The following example will set up an af_xdp interface in DPDK: + +.. code-block:: console + + --vdev eth_af_xdp,iface=ens786f1,queue=0 diff --git a/doc/guides/rel_notes/release_18_11.rst b/doc/guides/rel_notes/release_18_11.rst index 65bab557d..e0918441a 100644 --- a/doc/guides/rel_notes/release_18_11.rst +++ b/doc/guides/rel_notes/release_18_11.rst @@ -229,6 +229,13 @@ New Features The AESNI MB PMD has been updated with additional support for the AES-GCM algorithm. +* **Added the AF_XDP PMD.** + + Added a Linux-specific PMD driver for AF_XDP, it can create the AF_XDP socket + and bind it to a specific netdev queue, it allows a DPDK application to send + and receive raw packets through the socket which would bypass the kernel + network stack to achieve high performance packet processing. + * **Added NXP CAAM JR PMD.** Added the new caam job ring driver for NXP platforms. See the diff --git a/drivers/net/Makefile b/drivers/net/Makefile index 670d7f75a..93cccd2a8 100644 --- a/drivers/net/Makefile +++ b/drivers/net/Makefile @@ -9,6 +9,7 @@ ifeq ($(CONFIG_RTE_LIBRTE_THUNDERX_NICVF_PMD),d) endif DIRS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET) += af_packet +DIRS-$(CONFIG_RTE_LIBRTE_PMD_AF_XDP) += af_xdp DIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark DIRS-$(CONFIG_RTE_LIBRTE_ATLANTIC_PMD) += atlantic DIRS-$(CONFIG_RTE_LIBRTE_AVF_PMD) += avf diff --git a/drivers/net/af_xdp/Makefile b/drivers/net/af_xdp/Makefile new file mode 100644 index 000000000..e3755fff2 --- /dev/null +++ b/drivers/net/af_xdp/Makefile @@ -0,0 +1,31 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2018 Intel Corporation + +include $(RTE_SDK)/mk/rte.vars.mk + +# +# library name +# +LIB = librte_pmd_af_xdp.a + +EXPORT_MAP := rte_pmd_af_xdp_version.map + +LIBABIVER := 1 + + +CFLAGS += -O3 +# below line should be removed +CFLAGS += -I/root/yexl/shared_mks0/linux/tools/include +CFLAGS += -I/root/yexl/shared_mks0/linux/tools/lib/bpf + +CFLAGS += $(WERROR_FLAGS) +LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring +LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs +LDLIBS += -lrte_bus_vdev + +# +# all source are stored in SRCS-y +# +SRCS-$(CONFIG_RTE_LIBRTE_PMD_AF_XDP) += rte_eth_af_xdp.c + +include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/net/af_xdp/meson.build b/drivers/net/af_xdp/meson.build new file mode 100644 index 000000000..4b6652685 --- /dev/null +++ b/drivers/net/af_xdp/meson.build @@ -0,0 +1,7 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2018 Intel Corporation + +if host_machine.system() != 'linux' + build = false +endif +sources = files('rte_eth_af_xdp.c') diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c new file mode 100644 index 000000000..6de769650 --- /dev/null +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c @@ -0,0 +1,903 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Intel Corporation. + */ + +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#ifndef SOL_XDP +#define SOL_XDP 283 +#endif + +#ifndef AF_XDP +#define AF_XDP 44 +#endif + +#ifndef PF_XDP +#define PF_XDP AF_XDP +#endif + +#define ETH_AF_XDP_IFACE_ARG "iface" +#define ETH_AF_XDP_QUEUE_IDX_ARG "queue" + +#define ETH_AF_XDP_FRAME_SIZE XSK_UMEM__DEFAULT_FRAME_SIZE +#define ETH_AF_XDP_NUM_BUFFERS 4096 +#define ETH_AF_XDP_DATA_HEADROOM 0 +#define ETH_AF_XDP_DFLT_NUM_DESCS XSK_RING_CONS__DEFAULT_NUM_DESCS +#define ETH_AF_XDP_DFLT_QUEUE_IDX 0 + +#define ETH_AF_XDP_RX_BATCH_SIZE 32 +#define ETH_AF_XDP_TX_BATCH_SIZE 32 + +#define ETH_AF_XDP_MAX_QUEUE_PAIRS 16 + +struct xsk_umem_info { + struct xsk_ring_prod fq; + struct xsk_ring_cons cq; + struct xsk_umem *umem; + struct rte_ring *buf_ring; + void *buffer; +}; + +struct pkt_rx_queue { + struct xsk_ring_cons rx; + struct xsk_umem_info *umem; + struct xsk_socket *xsk; + struct rte_mempool *mb_pool; + + unsigned long rx_pkts; + unsigned long rx_bytes; + unsigned long rx_dropped; + + struct pkt_tx_queue *pair; + uint16_t queue_idx; +}; + +struct pkt_tx_queue { + struct xsk_ring_prod tx; + + unsigned long tx_pkts; + unsigned long err_pkts; + unsigned long tx_bytes; + + struct pkt_rx_queue *pair; + uint16_t queue_idx; +}; + +struct pmd_internals { + int if_index; + char if_name[IFNAMSIZ]; + uint16_t queue_idx; + struct ether_addr eth_addr; + struct xsk_umem_info *umem; + struct rte_mempool *mb_pool_share; + + struct pkt_rx_queue rx_queues[ETH_AF_XDP_MAX_QUEUE_PAIRS]; + struct pkt_tx_queue tx_queues[ETH_AF_XDP_MAX_QUEUE_PAIRS]; +}; + +static const char * const valid_arguments[] = { + ETH_AF_XDP_IFACE_ARG, + ETH_AF_XDP_QUEUE_IDX_ARG, + NULL +}; + +static struct rte_eth_link pmd_link = { + .link_speed = ETH_SPEED_NUM_10G, + .link_duplex = ETH_LINK_FULL_DUPLEX, + .link_status = ETH_LINK_DOWN, + .link_autoneg = ETH_LINK_AUTONEG +}; + +static inline int +reserve_fill_queue(struct xsk_umem_info *umem, int reserve_size) +{ + struct xsk_ring_prod *fq = &umem->fq; + uint32_t idx; + void *addr = NULL; + int i, ret = 0; + + ret = xsk_ring_prod__reserve(fq, reserve_size, &idx); + if (!ret) { + RTE_LOG(ERR, PMD, "Failed to reserve enough fq descs.\n"); + return ret; + } + + for (i = 0; i < reserve_size; i++) { + rte_ring_dequeue(umem->buf_ring, &addr); + *xsk_ring_prod__fill_addr(fq, idx++) = (uint64_t)addr; + } + + xsk_ring_prod__submit(fq, reserve_size); + + return 0; +} + +static uint16_t +eth_af_xdp_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) +{ + struct pkt_rx_queue *rxq = queue; + struct xsk_ring_cons *rx = &rxq->rx; + struct xsk_umem_info *umem = rxq->umem; + struct xsk_ring_prod *fq = &umem->fq; + uint32_t idx_rx; + uint32_t free_thresh = fq->size >> 1; + struct rte_mbuf *mbuf; + unsigned long dropped = 0; + unsigned long rx_bytes = 0; + uint16_t count = 0; + int rcvd, i; + + nb_pkts = nb_pkts < ETH_AF_XDP_RX_BATCH_SIZE ? + nb_pkts : ETH_AF_XDP_RX_BATCH_SIZE; + + rcvd = xsk_ring_cons__peek(rx, nb_pkts, &idx_rx); + if (!rcvd) + return 0; + + if (xsk_prod_nb_free(fq, free_thresh) >= free_thresh) + (void)reserve_fill_queue(umem, ETH_AF_XDP_RX_BATCH_SIZE); + + for (i = 0; i < rcvd; i++) { + uint64_t addr = xsk_ring_cons__rx_desc(rx, idx_rx)->addr; + uint32_t len = xsk_ring_cons__rx_desc(rx, idx_rx++)->len; + char *pkt = xsk_umem__get_data(rxq->umem->buffer, addr); + + mbuf = rte_pktmbuf_alloc(rxq->mb_pool); + if (mbuf) { + memcpy(rte_pktmbuf_mtod(mbuf, void*), pkt, len); + rte_pktmbuf_pkt_len(mbuf) = + rte_pktmbuf_data_len(mbuf) = len; + rx_bytes += len; + bufs[count++] = mbuf; + } else { + dropped++; + } + rte_ring_enqueue(umem->buf_ring, (void *)addr); + } + + xsk_ring_cons__release(rx, rcvd); + + /* statistics */ + rxq->rx_pkts += (rcvd - dropped); + rxq->rx_bytes += rx_bytes; + rxq->rx_dropped += dropped; + + return count; +} + +static void pull_umem_cq(struct xsk_umem_info *umem, int size) +{ + struct xsk_ring_cons *cq = &umem->cq; + int i, n; + uint32_t idx_cq; + uint64_t addr; + + n = xsk_ring_cons__peek(cq, size, &idx_cq); + if (n > 0) { + for (i = 0; i < n; i++) { + addr = *xsk_ring_cons__comp_addr(cq, + idx_cq++); + rte_ring_enqueue(umem->buf_ring, (void *)addr); + } + + xsk_ring_cons__release(cq, n); + } +} + +static void kick_tx(struct pkt_tx_queue *txq) +{ + struct xsk_umem_info *umem = txq->pair->umem; + int ret; + + while (1) { + ret = sendto(xsk_socket__fd(txq->pair->xsk), NULL, 0, + MSG_DONTWAIT, NULL, 0); + + /* everything is ok */ + if (ret >= 0) + break; + + /* some thing unexpected */ + if (errno != EBUSY && errno != EAGAIN) + break; + + /* pull from complete qeueu to leave more space */ + if (errno == EAGAIN) + pull_umem_cq(umem, ETH_AF_XDP_TX_BATCH_SIZE); + } +} + +static uint16_t +eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) +{ + struct pkt_tx_queue *txq = queue; + struct xsk_umem_info *umem = txq->pair->umem; + struct rte_mbuf *mbuf; + void *addrs[ETH_AF_XDP_TX_BATCH_SIZE]; + unsigned long tx_bytes = 0; + int i, valid = 0; + uint32_t idx_tx; + + nb_pkts = nb_pkts < ETH_AF_XDP_TX_BATCH_SIZE ? + nb_pkts : ETH_AF_XDP_TX_BATCH_SIZE; + + pull_umem_cq(umem, nb_pkts); + + nb_pkts = rte_ring_dequeue_bulk(umem->buf_ring, addrs, + nb_pkts, NULL); + if (!nb_pkts) + return 0; + + if (xsk_ring_prod__reserve(&txq->tx, nb_pkts, &idx_tx) + != nb_pkts) + return 0; + + for (i = 0; i < nb_pkts; i++) { + struct xdp_desc *desc; + char *pkt; + unsigned int buf_len = ETH_AF_XDP_FRAME_SIZE + - ETH_AF_XDP_DATA_HEADROOM; + desc = xsk_ring_prod__tx_desc(&txq->tx, idx_tx + i); + mbuf = bufs[i]; + if (mbuf->pkt_len <= buf_len) { + desc->addr = (uint64_t)addrs[valid]; + desc->len = mbuf->pkt_len; + pkt = xsk_umem__get_data(umem->buffer, + desc->addr); + memcpy(pkt, rte_pktmbuf_mtod(mbuf, void *), + desc->len); + valid++; + tx_bytes += mbuf->pkt_len; + } + rte_pktmbuf_free(mbuf); + } + + xsk_ring_prod__submit(&txq->tx, nb_pkts); + + kick_tx(txq); + + if (valid < nb_pkts) + rte_ring_enqueue_bulk(umem->buf_ring, &addrs[valid], + nb_pkts - valid, NULL); + + txq->err_pkts += nb_pkts - valid; + txq->tx_pkts += valid; + txq->tx_bytes += tx_bytes; + + return nb_pkts; +} + +static int +eth_dev_start(struct rte_eth_dev *dev) +{ + dev->data->dev_link.link_status = ETH_LINK_UP; + + return 0; +} + +/* This function gets called when the current port gets stopped. */ +static void +eth_dev_stop(struct rte_eth_dev *dev) +{ + dev->data->dev_link.link_status = ETH_LINK_DOWN; +} + +static int +eth_dev_configure(struct rte_eth_dev *dev __rte_unused) +{ + /* rx/tx must be paired */ + if (dev->data->nb_rx_queues != dev->data->nb_tx_queues) + return -EINVAL; + + return 0; +} + +static void +eth_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info) +{ + struct pmd_internals *internals = dev->data->dev_private; + + dev_info->if_index = internals->if_index; + dev_info->max_mac_addrs = 1; + dev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN; + dev_info->max_rx_queues = 1; + dev_info->max_tx_queues = 1; + dev_info->min_rx_bufsize = 0; + + dev_info->default_rxportconf.nb_queues = 1; + dev_info->default_txportconf.nb_queues = 1; + dev_info->default_rxportconf.ring_size = ETH_AF_XDP_DFLT_NUM_DESCS; + dev_info->default_txportconf.ring_size = ETH_AF_XDP_DFLT_NUM_DESCS; +} + +static int +eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats) +{ + struct pmd_internals *internals = dev->data->dev_private; + struct xdp_statistics xdp_stats; + struct pkt_rx_queue *rxq; + socklen_t optlen; + int i; + + optlen = sizeof(struct xdp_statistics); + for (i = 0; i < dev->data->nb_rx_queues; i++) { + rxq = &internals->rx_queues[i]; + stats->q_ipackets[i] = internals->rx_queues[i].rx_pkts; + stats->q_ibytes[i] = internals->rx_queues[i].rx_bytes; + + stats->q_opackets[i] = internals->tx_queues[i].tx_pkts; + stats->q_errors[i] = internals->tx_queues[i].err_pkts; + stats->q_obytes[i] = internals->tx_queues[i].tx_bytes; + + stats->ipackets += stats->q_ipackets[i]; + stats->ibytes += stats->q_ibytes[i]; + stats->imissed += internals->rx_queues[i].rx_dropped; + getsockopt(xsk_socket__fd(rxq->xsk), SOL_XDP, XDP_STATISTICS, + &xdp_stats, &optlen); + stats->imissed += xdp_stats.rx_dropped; + + stats->opackets += stats->q_opackets[i]; + stats->oerrors += stats->q_errors[i]; + stats->obytes += stats->q_obytes[i]; + } + + return 0; +} + +static void +eth_stats_reset(struct rte_eth_dev *dev) +{ + struct pmd_internals *internals = dev->data->dev_private; + int i; + + for (i = 0; i < ETH_AF_XDP_MAX_QUEUE_PAIRS; i++) { + internals->rx_queues[i].rx_pkts = 0; + internals->rx_queues[i].rx_bytes = 0; + internals->rx_queues[i].rx_dropped = 0; + + internals->tx_queues[i].tx_pkts = 0; + internals->tx_queues[i].err_pkts = 0; + internals->tx_queues[i].tx_bytes = 0; + } +} + +static void remove_xdp_program(struct pmd_internals *internals) +{ + uint32_t curr_prog_id = 0; + + if (bpf_get_link_xdp_id(internals->if_index, &curr_prog_id, + XDP_FLAGS_UPDATE_IF_NOEXIST)) { + RTE_LOG(ERR, PMD, "bpf_get_link_xdp_id failed\n"); + return; + } + bpf_set_link_xdp_fd(internals->if_index, -1, + XDP_FLAGS_UPDATE_IF_NOEXIST); +} + +static void +eth_dev_close(struct rte_eth_dev *dev) +{ + struct pmd_internals *internals = dev->data->dev_private; + struct pkt_rx_queue *rxq; + int i; + + RTE_LOG(INFO, PMD, "Closing AF_XDP ethdev on numa socket %u\n", + rte_socket_id()); + + for (i = 0; i < ETH_AF_XDP_MAX_QUEUE_PAIRS; i++) { + rxq = &internals->rx_queues[i]; + if (!rxq->umem) + break; + xsk_socket__delete(rxq->xsk); + } + + (void)xsk_umem__delete(internals->umem->umem); + remove_xdp_program(internals); +} + +static void +eth_queue_release(void *q __rte_unused) +{ +} + +static int +eth_link_update(struct rte_eth_dev *dev __rte_unused, + int wait_to_complete __rte_unused) +{ + return 0; +} + +static void xdp_umem_destroy(struct xsk_umem_info *umem) +{ + if (umem->buffer) + free(umem->buffer); + + free(umem); +} + +static struct xsk_umem_info *xdp_umem_configure(void) +{ + struct xsk_umem_info *umem; + struct xsk_umem_config usr_config = { + .fill_size = ETH_AF_XDP_DFLT_NUM_DESCS, + .comp_size = ETH_AF_XDP_DFLT_NUM_DESCS, + .frame_size = ETH_AF_XDP_FRAME_SIZE, + .frame_headroom = ETH_AF_XDP_DATA_HEADROOM }; + void *bufs = NULL; + char ring_name[0x100]; + int ret; + uint64_t i; + + umem = calloc(1, sizeof(*umem)); + if (!umem) { + RTE_LOG(ERR, PMD, "Failed to allocate umem info"); + return NULL; + } + + snprintf(ring_name, 0x100, "af_xdp_ring"); + umem->buf_ring = rte_ring_create(ring_name, + ETH_AF_XDP_NUM_BUFFERS, + SOCKET_ID_ANY, + 0x0); + if (!umem->buf_ring) { + RTE_LOG(ERR, PMD, + "Failed to create rte_ring\n"); + goto err; + } + + for (i = 0; i < ETH_AF_XDP_NUM_BUFFERS; i++) + rte_ring_enqueue(umem->buf_ring, + (void *)(i * ETH_AF_XDP_FRAME_SIZE + + ETH_AF_XDP_DATA_HEADROOM)); + + if (posix_memalign(&bufs, getpagesize(), + ETH_AF_XDP_NUM_BUFFERS * ETH_AF_XDP_FRAME_SIZE)) { + RTE_LOG(ERR, PMD, "Failed to allocate memory pool.\n"); + goto err; + } + ret = xsk_umem__create(&umem->umem, bufs, + ETH_AF_XDP_NUM_BUFFERS * ETH_AF_XDP_FRAME_SIZE, + &umem->fq, &umem->cq, + &usr_config); + + if (ret) { + RTE_LOG(ERR, PMD, "Failed to create umem"); + goto err; + } + umem->buffer = bufs; + + return umem; + +err: + xdp_umem_destroy(umem); + return NULL; +} + +static int +xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq, + int ring_size) +{ + struct xsk_socket_config cfg; + struct pkt_tx_queue *txq = rxq->pair; + int ret = 0; + int reserve_size; + + rxq->umem = xdp_umem_configure(); + if (!rxq->umem) { + ret = -ENOMEM; + goto err; + } + + cfg.rx_size = ring_size; + cfg.tx_size = ring_size; + cfg.libbpf_flags = 0; + cfg.xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST; + cfg.bind_flags = 0; + ret = xsk_socket__create(&rxq->xsk, internals->if_name, + internals->queue_idx, rxq->umem->umem, &rxq->rx, + &txq->tx, &cfg); + if (ret) { + RTE_LOG(ERR, PMD, "Failed to create xsk socket.\n"); + goto err; + } + + reserve_size = ETH_AF_XDP_DFLT_NUM_DESCS / 2; + ret = reserve_fill_queue(rxq->umem, reserve_size); + if (ret) { + RTE_LOG(ERR, PMD, "Failed to reserve fill queue.\n"); + goto err; + } + + return 0; + +err: + xdp_umem_destroy(rxq->umem); + + return ret; +} + +static void +queue_reset(struct pmd_internals *internals, uint16_t queue_idx) +{ + struct pkt_rx_queue *rxq = &internals->rx_queues[queue_idx]; + struct pkt_tx_queue *txq = rxq->pair; + int xsk_fd = xsk_socket__fd(rxq->xsk); + + if (xsk_fd) { + close(xsk_fd); + if (internals->umem) { + xdp_umem_destroy(internals->umem); + internals->umem = NULL; + } + } + memset(rxq, 0, sizeof(*rxq)); + memset(txq, 0, sizeof(*txq)); + rxq->pair = txq; + txq->pair = rxq; + rxq->queue_idx = queue_idx; + txq->queue_idx = queue_idx; +} + +static int +eth_rx_queue_setup(struct rte_eth_dev *dev, + uint16_t rx_queue_id, + uint16_t nb_rx_desc, + unsigned int socket_id __rte_unused, + const struct rte_eth_rxconf *rx_conf __rte_unused, + struct rte_mempool *mb_pool) +{ + struct pmd_internals *internals = dev->data->dev_private; + unsigned int buf_size, data_size; + struct pkt_rx_queue *rxq; + int ret = 0; + + if (mb_pool == NULL) { + RTE_LOG(ERR, PMD, + "Invalid mb_pool\n"); + ret = -EINVAL; + goto err; + } + + if (dev->data->nb_rx_queues <= rx_queue_id) { + RTE_LOG(ERR, PMD, + "Invalid rx queue id: %d\n", rx_queue_id); + ret = -EINVAL; + goto err; + } + + rxq = &internals->rx_queues[rx_queue_id]; + queue_reset(internals, rx_queue_id); + + /* Now get the space available for data in the mbuf */ + buf_size = rte_pktmbuf_data_room_size(mb_pool) - + RTE_PKTMBUF_HEADROOM; + data_size = ETH_AF_XDP_FRAME_SIZE - ETH_AF_XDP_DATA_HEADROOM; + + if (data_size > buf_size) { + RTE_LOG(ERR, PMD, + "%s: %d bytes will not fit in mbuf (%d bytes)\n", + dev->device->name, data_size, buf_size); + ret = -ENOMEM; + goto err; + } + + rxq->mb_pool = mb_pool; + + if (xsk_configure(internals, rxq, nb_rx_desc)) { + RTE_LOG(ERR, PMD, + "Failed to configure xdp socket\n"); + ret = -EINVAL; + goto err; + } + + internals->umem = rxq->umem; + + dev->data->rx_queues[rx_queue_id] = rxq; + return 0; + +err: + queue_reset(internals, rx_queue_id); + return ret; +} + +static int +eth_tx_queue_setup(struct rte_eth_dev *dev, + uint16_t tx_queue_id, + uint16_t nb_tx_desc, + unsigned int socket_id __rte_unused, + const struct rte_eth_txconf *tx_conf __rte_unused) +{ + struct pmd_internals *internals = dev->data->dev_private; + struct pkt_tx_queue *txq; + + if (dev->data->nb_tx_queues <= tx_queue_id) { + RTE_LOG(ERR, PMD, "Invalid tx queue id: %d\n", tx_queue_id); + return -EINVAL; + } + + RTE_LOG(WARNING, PMD, "tx queue setup size=%d will be skipped\n", + nb_tx_desc); + txq = &internals->tx_queues[tx_queue_id]; + + dev->data->tx_queues[tx_queue_id] = txq; + return 0; +} + +static int +eth_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu) +{ + struct pmd_internals *internals = dev->data->dev_private; + struct ifreq ifr = { .ifr_mtu = mtu }; + int ret; + int s; + + s = socket(PF_INET, SOCK_DGRAM, 0); + if (s < 0) + return -EINVAL; + + snprintf(ifr.ifr_name, IFNAMSIZ, "%s", internals->if_name); + ret = ioctl(s, SIOCSIFMTU, &ifr); + close(s); + + if (ret < 0) + return -EINVAL; + + return 0; +} + +static void +eth_dev_change_flags(char *if_name, uint32_t flags, uint32_t mask) +{ + struct ifreq ifr; + int s; + + s = socket(PF_INET, SOCK_DGRAM, 0); + if (s < 0) + return; + + snprintf(ifr.ifr_name, IFNAMSIZ, "%s", if_name); + if (ioctl(s, SIOCGIFFLAGS, &ifr) < 0) + goto out; + ifr.ifr_flags &= mask; + ifr.ifr_flags |= flags; + if (ioctl(s, SIOCSIFFLAGS, &ifr) < 0) + goto out; +out: + close(s); +} + +static void +eth_dev_promiscuous_enable(struct rte_eth_dev *dev) +{ + struct pmd_internals *internals = dev->data->dev_private; + + eth_dev_change_flags(internals->if_name, IFF_PROMISC, ~0); +} + +static void +eth_dev_promiscuous_disable(struct rte_eth_dev *dev) +{ + struct pmd_internals *internals = dev->data->dev_private; + + eth_dev_change_flags(internals->if_name, 0, ~IFF_PROMISC); +} + +static const struct eth_dev_ops ops = { + .dev_start = eth_dev_start, + .dev_stop = eth_dev_stop, + .dev_close = eth_dev_close, + .dev_configure = eth_dev_configure, + .dev_infos_get = eth_dev_info, + .mtu_set = eth_dev_mtu_set, + .promiscuous_enable = eth_dev_promiscuous_enable, + .promiscuous_disable = eth_dev_promiscuous_disable, + .rx_queue_setup = eth_rx_queue_setup, + .tx_queue_setup = eth_tx_queue_setup, + .rx_queue_release = eth_queue_release, + .tx_queue_release = eth_queue_release, + .link_update = eth_link_update, + .stats_get = eth_stats_get, + .stats_reset = eth_stats_reset, +}; + +static struct rte_vdev_driver pmd_af_xdp_drv; + +static void +parse_parameters(struct rte_kvargs *kvlist, + char **if_name, + int *queue_idx) +{ + struct rte_kvargs_pair *pair = NULL; + unsigned int k_idx; + + for (k_idx = 0; k_idx < kvlist->count; k_idx++) { + pair = &kvlist->pairs[k_idx]; + if (strstr(pair->key, ETH_AF_XDP_IFACE_ARG)) + *if_name = pair->value; + else if (strstr(pair->key, ETH_AF_XDP_QUEUE_IDX_ARG)) + *queue_idx = atoi(pair->value); + } +} + +static int +get_iface_info(const char *if_name, + struct ether_addr *eth_addr, + int *if_index) +{ + struct ifreq ifr; + int sock = socket(AF_INET, SOCK_DGRAM, IPPROTO_IP); + + if (sock < 0) + return -1; + + strcpy(ifr.ifr_name, if_name); + if (ioctl(sock, SIOCGIFINDEX, &ifr)) + goto error; + + if (ioctl(sock, SIOCGIFHWADDR, &ifr)) + goto error; + + memcpy(eth_addr, ifr.ifr_hwaddr.sa_data, 6); + + close(sock); + *if_index = if_nametoindex(if_name); + return 0; + +error: + close(sock); + return -1; +} + +static int +init_internals(struct rte_vdev_device *dev, + const char *if_name, + int queue_idx) +{ + const char *name = rte_vdev_device_name(dev); + struct rte_eth_dev *eth_dev = NULL; + const unsigned int numa_node = dev->device.numa_node; + struct pmd_internals *internals = NULL; + int ret; + int i; + + internals = rte_zmalloc_socket(name, sizeof(*internals), 0, numa_node); + if (!internals) + return -ENOMEM; + + internals->queue_idx = queue_idx; + strcpy(internals->if_name, if_name); + + for (i = 0; i < ETH_AF_XDP_MAX_QUEUE_PAIRS; i++) { + internals->tx_queues[i].pair = &internals->rx_queues[i]; + internals->rx_queues[i].pair = &internals->tx_queues[i]; + } + + ret = get_iface_info(if_name, &internals->eth_addr, + &internals->if_index); + if (ret) + goto err; + + eth_dev = rte_eth_vdev_allocate(dev, 0); + if (!eth_dev) + goto err; + + eth_dev->data->dev_private = internals; + eth_dev->data->dev_link = pmd_link; + eth_dev->data->mac_addrs = &internals->eth_addr; + eth_dev->dev_ops = &ops; + eth_dev->rx_pkt_burst = eth_af_xdp_rx; + eth_dev->tx_pkt_burst = eth_af_xdp_tx; + + rte_eth_dev_probing_finish(eth_dev); + return 0; + +err: + rte_free(internals); + return -1; +} + +static int +rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) +{ + struct rte_kvargs *kvlist; + char *if_name = NULL; + int queue_idx = ETH_AF_XDP_DFLT_QUEUE_IDX; + struct rte_eth_dev *eth_dev; + const char *name; + int ret; + + RTE_LOG(INFO, PMD, "Initializing pmd_af_packet for %s\n", + rte_vdev_device_name(dev)); + + name = rte_vdev_device_name(dev); + if (rte_eal_process_type() == RTE_PROC_SECONDARY && + strlen(rte_vdev_device_args(dev)) == 0) { + eth_dev = rte_eth_dev_attach_secondary(name); + if (!eth_dev) { + RTE_LOG(ERR, PMD, "Failed to probe %s\n", name); + return -EINVAL; + } + eth_dev->dev_ops = &ops; + rte_eth_dev_probing_finish(eth_dev); + } + + kvlist = rte_kvargs_parse(rte_vdev_device_args(dev), valid_arguments); + if (!kvlist) { + RTE_LOG(ERR, PMD, + "Invalid kvargs\n"); + return -EINVAL; + } + + if (dev->device.numa_node == SOCKET_ID_ANY) + dev->device.numa_node = rte_socket_id(); + + parse_parameters(kvlist, &if_name, + &queue_idx); + + ret = init_internals(dev, if_name, queue_idx); + + rte_kvargs_free(kvlist); + + return ret; +} + +static int +rte_pmd_af_xdp_remove(struct rte_vdev_device *dev) +{ + struct rte_eth_dev *eth_dev = NULL; + struct pmd_internals *internals; + + RTE_LOG(INFO, PMD, "Removing AF_XDP ethdev on numa socket %u\n", + rte_socket_id()); + + if (!dev) + return -1; + + /* find the ethdev entry */ + eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(dev)); + if (!eth_dev) + return -1; + + internals = eth_dev->data->dev_private; + + rte_ring_free(internals->umem->buf_ring); + rte_free(internals->umem->buffer); + rte_free(internals->umem); + rte_free(internals); + + rte_eth_dev_release_port(eth_dev); + + + return 0; +} + +static struct rte_vdev_driver pmd_af_xdp_drv = { + .probe = rte_pmd_af_xdp_probe, + .remove = rte_pmd_af_xdp_remove, +}; + +RTE_PMD_REGISTER_VDEV(net_af_xdp, pmd_af_xdp_drv); +RTE_PMD_REGISTER_ALIAS(net_af_xdp, eth_af_xdp); +RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp, + "iface= " + "queue= "); diff --git a/drivers/net/af_xdp/rte_pmd_af_xdp_version.map b/drivers/net/af_xdp/rte_pmd_af_xdp_version.map new file mode 100644 index 000000000..ef3539840 --- /dev/null +++ b/drivers/net/af_xdp/rte_pmd_af_xdp_version.map @@ -0,0 +1,4 @@ +DPDK_2.0 { + + local: *; +}; diff --git a/mk/rte.app.mk b/mk/rte.app.mk index d0ab942d5..db3271c7b 100644 --- a/mk/rte.app.mk +++ b/mk/rte.app.mk @@ -143,6 +143,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_DPAA2_MEMPOOL) += -lrte_mempool_dpaa2 endif _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET) += -lrte_pmd_af_packet +_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AF_XDP) += -lrte_pmd_af_xdp -lelf -lbpf _LDLIBS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += -lrte_pmd_ark _LDLIBS-$(CONFIG_RTE_LIBRTE_ATLANTIC_PMD) += -lrte_pmd_atlantic _LDLIBS-$(CONFIG_RTE_LIBRTE_AVF_PMD) += -lrte_pmd_avf From patchwork Fri Mar 1 08:09:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiaolong Ye X-Patchwork-Id: 50705 X-Patchwork-Delegate: ferruh.yigit@amd.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 7FE4B44BE; Fri, 1 Mar 2019 09:12:47 +0100 (CET) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id 6D0932B87 for ; Fri, 1 Mar 2019 09:12:44 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 Mar 2019 00:12:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.58,426,1544515200"; d="scan'208";a="147594739" Received: from yexl-server.sh.intel.com ([10.67.110.206]) by fmsmga002.fm.intel.com with ESMTP; 01 Mar 2019 00:12:42 -0800 From: Xiaolong Ye To: dev@dpdk.org Cc: Qi Zhang , Xiaolong Ye Date: Fri, 1 Mar 2019 16:09:43 +0800 Message-Id: <20190301080947.91086-3-xiaolong.ye@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190301080947.91086-1-xiaolong.ye@intel.com> References: <20190301080947.91086-1-xiaolong.ye@intel.com> Subject: [dpdk-dev] [PATCH v1 2/6] lib/mbuf: enable parse flags when create mempool X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This give the option that applicaiton can configure each memory chunk's size precisely. (by MEMPOOL_F_NO_SPREAD). Signed-off-by: Qi Zhang Signed-off-by: Xiaolong Ye --- lib/librte_mbuf/rte_mbuf.c | 15 ++++++++++++--- lib/librte_mbuf/rte_mbuf.h | 8 +++++++- 2 files changed, 19 insertions(+), 4 deletions(-) diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c index 21f6f7404..0f6fcff28 100644 --- a/lib/librte_mbuf/rte_mbuf.c +++ b/lib/librte_mbuf/rte_mbuf.c @@ -110,7 +110,7 @@ rte_pktmbuf_init(struct rte_mempool *mp, struct rte_mempool * rte_pktmbuf_pool_create_by_ops(const char *name, unsigned int n, unsigned int cache_size, uint16_t priv_size, uint16_t data_room_size, - int socket_id, const char *ops_name) + unsigned int flags, int socket_id, const char *ops_name) { struct rte_mempool *mp; struct rte_pktmbuf_pool_private mbp_priv; @@ -130,7 +130,7 @@ rte_pktmbuf_pool_create_by_ops(const char *name, unsigned int n, mbp_priv.mbuf_priv_size = priv_size; mp = rte_mempool_create_empty(name, n, elt_size, cache_size, - sizeof(struct rte_pktmbuf_pool_private), socket_id, 0); + sizeof(struct rte_pktmbuf_pool_private), socket_id, flags); if (mp == NULL) return NULL; @@ -164,9 +164,18 @@ rte_pktmbuf_pool_create(const char *name, unsigned int n, int socket_id) { return rte_pktmbuf_pool_create_by_ops(name, n, cache_size, priv_size, - data_room_size, socket_id, NULL); + data_room_size, 0, socket_id, NULL); } +/* helper to create a mbuf pool with NO_SPREAD */ +struct rte_mempool * +rte_pktmbuf_pool_create_with_flags(const char *name, unsigned int n, + unsigned int cache_size, uint16_t priv_size, uint16_t data_room_size, + unsigned int flags, int socket_id) +{ + return rte_pktmbuf_pool_create_by_ops(name, n, cache_size, priv_size, + data_room_size, flags, socket_id, NULL); +} /* do some sanity checks on a mbuf: panic if it fails */ void rte_mbuf_sanity_check(const struct rte_mbuf *m, int is_header) diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h index d961ccaf6..7a3faf11c 100644 --- a/lib/librte_mbuf/rte_mbuf.h +++ b/lib/librte_mbuf/rte_mbuf.h @@ -1266,6 +1266,12 @@ rte_pktmbuf_pool_create(const char *name, unsigned n, unsigned cache_size, uint16_t priv_size, uint16_t data_room_size, int socket_id); +struct rte_mempool * +rte_pktmbuf_pool_create_with_flags(const char *name, unsigned int n, + unsigned int cache_size, uint16_t priv_size, uint16_t data_room_size, + unsigned int flags, int socket_id); + + /** * Create a mbuf pool with a given mempool ops name * @@ -1306,7 +1312,7 @@ rte_pktmbuf_pool_create(const char *name, unsigned n, struct rte_mempool * rte_pktmbuf_pool_create_by_ops(const char *name, unsigned int n, unsigned int cache_size, uint16_t priv_size, uint16_t data_room_size, - int socket_id, const char *ops_name); + unsigned int flags, int socket_id, const char *ops_name); /** * Get the data room size of mbufs stored in a pktmbuf_pool From patchwork Fri Mar 1 08:09:44 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiaolong Ye X-Patchwork-Id: 50706 X-Patchwork-Delegate: ferruh.yigit@amd.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 316F44C90; Fri, 1 Mar 2019 09:12:49 +0100 (CET) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id 40C422B87 for ; Fri, 1 Mar 2019 09:12:45 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 Mar 2019 00:12:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.58,426,1544515200"; d="scan'208";a="147594748" Received: from yexl-server.sh.intel.com ([10.67.110.206]) by fmsmga002.fm.intel.com with ESMTP; 01 Mar 2019 00:12:44 -0800 From: Xiaolong Ye To: dev@dpdk.org Cc: Qi Zhang , Xiaolong Ye Date: Fri, 1 Mar 2019 16:09:44 +0800 Message-Id: <20190301080947.91086-4-xiaolong.ye@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190301080947.91086-1-xiaolong.ye@intel.com> References: <20190301080947.91086-1-xiaolong.ye@intel.com> Subject: [dpdk-dev] [PATCH v1 3/6] lib/mempool: allow page size aligned mempool X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Allow create a mempool with page size aligned base address. Signed-off-by: Qi Zhang Signed-off-by: Xiaolong Ye --- lib/librte_mempool/rte_mempool.c | 3 +++ lib/librte_mempool/rte_mempool.h | 1 + 2 files changed, 4 insertions(+) diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c index 683b216f9..33ab6a2b4 100644 --- a/lib/librte_mempool/rte_mempool.c +++ b/lib/librte_mempool/rte_mempool.c @@ -543,6 +543,9 @@ rte_mempool_populate_default(struct rte_mempool *mp) if (try_contig) flags |= RTE_MEMZONE_IOVA_CONTIG; + if (mp->flags & MEMPOOL_F_PAGE_ALIGN) + align = getpagesize(); + mz = rte_memzone_reserve_aligned(mz_name, mem_size, mp->socket_id, flags, align); diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h index 7c9cd9a2f..75553b36f 100644 --- a/lib/librte_mempool/rte_mempool.h +++ b/lib/librte_mempool/rte_mempool.h @@ -264,6 +264,7 @@ struct rte_mempool { #define MEMPOOL_F_POOL_CREATED 0x0010 /**< Internal: pool is created. */ #define MEMPOOL_F_NO_IOVA_CONTIG 0x0020 /**< Don't need IOVA contiguous objs. */ #define MEMPOOL_F_NO_PHYS_CONTIG MEMPOOL_F_NO_IOVA_CONTIG /* deprecated */ +#define MEMPOOL_F_PAGE_ALIGN 0x0040 /**< Chunk's base address is page aligned */ /** * @internal When debug is enabled, store some statistics. From patchwork Fri Mar 1 08:09:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiaolong Ye X-Patchwork-Id: 50707 X-Patchwork-Delegate: ferruh.yigit@amd.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 4EE2E4C96; Fri, 1 Mar 2019 09:12:50 +0100 (CET) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id 25EA537A2 for ; Fri, 1 Mar 2019 09:12:46 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 Mar 2019 00:12:46 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.58,426,1544515200"; d="scan'208";a="147594756" Received: from yexl-server.sh.intel.com ([10.67.110.206]) by fmsmga002.fm.intel.com with ESMTP; 01 Mar 2019 00:12:45 -0800 From: Xiaolong Ye To: dev@dpdk.org Cc: Qi Zhang , Xiaolong Ye Date: Fri, 1 Mar 2019 16:09:45 +0800 Message-Id: <20190301080947.91086-5-xiaolong.ye@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190301080947.91086-1-xiaolong.ye@intel.com> References: <20190301080947.91086-1-xiaolong.ye@intel.com> Subject: [dpdk-dev] [PATCH v1 4/6] net/af_xdp: use mbuf mempool for buffer management X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Now, af_xdp registered memory buffer is managed by rte_mempool. mbuf be allocated from rte_mempool can be convert to xdp_desc's address and vice versa. Signed-off-by: Xiaolong Ye --- drivers/net/af_xdp/rte_eth_af_xdp.c | 121 +++++++++++++++++----------- 1 file changed, 75 insertions(+), 46 deletions(-) diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c index 6de769650..e8270360c 100644 --- a/drivers/net/af_xdp/rte_eth_af_xdp.c +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c @@ -41,7 +41,11 @@ #define ETH_AF_XDP_FRAME_SIZE XSK_UMEM__DEFAULT_FRAME_SIZE #define ETH_AF_XDP_NUM_BUFFERS 4096 -#define ETH_AF_XDP_DATA_HEADROOM 0 +/* mempool hdrobj size (64 bytes) + sizeof(struct rte_mbuf) (128 bytes) */ +#define ETH_AF_XDP_MBUF_OVERHEAD 192 +/* data start from offset 320 (192 + 128) bytes */ +#define ETH_AF_XDP_DATA_HEADROOM \ + (ETH_AF_XDP_MBUF_OVERHEAD + RTE_PKTMBUF_HEADROOM) #define ETH_AF_XDP_DFLT_NUM_DESCS XSK_RING_CONS__DEFAULT_NUM_DESCS #define ETH_AF_XDP_DFLT_QUEUE_IDX 0 @@ -54,7 +58,7 @@ struct xsk_umem_info { struct xsk_ring_prod fq; struct xsk_ring_cons cq; struct xsk_umem *umem; - struct rte_ring *buf_ring; + struct rte_mempool *mb_pool; void *buffer; }; @@ -108,12 +112,32 @@ static struct rte_eth_link pmd_link = { .link_autoneg = ETH_LINK_AUTONEG }; +static inline struct rte_mbuf * +addr_to_mbuf(struct xsk_umem_info *umem, uint64_t addr) +{ + uint64_t offset = (addr / ETH_AF_XDP_FRAME_SIZE * + ETH_AF_XDP_FRAME_SIZE); + struct rte_mbuf *mbuf = (struct rte_mbuf *)((uint64_t)umem->buffer + + offset + ETH_AF_XDP_MBUF_OVERHEAD - + sizeof(struct rte_mbuf)); + mbuf->data_off = addr - offset - ETH_AF_XDP_MBUF_OVERHEAD; + return mbuf; +} + +static inline uint64_t +mbuf_to_addr(struct xsk_umem_info *umem, struct rte_mbuf *mbuf) +{ + return (uint64_t)mbuf->buf_addr + mbuf->data_off - + (uint64_t)umem->buffer; +} + static inline int reserve_fill_queue(struct xsk_umem_info *umem, int reserve_size) { struct xsk_ring_prod *fq = &umem->fq; + struct rte_mbuf *mbuf; uint32_t idx; - void *addr = NULL; + uint64_t addr; int i, ret = 0; ret = xsk_ring_prod__reserve(fq, reserve_size, &idx); @@ -123,8 +147,9 @@ reserve_fill_queue(struct xsk_umem_info *umem, int reserve_size) } for (i = 0; i < reserve_size; i++) { - rte_ring_dequeue(umem->buf_ring, &addr); - *xsk_ring_prod__fill_addr(fq, idx++) = (uint64_t)addr; + mbuf = rte_pktmbuf_alloc(umem->mb_pool); + addr = mbuf_to_addr(umem, mbuf); + *xsk_ring_prod__fill_addr(fq, idx++) = addr; } xsk_ring_prod__submit(fq, reserve_size); @@ -172,7 +197,7 @@ eth_af_xdp_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) } else { dropped++; } - rte_ring_enqueue(umem->buf_ring, (void *)addr); + rte_pktmbuf_free(addr_to_mbuf(umem, addr)); } xsk_ring_cons__release(rx, rcvd); @@ -195,9 +220,8 @@ static void pull_umem_cq(struct xsk_umem_info *umem, int size) n = xsk_ring_cons__peek(cq, size, &idx_cq); if (n > 0) { for (i = 0; i < n; i++) { - addr = *xsk_ring_cons__comp_addr(cq, - idx_cq++); - rte_ring_enqueue(umem->buf_ring, (void *)addr); + addr = *xsk_ring_cons__comp_addr(cq, idx_cq++); + rte_pktmbuf_free(addr_to_mbuf(umem, addr)); } xsk_ring_cons__release(cq, n); @@ -233,7 +257,7 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) struct pkt_tx_queue *txq = queue; struct xsk_umem_info *umem = txq->pair->umem; struct rte_mbuf *mbuf; - void *addrs[ETH_AF_XDP_TX_BATCH_SIZE]; + struct rte_mbuf *mbuf_to_tx; unsigned long tx_bytes = 0; int i, valid = 0; uint32_t idx_tx; @@ -243,11 +267,6 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) pull_umem_cq(umem, nb_pkts); - nb_pkts = rte_ring_dequeue_bulk(umem->buf_ring, addrs, - nb_pkts, NULL); - if (!nb_pkts) - return 0; - if (xsk_ring_prod__reserve(&txq->tx, nb_pkts, &idx_tx) != nb_pkts) return 0; @@ -260,7 +279,12 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) desc = xsk_ring_prod__tx_desc(&txq->tx, idx_tx + i); mbuf = bufs[i]; if (mbuf->pkt_len <= buf_len) { - desc->addr = (uint64_t)addrs[valid]; + mbuf_to_tx = rte_pktmbuf_alloc(umem->mb_pool); + if (!mbuf_to_tx) { + rte_pktmbuf_free(mbuf); + continue; + } + desc->addr = mbuf_to_addr(umem, mbuf_to_tx); desc->len = mbuf->pkt_len; pkt = xsk_umem__get_data(umem->buffer, desc->addr); @@ -276,10 +300,6 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) kick_tx(txq); - if (valid < nb_pkts) - rte_ring_enqueue_bulk(umem->buf_ring, &addrs[valid], - nb_pkts - valid, NULL); - txq->err_pkts += nb_pkts - valid; txq->tx_pkts += valid; txq->tx_bytes += tx_bytes; @@ -429,12 +449,28 @@ eth_link_update(struct rte_eth_dev *dev __rte_unused, static void xdp_umem_destroy(struct xsk_umem_info *umem) { - if (umem->buffer) - free(umem->buffer); + if (umem->mb_pool) + rte_mempool_free(umem->mb_pool); free(umem); } +static inline uint64_t get_base_addr(struct rte_mempool *mp) +{ + struct rte_mempool_memhdr *memhdr; + + memhdr = STAILQ_FIRST(&mp->mem_list); + return (uint64_t)(memhdr->addr); +} + +static inline uint64_t get_len(struct rte_mempool *mp) +{ + struct rte_mempool_memhdr *memhdr; + + memhdr = STAILQ_FIRST(&mp->mem_list); + return (uint64_t)(memhdr->len); +} + static struct xsk_umem_info *xdp_umem_configure(void) { struct xsk_umem_info *umem; @@ -443,10 +479,9 @@ static struct xsk_umem_info *xdp_umem_configure(void) .comp_size = ETH_AF_XDP_DFLT_NUM_DESCS, .frame_size = ETH_AF_XDP_FRAME_SIZE, .frame_headroom = ETH_AF_XDP_DATA_HEADROOM }; - void *bufs = NULL; - char ring_name[0x100]; + void *base_addr = NULL; + char pool_name[0x100]; int ret; - uint64_t i; umem = calloc(1, sizeof(*umem)); if (!umem) { @@ -454,28 +489,23 @@ static struct xsk_umem_info *xdp_umem_configure(void) return NULL; } - snprintf(ring_name, 0x100, "af_xdp_ring"); - umem->buf_ring = rte_ring_create(ring_name, - ETH_AF_XDP_NUM_BUFFERS, - SOCKET_ID_ANY, - 0x0); - if (!umem->buf_ring) { + snprintf(pool_name, 0x100, "af_xdp_ring"); + umem->mb_pool = rte_pktmbuf_pool_create_with_flags(pool_name, + ETH_AF_XDP_NUM_BUFFERS, + 250, 0, + ETH_AF_XDP_FRAME_SIZE - + ETH_AF_XDP_MBUF_OVERHEAD, + MEMPOOL_F_NO_SPREAD | MEMPOOL_F_PAGE_ALIGN, + SOCKET_ID_ANY); + + if (!umem->mb_pool || umem->mb_pool->nb_mem_chunks != 1) { RTE_LOG(ERR, PMD, - "Failed to create rte_ring\n"); + "Failed to create rte_mempool\n"); goto err; } + base_addr = (void *)get_base_addr(umem->mb_pool); - for (i = 0; i < ETH_AF_XDP_NUM_BUFFERS; i++) - rte_ring_enqueue(umem->buf_ring, - (void *)(i * ETH_AF_XDP_FRAME_SIZE + - ETH_AF_XDP_DATA_HEADROOM)); - - if (posix_memalign(&bufs, getpagesize(), - ETH_AF_XDP_NUM_BUFFERS * ETH_AF_XDP_FRAME_SIZE)) { - RTE_LOG(ERR, PMD, "Failed to allocate memory pool.\n"); - goto err; - } - ret = xsk_umem__create(&umem->umem, bufs, + ret = xsk_umem__create(&umem->umem, base_addr, ETH_AF_XDP_NUM_BUFFERS * ETH_AF_XDP_FRAME_SIZE, &umem->fq, &umem->cq, &usr_config); @@ -484,7 +514,7 @@ static struct xsk_umem_info *xdp_umem_configure(void) RTE_LOG(ERR, PMD, "Failed to create umem"); goto err; } - umem->buffer = bufs; + umem->buffer = base_addr; return umem; @@ -880,8 +910,7 @@ rte_pmd_af_xdp_remove(struct rte_vdev_device *dev) internals = eth_dev->data->dev_private; - rte_ring_free(internals->umem->buf_ring); - rte_free(internals->umem->buffer); + rte_mempool_free(internals->umem->mb_pool); rte_free(internals->umem); rte_free(internals); From patchwork Fri Mar 1 08:09:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiaolong Ye X-Patchwork-Id: 50708 X-Patchwork-Delegate: ferruh.yigit@amd.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 5F89F4C9F; Fri, 1 Mar 2019 09:12:51 +0100 (CET) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id AFE182BF4 for ; Fri, 1 Mar 2019 09:12:48 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 Mar 2019 00:12:48 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.58,426,1544515200"; d="scan'208";a="147594764" Received: from yexl-server.sh.intel.com ([10.67.110.206]) by fmsmga002.fm.intel.com with ESMTP; 01 Mar 2019 00:12:47 -0800 From: Xiaolong Ye To: dev@dpdk.org Cc: Qi Zhang , Xiaolong Ye Date: Fri, 1 Mar 2019 16:09:46 +0800 Message-Id: <20190301080947.91086-6-xiaolong.ye@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190301080947.91086-1-xiaolong.ye@intel.com> References: <20190301080947.91086-1-xiaolong.ye@intel.com> Subject: [dpdk-dev] [PATCH v1 5/6] net/af_xdp: enable zero copy X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Try to check if external mempool (from rx_queue_setup) is fit for af_xdp, if it is, it will be registered to af_xdp socket directly and there will be no packet data copy on Rx and Tx. Signed-off-by: Xiaolong Ye --- drivers/net/af_xdp/rte_eth_af_xdp.c | 126 ++++++++++++++++++++-------- 1 file changed, 91 insertions(+), 35 deletions(-) diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c index e8270360c..bfb93f50d 100644 --- a/drivers/net/af_xdp/rte_eth_af_xdp.c +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c @@ -60,6 +60,7 @@ struct xsk_umem_info { struct xsk_umem *umem; struct rte_mempool *mb_pool; void *buffer; + uint8_t zc; }; struct pkt_rx_queue { @@ -74,6 +75,7 @@ struct pkt_rx_queue { struct pkt_tx_queue *pair; uint16_t queue_idx; + uint8_t zc; }; struct pkt_tx_queue { @@ -187,17 +189,24 @@ eth_af_xdp_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) uint32_t len = xsk_ring_cons__rx_desc(rx, idx_rx++)->len; char *pkt = xsk_umem__get_data(rxq->umem->buffer, addr); - mbuf = rte_pktmbuf_alloc(rxq->mb_pool); - if (mbuf) { - memcpy(rte_pktmbuf_mtod(mbuf, void*), pkt, len); + if (rxq->zc) { + mbuf = addr_to_mbuf(rxq->umem, addr); rte_pktmbuf_pkt_len(mbuf) = rte_pktmbuf_data_len(mbuf) = len; - rx_bytes += len; bufs[count++] = mbuf; } else { - dropped++; + mbuf = rte_pktmbuf_alloc(rxq->mb_pool); + if (mbuf) { + memcpy(rte_pktmbuf_mtod(mbuf, void*), pkt, len); + rte_pktmbuf_pkt_len(mbuf) = + rte_pktmbuf_data_len(mbuf) = len; + rx_bytes += len; + bufs[count++] = mbuf; + } else { + dropped++; + } + rte_pktmbuf_free(addr_to_mbuf(umem, addr)); } - rte_pktmbuf_free(addr_to_mbuf(umem, addr)); } xsk_ring_cons__release(rx, rcvd); @@ -278,22 +287,29 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) - ETH_AF_XDP_DATA_HEADROOM; desc = xsk_ring_prod__tx_desc(&txq->tx, idx_tx + i); mbuf = bufs[i]; - if (mbuf->pkt_len <= buf_len) { - mbuf_to_tx = rte_pktmbuf_alloc(umem->mb_pool); - if (!mbuf_to_tx) { - rte_pktmbuf_free(mbuf); - continue; - } - desc->addr = mbuf_to_addr(umem, mbuf_to_tx); + if (txq->pair->zc && mbuf->pool == umem->mb_pool) { + desc->addr = mbuf_to_addr(umem, mbuf); desc->len = mbuf->pkt_len; - pkt = xsk_umem__get_data(umem->buffer, - desc->addr); - memcpy(pkt, rte_pktmbuf_mtod(mbuf, void *), - desc->len); valid++; tx_bytes += mbuf->pkt_len; + } else { + if (mbuf->pkt_len <= buf_len) { + mbuf_to_tx = rte_pktmbuf_alloc(umem->mb_pool); + if (!mbuf_to_tx) { + rte_pktmbuf_free(mbuf); + continue; + } + desc->addr = mbuf_to_addr(umem, mbuf_to_tx); + desc->len = mbuf->pkt_len; + pkt = xsk_umem__get_data(umem->buffer, + desc->addr); + memcpy(pkt, rte_pktmbuf_mtod(mbuf, void *), + desc->len); + valid++; + tx_bytes += mbuf->pkt_len; + } + rte_pktmbuf_free(mbuf); } - rte_pktmbuf_free(mbuf); } xsk_ring_prod__submit(&txq->tx, nb_pkts); @@ -471,7 +487,7 @@ static inline uint64_t get_len(struct rte_mempool *mp) return (uint64_t)(memhdr->len); } -static struct xsk_umem_info *xdp_umem_configure(void) +static struct xsk_umem_info *xdp_umem_configure(struct rte_mempool *mb_pool) { struct xsk_umem_info *umem; struct xsk_umem_config usr_config = { @@ -489,20 +505,26 @@ static struct xsk_umem_info *xdp_umem_configure(void) return NULL; } - snprintf(pool_name, 0x100, "af_xdp_ring"); - umem->mb_pool = rte_pktmbuf_pool_create_with_flags(pool_name, - ETH_AF_XDP_NUM_BUFFERS, - 250, 0, - ETH_AF_XDP_FRAME_SIZE - - ETH_AF_XDP_MBUF_OVERHEAD, - MEMPOOL_F_NO_SPREAD | MEMPOOL_F_PAGE_ALIGN, - SOCKET_ID_ANY); - - if (!umem->mb_pool || umem->mb_pool->nb_mem_chunks != 1) { - RTE_LOG(ERR, PMD, - "Failed to create rte_mempool\n"); - goto err; + if (!mb_pool) { + snprintf(pool_name, 0x100, "af_xdp_ring"); + umem->mb_pool = rte_pktmbuf_pool_create_with_flags(pool_name, + ETH_AF_XDP_NUM_BUFFERS, + 250, 0, + ETH_AF_XDP_FRAME_SIZE - + ETH_AF_XDP_MBUF_OVERHEAD, + MEMPOOL_F_NO_SPREAD | MEMPOOL_F_PAGE_ALIGN, + SOCKET_ID_ANY); + + if (!umem->mb_pool || umem->mb_pool->nb_mem_chunks != 1) { + RTE_LOG(ERR, PMD, + "Failed to create rte_mempool\n"); + goto err; + } + } else { + umem->mb_pool = mb_pool; + umem->zc = 1; } + base_addr = (void *)get_base_addr(umem->mb_pool); ret = xsk_umem__create(&umem->umem, base_addr, @@ -523,16 +545,43 @@ static struct xsk_umem_info *xdp_umem_configure(void) return NULL; } +static uint8_t +check_mempool_zc(struct rte_mempool *mp) +{ + RTE_ASSERT(mp); + + /* must continues */ + if (mp->nb_mem_chunks > 1) + return 0; + + /* check header size */ + if (mp->header_size != RTE_CACHE_LINE_SIZE) + return 0; + + /* check base address */ + if ((uint64_t)get_base_addr(mp) % getpagesize() != 0) + return 0; + + /* check chunk size */ + if ((mp->elt_size + mp->header_size + mp->trailer_size) % + ETH_AF_XDP_FRAME_SIZE != 0) + return 0; + + return 1; +} + static int xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq, - int ring_size) + int ring_size, struct rte_mempool *mb_pool) { struct xsk_socket_config cfg; struct pkt_tx_queue *txq = rxq->pair; + struct rte_mempool *mp; int ret = 0; int reserve_size; - rxq->umem = xdp_umem_configure(); + mp = check_mempool_zc(mb_pool) ? mb_pool : NULL; + rxq->umem = xdp_umem_configure(mp); if (!rxq->umem) { ret = -ENOMEM; goto err; @@ -633,7 +682,7 @@ eth_rx_queue_setup(struct rte_eth_dev *dev, rxq->mb_pool = mb_pool; - if (xsk_configure(internals, rxq, nb_rx_desc)) { + if (xsk_configure(internals, rxq, nb_rx_desc, mb_pool)) { RTE_LOG(ERR, PMD, "Failed to configure xdp socket\n"); ret = -EINVAL; @@ -642,6 +691,13 @@ eth_rx_queue_setup(struct rte_eth_dev *dev, internals->umem = rxq->umem; + if (mb_pool == internals->umem->mb_pool) + rxq->zc = internals->umem->zc; + + if (rxq->zc) + RTE_LOG(INFO, PMD, + "zero copy enabled on rx queue %d\n", rx_queue_id); + dev->data->rx_queues[rx_queue_id] = rxq; return 0; From patchwork Fri Mar 1 08:09:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiaolong Ye X-Patchwork-Id: 50709 X-Patchwork-Delegate: ferruh.yigit@amd.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 76BD34CAB; Fri, 1 Mar 2019 09:12:52 +0100 (CET) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id 1A5CD3572 for ; Fri, 1 Mar 2019 09:12:49 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 Mar 2019 00:12:49 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.58,426,1544515200"; d="scan'208";a="147594773" Received: from yexl-server.sh.intel.com ([10.67.110.206]) by fmsmga002.fm.intel.com with ESMTP; 01 Mar 2019 00:12:49 -0800 From: Xiaolong Ye To: dev@dpdk.org Cc: Qi Zhang , Xiaolong Ye Date: Fri, 1 Mar 2019 16:09:47 +0800 Message-Id: <20190301080947.91086-7-xiaolong.ye@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190301080947.91086-1-xiaolong.ye@intel.com> References: <20190301080947.91086-1-xiaolong.ye@intel.com> Subject: [dpdk-dev] [PATCH v1 6/6] app/testpmd: add mempool flags parameter X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" When create rte_mempool, flags can be parsed from command line. Now, it is possible for testpmd to create a af_xdp friendly mempool (which enable zero copy). Signed-off-by: Qi Zhang Signed-off-by: Xiaolong Ye --- app/test-pmd/parameters.c | 12 ++++++++++++ app/test-pmd/testpmd.c | 17 ++++++++++------- app/test-pmd/testpmd.h | 1 + 3 files changed, 23 insertions(+), 7 deletions(-) diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c index 38b419767..9d5be0007 100644 --- a/app/test-pmd/parameters.c +++ b/app/test-pmd/parameters.c @@ -61,6 +61,7 @@ usage(char* progname) "--tx-first | --stats-period=PERIOD | " "--coremask=COREMASK --portmask=PORTMASK --numa " "--mbuf-size= | --total-num-mbufs= | " + "--mp-flags= | " "--nb-cores= | --nb-ports= | " #ifdef RTE_LIBRTE_CMDLINE "--eth-peers-configfile= | " @@ -105,6 +106,7 @@ usage(char* progname) printf(" --socket-num=N: set socket from which all memory is allocated " "in NUMA mode.\n"); printf(" --mbuf-size=N: set the data size of mbuf to N bytes.\n"); + printf(" --mp-flags=N: set the flags when create mbuf memory pool.\n"); printf(" --total-num-mbufs=N: set the number of mbufs to be allocated " "in mbuf pools.\n"); printf(" --max-pkt-len=N: set the maximum size of packet to N bytes.\n"); @@ -585,6 +587,7 @@ launch_args_parse(int argc, char** argv) { "ring-numa-config", 1, 0, 0 }, { "socket-num", 1, 0, 0 }, { "mbuf-size", 1, 0, 0 }, + { "mp-flags", 1, 0, 0 }, { "total-num-mbufs", 1, 0, 0 }, { "max-pkt-len", 1, 0, 0 }, { "pkt-filter-mode", 1, 0, 0 }, @@ -811,6 +814,15 @@ launch_args_parse(int argc, char** argv) rte_exit(EXIT_FAILURE, "mbuf-size should be > 0 and < 65536\n"); } + if (!strcmp(lgopts[opt_idx].name, "mp-flags")) { + n = atoi(optarg); + if (n > 0 && n <= 0xFFFF) + mp_flags = (uint16_t)n; + else + rte_exit(EXIT_FAILURE, + "mp-flags should be > 0 and < 65536\n"); + } + if (!strcmp(lgopts[opt_idx].name, "total-num-mbufs")) { n = atoi(optarg); if (n > 1024) diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index 98c1baa8b..e0519be3c 100644 --- a/app/test-pmd/testpmd.c +++ b/app/test-pmd/testpmd.c @@ -195,6 +195,7 @@ uint32_t burst_tx_delay_time = BURST_TX_WAIT_US; uint32_t burst_tx_retry_num = BURST_TX_RETRIES; uint16_t mbuf_data_size = DEFAULT_MBUF_DATA_SIZE; /**< Mbuf data space size. */ +uint16_t mp_flags = 0; /**< flags parsed when create mempool */ uint32_t param_total_num_mbufs = 0; /**< number of mbufs in all pools - if * specified on command-line. */ uint16_t stats_period; /**< Period to show statistics (disabled by default) */ @@ -834,6 +835,7 @@ setup_extmem(uint32_t nb_mbufs, uint32_t mbuf_sz, bool huge) */ static void mbuf_pool_create(uint16_t mbuf_seg_size, unsigned nb_mbuf, + unsigned int flags, unsigned int socket_id) { char pool_name[RTE_MEMPOOL_NAMESIZE]; @@ -853,8 +855,8 @@ mbuf_pool_create(uint16_t mbuf_seg_size, unsigned nb_mbuf, /* wrapper to rte_mempool_create() */ TESTPMD_LOG(INFO, "preferred mempool ops selected: %s\n", rte_mbuf_best_mempool_ops()); - rte_mp = rte_pktmbuf_pool_create(pool_name, nb_mbuf, - mb_mempool_cache, 0, mbuf_seg_size, socket_id); + rte_mp = rte_pktmbuf_pool_create_with_flags(pool_name, nb_mbuf, + mb_mempool_cache, 0, mbuf_seg_size, flags, socket_id); break; } case MP_ALLOC_ANON: @@ -891,8 +893,8 @@ mbuf_pool_create(uint16_t mbuf_seg_size, unsigned nb_mbuf, TESTPMD_LOG(INFO, "preferred mempool ops selected: %s\n", rte_mbuf_best_mempool_ops()); - rte_mp = rte_pktmbuf_pool_create(pool_name, nb_mbuf, - mb_mempool_cache, 0, mbuf_seg_size, + rte_mp = rte_pktmbuf_pool_create_with_flags(pool_name, nb_mbuf, + mb_mempool_cache, 0, mbuf_seg_size, flags, heap_socket); break; } @@ -1128,13 +1130,14 @@ init_config(void) for (i = 0; i < num_sockets; i++) mbuf_pool_create(mbuf_data_size, nb_mbuf_per_pool, - socket_ids[i]); + mp_flags, socket_ids[i]); } else { if (socket_num == UMA_NO_CONFIG) - mbuf_pool_create(mbuf_data_size, nb_mbuf_per_pool, 0); + mbuf_pool_create(mbuf_data_size, nb_mbuf_per_pool, + mp_flags, 0); else mbuf_pool_create(mbuf_data_size, nb_mbuf_per_pool, - socket_num); + mp_flags, socket_num); } init_port_config(); diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h index fa4887853..3ddb70e3e 100644 --- a/app/test-pmd/testpmd.h +++ b/app/test-pmd/testpmd.h @@ -408,6 +408,7 @@ extern uint8_t dcb_config; extern uint8_t dcb_test; extern uint16_t mbuf_data_size; /**< Mbuf data space size. */ +extern uint16_t mp_flags; /**< flags for mempool creation. */ extern uint32_t param_total_num_mbufs; extern uint16_t stats_period;