From patchwork Fri Apr 21 09:51:37 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Van Haaren, Harry" X-Patchwork-Id: 23799 X-Patchwork-Delegate: jerinj@marvell.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id C437547CE; Fri, 21 Apr 2017 11:52:28 +0200 (CEST) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 0D9B8A2F for ; Fri, 21 Apr 2017 11:52:26 +0200 (CEST) Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 21 Apr 2017 02:52:19 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.37,229,1488873600"; d="scan'208";a="90572709" Received: from silpixa00398672.ir.intel.com ([10.237.223.128]) by orsmga005.jf.intel.com with ESMTP; 21 Apr 2017 02:52:18 -0700 From: Harry van Haaren To: dev@dpdk.org Cc: jerin.jacob@caviumnetworks.com, Harry van Haaren , Gage Eads , Bruce Richardson Date: Fri, 21 Apr 2017 10:51:37 +0100 Message-Id: <1492768299-84016-2-git-send-email-harry.van.haaren@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1492768299-84016-1-git-send-email-harry.van.haaren@intel.com> References: <1492768299-84016-1-git-send-email-harry.van.haaren@intel.com> Subject: [dpdk-dev] [PATCH 1/3] examples/eventdev_pipeline: added sample app X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This commit adds a sample app for the eventdev library. The app has been tested with DPDK 17.05-rc2, hence this release (or later) is recommended. The sample app showcases a pipeline processing use-case, with event scheduling and processing defined per stage. The application recieves traffic as normal, with each packet traversing the pipeline. Once the packet has been processed by each of the pipeline stages, it is transmitted again. The app provides a framework to utilize cores for a single role or multiple roles. Examples of roles are the RX core, TX core, Scheduling core (in the case of the event/sw PMD), and worker cores. Various flags are available to configure numbers of stages, cycles of work at each stage, type of scheduling, number of worker cores, queue depths etc. For a full explaination, please refer to the documentation. Signed-off-by: Gage Eads Signed-off-by: Bruce Richardson Signed-off-by: Harry van Haaren --- examples/eventdev_pipeline/Makefile | 49 ++ examples/eventdev_pipeline/main.c | 975 ++++++++++++++++++++++++++++++++++++ 2 files changed, 1024 insertions(+) create mode 100644 examples/eventdev_pipeline/Makefile create mode 100644 examples/eventdev_pipeline/main.c diff --git a/examples/eventdev_pipeline/Makefile b/examples/eventdev_pipeline/Makefile new file mode 100644 index 0000000..bab8916 --- /dev/null +++ b/examples/eventdev_pipeline/Makefile @@ -0,0 +1,49 @@ +# BSD LICENSE +# +# Copyright(c) 2016 Intel Corporation. All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Intel Corporation nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +ifeq ($(RTE_SDK),) +$(error "Please define RTE_SDK environment variable") +endif + +# Default target, can be overriden by command line or environment +RTE_TARGET ?= x86_64-native-linuxapp-gcc + +include $(RTE_SDK)/mk/rte.vars.mk + +# binary name +APP = eventdev_pipeline + +# all source are stored in SRCS-y +SRCS-y := main.c + +CFLAGS += -O3 +CFLAGS += $(WERROR_FLAGS) + +include $(RTE_SDK)/mk/rte.extapp.mk diff --git a/examples/eventdev_pipeline/main.c b/examples/eventdev_pipeline/main.c new file mode 100644 index 0000000..618e078 --- /dev/null +++ b/examples/eventdev_pipeline/main.c @@ -0,0 +1,975 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2016-2017 Intel Corporation. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define MAX_NUM_STAGES 8 +#define BATCH_SIZE 16 +#define MAX_NUM_CORE 64 + +static unsigned int active_cores; +static unsigned int num_workers; +static unsigned long num_packets = (1L << 25); /* do ~32M packets */ +static unsigned int num_fids = 512; +static unsigned int num_priorities = 1; +static unsigned int num_stages = 1; +static unsigned int worker_cq_depth = 16; +static int queue_type = RTE_EVENT_QUEUE_CFG_ATOMIC_ONLY; +static int16_t next_qid[MAX_NUM_STAGES+1] = {-1}; +static int16_t qid[MAX_NUM_STAGES] = {-1}; +static int worker_cycles; +static int enable_queue_priorities; + +struct prod_data { + uint8_t dev_id; + uint8_t port_id; + int32_t qid; + unsigned num_nic_ports; +}; + +struct cons_data { + uint8_t dev_id; + uint8_t port_id; +}; + +static struct prod_data prod_data; +static struct cons_data cons_data; + +struct worker_data { + uint8_t dev_id; + uint8_t port_id; +}; + +static unsigned *enqueue_cnt; +static unsigned *dequeue_cnt; + +static volatile int done; +static volatile int prod_stop; +static int quiet; +static int dump_dev; +static int dump_dev_signal; + +static uint32_t rx_lock; +static uint32_t tx_lock; +static uint32_t sched_lock; +static bool rx_single; +static bool tx_single; +static bool sched_single; + +static unsigned rx_core[MAX_NUM_CORE]; +static unsigned tx_core[MAX_NUM_CORE]; +static unsigned sched_core[MAX_NUM_CORE]; +static unsigned worker_core[MAX_NUM_CORE]; + +static bool +core_in_use(unsigned lcore_id) { + return (rx_core[lcore_id] || sched_core[lcore_id] || + tx_core[lcore_id] || worker_core[lcore_id]); +} + +static struct rte_eth_dev_tx_buffer *tx_buf[RTE_MAX_ETHPORTS]; + +static void +rte_eth_tx_buffer_retry(struct rte_mbuf **pkts, uint16_t unsent, + void *userdata) +{ + int port_id = (uintptr_t) userdata; + unsigned _sent = 0; + + do { + /* Note: hard-coded TX queue */ + _sent += rte_eth_tx_burst(port_id, 0, &pkts[_sent], + unsent - _sent); + } while (_sent != unsent); +} + +static int +consumer(void) +{ + const uint64_t freq_khz = rte_get_timer_hz() / 1000; + struct rte_event packets[BATCH_SIZE]; + + static uint64_t npackets; + static uint64_t received; + static uint64_t received_printed; + static uint64_t time_printed; + static uint64_t start_time; + unsigned i, j; + uint8_t dev_id = cons_data.dev_id; + uint8_t port_id = cons_data.port_id; + + if (!npackets) + npackets = num_packets; + + do { + uint16_t n = rte_event_dequeue_burst(dev_id, port_id, + packets, RTE_DIM(packets), 0); + + if (n == 0) { + for (j = 0; j < rte_eth_dev_count(); j++) + rte_eth_tx_buffer_flush(j, 0, tx_buf[j]); + return 0; + } + if (start_time == 0) + time_printed = start_time = rte_get_timer_cycles(); + + received += n; + for (i = 0; i < n; i++) { + uint8_t outport = packets[i].mbuf->port; + rte_eth_tx_buffer(outport, 0, tx_buf[outport], + packets[i].mbuf); + } + + if (!quiet && received >= received_printed + (1<<22)) { + const uint64_t now = rte_get_timer_cycles(); + const uint64_t delta_cycles = now - start_time; + const uint64_t elapsed_ms = delta_cycles / freq_khz; + const uint64_t interval_ms = + (now - time_printed) / freq_khz; + + uint64_t rx_noprint = received - received_printed; + printf("# consumer RX=%"PRIu64", time %"PRIu64 + "ms, avg %.3f mpps [current %.3f mpps]\n", + received, elapsed_ms, + (received) / (elapsed_ms * 1000.0), + rx_noprint / (interval_ms * 1000.0)); + received_printed = received; + time_printed = now; + } + + dequeue_cnt[0] += n; + + if (num_packets > 0 && npackets > 0) { + npackets -= n; + if (npackets == 0 || npackets > num_packets) + done = 1; + } + } while (0); + + return 0; +} + +static int +producer(void) +{ + static uint8_t eth_port; + struct rte_mbuf *mbufs[BATCH_SIZE]; + struct rte_event ev[BATCH_SIZE]; + uint32_t i, num_ports = prod_data.num_nic_ports; + int32_t qid = prod_data.qid; + uint8_t dev_id = prod_data.dev_id; + uint8_t port_id = prod_data.port_id; + uint32_t prio_idx = 0; + + const uint16_t nb_rx = rte_eth_rx_burst(eth_port, 0, mbufs, BATCH_SIZE); + if (++eth_port == num_ports) + eth_port = 0; + if (nb_rx == 0) { + rte_pause(); + return 0; + } + + for (i = 0; i < nb_rx; i++) { + ev[i].flow_id = mbufs[i]->hash.rss; + ev[i].op = RTE_EVENT_OP_NEW; + ev[i].sched_type = queue_type; + ev[i].queue_id = qid; + ev[i].event_type = RTE_EVENT_TYPE_CPU; + ev[i].sub_event_type = 0; + ev[i].priority = RTE_EVENT_DEV_PRIORITY_NORMAL; + ev[i].mbuf = mbufs[i]; + RTE_SET_USED(prio_idx); + } + + const int nb_tx = rte_event_enqueue_burst(dev_id, port_id, ev, nb_rx); + if (nb_tx != nb_rx) { + for (i = nb_tx; i < nb_rx; i++) + rte_pktmbuf_free(mbufs[i]); + } + enqueue_cnt[0] += nb_tx; + + if (unlikely(prod_stop)) + done = 1; + + return 0; +} + +static inline void +schedule_devices(uint8_t dev_id, unsigned lcore_id) +{ + if (rx_core[lcore_id] && (rx_single || + rte_atomic32_cmpset(&rx_lock, 0, 1))) { + producer(); + rte_atomic32_clear((rte_atomic32_t *)&rx_lock); + } + + if (sched_core[lcore_id] && (sched_single || + rte_atomic32_cmpset(&sched_lock, 0, 1))) { + rte_event_schedule(dev_id); + if (dump_dev_signal) { + rte_event_dev_dump(0, stdout); + dump_dev_signal = 0; + } + rte_atomic32_clear((rte_atomic32_t *)&sched_lock); + } + + if (tx_core[lcore_id] && (tx_single || + rte_atomic32_cmpset(&tx_lock, 0, 1))) { + consumer(); + rte_atomic32_clear((rte_atomic32_t *)&tx_lock); + } +} + +static int +worker(void *arg) +{ + struct rte_event events[BATCH_SIZE]; + + struct worker_data *data = (struct worker_data *)arg; + uint8_t dev_id = data->dev_id; + uint8_t port_id = data->port_id; + size_t sent = 0, received = 0; + unsigned lcore_id = rte_lcore_id(); + + while (!done) { + uint16_t i; + + schedule_devices(dev_id, lcore_id); + + if (!worker_core[lcore_id]) { + rte_pause(); + continue; + } + + uint16_t nb_rx = rte_event_dequeue_burst(dev_id, port_id, + events, RTE_DIM(events), 0); + + if (nb_rx == 0) { + rte_pause(); + continue; + } + received += nb_rx; + + for (i = 0; i < nb_rx; i++) { + struct ether_hdr *eth; + struct ether_addr addr; + struct rte_mbuf *m = events[i].mbuf; + + /* The first worker stage does classification */ + if (events[i].queue_id == qid[0]) + events[i].flow_id = m->hash.rss % num_fids; + + events[i].queue_id = next_qid[events[i].queue_id]; + events[i].op = RTE_EVENT_OP_FORWARD; + + /* change mac addresses on packet (to use mbuf data) */ + eth = rte_pktmbuf_mtod(m, struct ether_hdr *); + ether_addr_copy(ð->d_addr, &addr); + ether_addr_copy(ð->s_addr, ð->d_addr); + ether_addr_copy(&addr, ð->s_addr); + + /* do a number of cycles of work per packet */ + volatile uint64_t start_tsc = rte_rdtsc(); + while (rte_rdtsc() < start_tsc + worker_cycles) + rte_pause(); + } + uint16_t nb_tx = rte_event_enqueue_burst(dev_id, port_id, + events, nb_rx); + while (nb_tx < nb_rx && !done) + nb_tx += rte_event_enqueue_burst(dev_id, port_id, + events + nb_tx, + nb_rx - nb_tx); + sent += nb_tx; + } + + if (!quiet) + printf(" worker %u thread done. RX=%zu TX=%zu\n", + rte_lcore_id(), received, sent); + + return 0; +} + +/* + * Parse the coremask given as argument (hexadecimal string) and fill + * the global configuration (core role and core count) with the parsed + * value. + */ +static int xdigit2val(unsigned char c) +{ + int val; + + if (isdigit(c)) + val = c - '0'; + else if (isupper(c)) + val = c - 'A' + 10; + else + val = c - 'a' + 10; + return val; +} + +static uint64_t +parse_coremask(const char *coremask) +{ + int i, j, idx = 0; + unsigned count = 0; + char c; + int val; + uint64_t mask = 0; + const int32_t BITS_HEX = 4; + + if (coremask == NULL) + return -1; + /* Remove all blank characters ahead and after . + * Remove 0x/0X if exists. + */ + while (isblank(*coremask)) + coremask++; + if (coremask[0] == '0' && ((coremask[1] == 'x') + || (coremask[1] == 'X'))) + coremask += 2; + i = strlen(coremask); + while ((i > 0) && isblank(coremask[i - 1])) + i--; + if (i == 0) + return -1; + + for (i = i - 1; i >= 0 && idx < MAX_NUM_CORE; i--) { + c = coremask[i]; + if (isxdigit(c) == 0) { + /* invalid characters */ + return -1; + } + val = xdigit2val(c); + for (j = 0; j < BITS_HEX && idx < MAX_NUM_CORE; j++, idx++) { + if ((1 << j) & val) { + mask |= (1UL << idx); + count++; + } + } + } + for (; i >= 0; i--) + if (coremask[i] != '0') + return -1; + if (count == 0) + return -1; + return mask; +} + +static struct option long_options[] = { + {"workers", required_argument, 0, 'w'}, + {"packets", required_argument, 0, 'n'}, + {"atomic-flows", required_argument, 0, 'f'}, + {"num_stages", required_argument, 0, 's'}, + {"rx-mask", required_argument, 0, 'r'}, + {"tx-mask", required_argument, 0, 't'}, + {"sched-mask", required_argument, 0, 'e'}, + {"cq-depth", required_argument, 0, 'c'}, + {"work-cycles", required_argument, 0, 'W'}, + {"queue-priority", no_argument, 0, 'P'}, + {"parallel", no_argument, 0, 'p'}, + {"ordered", no_argument, 0, 'o'}, + {"quiet", no_argument, 0, 'q'}, + {"dump", no_argument, 0, 'D'}, + {0, 0, 0, 0} +}; + +static void +usage(void) +{ + const char *usage_str = + " Usage: eventdev_demo [options]\n" + " Options:\n" + " -n, --packets=N Send N packets (default ~32M), 0 implies no limit\n" + " -f, --atomic-flows=N Use N random flows from 1 to N (default 16)\n" + " -s, --num_stages=N Use N atomic stages (default 1)\n" + " -r, --rx-mask=core mask Run NIC rx on CPUs in core mask\n" + " -w, --worker-mask=core mask Run worker on CPUs in core mask\n" + " -t, --tx-mask=core mask Run NIC tx on CPUs in core mask\n" + " -e --sched-mask=core mask Run scheduler on CPUs in core mask\n" + " -c --cq-depth=N Worker CQ depth (default 16)\n" + " -W --work-cycles=N Worker cycles (default 0)\n" + " -P --queue-priority Enable scheduler queue prioritization\n" + " -o, --ordered Use ordered scheduling\n" + " -p, --parallel Use parallel scheduling\n" + " -q, --quiet Minimize printed output\n" + " -D, --dump Print detailed statistics before exit" + "\n"; + fprintf(stderr, "%s", usage_str); + exit(1); +} + +static void +parse_app_args(int argc, char **argv) +{ + /* Parse cli options*/ + int option_index; + int c; + opterr = 0; + uint64_t rx_lcore_mask = 0; + uint64_t tx_lcore_mask = 0; + uint64_t sched_lcore_mask = 0; + uint64_t worker_lcore_mask = 0; + int i; + + for (;;) { + c = getopt_long(argc, argv, "r:t:e:c:w:n:f:s:poPqDW:", + long_options, &option_index); + if (c == -1) + break; + + int popcnt = 0; + switch (c) { + case 'n': + num_packets = (unsigned long)atol(optarg); + break; + case 'f': + num_fids = (unsigned int)atoi(optarg); + break; + case 's': + num_stages = (unsigned int)atoi(optarg); + break; + case 'c': + worker_cq_depth = (unsigned int)atoi(optarg); + break; + case 'W': + worker_cycles = (unsigned int)atoi(optarg); + break; + case 'P': + enable_queue_priorities = 1; + break; + case 'o': + queue_type = RTE_EVENT_QUEUE_CFG_ORDERED_ONLY; + break; + case 'p': + queue_type = RTE_EVENT_QUEUE_CFG_PARALLEL_ONLY; + break; + case 'q': + quiet = 1; + break; + case 'D': + dump_dev = 1; + break; + case 'w': + worker_lcore_mask = parse_coremask(optarg); + break; + case 'r': + rx_lcore_mask = parse_coremask(optarg); + popcnt = __builtin_popcountll(rx_lcore_mask); + rx_single = (popcnt == 1); + break; + case 't': + tx_lcore_mask = parse_coremask(optarg); + popcnt = __builtin_popcountll(tx_lcore_mask); + tx_single = (popcnt == 1); + break; + case 'e': + sched_lcore_mask = parse_coremask(optarg); + popcnt = __builtin_popcountll(sched_lcore_mask); + sched_single = (popcnt == 1); + break; + default: + usage(); + } + } + + if (worker_lcore_mask == 0 || rx_lcore_mask == 0 || + sched_lcore_mask == 0 || tx_lcore_mask == 0) { + printf("Core part of pipeline was not assigned any cores. " + "This will stall the pipeline, please check core masks " + "(use -h for details on setting core masks):\n" + "\trx: %"PRIu64"\n\ttx: %"PRIu64"\n\tsched: %"PRIu64 + "\n\tworkers: %"PRIu64"\n", + rx_lcore_mask, tx_lcore_mask, sched_lcore_mask, + worker_lcore_mask); + rte_exit(-1, "Fix core masks\n"); + } + if (num_stages == 0 || num_stages > MAX_NUM_STAGES) + usage(); + + for (i = 0; i < MAX_NUM_CORE; i++) { + rx_core[i] = !!(rx_lcore_mask & (1UL << i)); + tx_core[i] = !!(tx_lcore_mask & (1UL << i)); + sched_core[i] = !!(sched_lcore_mask & (1UL << i)); + worker_core[i] = !!(worker_lcore_mask & (1UL << i)); + + if (worker_core[i]) + num_workers++; + if (core_in_use(i)) + active_cores++; + } +} + +/* + * Initializes a given port using global settings and with the RX buffers + * coming from the mbuf_pool passed as a parameter. + */ +static inline int +port_init(uint8_t port, struct rte_mempool *mbuf_pool) +{ + static const struct rte_eth_conf port_conf_default = { + .rxmode = { + .mq_mode = ETH_MQ_RX_RSS, + .max_rx_pkt_len = ETHER_MAX_LEN + }, + .rx_adv_conf = { + .rss_conf = { + .rss_hf = ETH_RSS_IP | + ETH_RSS_TCP | + ETH_RSS_UDP, + } + } + }; + const uint16_t rx_rings = 1, tx_rings = 1; + const uint16_t rx_ring_size = 512, tx_ring_size = 512; + struct rte_eth_conf port_conf = port_conf_default; + int retval; + uint16_t q; + + if (port >= rte_eth_dev_count()) + return -1; + + /* Configure the Ethernet device. */ + retval = rte_eth_dev_configure(port, rx_rings, tx_rings, &port_conf); + if (retval != 0) + return retval; + + /* Allocate and set up 1 RX queue per Ethernet port. */ + for (q = 0; q < rx_rings; q++) { + retval = rte_eth_rx_queue_setup(port, q, rx_ring_size, + rte_eth_dev_socket_id(port), NULL, mbuf_pool); + if (retval < 0) + return retval; + } + + /* Allocate and set up 1 TX queue per Ethernet port. */ + for (q = 0; q < tx_rings; q++) { + retval = rte_eth_tx_queue_setup(port, q, tx_ring_size, + rte_eth_dev_socket_id(port), NULL); + if (retval < 0) + return retval; + } + + /* Start the Ethernet port. */ + retval = rte_eth_dev_start(port); + if (retval < 0) + return retval; + + /* Display the port MAC address. */ + struct ether_addr addr; + rte_eth_macaddr_get(port, &addr); + printf("Port %u MAC: %02" PRIx8 " %02" PRIx8 " %02" PRIx8 + " %02" PRIx8 " %02" PRIx8 " %02" PRIx8 "\n", + (unsigned)port, + addr.addr_bytes[0], addr.addr_bytes[1], + addr.addr_bytes[2], addr.addr_bytes[3], + addr.addr_bytes[4], addr.addr_bytes[5]); + + /* Enable RX in promiscuous mode for the Ethernet device. */ + rte_eth_promiscuous_enable(port); + + return 0; +} + +static int +init_ports(unsigned num_ports) +{ + uint8_t portid; + unsigned i; + + struct rte_mempool *mp = rte_pktmbuf_pool_create("packet_pool", + /* mbufs */ 16384 * num_ports, + /* cache_size */ 512, + /* priv_size*/ 0, + /* data_room_size */ RTE_MBUF_DEFAULT_BUF_SIZE, + rte_socket_id()); + + for (portid = 0; portid < num_ports; portid++) + if (port_init(portid, mp) != 0) + rte_exit(EXIT_FAILURE, "Cannot init port %"PRIu8 "\n", + portid); + + for (i = 0; i < num_ports; i++) { + void *userdata = (void *)(uintptr_t) i; + tx_buf[i] = rte_malloc(NULL, RTE_ETH_TX_BUFFER_SIZE(32), 0); + if (tx_buf[i] == NULL) + rte_panic("Out of memory\n"); + rte_eth_tx_buffer_init(tx_buf[i], 32); + rte_eth_tx_buffer_set_err_callback(tx_buf[i], + rte_eth_tx_buffer_retry, + userdata); + } + + return 0; +} + +struct port_link { + uint8_t queue_id; + uint8_t priority; +}; + +static int +setup_eventdev(struct prod_data *prod_data, + struct cons_data *cons_data, + struct worker_data *worker_data) +{ + const uint8_t dev_id = 0; + /* +1 stages is for a SINGLE_LINK TX stage */ + const uint8_t nb_queues = num_stages + 1; + /* + 2 is one port for producer and one for consumer */ + const uint8_t nb_ports = num_workers + 2; + const struct rte_event_dev_config config = { + .nb_event_queues = nb_queues, + .nb_event_ports = nb_ports, + .nb_events_limit = 4096, + .nb_event_queue_flows = 1024, + .nb_event_port_dequeue_depth = 128, + .nb_event_port_enqueue_depth = 128, + }; + const struct rte_event_port_conf wkr_p_conf = { + .dequeue_depth = worker_cq_depth, + .enqueue_depth = 64, + .new_event_threshold = 4096, + }; + struct rte_event_queue_conf wkr_q_conf = { + .event_queue_cfg = queue_type, + .priority = RTE_EVENT_DEV_PRIORITY_NORMAL, + .nb_atomic_flows = 1024, + .nb_atomic_order_sequences = 1024, + }; + const struct rte_event_port_conf tx_p_conf = { + .dequeue_depth = 128, + .enqueue_depth = 128, + .new_event_threshold = 4096, + }; + const struct rte_event_queue_conf tx_q_conf = { + .priority = RTE_EVENT_DEV_PRIORITY_HIGHEST, + .event_queue_cfg = + RTE_EVENT_QUEUE_CFG_ATOMIC_ONLY | + RTE_EVENT_QUEUE_CFG_SINGLE_LINK, + .nb_atomic_flows = 1024, + .nb_atomic_order_sequences = 1024, + }; + + struct port_link worker_queues[MAX_NUM_STAGES]; + struct port_link tx_queue; + unsigned i; + + int ret, ndev = rte_event_dev_count(); + if (ndev < 1) { + printf("%d: No Eventdev Devices Found\n", __LINE__); + return -1; + } + + struct rte_event_dev_info dev_info; + ret = rte_event_dev_info_get(dev_id, &dev_info); + printf("\tEventdev %d: %s\n", dev_id, dev_info.driver_name); + + ret = rte_event_dev_configure(dev_id, &config); + if (ret < 0) + printf("%d: Error configuring device\n", __LINE__); + + /* Q creation - one load balanced per pipeline stage*/ + printf(" Stages:\n"); + for (i = 0; i < num_stages; i++) { + if (rte_event_queue_setup(dev_id, i, &wkr_q_conf) < 0) { + printf("%d: error creating qid %d\n", __LINE__, i); + return -1; + } + qid[i] = i; + next_qid[i] = i+1; + worker_queues[i].queue_id = i; + if (enable_queue_priorities) { + /* calculate priority stepping for each stage, leaving + * headroom of 1 for the SINGLE_LINK TX below + */ + const uint32_t prio_delta = + (RTE_EVENT_DEV_PRIORITY_LOWEST-1) / nb_queues; + + /* higher priority for queues closer to tx */ + wkr_q_conf.priority = + RTE_EVENT_DEV_PRIORITY_LOWEST - prio_delta * i; + } + + const char *type_str = "Atomic"; + switch (wkr_q_conf.event_queue_cfg) { + case RTE_EVENT_QUEUE_CFG_ORDERED_ONLY: + type_str = "Ordered"; + break; + case RTE_EVENT_QUEUE_CFG_PARALLEL_ONLY: + type_str = "Parallel"; + break; + } + printf("\tStage %d, Type %s\tPriority = %d\n", i, type_str, + wkr_q_conf.priority); + } + printf("\n"); + + /* final queue for sending to TX core */ + if (rte_event_queue_setup(dev_id, i, &tx_q_conf) < 0) { + printf("%d: error creating qid %d\n", __LINE__, i); + return -1; + } + tx_queue.queue_id = i; + tx_queue.priority = RTE_EVENT_DEV_PRIORITY_HIGHEST; + + /* set up one port per worker, linking to all stage queues */ + for (i = 0; i < num_workers; i++) { + struct worker_data *w = &worker_data[i]; + w->dev_id = dev_id; + if (rte_event_port_setup(dev_id, i, &wkr_p_conf) < 0) { + printf("Error setting up port %d\n", i); + return -1; + } + + uint32_t s; + for (s = 0; s < num_stages; s++) { + if (rte_event_port_link(dev_id, i, + &worker_queues[s].queue_id, + &worker_queues[s].priority, + 1) != 1) { + printf("%d: error creating link for port %d\n", + __LINE__, i); + return -1; + } + } + w->port_id = i; + } + /* port for consumer, linked to TX queue */ + if (rte_event_port_setup(dev_id, i, &tx_p_conf) < 0) { + printf("Error setting up port %d\n", i); + return -1; + } + if (rte_event_port_link(dev_id, i, &tx_queue.queue_id, + &tx_queue.priority, 1) != 1) { + printf("%d: error creating link for port %d\n", + __LINE__, i); + return -1; + } + /* port for producer, no links */ + const struct rte_event_port_conf rx_p_conf = { + .dequeue_depth = 8, + .enqueue_depth = 8, + .new_event_threshold = 1200, + }; + if (rte_event_port_setup(dev_id, i + 1, &rx_p_conf) < 0) { + printf("Error setting up port %d\n", i); + return -1; + } + + *prod_data = (struct prod_data){.dev_id = dev_id, + .port_id = i + 1, + .qid = qid[0] }; + *cons_data = (struct cons_data){.dev_id = dev_id, + .port_id = i }; + + enqueue_cnt = rte_calloc(0, + RTE_CACHE_LINE_SIZE/(sizeof(enqueue_cnt[0])), + sizeof(enqueue_cnt[0]), 0); + dequeue_cnt = rte_calloc(0, + RTE_CACHE_LINE_SIZE/(sizeof(dequeue_cnt[0])), + sizeof(dequeue_cnt[0]), 0); + + if (rte_event_dev_start(dev_id) < 0) { + printf("Error starting eventdev\n"); + return -1; + } + + return dev_id; +} + +static void +signal_handler(int signum) +{ + if (done || prod_stop) + rte_exit(1, "Exiting on signal %d\n", signum); + if (signum == SIGINT || signum == SIGTERM) { + printf("\n\nSignal %d received, preparing to exit...\n", + signum); + done = 1; + } + if (signum == SIGTSTP) + rte_event_dev_dump(0, stdout); +} + +int +main(int argc, char **argv) +{ + struct worker_data *worker_data; + unsigned num_ports; + int lcore_id; + int err; + + signal(SIGINT, signal_handler); + signal(SIGTERM, signal_handler); + signal(SIGTSTP, signal_handler); + + err = rte_eal_init(argc, argv); + if (err < 0) + rte_panic("Invalid EAL arguments\n"); + + argc -= err; + argv += err; + + /* Parse cli options*/ + parse_app_args(argc, argv); + + num_ports = rte_eth_dev_count(); + if (num_ports == 0) + rte_panic("No ethernet ports found\n"); + + const unsigned cores_needed = active_cores; + + if (!quiet) { + printf(" Config:\n"); + printf("\tports: %u\n", num_ports); + printf("\tworkers: %u\n", num_workers); + printf("\tpackets: %lu\n", num_packets); + printf("\tpriorities: %u\n", num_priorities); + printf("\tQueue-prio: %u\n", enable_queue_priorities); + if (queue_type == RTE_EVENT_QUEUE_CFG_ORDERED_ONLY) + printf("\tqid0 type: ordered\n"); + if (queue_type == RTE_EVENT_QUEUE_CFG_ATOMIC_ONLY) + printf("\tqid0 type: atomic\n"); + printf("\tCores available: %u\n", rte_lcore_count()); + printf("\tCores used: %u\n", cores_needed); + } + + if (rte_lcore_count() < cores_needed) + rte_panic("Too few cores (%d < %d)\n", rte_lcore_count(), + cores_needed); + + const unsigned ndevs = rte_event_dev_count(); + if (ndevs == 0) + rte_panic("No dev_id devs found. Pasl in a --vdev eventdev.\n"); + if (ndevs > 1) + fprintf(stderr, "Warning: More than one eventdev, using idx 0"); + + worker_data = rte_calloc(0, num_workers, sizeof(worker_data[0]), 0); + if (worker_data == NULL) + rte_panic("rte_calloc failed\n"); + + int dev_id = setup_eventdev(&prod_data, &cons_data, worker_data); + if (dev_id < 0) + rte_exit(EXIT_FAILURE, "Error setting up eventdev\n"); + + prod_data.num_nic_ports = num_ports; + init_ports(num_ports); + + int worker_idx = 0; + RTE_LCORE_FOREACH_SLAVE(lcore_id) { + if (lcore_id >= MAX_NUM_CORE) + break; + + if (!rx_core[lcore_id] && !worker_core[lcore_id] && + !tx_core[lcore_id] && !sched_core[lcore_id]) + continue; + + if (rx_core[lcore_id]) + printf( + "[%s()] lcore %d executing NIC Rx, and using eventdev port %u\n", + __func__, lcore_id, prod_data.port_id); + + if (tx_core[lcore_id]) + printf( + "[%s()] lcore %d executing NIC Tx, and using eventdev port %u\n", + __func__, lcore_id, cons_data.port_id); + + if (sched_core[lcore_id]) + printf("[%s()] lcore %d executing scheduler\n", + __func__, lcore_id); + + if (worker_core[lcore_id]) + printf( + "[%s()] lcore %d executing worker, using eventdev port %u\n", + __func__, lcore_id, + worker_data[worker_idx].port_id); + + err = rte_eal_remote_launch(worker, &worker_data[worker_idx], + lcore_id); + if (err) { + rte_panic("Failed to launch worker on core %d\n", + lcore_id); + continue; + } + if (worker_core[lcore_id]) + worker_idx++; + } + + lcore_id = rte_lcore_id(); + + if (core_in_use(lcore_id)) + worker(&worker_data[worker_idx++]); + + rte_eal_mp_wait_lcore(); + + if (dump_dev) + rte_event_dev_dump(dev_id, stdout); + + if (!quiet) { + printf("\nPort Workload distribution:\n"); + uint32_t i; + uint64_t tot_pkts = 0; + uint64_t pkts_per_wkr[RTE_MAX_LCORE] = {0}; + for (i = 0; i < num_workers; i++) { + char statname[64]; + snprintf(statname, sizeof(statname), "port_%u_rx", + worker_data[i].port_id); + pkts_per_wkr[i] = rte_event_dev_xstats_by_name_get( + dev_id, statname, NULL); + tot_pkts += pkts_per_wkr[i]; + } + for (i = 0; i < num_workers; i++) { + float pc = pkts_per_wkr[i] * 100 / + ((float)tot_pkts); + printf("worker %i :\t%.1f %% (%"PRIu64" pkts)\n", + i, pc, pkts_per_wkr[i]); + } + + } + + return 0; +} From patchwork Fri Apr 21 09:51:38 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Van Haaren, Harry" X-Patchwork-Id: 23800 X-Patchwork-Delegate: jerinj@marvell.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id 08F4B58CE; Fri, 21 Apr 2017 11:52:33 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id E879E58CD for ; Fri, 21 Apr 2017 11:52:30 +0200 (CEST) Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga105.fm.intel.com with ESMTP; 21 Apr 2017 02:52:29 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.37,229,1488873600"; d="scan'208";a="90572723" Received: from silpixa00398672.ir.intel.com ([10.237.223.128]) by orsmga005.jf.intel.com with ESMTP; 21 Apr 2017 02:52:28 -0700 From: Harry van Haaren To: dev@dpdk.org Cc: jerin.jacob@caviumnetworks.com, Harry van Haaren Date: Fri, 21 Apr 2017 10:51:38 +0100 Message-Id: <1492768299-84016-3-git-send-email-harry.van.haaren@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1492768299-84016-1-git-send-email-harry.van.haaren@intel.com> References: <1492768299-84016-1-git-send-email-harry.van.haaren@intel.com> Subject: [dpdk-dev] [PATCH 2/3] doc: add eventdev pipeline to sample app ug X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add a new entry in the sample app user-guides, which details the working of the eventdev_pipeline. Signed-off-by: Harry van Haaren --- doc/guides/sample_app_ug/eventdev_pipeline.rst | 188 +++++++++++++++++++++++++ doc/guides/sample_app_ug/index.rst | 1 + 2 files changed, 189 insertions(+) create mode 100644 doc/guides/sample_app_ug/eventdev_pipeline.rst diff --git a/doc/guides/sample_app_ug/eventdev_pipeline.rst b/doc/guides/sample_app_ug/eventdev_pipeline.rst new file mode 100644 index 0000000..31d1006 --- /dev/null +++ b/doc/guides/sample_app_ug/eventdev_pipeline.rst @@ -0,0 +1,188 @@ + +.. BSD LICENSE + Copyright(c) 2017 Intel Corporation. All rights reserved. + All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + * Neither the name of Intel Corporation nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +Eventdev Pipeline Sample Application +==================================== + +The eventdev pipeline sample application is a sample app that demonstrates +the usage of the eventdev API. It shows how an application can configure +a pipeline and assign a set of worker cores to perform the processing required. + +The application has a range of command line arguments allowing it to be +configured for various numbers worker cores, stages,queue depths and cycles per +stage of work. This is useful for performance testing as well as quickly testing +a particular pipeline configuration. + + +Compiling the Application +------------------------- + +To compile the application: + +#. Go to the sample application directory: + + .. code-block:: console + + export RTE_SDK=/path/to/rte_sdk cd ${RTE_SDK}/examples/eventdev_pipeline + +#. Set the target (a default target is used if not specified). For example: + + .. code-block:: console + + export RTE_TARGET=x86_64-native-linuxapp-gcc + + See the *DPDK Getting Started Guide* for possible RTE_TARGET values. + +#. Build the application: + + .. code-block:: console + + make + +Running the Application +----------------------- + +The application has a lot of command line options. This allows specification of +the eventdev PMD to use, and a number of attributes of the processing pipeline +options. + +An example eventdev pipeline running with the software eventdev PMD using +these settings is shown below: + + * ``-r1``: core mask 0x1 for RX + * ``-t1``: core mask 0x1 for TX + * ``-e4``: core mask 0x4 for the software scheduler + * ``-w FF00``: core mask for worker cores, 8 cores from 8th to 16th + * ``-s4``: 4 atomic stages + * ``-n0``: process infinite packets (run forever) + * ``-c32``: worker dequeue depth of 32 + * ``-W1000``: do 1000 cycles of work per packet in each stage + * ``-D``: dump statistics on exit + +.. code-block:: console + + ./build/eventdev_pipeline --vdev event_sw0 -- -r1 -t1 -e4 -w FF00 -s4 -n0 -c32 -W1000 -D + +The application has some sanity checking built-in, so if there is a function +(eg; the RX core) which doesn't have a cpu core mask assigned, the application +will print an error message: + +.. code-block:: console + + Core part of pipeline was not assigned any cores. This will stall the + pipeline, please check core masks (use -h for details on setting core masks): + rx: 0 + tx: 1 + +Configuration of the eventdev is covered in detail in the programmers guide, +see the Event Device Library section. + + +Observing the Application +------------------------- + +At runtime the eventdev pipeline application prints out a summary of the +configuration, and some runtime statistics like packets per second. On exit the +worker statistics are printed, along with a full dump of the PMD statistics if +required. The following sections show sample output for each of the output +types. + +Configuration +~~~~~~~~~~~~~ + +This provides an overview of the pipeline, +scheduling type at each stage, and parameters to options such as how many +flows to use and what eventdev PMD is in use. See the following sample output +for details: + +.. code-block:: console + + Config: + ports: 2 + workers: 8 + packets: 0 + priorities: 1 + Queue-prio: 0 + qid0 type: atomic + Cores available: 44 + Cores used: 10 + Eventdev 0: event_sw + Stages: + Stage 0, Type Atomic Priority = 128 + Stage 1, Type Atomic Priority = 128 + Stage 2, Type Atomic Priority = 128 + Stage 3, Type Atomic Priority = 128 + +Runtime +~~~~~~~ + +At runtime, the statistics of the consumer are printed, stating the number of +packets recieved, runtime in milliseconds, average mpps, and current mpps. + +.. code-block:: console + + # consumer RX= xxxxxxx, time yyyy ms, avg z.zzz mpps [current w.www mpps] + +Shutdown +~~~~~~~~ + +At shutdown, the application prints the number of packets recieved and +transmitted, and an overview of the distribution of work across worker cores. + +.. code-block:: console + + Signal 2 received, preparing to exit... + worker 12 thread done. RX=4966581 TX=4966581 + worker 13 thread done. RX=4963329 TX=4963329 + worker 14 thread done. RX=4953614 TX=4953614 + worker 0 thread done. RX=0 TX=0 + worker 11 thread done. RX=4970549 TX=4970549 + worker 10 thread done. RX=4986391 TX=4986391 + worker 9 thread done. RX=4970528 TX=4970528 + worker 15 thread done. RX=4974087 TX=4974087 + worker 8 thread done. RX=4979908 TX=4979908 + worker 2 thread done. RX=0 TX=0 + + Port Workload distribution: + worker 0 : 12.5 % (4979876 pkts) + worker 1 : 12.5 % (4970497 pkts) + worker 2 : 12.5 % (4986359 pkts) + worker 3 : 12.5 % (4970517 pkts) + worker 4 : 12.5 % (4966566 pkts) + worker 5 : 12.5 % (4963297 pkts) + worker 6 : 12.5 % (4953598 pkts) + worker 7 : 12.5 % (4974055 pkts) + +To get a full dump of the state of the eventdev PMD, pass the ``-D`` flag to +this application. When the app is terminated using ``Ctrl+C``, the +``rte_event_dev_dump()`` function is called, resulting in a dump of the +statistics that the PMD provides. The statistics provided depend on the PMD +used, see the Event Device Drivers section for a list of eventdev PMDs. diff --git a/doc/guides/sample_app_ug/index.rst b/doc/guides/sample_app_ug/index.rst index 02611ef..11f5781 100644 --- a/doc/guides/sample_app_ug/index.rst +++ b/doc/guides/sample_app_ug/index.rst @@ -69,6 +69,7 @@ Sample Applications User Guides netmap_compatibility ip_pipeline test_pipeline + eventdev_pipeline dist_app vm_power_management tep_termination From patchwork Fri Apr 21 09:51:39 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Van Haaren, Harry" X-Patchwork-Id: 23801 X-Patchwork-Delegate: jerinj@marvell.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id F1D7858F6; Fri, 21 Apr 2017 11:52:37 +0200 (CEST) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id EDACC58D1 for ; Fri, 21 Apr 2017 11:52:35 +0200 (CEST) Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 21 Apr 2017 02:52:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.37,229,1488873600"; d="scan'208";a="90572738" Received: from silpixa00398672.ir.intel.com ([10.237.223.128]) by orsmga005.jf.intel.com with ESMTP; 21 Apr 2017 02:52:33 -0700 From: Harry van Haaren To: dev@dpdk.org Cc: jerin.jacob@caviumnetworks.com, Harry van Haaren Date: Fri, 21 Apr 2017 10:51:39 +0100 Message-Id: <1492768299-84016-4-git-send-email-harry.van.haaren@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1492768299-84016-1-git-send-email-harry.van.haaren@intel.com> References: <1492768299-84016-1-git-send-email-harry.van.haaren@intel.com> Subject: [dpdk-dev] [PATCH 3/3] doc: add eventdev library to programmers guide X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This commit adds an entry in the programmers guide explaining the eventdev library. The rte_event struct, queues and ports are explained. An API walktrough of a simple two stage atomic pipeline provides the reader with a step by step overview of the expected usage of the Eventdev API. Signed-off-by: Harry van Haaren --- doc/guides/prog_guide/eventdev.rst | 365 ++++++++++ doc/guides/prog_guide/img/eventdev_usage.svg | 994 +++++++++++++++++++++++++++ doc/guides/prog_guide/index.rst | 1 + 3 files changed, 1360 insertions(+) create mode 100644 doc/guides/prog_guide/eventdev.rst create mode 100644 doc/guides/prog_guide/img/eventdev_usage.svg diff --git a/doc/guides/prog_guide/eventdev.rst b/doc/guides/prog_guide/eventdev.rst new file mode 100644 index 0000000..4f6088e --- /dev/null +++ b/doc/guides/prog_guide/eventdev.rst @@ -0,0 +1,365 @@ +.. BSD LICENSE + Copyright(c) 2017 Intel Corporation. All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + * Neither the name of Intel Corporation nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +Event Device Library +==================== + +The DPDK Event device library is an abstraction that provides the application +with features to schedule events. This is achieved using the PMD architecture +similar to the ethdev or cryptodev APIs, which may already be familiar to the +reader. The eventdev framework is provided as a DPDK library, allowing +applications to use it if they wish, but not require its usage. + +The goal of this library is to enable applications to build processing +pipelines where the load balancing and scheduling is handled by the eventdev. +Step-by-step instructions of the eventdev design is available in the `API +Walktrough`_ section later in this document. + +Event struct +------------ + +The eventdev API represents each event with a generic struct, which contains a +payload and metadata required for scheduling by an eventdev. The +``rte_event`` struct is a 16 byte C structure, defined in +``libs/librte_eventdev/rte_eventdev.h``. + +Event Metadata +~~~~~~~~~~~~~~ + +The rte_event structure contains the following metadata fields, which the +application fills in to have the event scheduled as required: + +* ``flow_id`` - The targeted flow identifier for the enq/deq operation. +* ``event_type`` - The source of this event, eg RTE_EVENT_TYPE_ETHDEV or CPU. +* ``sub_event_type`` - Distinguishes events inside the application, that have + the same event_type (see above) +* ``op`` - This field takes one of the RTE_EVENT_OP_* values, and tells the + eventdev about the status of the event - valid values are NEW, FORWARD or + RELEASE. +* ``sched_type`` - Represents the type of scheduling that should be performed + on this event, valid values are the RTE_SCHED_TYPE_ORDERED, ATOMIC and + PARALLEL. +* ``queue_id`` - The identifier for the event queue that the event is sent to. +* ``priority`` - The priority of this event, see RTE_EVENT_DEV_PRIORITY. + +Event Payload +~~~~~~~~~~~~~ + +The rte_event struct contains a union for payload, allowing flexibility in what +the actual event being scheduled is. The payload is a union of the following: + +* ``uint64_t u64`` +* ``void *event_ptr`` +* ``struct rte_mbuf *mbuf`` + +These three items in a union occupy the same 64 bits at the end of the rte_event +structure. The application can utilize the 64 bits directly by accessing the +u64 variable, while the event_ptr and mbuf are provided as convenience +variables. For example the mbuf pointer in the union can used to schedule a +DPDK packet. + +Queues +~~~~~~ + +A queue is a logical "stage" of a packet processing graph, where each stage +has a specified scheduling type. The application configures each queue for a +specific type of scheduling, and just enqueues all events to the eventdev. +The Eventdev API supports the following scheduling types per queue: + +* Atomic +* Ordered +* Parallel + +Atomic, Ordered and Parallel are load-balanced scheduling types: the output +of the queue can be spread out over multiple CPU cores. + +Atomic scheduling on a queue ensures that a single flow is not present on two +different CPU cores at the same time. Ordered allows sending all flows to any +core, but the scheduler must ensure that on egress the packets are returned to +ingress order. Parallel allows sending all flows to all CPU cores, without any +re-ordering guarantees. + +Single Link Flag +^^^^^^^^^^^^^^^^ + +There is a SINGLE_LINK flag which allows an application to indicate that only +one port will be connected to a queue. Queues configured with the single-link +flag follow a FIFO like structure, maintaining ordering but it is only capable +of being linked to a single port (see below for port and queue linking details). + + +Ports +~~~~~ + +Ports are the points of contact between worker cores and the eventdev. The +general use-case will see one CPU core using one port to enqueue and dequeue +events from an eventdev. Ports are linked to queues in order to retrieve events +from those queues (more details in `Linking Queues and Ports`_ below). + + +API Walktrough +-------------- + +This section will introduce the reader to the eventdev API, showing how to +create and configure an eventdev and use it for a two-stage atomic pipeline +with a single core for TX. The diagram below shows the final state of the +application after this walktrough: + +.. _figure_eventdev-usage1: + +.. figure:: img/eventdev_usage.* + + Sample eventdev usage, with RX, two atomic stages and a single-link to TX. + + +A high level overview of the setup steps are: + +* rte_event_dev_configure() +* rte_event_queue_setup() +* rte_event_port_setup() +* rte_event_port_link() +* rte_event_dev_start() + + +Init and Config +~~~~~~~~~~~~~~~ + +The eventdev library uses vdev options to add devices to the DPDK application. +The ``--vdev`` EAL option allows adding eventdev instances to your DPDK +application, using the name of the eventdev PMD as an argument. + +For example, to create an instance of the software eventdev scheduler, the +following vdev arguments should be provided to the application EAL command line: + +.. code-block:: console + + ./dpdk_application --vdev="event_sw0" + +In the following code, we configure eventdev instance with 3 queues +and 6 ports as follows. The 3 queues consist of 2 Atomic and 1 Single-Link, +while the 6 ports consist of 4 workers, 1 RX and 1 TX. + +.. code-block:: c + + const struct rte_event_dev_config config = { + .nb_event_queues = 3, + .nb_event_ports = 6, + .nb_events_limit = 4096, + .nb_event_queue_flows = 1024, + .nb_event_port_dequeue_depth = 128, + .nb_event_port_enqueue_depth = 128, + }; + int err = rte_event_dev_configure(dev_id, &config); + +The remainder of this walktrough assumes that dev_id is 0. + +Setting up Queues +~~~~~~~~~~~~~~~~~ + +Once the eventdev itself is configured, the next step is to configure queues. +This is done by setting the appropriate values in a queue_conf structure, and +calling the setup function. Repeat this step for each queue, starting from +0 and ending at ``nb_event_queues - 1`` from the event_dev config above. + +.. code-block:: c + + struct rte_event_queue_conf atomic_conf = { + .event_queue_cfg = RTE_EVENT_QUEUE_CFG_ATOMIC_ONLY, + .priority = RTE_EVENT_DEV_PRIORITY_NORMAL, + .nb_atomic_flows = 1024, + .nb_atomic_order_sequences = 1024, + }; + int dev_id = 0; + int queue_id = 0; + int err = rte_event_queue_setup(dev_id, queue_id, &atomic_conf); + +The remainder of this walktrough assumes that the queues are configured as +follows: + + * id 0, atomic queue #1 + * id 1, atomic queue #2 + * id 2, single-link queue + +Setting up Ports +~~~~~~~~~~~~~~~~ + +Once queues are set up successfully, create the ports as required. Each port +should be set up with its corresponding port_conf type, worker for worker cores, +rx and tx for the RX and TX cores: + +.. code-block:: c + + struct rte_event_port_conf rx_conf = { + .dequeue_depth = 128, + .enqueue_depth = 128, + .new_event_threshold = 1024, + }; + struct rte_event_port_conf worker_conf = { + .dequeue_depth = 16, + .enqueue_depth = 64, + .new_event_threshold = 4096, + }; + struct rte_event_port_conf tx_conf = { + .dequeue_depth = 128, + .enqueue_depth = 128, + .new_event_threshold = 4096, + }; + int dev_id = 0; + int port_id = 0; + int err = rte_event_port_setup(dev_id, port_id, &CORE_FUNCTION_conf); + +It is now assumed that: + + * port 0: RX core + * ports 1,2,3,4: Workers + * port 5: TX core + +Linking Queues and Ports +~~~~~~~~~~~~~~~~~~~~~~~~ + +The final step is to "wire up" the ports to the queues. After this, the +eventdev is capable of scheduling events, and when cores request work to do, +the correct events are provided to that core. Note that the RX core takes input +from eg: a NIC so it is not linked to any eventdev queues. + +Linking all workers to atomic queues, and the TX core to the single-link queue +can be achieved like this: + +.. code-block:: c + + uint8_t port_id = 0; + uint8_t atomic_qs[] = {0, 1}; + uint8_t single_link_q = 2; + uint8_t tx_port_id = 5; + uin8t_t priority = RTE_EVENT_DEV_PRIORITY_NORMAL; + + for(int i = 0; i < 4; i++) { + int worker_port = i + 1; + int links_made = rte_event_port_link(dev_id, worker_port, atomic_qs, NULL, 2); + } + int links_made = rte_event_port_link(dev_id, tx_port_id, &single_link_q, &priority, 1); + +Starting the EventDev +~~~~~~~~~~~~~~~~~~~~~ + +A single function call tells the eventdev instance to start processing +events. Note that all queues must be linked to for the instance to start, as +if any queue is not linked to, enqueuing to that queue will cause the +application to backpressure and eventually stall due to no space in the +eventdev. + +.. code-block:: c + + int err = rte_event_dev_start(dev_id); + +Ingress of New Events +~~~~~~~~~~~~~~~~~~~~~ + +Now that the eventdev is set up, and ready to receive events, the RX core must +enqueue some events into the system for it to schedule. The events to be +scheduled are ordinary DPDK packets, received from an eth_rx_burst() as normal. +The following code shows how those packets can be enqueued into the eventdev: + +.. code-block:: c + + const uint16_t nb_rx = rte_eth_rx_burst(eth_port, 0, mbufs, BATCH_SIZE); + + for (i = 0; i < nb_rx; i++) { + ev[i].flow_id = mbufs[i]->hash.rss; + ev[i].op = RTE_EVENT_OP_NEW; + ev[i].sched_type = RTE_EVENT_QUEUE_CFG_ATOMIC_ONLY; + ev[i].queue_id = 0; + ev[i].event_type = RTE_EVENT_TYPE_CPU; + ev[i].sub_event_type = 0; + ev[i].priority = RTE_EVENT_DEV_PRIORITY_NORMAL; + ev[i].mbuf = mbufs[i]; + } + + const int nb_tx = rte_event_enqueue_burst(dev_id, port_id, ev, nb_rx); + if (nb_tx != nb_rx) { + for(i = nb_tx; i < nb_rx; i++) + rte_pktmbuf_free(mbufs[i]); + } + +Forwarding of Events +~~~~~~~~~~~~~~~~~~~~ + +Now that the RX core has injected events, there is work to be done by the +workers. Note that each worker will dequeue as many events as it can in a burst, +process each one individually, and then burst the packets back into the +eventdev. + +The worker can lookup the events source from ``event.queue_id``, which should +indicate to the worker what workload needs to be performed on the event. +Once done, the worker can update the ``event.queue_id`` to a new value, to send +the event to the next stage in the pipeline. + +.. code-block:: c + + int timeout = 0; + struct rte_event events[BATCH_SIZE]; + uint16_t nb_rx = rte_event_dequeue_burst(dev_id, worker_port_id, events, BATCH_SIZE, timeout); + + for (i = 0; i < nb_rx; i++) { + /* process mbuf using events[i].queue_id as pipeline stage */ + struct rte_mbuf *mbuf = events[i].mbuf; + /* Send event to next stage in pipeline */ + events[i].queue_id++; + } + + uint16_t nb_tx = rte_event_enqueue_burst(dev_id, port_id, events, nb_rx); + + +Egress of Events +~~~~~~~~~~~~~~~~ + +Finally, when the packet is ready for egress or needs to be dropped, we need +to inform the eventdev that the packet is no longer being handled by the +application. This can be done by calling dequeue() or dequeue_burst(), which +indicates that the previous burst of packets is no longer in use by the +application. + +.. code-block:: c + + struct rte_event events[BATCH_SIZE]; + uint16_t n = rte_event_dequeue_burst(dev_id, port_id, events, BATCH_SIZE, 0); + /* burst #1 : now tx or use the packets */ + n = rte_event_dequeue_burst(dev_id, port_id, events, BATCH_SIZE, 0); + /* burst #1 is now no longer valid to use in the application, as + the eventdev has dropped any locks or released re-ordered packets */ + +Summary +------- + +The eventdev library allows an application to easily schedule events as it +requires, either using a run-to-completion or pipeline processing model. The +queues and ports abstract the logical functionality of an eventdev, providing +the application with a generic method to schedule events. With the flexible +PMD infrastructure applications benefit of improvements in existing eventdevs +and additions of new ones without modification. diff --git a/doc/guides/prog_guide/img/eventdev_usage.svg b/doc/guides/prog_guide/img/eventdev_usage.svg new file mode 100644 index 0000000..7765649 --- /dev/null +++ b/doc/guides/prog_guide/img/eventdev_usage.svg @@ -0,0 +1,994 @@ + + +image/svg+xml + + + + + + + + + + + + + + + + + + + + + + + + + + + Page-1 + + + + Square + Atomic Queue #1 + + + + + + + + + + + + + Square.3 + Atomic Queue #2 + + + + + + + + + + + + + Square.4 + Single Link Queue # 1 + + + + + + + + + + + + + Circle + RX + + + + + + + + + + RX + + + + Dynamic connector + + + + Circle.7 + W .. + + + + + + + + + + W .. + + + + Circle.9 + W N + + + + + + + + + + W N + + + + Circle.10 + W 1 + + + + + + + + + + W 1 + + + + Dynamic connector.11 + + + + Dynamic connector.12 + + + + Dynamic connector.13 + + + + Dynamic connector.14 + + + + Dynamic connector.15 + + + + Circle.19 + W .. + + + + + + + + + + W .. + + + + Circle.20 + W N + + + + + + + + + + W N + + + + Circle.21 + W 1 + + + + + + + + + + W 1 + + + + Dynamic connector.28 + + + + Dynamic connector.29 + + + + Dynamic connector.30 + + + + Dynamic connector.31 + + + + Dynamic connector.32 + + + + Dynamic connector.33 + + + + Dynamic connector.34 + + + Atomic #1 +Atomic #2 + +Single Link +CircleRX TX +Dynamic connector.28 + diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst index ef5a02a..7578395 100644 --- a/doc/guides/prog_guide/index.rst +++ b/doc/guides/prog_guide/index.rst @@ -57,6 +57,7 @@ Programmer's Guide multi_proc_support kernel_nic_interface thread_safety_dpdk_functions + eventdev qos_framework power_man packet_classif_access_ctrl