From patchwork Fri Jun 14 14:43:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cian Ferriter X-Patchwork-Id: 54807 X-Patchwork-Delegate: ferruh.yigit@amd.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 5BC5A1D55D; Fri, 14 Jun 2019 16:47:17 +0200 (CEST) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 304DB1D519 for ; Fri, 14 Jun 2019 16:47:15 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 14 Jun 2019 07:47:13 -0700 X-ExtLoop1: 1 Received: from silpixa00393943.ir.intel.com (HELO silpixa00393943.ger.corp.intel.com) ([10.237.222.31]) by fmsmga005.fm.intel.com with ESMTP; 14 Jun 2019 07:47:12 -0700 From: Cian Ferriter To: Ferruh Yigit , Bruce Richardson , John McNamara , Marko Kovacevic Cc: dev@dpdk.org, Cian Ferriter Date: Fri, 14 Jun 2019 15:43:37 +0100 Message-Id: <20190614144337.17177-2-cian.ferriter@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190614144337.17177-1-cian.ferriter@intel.com> References: <20190614144337.17177-1-cian.ferriter@intel.com> Subject: [dpdk-dev] [PATCH 19.08 v3 2/2] net/pcap: enable infinitely rxing a pcap file X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" It can be useful to use pcap files for some rudimental performance At a high level, this works by creaing a ring of sufficient size to store the packets in the pcap file passed to the application. When the rx function for this mode is called, packets are dequeued from the ring for use by the application and also enqueued back on to the ring to be "received" again. A tx_drop mode is also added since transmitting to a tx_pcap file isn't desirable at a high traffic rate. Jumbo frames are not supported in this mode. When filling the ring at rx queue setup time, the presence of multi segment mbufs is checked for. The PMD will exit on detection of these multi segment mbufs. Signed-off-by: Cian Ferriter --- v3 changes: * Update PCAP docs: * State that 'infinite_rx' should only be provided once per device. * Drop use of tx_drop and mention its use through not providing a txq. * Remove the tx_drop option and related args processing. * Notify user when tx_drop is being used. * Change infinite_rx from an 'int' to 'unsigned int' in pmd_internals struct. * Add pmd_devargs_all struct to pass args from pmd_probe to ethdev creation. * Improve args parsing of infinite_rx so -1 doesn't enable feature. * Change order of tx ops assignment so tx_drop is default case. * Move args parsing of infinite_rx inside 'is_rx_pcap' check. * Only notify user of 'infinite_rx' state if arg has been passed on cmdline. * Call eth_dev_close for 'infinite_rx' cleanup rather than duplicating code. doc/guides/nics/pcap_ring.rst | 20 +++ drivers/net/pcap/rte_eth_pcap.c | 234 ++++++++++++++++++++++++++++++-- 2 files changed, 246 insertions(+), 8 deletions(-) diff --git a/doc/guides/nics/pcap_ring.rst b/doc/guides/nics/pcap_ring.rst index c1ef9196b..a2e9523b3 100644 --- a/doc/guides/nics/pcap_ring.rst +++ b/doc/guides/nics/pcap_ring.rst @@ -106,6 +106,26 @@ Runtime Config Options --vdev 'net_pcap0,iface=eth0,phy_mac=1' +- Use the RX PCAP file to infinitely receive packets + + In case ``rx_pcap=`` configuration is set, user may want to use the selected PCAP file for rudimental + performance testing. This can be done with a ``devarg`` ``infinite_rx``, for example:: + + --vdev 'net_pcap0,rx_pcap=file_rx.pcap,infinite_rx=1,tx_drop=1' + + When this mode is used, it is recommended to drop all packets on transmit by not providing a tx_pcap or tx_iface. + + This option is device wide, so all queues on a device will either have this enabled or disabled. + This option should only be provided once per device. + +- Drop all packets on transmit + + The user may want to drop all packets on tx for a device. This can be done by not providing a tx_pcap or tx_iface, for example:: + + --vdev 'net_pcap0,rx_pcap=file_rx.pcap' + + In this case, one tx drop queue is created for each rxq on that device. + Examples of Usage ^^^^^^^^^^^^^^^^^ diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c index c35f501cf..b5bd800f0 100644 --- a/drivers/net/pcap/rte_eth_pcap.c +++ b/drivers/net/pcap/rte_eth_pcap.c @@ -39,6 +39,7 @@ #define ETH_PCAP_TX_IFACE_ARG "tx_iface" #define ETH_PCAP_IFACE_ARG "iface" #define ETH_PCAP_PHY_MAC_ARG "phy_mac" +#define ETH_PCAP_INFINITE_RX_ARG "infinite_rx" #define ETH_PCAP_ARG_MAXLEN 64 @@ -64,6 +65,9 @@ struct pcap_rx_queue { struct queue_stat rx_stat; char name[PATH_MAX]; char type[ETH_PCAP_ARG_MAXLEN]; + + /* Contains pre-generated packets to be looped through */ + struct rte_ring *pkts; }; struct pcap_tx_queue { @@ -82,6 +86,7 @@ struct pmd_internals { int if_index; int single_iface; int phy_mac; + unsigned int infinite_rx; }; struct pmd_process_private { @@ -107,6 +112,7 @@ struct pmd_devargs_all { int single_iface; unsigned int is_tx_pcap; unsigned int is_tx_iface; + unsigned int infinite_rx; }; static const char *valid_arguments[] = { @@ -117,6 +123,7 @@ static const char *valid_arguments[] = { ETH_PCAP_TX_IFACE_ARG, ETH_PCAP_IFACE_ARG, ETH_PCAP_PHY_MAC_ARG, + ETH_PCAP_INFINITE_RX_ARG, NULL }; @@ -186,6 +193,43 @@ eth_pcap_gather_data(unsigned char *data, struct rte_mbuf *mbuf) } } +static uint16_t +eth_pcap_rx_infinite(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) +{ + int i; + struct pcap_rx_queue *pcap_q = queue; + uint32_t rx_bytes = 0; + + if (unlikely(nb_pkts == 0)) + return 0; + + if (rte_pktmbuf_alloc_bulk(pcap_q->mb_pool, bufs, nb_pkts) != 0) + return 0; + + for (i = 0; i < nb_pkts; i++) { + struct rte_mbuf *pcap_buf; + int err = rte_ring_dequeue(pcap_q->pkts, (void **)&pcap_buf); + if (err) + return i; + + rte_memcpy(rte_pktmbuf_mtod(bufs[i], void *), + rte_pktmbuf_mtod(pcap_buf, void *), + pcap_buf->data_len); + bufs[i]->data_len = pcap_buf->data_len; + bufs[i]->pkt_len = pcap_buf->pkt_len; + bufs[i]->port = pcap_q->port_id; + rx_bytes += pcap_buf->data_len; + + /* Enqueue packet back on ring to allow infinite rx. */ + rte_ring_enqueue(pcap_q->pkts, pcap_buf); + } + + pcap_q->rx_stat.pkts += i; + pcap_q->rx_stat.bytes += rx_bytes; + + return i; +} + static uint16_t eth_pcap_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) { @@ -328,6 +372,30 @@ eth_pcap_tx_dumper(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) return num_tx; } +/* + * Callback to handle dropping packets in the infinite rx case. + */ +static uint16_t +eth_tx_drop(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) +{ + unsigned int i; + uint32_t tx_bytes = 0; + struct pcap_tx_queue *tx_queue = queue; + + if (unlikely(nb_pkts == 0)) + return 0; + + for (i = 0; i < nb_pkts; i++) { + tx_bytes += bufs[i]->data_len; + rte_pktmbuf_free(bufs[i]); + } + + tx_queue->tx_stat.pkts += nb_pkts; + tx_queue->tx_stat.bytes += tx_bytes; + + return i; +} + /* * Callback to handle sending packets through a real NIC. */ @@ -455,6 +523,24 @@ open_single_rx_pcap(const char *pcap_filename, pcap_t **pcap) return 0; } +static uint64_t +count_packets_in_pcap(pcap_t **pcap, struct pcap_rx_queue *pcap_q) +{ + const u_char *packet; + struct pcap_pkthdr header; + uint64_t pcap_pkt_count = 0; + + while ((packet = pcap_next(*pcap, &header))) + pcap_pkt_count++; + + /* The pcap is reopened so it can be used as normal later. */ + pcap_close(*pcap); + *pcap = NULL; + open_single_rx_pcap(pcap_q->name, pcap); + + return pcap_pkt_count; +} + static int eth_dev_start(struct rte_eth_dev *dev) { @@ -647,8 +733,25 @@ eth_stats_reset(struct rte_eth_dev *dev) } static void -eth_dev_close(struct rte_eth_dev *dev __rte_unused) +eth_dev_close(struct rte_eth_dev *dev) { + unsigned int i; + struct pmd_internals *internals = dev->data->dev_private; + + /* Device wide flag, but cleanup must be performed per queue. */ + if (internals->infinite_rx) { + for (i = 0; i < dev->data->nb_rx_queues; i++) { + struct pcap_rx_queue *pcap_q = &internals->rx_queue[i]; + struct rte_mbuf *pcap_buf; + + while (!rte_ring_dequeue(pcap_q->pkts, + (void **)&pcap_buf)) + rte_pktmbuf_free(pcap_buf); + + rte_ring_free(pcap_q->pkts); + } + } + } static void @@ -679,6 +782,59 @@ eth_rx_queue_setup(struct rte_eth_dev *dev, pcap_q->queue_id = rx_queue_id; dev->data->rx_queues[rx_queue_id] = pcap_q; + if (internals->infinite_rx) { + struct pmd_process_private *pp; + char ring_name[NAME_MAX]; + static uint32_t ring_number; + uint64_t pcap_pkt_count = 0; + struct rte_mbuf *bufs[1]; + pcap_t **pcap; + + pp = rte_eth_devices[pcap_q->port_id].process_private; + pcap = &pp->rx_pcap[pcap_q->queue_id]; + + if (unlikely(*pcap == NULL)) + return -ENOENT; + + pcap_pkt_count = count_packets_in_pcap(pcap, pcap_q); + + snprintf(ring_name, sizeof(ring_name), "PCAP_RING%" PRIu16, + ring_number); + + pcap_q->pkts = rte_ring_create(ring_name, + rte_align64pow2(pcap_pkt_count + 1), 0, + RING_F_SP_ENQ | RING_F_SC_DEQ); + ring_number++; + if (!pcap_q->pkts) + return -ENOENT; + + /* Fill ring with packets from PCAP file one by one. */ + while (eth_pcap_rx(pcap_q, bufs, 1)) { + /* Check for multiseg mbufs. */ + if (bufs[0]->nb_segs != 1) { + rte_pktmbuf_free(*bufs); + + while (!rte_ring_dequeue(pcap_q->pkts, + (void **)bufs)) + rte_pktmbuf_free(*bufs); + + rte_ring_free(pcap_q->pkts); + PMD_LOG(ERR, "Multiseg mbufs are not supported in infinite_rx " + "mode."); + return -EINVAL; + } + + rte_ring_enqueue_bulk(pcap_q->pkts, + (void * const *)bufs, 1, NULL); + } + /* + * Reset the stats for this queue since eth_pcap_rx calls above + * didn't result in the application receiving packets. + */ + pcap_q->rx_stat.pkts = 0; + pcap_q->rx_stat.bytes = 0; + } + return 0; } @@ -916,6 +1072,20 @@ select_phy_mac(const char *key __rte_unused, const char *value, return 0; } +static int +get_infinite_rx_arg(const char *key __rte_unused, + const char *value, void *extra_args) +{ + if (extra_args) { + const int infinite_rx = atoi(value); + int *enable_infinite_rx = extra_args; + + if (infinite_rx > 0) + *enable_infinite_rx = 1; + } + return 0; +} + static struct rte_vdev_driver pmd_pcap_drv; static int @@ -1120,6 +1290,7 @@ eth_from_pcaps(struct rte_vdev_device *vdev, struct rte_eth_dev *eth_dev = NULL; struct pmd_devargs *rx_queues = &devargs_all->rx_queues; int single_iface = devargs_all->single_iface; + unsigned int infinite_rx = devargs_all->infinite_rx; int ret; ret = eth_from_pcaps_common(vdev, devargs_all, &internals, ð_dev); @@ -1142,13 +1313,20 @@ eth_from_pcaps(struct rte_vdev_device *vdev, } } - eth_dev->rx_pkt_burst = eth_pcap_rx; + internals->infinite_rx = infinite_rx; + /* Assign rx ops. */ + if (infinite_rx) + eth_dev->rx_pkt_burst = eth_pcap_rx_infinite; + else + eth_dev->rx_pkt_burst = eth_pcap_rx; /* Assign tx ops. */ if (devargs_all->is_tx_pcap) eth_dev->tx_pkt_burst = eth_pcap_tx_dumper; - else + else if (devargs_all->is_tx_iface) eth_dev->tx_pkt_burst = eth_pcap_tx; + else + eth_dev->tx_pkt_burst = eth_tx_drop; rte_eth_dev_probing_finish(eth_dev); return 0; @@ -1170,6 +1348,7 @@ pmd_pcap_probe(struct rte_vdev_device *dev) .single_iface = 0, .is_tx_pcap = 0, .is_tx_iface = 0, + .infinite_rx = 0, }; name = rte_vdev_device_name(dev); @@ -1233,6 +1412,29 @@ pmd_pcap_probe(struct rte_vdev_device *dev) pcaps.num_of_queue = 0; if (is_rx_pcap) { + /* + * We check whether we want to infinitely rx the pcap file. + */ + unsigned int infinite_rx_arg_cnt = rte_kvargs_count(kvlist, + ETH_PCAP_INFINITE_RX_ARG); + + if (infinite_rx_arg_cnt == 1) { + ret = rte_kvargs_process(kvlist, + ETH_PCAP_INFINITE_RX_ARG, + &get_infinite_rx_arg, + &devargs_all.infinite_rx); + if (ret < 0) + goto free_kvlist; + PMD_LOG(INFO, "infinite_rx has been %s for %s", + devargs_all.infinite_rx ? "enabled" : "disabled", + name); + + } else if (infinite_rx_arg_cnt > 1) { + PMD_LOG(WARNING, "infinite_rx has not been enabled since the " + "argument has been provided more than once " + "for %s", name); + } + ret = rte_kvargs_process(kvlist, ETH_PCAP_RX_PCAP_ARG, &open_rx_pcap, &pcaps); } else { @@ -1244,19 +1446,31 @@ pmd_pcap_probe(struct rte_vdev_device *dev) goto free_kvlist; /* - * We check whether we want to open a TX stream to a real NIC or a - * pcap file + * We check whether we want to open a TX stream to a real NIC, + * a pcap file, or drop packets on tx */ devargs_all.is_tx_pcap = rte_kvargs_count(kvlist, ETH_PCAP_TX_PCAP_ARG) ? 1 : 0; + devargs_all.is_tx_iface = + rte_kvargs_count(kvlist, ETH_PCAP_TX_IFACE_ARG) ? 1 : 0; dumpers.num_of_queue = 0; - if (devargs_all.is_tx_pcap) + if (devargs_all.is_tx_pcap) { ret = rte_kvargs_process(kvlist, ETH_PCAP_TX_PCAP_ARG, &open_tx_pcap, &dumpers); - else + } else if (devargs_all.is_tx_iface) { ret = rte_kvargs_process(kvlist, ETH_PCAP_TX_IFACE_ARG, &open_tx_iface, &dumpers); + } else { + unsigned int i; + + PMD_LOG(INFO, "Dropping packets on tx since no tx queues were provided."); + + /* Add 1 dummy queue per rxq which counts and drops packets. */ + for (i = 0; i < pcaps.num_of_queue; i++) + ret = add_queue(&dumpers, "dummy", "tx_drop", NULL, + NULL); + } if (ret < 0) goto free_kvlist; @@ -1337,6 +1551,9 @@ pmd_pcap_remove(struct rte_vdev_device *dev) eth_dev->data->mac_addrs = NULL; } + if (internals->infinite_rx) + eth_dev_close(eth_dev); + rte_free(eth_dev->process_private); rte_eth_dev_release_port(eth_dev); @@ -1357,7 +1574,8 @@ RTE_PMD_REGISTER_PARAM_STRING(net_pcap, ETH_PCAP_RX_IFACE_IN_ARG "= " ETH_PCAP_TX_IFACE_ARG "= " ETH_PCAP_IFACE_ARG "= " - ETH_PCAP_PHY_MAC_ARG "="); + ETH_PCAP_PHY_MAC_ARG "=" + ETH_PCAP_INFINITE_RX_ARG "=<0|1>"); RTE_INIT(eth_pcap_init_log) {