[dpdk-dev,memnic,1/7] guest: memnic-tester: PMD benchmark in guest

Message ID 7F861DC0615E0C47A872E6F3C5FCDDBD011A98C5@BPXM14GP.gisp.nec.co.jp (mailing list archive)
State Superseded, archived
Headers

Commit Message

Hiroshi Shimamoto Sept. 11, 2014, 7:46 a.m. UTC
  From: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>

Introduce memnic-tester which benchmarks MEMNIC PMD performance in guest.

It starts with two threads, one thread produces and consumes packets,
other thread receives packets and directly transmits the received
packets. This evaluates MEMNIC PMD running cost.

The master thread does rx_burst and tx_burst through MEMNIC PMD.
        +---------+
        | master  |
        +---------+
 rx_burst ^     | tx_burst
          |     V
      +------+------+
      |  up  | down | MEMNIC shared memory
      +------+------+
 set flag ^     | unset flag
          |     V
        +---------+
        |  slave  |
        +---------+
The slave thread emulates packet-in/out by setting flag on/off.

 master |<- put packets ->|     |<- get packets ->|
 slave  |   |<- rx packets ->|<- tx packets ->|   |
        |<----------------- set ----------------->|

Measuring how many sets in the certain period, that represents
the MEMNIC PMD performance. The master workload must be very low.

It shows that throughputs in different frame size.
  64, 128, 256, 512, 1024, 1280, 1518

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Reviewed-by: Hayato Momma <h-momma@ce.jp.nec.com>
---
 guest/Makefile        |  20 ++++
 guest/README.rst      |  94 +++++++++++++++++
 guest/memnic-tester.c | 281 ++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 395 insertions(+)
 create mode 100644 guest/Makefile
 create mode 100644 guest/README.rst
 create mode 100644 guest/memnic-tester.c
  

Comments

Thomas Monjalon Sept. 24, 2014, 3:10 p.m. UTC | #1
Hi Hiroshi,

2014-09-11 07:46, Hiroshi Shimamoto:
>  master |<- put packets ->|     |<- get packets ->|
>  slave  |   |<- rx packets ->|<- tx packets ->|   |
>         |<----------------- set ----------------->|
> 
> Measuring how many sets in the certain period, that represents
> the MEMNIC PMD performance. The master workload must be very low.

Sorry, I don't really understand this diagram and the associated explanation.
Could you try to reword it?

Thanks
  
Hiroshi Shimamoto Sept. 24, 2014, 11:54 p.m. UTC | #2
Hi,
> Subject: Re: [dpdk-dev] [memnic PATCH 1/7] guest: memnic-tester: PMD benchmark in guest
> 
> Hi Hiroshi,
> 
> 2014-09-11 07:46, Hiroshi Shimamoto:
> >  master |<- put packets ->|     |<- get packets ->|
> >  slave  |   |<- rx packets ->|<- tx packets ->|   |
> >         |<----------------- set ----------------->|
> >
> > Measuring how many sets in the certain period, that represents
> > the MEMNIC PMD performance. The master workload must be very low.
> 
> Sorry, I don't really understand this diagram and the associated explanation.
> Could you try to reword it?

sure, will make more understandable description.
Could you please help me to do that?

The purpose of this program is measuring the performance of MEMNIC PMD itself.
It means that we'd like to know how much the PMD takes in rx and tx API.
The program does rx and tx in the slave thread and the PMD performance could
be measured how much packets are handled in certain period. By the way we
need to fill and clear MEMNIC packet buffer for enabling to work the PMD rx/tx
in the slave thread. Then, I made the master thread which fills and clears
MEMNIC packet buffer in the lightest way, and it should be with the least jitter.
If we generate a real packet out of VM, that may cause increasing jitter
outside of the MEMNIC PMD, it means we will not see the precise performance
of MEMNIC PMD itself.

Can you see the concept of this benchmark with the above?

thanks,
Hiroshi

> 
> Thanks
> --
> Thomas
  
Thomas Monjalon Sept. 25, 2014, 9:02 a.m. UTC | #3
> > >  master |<- put packets ->|     |<- get packets ->|
> > >  slave  |   |<- rx packets ->|<- tx packets ->|   |
> > >         |<----------------- set ----------------->|
> > >
> > > Measuring how many sets in the certain period, that represents
> > > the MEMNIC PMD performance. The master workload must be very low.
> > 
> > Sorry, I don't really understand this diagram and the associated explanation.
> > Could you try to reword it?
> 
> sure, will make more understandable description.
> Could you please help me to do that?
> 
> The purpose of this program is measuring the performance of MEMNIC PMD itself.
> It means that we'd like to know how much the PMD takes in rx and tx API.
> The program does rx and tx in the slave thread and the PMD performance could
> be measured how much packets are handled in certain period. By the way we
> need to fill and clear MEMNIC packet buffer for enabling to work the PMD rx/tx
> in the slave thread. Then, I made the master thread which fills and clears
> MEMNIC packet buffer in the lightest way, and it should be with the least jitter.
> If we generate a real packet out of VM, that may cause increasing jitter
> outside of the MEMNIC PMD, it means we will not see the precise performance
> of MEMNIC PMD itself.
> 
> Can you see the concept of this benchmark with the above?

Yes. But I think master and slave roles are confused.
I try to reword it with less words:

memnic-tester is a benchmark tool to measure performance of MEMNIC PMD itself.
The master thread forward packets with Rx and Tx bursts. 
The slave thread fills and clears packets in the lightest way. It doesn't get
packet out of VM because it would increase jitter and hide PMD performance.
Throughput (number of forwarded packets per second) is given for each frame size.
  

Patch

diff --git a/guest/Makefile b/guest/Makefile
new file mode 100644
index 0000000..3c90350
--- /dev/null
+++ b/guest/Makefile
@@ -0,0 +1,20 @@ 
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+ifeq ($(RTE_TARGET),)
+$(error "Please define RTE_TARGET environment variable")
+endif
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+COMMON_INC_OPT = -I $(PWD)/../common
+
+APP = memnic-tester
+
+CFLAGS += -Wall -g -O3 $(COMMON_INC_OPT)
+
+SRCS-y := memnic-tester.c
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/guest/README.rst b/guest/README.rst
new file mode 100644
index 0000000..760014e
--- /dev/null
+++ b/guest/README.rst
@@ -0,0 +1,94 @@ 
+.. Copyright 2014 NEC
+   Redistribution and use in source and binary forms, with or without
+   modification, are permitted provided that the following conditions
+   are met:
+   - Redistributions of source code must retain the above copyright
+     notice, this list of conditions and the following disclaimer.
+   - Redistributions in binary form must reproduce the above copyright
+     notice, this list of conditions and the following disclaimer in
+     the documentation and/or other materials provided with the
+     distribution.
+   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+   FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+   COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+   INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+   (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+   SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+   HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+   STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+   ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+   OF THE POSSIBILITY OF SUCH DAMAGE.
+
+MEMNIC TESTER
+=============
+
+DESCRIPTION
+-----------
+
+It is a simple benchmark test of MEMNIC PMD in guest.
+
+It have two threads, one thread produces and consumes packets,
+other thread receives packets and directly transmits the received
+packets back in MEMNIC interface. This evaluates MEMNIC PMD running cost.
+
+The master thread does rx_burst and tx_burst through MEMNIC PMD.
+            +---------+
+            | master  |
+            +---------+
+     rx_burst ^     | tx_burst
+              |     V
+          +------+------+
+          |  up  | down | MEMNIC shared memory
+          +------+------+
+     set flag ^     | unset flag
+              |     V
+            +---------+
+            |  slave  |
+            +---------+
+The slave thread emulates packet-in/out by setting flag on/off.
+
+Measuring how many sets in the certain period, that represents
+the MEMNIC PMD performance. The master workload must be very low.
+
+     master |<- put packets ->|     |<- get packets ->|
+     slave  |   |<- rx packets ->|<- tx packets ->|   |
+            |<----------------- set ----------------->|
+
+Like RFC2544, evaluations are performed the below frame size packets.
+  64, 128, 256, 512, 1024, 1280, 1518
+
+It shows the result as packets per second number of each frame size.
+
+HOW TO BUILD
+------------
+
+DPDK and DPDK MEMNIC PMD must be built first like below::
+
+  cd /path/to/dpdk
+  make install T=x86_64-native-linuxapp-gcc
+  cd /path/to/memnic/pmd
+  make RTE_INCLUDE=/path/to/dpdk/x86_64-native-linuxapp-gcc/include
+  cd /path/to/memnic/guest
+  make RTE_SDK=/path/to/dpdk RTE_TARGET=x86_64-native-linuxapp-gcc
+
+The file ``memnic-tester`` is generated under ``build`` directory.
+
+HOW TO RUN
+----------
+
+On host the MEMNIC device must be initialized with proper program.
+``memnic-host-sim`` should take care about it::
+
+  [host]# ./memnic-host-sim /dev/shm/ivshm
+
+Then stop the ``memnic-host-sim`` by CTRL-C.
+
+Run ``memnic-tester`` in guest::
+
+  [guest]# echo 64 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
+  [guest]# mount -t hugetlbfs nodev /mnt/huge
+  [guest]# ./build/memnic-tester -c 0x6 -n 4 -d /path/to/librte_pmd_memnic_copy.so
+
+The result shows how much packets are handled by MEMNIC PMD per second.
diff --git a/guest/memnic-tester.c b/guest/memnic-tester.c
new file mode 100644
index 0000000..10e304b
--- /dev/null
+++ b/guest/memnic-tester.c
@@ -0,0 +1,281 @@ 
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2014 NEC All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ */
+
+#include <rte_eal.h>
+#include <rte_ethdev.h>
+#include <rte_pci.h>
+#include <rte_mempool.h>
+#include <rte_mbuf.h>
+#include <rte_malloc.h>
+#include <rte_cycles.h>
+#include <rte_byteorder.h>
+#include <rte_ether.h>
+#include <rte_ring.h>
+
+#include "memnic.h"
+
+#define PKTS_BURST_SIZE	(32)
+#define TEST_DURATION	(10)
+
+static const struct rte_eth_conf port_conf = {
+	.rxmode = {
+		.jumbo_frame = 0,
+	},
+	.txmode = {
+		.mq_mode = ETH_MQ_TX_NONE,
+	},
+};
+
+static const struct rte_eth_rxconf rx_conf = {
+	.rx_thresh = {
+		.pthresh = 8,
+		.hthresh = 8,
+		.wthresh = 4,
+	},
+};
+
+static const struct rte_eth_txconf tx_conf = {
+	.tx_thresh = {
+		.pthresh = 36,
+		.hthresh = 0,
+		.wthresh = 0,
+	},
+};
+
+static const unsigned nr_rxdesc = 128;
+static const unsigned nr_txdesc = 512;
+
+static struct rte_mempool *mempool;
+
+static struct memnic_area *memnic;
+
+static void init_port(unsigned portid)
+{
+	if (rte_eth_dev_configure(portid, 1, 1, &port_conf) < 0) {
+		rte_exit(EXIT_FAILURE, "failed to configure port %u\n",
+			portid);
+	}
+	if (rte_eth_rx_queue_setup(portid, 0, nr_rxdesc,
+			rte_eth_dev_socket_id(portid), &rx_conf,
+			mempool) < 0) {
+		rte_exit(EXIT_FAILURE,
+			"failed to configure rx queue port %u\n",
+			portid);
+	}
+	if (rte_eth_tx_queue_setup(portid, 0, nr_txdesc,
+			rte_eth_dev_socket_id(portid), &tx_conf) < 0) {
+		rte_exit(EXIT_FAILURE,
+			"failed to configure tx queue port %u\n",
+			portid);
+	}
+
+	rte_eth_promiscuous_enable(portid);
+}
+
+static void reset_memnic(void)
+{
+	struct memnic_header *hdr = &memnic->hdr;
+	struct memnic_data *up = &memnic->up;
+	struct memnic_data *down = &memnic->down;
+	int i;
+
+	/* prepare packet data */
+	for (i = 0; i < MEMNIC_NR_PACKET; i++) {
+		struct memnic_packet *p = &up->packets[i];
+
+		p->status = MEMNIC_PKT_ST_FREE;
+		p->len = 60; /* short packet */
+
+		/* don't care about content */
+	}
+
+	/* clear packet data */
+	for (i = 0; i < MEMNIC_NR_PACKET; i++) {
+		struct memnic_packet *p = &down->packets[i];
+
+		p->status = MEMNIC_PKT_ST_FREE;
+	}
+
+	/* use default framesz */
+	hdr->framesz = MEMNIC_MAX_FRAME_LEN;
+
+	rte_compiler_barrier();
+
+	hdr->reset = 0;
+	hdr->valid = 1;
+}
+
+static void slave(void)
+{
+	struct memnic_header *hdr = &memnic->hdr;
+	int up_idx, down_idx;
+	uint64_t hz = rte_get_tsc_hz(), next;
+	uint64_t count;
+	/* RFC2544 like */
+	uint32_t testset[] = {64, 128, 256, 512, 1024, 1280, 1518};
+	int n, nr_tests = sizeof(testset) / sizeof(uint32_t);
+
+	/* wait to turn reset flag on */
+	while (hdr->reset == 0)
+		rte_pause();
+
+	/* wait a sec to confirm no one handles this MEMNIC in host side */
+	next = rte_rdtsc() + hz;
+	while (next < rte_rdtsc()) {
+		if (ACCESS_ONCE(hdr->valid))
+			rte_exit(EXIT_FAILURE, "MEMNIC is active\n");
+	}
+
+	reset_memnic();
+
+	up_idx = down_idx = 0;
+
+	for (n = 0; n < nr_tests; n++) {
+		struct memnic_data *up = &memnic->up;
+		struct memnic_data *down = &memnic->down;
+		struct memnic_packet *p;
+		int i;
+
+		/* prepare incoming packet */
+		for (i = 0; i < MEMNIC_NR_PACKET; i++) {
+			p = &up->packets[i];
+			p->len = testset[n] - 4; /* remove FCS */
+		}
+
+		count = 0;
+		next = rte_rdtsc() + hz * TEST_DURATION;
+		while (next > rte_rdtsc()) {
+			/* put packets */
+			for (i = 0; i < PKTS_BURST_SIZE; i++) {
+xmit_retry:
+				p = &up->packets[up_idx];
+				if (ACCESS_ONCE(p->status) != MEMNIC_PKT_ST_FREE)
+					goto xmit_retry;
+				if (++up_idx >= MEMNIC_NR_PACKET)
+					up_idx = 0;
+				p->status = MEMNIC_PKT_ST_FILLED;
+			}
+			/* get packets */
+			for (i = 0; i < PKTS_BURST_SIZE; i++) {
+recv_retry:
+				p = &down->packets[down_idx];
+				if (ACCESS_ONCE(p->status) != MEMNIC_PKT_ST_FILLED)
+					goto recv_retry;
+				if (++down_idx >= MEMNIC_NR_PACKET)
+					down_idx = 0;
+				p->status = MEMNIC_PKT_ST_FREE;
+			}
+			++count;
+		}
+		printf("frame size %u throughput %lu pps\n",
+			testset[n], (count * PKTS_BURST_SIZE) / TEST_DURATION);
+	}
+
+	/* finish the test */
+	rte_exit(EXIT_SUCCESS, "Test done\n");
+}
+
+static void master(void)
+{
+	if (rte_eth_dev_start(0) < 0)
+		rte_exit(EXIT_FAILURE, "failed to start device\n");
+
+	/* infinity loop back */
+	for (;;) {
+		struct rte_mbuf *bufs[PKTS_BURST_SIZE];
+		int rx, tx;
+
+		rx = rte_eth_rx_burst(0, 0, bufs, PKTS_BURST_SIZE);
+		tx = 0;
+		while (rx != tx)
+			tx += rte_eth_tx_burst(0, 0, &bufs[tx], rx - tx);
+	}
+}
+
+static int lcore_main(void *p)
+{
+	if (rte_lcore_id() == rte_get_master_lcore())
+		master();
+	else
+		slave();
+
+	/* never reach here */
+	return 0;
+}
+
+int main(int argc, char **argv)
+{
+	struct rte_eth_dev *dev;
+	struct memnic_adapter {
+		struct memnic_area *nic;
+	} *adapter;
+	int ret;
+	unsigned lcore_id;
+
+	ret = rte_eal_init(argc, argv);
+	if (ret < 0)
+		exit(1);
+
+	argc -= ret;
+	argv += ret;
+
+	if (rte_lcore_count() != 2)
+		rte_exit(EXIT_FAILURE, "Need just 2 lcores\n");
+
+	/* alloc mempool */
+	mempool = rte_mempool_create("pkt_mempool", 8192, 2048, 32,
+				sizeof(struct rte_pktmbuf_pool_private),
+				rte_pktmbuf_pool_init, NULL,
+				rte_pktmbuf_init, NULL,
+				rte_socket_id(), 0);
+
+	if (rte_eal_pci_probe() < 0)
+		rte_exit(EXIT_FAILURE, "failed to probe PCI\n");
+
+	/* get MEMNIC data from ether device */
+	dev = &rte_eth_devices[0];
+	adapter = (struct memnic_adapter *)(dev->data->dev_private);
+
+	memnic = adapter->nic;
+
+	if (memnic->hdr.magic != MEMNIC_MAGIC)
+		rte_exit(EXIT_FAILURE, "Not a MEMNIC device\n");
+
+	/* port 0 must initialize MEMNIC */
+	init_port(0);
+
+	rte_eal_mp_remote_launch(lcore_main, NULL, CALL_MASTER);
+	RTE_LCORE_FOREACH_SLAVE(lcore_id) {
+		if (rte_eal_wait_lcore(lcore_id) < 0)
+			return -1;
+	}
+
+	return 0;
+}