From patchwork Tue Dec 6 12:24:40 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 17704 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id B4B3E2C2D; Tue, 6 Dec 2016 13:24:53 +0100 (CET) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 97C572B8F for ; Tue, 6 Dec 2016 13:24:51 +0100 (CET) Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C5E56C0567B5; Tue, 6 Dec 2016 12:24:50 +0000 (UTC) Received: from max-t460s.redhat.com ([10.36.126.15]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id uB6COjbV022322; Tue, 6 Dec 2016 07:24:46 -0500 From: Maxime Coquelin To: yuanhan.liu@linux.intel.com, thomas.monjalon@6wind.com, john.mcnamara@intel.com, zhiyong.yang@intel.com, ktraynor@redhat.com, dev@dpdk.org Cc: Maxime Coquelin Date: Tue, 6 Dec 2016 13:24:40 +0100 Message-Id: <20161206122440.12039-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Tue, 06 Dec 2016 12:24:51 +0000 (UTC) Subject: [dpdk-dev] [PATCH v2] doc: introduce PVP reference benchmark X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Having reference benchmarks is important in order to obtain reproducible performance figures. This patch describes required steps to configure a PVP setup using testpmd in both host and guest. Not relying on external vSwitch ease integration in a CI loop by not being impacted by DPDK API changes. Signed-off-by: Maxime Coquelin Reviewed-by: Yuanhan Liu Acked-by: Kevin Traynor Acked-by: John McNamara --- Thanks to all the reviewers, this v2 should take all remarks into account. -- Maxime doc/guides/howto/img/pvp_2nics.svg | 556 +++++++++++++++++++++++++++ doc/guides/howto/index.rst | 1 + doc/guides/howto/pvp_reference_benchmark.rst | 395 +++++++++++++++++++ 3 files changed, 952 insertions(+) create mode 100644 doc/guides/howto/img/pvp_2nics.svg create mode 100644 doc/guides/howto/pvp_reference_benchmark.rst diff --git a/doc/guides/howto/img/pvp_2nics.svg b/doc/guides/howto/img/pvp_2nics.svg new file mode 100644 index 0000000..517a800 --- /dev/null +++ b/doc/guides/howto/img/pvp_2nics.svg @@ -0,0 +1,556 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + image/svg+xml + + + + + + + + TE + + 10G NIC + Moongen + + DUT + + VM + + 10G NIC + + TestPMD(macswap) + + TestPMD (io) + + + + + + + + 10G NIC + + 10G NIC + + + + + + + + + + diff --git a/doc/guides/howto/index.rst b/doc/guides/howto/index.rst index 5575b27..712a9f3 100644 --- a/doc/guides/howto/index.rst +++ b/doc/guides/howto/index.rst @@ -38,3 +38,4 @@ HowTo Guides lm_bond_virtio_sriov lm_virtio_vhost_user flow_bifurcation + pvp_reference_benchmark diff --git a/doc/guides/howto/pvp_reference_benchmark.rst b/doc/guides/howto/pvp_reference_benchmark.rst new file mode 100644 index 0000000..c5bbd32 --- /dev/null +++ b/doc/guides/howto/pvp_reference_benchmark.rst @@ -0,0 +1,395 @@ +.. BSD LICENSE + Copyright(c) 2016 Red Hat, Inc. All rights reserved. + All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + * Neither the name of Intel Corporation nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +PVP reference benchmark setup using testpmd +=========================================== + +This guide lists the steps required to setup a PVP benchmark using testpmd as +a simple forwarder between NICs and Vhost interfaces. The goal of this setup +is to have a reference PVP benchmark without using external vSwitches (OVS, +VPP, ...) to make it easier to obtain reproducible results and to facilitate +continuous integration testing. + +The guide covers two ways of launching the VM, either by directly calling the +QEMU command line, or by relying on libvirt. It has been tested with DPDK +v16.11 using RHEL7 for both host and guest. + + +Setup overview +-------------- + +.. _figure_pvp_2nics: + +.. figure:: img/pvp_2nics.* + + PVP setup using 2 NICs + +In this diagram, each red arrow represents one logical core. This use-case +requires 6 dedicated logical cores. A forwarding configuration with a single +NIC is also possible, requiring 3 logical cores. + + +Host setup +---------- + +In this setup, we isolate 6 cores (from CPU2 to CPU7) on the same NUMA +node. Two cores are assigned to the VM vCPUs running testpmd and four are +assigned to testpmd on the host. + + +Host tuning +~~~~~~~~~~~ + +#. On BIOS, disable turbo-boost and hyper-threads. + +#. Append these options to Kernel command line: + + .. code-block:: console + + intel_pstate=disable mce=ignore_ce default_hugepagesz=1G hugepagesz=1G hugepages=6 isolcpus=2-7 rcu_nocbs=2-7 nohz_full=2-7 iommu=pt intel_iommu=on + +#. Disable hyper-threads at runtime if necessary or if BIOS is not accessible: + + .. code-block:: console + + cat /sys/devices/system/cpu/cpu*[0-9]/topology/thread_siblings_list \ + | sort | uniq \ + | awk -F, '{system("echo 0 > /sys/devices/system/cpu/cpu"$2"/online")}' + +#. Disable NMIs: + + .. code-block:: console + + echo 0 > /proc/sys/kernel/nmi_watchdog + +#. Exclude isolated CPUs from the writeback cpumask: + + .. code-block:: console + + echo ffffff03 > /sys/bus/workqueue/devices/writeback/cpumask + +#. Isolate CPUs from IRQs: + + .. code-block:: console + + clear_mask=0xfc #Isolate CPU2 to CPU7 from IRQs + for i in /proc/irq/*/smp_affinity + do + echo "obase=16;$(( 0x$(cat $i) & ~$clear_mask ))" | bc > $i + done + + +Qemu build +~~~~~~~~~~ + +Build Qemu: + + .. code-block:: console + + git clone git://git.qemu.org/qemu.git + cd qemu + mkdir bin + cd bin + ../configure --target-list=x86_64-softmmu + + +DPDK build +~~~~~~~~~~ + +Build DPDK: + + .. code-block:: console + + git clone git://dpdk.org/dpdk + cd dpdk + export RTE_SDK=$PWD + make install T=x86_64-native-linuxapp-gcc DESTDIR=install + + +Testpmd launch +~~~~~~~~~~~~~~ + +#. Assign NICs to DPDK: + + .. code-block:: console + + modprobe vfio-pci + $RTE_SDK/install/sbin/dpdk-devbind -b vfio-pci 0000:11:00.0 0000:11:00.1 + + .. Note:: + + The Sandy Bridge family seems to have some IOMMU limitations giving poor + performance results. To achieve good performance on these machines + consider using UIO instead. + +#. Launch the testpmd application: + + .. code-block:: console + + $RTE_SDK/install/bin/testpmd -l 0,2,3,4,5 --socket-mem=1024 -n 4 \ + --vdev 'net_vhost0,iface=/tmp/vhost-user1' \ + --vdev 'net_vhost1,iface=/tmp/vhost-user2' -- \ + --portmask=f --disable-hw-vlan -i --rxq=1 --txq=1 + --nb-cores=4 --forward-mode=io + +With this command, isolated CPUs 2 to 5 will be used as lcores for PMD threads. + +#. In testpmd interactive mode, set the portlist to obtain the correct port + chaining: + + .. code-block:: console + + set portlist 0,2,1,3 + start + + +VM launch +~~~~~~~~~ + +The VM may be launched either by calling QEMU directly, or by using libvirt. + +#. Qemu way: + +Launch QEMU with two Virtio-net devices paired to the vhost-user sockets +created by testpmd. Below example uses default Virtio-net options, but options +may be specified, for example to disable mergeable buffers or indirect +descriptors. + + .. code-block:: console + + /bin/x86_64-softmmu/qemu-system-x86_64 \ + -enable-kvm -cpu host -m 3072 -smp 3 \ + -chardev socket,id=char0,path=/tmp/vhost-user1 \ + -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \ + -device virtio-net-pci,netdev=mynet1,mac=52:54:00:02:d9:01,addr=0x10 \ + -chardev socket,id=char1,path=/tmp/vhost-user2 \ + -netdev type=vhost-user,id=mynet2,chardev=char1,vhostforce \ + -device virtio-net-pci,netdev=mynet2,mac=52:54:00:02:d9:02,addr=0x11 \ + -object memory-backend-file,id=mem,size=3072M,mem-path=/dev/hugepages,share=on \ + -numa node,memdev=mem -mem-prealloc \ + -net user,hostfwd=tcp::1002$1-:22 -net nic \ + -qmp unix:/tmp/qmp.socket,server,nowait \ + -monitor stdio .qcow2 + +You can use this `qmp-vcpu-pin `_ +script to pin vCPUs. + +It can be used as follows, for example to pin 3 vCPUs to CPUs 1, 6 and 7, +where isolated CPUs 6 and 7 will be used as lcores for Virtio PMDs: + + .. code-block:: console + + export PYTHONPATH=$PYTHONPATH:/scripts/qmp + ./qmp-vcpu-pin -s /tmp/qmp.socket 1 6 7 + +#. Libvirt way: + +Some initial steps are required for libvirt to be able to connect to testpmd's +sockets. + +First, SELinux policy needs to be set to permissive, since testpmd is +generally run as root (note, as reboot is required): + + .. code-block:: console + + cat /etc/selinux/config + + # This file controls the state of SELinux on the system. + # SELINUX= can take one of these three values: + # enforcing - SELinux security policy is enforced. + # permissive - SELinux prints warnings instead of enforcing. + # disabled - No SELinux policy is loaded. + SELINUX=permissive + + # SELINUXTYPE= can take one of three two values: + # targeted - Targeted processes are protected, + # minimum - Modification of targeted policy. + # Only selected processes are protected. + # mls - Multi Level Security protection. + SELINUXTYPE=targeted + + +Also, Qemu needs to be run as root, which has to be specified in +``/etc/libvirt/qemu.conf``: + + .. code-block:: console + + user = "root" + +Once the domain created, the following snippet is an extract of he most +important information (hugepages, vCPU pinning, Virtio PCI devices): + + .. code-block:: xml + + + 3145728 + 3145728 + + + + + + + 3 + + + + + + + + + + + hvm + + + + + + + + + + + + + + +
+ + + + + + +
+ + + + + +Guest setup +----------- + + +Guest tuning +~~~~~~~~~~~~ + +#. Append these options to the Kernel command line: + + .. code-block:: console + + default_hugepagesz=1G hugepagesz=1G hugepages=1 intel_iommu=on iommu=pt isolcpus=1,2 rcu_nocbs=1,2 nohz_full=1,2 + +#. Disable NMIs: + + .. code-block:: console + + echo 0 > /proc/sys/kernel/nmi_watchdog + +#. Exclude isolated CPU1 and CPU2 from the writeback cpumask: + + .. code-block:: console + + echo 1 > /sys/bus/workqueue/devices/writeback/cpumask + +#. Isolate CPUs from IRQs: + + .. code-block:: console + + clear_mask=0x6 #Isolate CPU1 and CPU2 from IRQs + for i in /proc/irq/*/smp_affinity + do + echo "obase=16;$(( 0x$(cat $i) & ~$clear_mask ))" | bc > $i + done + + +DPDK build +~~~~~~~~~~ + +Build DPDK: + + .. code-block:: console + + git clone git://dpdk.org/dpdk + cd dpdk + export RTE_SDK=$PWD + make install T=x86_64-native-linuxapp-gcc DESTDIR=install + + +Testpmd launch +~~~~~~~~~~~~~~ + +Probe vfio module without iommu: + + .. code-block:: console + + modprobe -r vfio_iommu_type1 + modprobe -r vfio + modprobe vfio enable_unsafe_noiommu_mode=1 + cat /sys/module/vfio/parameters/enable_unsafe_noiommu_mode + modprobe vfio-pci + +Bind the virtio-net devices to DPDK: + + .. code-block:: console + + $RTE_SDK/tools/dpdk-devbind.py -b vfio-pci 0000:00:10.0 0000:00:11.0 + +Start testpmd: + + .. code-block:: console + + $RTE_SDK/install/bin/testpmd -l 0,1,2 --socket-mem 1024 -n 4 \ + --proc-type auto --file-prefix pg -- \ + --portmask=3 --forward-mode=macswap --port-topology=chained \ + --disable-hw-vlan --disable-rss -i --rxq=1 --txq=1 \ + --rxd=256 --txd=256 --nb-cores=2 --auto-start + +Results template +---------------- + +Below template should be used when sharing results: + + .. code-block:: none + + Traffic Generator: + Acceptable Loss: % + Validation run time: min + Host DPDK version/commit: + Guest DPDK version/commit: + Patches applied: + QEMU version/commit: + Virtio features: + CPU: , + NIC: + Result: Mpps