From patchwork Wed Nov 23 21:00:06 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 17208 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id 39B092BBD; Wed, 23 Nov 2016 22:00:21 +0100 (CET) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 2D9652BA7 for ; Wed, 23 Nov 2016 22:00:18 +0100 (CET) Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id DAE103F1F5; Wed, 23 Nov 2016 21:00:16 +0000 (UTC) Received: from max-t460s.redhat.com (vpn1-5-151.ams2.redhat.com [10.36.5.151]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id uANL0C8F010760; Wed, 23 Nov 2016 16:00:12 -0500 From: Maxime Coquelin To: yuanhan.liu@linux.intel.com, thomas.monjalon@6wind.com, john.mcnamara@intel.com, zhiyong.yang@intel.com, dev@dpdk.org Cc: fbaudin@redhat.com, Maxime Coquelin Date: Wed, 23 Nov 2016 22:00:06 +0100 Message-Id: <20161123210006.7113-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Wed, 23 Nov 2016 21:00:17 +0000 (UTC) Subject: [dpdk-dev] [PATCH] doc: introduce PVP reference benchmark X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Having reference benchmarks is important in order to obtain reproducible performance figures. This patch describes required steps to configure a PVP setup using testpmd in both host and guest. Not relying on external vSwitch ease integration in a CI loop by not being impacted by DPDK API changes. Signed-off-by: Maxime Coquelin --- doc/guides/howto/img/pvp_2nics.svg | 556 +++++++++++++++++++++++++++ doc/guides/howto/index.rst | 1 + doc/guides/howto/pvp_reference_benchmark.rst | 389 +++++++++++++++++++ 3 files changed, 946 insertions(+) create mode 100644 doc/guides/howto/img/pvp_2nics.svg create mode 100644 doc/guides/howto/pvp_reference_benchmark.rst diff --git a/doc/guides/howto/img/pvp_2nics.svg b/doc/guides/howto/img/pvp_2nics.svg new file mode 100644 index 0000000..517a800 --- /dev/null +++ b/doc/guides/howto/img/pvp_2nics.svg @@ -0,0 +1,556 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + image/svg+xml + + + + + + + + TE + + 10G NIC + Moongen + + DUT + + VM + + 10G NIC + + TestPMD(macswap) + + TestPMD (io) + + + + + + + + 10G NIC + + 10G NIC + + + + + + + + + + diff --git a/doc/guides/howto/index.rst b/doc/guides/howto/index.rst index 5575b27..712a9f3 100644 --- a/doc/guides/howto/index.rst +++ b/doc/guides/howto/index.rst @@ -38,3 +38,4 @@ HowTo Guides lm_bond_virtio_sriov lm_virtio_vhost_user flow_bifurcation + pvp_reference_benchmark diff --git a/doc/guides/howto/pvp_reference_benchmark.rst b/doc/guides/howto/pvp_reference_benchmark.rst new file mode 100644 index 0000000..042c6aa --- /dev/null +++ b/doc/guides/howto/pvp_reference_benchmark.rst @@ -0,0 +1,389 @@ +.. BSD LICENSE + Copyright(c) 2016 Red Hat, Inc. All rights reserved. + All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + * Neither the name of Intel Corporation nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +PVP reference benchmark setup using testpmd +=========================================== + +This guide lists the steps required to setup PVP benchmark using testpmd as a +simple forwarder between NICs and Vhost interfaces. The goal of this setup is +to have a reference PVP benchmark not using external vSwitches (OVS, VPP, ...), +making easier to obtain reproducible results and easing continuous integration +testing. + +The guide covers two way for launching the VM, either by calling directly QEMU +command line, or by relying on libvirt. It has been tested working with DPDK +v16.11, using RHEL7 for both host and guest. + +Setup overview +.............. + +.. figure:: img/pvp_2nics.svg + + PVP setup using 2 NICs + +In this diagram, each red arrow represents one logical core. This use-case +requires 6 dedicated logical cores. A setup consisting in doing forwarding +with a single NIC is also possible, requiring 3 logical cores. + +Host setup +.......... + +In this setup, we isolate 6 cores (from CPU2 to CPU7) in a same NUMA node. Two +cores are assigned to the VM vCPUs running testpmd, four are assigned to +testpmd on host. + +Host tuning +~~~~~~~~~~~ + +#. Append these options to Kernel command line: + + .. code-block:: console + + intel_pstate=disable mce=ignore_ce default_hugepagesz=1G hugepagesz=1G hugepages=6 isolcpus=2-7 rcu_nocbs=2-7 nohz_full=2-7 iommu=pt intel_iommu=on + +#. Disable hyper-threads at runtime if necessary and BIOS not accessible: + + .. code-block:: console + + cat /sys/devices/system/cpu/cpu*[0-9]/topology/thread_siblings_list \ + | sort | uniq \ + | awk -F, '{system("echo 0 > /sys/devices/system/cpu/cpu"$2"/online")}' + +#. Disable NMIs: + + .. code-block:: console + + echo 0 > /proc/sys/kernel/nmi_watchdog + +#. Exclude isolated CPUs from the writeback cpumask: + + .. code-block:: console + + echo ffffff03 > /sys/bus/workqueue/devices/writeback/cpumask + +#. Isolate CPUs from IRQs: + + .. code-block:: console + + clear_mask=0xfc #Isolate CPU2 to CPU7 from IRQs + for i in /proc/irq/*/smp_affinity + do + echo "obase=16;$(( 0x$(cat $i) & ~$clear_mask ))" | bc > $i + done + +Qemu build +~~~~~~~~~~ + + .. code-block:: console + + git clone git://dpdk.org/dpdk + cd dpdk + export RTE_SDK=$PWD + make install T=x86_64-native-linuxapp-gcc DESTDIR=install + +DPDK build +~~~~~~~~~~ + + .. code-block:: console + + git clone git://dpdk.org/dpdk + cd dpdk + export RTE_SDK=$PWD + make install T=x86_64-native-linuxapp-gcc DESTDIR=install + +Testpmd launch +~~~~~~~~~~~~~~ + +#. Assign NICs to DPDK: + + .. code-block:: console + + modprobe vfio-pci + $RTE_SDK/install/sbin/dpdk-devbind -b vfio-pci 0000:11:00.0 0000:11:00.1 + +*Note: Sandy Bridge family seems to have some limitations wrt its IOMMU, +giving poor performance results. To achieve good performance on these machines, +consider using UIO instead.* + +#. Launch testpmd application: + + .. code-block:: console + + $RTE_SDK/install/bin/testpmd -l 0,2,3,4,5 --socket-mem=1024 -n 4 \ + --vdev 'net_vhost0,iface=/tmp/vhost-user1' \ + --vdev 'net_vhost1,iface=/tmp/vhost-user2' -- \ + --portmask=f --disable-hw-vlan -i --rxq=1 --txq=1 + --nb-cores=4 --forward-mode=io + +#. In testpmd interactive mode, set the portlist to obtin the right chaining: + + .. code-block:: console + + set portlist 0,2,1,3 + start + +VM launch +~~~~~~~~~ + +The VM may be launched ezither by calling directly QEMU, or by using libvirt. + +#. Qemu way: + +Launch QEMU with two Virtio-net devices paired to the vhost-user sockets created by testpmd: + + .. code-block:: console + + /bin/x86_64-softmmu/qemu-system-x86_64 \ + -enable-kvm -cpu host -m 3072 -smp 3 \ + -chardev socket,id=char0,path=/tmp/vhost-user1 \ + -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \ + -device virtio-net-pci,netdev=mynet1,mac=52:54:00:02:d9:01,addr=0x10 \ + -chardev socket,id=char1,path=/tmp/vhost-user2 \ + -netdev type=vhost-user,id=mynet2,chardev=char1,vhostforce \ + -device virtio-net-pci,netdev=mynet2,mac=52:54:00:02:d9:02,addr=0x11 \ + -object memory-backend-file,id=mem,size=3072M,mem-path=/dev/hugepages,share=on \ + -numa node,memdev=mem -mem-prealloc \ + -net user,hostfwd=tcp::1002$1-:22 -net nic \ + -qmp unix:/tmp/qmp.socket,server,nowait \ + -monitor stdio .qcow2 + +You can use this qmp-vcpu-pin script to pin vCPUs: + + .. code-block:: python + + #!/usr/bin/python + # QEMU vCPU pinning tool + # + # Copyright (C) 2016 Red Hat Inc. + # + # Authors: + # Maxime Coquelin + # + # This work is licensed under the terms of the GNU GPL, version 2. See + # the COPYING file in the top-level directory + import argparse + import json + import os + + from subprocess import call + from qmp import QEMUMonitorProtocol + + pinned = [] + + parser = argparse.ArgumentParser(description='Pin QEMU vCPUs to physical CPUs') + parser.add_argument('-s', '--server', type=str, required=True, + help='QMP server path or address:port') + parser.add_argument('cpu', type=int, nargs='+', + help='Physical CPUs IDs') + args = parser.parse_args() + + devnull = open(os.devnull, 'w') + + srv = QEMUMonitorProtocol(args.server) + srv.connect() + + for vcpu in srv.command('query-cpus'): + vcpuid = vcpu['CPU'] + tid = vcpu['thread_id'] + if tid in pinned: + print 'vCPU{}\'s tid {} already pinned, skipping'.format(vcpuid, tid) + continue + + cpuid = args.cpu[vcpuid % len(args.cpu)] + print 'Pin vCPU {} (tid {}) to physical CPU {}'.format(vcpuid, tid, cpuid) + try: + call(['taskset', '-pc', str(cpuid), str(tid)], stdout=devnull) + pinned.append(tid) + except OSError: + print 'Failed to pin vCPU{} to CPU{}'.format(vcpuid, cpuid) + + +That can be used this way, for example to pin 3 vCPUs to CPUs 1, 6 and 7: + + .. code-block:: console + + export PYTHONPATH=$PYTHONPATH:/scripts/qmp + ./qmp-vcpu-pin -s /tmp/qmp.socket 1 6 7 + +#. Libvirt way: + +Some initial steps are required for libvirt to be able to connect to testpmd's +sockets. + +First, SELinux policy needs to be set to permissiven, as testpmd is run as root +(reboot required): + + .. code-block:: console + + cat /etc/selinux/config + + # This file controls the state of SELinux on the system. + # SELINUX= can take one of these three values: + # enforcing - SELinux security policy is enforced. + # permissive - SELinux prints warnings instead of enforcing. + # disabled - No SELinux policy is loaded. + SELINUX=permissive + # SELINUXTYPE= can take one of three two values: + # targeted - Targeted processes are protected, + # minimum - Modification of targeted policy. Only selected processes are protected. + # mls - Multi Level Security protection. + SELINUXTYPE=targeted + + +Also, Qemu needs to be run as root, which has to be specified in /etc/libvirt/qemu.conf: + + .. code-block:: console + + user = "root" + +Once the domain created, following snippset is an extract of most important +information (hugepages, vCPU pinning, Virtio PCI devices): + + .. code-block:: xml + + + 3145728 + 3145728 + + + + + + + 3 + + + + + + + + + + + hvm + + + + + + + + + + + + + + +
+ + + + + + +
+ + + + +Guest setup +........... + +Guest tuning +~~~~~~~~~~~~ + +#. Append these options to Kernel command line: + + .. code-block:: console + + default_hugepagesz=1G hugepagesz=1G hugepages=1 intel_iommu=on iommu=pt isolcpus=1,2 rcu_nocbs=1,2 nohz_full=1,2 + +#. Disable NMIs: + + .. code-block:: console + + echo 0 > /proc/sys/kernel/nmi_watchdog + +#. Exclude isolated CPU1 and CPU2 from the writeback wq cpumask: + + .. code-block:: console + + echo 1 > /sys/bus/workqueue/devices/writeback/cpumask + +#. Isolate CPUs from IRQs: + + .. code-block:: console + + clear_mask=0x6 #Isolate CPU1 and CPU2 from IRQs + for i in /proc/irq/*/smp_affinity + do + echo "obase=16;$(( 0x$(cat $i) & ~$clear_mask ))" | bc > $i + done + +DPDK build +~~~~~~~~~~ + + .. code-block:: console + + git clone git://dpdk.org/dpdk + cd dpdk + export RTE_SDK=$PWD + make install T=x86_64-native-linuxapp-gcc DESTDIR=install + +Testpmd launch +~~~~~~~~~~~~~~ + +Probe vfio module without iommu: + + .. code-block:: console + + modprobe -r vfio_iommu_type1 + modprobe -r vfio + modprobe vfio enable_unsafe_noiommu_mode=1 + cat /sys/module/vfio/parameters/enable_unsafe_noiommu_mode + modprobe vfio-pci + +Bind virtio-net devices to DPDK: + + .. code-block:: console + + $RTE_SDK/tools/dpdk-devbind.py -b vfio-pci 0000:00:10.0 0000:00:11.0 + +Start testpmd: + + .. code-block:: console + + $RTE_SDK/install/bin/testpmd -l 0,1,2 --socket-mem 1024 -n 4 \ + --proc-type auto --file-prefix pg -- \ + --portmask=3 --forward-mode=macswap --port-topology=chained \ + --disable-hw-vlan --disable-rss -i --rxq=1 --txq=1 \ + --rxd=256 --txd=256 --nb-cores=2 --auto-start