From patchwork Thu Jan 21 11:07:58 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tetsuya Mukawa X-Patchwork-Id: 10035 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id D5B4691A1; Thu, 21 Jan 2016 12:08:23 +0100 (CET) Received: from mail-pf0-f173.google.com (mail-pf0-f173.google.com [209.85.192.173]) by dpdk.org (Postfix) with ESMTP id 8C3538E97 for ; Thu, 21 Jan 2016 12:08:21 +0100 (CET) Received: by mail-pf0-f173.google.com with SMTP id e65so22057039pfe.0 for ; Thu, 21 Jan 2016 03:08:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=igel-co-jp.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :in-reply-to:references; bh=215RP/Qo+/WymYIzj4UohjNf1goRdvIqjfUH+lM6iaQ=; b=upQdDWVuaurq6R6E7Ui6ZnI5Qv75Xr/X1lcrYwri3iEFXh3Io4bDgcOQfGm1LppVjc ZVcGPcxKggU+cG6MegLqK2Kugjy3wEhBjA9LJKFlkTkEk+0xdE2xmYkdSG7xDxqBZoj9 bt4QFuTmRBJxnlz+V9O9cXlqSZW7VuZfEQu65Os1pjOPjFUoeMzaw+UwMZa9qYZcspoE TUlicvTlHay46ibLW6gAgpomFiQR3zwyBonnHkbnV+aEQm3q+bmMqmcC7EuyVGVk8Fq5 ezKidum70JhQ4+Gm+sxspyfTQ6g5eQ+zZoX+PX4tRxHTlwc5P2vE0UET+k1QJdo4Fy4v Ic/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:in-reply-to:references; bh=215RP/Qo+/WymYIzj4UohjNf1goRdvIqjfUH+lM6iaQ=; b=O+7nvYhIji4gvCQ+xSay7KbDtVOPd+afdbAKyvcWx4v0NgJq6h9zMPALtyJBN/7lmj 9vS0UUmwmdvr81hZ8cQHS21QXEe0/pXvDgJxsWlIC0fr0hVDuMMEI55s8qOEBZHG9lzJ MAY1Nl9xSMSOlTs2491vb/nhI5+exI4Jd1cD67emhDdLrxr/nv/qc8FNMq1/1C118Bpn HNt7GsUP696L/xv31DyFz184zjVKVXRVhFqiXde9yWSOqi50Y0a6nlwgx6ZaG22LS5I/ sIw3+BBK8ViRl8Z6WLUU1/BVay7DYcLhsTQ9xXy060Wbx8aBPDcMoZWSWkTz/8GaCclc R+1w== X-Gm-Message-State: ALoCoQm2/Leiu8xl1RODJF72pVnm78NEVqif95bXh7P1zHTOxX1CkTPD+pdq52fgOQqrHnIZFOPf4bBhrqDFSTDRYzCrMOwhvA== X-Received: by 10.98.66.74 with SMTP id p71mr59935955pfa.105.1453374500961; Thu, 21 Jan 2016 03:08:20 -0800 (PST) Received: from localhost.localdomain (napt.igel.co.jp. [219.106.231.132]) by smtp.gmail.com with ESMTPSA id wa17sm1792640pac.38.2016.01.21.03.08.18 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 21 Jan 2016 03:08:20 -0800 (PST) From: Tetsuya Mukawa To: dev@dpdk.org, yuanhan.liu@linux.intel.com, jianfeng.tan@intel.com Date: Thu, 21 Jan 2016 20:07:58 +0900 Message-Id: <1453374478-30996-6-git-send-email-mukawa@igel.co.jp> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1453374478-30996-1-git-send-email-mukawa@igel.co.jp> References: <1453374478-30996-1-git-send-email-mukawa@igel.co.jp> In-Reply-To: <1453108389-21006-2-git-send-email-mukawa@igel.co.jp> References: <1453108389-21006-2-git-send-email-mukawa@igel.co.jp> Subject: [dpdk-dev] [RFC PATCH 5/5] virtio: Extend virtio-net PMD to support container environment X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" virtio: Extend virtio-net PMD to support container environment The patch adds a new virtio-net PMD configuration that allows the PMD to work on host as if the PMD is in VM. Here is new configuration for virtio-net PMD. - CONFIG_RTE_LIBRTE_VIRTIO_HOST_MODE To use this mode, EAL needs physically contiguous memory. To allocate such memory, add "--shm" option to application command line. To prepare virtio-net device on host, the users need to invoke QEMU process in special qtest mode. This mode is mainly used for testing QEMU devices from outer process. In this mode, no guest runs. Here is QEMU command line. $ qemu-system-x86_64 \ -machine pc-i440fx-1.4,accel=qtest \ -display none -qtest-log /dev/null \ -qtest unix:/tmp/socket,server \ -netdev type=tap,script=/etc/qemu-ifup,id=net0,queues=1\ -device virtio-net-pci,netdev=net0,mq=on \ -chardev socket,id=chr1,path=/tmp/ivshmem,server \ -device ivshmem,size=1G,chardev=chr1,vectors=1 * QEMU process is needed per port. * Virtio-1.0 device is supported. * In most cases, just using above command is enough. * The vhost backends like vhost-net and vhost-user can be specified. * Only checked "pc-i440fx-1.4" machine, but may work with other machines. It depends on a machine has piix3 south bridge. If the machine doesn't have, virtio-net PMD cannot receive status changed interrupts. * Should not add "--enable-kvm" to QEMU command line. After invoking QEMU, the PMD can connect to QEMU process using unix domain sockets. Over these sockets, virtio-net, ivshmem and piix3 device in QEMU are probed by the PMD. Here is example of command line. $ testpmd -c f -n 1 -m 1024 --shm \ --vdev="eth_virtio_net0,qtest=/tmp/socket,ivshmem=/tmp/ivshmem"\ -- --disable-hw-vlan --txqflags=0xf00 -i Please specify same unix domain sockets and memory size in both QEMU and DPDK command lines like above. The share memory size should be power of 2, because ivshmem only accepts such memry size. Signed-off-by: Tetsuya Mukawa --- config/common_linuxapp | 1 + drivers/net/virtio/Makefile | 4 + drivers/net/virtio/qtest.c | 1237 ++++++++++++++++++++++++++++++++++++ drivers/net/virtio/virtio_ethdev.c | 450 ++++++++++--- drivers/net/virtio/virtio_ethdev.h | 12 + drivers/net/virtio/virtio_pci.c | 190 +++++- drivers/net/virtio/virtio_pci.h | 16 + drivers/net/virtio/virtio_rxtx.c | 3 +- drivers/net/virtio/virtqueue.h | 9 +- 9 files changed, 1845 insertions(+), 77 deletions(-) create mode 100644 drivers/net/virtio/qtest.c diff --git a/config/common_linuxapp b/config/common_linuxapp index 74bc515..04682f6 100644 --- a/config/common_linuxapp +++ b/config/common_linuxapp @@ -269,6 +269,7 @@ CONFIG_RTE_LIBRTE_PMD_SZEDATA2=n # Compile burst-oriented VIRTIO PMD driver # CONFIG_RTE_LIBRTE_VIRTIO_PMD=y +CONFIG_RTE_LIBRTE_VIRTIO_HOST_MODE=y CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_INIT=n CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_RX=n CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_TX=n diff --git a/drivers/net/virtio/Makefile b/drivers/net/virtio/Makefile index 43835ba..697e629 100644 --- a/drivers/net/virtio/Makefile +++ b/drivers/net/virtio/Makefile @@ -52,6 +52,10 @@ SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx.c SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_ethdev.c SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple.c +ifeq ($(CONFIG_RTE_LIBRTE_VIRTIO_HOST_MODE),y) + SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += qtest.c +endif + # this lib depends upon: DEPDIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += lib/librte_eal lib/librte_ether DEPDIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += lib/librte_mempool lib/librte_mbuf diff --git a/drivers/net/virtio/qtest.c b/drivers/net/virtio/qtest.c new file mode 100644 index 0000000..717bee9 --- /dev/null +++ b/drivers/net/virtio/qtest.c @@ -0,0 +1,1237 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2015 IGEL Co., Ltd. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of IGEL Co., Ltd. nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include + +#include "virtio_pci.h" +#include "virtio_logs.h" +#include "virtio_ethdev.h" + +#define NB_BUS 256 +#define NB_DEVICE 32 +#define NB_BAR 6 + +/* PCI common configuration registers */ +#define REG_ADDR_VENDOR_ID 0x0 +#define REG_ADDR_DEVICE_ID 0x2 +#define REG_ADDR_COMMAND 0x4 +#define REG_ADDR_STATUS 0x6 +#define REG_ADDR_REVISION_ID 0x8 +#define REG_ADDR_CLASS_CODE 0x9 +#define REG_ADDR_CACHE_LINE_S 0xc +#define REG_ADDR_LAT_TIMER 0xd +#define REG_ADDR_HEADER_TYPE 0xe +#define REG_ADDR_BIST 0xf +#define REG_ADDR_BAR0 0x10 +#define REG_ADDR_BAR1 0x14 +#define REG_ADDR_BAR2 0x18 +#define REG_ADDR_BAR3 0x1c +#define REG_ADDR_BAR4 0x20 +#define REG_ADDR_BAR5 0x24 + +/* PCI common configuration register values */ +#define REG_VAL_COMMAND_IO 0x1 +#define REG_VAL_COMMAND_MEMORY 0x2 +#define REG_VAL_COMMAND_MASTER 0x4 +#define REG_VAL_HEADER_TYPE_ENDPOINT 0x0 +#define REG_VAL_BAR_MEMORY 0x0 +#define REG_VAL_BAR_IO 0x1 +#define REG_VAL_BAR_LOCATE_32 0x0 +#define REG_VAL_BAR_LOCATE_UNDER_1MB 0x2 +#define REG_VAL_BAR_LOCATE_64 0x4 + +/* PIIX3 configuration registers */ +#define PIIX3_REG_ADDR_PIRQA 0x60 +#define PIIX3_REG_ADDR_PIRQB 0x61 +#define PIIX3_REG_ADDR_PIRQC 0x62 +#define PIIX3_REG_ADDR_PIRQD 0x63 + +/* Device information */ +#define VIRTIO_NET_DEVICE_ID 0x1000 +#define VIRTIO_NET_VENDOR_ID 0x1af4 +#define VIRTIO_NET_IO_START 0xc000 +#define VIRTIO_NET_MEMORY1_START 0x1000000000 +#define VIRTIO_NET_MEMORY2_START 0x2000000000 +#define VIRTIO_NET_IRQ_NUM 10 +#define IVSHMEM_DEVICE_ID 0x1110 +#define IVSHMEM_VENDOR_ID 0x1af4 +#define IVSHMEM_MEMORY_START 0x3000000000 +#define IVSHMEM_PROTOCOL_VERSION 0 +#define PIIX3_DEVICE_ID 0x7000 +#define PIIX3_VENDOR_ID 0x8086 + +#define PCI_CONFIG_ADDR(_bus, _device, _function, _offset) ( \ + (1 << 31) | ((_bus) & 0xff) << 16 | ((_device) & 0x1f) << 11 | \ + ((_function) & 0xf) << 8 | ((_offset) & 0xfc)) + +static char interrupt_message[32]; + +enum qtest_pci_bar_type { + QTEST_PCI_BAR_DISABLE = 0, + QTEST_PCI_BAR_IO, + QTEST_PCI_BAR_MEMORY_UNDER_1MB, + QTEST_PCI_BAR_MEMORY_32, + QTEST_PCI_BAR_MEMORY_64 +}; + +struct qtest_pci_bar { + enum qtest_pci_bar_type type; + uint8_t addr; + uint64_t region_start; + uint64_t region_size; +}; + +struct qtest_session; +TAILQ_HEAD(qtest_pci_device_list, qtest_pci_device); +struct qtest_pci_device { + TAILQ_ENTRY(qtest_pci_device) next; + const char *name; + uint16_t device_id; + uint16_t vendor_id; + uint8_t bus_addr; + uint8_t device_addr; + struct qtest_pci_bar bar[NB_BAR]; + int (*init)(struct qtest_session *s, struct qtest_pci_device *dev); +}; + +union qtest_pipefds { + struct { + int pipefd[2]; + }; + struct { + int readfd; + int writefd; + }; +}; + +struct qtest_session { + int qtest_socket; + pthread_mutex_t qtest_session_lock; + + struct qtest_pci_device_list head; + int ivshmem_socket; + + pthread_t event_th; + union qtest_pipefds msgfds; + + pthread_t intr_th; + union qtest_pipefds irqfds; + rte_atomic16_t enable_intr; + rte_intr_callback_fn cb; + void *cb_arg; +}; + +static int +qtest_raw_send(int fd, char *buf, size_t count) +{ + size_t len = count; + size_t total_len = 0; + int ret = 0; + + while (len > 0) { + ret = write(fd, buf, len); + if (ret == (int)len) + break; + if (ret == -1) { + if (errno == EINTR) + continue; + return ret; + } + total_len += ret; + buf += ret; + len -= ret; + } + return total_len + ret; +} + +static int +qtest_raw_recv(int fd, char *buf, size_t count) +{ + size_t len = count; + size_t total_len = 0; + int ret = 0; + + while (len > 0) { + ret = read(fd, buf, len); + if (ret == (int)len) + break; + if (*(buf + ret - 1) == '\n') + break; + if (ret == -1) { + if (errno == EINTR) + continue; + return ret; + } + total_len += ret; + buf += ret; + len -= ret; + } + return total_len + ret; +} + +/* + * To know QTest protocol specification, see below QEMU source code. + * - qemu/qtest.c + * If qtest socket is closed, qtest_raw_in and qtest_raw_read will return 0. + */ +static uint32_t +qtest_raw_in(struct qtest_session *s, uint16_t addr, char type) +{ + char buf[1024]; + int ret; + + if ((type != 'l') && (type != 'w') && (type != 'b')) + rte_panic("Invalid value\n"); + + snprintf(buf, sizeof(buf), "in%c 0x%x\n", type, addr); + /* write to qtest socket */ + ret = qtest_raw_send(s->qtest_socket, buf, strlen(buf)); + /* read reply from event handler */ + ret = qtest_raw_recv(s->msgfds.readfd, buf, sizeof(buf)); + if (ret < 0) + return 0; + + buf[ret] = '\0'; + return strtoul(buf + strlen("OK "), NULL, 16); +} + +static void +qtest_raw_out(struct qtest_session *s, uint16_t addr, uint32_t val, char type) +{ + char buf[1024]; + int ret __rte_unused; + + if ((type != 'l') && (type != 'w') && (type != 'b')) + rte_panic("Invalid value\n"); + + snprintf(buf, sizeof(buf), "out%c 0x%x 0x%x\n", type, addr, val); + /* write to qtest socket */ + ret = qtest_raw_send(s->qtest_socket, buf, strlen(buf)); + /* read reply from event handler */ + ret = qtest_raw_recv(s->msgfds.readfd, buf, sizeof(buf)); +} + +static uint32_t +qtest_raw_read(struct qtest_session *s, uint64_t addr, char type) +{ + char buf[1024]; + int ret; + + if ((type != 'l') && (type != 'w') && (type != 'b')) + rte_panic("Invalid value\n"); + + snprintf(buf, sizeof(buf), "read%c 0x%lx\n", type, addr); + /* write to qtest socket */ + ret = qtest_raw_send(s->qtest_socket, buf, strlen(buf)); + /* read reply from event handler */ + ret = qtest_raw_recv(s->msgfds.readfd, buf, sizeof(buf)); + if (ret < 0) + return 0; + + buf[ret] = '\0'; + return strtoul(buf + strlen("OK "), NULL, 16); +} + +static void +qtest_raw_write(struct qtest_session *s, uint64_t addr, uint32_t val, char type) +{ + char buf[1024]; + int ret __rte_unused; + + if ((type != 'l') && (type != 'w') && (type != 'b')) + rte_panic("Invalid value\n"); + + snprintf(buf, sizeof(buf), "write%c 0x%lx 0x%x\n", type, addr, val); + /* write to qtest socket */ + ret = qtest_raw_send(s->qtest_socket, buf, strlen(buf)); + /* read reply from event handler */ + ret = qtest_raw_recv(s->msgfds.readfd, buf, sizeof(buf)); +} + +/* + * qtest_pci_inX/outX are used for accessing PCI configuration space. + * The functions are implemented based on PCI configuration space + * specification. + * Accroding to the spec, access size of read()/write() should be 4 bytes. + */ +static int +qtest_pci_inb(struct qtest_session *s, uint8_t bus, uint8_t device, + uint8_t function, uint8_t offset) +{ + uint32_t tmp; + + tmp = PCI_CONFIG_ADDR(bus, device, function, offset); + + if (pthread_mutex_lock(&s->qtest_session_lock) < 0) + rte_panic("Cannot lock mutex\n"); + + qtest_raw_out(s, 0xcf8, tmp, 'l'); + tmp = qtest_raw_in(s, 0xcfc, 'l'); + + if (pthread_mutex_unlock(&s->qtest_session_lock) < 0) + rte_panic("Cannot unlock mutex\n"); + + return (tmp >> ((offset & 0x3) * 8)) & 0xff; +} + +static void +qtest_pci_outb(struct qtest_session *s, uint8_t bus, uint8_t device, + uint8_t function, uint8_t offset, uint8_t value) +{ + uint32_t addr, tmp, pos; + + addr = PCI_CONFIG_ADDR(bus, device, function, offset); + pos = (offset % 4) * 8; + + if (pthread_mutex_lock(&s->qtest_session_lock) < 0) + rte_panic("Cannot lock mutex\n"); + + qtest_raw_out(s, 0xcf8, addr, 'l'); + tmp = qtest_raw_in(s, 0xcfc, 'l'); + tmp = (tmp & ~(0xff << pos)) | (value << pos); + + qtest_raw_out(s, 0xcf8, addr, 'l'); + qtest_raw_out(s, 0xcfc, tmp, 'l'); + + if (pthread_mutex_unlock(&s->qtest_session_lock) < 0) + rte_panic("Cannot unlock mutex\n"); +} + +static uint32_t +qtest_pci_inl(struct qtest_session *s, uint8_t bus, uint8_t device, + uint8_t function, uint8_t offset) +{ + uint32_t tmp; + + tmp = PCI_CONFIG_ADDR(bus, device, function, offset); + + if (pthread_mutex_lock(&s->qtest_session_lock) < 0) + rte_panic("Cannot lock mutex\n"); + + qtest_raw_out(s, 0xcf8, tmp, 'l'); + tmp = qtest_raw_in(s, 0xcfc, 'l'); + + if (pthread_mutex_unlock(&s->qtest_session_lock) < 0) + rte_panic("Cannot unlock mutex\n"); + + return tmp; +} + +static void +qtest_pci_outl(struct qtest_session *s, uint8_t bus, uint8_t device, + uint8_t function, uint8_t offset, uint32_t value) +{ + uint32_t tmp; + + tmp = PCI_CONFIG_ADDR(bus, device, function, offset); + + if (pthread_mutex_lock(&s->qtest_session_lock) < 0) + rte_panic("Cannot lock mutex\n"); + + qtest_raw_out(s, 0xcf8, tmp, 'l'); + qtest_raw_out(s, 0xcfc, value, 'l'); + + if (pthread_mutex_unlock(&s->qtest_session_lock) < 0) + rte_panic("Cannot unlock mutex\n"); +} + +static uint64_t +qtest_pci_inq(struct qtest_session *s, uint8_t bus, uint8_t device, + uint8_t function, uint8_t offset) +{ + uint32_t tmp; + uint64_t val; + + tmp = PCI_CONFIG_ADDR(bus, device, function, offset); + + if (pthread_mutex_lock(&s->qtest_session_lock) < 0) + rte_panic("Cannot lock mutex\n"); + + qtest_raw_out(s, 0xcf8, tmp, 'l'); + val = (uint64_t)qtest_raw_in(s, 0xcfc, 'l'); + + tmp = PCI_CONFIG_ADDR(bus, device, function, offset + 4); + + qtest_raw_out(s, 0xcf8, tmp, 'l'); + val |= (uint64_t)qtest_raw_in(s, 0xcfc, 'l') << 32; + + if (pthread_mutex_unlock(&s->qtest_session_lock) < 0) + rte_panic("Cannot unlock mutex\n"); + + return val; +} + +static void +qtest_pci_outq(struct qtest_session *s, uint8_t bus, uint8_t device, + uint8_t function, uint8_t offset, uint64_t value) +{ + uint32_t tmp; + + tmp = PCI_CONFIG_ADDR(bus, device, function, offset); + + if (pthread_mutex_lock(&s->qtest_session_lock) < 0) + rte_panic("Cannot lock mutex\n"); + + qtest_raw_out(s, 0xcf8, tmp, 'l'); + qtest_raw_out(s, 0xcfc, (uint32_t)(value & 0xffffffff), 'l'); + + tmp = PCI_CONFIG_ADDR(bus, device, function, offset + 4); + + qtest_raw_out(s, 0xcf8, tmp, 'l'); + qtest_raw_out(s, 0xcfc, (uint32_t)(value >> 32), 'l'); + + if (pthread_mutex_unlock(&s->qtest_session_lock) < 0) + rte_panic("Cannot unlock mutex\n"); +} + +/* + * qtest_in/out are used for accessing ioport of qemu guest. + * qtest_read/write are used for accessing memory of qemu guest. + */ +uint32_t +qtest_in(struct virtio_hw *hw, uint16_t addr, char type) +{ + struct qtest_session *s = (struct qtest_session *)hw->qsession; + uint32_t val; + + if (pthread_mutex_lock(&s->qtest_session_lock) < 0) + rte_panic("Cannot lock mutex\n"); + + val = qtest_raw_in(s, addr, type); + + if (pthread_mutex_unlock(&s->qtest_session_lock) < 0) + rte_panic("Cannot lock mutex\n"); + + return val; +} + +void +qtest_out(struct virtio_hw *hw, uint16_t addr, uint64_t val, char type) +{ + struct qtest_session *s = (struct qtest_session *)hw->qsession; + + if (pthread_mutex_lock(&s->qtest_session_lock) < 0) + rte_panic("Cannot lock mutex\n"); + + qtest_raw_out(s, addr, val, type); + + if (pthread_mutex_unlock(&s->qtest_session_lock) < 0) + rte_panic("Cannot lock mutex\n"); +} + +uint32_t +qtest_read(struct virtio_hw *hw, uint64_t addr, char type) +{ + struct qtest_session *s = (struct qtest_session *)hw->qsession; + uint32_t val; + + if (pthread_mutex_lock(&s->qtest_session_lock) < 0) + rte_panic("Cannot lock mutex\n"); + + val = qtest_raw_read(s, addr, type); + + if (pthread_mutex_unlock(&s->qtest_session_lock) < 0) + rte_panic("Cannot lock mutex\n"); + + return val; +} + +void +qtest_write(struct virtio_hw *hw, uint64_t addr, uint64_t val, char type) +{ + struct qtest_session *s = (struct qtest_session *)hw->qsession; + + if (pthread_mutex_lock(&s->qtest_session_lock) < 0) + rte_panic("Cannot lock mutex\n"); + + qtest_raw_write(s, addr, val, type); + + if (pthread_mutex_unlock(&s->qtest_session_lock) < 0) + rte_panic("Cannot lock mutex\n"); +} + +static struct qtest_pci_device * +qtest_find_device(struct qtest_session *s, const char *name) +{ + struct qtest_pci_device *dev; + + TAILQ_FOREACH(dev, &s->head, next) { + if (strcmp(dev->name, name) == 0) + return dev; + } + return NULL; +} + +/* + * The function is used for reading pci configuration space of specifed device. + */ +int +qtest_read_pci_cfg(struct virtio_hw *hw, const char *name, + void *buf, size_t len, off_t offset) +{ + struct qtest_session *s = (struct qtest_session *)hw->qsession; + struct qtest_pci_device *dev; + uint32_t i; + uint8_t *p = buf; + + dev = qtest_find_device(s, name); + if (dev == NULL) { + PMD_DRV_LOG(ERR, "Cannot find specified device: %s\n", name); + return -1; + } + + for (i = 0; i < len; i++) { + *(p + i) = qtest_pci_inb(s, + dev->bus_addr, dev->device_addr, 0, offset + i); + } + + return 0; +} + +static struct qtest_pci_bar * +qtest_get_bar(struct virtio_hw *hw, const char *name, uint8_t bar) +{ + struct qtest_session *s = (struct qtest_session *)hw->qsession; + struct qtest_pci_device *dev; + + if (bar >= NB_BAR) { + PMD_DRV_LOG(ERR, "Invalid bar is specified: %u\n", bar); + return NULL; + } + + dev = qtest_find_device(s, name); + if (dev == NULL) { + PMD_DRV_LOG(ERR, "Cannot find specified device: %s\n", name); + return NULL; + } + + if (dev->bar[bar].type == QTEST_PCI_BAR_DISABLE) { + PMD_DRV_LOG(ERR, "Cannot find valid BAR(%s): %u\n", name, bar); + return NULL; + } + + return &dev->bar[bar]; +} + +int +qtest_get_bar_addr(struct virtio_hw *hw, const char *name, + uint8_t bar, uint64_t *addr) +{ + struct qtest_pci_bar *bar_ptr; + + bar_ptr = qtest_get_bar(hw, name, bar); + if (bar_ptr == NULL) + return -1; + + *addr = bar_ptr->region_start; + return 0; +} + +int +qtest_get_bar_size(struct virtio_hw *hw, const char *name, + uint8_t bar, uint64_t *size) +{ + struct qtest_pci_bar *bar_ptr; + + bar_ptr = qtest_get_bar(hw, name, bar); + if (bar_ptr == NULL) + return -1; + + *size = bar_ptr->region_size; + return 0; +} + +int +qtest_intr_enable(void *data) +{ + struct virtio_hw *hw = ((struct rte_eth_dev_data *)data)->dev_private; + struct qtest_session *s; + + s = (struct qtest_session *)hw->qsession; + rte_atomic16_set(&s->enable_intr, 1); + + return 0; +} + +int +qtest_intr_disable(void *data) +{ + struct virtio_hw *hw = ((struct rte_eth_dev_data *)data)->dev_private; + struct qtest_session *s; + + s = (struct qtest_session *)hw->qsession; + rte_atomic16_set(&s->enable_intr, 0); + + return 0; +} + +void +qtest_intr_callback_register(void *data, + rte_intr_callback_fn cb, void *cb_arg) +{ + struct virtio_hw *hw = ((struct rte_eth_dev_data *)data)->dev_private; + struct qtest_session *s; + + s = (struct qtest_session *)hw->qsession; + s->cb = cb; + s->cb_arg = cb_arg; + rte_atomic16_set(&s->enable_intr, 1); +} + +void +qtest_intr_callback_unregister(void *data, + rte_intr_callback_fn cb __rte_unused, + void *cb_arg __rte_unused) +{ + struct virtio_hw *hw = ((struct rte_eth_dev_data *)data)->dev_private; + struct qtest_session *s; + + s = (struct qtest_session *)hw->qsession; + rte_atomic16_set(&s->enable_intr, 0); + s->cb = NULL; + s->cb_arg = NULL; +} + +static void * +qtest_intr_handler(void *data) { + struct qtest_session *s = (struct qtest_session *)data; + char buf[1]; + int ret; + + for (;;) { + ret = qtest_raw_recv(s->irqfds.readfd, buf, sizeof(buf)); + if (ret < 0) + return NULL; + s->cb(NULL, s->cb_arg); + } + return NULL; +} + +static int +qtest_intr_initialize(void *data) +{ + struct virtio_hw *hw = ((struct rte_eth_dev_data *)data)->dev_private; + struct qtest_session *s; + char buf[1024]; + int ret; + + s = (struct qtest_session *)hw->qsession; + + /* This message will come when interrupt occurs */ + snprintf(interrupt_message, sizeof(interrupt_message), + "IRQ raise %d", VIRTIO_NET_IRQ_NUM); + + snprintf(buf, sizeof(buf), "irq_intercept_in ioapic\n"); + + if (pthread_mutex_lock(&s->qtest_session_lock) < 0) + rte_panic("Cannot lock mutex\n"); + + /* To enable interrupt, send "irq_intercept_in" message to QEMU */ + ret = qtest_raw_send(s->qtest_socket, buf, strlen(buf)); + if (ret < 0) { + pthread_mutex_unlock(&s->qtest_session_lock); + return -1; + } + + /* just ignore QEMU response */ + ret = qtest_raw_recv(s->msgfds.readfd, buf, sizeof(buf)); + if (ret < 0) { + pthread_mutex_unlock(&s->qtest_session_lock); + return -1; + } + + if (pthread_mutex_unlock(&s->qtest_session_lock) < 0) + rte_panic("Cannot lock mutex\n"); + + return 0; +} + +static void +qtest_handle_one_message(struct qtest_session *s, char *buf) +{ + int ret; + + if (strncmp(buf, interrupt_message, strlen(interrupt_message)) == 0) { + if (rte_atomic16_read(&s->enable_intr) == 0) + return; + + /* relay interrupt to pipe */ + ret = write(s->irqfds.writefd, "1", 1); + if (ret < 0) + rte_panic("cannot relay interrupt\n"); + } else { + /* relay normal message to pipe */ + ret = qtest_raw_send(s->msgfds.writefd, buf, strlen(buf)); + if (ret < 0) + rte_panic("cannot relay normal message\n"); + } +} + +static char * +qtest_get_next_message(char *p) +{ + p = strchr(p, '\n'); + if ((p == NULL) || (*(p + 1) == '\0')) + return NULL; + return p + 1; +} + +static void +qtest_close_one_socket(int *fd) +{ + if (*fd > 0) { + close(*fd); + *fd = -1; + } +} + +static void +qtest_close_sockets(struct qtest_session *s) +{ + qtest_close_one_socket(&s->qtest_socket); + qtest_close_one_socket(&s->msgfds.readfd); + qtest_close_one_socket(&s->msgfds.writefd); + qtest_close_one_socket(&s->irqfds.readfd); + qtest_close_one_socket(&s->irqfds.writefd); + qtest_close_one_socket(&s->ivshmem_socket); +} + +/* + * This thread relays QTest response using pipe. + * The function is needed because we need to separate IRQ message from others. + */ +static void * +qtest_event_handler(void *data) { + struct qtest_session *s = (struct qtest_session *)data; + char buf[1024]; + char *p; + int ret; + + for (;;) { + memset(buf, 0, sizeof(buf)); + ret = qtest_raw_recv(s->qtest_socket, buf, sizeof(buf)); + if (ret < 0) { + qtest_close_sockets(s); + return NULL; + } + + /* may receive multiple messages at the same time */ + p = buf; + do { + qtest_handle_one_message(s, p); + } while ((p = qtest_get_next_message(p)) != NULL); + } + return NULL; +} + +static int +qtest_init_piix3_device(struct qtest_session *s, struct qtest_pci_device *dev) +{ + uint8_t bus, device, virtio_net_slot = 0; + struct qtest_pci_device *tmpdev; + uint8_t pcislot2regaddr[] = { 0xff, + 0xff, + 0xff, + PIIX3_REG_ADDR_PIRQC, + PIIX3_REG_ADDR_PIRQD, + PIIX3_REG_ADDR_PIRQA, + PIIX3_REG_ADDR_PIRQB}; + + bus = dev->bus_addr; + device = dev->device_addr; + + PMD_DRV_LOG(INFO, + "Find %s on virtual PCI bus: %04x:%02x:00.0\n", + dev->name, bus, device); + + /* Get slot id that is connected to virtio-net */ + TAILQ_FOREACH(tmpdev, &s->head, next) { + if (strcmp(tmpdev->name, "virtio-net") == 0) { + virtio_net_slot = tmpdev->device_addr; + break; + } + } + + if (virtio_net_slot == 0) + return -1; + + /* + * Set interrupt routing for virtio-net device. + * Here is i440fx/piix3 connection settings + * --------------------------------------- + * PCI Slot3 -> PIRQC + * PCI Slot4 -> PIRQD + * PCI Slot5 -> PIRQA + * PCI Slot6 -> PIRQB + */ + if (pcislot2regaddr[virtio_net_slot] != 0xff) { + qtest_pci_outb(s, bus, device, 0, + pcislot2regaddr[virtio_net_slot], + VIRTIO_NET_IRQ_NUM); + } + + return 0; +} + +/* + * Common initialization of PCI device. + * To know detail, see pci specification. + */ +static int +qtest_init_pci_device(struct qtest_session *s, struct qtest_pci_device *dev) +{ + uint8_t i, bus, device; + uint32_t val; + uint64_t val64; + + bus = dev->bus_addr; + device = dev->device_addr; + + PMD_DRV_LOG(INFO, + "Find %s on virtual PCI bus: %04x:%02x:00.0\n", + dev->name, bus, device); + + /* Check header type */ + val = qtest_pci_inb(s, bus, device, 0, REG_ADDR_HEADER_TYPE); + if (val != REG_VAL_HEADER_TYPE_ENDPOINT) { + PMD_DRV_LOG(ERR, "Unexpected header type %d\n", val); + return -1; + } + + /* Check BAR type */ + for (i = 0; i < NB_BAR; i++) { + val = qtest_pci_inl(s, bus, device, 0, dev->bar[i].addr); + + switch (dev->bar[i].type) { + case QTEST_PCI_BAR_IO: + if ((val & 0x1) != REG_VAL_BAR_IO) + dev->bar[i].type = QTEST_PCI_BAR_DISABLE; + break; + case QTEST_PCI_BAR_MEMORY_UNDER_1MB: + if ((val & 0x1) != REG_VAL_BAR_MEMORY) + dev->bar[i].type = QTEST_PCI_BAR_DISABLE; + if ((val & 0x6) != REG_VAL_BAR_LOCATE_UNDER_1MB) + dev->bar[i].type = QTEST_PCI_BAR_DISABLE; + break; + case QTEST_PCI_BAR_MEMORY_32: + if ((val & 0x1) != REG_VAL_BAR_MEMORY) + dev->bar[i].type = QTEST_PCI_BAR_DISABLE; + if ((val & 0x6) != REG_VAL_BAR_LOCATE_32) + dev->bar[i].type = QTEST_PCI_BAR_DISABLE; + break; + case QTEST_PCI_BAR_MEMORY_64: + + if ((val & 0x1) != REG_VAL_BAR_MEMORY) + dev->bar[i].type = QTEST_PCI_BAR_DISABLE; + if ((val & 0x6) != REG_VAL_BAR_LOCATE_64) + dev->bar[i].type = QTEST_PCI_BAR_DISABLE; + break; + case QTEST_PCI_BAR_DISABLE: + break; + } + } + + /* Enable device */ + val = qtest_pci_inl(s, bus, device, 0, REG_ADDR_COMMAND); + val |= REG_VAL_COMMAND_IO | REG_VAL_COMMAND_MEMORY | REG_VAL_COMMAND_MASTER; + qtest_pci_outl(s, bus, device, 0, REG_ADDR_COMMAND, val); + + /* Calculate BAR size */ + for (i = 0; i < NB_BAR; i++) { + switch (dev->bar[i].type) { + case QTEST_PCI_BAR_IO: + case QTEST_PCI_BAR_MEMORY_UNDER_1MB: + case QTEST_PCI_BAR_MEMORY_32: + qtest_pci_outl(s, bus, device, 0, + dev->bar[i].addr, 0xffffffff); + val = qtest_pci_inl(s, bus, device, + 0, dev->bar[i].addr); + dev->bar[i].region_size = ~(val & 0xfffffff0) + 1; + break; + case QTEST_PCI_BAR_MEMORY_64: + qtest_pci_outq(s, bus, device, 0, + dev->bar[i].addr, 0xffffffffffffffff); + val64 = qtest_pci_inq(s, bus, device, + 0, dev->bar[i].addr); + dev->bar[i].region_size = + ~(val64 & 0xfffffffffffffff0) + 1; + break; + case QTEST_PCI_BAR_DISABLE: + break; + } + } + + /* Set BAR region */ + for (i = 0; i < NB_BAR; i++) { + switch (dev->bar[i].type) { + case QTEST_PCI_BAR_IO: + case QTEST_PCI_BAR_MEMORY_UNDER_1MB: + case QTEST_PCI_BAR_MEMORY_32: + qtest_pci_outl(s, bus, device, 0, dev->bar[i].addr, + dev->bar[i].region_start); + PMD_DRV_LOG(INFO, "Set BAR of %s device: 0x%lx - 0x%lx\n", + dev->name, dev->bar[i].region_start, + dev->bar[i].region_start + dev->bar[i].region_size); + break; + case QTEST_PCI_BAR_MEMORY_64: + qtest_pci_outq(s, bus, device, 0, dev->bar[i].addr, + dev->bar[i].region_start); + PMD_DRV_LOG(INFO, "Set BAR of %s device: 0x%lx - 0x%lx\n", + dev->name, dev->bar[i].region_start, + dev->bar[i].region_start + dev->bar[i].region_size); + break; + case QTEST_PCI_BAR_DISABLE: + break; + } + } + + return 0; +} + +static void +qtest_find_pci_device(struct qtest_session *s, uint16_t bus, uint8_t device) +{ + struct qtest_pci_device *dev; + uint32_t val; + + val = qtest_pci_inl(s, bus, device, 0, 0); + TAILQ_FOREACH(dev, &s->head, next) { + if (val == ((uint32_t)dev->device_id << 16 | dev->vendor_id)) { + dev->bus_addr = bus; + dev->device_addr = device; + return; + } + + } +} + +static int +qtest_init_pci_devices(struct qtest_session *s) +{ + struct qtest_pci_device *dev; + uint16_t bus; + uint8_t device; + int ret; + + /* Find devices */ + bus = 0; + do { + device = 0; + do { + qtest_find_pci_device(s, bus, device); + } while (device++ != NB_DEVICE - 1); + } while (bus++ != NB_BUS - 1); + + /* Initialize devices */ + TAILQ_FOREACH(dev, &s->head, next) { + ret = dev->init(s, dev); + if (ret != 0) + return ret; + } + + return 0; +} + +struct rte_pci_id +qtest_get_pci_id_of_virtio_net(void) +{ + struct rte_pci_id id = {VIRTIO_NET_DEVICE_ID, + VIRTIO_NET_VENDOR_ID, PCI_ANY_ID, PCI_ANY_ID}; + + return id; +} + +static int +qtest_register_target_devices(struct qtest_session *s) +{ + struct qtest_pci_device *virtio_net, *ivshmem, *piix3; + const struct rte_memseg *ms; + + ms = rte_eal_get_physmem_layout(); + /* if EAL memory size isn't pow of 2, ivshmem will refuse it */ + if ((ms[0].len & (ms[0].len - 1)) != 0) { + PMD_DRV_LOG(ERR, "memory size must be power of 2\n"); + return -1; + } + + virtio_net = malloc(sizeof(*virtio_net)); + if (virtio_net == NULL) + return -1; + + ivshmem = malloc(sizeof(*ivshmem)); + if (ivshmem == NULL) + return -1; + + piix3 = malloc(sizeof(*piix3)); + if (piix3 == NULL) + return -1; + + memset(virtio_net, 0, sizeof(*virtio_net)); + memset(ivshmem, 0, sizeof(*ivshmem)); + + TAILQ_INIT(&s->head); + + virtio_net->name = "virtio-net"; + virtio_net->device_id = VIRTIO_NET_DEVICE_ID; + virtio_net->vendor_id = VIRTIO_NET_VENDOR_ID; + virtio_net->init = qtest_init_pci_device; + virtio_net->bar[0].addr = REG_ADDR_BAR0; + virtio_net->bar[0].type = QTEST_PCI_BAR_IO; + virtio_net->bar[0].region_start = VIRTIO_NET_IO_START; + virtio_net->bar[1].addr = REG_ADDR_BAR1; + virtio_net->bar[1].type = QTEST_PCI_BAR_MEMORY_32; + virtio_net->bar[1].region_start = VIRTIO_NET_MEMORY1_START; + virtio_net->bar[4].addr = REG_ADDR_BAR4; + virtio_net->bar[4].type = QTEST_PCI_BAR_MEMORY_64; + virtio_net->bar[4].region_start = VIRTIO_NET_MEMORY2_START; + TAILQ_INSERT_TAIL(&s->head, virtio_net, next); + + ivshmem->name = "ivshmem"; + ivshmem->device_id = IVSHMEM_DEVICE_ID; + ivshmem->vendor_id = IVSHMEM_VENDOR_ID; + ivshmem->init = qtest_init_pci_device; + ivshmem->bar[0].addr = REG_ADDR_BAR0; + ivshmem->bar[0].type = QTEST_PCI_BAR_MEMORY_32; + ivshmem->bar[0].region_start = IVSHMEM_MEMORY_START; + ivshmem->bar[2].addr = REG_ADDR_BAR2; + ivshmem->bar[2].type = QTEST_PCI_BAR_MEMORY_64; + /* In host mode, only one memory segment is vaild */ + ivshmem->bar[2].region_start = (uint64_t)ms[0].addr; + TAILQ_INSERT_TAIL(&s->head, ivshmem, next); + + /* piix3 is needed to route irqs from virtio-net to ioapic */ + piix3->name = "piix3"; + piix3->device_id = PIIX3_DEVICE_ID; + piix3->vendor_id = PIIX3_VENDOR_ID; + piix3->init = qtest_init_piix3_device; + TAILQ_INSERT_TAIL(&s->head, piix3, next); + + return 0; +} + +static int +qtest_send_message_to_ivshmem(int sock_fd, uint64_t client_id, int shm_fd) +{ + struct iovec iov; + struct msghdr msgh; + size_t fdsize = sizeof(int); + char control[CMSG_SPACE(fdsize)]; + struct cmsghdr *cmsg; + int ret; + + memset(&msgh, 0, sizeof(msgh)); + iov.iov_base = &client_id; + iov.iov_len = sizeof(client_id); + + msgh.msg_iov = &iov; + msgh.msg_iovlen = 1; + + if (shm_fd >= 0) { + msgh.msg_control = &control; + msgh.msg_controllen = sizeof(control); + cmsg = CMSG_FIRSTHDR(&msgh); + cmsg->cmsg_len = CMSG_LEN(fdsize); + cmsg->cmsg_level = SOL_SOCKET; + cmsg->cmsg_type = SCM_RIGHTS; + memcpy(CMSG_DATA(cmsg), &shm_fd, fdsize); + } + + do { + ret = sendmsg(sock_fd, &msgh, 0); + } while (ret < 0 && errno == EINTR); + + if (ret < 0) { + PMD_DRV_LOG(ERR, "sendmsg error\n"); + return ret; + } + + return ret; +} + +static int +qtest_setup_shared_memory(struct qtest_session *s) +{ + int shm_fd, ret; + + rte_memseg_info_get(0, &shm_fd, NULL, NULL); + + /* send our protocol version first */ + ret = qtest_send_message_to_ivshmem(s->ivshmem_socket, + IVSHMEM_PROTOCOL_VERSION, -1); + if (ret < 0) { + PMD_DRV_LOG(ERR, + "Failed to send protocol version to ivshmem\n"); + return -1; + } + + /* send client id */ + ret = qtest_send_message_to_ivshmem(s->ivshmem_socket, 0, -1); + if (ret < 0) { + PMD_DRV_LOG(ERR, "Failed to send VMID to ivshmem\n"); + return -1; + } + + /* send message to ivshmem */ + ret = qtest_send_message_to_ivshmem(s->ivshmem_socket, -1, shm_fd); + if (ret < 0) { + PMD_DRV_LOG(ERR, "Failed to file descriptor to ivshmem\n"); + return -1; + } + + /* close EAL memory again */ + close(shm_fd); + + return 0; +} + +int +qtest_vdev_init(struct rte_eth_dev_data *data, + int qtest_socket, int ivshmem_socket) +{ + struct virtio_hw *hw = ((struct rte_eth_dev_data *)data)->dev_private; + struct qtest_session *s; + int ret; + + s = rte_zmalloc(NULL, sizeof(*s), RTE_CACHE_LINE_SIZE); + + ret = pipe(s->msgfds.pipefd); + if (ret != 0) { + PMD_DRV_LOG(ERR, "Failed to initialize message pipe\n"); + return -1; + } + + ret = pipe(s->irqfds.pipefd); + if (ret != 0) { + PMD_DRV_LOG(ERR, "Failed to initialize irq pipe\n"); + return -1; + } + + ret = qtest_register_target_devices(s); + if (ret != 0) { + PMD_DRV_LOG(ERR, "Failed to initialize qtest session\n"); + return -1; + } + + ret = pthread_mutex_init(&s->qtest_session_lock, NULL); + if (ret != 0) { + PMD_DRV_LOG(ERR, "Failed to initialize mutex\n"); + return -1; + } + + rte_atomic16_set(&s->enable_intr, 0); + s->qtest_socket = qtest_socket; + s->ivshmem_socket = ivshmem_socket; + hw->qsession = (void *)s; + + ret = pthread_create(&s->event_th, NULL, qtest_event_handler, s); + if (ret != 0) { + PMD_DRV_LOG(ERR, "Failed to create event handler\n"); + return -1; + } + + ret = pthread_create(&s->intr_th, NULL, qtest_intr_handler, s); + if (ret != 0) { + PMD_DRV_LOG(ERR, "Failed to create interrupt handler\n"); + return -1; + } + + ret = qtest_intr_initialize(data); + if (ret != 0) { + PMD_DRV_LOG(ERR, "Failed to initialize interrupt\n"); + return -1; + } + + ret = qtest_setup_shared_memory(s); + if (ret != 0) { + PMD_DRV_LOG(ERR, "Failed to setup shared memory\n"); + return -1; + } + + ret = qtest_init_pci_devices(s); + if (ret != 0) { + PMD_DRV_LOG(ERR, "Failed to initialize devices\n"); + return -1; + } + + return 0; +} + +static void +qtest_remove_target_devices(struct qtest_session *s) +{ + struct qtest_pci_device *dev, *next; + + for (dev = TAILQ_FIRST(&s->head); dev != NULL; dev = next) { + next = TAILQ_NEXT(dev, next); + TAILQ_REMOVE(&s->head, dev, next); + free(dev); + } +} + +void +qtest_vdev_uninit(struct rte_eth_dev_data *data) +{ + struct virtio_hw *hw = ((struct rte_eth_dev_data *)data)->dev_private; + struct qtest_session *s; + + s = (struct qtest_session *)hw->qsession; + + qtest_close_sockets(s); + + pthread_cancel(s->event_th); + pthread_join(s->event_th, NULL); + + pthread_cancel(s->intr_th); + pthread_join(s->intr_th, NULL); + + pthread_mutex_destroy(&s->qtest_session_lock); + + qtest_remove_target_devices(s); + + rte_free(s); +} diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c index c477b05..e32f1dd 100644 --- a/drivers/net/virtio/virtio_ethdev.c +++ b/drivers/net/virtio/virtio_ethdev.c @@ -36,6 +36,10 @@ #include #include #include +#ifdef RTE_LIBRTE_VIRTIO_HOST_MODE +#include +#include +#endif #include #include @@ -52,6 +56,10 @@ #include #include #include +#ifdef RTE_LIBRTE_VIRTIO_HOST_MODE +#include +#include +#endif #include "virtio_ethdev.h" #include "virtio_pci.h" @@ -160,8 +168,7 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl, if ((vq->vq_free_cnt < ((uint32_t)pkt_num + 2)) || (pkt_num < 1)) return -1; - memcpy(vq->virtio_net_hdr_mz->addr, ctrl, - sizeof(struct virtio_pmd_ctrl)); + memcpy(vq->virtio_net_hdr_vaddr, ctrl, sizeof(struct virtio_pmd_ctrl)); /* * Format is enforced in qemu code: @@ -170,14 +177,14 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl, * One RX packet for ACK. */ vq->vq_ring.desc[head].flags = VRING_DESC_F_NEXT; - vq->vq_ring.desc[head].addr = vq->virtio_net_hdr_mz->phys_addr; + vq->vq_ring.desc[head].addr = vq->virtio_net_hdr_mem; vq->vq_ring.desc[head].len = sizeof(struct virtio_net_ctrl_hdr); vq->vq_free_cnt--; i = vq->vq_ring.desc[head].next; for (k = 0; k < pkt_num; k++) { vq->vq_ring.desc[i].flags = VRING_DESC_F_NEXT; - vq->vq_ring.desc[i].addr = vq->virtio_net_hdr_mz->phys_addr + vq->vq_ring.desc[i].addr = vq->virtio_net_hdr_mem + sizeof(struct virtio_net_ctrl_hdr) + sizeof(ctrl->status) + sizeof(uint8_t)*sum; vq->vq_ring.desc[i].len = dlen[k]; @@ -187,7 +194,7 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl, } vq->vq_ring.desc[i].flags = VRING_DESC_F_WRITE; - vq->vq_ring.desc[i].addr = vq->virtio_net_hdr_mz->phys_addr + vq->vq_ring.desc[i].addr = vq->virtio_net_hdr_mem + sizeof(struct virtio_net_ctrl_hdr); vq->vq_ring.desc[i].len = sizeof(ctrl->status); vq->vq_free_cnt--; @@ -232,7 +239,7 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl, PMD_INIT_LOG(DEBUG, "vq->vq_free_cnt=%d\nvq->vq_desc_head_idx=%d", vq->vq_free_cnt, vq->vq_desc_head_idx); - memcpy(&result, vq->virtio_net_hdr_mz->addr, + memcpy(&result, vq->virtio_net_hdr_vaddr, sizeof(struct virtio_pmd_ctrl)); return result.status; @@ -270,6 +277,9 @@ virtio_dev_queue_release(struct virtqueue *vq) { hw = vq->hw; hw->vtpci_ops->del_queue(hw, vq); + rte_memzone_free(vq->virtio_net_hdr_mz); + rte_memzone_free(vq->mz); + rte_free(vq->sw_ring); rte_free(vq); } @@ -366,66 +376,81 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev, } } - /* - * Virtio PCI device VIRTIO_PCI_QUEUE_PF register is 32bit, - * and only accepts 32 bit page frame number. - * Check if the allocated physical memory exceeds 16TB. - */ - if ((mz->phys_addr + vq->vq_ring_size - 1) >> (VIRTIO_PCI_QUEUE_ADDR_SHIFT + 32)) { - PMD_INIT_LOG(ERR, "vring address shouldn't be above 16TB!"); - rte_free(vq); - return -ENOMEM; - } - memset(mz->addr, 0, sizeof(mz->len)); vq->mz = mz; - vq->vq_ring_mem = mz->phys_addr; vq->vq_ring_virt_mem = mz->addr; - PMD_INIT_LOG(DEBUG, "vq->vq_ring_mem: 0x%"PRIx64, (uint64_t)mz->phys_addr); - PMD_INIT_LOG(DEBUG, "vq->vq_ring_virt_mem: 0x%"PRIx64, (uint64_t)(uintptr_t)mz->addr); + + + if (dev->dev_type == RTE_ETH_DEV_PCI) { + vq->vq_ring_mem = mz->phys_addr; + + /* Virtio PCI device VIRTIO_PCI_QUEUE_PF register is 32bit, + * and only accepts 32 bit page frame number. + * Check if the allocated physical memory exceeds 16TB. + */ + uint64_t last_physaddr = vq->vq_ring_mem + vq->vq_ring_size - 1; + if (last_physaddr >> (VIRTIO_PCI_QUEUE_ADDR_SHIFT + 32)) { + PMD_INIT_LOG(ERR, "vring address shouldn't be above 16TB!"); + rte_free(vq); + return -ENOMEM; + } + } +#ifdef RTE_LIBRTE_VIRTIO_HOST_MODE + else { /* RTE_ETH_DEV_VIRTUAL */ + /* Use virtual addr to fill!!! */ + vq->vq_ring_mem = (phys_addr_t)mz->addr; + + /* TODO: check last_physaddr */ + } +#endif + + PMD_INIT_LOG(DEBUG, "vq->vq_ring_mem: 0x%"PRIx64, + (uint64_t)vq->vq_ring_mem); + PMD_INIT_LOG(DEBUG, "vq->vq_ring_virt_mem: 0x%"PRIx64, + (uint64_t)(uintptr_t)vq->vq_ring_virt_mem); + vq->virtio_net_hdr_mz = NULL; vq->virtio_net_hdr_mem = 0; + uint64_t hdr_size = 0; if (queue_type == VTNET_TQ) { /* * For each xmit packet, allocate a virtio_net_hdr */ snprintf(vq_name, sizeof(vq_name), "port%d_tvq%d_hdrzone", dev->data->port_id, queue_idx); - vq->virtio_net_hdr_mz = rte_memzone_reserve_aligned(vq_name, - vq_size * hw->vtnet_hdr_size, - socket_id, 0, RTE_CACHE_LINE_SIZE); - if (vq->virtio_net_hdr_mz == NULL) { - if (rte_errno == EEXIST) - vq->virtio_net_hdr_mz = - rte_memzone_lookup(vq_name); - if (vq->virtio_net_hdr_mz == NULL) { - rte_free(vq); - return -ENOMEM; - } - } - vq->virtio_net_hdr_mem = - vq->virtio_net_hdr_mz->phys_addr; - memset(vq->virtio_net_hdr_mz->addr, 0, - vq_size * hw->vtnet_hdr_size); + hdr_size = vq_size * hw->vtnet_hdr_size; } else if (queue_type == VTNET_CQ) { - /* Allocate a page for control vq command, data and status */ snprintf(vq_name, sizeof(vq_name), "port%d_cvq_hdrzone", dev->data->port_id); - vq->virtio_net_hdr_mz = rte_memzone_reserve_aligned(vq_name, - PAGE_SIZE, socket_id, 0, RTE_CACHE_LINE_SIZE); - if (vq->virtio_net_hdr_mz == NULL) { + /* Allocate a page for control vq command, data and status */ + hdr_size = PAGE_SIZE; + } + + if (hdr_size) { /* queue_type is VTNET_TQ or VTNET_CQ */ + mz = rte_memzone_reserve_aligned(vq_name, + hdr_size, socket_id, 0, RTE_CACHE_LINE_SIZE); + if (mz == NULL) { if (rte_errno == EEXIST) - vq->virtio_net_hdr_mz = - rte_memzone_lookup(vq_name); - if (vq->virtio_net_hdr_mz == NULL) { + mz = rte_memzone_lookup(vq_name); + if (mz == NULL) { rte_free(vq); return -ENOMEM; } } - vq->virtio_net_hdr_mem = - vq->virtio_net_hdr_mz->phys_addr; - memset(vq->virtio_net_hdr_mz->addr, 0, PAGE_SIZE); + vq->virtio_net_hdr_mz = mz; + vq->virtio_net_hdr_vaddr = mz->addr; + memset(vq->virtio_net_hdr_vaddr, 0, hdr_size); + + if (dev->dev_type == RTE_ETH_DEV_PCI) { + vq->virtio_net_hdr_mem = mz->phys_addr; + } +#ifdef RTE_LIBRTE_VIRTIO_HOST_MODE + else { + /* Use vaddr!!! */ + vq->virtio_net_hdr_mem = (phys_addr_t)mz->addr; + } +#endif } hw->vtpci_ops->setup_queue(hw, vq); @@ -479,12 +504,18 @@ virtio_dev_close(struct rte_eth_dev *dev) PMD_INIT_LOG(DEBUG, "virtio_dev_close"); /* reset the NIC */ - if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC) + if (((dev->dev_type == RTE_ETH_DEV_PCI) && + (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)) || + ((dev->dev_type == RTE_ETH_DEV_VIRTUAL) && + (dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC))) { vtpci_irq_config(hw, VIRTIO_MSI_NO_VECTOR); + } vtpci_reset(hw); hw->started = 0; - virtio_dev_free_mbufs(dev); - virtio_free_queues(dev); + if ((dev->data->rx_queues != NULL) && (dev->data->tx_queues != NULL)) { + virtio_dev_free_mbufs(dev); + virtio_free_queues(dev); + } } static void @@ -983,14 +1014,30 @@ virtio_interrupt_handler(__rte_unused struct rte_intr_handle *handle, isr = vtpci_isr(hw); PMD_DRV_LOG(INFO, "interrupt status = %#x", isr); - if (rte_intr_enable(&dev->pci_dev->intr_handle) < 0) - PMD_DRV_LOG(ERR, "interrupt enable failed"); - - if (isr & VIRTIO_PCI_ISR_CONFIG) { + if (dev->dev_type == RTE_ETH_DEV_PCI) { + if (rte_intr_enable(&dev->pci_dev->intr_handle) < 0) + PMD_DRV_LOG(ERR, "interrupt enable failed"); + if (isr & VIRTIO_PCI_ISR_CONFIG) { + if (virtio_dev_link_update(dev, 0) == 0) + _rte_eth_dev_callback_process(dev, + RTE_ETH_EVENT_INTR_LSC); + } + } +#ifdef RTE_LIBRTE_VIRTIO_HOST_MODE + else if (dev->dev_type == RTE_ETH_DEV_VIRTUAL) { + if (qtest_intr_enable(dev->data) < 0) + PMD_DRV_LOG(ERR, "interrupt enable failed"); + /* + * If last qtest message is interrupt, 'isr' will be 0 + * becasue socket has been closed already. + * But still we want to notice this event to EAL. + * So just ignore isr value. + */ if (virtio_dev_link_update(dev, 0) == 0) _rte_eth_dev_callback_process(dev, - RTE_ETH_EVENT_INTR_LSC); + RTE_ETH_EVENT_INTR_LSC); } +#endif } @@ -1014,7 +1061,8 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev) struct virtio_hw *hw = eth_dev->data->dev_private; struct virtio_net_config *config; struct virtio_net_config local_config; - struct rte_pci_device *pci_dev; + struct rte_pci_device *pci_dev = eth_dev->pci_dev; + struct rte_pci_id id; RTE_BUILD_BUG_ON(RTE_PKTMBUF_HEADROOM < sizeof(struct virtio_net_hdr)); @@ -1052,8 +1100,14 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev) return -1; /* If host does not support status then disable LSC */ - if (!vtpci_with_feature(hw, VIRTIO_NET_F_STATUS)) - pci_dev->driver->drv_flags &= ~RTE_PCI_DRV_INTR_LSC; + if (!vtpci_with_feature(hw, VIRTIO_NET_F_STATUS)) { + if (eth_dev->dev_type == RTE_ETH_DEV_PCI) + pci_dev->driver->drv_flags &= ~RTE_PCI_DRV_INTR_LSC; +#ifdef RTE_LIBRTE_VIRTIO_HOST_MODE + else if (eth_dev->dev_type == RTE_ETH_DEV_VIRTUAL) + eth_dev->data->dev_flags &= ~RTE_ETH_DEV_INTR_LSC; +#endif + } rte_eth_copy_pci_info(eth_dev, pci_dev); @@ -1132,14 +1186,30 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev) PMD_INIT_LOG(DEBUG, "hw->max_rx_queues=%d hw->max_tx_queues=%d", hw->max_rx_queues, hw->max_tx_queues); + + memset(&id, 0, sizeof(id)); + if (eth_dev->dev_type == RTE_ETH_DEV_PCI) + id = pci_dev->id; +#ifdef RTE_LIBRTE_VIRTIO_HOST_MODE + else if (eth_dev->dev_type == RTE_ETH_DEV_VIRTUAL) + id = qtest_get_pci_id_of_virtio_net(); +#endif + PMD_INIT_LOG(DEBUG, "port %d vendorID=0x%x deviceID=0x%x", - eth_dev->data->port_id, pci_dev->id.vendor_id, - pci_dev->id.device_id); + eth_dev->data->port_id, + id.vendor_id, id.device_id); /* Setup interrupt callback */ - if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC) + if ((eth_dev->dev_type == RTE_ETH_DEV_PCI) && + (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)) rte_intr_callback_register(&pci_dev->intr_handle, - virtio_interrupt_handler, eth_dev); + virtio_interrupt_handler, eth_dev); +#ifdef RTE_LIBRTE_VIRTIO_HOST_MODE + else if ((eth_dev->dev_type == RTE_ETH_DEV_VIRTUAL) && + (eth_dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC)) + qtest_intr_callback_register(eth_dev->data, + virtio_interrupt_handler, eth_dev); +#endif virtio_dev_cq_start(eth_dev); @@ -1173,10 +1243,18 @@ eth_virtio_dev_uninit(struct rte_eth_dev *eth_dev) eth_dev->data->mac_addrs = NULL; /* reset interrupt callback */ - if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC) + if ((eth_dev->dev_type == RTE_ETH_DEV_PCI) && + (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)) rte_intr_callback_unregister(&pci_dev->intr_handle, virtio_interrupt_handler, eth_dev); +#ifdef RTE_LIBRTE_VIRTIO_HOST_MODE + else if ((eth_dev->dev_type == RTE_ETH_DEV_VIRTUAL) && + (eth_dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC)) + qtest_intr_callback_unregister(eth_dev->data, + virtio_interrupt_handler, eth_dev); +#endif + vtpci_uninit(eth_dev, hw); PMD_INIT_LOG(DEBUG, "dev_uninit completed"); @@ -1241,11 +1319,15 @@ virtio_dev_configure(struct rte_eth_dev *dev) return -ENOTSUP; } - if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC) + if (((dev->dev_type == RTE_ETH_DEV_PCI) && + (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)) || + ((dev->dev_type == RTE_ETH_DEV_VIRTUAL) && + (dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC))) { if (vtpci_irq_config(hw, 0) == VIRTIO_MSI_NO_VECTOR) { PMD_DRV_LOG(ERR, "failed to set config vector"); return -EBUSY; } + } return 0; } @@ -1260,15 +1342,31 @@ virtio_dev_start(struct rte_eth_dev *dev) /* check if lsc interrupt feature is enabled */ if (dev->data->dev_conf.intr_conf.lsc) { - if (!(pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)) { - PMD_DRV_LOG(ERR, "link status not supported by host"); - return -ENOTSUP; - } + if (dev->dev_type == RTE_ETH_DEV_PCI) { + if (!(pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)) { + PMD_DRV_LOG(ERR, + "link status not supported by host"); + return -ENOTSUP; + } - if (rte_intr_enable(&dev->pci_dev->intr_handle) < 0) { - PMD_DRV_LOG(ERR, "interrupt enable failed"); - return -EIO; + if (rte_intr_enable(&dev->pci_dev->intr_handle) < 0) { + PMD_DRV_LOG(ERR, "interrupt enable failed"); + return -EIO; + } } +#ifdef RTE_LIBRTE_VIRTIO_HOST_MODE + else if (dev->dev_type == RTE_ETH_DEV_VIRTUAL) { + if (!(dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC)) { + PMD_DRV_LOG(ERR, + "link status not supported by host"); + return -ENOTSUP; + } + if (qtest_intr_enable(dev->data) < 0) { + PMD_DRV_LOG(ERR, "interrupt enable failed"); + return -EIO; + } + } +#endif } /* Initialize Link state */ @@ -1365,8 +1463,15 @@ virtio_dev_stop(struct rte_eth_dev *dev) PMD_INIT_LOG(DEBUG, "stop"); - if (dev->data->dev_conf.intr_conf.lsc) - rte_intr_disable(&dev->pci_dev->intr_handle); + if (dev->data->dev_conf.intr_conf.lsc) { + if (dev->dev_type == RTE_ETH_DEV_PCI) + rte_intr_disable(&dev->pci_dev->intr_handle); + +#ifdef RTE_LIBRTE_VIRTIO_HOST_MODE + if (dev->dev_type == RTE_ETH_DEV_VIRTUAL) + qtest_intr_disable(dev->data); +#endif + } memset(&link, 0, sizeof(link)); virtio_dev_atomic_write_link_status(dev, &link); @@ -1411,7 +1516,13 @@ virtio_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info) { struct virtio_hw *hw = dev->data->dev_private; - dev_info->driver_name = dev->driver->pci_drv.name; + if (dev->dev_type == RTE_ETH_DEV_PCI) + dev_info->driver_name = dev->driver->pci_drv.name; +#ifdef RTE_LIBRTE_VIRTIO_HOST_MODE + else if (dev->dev_type == RTE_ETH_DEV_VIRTUAL) + dev_info->driver_name = dev->data->drv_name; +#endif + dev_info->max_rx_queues = (uint16_t)hw->max_rx_queues; dev_info->max_tx_queues = (uint16_t)hw->max_tx_queues; dev_info->min_rx_bufsize = VIRTIO_MIN_RX_BUFSIZE; @@ -1439,3 +1550,196 @@ static struct rte_driver rte_virtio_driver = { }; PMD_REGISTER_DRIVER(rte_virtio_driver); + +#ifdef RTE_LIBRTE_VIRTIO_HOST_MODE + +#define ETH_VIRTIO_NET_ARG_QTEST_PATH "qtest" +#define ETH_VIRTIO_NET_ARG_IVSHMEM_PATH "ivshmem" + +static const char *valid_args[] = { + ETH_VIRTIO_NET_ARG_QTEST_PATH, + ETH_VIRTIO_NET_ARG_IVSHMEM_PATH, + NULL +}; + +static int +get_string_arg(const char *key __rte_unused, + const char *value, void *extra_args) +{ + int ret, fd, loop = 3; + int *pfd = extra_args; + struct sockaddr_un sa = {0}; + + if ((value == NULL) || (extra_args == NULL)) + return -EINVAL; + + fd = socket(AF_UNIX, SOCK_STREAM, 0); + if (fd < 0) + return -1; + + sa.sun_family = AF_UNIX; + strncpy(sa.sun_path, value, sizeof(sa.sun_path)); + + while (loop--) { + /* + * may need to wait for qtest and ivshmem + * sockets are prepared by QEMU. + */ + ret = connect(fd, (struct sockaddr *)&sa, + sizeof(struct sockaddr_un)); + if (ret == 0) + break; + else + usleep(100000); + } + + if (ret != 0) { + close(fd); + return -1; + } + + *pfd = fd; + + return 0; +} + +static struct rte_eth_dev * +virtio_net_eth_dev_alloc(const char *name) +{ + struct rte_eth_dev *eth_dev; + struct rte_eth_dev_data *data; + struct virtio_hw *hw; + + eth_dev = rte_eth_dev_allocate(name, RTE_ETH_DEV_VIRTUAL); + if (eth_dev == NULL) + rte_panic("cannot alloc rte_eth_dev\n"); + + data = eth_dev->data; + + hw = rte_zmalloc(NULL, sizeof(*hw), 0); + if (!hw) + rte_panic("malloc virtio_hw failed\n"); + + data->dev_private = hw; + eth_dev->driver = &rte_virtio_pmd; + return eth_dev; +} + +/* + * Initialization when "CONFIG_RTE_LIBRTE_VIRTIO_HOST_MODE" is enabled. + */ +static int +rte_virtio_net_pmd_init(const char *name, const char *params) +{ + struct rte_kvargs *kvlist = NULL; + struct rte_eth_dev *eth_dev = NULL; + int ret, qtest_sock, ivshmem_sock; + struct rte_mem_config *mcfg; + + if (params == NULL || params[0] == '\0') + goto error; + + /* get pointer to global configuration */ + mcfg = rte_eal_get_configuration()->mem_config; + + /* Check if EAL memory consists of one memory segment */ + if ((RTE_MAX_MEMSEG > 1) && (mcfg->memseg[1].addr != NULL)) { + PMD_INIT_LOG(ERR, "Non contigious memory"); + goto error; + } + + kvlist = rte_kvargs_parse(params, valid_args); + if (!kvlist) { + PMD_INIT_LOG(ERR, "error when parsing param"); + goto error; + } + + if (rte_kvargs_count(kvlist, ETH_VIRTIO_NET_ARG_IVSHMEM_PATH) == 1) { + ret = rte_kvargs_process(kvlist, ETH_VIRTIO_NET_ARG_IVSHMEM_PATH, + &get_string_arg, &ivshmem_sock); + if (ret != 0) { + PMD_INIT_LOG(ERR, + "Failed to connect to ivshmem socket"); + goto error; + } + } else { + PMD_INIT_LOG(ERR, "No argument specified for %s", + ETH_VIRTIO_NET_ARG_IVSHMEM_PATH); + goto error; + } + + if (rte_kvargs_count(kvlist, ETH_VIRTIO_NET_ARG_QTEST_PATH) == 1) { + ret = rte_kvargs_process(kvlist, ETH_VIRTIO_NET_ARG_QTEST_PATH, + &get_string_arg, &qtest_sock); + if (ret != 0) { + PMD_INIT_LOG(ERR, + "Failed to connect to qtest socket"); + goto error; + } + } else { + PMD_INIT_LOG(ERR, "No argument specified for %s", + ETH_VIRTIO_NET_ARG_QTEST_PATH); + goto error; + } + + eth_dev = virtio_net_eth_dev_alloc(name); + + qtest_vdev_init(eth_dev->data, qtest_sock, ivshmem_sock); + + /* originally, this will be called in rte_eal_pci_probe() */ + eth_virtio_dev_init(eth_dev); + + eth_dev->driver = NULL; + eth_dev->data->dev_flags |= RTE_ETH_DEV_DETACHABLE; + eth_dev->data->kdrv = RTE_KDRV_NONE; + eth_dev->data->drv_name = "rte_virtio_pmd"; + + rte_kvargs_free(kvlist); + return 0; + +error: + rte_kvargs_free(kvlist); + return -EFAULT; +} + +/* + * Finalization when "CONFIG_RTE_LIBRTE_VIRTIO_HOST_MODE" is enabled. + */ +static int +rte_virtio_net_pmd_uninit(const char *name) +{ + struct rte_eth_dev *eth_dev = NULL; + int ret; + + if (name == NULL) + return -EINVAL; + + /* find the ethdev entry */ + eth_dev = rte_eth_dev_allocated(name); + if (eth_dev == NULL) + return -ENODEV; + + ret = eth_virtio_dev_uninit(eth_dev); + if (ret != 0) + return -EFAULT; + + qtest_vdev_uninit(eth_dev->data); + rte_free(eth_dev->data->dev_private); + + ret = rte_eth_dev_release_port(eth_dev); + if (ret != 0) + return -EFAULT; + + return 0; +} + +static struct rte_driver rte_virtio_net_driver = { + .name = "eth_virtio_net", + .type = PMD_VDEV, + .init = rte_virtio_net_pmd_init, + .uninit = rte_virtio_net_pmd_uninit, +}; + +PMD_REGISTER_DRIVER(rte_virtio_net_driver); + +#endif /* RTE_LIBRTE_VIRTIO_HOST_MODE */ diff --git a/drivers/net/virtio/virtio_ethdev.h b/drivers/net/virtio/virtio_ethdev.h index fed9571..81e6465 100644 --- a/drivers/net/virtio/virtio_ethdev.h +++ b/drivers/net/virtio/virtio_ethdev.h @@ -123,5 +123,17 @@ uint16_t virtio_xmit_pkts_simple(void *tx_queue, struct rte_mbuf **tx_pkts, #define VTNET_LRO_FEATURES (VIRTIO_NET_F_GUEST_TSO4 | \ VIRTIO_NET_F_GUEST_TSO6 | VIRTIO_NET_F_GUEST_ECN) +#ifdef RTE_LIBRTE_VIRTIO_HOST_MODE +int qtest_vdev_init(struct rte_eth_dev_data *data, + int qtest_socket, int ivshmem_socket); +void qtest_vdev_uninit(struct rte_eth_dev_data *data); +void qtest_intr_callback_register(void *data, + rte_intr_callback_fn cb, void *cb_arg); +void qtest_intr_callback_unregister(void *data, + rte_intr_callback_fn cb, void *cb_arg); +int qtest_intr_enable(void *data); +int qtest_intr_disable(void *data); +struct rte_pci_id qtest_get_pci_id_of_virtio_net(void); +#endif /* RTE_LIBRTE_VIRTIO_HOST_MODE */ #endif /* _VIRTIO_ETHDEV_H_ */ diff --git a/drivers/net/virtio/virtio_pci.c b/drivers/net/virtio/virtio_pci.c index 98eef85..2121234 100644 --- a/drivers/net/virtio/virtio_pci.c +++ b/drivers/net/virtio/virtio_pci.c @@ -145,6 +145,98 @@ static const struct virtio_pci_dev_ops phys_modern_dev_ops = { .write32 = phys_modern_write32, }; +#ifdef RTE_LIBRTE_VIRTIO_HOST_MODE +static uint8_t +virt_legacy_read8(struct virtio_hw *hw, uint8_t *addr) +{ + return qtest_in(hw, (uint16_t)(hw->io_base + (uint64_t)addr), 'b'); +} + +static uint16_t +virt_legacy_read16(struct virtio_hw *hw, uint16_t *addr) +{ + return qtest_in(hw, (uint16_t)(hw->io_base + (uint64_t)addr), 'w'); +} + +static uint32_t +virt_legacy_read32(struct virtio_hw *hw, uint32_t *addr) +{ + return qtest_in(hw, (uint16_t)(hw->io_base + (uint64_t)addr), 'l'); +} + +static void +virt_legacy_write8(struct virtio_hw *hw, uint8_t *addr, uint8_t val) +{ + qtest_out(hw, (uint16_t)(hw->io_base + (uint64_t)addr), val, 'b'); +} + +static void +virt_legacy_write16(struct virtio_hw *hw, uint16_t *addr, uint16_t val) +{ + qtest_out(hw, (uint16_t)(hw->io_base + (uint64_t)addr), val, 'w'); +} + +static void +virt_legacy_write32(struct virtio_hw *hw, uint32_t *addr, uint32_t val) +{ + qtest_out(hw, (uint16_t)(hw->io_base + (uint64_t)addr), val, 'l'); +} + +static const struct virtio_pci_dev_ops virt_legacy_dev_ops = { + .read8 = virt_legacy_read8, + .read16 = virt_legacy_read16, + .read32 = virt_legacy_read32, + .write8 = virt_legacy_write8, + .write16 = virt_legacy_write16, + .write32 = virt_legacy_write32, +}; + +static uint8_t +virt_modern_read8(struct virtio_hw *hw, uint8_t *addr) +{ + return qtest_read(hw, (uint64_t)addr, 'b'); +} + +static uint16_t +virt_modern_read16(struct virtio_hw *hw, uint16_t *addr) +{ + return qtest_read(hw, (uint64_t)addr, 'w'); +} + +static uint32_t +virt_modern_read32(struct virtio_hw *hw, uint32_t *addr) +{ + return qtest_read(hw, (uint64_t)addr, 'l'); +} + +static void +virt_modern_write8(struct virtio_hw *hw, uint8_t *addr, uint8_t val) +{ + qtest_write(hw, (uint64_t)addr, val, 'b'); +} + +static void +virt_modern_write16(struct virtio_hw *hw, uint16_t *addr, uint16_t val) +{ + qtest_write(hw, (uint64_t)addr, val, 'w'); +} + +static void +virt_modern_write32(struct virtio_hw *hw, uint32_t *addr, uint32_t val) +{ + qtest_write(hw, (uint64_t)addr, val, 'l'); +} + +static const struct virtio_pci_dev_ops virt_modern_dev_ops = { + .read8 = virt_modern_read8, + .read16 = virt_modern_read16, + .read32 = virt_modern_read32, + .write8 = virt_modern_write8, + .write16 = virt_modern_write16, + .write32 = virt_modern_write32, +}; +#endif /* RTE_LIBRTE_VIRTIO_HOST_MODE */ + static int vtpci_dev_init(struct rte_eth_dev *dev, struct virtio_hw *hw) { @@ -154,6 +246,17 @@ vtpci_dev_init(struct rte_eth_dev *dev, struct virtio_hw *hw) else hw->vtpci_dev_ops = &phys_legacy_dev_ops; return 0; + } else if (dev->dev_type == RTE_ETH_DEV_VIRTUAL) { +#ifdef RTE_LIBRTE_VIRTIO_HOST_MODE + if (strncmp(dev->data->name, "eth_virtio_net", + strlen("eth_virtio_net")) == 0) { + if (hw->modern == 1) + hw->vtpci_dev_ops = &virt_modern_dev_ops; + else + hw->vtpci_dev_ops = &virt_legacy_dev_ops; + return 0; + } +#endif } PMD_DRV_LOG(ERR, "Unkown virtio-net device."); @@ -224,12 +327,81 @@ static const struct virtio_pci_cfg_ops phys_cfg_ops = { .read = phys_read_pci_cfg, }; +#ifdef RTE_LIBRTE_VIRTIO_HOST_MODE +static int +virt_map_pci_cfg(struct virtio_hw *hw __rte_unused) +{ + return 0; +} + +static void +virt_unmap_pci_cfg(struct virtio_hw *hw __rte_unused) +{ + return; +} + +static int +virt_read_pci_cfg(struct virtio_hw *hw, void *buf, size_t len, off_t offset) +{ + qtest_read_pci_cfg(hw, "virtio-net", buf, len, offset); + return 0; +} + +static void * +virt_get_mapped_addr(struct virtio_hw *hw, uint8_t bar, + uint32_t offset, uint32_t length) +{ + uint64_t base; + uint64_t size; + + if (qtest_get_bar_size(hw, "virtio-net", bar, &size) < 0) { + PMD_INIT_LOG(ERR, "invalid bar: %u", bar); + return NULL; + } + + if (offset + length < offset) { + PMD_INIT_LOG(ERR, "offset(%u) + lenght(%u) overflows", + offset, length); + return NULL; + } + + if (offset + length > size) { + PMD_INIT_LOG(ERR, + "invalid cap: overflows bar space: %u > %"PRIu64, + offset + length, size); + return NULL; + } + + if (qtest_get_bar_addr(hw, "virtio-net", bar, &base) < 0) { + PMD_INIT_LOG(ERR, "invalid bar: %u", bar); + return NULL; + } + + return (void *)(base + offset); +} + +static const struct virtio_pci_cfg_ops virt_cfg_ops = { + .map = virt_map_pci_cfg, + .unmap = virt_unmap_pci_cfg, + .get_mapped_addr = virt_get_mapped_addr, + .read = virt_read_pci_cfg, +}; +#endif /* RTE_LIBRTE_VIRTIO_HOST_MODE */ + static int vtpci_cfg_init(struct rte_eth_dev *dev, struct virtio_hw *hw) { if (dev->dev_type == RTE_ETH_DEV_PCI) { hw->vtpci_cfg_ops = &phys_cfg_ops; return 0; + } else if (dev->dev_type == RTE_ETH_DEV_VIRTUAL) { +#ifdef RTE_LIBRTE_VIRTIO_HOST_MODE + if (strncmp(dev->data->name, "eth_virtio_net", + strlen("eth_virtio_net")) == 0) { + hw->vtpci_cfg_ops = &virt_cfg_ops; + return 0; + } +#endif } PMD_DRV_LOG(ERR, "Unkown virtio-net device."); @@ -785,7 +957,7 @@ modern_setup_queue(struct virtio_hw *hw, struct virtqueue *vq) uint64_t desc_addr, avail_addr, used_addr; uint16_t notify_off; - desc_addr = vq->mz->phys_addr; + desc_addr = vq->vq_ring_mem; avail_addr = desc_addr + vq->vq_nentries * sizeof(struct vring_desc); used_addr = RTE_ALIGN_CEIL(avail_addr + offsetof(struct vring_avail, ring[vq->vq_nentries]), @@ -1019,6 +1191,14 @@ vtpci_modern_init(struct rte_eth_dev *dev, struct virtio_hw *hw) if (dev->dev_type == RTE_ETH_DEV_PCI) pci_dev->driver->drv_flags |= RTE_PCI_DRV_INTR_LSC; + else if (dev->dev_type == RTE_ETH_DEV_VIRTUAL) { +#ifdef RTE_LIBRTE_VIRTIO_HOST_MODE + if (strncmp(dev->data->name, "eth_virtio_net", + strlen("eth_virtio_net")) == 0) { + dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC; + } +#endif + } hw->vtpci_ops = &modern_ops; hw->modern = 1; @@ -1037,6 +1217,14 @@ vtpci_legacy_init(struct rte_eth_dev *dev, struct virtio_hw *hw) return -1; hw->use_msix = legacy_virtio_has_msix(&pci_dev->addr); + } else if (dev->dev_type == RTE_ETH_DEV_VIRTUAL) { +#ifdef RTE_LIBRTE_VIRTIO_HOST_MODE + if (strncmp(dev->data->name, "eth_virtio_net", + strlen("eth_virtio_net")) == 0) { + hw->use_msix = 0; + dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC; + } +#endif } hw->io_base = (uint32_t)(uintptr_t) diff --git a/drivers/net/virtio/virtio_pci.h b/drivers/net/virtio/virtio_pci.h index 7b5ad54..cdc23b5 100644 --- a/drivers/net/virtio/virtio_pci.h +++ b/drivers/net/virtio/virtio_pci.h @@ -267,6 +267,9 @@ struct virtio_net_config; struct virtio_hw { struct virtqueue *cvq; +#ifdef RTE_LIBRTE_VIRTIO_HOST_MODE + void *qsession; +#endif uint32_t io_base; uint64_t guest_features; uint32_t max_tx_queues; @@ -366,4 +369,17 @@ uint8_t vtpci_isr(struct virtio_hw *); uint16_t vtpci_irq_config(struct virtio_hw *, uint16_t); +#ifdef RTE_LIBRTE_VIRTIO_HOST_MODE +uint32_t qtest_in(struct virtio_hw *, uint16_t, char type); +void qtest_out(struct virtio_hw *, uint16_t, uint64_t, char type); +uint32_t qtest_read(struct virtio_hw *, uint64_t, char type); +void qtest_write(struct virtio_hw *, uint64_t, uint64_t, char type); +int qtest_read_pci_cfg(struct virtio_hw *hw, const char *name, + void *buf, size_t len, off_t offset); +int qtest_get_bar_addr(struct virtio_hw *hw, const char *name, + uint8_t bar, uint64_t *addr); +int qtest_get_bar_size(struct virtio_hw *hw, const char *name, + uint8_t bar, uint64_t *size); +#endif + #endif /* _VIRTIO_PCI_H_ */ diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c index 41a1366..f842c79 100644 --- a/drivers/net/virtio/virtio_rxtx.c +++ b/drivers/net/virtio/virtio_rxtx.c @@ -191,8 +191,7 @@ virtqueue_enqueue_recv_refill(struct virtqueue *vq, struct rte_mbuf *cookie) start_dp = vq->vq_ring.desc; start_dp[idx].addr = - (uint64_t)(cookie->buf_physaddr + RTE_PKTMBUF_HEADROOM - - hw->vtnet_hdr_size); + RTE_MBUF_DATA_DMA_ADDR(cookie) - hw->vtnet_hdr_size; start_dp[idx].len = cookie->buf_len - RTE_PKTMBUF_HEADROOM + hw->vtnet_hdr_size; start_dp[idx].flags = VRING_DESC_F_WRITE; diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h index 99d4fa9..b772e04 100644 --- a/drivers/net/virtio/virtqueue.h +++ b/drivers/net/virtio/virtqueue.h @@ -66,8 +66,13 @@ struct rte_mbuf; #define VIRTQUEUE_MAX_NAME_SZ 32 +#ifdef RTE_LIBRTE_VIRTIO_HOST_MODE +#define RTE_MBUF_DATA_DMA_ADDR(mb) \ + ((uint64_t)(mb)->buf_addr + (mb)->data_off) +#else #define RTE_MBUF_DATA_DMA_ADDR(mb) \ (uint64_t) ((mb)->buf_physaddr + (mb)->data_off) +#endif #define VTNET_SQ_RQ_QUEUE_IDX 0 #define VTNET_SQ_TQ_QUEUE_IDX 1 @@ -167,7 +172,8 @@ struct virtqueue { void *vq_ring_virt_mem; /**< linear address of vring*/ unsigned int vq_ring_size; - phys_addr_t vq_ring_mem; /**< physical address of vring */ + phys_addr_t vq_ring_mem; /**< physical address of vring for non-vdev, + virtual address of vring for vdev */ struct vring vq_ring; /**< vring keeping desc, used and avail */ uint16_t vq_free_cnt; /**< num of desc available */ @@ -188,6 +194,7 @@ struct virtqueue { uint16_t vq_avail_idx; uint64_t mbuf_initializer; /**< value to init mbufs. */ phys_addr_t virtio_net_hdr_mem; /**< hdr for each xmit packet */ + void *virtio_net_hdr_vaddr; /**< linear address of vring */ struct rte_mbuf **sw_ring; /**< RX software ring. */ /* dummy mbuf, for wraparound when processing RX ring. */