From patchwork Mon Apr 2 11:46:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhihong Wang X-Patchwork-Id: 36857 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 0DFFDA495; Mon, 2 Apr 2018 13:47:36 +0200 (CEST) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 9A861AAD7 for ; Mon, 2 Apr 2018 13:47:33 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 02 Apr 2018 04:47:30 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.48,395,1517904000"; d="scan'208";a="187970105" Received: from unknown (HELO dpdk99.sh.intel.com) ([10.67.110.156]) by orsmga004.jf.intel.com with ESMTP; 02 Apr 2018 04:47:29 -0700 From: Zhihong Wang To: dev@dpdk.org Cc: jianfeng.tan@intel.com, tiwei.bie@intel.com, maxime.coquelin@redhat.com, yliu@fridaylinux.org, cunming.liang@intel.com, xiao.w.wang@intel.com, dan.daly@intel.com, Zhihong Wang Date: Mon, 2 Apr 2018 19:46:53 +0800 Message-Id: <20180402114656.17090-3-zhihong.wang@intel.com> X-Mailer: git-send-email 2.13.6 In-Reply-To: <20180402114656.17090-1-zhihong.wang@intel.com> References: <1517614137-62926-1-git-send-email-zhihong.wang@intel.com> <20180402114656.17090-1-zhihong.wang@intel.com> Subject: [dpdk-dev] [PATCH v5 2/5] vhost: support selective datapath X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This patch set introduces support for selective datapath in DPDK vhost-user lib. vDPA stands for vhost Data Path Acceleration. The idea is to support virtio ring compatible devices to serve virtio driver directly to enable datapath acceleration. A set of device ops is defined for device specific operations: a. get_queue_num: Called to get supported queue number of the device. b. get_features: Called to get supported features of the device. c. get_protocol_features: Called to get supported protocol features of the device. d. dev_conf: Called to configure the actual device when the virtio device becomes ready. e. dev_close: Called to close the actual device when the virtio device is stopped. f. set_vring_state: Called to change the state of the vring in the actual device when vring state changes. g. set_features: Called to set the negotiated features to device. h. migration_done: Called to allow the device to response to RARP sending. i. get_vfio_group_fd: Called to get the VFIO group fd of the device. j. get_vfio_device_fd: Called to get the VFIO device fd of the device. k. get_notify_area: Called to get the notify area info of the queue. Signed-off-by: Zhihong Wang Reviewed-by: Maxime Coquelin --- Changes in v5: 1. Rename the vDPA device ops to follow convention. 2. Improve sanity check. --- Changes in v4: 1. Remove the "engine" concept in the lib. --- Changes in v2: 1. Add VFIO related vDPA device ops. lib/librte_vhost/Makefile | 4 +- lib/librte_vhost/rte_vdpa.h | 87 +++++++++++++++++++++++++ lib/librte_vhost/rte_vhost_version.map | 7 ++ lib/librte_vhost/vdpa.c | 115 +++++++++++++++++++++++++++++++++ 4 files changed, 211 insertions(+), 2 deletions(-) create mode 100644 lib/librte_vhost/rte_vdpa.h create mode 100644 lib/librte_vhost/vdpa.c diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile index 5d6c6abae..37044ac03 100644 --- a/lib/librte_vhost/Makefile +++ b/lib/librte_vhost/Makefile @@ -22,9 +22,9 @@ LDLIBS += -lrte_eal -lrte_mempool -lrte_mbuf -lrte_ethdev -lrte_net # all source are stored in SRCS-y SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := fd_man.c iotlb.c socket.c vhost.c \ - vhost_user.c virtio_net.c + vhost_user.c virtio_net.c vdpa.c # install includes -SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_vhost.h +SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_vhost.h rte_vdpa.h include $(RTE_SDK)/mk/rte.lib.mk diff --git a/lib/librte_vhost/rte_vdpa.h b/lib/librte_vhost/rte_vdpa.h new file mode 100644 index 000000000..90465ca26 --- /dev/null +++ b/lib/librte_vhost/rte_vdpa.h @@ -0,0 +1,87 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2018 Intel Corporation + */ + +#ifndef _RTE_VDPA_H_ +#define _RTE_VDPA_H_ + +/** + * @file + * + * Device specific vhost lib + */ + +#include +#include "rte_vhost.h" + +#define MAX_VDPA_NAME_LEN 128 + +enum vdpa_addr_type { + PCI_ADDR, + VDPA_ADDR_MAX +}; + +struct rte_vdpa_dev_addr { + enum vdpa_addr_type type; + union { + uint8_t __dummy[64]; + struct rte_pci_addr pci_addr; + }; +}; + +struct rte_vdpa_dev_ops { + /* Get capabilities of this device */ + int (*get_queue_num)(int did, uint32_t *queue_num); + int (*get_features)(int did, uint64_t *features); + int (*get_protocol_features)(int did, uint64_t *protocol_features); + + /* Driver configure/close the device */ + int (*dev_conf)(int vid); + int (*dev_close)(int vid); + + /* Enable/disable this vring */ + int (*set_vring_state)(int vid, int vring, int state); + + /* Set features when changed */ + int (*set_features)(int vid); + + /* Destination operations when migration done */ + int (*migration_done)(int vid); + + /* Get the vfio group fd */ + int (*get_vfio_group_fd)(int vid); + + /* Get the vfio device fd */ + int (*get_vfio_device_fd)(int vid); + + /* Get the notify area info of the queue */ + int (*get_notify_area)(int vid, int qid, + uint64_t *offset, uint64_t *size); + + /* Reserved for future extension */ + void *reserved[5]; +}; + +struct rte_vdpa_device { + struct rte_vdpa_dev_addr addr; + struct rte_vdpa_dev_ops *ops; +} __rte_cache_aligned; + +/* Register a vdpa device, return did if successful, -1 on failure */ +int __rte_experimental +rte_vdpa_register_device(struct rte_vdpa_dev_addr *addr, + struct rte_vdpa_dev_ops *ops); + +/* Unregister a vdpa device, return -1 on failure */ +int __rte_experimental +rte_vdpa_unregister_device(int did); + +/* Find did of a vdpa device, return -1 on failure */ +int __rte_experimental +rte_vdpa_find_device_id(struct rte_vdpa_dev_addr *addr); + +/* Find a vdpa device based on did */ +struct rte_vdpa_device * __rte_experimental +rte_vdpa_get_device(int did); + +#endif /* _RTE_VDPA_H_ */ diff --git a/lib/librte_vhost/rte_vhost_version.map b/lib/librte_vhost/rte_vhost_version.map index df0103129..d3453a2a7 100644 --- a/lib/librte_vhost/rte_vhost_version.map +++ b/lib/librte_vhost/rte_vhost_version.map @@ -59,3 +59,10 @@ DPDK_18.02 { rte_vhost_vring_call; } DPDK_17.08; + +EXPERIMENTAL { + rte_vdpa_register_device; + rte_vdpa_unregister_device; + rte_vdpa_find_device_id; + rte_vdpa_get_device; +} DPDK_18.02; diff --git a/lib/librte_vhost/vdpa.c b/lib/librte_vhost/vdpa.c new file mode 100644 index 000000000..4b339b1c2 --- /dev/null +++ b/lib/librte_vhost/vdpa.c @@ -0,0 +1,115 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2018 Intel Corporation + */ + +/** + * @file + * + * Device specific vhost lib + */ + +#include + +#include +#include "rte_vdpa.h" +#include "vhost.h" + +static struct rte_vdpa_device *vdpa_devices[MAX_VHOST_DEVICE]; +static uint32_t vdpa_device_num; + +static bool +is_same_vdpa_device(struct rte_vdpa_dev_addr *a, + struct rte_vdpa_dev_addr *b) +{ + bool ret = true; + + if (a->type != b->type) + return false; + + switch (a->type) { + case PCI_ADDR: + if (a->pci_addr.domain != b->pci_addr.domain || + a->pci_addr.bus != b->pci_addr.bus || + a->pci_addr.devid != b->pci_addr.devid || + a->pci_addr.function != b->pci_addr.function) + ret = false; + break; + default: + break; + } + + return ret; +} + +int +rte_vdpa_register_device(struct rte_vdpa_dev_addr *addr, + struct rte_vdpa_dev_ops *ops) +{ + struct rte_vdpa_device *dev; + char device_name[MAX_VDPA_NAME_LEN]; + int i; + + if (vdpa_device_num >= MAX_VHOST_DEVICE) + return -1; + + for (i = 0; i < MAX_VHOST_DEVICE; i++) { + if (vdpa_devices[i] && is_same_vdpa_device(addr, + &vdpa_devices[i]->addr)) + return -1; + } + + for (i = 0; i < MAX_VHOST_DEVICE; i++) { + if (vdpa_devices[i] == NULL) + break; + } + + sprintf(device_name, "vdpa-dev-%d", i); + dev = rte_zmalloc(device_name, sizeof(struct rte_vdpa_device), + RTE_CACHE_LINE_SIZE); + if (!dev) + return -1; + + memcpy(&dev->addr, addr, sizeof(struct rte_vdpa_dev_addr)); + dev->ops = ops; + vdpa_devices[i] = dev; + vdpa_device_num++; + + return i; +} + +int +rte_vdpa_unregister_device(int did) +{ + if (did < 0 || did >= MAX_VHOST_DEVICE || vdpa_devices[did] == NULL) + return -1; + + rte_free(vdpa_devices[did]); + vdpa_devices[did] = NULL; + vdpa_device_num--; + + return did; +} + +int +rte_vdpa_find_device_id(struct rte_vdpa_dev_addr *addr) +{ + struct rte_vdpa_device *dev; + int i; + + for (i = 0; i < MAX_VHOST_DEVICE; ++i) { + dev = vdpa_devices[i]; + if (dev && is_same_vdpa_device(&dev->addr, addr) == 0) + return i; + } + + return -1; +} + +struct rte_vdpa_device * +rte_vdpa_get_device(int did) +{ + if (did < 0 || did >= MAX_VHOST_DEVICE) + return NULL; + + return vdpa_devices[did]; +}