From patchwork Wed Apr 3 07:18:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tiwei Bie X-Patchwork-Id: 52143 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id DBD6B5B38; Wed, 3 Apr 2019 09:19:30 +0200 (CEST) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 645D04F9A for ; Wed, 3 Apr 2019 09:19:25 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 03 Apr 2019 00:19:24 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,303,1549958400"; d="scan'208";a="128206235" Received: from dpdk-tbie.sh.intel.com ([10.67.104.173]) by orsmga007.jf.intel.com with ESMTP; 03 Apr 2019 00:19:23 -0700 From: Tiwei Bie To: dev@dpdk.org Cc: cunming.liang@intel.com, bruce.richardson@intel.com, alejandro.lucero@netronome.com Date: Wed, 3 Apr 2019 15:18:42 +0800 Message-Id: <20190403071844.21126-2-tiwei.bie@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190403071844.21126-1-tiwei.bie@intel.com> References: <20190403071844.21126-1-tiwei.bie@intel.com> Subject: [dpdk-dev] [RFC 1/3] eal: add a helper for reading string from sysfs X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This patch adds a helper for reading string from sysfs. Signed-off-by: Cunming Liang Signed-off-by: Tiwei Bie --- lib/librte_eal/common/eal_filesystem.h | 7 +++++++ lib/librte_eal/freebsd/eal/eal.c | 22 ++++++++++++++++++++++ lib/librte_eal/linux/eal/eal.c | 22 ++++++++++++++++++++++ lib/librte_eal/rte_eal_version.map | 1 + 4 files changed, 52 insertions(+) diff --git a/lib/librte_eal/common/eal_filesystem.h b/lib/librte_eal/common/eal_filesystem.h index 89a3added..2c823b27d 100644 --- a/lib/librte_eal/common/eal_filesystem.h +++ b/lib/librte_eal/common/eal_filesystem.h @@ -116,4 +116,11 @@ eal_get_hugefile_lock_path(char *buffer, size_t buflen, int f_id) * Used to read information from files on /sys */ int eal_parse_sysfs_value(const char *filename, unsigned long *val); +/** + * Function to read a line from a file on the filesystem. + * Used to read information from files on /sys + */ +int __rte_experimental +rte_eal_parse_sysfs_str(const char *filename, char *buf, unsigned long sz); + #endif /* EAL_FILESYSTEM_H */ diff --git a/lib/librte_eal/freebsd/eal/eal.c b/lib/librte_eal/freebsd/eal/eal.c index 4e86b10b1..816cb9b91 100644 --- a/lib/librte_eal/freebsd/eal/eal.c +++ b/lib/librte_eal/freebsd/eal/eal.c @@ -208,6 +208,28 @@ eal_parse_sysfs_value(const char *filename, unsigned long *val) return 0; } +int +rte_eal_parse_sysfs_str(const char *filename, char *buf, unsigned long sz) +{ + FILE *f; + + f = fopen(filename, "r"); + if (f == NULL) { + RTE_LOG(ERR, EAL, "%s(): cannot open sysfs file %s\n", + __func__, filename); + return -1; + } + + if (fgets(buf, sz, f) == NULL) { + RTE_LOG(ERR, EAL, "%s(): cannot read sysfs file %s\n", + __func__, filename); + fclose(f); + return -1; + } + + fclose(f); + return 0; +} /* create memory configuration in shared/mmap memory. Take out * a write lock on the memsegs, so we can auto-detect primary/secondary. diff --git a/lib/librte_eal/linux/eal/eal.c b/lib/librte_eal/linux/eal/eal.c index 13f401684..865cb19d7 100644 --- a/lib/librte_eal/linux/eal/eal.c +++ b/lib/librte_eal/linux/eal/eal.c @@ -293,6 +293,28 @@ eal_parse_sysfs_value(const char *filename, unsigned long *val) return 0; } +int +rte_eal_parse_sysfs_str(const char *filename, char *buf, unsigned long sz) +{ + FILE *f; + + f = fopen(filename, "r"); + if (f == NULL) { + RTE_LOG(ERR, EAL, "%s(): cannot open sysfs file %s\n", + __func__, filename); + return -1; + } + + if (fgets(buf, sz, f) == NULL) { + RTE_LOG(ERR, EAL, "%s(): cannot read sysfs file %s\n", + __func__, filename); + fclose(f); + return -1; + } + + fclose(f); + return 0; +} /* create memory configuration in shared/mmap memory. Take out * a write lock on the memsegs, so we can auto-detect primary/secondary. diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map index d6e375135..d16258ffc 100644 --- a/lib/librte_eal/rte_eal_version.map +++ b/lib/librte_eal/rte_eal_version.map @@ -298,6 +298,7 @@ EXPERIMENTAL { rte_devargs_remove; rte_devargs_type_count; rte_eal_cleanup; + rte_eal_parse_sysfs_str; rte_extmem_attach; rte_extmem_detach; rte_extmem_register; From patchwork Wed Apr 3 07:18:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tiwei Bie X-Patchwork-Id: 52144 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 07FE05F11; Wed, 3 Apr 2019 09:19:34 +0200 (CEST) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 69FFC5B1C for ; Wed, 3 Apr 2019 09:19:27 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 03 Apr 2019 00:19:26 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,303,1549958400"; d="scan'208";a="128206258" Received: from dpdk-tbie.sh.intel.com ([10.67.104.173]) by orsmga007.jf.intel.com with ESMTP; 03 Apr 2019 00:19:25 -0700 From: Tiwei Bie To: dev@dpdk.org Cc: cunming.liang@intel.com, bruce.richardson@intel.com, alejandro.lucero@netronome.com Date: Wed, 3 Apr 2019 15:18:43 +0800 Message-Id: <20190403071844.21126-3-tiwei.bie@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190403071844.21126-1-tiwei.bie@intel.com> References: <20190403071844.21126-1-tiwei.bie@intel.com> Subject: [dpdk-dev] [RFC 2/3] bus/mdev: add mdev bus support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This patch adds the mdev (Mediated device) bus support in DPDK. This bus driver will scan all the mdev devices in the system, and do the probe based on device API (mdev_type/device_api). Signed-off-by: Cunming Liang Signed-off-by: Tiwei Bie --- config/common_base | 5 + config/common_linux | 1 + drivers/bus/Makefile | 1 + drivers/bus/mdev/Makefile | 41 +++ drivers/bus/mdev/linux/Makefile | 6 + drivers/bus/mdev/linux/mdev.c | 117 ++++++++ drivers/bus/mdev/mdev.c | 310 ++++++++++++++++++++++ drivers/bus/mdev/meson.build | 15 ++ drivers/bus/mdev/private.h | 90 +++++++ drivers/bus/mdev/rte_bus_mdev.h | 141 ++++++++++ drivers/bus/mdev/rte_bus_mdev_version.map | 12 + drivers/bus/meson.build | 2 +- mk/rte.app.mk | 1 + 13 files changed, 741 insertions(+), 1 deletion(-) create mode 100644 drivers/bus/mdev/Makefile create mode 100644 drivers/bus/mdev/linux/Makefile create mode 100644 drivers/bus/mdev/linux/mdev.c create mode 100644 drivers/bus/mdev/mdev.c create mode 100644 drivers/bus/mdev/meson.build create mode 100644 drivers/bus/mdev/private.h create mode 100644 drivers/bus/mdev/rte_bus_mdev.h create mode 100644 drivers/bus/mdev/rte_bus_mdev_version.map diff --git a/config/common_base b/config/common_base index 6292bc4af..d29e9a089 100644 --- a/config/common_base +++ b/config/common_base @@ -168,6 +168,11 @@ CONFIG_RTE_LIBRTE_COMMON_DPAAX=n # CONFIG_RTE_LIBRTE_IFPGA_BUS=y +# +# Compile the mdev bus +# +CONFIG_RTE_LIBRTE_MDEV_BUS=n + # # Compile PCI bus driver # diff --git a/config/common_linux b/config/common_linux index 75334273d..7de9624c0 100644 --- a/config/common_linux +++ b/config/common_linux @@ -25,6 +25,7 @@ CONFIG_RTE_LIBRTE_AVP_PMD=y CONFIG_RTE_LIBRTE_VDEV_NETVSC_PMD=y CONFIG_RTE_LIBRTE_NFP_PMD=y CONFIG_RTE_LIBRTE_POWER=y +CONFIG_RTE_LIBRTE_MDEV_BUS=y CONFIG_RTE_VIRTIO_USER=y CONFIG_RTE_PROC_INFO=y diff --git a/drivers/bus/Makefile b/drivers/bus/Makefile index cea3b55e6..b2144ee63 100644 --- a/drivers/bus/Makefile +++ b/drivers/bus/Makefile @@ -8,6 +8,7 @@ ifeq ($(CONFIG_RTE_EAL_VFIO),y) DIRS-$(CONFIG_RTE_LIBRTE_FSLMC_BUS) += fslmc endif DIRS-$(CONFIG_RTE_LIBRTE_IFPGA_BUS) += ifpga +DIRS-$(CONFIG_RTE_LIBRTE_MDEV_BUS) += mdev DIRS-$(CONFIG_RTE_LIBRTE_PCI_BUS) += pci DIRS-$(CONFIG_RTE_LIBRTE_VDEV_BUS) += vdev DIRS-$(CONFIG_RTE_LIBRTE_VMBUS) += vmbus diff --git a/drivers/bus/mdev/Makefile b/drivers/bus/mdev/Makefile new file mode 100644 index 000000000..b2faee395 --- /dev/null +++ b/drivers/bus/mdev/Makefile @@ -0,0 +1,41 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2019 Intel Corporation + +include $(RTE_SDK)/mk/rte.vars.mk + +# +# library name +# +LIB = librte_bus_mdev.a + +CFLAGS += -O3 +CFLAGS += $(WERROR_FLAGS) +CFLAGS += -DALLOW_EXPERIMENTAL_API +CFLAGS += -I$(SRCDIR) + +# versioning export map +EXPORT_MAP := rte_bus_mdev_version.map + +# library version +LIBABIVER := 1 + +ifneq ($(CONFIG_RTE_EXEC_ENV_LINUX),) +SYSTEM := linux +endif +ifneq ($(CONFIG_RTE_EXEC_ENV_FREEBSD),) +$(error "Mdev bus not implemented for BSD yet") +endif + +CFLAGS += -I$(RTE_SDK)/drivers/bus/mdev/$(SYSTEM) +CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common +CFLAGS += -I$(RTE_SDK)/lib/librte_eal/$(SYSTEM)/eal + +LDLIBS += -lrte_eal + +include $(RTE_SDK)/drivers/bus/mdev/$(SYSTEM)/Makefile +SRCS-$(CONFIG_RTE_LIBRTE_MDEV_BUS) := $(addprefix $(SYSTEM)/,$(SRCS)) +SRCS-$(CONFIG_RTE_LIBRTE_MDEV_BUS) += mdev.c + +SYMLINK-$(CONFIG_RTE_LIBRTE_MDEV_BUS)-include += rte_bus_mdev.h + +include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/bus/mdev/linux/Makefile b/drivers/bus/mdev/linux/Makefile new file mode 100644 index 000000000..a777ad3d4 --- /dev/null +++ b/drivers/bus/mdev/linux/Makefile @@ -0,0 +1,6 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2019 Intel Corporation + +SRCS += mdev.c + +CFLAGS += -D_GNU_SOURCE diff --git a/drivers/bus/mdev/linux/mdev.c b/drivers/bus/mdev/linux/mdev.c new file mode 100644 index 000000000..ecfe0eba6 --- /dev/null +++ b/drivers/bus/mdev/linux/mdev.c @@ -0,0 +1,117 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Intel Corporation + */ + +#include +#include + +#include +#include + +#include "eal_filesystem.h" + +#include "private.h" + +static int +mdev_scan_one(const char *dirname, const rte_uuid_t addr) +{ + struct rte_mdev_device *mdev; + char device_api[PATH_MAX]; + char filename[PATH_MAX]; + char *ptr; + + mdev = malloc(sizeof(*mdev)); + if (mdev == NULL) + return -1; + + memset(mdev, 0, sizeof(*mdev)); + mdev->device.bus = &rte_mdev_bus.bus; + rte_uuid_copy(mdev->addr, addr); + + /* get device_api */ + snprintf(filename, sizeof(filename), "%s/mdev_type/device_api", + dirname); + if (rte_eal_parse_sysfs_str(filename, device_api, + sizeof(device_api)) < 0) { + free(mdev); + return -1; + } + + ptr = strchr(device_api, '\n'); + if (ptr != NULL) + *ptr = '\0'; + + mdev_name_set(mdev); + + if (strcmp(device_api, "vfio-pci") == 0) { + /* device api */ + mdev->dev_api = RTE_MDEV_DEV_API_VFIO_PCI; + + if (TAILQ_EMPTY(&rte_mdev_bus.device_list)) + rte_mdev_add_device(mdev); + else { + struct rte_mdev_device *dev; + int ret; + + TAILQ_FOREACH(dev, &rte_mdev_bus.device_list, next) { + ret = rte_uuid_compare(mdev->addr, dev->addr); + if (ret > 0) + continue; + + if (ret < 0) + rte_mdev_insert_device(dev, mdev); + else /* already registered */ + free(mdev); + + return 0; + } + + rte_mdev_add_device(mdev); + } + } else { + RTE_LOG(DEBUG, EAL, "%s(): mdev device_api %s is not supported\n", + __func__, device_api); + } + + return 0; +} + +/* + * Scan the content of the mdev bus, and the devices in the devices + * list + */ +int +rte_mdev_scan(void) +{ + struct dirent *e; + DIR *dir; + char dirname[PATH_MAX]; + rte_uuid_t addr; + + dir = opendir(rte_mdev_get_sysfs_path()); + if (dir == NULL) { + RTE_LOG(ERR, EAL, "%s(): opendir failed: %s\n", + __func__, strerror(errno)); + return -1; + } + + while ((e = readdir(dir)) != NULL) { + if (e->d_name[0] == '.') + continue; + + if (rte_uuid_parse(e->d_name, addr) != 0) + continue; + + snprintf(dirname, sizeof(dirname), "%s/%s", + rte_mdev_get_sysfs_path(), e->d_name); + + if (mdev_scan_one(dirname, addr) < 0) + goto error; + } + closedir(dir); + return 0; + +error: + closedir(dir); + return -1; +} diff --git a/drivers/bus/mdev/mdev.c b/drivers/bus/mdev/mdev.c new file mode 100644 index 000000000..2f9209cca --- /dev/null +++ b/drivers/bus/mdev/mdev.c @@ -0,0 +1,310 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Intel Corporation + */ + +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "private.h" + +#define SYSFS_MDEV_DEVICES "/sys/bus/mdev/devices" + +const char *rte_mdev_get_sysfs_path(void) +{ + const char *path = NULL; + + path = getenv("SYSFS_MDEV_DEVICES"); + if (path == NULL) + return SYSFS_MDEV_DEVICES; + + return path; +} + +static void +rte_mdev_device_name(const rte_uuid_t addr, char *output, size_t size) +{ + RTE_VERIFY(size >= RTE_UUID_STRLEN); + rte_uuid_unparse(addr, output, size); +} + +static struct rte_devargs * +mdev_devargs_lookup(struct rte_mdev_device *dev) +{ + struct rte_devargs *devargs; + rte_uuid_t addr; + + RTE_EAL_DEVARGS_FOREACH("mdev", devargs) { + devargs->bus->parse(devargs->name, addr); + if (!rte_uuid_compare(dev->addr, addr)) + return devargs; + } + return NULL; +} + +void +mdev_name_set(struct rte_mdev_device *dev) +{ + struct rte_devargs *devargs; + + /* Each device has its internal, canonical name set. */ + rte_mdev_device_name(dev->addr, dev->name, sizeof(dev->name)); + devargs = mdev_devargs_lookup(dev); + dev->device.devargs = devargs; + /* In blacklist mode, if the device is not blacklisted, no + * rte_devargs exists for it. + */ + if (devargs != NULL) + /* If an rte_devargs exists, the generic rte_device uses the + * given name as its name. + */ + dev->device.name = dev->device.devargs->name; + else + /* Otherwise, it uses the internal, canonical form. */ + dev->device.name = dev->name; +} + +void +rte_mdev_register(struct rte_mdev_driver *driver) +{ + TAILQ_INSERT_TAIL(&rte_mdev_bus.driver_list, driver, next); + driver->bus = &rte_mdev_bus; +} + +void +rte_mdev_unregister(struct rte_mdev_driver *driver) +{ + TAILQ_REMOVE(&rte_mdev_bus.driver_list, driver, next); + driver->bus = NULL; +} + +void +rte_mdev_add_device(struct rte_mdev_device *mdev) +{ + TAILQ_INSERT_TAIL(&rte_mdev_bus.device_list, mdev, next); +} + +void +rte_mdev_insert_device(struct rte_mdev_device *exist_mdev, + struct rte_mdev_device *new_mdev) +{ + TAILQ_INSERT_BEFORE(exist_mdev, new_mdev, next); +} + +void +rte_mdev_remove_device(struct rte_mdev_device *mdev) +{ + TAILQ_REMOVE(&rte_mdev_bus.device_list, mdev, next); +} + +static struct rte_device * +mdev_find_device(const struct rte_device *start, rte_dev_cmp_t cmp, + const void *data) +{ + const struct rte_mdev_device *pstart; + struct rte_mdev_device *pdev; + + if (start != NULL) { + pstart = RTE_DEV_TO_MDEV_CONST(start); + pdev = TAILQ_NEXT(pstart, next); + } else { + pdev = TAILQ_FIRST(&rte_mdev_bus.device_list); + } + while (pdev != NULL) { + if (cmp(&pdev->device, data) == 0) + return &pdev->device; + pdev = TAILQ_NEXT(pdev, next); + } + return NULL; +} + +int +rte_mdev_match(const struct rte_mdev_driver *mdev_drv, + const struct rte_mdev_device *mdev_dev) +{ + if (mdev_drv->dev_api == mdev_dev->dev_api) + return 1; + + return 0; +} + +static int +rte_mdev_probe_one_driver(struct rte_mdev_driver *dr, + struct rte_mdev_device *dev) +{ + int ret; + + if (dr == NULL || dev == NULL) + return -EINVAL; + + /* no initialization when blacklisted, return without error */ + if (dev->device.devargs != NULL && + dev->device.devargs->policy == RTE_DEV_BLACKLISTED) { + RTE_LOG(INFO, EAL, "Device is blacklisted, not initializing\n"); + return 1; + } + + /* The device is not blacklisted; Check if driver supports it */ + if (!rte_mdev_match(dr, dev)) { + /* Match of device and driver failed */ + return 1; + } + + /* reference driver structure */ + dev->driver = dr; + + /* call the driver probe() function */ + ret = dr->probe(dr, dev); + if (ret != 0) + dev->driver = NULL; + + return ret; +} + +static int +mdev_probe_all_drivers(struct rte_mdev_device *dev) +{ + struct rte_mdev_driver *dr = NULL; + int rc = 0; + + if (dev == NULL) + return -1; + + /* Check if a driver is already loaded */ + if (dev->driver != NULL) + return 0; + + FOREACH_DRIVER_ON_MDEV_BUS(dr) { + rc = rte_mdev_probe_one_driver(dr, dev); + if (rc < 0) + /* negative value is an error */ + return -1; + if (rc > 0) + /* positive value means driver doesn't support it */ + continue; + return 0; + } + return 1; +} + +int +rte_mdev_probe(void) +{ + struct rte_mdev_device *mdev = NULL; + size_t probed = 0, failed = 0; + struct rte_devargs *devargs; + int probe_all = 0; + int ret = 0; + + if (rte_mdev_bus.bus.conf.scan_mode != RTE_BUS_SCAN_WHITELIST) + probe_all = 1; + + FOREACH_DEVICE_ON_MDEV_BUS(mdev) { + probed++; + + devargs = mdev->device.devargs; + /* probe all or only whitelisted devices */ + if (probe_all) + ret = mdev_probe_all_drivers(mdev); + else if (devargs != NULL && + devargs->policy == RTE_DEV_WHITELISTED) + ret = mdev_probe_all_drivers(mdev); + if (ret < 0) { + char name[RTE_UUID_STRLEN]; + rte_uuid_unparse(mdev->addr, name, sizeof(name)); + RTE_LOG(ERR, EAL, "Requested device %s cannot be used\n", + name); + rte_errno = errno; + failed++; + ret = 0; + } + } + + return (probed && probed == failed) ? -1 : 0; +} + +static int +mdev_plug(struct rte_device *dev) +{ + return mdev_probe_all_drivers(RTE_DEV_TO_MDEV(dev)); +} + +static int +rte_mdev_detach_dev(struct rte_mdev_device *dev) +{ + struct rte_mdev_driver *dr; + int ret = 0; + + if (dev == NULL) + return -EINVAL; + + dr = dev->driver; + + if (dr->remove) { + ret = dr->remove(dev); + if (ret != 0) + return ret; + } + + /* clear driver structure */ + dev->driver = NULL; + + return 0; +} + +static int +mdev_unplug(struct rte_device *dev) +{ + struct rte_mdev_device *pmdev; + int ret; + + pmdev = RTE_DEV_TO_MDEV(dev); + ret = rte_mdev_detach_dev(pmdev); + if (ret == 0) { + rte_mdev_remove_device(pmdev); + free(pmdev); + } + return ret; +} + +static int +mdev_parse(const char *name, void *addr) +{ + rte_uuid_t uuid; + int parse; + + parse = (rte_uuid_parse(name, uuid) == 0); + if (parse && addr != NULL) + rte_uuid_copy(addr, uuid); + return parse == false; +} + +struct rte_mdev_bus rte_mdev_bus = { + .bus = { + .scan = rte_mdev_scan, + .probe = rte_mdev_probe, + .find_device = mdev_find_device, + .plug = mdev_plug, + .unplug = mdev_unplug, + .parse = mdev_parse, + }, + .device_list = TAILQ_HEAD_INITIALIZER(rte_mdev_bus.device_list), + .driver_list = TAILQ_HEAD_INITIALIZER(rte_mdev_bus.driver_list), +}; + +RTE_REGISTER_BUS(mdev, rte_mdev_bus.bus); diff --git a/drivers/bus/mdev/meson.build b/drivers/bus/mdev/meson.build new file mode 100644 index 000000000..33c701cb9 --- /dev/null +++ b/drivers/bus/mdev/meson.build @@ -0,0 +1,15 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2019 Intel Corporation + +version = 1 +allow_experimental_apis = true +install_headers('rte_bus_mdev.h') +sources = files('mdev.c') + +if host_machine.system() == 'linux' + sources += files('linux/mdev.c') + includes += include_directories('linux') + cflags += ['-D_GNU_SOURCE'] +else + build = false +endif diff --git a/drivers/bus/mdev/private.h b/drivers/bus/mdev/private.h new file mode 100644 index 000000000..81cfe3045 --- /dev/null +++ b/drivers/bus/mdev/private.h @@ -0,0 +1,90 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Intel Corporation + */ + +#ifndef _MDEV_PRIVATE_H_ +#define _MDEV_PRIVATE_H_ + +#include +#include +#include + +struct rte_mdev_driver; +struct rte_mdev_device; + +extern struct rte_mdev_bus rte_mdev_bus; + +/** + * Probe the mdev bus. + * + * @return + * - 0 on success. + * - !0 on error. + */ +int rte_mdev_probe(void); + +/** + * Scan the content of the mdev bus, and the devices in the devices + * list. + * + * @return + * 0 on success, negative on error + */ +int rte_mdev_scan(void); + +/** + * Set the name of a mdev device. + */ +void mdev_name_set(struct rte_mdev_device *dev); + +/** + * Add a mdev device to the mdev bus (append to mdev device list). This function + * also updates the bus references of the mdev device (and the generic device + * object embedded within. + * + * @param mdev + * mdev device to add + * @return void + */ +void rte_mdev_add_device(struct rte_mdev_device *mdev); + +/** + * Insert a mdev device in the mdev bus at a particular location in the device + * list. It also updates the mdev bus reference of the new devices to be + * inserted. + * + * @param exist_mdev + * existing mdev device in mdev bus + * @param new_mdev + * mdev device to be added before exist_mdev + * @return void + */ +void rte_mdev_insert_device(struct rte_mdev_device *exist_mdev, + struct rte_mdev_device *new_mdev); + +/** + * Remove a mdev device from the mdev bus. This sets to NULL the bus references + * in the mdev device object as well as the generic device object. + * + * @param mdev_device + * mdev device to be removed from mdev bus + * @return void + */ +void rte_mdev_remove_device(struct rte_mdev_device *mdev_device); + +/** + * Match the mdev driver and device using mdev device_api. + * + * @param mdev_drv + * mdev driver from which device_api would be extracted + * @param mdev_dev + * mdev device to match against the driver + * @return + * 1 for successful match + * 0 for unsuccessful match + */ +int +rte_mdev_match(const struct rte_mdev_driver *mdev_drv, + const struct rte_mdev_device *mdev_dev); + +#endif /* _MDEV_PRIVATE_H_ */ diff --git a/drivers/bus/mdev/rte_bus_mdev.h b/drivers/bus/mdev/rte_bus_mdev.h new file mode 100644 index 000000000..913521ace --- /dev/null +++ b/drivers/bus/mdev/rte_bus_mdev.h @@ -0,0 +1,141 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Intel Corporation + */ + +#ifndef _RTE_BUS_MDEV_H_ +#define _RTE_BUS_MDEV_H_ + +/** + * @file + * + * RTE Mdev Bus Interface + */ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +struct rte_devargs; + +enum rte_mdev_device_api { + RTE_MDEV_DEV_API_VFIO_PCI = 0, + RTE_MDEV_DEV_API_MAX, +}; + +struct rte_mdev_bus; +struct rte_mdev_driver; +struct rte_mdev_device; + +/** Pathname of mdev devices directory. */ +const char * __rte_experimental rte_mdev_get_sysfs_path(void); + +/** + * Register a mdev driver. + * + * @param driver + * A pointer to a rte_mdev_driver structure describing the driver + * to be registered. + */ +void __rte_experimental rte_mdev_register(struct rte_mdev_driver *driver); + +#define RTE_MDEV_REGISTER_DRIVER(nm, mdev_drv) \ +RTE_INIT(mdevinitfn_ ##nm) \ +{ \ + (mdev_drv).driver.name = RTE_STR(nm); \ + rte_mdev_register(&mdev_drv); \ +} \ +RTE_PMD_EXPORT_NAME(nm, __COUNTER__) + +/** + * Unregister a mdev driver. + * + * @param driver + * A pointer to a rte_mdev_driver structure describing the driver + * to be unregistered. + */ +void __rte_experimental rte_mdev_unregister(struct rte_mdev_driver *driver); + +/** + * Initialisation function for the driver called during mdev probing. + */ +typedef int (mdev_probe_t)(struct rte_mdev_driver *, struct rte_mdev_device *); + +/** + * Uninitialisation function for the driver called during hotplugging. + */ +typedef int (mdev_remove_t)(struct rte_mdev_device *); + +/** + * A structure describing a mdev driver. + */ +struct rte_mdev_driver { + TAILQ_ENTRY(rte_mdev_driver) next; /**< Next in list. */ + struct rte_driver driver; /**< Inherit core driver. */ + struct rte_mdev_bus *bus; /**< Mdev bus reference. */ + mdev_probe_t *probe; /**< Device probe function. */ + mdev_remove_t *remove; /**< Device remove function. */ + enum rte_mdev_device_api dev_api; /**< Device API. */ +}; + +/** + * A structure describing a mdev device. + */ +struct rte_mdev_device { + TAILQ_ENTRY(rte_mdev_device) next; /**< Next mdev device. */ + struct rte_device device; /**< Inherit core device. */ + enum rte_mdev_device_api dev_api; /**< Device API. */ + struct rte_mdev_driver *driver; /**< Associated driver. */ + rte_uuid_t addr; /**< Location. */ + char name[RTE_UUID_STRLEN]; /**< Location (ASCII). */ + void *private; /**< Driver-specific data. */ +}; + +/** + * @internal + * Helper macro for drivers that need to convert to struct rte_mdev_device. + */ +#define RTE_DEV_TO_MDEV(ptr) container_of(ptr, struct rte_mdev_device, device) + +#define RTE_DEV_TO_MDEV_CONST(ptr) \ + container_of(ptr, const struct rte_mdev_device, device) + +/** List of mdev devices */ +TAILQ_HEAD(rte_mdev_device_list, rte_mdev_device); +/** List of mdev drivers */ +TAILQ_HEAD(rte_mdev_driver_list, rte_mdev_driver); + +/** + * Structure describing the mdev bus + */ +struct rte_mdev_bus { + struct rte_bus bus; /**< Inherit the generic class */ + struct rte_mdev_device_list device_list; /**< List of mdev devices */ + struct rte_mdev_driver_list driver_list; /**< List of mdev drivers */ +}; + +/* Mdev Bus iterators */ +#define FOREACH_DEVICE_ON_MDEV_BUS(p) \ + TAILQ_FOREACH(p, &(rte_mdev_bus.device_list), next) + +#define FOREACH_DRIVER_ON_MDEV_BUS(p) \ + TAILQ_FOREACH(p, &(rte_mdev_bus.driver_list), next) + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_BUS_MDEV_H_ */ diff --git a/drivers/bus/mdev/rte_bus_mdev_version.map b/drivers/bus/mdev/rte_bus_mdev_version.map new file mode 100644 index 000000000..7f73bf96b --- /dev/null +++ b/drivers/bus/mdev/rte_bus_mdev_version.map @@ -0,0 +1,12 @@ +DPDK_19.05 { + + local: *; +}; + +EXPERIMENTAL { + global: + + rte_mdev_get_sysfs_path; + rte_mdev_register; + rte_mdev_unregister; +}; diff --git a/drivers/bus/meson.build b/drivers/bus/meson.build index 80de2d91d..f0ab19a03 100644 --- a/drivers/bus/meson.build +++ b/drivers/bus/meson.build @@ -1,7 +1,7 @@ # SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2017 Intel Corporation -drivers = ['dpaa', 'fslmc', 'ifpga', 'pci', 'vdev', 'vmbus'] +drivers = ['dpaa', 'fslmc', 'ifpga', 'mdev', 'pci', 'vdev', 'vmbus'] std_deps = ['eal'] config_flag_fmt = 'RTE_LIBRTE_@0@_BUS' driver_name_fmt = 'rte_bus_@0@' diff --git a/mk/rte.app.mk b/mk/rte.app.mk index 262132fc6..f8abe8237 100644 --- a/mk/rte.app.mk +++ b/mk/rte.app.mk @@ -123,6 +123,7 @@ ifeq ($(CONFIG_RTE_LIBRTE_FSLMC_BUS),y) _LDLIBS-$(CONFIG_RTE_LIBRTE_COMMON_DPAAX) += -lrte_common_dpaax endif +_LDLIBS-$(CONFIG_RTE_LIBRTE_MDEV_BUS) += -lrte_bus_mdev _LDLIBS-$(CONFIG_RTE_LIBRTE_PCI_BUS) += -lrte_bus_pci _LDLIBS-$(CONFIG_RTE_LIBRTE_VDEV_BUS) += -lrte_bus_vdev _LDLIBS-$(CONFIG_RTE_LIBRTE_DPAA_BUS) += -lrte_bus_dpaa From patchwork Wed Apr 3 07:18:44 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tiwei Bie X-Patchwork-Id: 52145 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 5A2AA5F29; Wed, 3 Apr 2019 09:19:36 +0200 (CEST) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 6B6CD5B2A for ; Wed, 3 Apr 2019 09:19:28 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 03 Apr 2019 00:19:28 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,303,1549958400"; d="scan'208";a="128206262" Received: from dpdk-tbie.sh.intel.com ([10.67.104.173]) by orsmga007.jf.intel.com with ESMTP; 03 Apr 2019 00:19:26 -0700 From: Tiwei Bie To: dev@dpdk.org Cc: cunming.liang@intel.com, bruce.richardson@intel.com, alejandro.lucero@netronome.com Date: Wed, 3 Apr 2019 15:18:44 +0800 Message-Id: <20190403071844.21126-4-tiwei.bie@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190403071844.21126-1-tiwei.bie@intel.com> References: <20190403071844.21126-1-tiwei.bie@intel.com> Subject: [dpdk-dev] [RFC 3/3] bus/pci: add mdev support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This patch adds the mdev support in PCI bus driver. A mdev driver is introduced to probe the mdev devices whose device API is "vfio-pci" on the mdev bus. PS. There are some hacks in this patch for now. Signed-off-by: Cunming Liang Signed-off-by: Tiwei Bie --- drivers/bus/pci/Makefile | 3 + drivers/bus/pci/linux/Makefile | 4 + drivers/bus/pci/linux/pci_vfio.c | 35 ++- drivers/bus/pci/linux/pci_vfio_mdev.c | 305 ++++++++++++++++++++++++++ drivers/bus/pci/meson.build | 4 +- drivers/bus/pci/pci_common.c | 17 +- drivers/bus/pci/private.h | 9 + drivers/bus/pci/rte_bus_pci.h | 11 +- 8 files changed, 370 insertions(+), 18 deletions(-) create mode 100644 drivers/bus/pci/linux/pci_vfio_mdev.c diff --git a/drivers/bus/pci/Makefile b/drivers/bus/pci/Makefile index de53ce1bf..085ec9066 100644 --- a/drivers/bus/pci/Makefile +++ b/drivers/bus/pci/Makefile @@ -27,6 +27,9 @@ CFLAGS += -DALLOW_EXPERIMENTAL_API LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev -lrte_pci -lrte_kvargs +ifeq ($(CONFIG_RTE_LIBRTE_MDEV_BUS),y) +LDLIBS += -lrte_bus_mdev +endif include $(RTE_SDK)/drivers/bus/pci/$(SYSTEM)/Makefile SRCS-$(CONFIG_RTE_LIBRTE_PCI_BUS) := $(addprefix $(SYSTEM)/,$(SRCS)) diff --git a/drivers/bus/pci/linux/Makefile b/drivers/bus/pci/linux/Makefile index 90404468b..88bbc2390 100644 --- a/drivers/bus/pci/linux/Makefile +++ b/drivers/bus/pci/linux/Makefile @@ -4,3 +4,7 @@ SRCS += pci.c SRCS += pci_uio.c SRCS += pci_vfio.c + +ifeq ($(CONFIG_RTE_LIBRTE_MDEV_BUS),y) + SRCS += pci_vfio_mdev.c +endif diff --git a/drivers/bus/pci/linux/pci_vfio.c b/drivers/bus/pci/linux/pci_vfio.c index ebf6ccd3c..c2c4c6a50 100644 --- a/drivers/bus/pci/linux/pci_vfio.c +++ b/drivers/bus/pci/linux/pci_vfio.c @@ -13,6 +13,9 @@ #include #include +#ifdef RTE_LIBRTE_MDEV_BUS +#include +#endif #include #include #include @@ -20,6 +23,7 @@ #include #include #include +#include #include "eal_filesystem.h" @@ -648,6 +652,7 @@ pci_vfio_map_resource_primary(struct rte_pci_device *dev) { struct vfio_device_info device_info = { .argsz = sizeof(device_info) }; char pci_addr[PATH_MAX] = {0}; + const char *sysfs_path; int vfio_dev_fd; struct rte_pci_addr *loc = &dev->addr; int i, ret; @@ -663,10 +668,20 @@ pci_vfio_map_resource_primary(struct rte_pci_device *dev) #endif /* store PCI address string */ - snprintf(pci_addr, sizeof(pci_addr), PCI_PRI_FMT, + if (dev->use_uuid) { +#ifdef RTE_LIBRTE_MDEV_BUS + sysfs_path = rte_mdev_get_sysfs_path(); + rte_uuid_unparse(dev->uuid, pci_addr, sizeof(pci_addr)); +#else + return -1; +#endif + } else { + sysfs_path = rte_pci_get_sysfs_path(); + snprintf(pci_addr, sizeof(pci_addr), PCI_PRI_FMT, loc->domain, loc->bus, loc->devid, loc->function); + } - ret = rte_vfio_setup_device(rte_pci_get_sysfs_path(), pci_addr, + ret = rte_vfio_setup_device(sysfs_path, pci_addr, &vfio_dev_fd, &device_info); if (ret) return ret; @@ -793,6 +808,7 @@ pci_vfio_map_resource_secondary(struct rte_pci_device *dev) { struct vfio_device_info device_info = { .argsz = sizeof(device_info) }; char pci_addr[PATH_MAX] = {0}; + const char *sysfs_path; int vfio_dev_fd; struct rte_pci_addr *loc = &dev->addr; int i, ret; @@ -808,8 +824,19 @@ pci_vfio_map_resource_secondary(struct rte_pci_device *dev) #endif /* store PCI address string */ - snprintf(pci_addr, sizeof(pci_addr), PCI_PRI_FMT, + if (dev->use_uuid) { +#ifdef RTE_LIBRTE_MDEV_BUS + sysfs_path = rte_mdev_get_sysfs_path(); + rte_uuid_unparse(dev->uuid, pci_addr, sizeof(pci_addr)); +#else + return -1; +#endif + } else { + sysfs_path = rte_pci_get_sysfs_path(); + snprintf(pci_addr, sizeof(pci_addr), PCI_PRI_FMT, loc->domain, loc->bus, loc->devid, loc->function); + } + /* if we're in a secondary process, just find our tailq entry */ TAILQ_FOREACH(vfio_res, vfio_res_list, next) { @@ -825,7 +852,7 @@ pci_vfio_map_resource_secondary(struct rte_pci_device *dev) return -1; } - ret = rte_vfio_setup_device(rte_pci_get_sysfs_path(), pci_addr, + ret = rte_vfio_setup_device(sysfs_path, pci_addr, &vfio_dev_fd, &device_info); if (ret) return ret; diff --git a/drivers/bus/pci/linux/pci_vfio_mdev.c b/drivers/bus/pci/linux/pci_vfio_mdev.c new file mode 100644 index 000000000..92498c2fe --- /dev/null +++ b/drivers/bus/pci/linux/pci_vfio_mdev.c @@ -0,0 +1,305 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Intel Corporation + */ + +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "eal_private.h" +#include "eal_filesystem.h" + +#include "private.h" + +extern struct rte_pci_bus rte_pci_bus; + +static int +get_pci_id(const char *sysfs_base, const char *dev_addr, + struct rte_pci_id *pci_id) +{ + int ret = 0; + int iommu_group_num; + int vfio_group_fd; + int vfio_dev_fd; + int container; + int class; + char name[PATH_MAX]; + struct vfio_group_status group_status = { + .argsz = sizeof(group_status) }; + + container = open("/dev/vfio/vfio", O_RDWR); + if (container < 0) { + RTE_LOG(WARNING, EAL, "Failed to open VFIO container\n"); + ret = -1; + goto out; + } + + if (ioctl(container, VFIO_GET_API_VERSION) != VFIO_API_VERSION) { + /* Unknown API version */ + RTE_LOG(WARNING, EAL, "Unknown VFIO API version\n"); + ret = -1; + goto close_container; + } + + if (rte_vfio_get_group_num(sysfs_base, dev_addr, + &iommu_group_num) <= 0) { + RTE_LOG(WARNING, EAL, "%s not managed by VFIO driver\n", + dev_addr); + ret = -1; + goto close_container; + } + + snprintf(name, sizeof(name), "/dev/vfio/%d", iommu_group_num); + + vfio_group_fd = open(name, O_RDWR); + if (vfio_group_fd < 0) { + ret = -1; + goto close_container; + } + + /* if group_fd == 0, that means the device isn't managed by VFIO */ + if (vfio_group_fd == 0) { + RTE_LOG(WARNING, EAL, "%s not managed by VFIO driver\n", + dev_addr); + ret = -1; + goto close_group; + } + + if (ioctl(vfio_group_fd, VFIO_GROUP_GET_STATUS, &group_status)) { + RTE_LOG(ERR, EAL, "%s cannot get group status, error %i (%s)\n", + dev_addr, errno, strerror(errno)); + ret = -1; + goto close_group; + } + + if (!(group_status.flags & VFIO_GROUP_FLAGS_VIABLE)) { + RTE_LOG(ERR, EAL, "%s VFIO group is not viable!\n", dev_addr); + ret = -1; + goto close_group; + } + + if (!(group_status.flags & VFIO_GROUP_FLAGS_CONTAINER_SET)) { + if (ioctl(vfio_group_fd, VFIO_GROUP_SET_CONTAINER, + &container)) { + RTE_LOG(ERR, EAL, "%s cannot add VFIO group to container, error %i (%s)\n", + dev_addr, errno, strerror(errno)); + ret = -1; + goto close_group; + } + } + + if (ioctl(container, VFIO_SET_IOMMU, VFIO_TYPE1_IOMMU)) { + RTE_LOG(ERR, EAL, "%s cannot set iommu, error %i (%s)\n", + dev_addr, errno, strerror(errno)); + ret = -1; + goto close_group; + } + + vfio_dev_fd = ioctl(vfio_group_fd, VFIO_GROUP_GET_DEVICE_FD, dev_addr); + if (vfio_dev_fd < 0) { + /* if we cannot get a device fd, this implies a problem with + * the VFIO group or the container not having IOMMU configured. + */ + RTE_LOG(ERR, EAL, "Getting a vfio_dev_fd for %s failed errno %d\n", + dev_addr, errno); + ret = -1; + goto close_group; + } + + /* vendor_id */ + if (pread64(vfio_dev_fd, &pci_id->vendor_id, sizeof(uint16_t), + VFIO_GET_REGION_ADDR(VFIO_PCI_CONFIG_REGION_INDEX) + + PCI_VENDOR_ID) != sizeof(uint16_t)) { + RTE_LOG(ERR, EAL, "Cannot read VendorID from PCI config space\n"); + ret = -1; + goto close_device; + } + + /* device_id */ + if (pread64(vfio_dev_fd, &pci_id->device_id, sizeof(uint16_t), + VFIO_GET_REGION_ADDR(VFIO_PCI_CONFIG_REGION_INDEX) + + PCI_DEVICE_ID) != sizeof(uint16_t)) { + RTE_LOG(ERR, EAL, "Cannot read DeviceID from PCI config space\n"); + ret = -1; + goto close_device; + } + + /* subsystem_vendor_id */ + if (pread64(vfio_dev_fd, &pci_id->subsystem_vendor_id, sizeof(uint16_t), + VFIO_GET_REGION_ADDR(VFIO_PCI_CONFIG_REGION_INDEX) + + PCI_SUBSYSTEM_VENDOR_ID) != sizeof(uint16_t)) { + RTE_LOG(ERR, EAL, "Cannot read SubVendorID from PCI config space\n"); + ret = -1; + goto close_device; + } + + /* subsystem_device_id */ + if (pread64(vfio_dev_fd, &pci_id->subsystem_device_id, sizeof(uint16_t), + VFIO_GET_REGION_ADDR(VFIO_PCI_CONFIG_REGION_INDEX) + + PCI_SUBSYSTEM_ID) != sizeof(uint16_t)) { + RTE_LOG(ERR, EAL, "Cannot read SubDeviceID from PCI config space\n"); + ret = -1; + goto close_device; + } + + /* class_id */ + if (pread64(vfio_dev_fd, &class, sizeof(uint32_t), + VFIO_GET_REGION_ADDR(VFIO_PCI_CONFIG_REGION_INDEX) + + PCI_CLASS_REVISION) != sizeof(uint32_t)) { + RTE_LOG(ERR, EAL, "Cannot read ClassID from PCI config space\n"); + ret = -1; + goto close_device; + } + pci_id->class_id = class >> 8; + +close_device: + if (close(vfio_dev_fd) < 0) { + RTE_LOG(INFO, EAL, "Error when closing VFIO device for %s\n", + dev_addr); + ret = -1; + } + +close_group: + if (close(vfio_group_fd) < 0) { + RTE_LOG(INFO, EAL, "Error when closing VFIO group for %s\n", + dev_addr); + ret = -1; + } + +close_container: + if (close(container) < 0) { + RTE_LOG(INFO, EAL, "Error when closing VFIO container\n"); + ret = -1; + } + +out: + return ret; +} + +static int vfio_pci_probe(struct rte_mdev_driver *mdev_drv __rte_unused, + struct rte_mdev_device *mdev_dev) +{ + char name[RTE_UUID_STRLEN]; + struct rte_pci_device *dev; + struct rte_bus *bus; + int ret; + + bus = rte_bus_find_by_name("pci"); + if (bus == NULL) { + RTE_LOG(ERR, EAL, "Cannot find bus pci\n"); + return -ENOENT; + } + + if (bus->plug == NULL) { + RTE_LOG(ERR, EAL, "Function plug not supported by bus (%s)\n", + bus->name); + return -ENOTSUP; + } + + dev = malloc(sizeof(*dev)); + if (dev == NULL) + return -ENOMEM; + + memset(dev, 0, sizeof(*dev)); + dev->device.bus = &rte_pci_bus.bus; + rte_uuid_unparse(mdev_dev->addr, name, sizeof(name)); + + if (get_pci_id(rte_mdev_get_sysfs_path(), name, &dev->id)) { + free(dev); + return -1; + } + + snprintf(dev->name, sizeof(dev->name), "%s", name); + dev->device.name = dev->name; + dev->kdrv = RTE_KDRV_VFIO; + dev->use_uuid = 1; + rte_uuid_copy(dev->uuid, mdev_dev->addr); + + // TODO: dev->device.devargs, etc + + memset(&dev->addr, -1, sizeof(dev->addr)); // XXX: TODO + + /* device is valid, add to the list (sorted) */ + if (TAILQ_EMPTY(&rte_pci_bus.device_list)) { + rte_pci_add_device(dev); + } else { + struct rte_pci_device *dev2; + int ret; + + TAILQ_FOREACH(dev2, &rte_pci_bus.device_list, next) { + // XXX + ret = rte_pci_addr_cmp(&dev->addr, &dev2->addr); + if (ret == 0) + ret = strncmp(dev->name, dev2->name, + sizeof(dev->name)); + if (ret > 0) + continue; + if (ret < 0) { + rte_pci_insert_device(dev2, dev); + goto plug; + } + /* already registered */ + free(dev); + return 0; + } + + rte_pci_add_device(dev); + } + +plug: + ret = bus->plug(&dev->device); + if (ret != 0) { + rte_pci_remove_device(dev); + free(dev); + } else { + mdev_dev->private = dev; + } + return ret; +} + +static int vfio_pci_remove(struct rte_mdev_device *mdev_dev) +{ + struct rte_pci_device *dev = mdev_dev->private; + struct rte_bus *bus; + int ret; + + if (dev == NULL) + return 0; + + bus = rte_bus_find_by_name("pci"); + if (bus == NULL) { + RTE_LOG(ERR, EAL, "Cannot find bus pci\n"); + return -ENOENT; + } + + if (bus->unplug == NULL) { + RTE_LOG(ERR, EAL, "Function unplug not supported by bus (%s)\n", + bus->name); + return -ENOTSUP; + } + + ret = bus->unplug(&dev->device); + if (ret == 0) + mdev_dev->private = NULL; + + return ret; +} + +static struct rte_mdev_driver vfio_pci_drv = { + .dev_api = RTE_MDEV_DEV_API_VFIO_PCI, + .probe = vfio_pci_probe, + .remove = vfio_pci_remove +}; + +RTE_MDEV_REGISTER_DRIVER(mdev_vfio_pci, vfio_pci_drv); diff --git a/drivers/bus/pci/meson.build b/drivers/bus/pci/meson.build index a3140ff97..c3e884657 100644 --- a/drivers/bus/pci/meson.build +++ b/drivers/bus/pci/meson.build @@ -11,8 +11,10 @@ sources = files('pci_common.c', if host_machine.system() == 'linux' sources += files('linux/pci.c', 'linux/pci_uio.c', - 'linux/pci_vfio.c') + 'linux/pci_vfio.c', + 'linux/pci_vfio_mdev.c') includes += include_directories('linux') + deps += ['bus_mdev'] else sources += files('bsd/pci.c') includes += include_directories('bsd') diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c index 704b9d71a..6b47333e6 100644 --- a/drivers/bus/pci/pci_common.c +++ b/drivers/bus/pci/pci_common.c @@ -124,21 +124,17 @@ rte_pci_probe_one_driver(struct rte_pci_driver *dr, { int ret; bool already_probed; - struct rte_pci_addr *loc; if ((dr == NULL) || (dev == NULL)) return -EINVAL; - loc = &dev->addr; - /* The device is not blacklisted; Check if driver supports it */ if (!rte_pci_match(dr, dev)) /* Match of device and driver failed */ return 1; - RTE_LOG(INFO, EAL, "PCI device "PCI_PRI_FMT" on NUMA socket %i\n", - loc->domain, loc->bus, loc->devid, loc->function, - dev->device.numa_node); + RTE_LOG(INFO, EAL, "PCI device %s on NUMA socket %i\n", + dev->name, dev->device.numa_node); /* no initialization when blacklisted, return without error */ if (dev->device.devargs != NULL && @@ -208,7 +204,6 @@ rte_pci_probe_one_driver(struct rte_pci_driver *dr, static int rte_pci_detach_dev(struct rte_pci_device *dev) { - struct rte_pci_addr *loc; struct rte_pci_driver *dr; int ret = 0; @@ -216,11 +211,9 @@ rte_pci_detach_dev(struct rte_pci_device *dev) return -EINVAL; dr = dev->driver; - loc = &dev->addr; - RTE_LOG(DEBUG, EAL, "PCI device "PCI_PRI_FMT" on NUMA socket %i\n", - loc->domain, loc->bus, loc->devid, - loc->function, dev->device.numa_node); + RTE_LOG(DEBUG, EAL, "PCI device %s on NUMA socket %i\n", + dev->name, dev->device.numa_node); RTE_LOG(DEBUG, EAL, " remove driver: %x:%x %s\n", dev->id.vendor_id, dev->id.device_id, dr->driver.name); @@ -387,7 +380,7 @@ rte_pci_insert_device(struct rte_pci_device *exist_pci_dev, } /* Remove a device from PCI bus */ -static void +void rte_pci_remove_device(struct rte_pci_device *pci_dev) { TAILQ_REMOVE(&rte_pci_bus.device_list, pci_dev, next); diff --git a/drivers/bus/pci/private.h b/drivers/bus/pci/private.h index 13c3324bb..d5815ee44 100644 --- a/drivers/bus/pci/private.h +++ b/drivers/bus/pci/private.h @@ -67,6 +67,15 @@ void rte_pci_add_device(struct rte_pci_device *pci_dev); void rte_pci_insert_device(struct rte_pci_device *exist_pci_dev, struct rte_pci_device *new_pci_dev); +/** + * Remove a PCI device from the PCI Bus. + * + * @param pci_dev + * PCI device to remove + * @return void + */ +void rte_pci_remove_device(struct rte_pci_device *pci_dev); + /** * Update a pci device object by asking the kernel for the latest information. * diff --git a/drivers/bus/pci/rte_bus_pci.h b/drivers/bus/pci/rte_bus_pci.h index 06e004cd3..465a44935 100644 --- a/drivers/bus/pci/rte_bus_pci.h +++ b/drivers/bus/pci/rte_bus_pci.h @@ -51,6 +51,13 @@ TAILQ_HEAD(rte_pci_driver_list, rte_pci_driver); struct rte_devargs; +/* It's RTE_UUID_STRLEN, which is bigger than PCI_PRI_STR_SIZE. */ +#define RTE_PCI_NAME_LEN (36 + 1) + +// XXX: we can't include rte_uuid.h directly due to the conflicts +// introduced by stdbool.h +typedef unsigned char rte_uuid_t[16]; + /** * A structure describing a PCI device. */ @@ -58,6 +65,8 @@ struct rte_pci_device { TAILQ_ENTRY(rte_pci_device) next; /**< Next probed PCI device. */ struct rte_device device; /**< Inherit core device */ struct rte_pci_addr addr; /**< PCI location. */ + rte_uuid_t uuid; /**< Mdev location. */ + uint8_t use_uuid; /**< True if uuid field valid. */ struct rte_pci_id id; /**< PCI ID. */ struct rte_mem_resource mem_resource[PCI_MAX_RESOURCE]; /**< PCI Memory Resource */ @@ -65,7 +74,7 @@ struct rte_pci_device { struct rte_pci_driver *driver; /**< PCI driver used in probing */ uint16_t max_vfs; /**< sriov enable if not zero */ enum rte_kernel_driver kdrv; /**< Kernel driver passthrough */ - char name[PCI_PRI_STR_SIZE+1]; /**< PCI location (ASCII) */ + char name[RTE_PCI_NAME_LEN]; /**< PCI/Mdev location (ASCII) */ struct rte_intr_handle vfio_req_intr_handle; /**< Handler of VFIO request interrupt */ };