From patchwork Sun Feb 2 16:03:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65463 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 30B76A04FA; Sun, 2 Feb 2020 17:04:07 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 844DE1BFE5; Sun, 2 Feb 2020 17:04:03 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id 629AB1BFE3 for ; Sun, 2 Feb 2020 17:04:02 +0100 (CET) Received: from Internal Mail-Server by MTLPINE1 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 2 Feb 2020 18:04:01 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 012G3uix032300; Sun, 2 Feb 2020 18:04:01 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Sun, 2 Feb 2020 16:03:41 +0000 Message-Id: <1580659433-25581-2-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580659433-25581-1-git-send-email-matan@mellanox.com> References: <1580292549-27439-1-git-send-email-matan@mellanox.com> <1580659433-25581-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v3 01/13] drivers: introduce mlx5 vDPA driver X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add a new driver to support vDPA operations by Mellanox devices. The first Mellanox devices which support vDPA operations are ConnectX6DX and Bluefield1 HCA for their PF ports and VF ports. This driver is depending on rdma-core like the mlx5 PMD, also it is going to use mlx5 DevX to create HW objects directly by the FW. Hence, the common/mlx5 library is linked to the mlx5_vdpa driver. This driver will not be compiled by default due to the above dependencies. Register a new log type for this driver. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko Reviewed-by: Maxime Coquelin --- MAINTAINERS | 7 + config/common_base | 5 + doc/guides/rel_notes/release_20_02.rst | 5 + doc/guides/vdpadevs/features/mlx5.ini | 14 ++ doc/guides/vdpadevs/index.rst | 1 + doc/guides/vdpadevs/mlx5.rst | 111 +++++++++++ drivers/common/Makefile | 2 +- drivers/common/mlx5/Makefile | 17 +- drivers/meson.build | 8 +- drivers/vdpa/Makefile | 2 + drivers/vdpa/meson.build | 3 +- drivers/vdpa/mlx5/Makefile | 51 ++++++ drivers/vdpa/mlx5/meson.build | 33 ++++ drivers/vdpa/mlx5/mlx5_vdpa.c | 234 ++++++++++++++++++++++++ drivers/vdpa/mlx5/mlx5_vdpa_utils.h | 20 ++ drivers/vdpa/mlx5/rte_pmd_mlx5_vdpa_version.map | 3 + mk/rte.app.mk | 15 +- 17 files changed, 514 insertions(+), 17 deletions(-) create mode 100644 doc/guides/vdpadevs/features/mlx5.ini create mode 100644 doc/guides/vdpadevs/mlx5.rst create mode 100644 drivers/vdpa/mlx5/Makefile create mode 100644 drivers/vdpa/mlx5/meson.build create mode 100644 drivers/vdpa/mlx5/mlx5_vdpa.c create mode 100644 drivers/vdpa/mlx5/mlx5_vdpa_utils.h create mode 100644 drivers/vdpa/mlx5/rte_pmd_mlx5_vdpa_version.map diff --git a/MAINTAINERS b/MAINTAINERS index 150d507..f697e9a 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1103,6 +1103,13 @@ F: drivers/vdpa/ifc/ F: doc/guides/vdpadevs/ifc.rst F: doc/guides/vdpadevs/features/ifcvf.ini +Mellanox mlx5 vDPA +M: Matan Azrad +M: Viacheslav Ovsiienko +F: drivers/vdpa/mlx5/ +F: doc/guides/vdpadevs/mlx5.rst +F: doc/guides/vdpadevs/features/mlx5.ini + Eventdev Drivers ---------------- diff --git a/config/common_base b/config/common_base index c897dd0..6ea9c63 100644 --- a/config/common_base +++ b/config/common_base @@ -366,6 +366,11 @@ CONFIG_RTE_LIBRTE_MLX4_DEBUG=n CONFIG_RTE_LIBRTE_MLX5_PMD=n CONFIG_RTE_LIBRTE_MLX5_DEBUG=n +# +# Compile vdpa-oriented Mellanox ConnectX-6 & Bluefield (MLX5) PMD +# +CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD=n + # Linking method for mlx4/5 dependency on ibverbs and related libraries # Default linking is dynamic by linker. # Other options are: dynamic by dlopen at run-time, or statically embedded. diff --git a/doc/guides/rel_notes/release_20_02.rst b/doc/guides/rel_notes/release_20_02.rst index 50e2c14..690e7db 100644 --- a/doc/guides/rel_notes/release_20_02.rst +++ b/doc/guides/rel_notes/release_20_02.rst @@ -113,6 +113,11 @@ New Features * Added support for RSS using L3/L4 source/destination only. * Added support for matching on GTP tunnel header item. +* **Add new vDPA PMD based on Mellanox devices** + + Added a new Mellanox vDPA (``mlx5_vdpa``) PMD. + See the :doc:`../vdpadevs/mlx5` guide for more details on this driver. + * **Updated testpmd application.** Added support for ESP and L2TPv3 over IP rte_flow patterns to the testpmd diff --git a/doc/guides/vdpadevs/features/mlx5.ini b/doc/guides/vdpadevs/features/mlx5.ini new file mode 100644 index 0000000..d635bdf --- /dev/null +++ b/doc/guides/vdpadevs/features/mlx5.ini @@ -0,0 +1,14 @@ +; +; Supported features of the 'mlx5' VDPA driver. +; +; Refer to default.ini for the full list of available driver features. +; +[Features] +Other kdrv = Y +ARMv8 = Y +Power8 = Y +x86-32 = Y +x86-64 = Y +Usage doc = Y +Design doc = Y + diff --git a/doc/guides/vdpadevs/index.rst b/doc/guides/vdpadevs/index.rst index 9657108..1a13efe 100644 --- a/doc/guides/vdpadevs/index.rst +++ b/doc/guides/vdpadevs/index.rst @@ -13,3 +13,4 @@ which can be used from an application through vhost API. features_overview ifc + mlx5 diff --git a/doc/guides/vdpadevs/mlx5.rst b/doc/guides/vdpadevs/mlx5.rst new file mode 100644 index 0000000..ce7c8a7 --- /dev/null +++ b/doc/guides/vdpadevs/mlx5.rst @@ -0,0 +1,111 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright 2019 Mellanox Technologies, Ltd + +MLX5 vDPA driver +================ + +The MLX5 vDPA (vhost data path acceleration) driver library +(**librte_pmd_mlx5_vdpa**) provides support for **Mellanox ConnectX-6**, +**Mellanox ConnectX-6DX** and **Mellanox BlueField** families of +10/25/40/50/100/200 Gb/s adapters as well as their virtual functions (VF) in +SR-IOV context. + +.. note:: + + Due to external dependencies, this driver is disabled in default + configuration of the "make" build. It can be enabled with + ``CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD=y`` or by using "meson" build system which + will detect dependencies. + + +Design +------ + +For security reasons and robustness, this driver only deals with virtual +memory addresses. The way resources allocations are handled by the kernel, +combined with hardware specifications that allow to handle virtual memory +addresses directly, ensure that DPDK applications cannot access random +physical memory (or memory that does not belong to the current process). + +The PMD can use libibverbs and libmlx5 to access the device firmware +or directly the hardware components. +There are different levels of objects and bypassing abilities +to get the best performances: + +- Verbs is a complete high-level generic API +- Direct Verbs is a device-specific API +- DevX allows to access firmware objects +- Direct Rules manages flow steering at low-level hardware layer + +Enabling librte_pmd_mlx5_vdpa causes DPDK applications to be linked against +libibverbs. + +A Mellanox mlx5 PCI device can be probed by either net/mlx5 driver or vdpa/mlx5 +driver but not in parallel. Hence, the user should decide the driver by the +``class`` parameter in the device argument list. +By default, the mlx5 device will be probed by the net/mlx5 driver. + +Supported NICs +-------------- + +* Mellanox(R) ConnectX(R)-6 200G MCX654106A-HCAT (4x200G) +* Mellanox(R) ConnectX(R)-6DX EN 100G MCX623106AN-CDAT (2*100G) +* Mellanox(R) ConnectX(R)-6DX EN 200G MCX623105AN-VDAT (1*200G) +* Mellanox(R) BlueField SmartNIC 25G MBF1M332A-ASCAT (2*25G) + +Prerequisites +------------- + +- Mellanox OFED version: **4.7** + see :doc:`../../nics/mlx5` guide for more Mellanox OFED details. + +Compilation options +~~~~~~~~~~~~~~~~~~~ + +These options can be modified in the ``.config`` file. + +- ``CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD`` (default **n**) + + Toggle compilation of librte_pmd_mlx5 itself. + +- ``CONFIG_RTE_IBVERBS_LINK_DLOPEN`` (default **n**) + + Build PMD with additional code to make it loadable without hard + dependencies on **libibverbs** nor **libmlx5**, which may not be installed + on the target system. + + In this mode, their presence is still required for it to run properly, + however their absence won't prevent a DPDK application from starting (with + ``CONFIG_RTE_BUILD_SHARED_LIB`` disabled) and they won't show up as + missing with ``ldd(1)``. + + It works by moving these dependencies to a purpose-built rdma-core "glue" + plug-in which must either be installed in a directory whose name is based + on ``CONFIG_RTE_EAL_PMD_PATH`` suffixed with ``-glue`` if set, or in a + standard location for the dynamic linker (e.g. ``/lib``) if left to the + default empty string (``""``). + + This option has no performance impact. + +- ``CONFIG_RTE_IBVERBS_LINK_STATIC`` (default **n**) + + Embed static flavor of the dependencies **libibverbs** and **libmlx5** + in the PMD shared library or the executable static binary. + +.. note:: + + For BlueField, target should be set to ``arm64-bluefield-linux-gcc``. This + will enable ``CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD`` and set + ``RTE_CACHE_LINE_SIZE`` to 64. Default armv8a configuration of make build and + meson build set it to 128 then brings performance degradation. + +Run-time configuration +~~~~~~~~~~~~~~~~~~~~~~ + +- **ethtool** operations on related kernel interfaces also affect the PMD. + +- ``class`` parameter [string] + + Select the class of the driver that should probe the device. + `vdpa` for the mlx5 vDPA driver. + diff --git a/drivers/common/Makefile b/drivers/common/Makefile index 4775d4b..96bd7ac 100644 --- a/drivers/common/Makefile +++ b/drivers/common/Makefile @@ -35,7 +35,7 @@ ifneq (,$(findstring y,$(IAVF-y))) DIRS-y += iavf endif -ifeq ($(CONFIG_RTE_LIBRTE_MLX5_PMD),y) +ifeq ($(findstring y,$(CONFIG_RTE_LIBRTE_MLX5_PMD)$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD)),y) DIRS-y += mlx5 endif diff --git a/drivers/common/mlx5/Makefile b/drivers/common/mlx5/Makefile index 624d331..f32933d 100644 --- a/drivers/common/mlx5/Makefile +++ b/drivers/common/mlx5/Makefile @@ -10,15 +10,16 @@ LIB_GLUE_BASE = librte_pmd_mlx5_glue.so LIB_GLUE_VERSION = 20.02.0 # Sources. +ifeq ($(findstring y,$(CONFIG_RTE_LIBRTE_MLX5_PMD)$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD)),y) ifneq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y) -SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_glue.c +SRCS-y += mlx5_glue.c endif -SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_devx_cmds.c -SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_common.c -SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_nl.c - +SRCS-y += mlx5_devx_cmds.c +SRCS-y += mlx5_common.c +SRCS-y += mlx5_nl.c ifeq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y) -INSTALL-$(CONFIG_RTE_LIBRTE_MLX5_PMD)-lib += $(LIB_GLUE) +INSTALL-y-lib += $(LIB_GLUE) +endif endif # Basic CFLAGS. @@ -317,7 +318,9 @@ mlx5_autoconf.h: mlx5_autoconf.h.new cmp '$<' '$@' $(AUTOCONF_OUTPUT) || \ mv '$<' '$@' -$(SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD):.c=.o): mlx5_autoconf.h +ifeq ($(findstring y,$(CONFIG_RTE_LIBRTE_MLX5_PMD)$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD)),y) +$(SRCS-y:.c=.o): mlx5_autoconf.h +endif # Generate dependency plug-in for rdma-core when the PMD must not be linked # directly, so that applications do not inherit this dependency. diff --git a/drivers/meson.build b/drivers/meson.build index 29708cc..bd154fa 100644 --- a/drivers/meson.build +++ b/drivers/meson.build @@ -42,6 +42,7 @@ foreach class:dpdk_driver_classes build = true # set to false to disable, e.g. missing deps reason = '' # set if build == false to explain name = drv + fmt_name = '' allow_experimental_apis = false sources = [] objs = [] @@ -98,8 +99,11 @@ foreach class:dpdk_driver_classes else class_drivers += name - dpdk_conf.set(config_flag_fmt.format(name.to_upper()),1) - lib_name = driver_name_fmt.format(name) + if fmt_name == '' + fmt_name = name + endif + dpdk_conf.set(config_flag_fmt.format(fmt_name.to_upper()),1) + lib_name = driver_name_fmt.format(fmt_name) if allow_experimental_apis cflags += '-DALLOW_EXPERIMENTAL_API' diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile index b5a7a11..6e88359 100644 --- a/drivers/vdpa/Makefile +++ b/drivers/vdpa/Makefile @@ -7,4 +7,6 @@ ifeq ($(CONFIG_RTE_EAL_VFIO),y) DIRS-$(CONFIG_RTE_LIBRTE_IFC_PMD) += ifc endif +DIRS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5 + include $(RTE_SDK)/mk/rte.subdir.mk diff --git a/drivers/vdpa/meson.build b/drivers/vdpa/meson.build index 2f047b5..e3ed54a 100644 --- a/drivers/vdpa/meson.build +++ b/drivers/vdpa/meson.build @@ -1,7 +1,8 @@ # SPDX-License-Identifier: BSD-3-Clause # Copyright 2019 Mellanox Technologies, Ltd -drivers = ['ifc'] +drivers = ['ifc', + 'mlx5',] std_deps = ['bus_pci', 'kvargs'] std_deps += ['vhost'] config_flag_fmt = 'RTE_LIBRTE_@0@_PMD' diff --git a/drivers/vdpa/mlx5/Makefile b/drivers/vdpa/mlx5/Makefile new file mode 100644 index 0000000..1ab5296 --- /dev/null +++ b/drivers/vdpa/mlx5/Makefile @@ -0,0 +1,51 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright 2019 Mellanox Technologies, Ltd + +include $(RTE_SDK)/mk/rte.vars.mk + +# Library name. +LIB = librte_pmd_mlx5_vdpa.a + +# Sources. +SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa.c + +# Basic CFLAGS. +CFLAGS += -O3 +CFLAGS += -std=c11 -Wall -Wextra +CFLAGS += -g +CFLAGS += -I$(RTE_SDK)/drivers/common/mlx5 +CFLAGS += -I$(RTE_SDK)/drivers/net/mlx5_vdpa +CFLAGS += -I$(BUILDDIR)/drivers/common/mlx5 +CFLAGS += -D_BSD_SOURCE +CFLAGS += -D_DEFAULT_SOURCE +CFLAGS += -D_XOPEN_SOURCE=600 +CFLAGS += $(WERROR_FLAGS) +CFLAGS += -Wno-strict-prototypes +LDLIBS += -lrte_common_mlx5 +LDLIBS += -lrte_eal -lrte_vhost -lrte_kvargs -lrte_bus_pci + +# A few warnings cannot be avoided in external headers. +CFLAGS += -Wno-error=cast-qual + +EXPORT_MAP := rte_pmd_mlx5_vdpa_version.map +# memseg walk is not part of stable API +CFLAGS += -DALLOW_EXPERIMENTAL_API + +# DEBUG which is usually provided on the command-line may enable +# CONFIG_RTE_LIBRTE_MLX5_DEBUG. +ifeq ($(DEBUG),1) +CONFIG_RTE_LIBRTE_MLX5_DEBUG := y +endif + +# User-defined CFLAGS. +ifeq ($(CONFIG_RTE_LIBRTE_MLX5_DEBUG),y) +CFLAGS += -pedantic +ifneq ($(CONFIG_RTE_TOOLCHAIN_ICC),y) +CFLAGS += -DPEDANTIC +endif +AUTO_CONFIG_CFLAGS += -Wno-pedantic +else +CFLAGS += -UPEDANTIC +endif + +include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/vdpa/mlx5/meson.build b/drivers/vdpa/mlx5/meson.build new file mode 100644 index 0000000..6d3ab98 --- /dev/null +++ b/drivers/vdpa/mlx5/meson.build @@ -0,0 +1,33 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright 2019 Mellanox Technologies, Ltd + +if not is_linux + build = false + reason = 'only supported on Linux' + subdir_done() +endif + +fmt_name = 'mlx5_vdpa' +allow_experimental_apis = true +deps += ['hash', 'common_mlx5', 'vhost', 'bus_pci', 'eal'] +sources = files( + 'mlx5_vdpa.c', +) +cflags_options = [ + '-std=c11', + '-Wno-strict-prototypes', + '-D_BSD_SOURCE', + '-D_DEFAULT_SOURCE', + '-D_XOPEN_SOURCE=600' +] +foreach option:cflags_options + if cc.has_argument(option) + cflags += option + endif +endforeach + +if get_option('buildtype').contains('debug') + cflags += [ '-pedantic', '-DPEDANTIC' ] +else + cflags += [ '-UPEDANTIC' ] +endif \ No newline at end of file diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c new file mode 100644 index 0000000..80204b3 --- /dev/null +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -0,0 +1,234 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2019 Mellanox Technologies, Ltd + */ +#include +#include +#include +#include +#ifdef PEDANTIC +#pragma GCC diagnostic ignored "-Wpedantic" +#endif +#include +#ifdef PEDANTIC +#pragma GCC diagnostic error "-Wpedantic" +#endif + +#include +#include + +#include "mlx5_vdpa_utils.h" + + +struct mlx5_vdpa_priv { + TAILQ_ENTRY(mlx5_vdpa_priv) next; + int id; /* vDPA device id. */ + struct ibv_context *ctx; /* Device context. */ + struct rte_vdpa_dev_addr dev_addr; +}; + +TAILQ_HEAD(mlx5_vdpa_privs, mlx5_vdpa_priv) priv_list = + TAILQ_HEAD_INITIALIZER(priv_list); +static pthread_mutex_t priv_list_lock = PTHREAD_MUTEX_INITIALIZER; +int mlx5_vdpa_logtype; + +static struct rte_vdpa_dev_ops mlx5_vdpa_ops = { + .get_queue_num = NULL, + .get_features = NULL, + .get_protocol_features = NULL, + .dev_conf = NULL, + .dev_close = NULL, + .set_vring_state = NULL, + .set_features = NULL, + .migration_done = NULL, + .get_vfio_group_fd = NULL, + .get_vfio_device_fd = NULL, + .get_notify_area = NULL, +}; + +/** + * DPDK callback to register a PCI device. + * + * This function spawns vdpa device out of a given PCI device. + * + * @param[in] pci_drv + * PCI driver structure (mlx5_vpda_driver). + * @param[in] pci_dev + * PCI device information. + * + * @return + * 0 on success, 1 to skip this driver, a negative errno value otherwise + * and rte_errno is set. + */ +static int +mlx5_vdpa_pci_probe(struct rte_pci_driver *pci_drv __rte_unused, + struct rte_pci_device *pci_dev __rte_unused) +{ + struct ibv_device **ibv_list; + struct ibv_device *ibv_match = NULL; + struct mlx5_vdpa_priv *priv = NULL; + struct ibv_context *ctx = NULL; + int ret; + + if (mlx5_class_get(pci_dev->device.devargs) != MLX5_CLASS_VDPA) { + DRV_LOG(DEBUG, "Skip probing - should be probed by other mlx5" + " driver."); + return 1; + } + errno = 0; + ibv_list = mlx5_glue->get_device_list(&ret); + if (!ibv_list) { + rte_errno = ENOSYS; + DRV_LOG(ERR, "Failed to get device list, is ib_uverbs loaded?"); + return -rte_errno; + } + while (ret-- > 0) { + struct rte_pci_addr pci_addr; + + DRV_LOG(DEBUG, "Checking device \"%s\"..", ibv_list[ret]->name); + if (mlx5_dev_to_pci_addr(ibv_list[ret]->ibdev_path, &pci_addr)) + continue; + if (pci_dev->addr.domain != pci_addr.domain || + pci_dev->addr.bus != pci_addr.bus || + pci_dev->addr.devid != pci_addr.devid || + pci_dev->addr.function != pci_addr.function) + continue; + DRV_LOG(INFO, "PCI information matches for device \"%s\".", + ibv_list[ret]->name); + ibv_match = ibv_list[ret]; + break; + } + mlx5_glue->free_device_list(ibv_list); + if (!ibv_match) { + DRV_LOG(ERR, "No matching IB device for PCI slot " + "%" SCNx32 ":%" SCNx8 ":%" SCNx8 ".%" SCNx8 ".", + pci_dev->addr.domain, pci_dev->addr.bus, + pci_dev->addr.devid, pci_dev->addr.function); + rte_errno = ENOENT; + return -rte_errno; + } + ctx = mlx5_glue->dv_open_device(ibv_match); + if (!ctx) { + DRV_LOG(ERR, "Failed to open IB device \"%s\".", + ibv_match->name); + rte_errno = ENODEV; + return -rte_errno; + } + priv = rte_zmalloc("mlx5 vDPA device private", sizeof(*priv), + RTE_CACHE_LINE_SIZE); + if (!priv) { + DRV_LOG(ERR, "Failed to allocate private memory."); + rte_errno = ENOMEM; + goto error; + } + priv->ctx = ctx; + priv->dev_addr.pci_addr = pci_dev->addr; + priv->dev_addr.type = PCI_ADDR; + priv->id = rte_vdpa_register_device(&priv->dev_addr, &mlx5_vdpa_ops); + if (priv->id < 0) { + DRV_LOG(ERR, "Failed to register vDPA device."); + rte_errno = rte_errno ? rte_errno : EINVAL; + goto error; + } + pthread_mutex_lock(&priv_list_lock); + TAILQ_INSERT_TAIL(&priv_list, priv, next); + pthread_mutex_unlock(&priv_list_lock); + return 0; + +error: + if (priv) + rte_free(priv); + if (ctx) + mlx5_glue->close_device(ctx); + return -rte_errno; +} + +/** + * DPDK callback to remove a PCI device. + * + * This function removes all vDPA devices belong to a given PCI device. + * + * @param[in] pci_dev + * Pointer to the PCI device. + * + * @return + * 0 on success, the function cannot fail. + */ +static int +mlx5_vdpa_pci_remove(struct rte_pci_device *pci_dev) +{ + struct mlx5_vdpa_priv *priv = NULL; + int found = 0; + + pthread_mutex_lock(&priv_list_lock); + TAILQ_FOREACH(priv, &priv_list, next) { + if (memcmp(&priv->dev_addr.pci_addr, &pci_dev->addr, + sizeof(pci_dev->addr)) == 0) { + found = 1; + break; + } + } + if (found) { + TAILQ_REMOVE(&priv_list, priv, next); + mlx5_glue->close_device(priv->ctx); + rte_free(priv); + } + pthread_mutex_unlock(&priv_list_lock); + return 0; +} + +static const struct rte_pci_id mlx5_vdpa_pci_id_map[] = { + { + RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX, + PCI_DEVICE_ID_MELLANOX_CONNECTX5BF) + }, + { + RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX, + PCI_DEVICE_ID_MELLANOX_CONNECTX5BFVF) + }, + { + RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX, + PCI_DEVICE_ID_MELLANOX_CONNECTX6) + }, + { + RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX, + PCI_DEVICE_ID_MELLANOX_CONNECTX6VF) + }, + { + RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX, + PCI_DEVICE_ID_MELLANOX_CONNECTX6DX) + }, + { + RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX, + PCI_DEVICE_ID_MELLANOX_CONNECTX6DXVF) + }, + { + .vendor_id = 0 + } +}; + +static struct rte_pci_driver mlx5_vdpa_driver = { + .driver = { + .name = "mlx5_vdpa", + }, + .id_table = mlx5_vdpa_pci_id_map, + .probe = mlx5_vdpa_pci_probe, + .remove = mlx5_vdpa_pci_remove, + .drv_flags = 0, +}; + +/** + * Driver initialization routine. + */ +RTE_INIT(rte_mlx5_vdpa_init) +{ + /* Initialize common log type. */ + mlx5_vdpa_logtype = rte_log_register("pmd.vdpa.mlx5"); + if (mlx5_vdpa_logtype >= 0) + rte_log_set_level(mlx5_vdpa_logtype, RTE_LOG_NOTICE); + if (mlx5_glue) + rte_pci_register(&mlx5_vdpa_driver); +} + +RTE_PMD_EXPORT_NAME(net_mlx5_vdpa, __COUNTER__); +RTE_PMD_REGISTER_PCI_TABLE(net_mlx5_vdpa, mlx5_vdpa_pci_id_map); +RTE_PMD_REGISTER_KMOD_DEP(net_mlx5_vdpa, "* ib_uverbs & mlx5_core & mlx5_ib"); diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_utils.h b/drivers/vdpa/mlx5/mlx5_vdpa_utils.h new file mode 100644 index 0000000..a239df9 --- /dev/null +++ b/drivers/vdpa/mlx5/mlx5_vdpa_utils.h @@ -0,0 +1,20 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2019 Mellanox Technologies, Ltd + */ + +#ifndef RTE_PMD_MLX5_VDPA_UTILS_H_ +#define RTE_PMD_MLX5_VDPA_UTILS_H_ + +#include + + +extern int mlx5_vdpa_logtype; + +#define MLX5_VDPA_LOG_PREFIX "mlx5_vdpa" +/* Generic printf()-like logging macro with automatic line feed. */ +#define DRV_LOG(level, ...) \ + PMD_DRV_LOG_(level, mlx5_vdpa_logtype, MLX5_VDPA_LOG_PREFIX, \ + __VA_ARGS__ PMD_DRV_LOG_STRIP PMD_DRV_LOG_OPAREN, \ + PMD_DRV_LOG_CPAREN) + +#endif /* RTE_PMD_MLX5_VDPA_UTILS_H_ */ diff --git a/drivers/vdpa/mlx5/rte_pmd_mlx5_vdpa_version.map b/drivers/vdpa/mlx5/rte_pmd_mlx5_vdpa_version.map new file mode 100644 index 0000000..143836e --- /dev/null +++ b/drivers/vdpa/mlx5/rte_pmd_mlx5_vdpa_version.map @@ -0,0 +1,3 @@ +DPDK_20.02 { + local: *; +}; diff --git a/mk/rte.app.mk b/mk/rte.app.mk index 45f4cad..b33cd8a 100644 --- a/mk/rte.app.mk +++ b/mk/rte.app.mk @@ -196,18 +196,21 @@ endif _LDLIBS-$(CONFIG_RTE_LIBRTE_LIO_PMD) += -lrte_pmd_lio _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += -lrte_pmd_memif _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += -lrte_pmd_mlx4 -_LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += -lrte_common_mlx5 +ifeq ($(findstring y,$(CONFIG_RTE_LIBRTE_MLX5_PMD)$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD)),y) +_LDLIBS-y += -lrte_common_mlx5 +endif _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += -lrte_pmd_mlx5 +_LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += -lrte_pmd_mlx5_vdpa ifeq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y) -_LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += -ldl -_LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += -ldl +_LDLIBS-y += -ldl else ifeq ($(CONFIG_RTE_IBVERBS_LINK_STATIC),y) LIBS_IBVERBS_STATIC = $(shell $(RTE_SDK)/buildtools/options-ibverbs-static.sh) -_LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += $(LIBS_IBVERBS_STATIC) -_LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += $(LIBS_IBVERBS_STATIC) +_LDLIBS-y += $(LIBS_IBVERBS_STATIC) else +ifeq ($(findstring y,$(CONFIG_RTE_LIBRTE_MLX5_PMD)$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD)),y) +_LDLIBS-y += -libverbs -lmlx5 +endif _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += -libverbs -lmlx4 -_LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += -libverbs -lmlx5 endif _LDLIBS-$(CONFIG_RTE_LIBRTE_MVPP2_PMD) += -lrte_pmd_mvpp2 _LDLIBS-$(CONFIG_RTE_LIBRTE_MVNETA_PMD) += -lrte_pmd_mvneta From patchwork Sun Feb 2 16:03:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65464 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 77E4CA04FA; Sun, 2 Feb 2020 17:04:20 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 648B21BFE4; Sun, 2 Feb 2020 17:04:08 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id 6222B1BFE4 for ; Sun, 2 Feb 2020 17:04:07 +0100 (CET) Received: from Internal Mail-Server by MTLPINE1 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 2 Feb 2020 18:04:05 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 012G3uj0032300; Sun, 2 Feb 2020 18:04:05 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Sun, 2 Feb 2020 16:03:42 +0000 Message-Id: <1580659433-25581-3-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580659433-25581-1-git-send-email-matan@mellanox.com> References: <1580292549-27439-1-git-send-email-matan@mellanox.com> <1580659433-25581-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v3 02/13] vdpa/mlx5: support queues number operation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Support get_queue_num operation to get the maximum number of queues supported by the device. This number comes from the DevX capabilities. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko Reviewed-by: Maxime Coquelin --- drivers/vdpa/mlx5/mlx5_vdpa.c | 54 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 53 insertions(+), 1 deletion(-) diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c index 80204b3..5246fd2 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -15,6 +15,7 @@ #include #include +#include #include "mlx5_vdpa_utils.h" @@ -24,6 +25,7 @@ struct mlx5_vdpa_priv { int id; /* vDPA device id. */ struct ibv_context *ctx; /* Device context. */ struct rte_vdpa_dev_addr dev_addr; + struct mlx5_hca_vdpa_attr caps; }; TAILQ_HEAD(mlx5_vdpa_privs, mlx5_vdpa_priv) priv_list = @@ -31,8 +33,43 @@ struct mlx5_vdpa_priv { static pthread_mutex_t priv_list_lock = PTHREAD_MUTEX_INITIALIZER; int mlx5_vdpa_logtype; +static struct mlx5_vdpa_priv * +mlx5_vdpa_find_priv_resource_by_did(int did) +{ + struct mlx5_vdpa_priv *priv; + int found = 0; + + pthread_mutex_lock(&priv_list_lock); + TAILQ_FOREACH(priv, &priv_list, next) { + if (did == priv->id) { + found = 1; + break; + } + } + pthread_mutex_unlock(&priv_list_lock); + if (!found) { + DRV_LOG(ERR, "Invalid device id: %d.", did); + rte_errno = EINVAL; + return NULL; + } + return priv; +} + +static int +mlx5_vdpa_get_queue_num(int did, uint32_t *queue_num) +{ + struct mlx5_vdpa_priv *priv = mlx5_vdpa_find_priv_resource_by_did(did); + + if (priv == NULL) { + DRV_LOG(ERR, "Invalid device id: %d.", did); + return -1; + } + *queue_num = priv->caps.max_num_virtio_queues; + return 0; +} + static struct rte_vdpa_dev_ops mlx5_vdpa_ops = { - .get_queue_num = NULL, + .get_queue_num = mlx5_vdpa_get_queue_num, .get_features = NULL, .get_protocol_features = NULL, .dev_conf = NULL, @@ -67,6 +104,7 @@ struct mlx5_vdpa_priv { struct ibv_device *ibv_match = NULL; struct mlx5_vdpa_priv *priv = NULL; struct ibv_context *ctx = NULL; + struct mlx5_hca_attr attr; int ret; if (mlx5_class_get(pci_dev->device.devargs) != MLX5_CLASS_VDPA) { @@ -120,6 +158,20 @@ struct mlx5_vdpa_priv { rte_errno = ENOMEM; goto error; } + ret = mlx5_devx_cmd_query_hca_attr(ctx, &attr); + if (ret) { + DRV_LOG(ERR, "Unable to read HCA capabilities."); + rte_errno = ENOTSUP; + goto error; + } else { + if (!attr.vdpa.valid || !attr.vdpa.max_num_virtio_queues) { + DRV_LOG(ERR, "Not enough capabilities to support vdpa," + " maybe old FW/OFED version?"); + rte_errno = ENOTSUP; + goto error; + } + priv->caps = attr.vdpa; + } priv->ctx = ctx; priv->dev_addr.pci_addr = pci_dev->addr; priv->dev_addr.type = PCI_ADDR; From patchwork Sun Feb 2 16:03:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65465 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id B2505A04FA; Sun, 2 Feb 2020 17:04:30 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id AEB121C014; Sun, 2 Feb 2020 17:04:12 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id 8C9B91C013 for ; Sun, 2 Feb 2020 17:04:11 +0100 (CET) Received: from Internal Mail-Server by MTLPINE2 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 2 Feb 2020 18:04:10 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 012G3uj1032300; Sun, 2 Feb 2020 18:04:10 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Sun, 2 Feb 2020 16:03:43 +0000 Message-Id: <1580659433-25581-4-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580659433-25581-1-git-send-email-matan@mellanox.com> References: <1580292549-27439-1-git-send-email-matan@mellanox.com> <1580659433-25581-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v3 03/13] vdpa/mlx5: support features get operations X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add support for get_features and get_protocol_features operations. Part of the features are reported by the DevX capabilities. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko Reviewed-by: Maxime Coquelin --- doc/guides/vdpadevs/features/mlx5.ini | 7 ++++ drivers/vdpa/mlx5/mlx5_vdpa.c | 66 +++++++++++++++++++++++++++++++++-- 2 files changed, 71 insertions(+), 2 deletions(-) diff --git a/doc/guides/vdpadevs/features/mlx5.ini b/doc/guides/vdpadevs/features/mlx5.ini index d635bdf..fea491d 100644 --- a/doc/guides/vdpadevs/features/mlx5.ini +++ b/doc/guides/vdpadevs/features/mlx5.ini @@ -4,6 +4,13 @@ ; Refer to default.ini for the full list of available driver features. ; [Features] + +any layout = Y +guest announce = Y +mq = Y +proto mq = Y +proto log shmfd = Y +proto host notifier = Y Other kdrv = Y ARMv8 = Y Power8 = Y diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c index 5246fd2..00d3a19 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -1,6 +1,8 @@ /* SPDX-License-Identifier: BSD-3-Clause * Copyright 2019 Mellanox Technologies, Ltd */ +#include + #include #include #include @@ -16,6 +18,7 @@ #include #include #include +#include #include "mlx5_vdpa_utils.h" @@ -28,6 +31,27 @@ struct mlx5_vdpa_priv { struct mlx5_hca_vdpa_attr caps; }; +#ifndef VIRTIO_F_ORDER_PLATFORM +#define VIRTIO_F_ORDER_PLATFORM 36 +#endif + +#ifndef VIRTIO_F_RING_PACKED +#define VIRTIO_F_RING_PACKED 34 +#endif + +#define MLX5_VDPA_DEFAULT_FEATURES ((1ULL << VHOST_USER_F_PROTOCOL_FEATURES) | \ + (1ULL << VIRTIO_F_ANY_LAYOUT) | \ + (1ULL << VIRTIO_NET_F_MQ) | \ + (1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE) | \ + (1ULL << VIRTIO_F_ORDER_PLATFORM)) + +#define MLX5_VDPA_PROTOCOL_FEATURES \ + ((1ULL << VHOST_USER_PROTOCOL_F_SLAVE_REQ) | \ + (1ULL << VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD) | \ + (1ULL << VHOST_USER_PROTOCOL_F_HOST_NOTIFIER) | \ + (1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD) | \ + (1ULL << VHOST_USER_PROTOCOL_F_MQ)) + TAILQ_HEAD(mlx5_vdpa_privs, mlx5_vdpa_priv) priv_list = TAILQ_HEAD_INITIALIZER(priv_list); static pthread_mutex_t priv_list_lock = PTHREAD_MUTEX_INITIALIZER; @@ -68,10 +92,48 @@ struct mlx5_vdpa_priv { return 0; } +static int +mlx5_vdpa_get_vdpa_features(int did, uint64_t *features) +{ + struct mlx5_vdpa_priv *priv = mlx5_vdpa_find_priv_resource_by_did(did); + + if (priv == NULL) { + DRV_LOG(ERR, "Invalid device id: %d.", did); + return -1; + } + *features = MLX5_VDPA_DEFAULT_FEATURES; + if (priv->caps.virtio_queue_type & (1 << MLX5_VIRTQ_TYPE_PACKED)) + *features |= (1ULL << VIRTIO_F_RING_PACKED); + if (priv->caps.tso_ipv4) + *features |= (1ULL << VIRTIO_NET_F_HOST_TSO4); + if (priv->caps.tso_ipv6) + *features |= (1ULL << VIRTIO_NET_F_HOST_TSO6); + if (priv->caps.tx_csum) + *features |= (1ULL << VIRTIO_NET_F_CSUM); + if (priv->caps.rx_csum) + *features |= (1ULL << VIRTIO_NET_F_GUEST_CSUM); + if (priv->caps.virtio_version_1_0) + *features |= (1ULL << VIRTIO_F_VERSION_1); + return 0; +} + +static int +mlx5_vdpa_get_protocol_features(int did, uint64_t *features) +{ + struct mlx5_vdpa_priv *priv = mlx5_vdpa_find_priv_resource_by_did(did); + + if (priv == NULL) { + DRV_LOG(ERR, "Invalid device id: %d.", did); + return -1; + } + *features = MLX5_VDPA_PROTOCOL_FEATURES; + return 0; +} + static struct rte_vdpa_dev_ops mlx5_vdpa_ops = { .get_queue_num = mlx5_vdpa_get_queue_num, - .get_features = NULL, - .get_protocol_features = NULL, + .get_features = mlx5_vdpa_get_vdpa_features, + .get_protocol_features = mlx5_vdpa_get_protocol_features, .dev_conf = NULL, .dev_close = NULL, .set_vring_state = NULL, From patchwork Sun Feb 2 16:03:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65466 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id D21DFA04FA; Sun, 2 Feb 2020 17:04:41 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 082DA1C06B; Sun, 2 Feb 2020 17:04:20 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id 24A141BFDC for ; Sun, 2 Feb 2020 17:04:18 +0100 (CET) Received: from Internal Mail-Server by MTLPINE1 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 2 Feb 2020 18:04:14 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 012G3uj2032300; Sun, 2 Feb 2020 18:04:14 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Sun, 2 Feb 2020 16:03:44 +0000 Message-Id: <1580659433-25581-5-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580659433-25581-1-git-send-email-matan@mellanox.com> References: <1580292549-27439-1-git-send-email-matan@mellanox.com> <1580659433-25581-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v3 04/13] vdpa/mlx5: prepare memory regions X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" In order to map the guest physical addresses used by the virtio device guest side to the host physical addresses used by the HW as the host side, memory regions are created. By this way, for example, the HW can translate the addresses of the packets posted by the guest and to take the packets from the correct place. The design is to work with single MR which will be configured to the virtio queues in the HW, hence a lot of direct MRs are grouped to single indirect MR. Create functions to prepare and release MRs with all the related resources that are required for it. Create a new file mlx5_vdpa_mem.c to manage all the MR related code in the driver. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko Acked-by: Maxime Coquelin --- drivers/vdpa/mlx5/Makefile | 4 +- drivers/vdpa/mlx5/meson.build | 5 +- drivers/vdpa/mlx5/mlx5_vdpa.c | 17 +- drivers/vdpa/mlx5/mlx5_vdpa.h | 66 ++++++++ drivers/vdpa/mlx5/mlx5_vdpa_mem.c | 346 ++++++++++++++++++++++++++++++++++++++ 5 files changed, 420 insertions(+), 18 deletions(-) create mode 100644 drivers/vdpa/mlx5/mlx5_vdpa.h create mode 100644 drivers/vdpa/mlx5/mlx5_vdpa_mem.c diff --git a/drivers/vdpa/mlx5/Makefile b/drivers/vdpa/mlx5/Makefile index 1ab5296..bceab1e 100644 --- a/drivers/vdpa/mlx5/Makefile +++ b/drivers/vdpa/mlx5/Makefile @@ -8,6 +8,7 @@ LIB = librte_pmd_mlx5_vdpa.a # Sources. SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa.c +SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_mem.c # Basic CFLAGS. CFLAGS += -O3 @@ -15,6 +16,7 @@ CFLAGS += -std=c11 -Wall -Wextra CFLAGS += -g CFLAGS += -I$(RTE_SDK)/drivers/common/mlx5 CFLAGS += -I$(RTE_SDK)/drivers/net/mlx5_vdpa +CFLAGS += -I$(RTE_SDK)/lib/librte_sched CFLAGS += -I$(BUILDDIR)/drivers/common/mlx5 CFLAGS += -D_BSD_SOURCE CFLAGS += -D_DEFAULT_SOURCE @@ -22,7 +24,7 @@ CFLAGS += -D_XOPEN_SOURCE=600 CFLAGS += $(WERROR_FLAGS) CFLAGS += -Wno-strict-prototypes LDLIBS += -lrte_common_mlx5 -LDLIBS += -lrte_eal -lrte_vhost -lrte_kvargs -lrte_bus_pci +LDLIBS += -lrte_eal -lrte_vhost -lrte_kvargs -lrte_bus_pci -lrte_sched # A few warnings cannot be avoided in external headers. CFLAGS += -Wno-error=cast-qual diff --git a/drivers/vdpa/mlx5/meson.build b/drivers/vdpa/mlx5/meson.build index 6d3ab98..47f9537 100644 --- a/drivers/vdpa/mlx5/meson.build +++ b/drivers/vdpa/mlx5/meson.build @@ -9,9 +9,10 @@ endif fmt_name = 'mlx5_vdpa' allow_experimental_apis = true -deps += ['hash', 'common_mlx5', 'vhost', 'bus_pci', 'eal'] +deps += ['hash', 'common_mlx5', 'vhost', 'bus_pci', 'eal', 'sched'] sources = files( 'mlx5_vdpa.c', + 'mlx5_vdpa_mem.c', ) cflags_options = [ '-std=c11', @@ -30,4 +31,4 @@ if get_option('buildtype').contains('debug') cflags += [ '-pedantic', '-DPEDANTIC' ] else cflags += [ '-UPEDANTIC' ] -endif \ No newline at end of file +endif diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c index 00d3a19..16107cf 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -7,13 +7,6 @@ #include #include #include -#ifdef PEDANTIC -#pragma GCC diagnostic ignored "-Wpedantic" -#endif -#include -#ifdef PEDANTIC -#pragma GCC diagnostic error "-Wpedantic" -#endif #include #include @@ -21,16 +14,9 @@ #include #include "mlx5_vdpa_utils.h" +#include "mlx5_vdpa.h" -struct mlx5_vdpa_priv { - TAILQ_ENTRY(mlx5_vdpa_priv) next; - int id; /* vDPA device id. */ - struct ibv_context *ctx; /* Device context. */ - struct rte_vdpa_dev_addr dev_addr; - struct mlx5_hca_vdpa_attr caps; -}; - #ifndef VIRTIO_F_ORDER_PLATFORM #define VIRTIO_F_ORDER_PLATFORM 36 #endif @@ -243,6 +229,7 @@ struct mlx5_vdpa_priv { rte_errno = rte_errno ? rte_errno : EINVAL; goto error; } + SLIST_INIT(&priv->mr_list); pthread_mutex_lock(&priv_list_lock); TAILQ_INSERT_TAIL(&priv_list, priv, next); pthread_mutex_unlock(&priv_list_lock); diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h new file mode 100644 index 0000000..f367991 --- /dev/null +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -0,0 +1,66 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2019 Mellanox Technologies, Ltd + */ + +#ifndef RTE_PMD_MLX5_VDPA_H_ +#define RTE_PMD_MLX5_VDPA_H_ + +#include + +#ifdef PEDANTIC +#pragma GCC diagnostic ignored "-Wpedantic" +#endif +#include +#include +#ifdef PEDANTIC +#pragma GCC diagnostic error "-Wpedantic" +#endif + +#include +#include + +struct mlx5_vdpa_query_mr { + SLIST_ENTRY(mlx5_vdpa_query_mr) next; + void *addr; + uint64_t length; + struct mlx5dv_devx_umem *umem; + struct mlx5_devx_obj *mkey; + int is_indirect; +}; + +struct mlx5_vdpa_priv { + TAILQ_ENTRY(mlx5_vdpa_priv) next; + int id; /* vDPA device id. */ + int vid; /* vhost device id. */ + struct ibv_context *ctx; /* Device context. */ + struct rte_vdpa_dev_addr dev_addr; + struct mlx5_hca_vdpa_attr caps; + uint32_t pdn; /* Protection Domain number. */ + struct ibv_pd *pd; + uint32_t gpa_mkey_index; + struct ibv_mr *null_mr; + struct rte_vhost_memory *vmem; + SLIST_HEAD(mr_list, mlx5_vdpa_query_mr) mr_list; +}; + +/** + * Release all the prepared memory regions and all their related resources. + * + * @param[in] priv + * The vdpa driver private structure. + */ +void mlx5_vdpa_mem_dereg(struct mlx5_vdpa_priv *priv); + +/** + * Register all the memory regions of the virtio device to the HW and allocate + * all their related resources. + * + * @param[in] priv + * The vdpa driver private structure. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +int mlx5_vdpa_mem_register(struct mlx5_vdpa_priv *priv); + +#endif /* RTE_PMD_MLX5_VDPA_H_ */ diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_mem.c b/drivers/vdpa/mlx5/mlx5_vdpa_mem.c new file mode 100644 index 0000000..398ca35 --- /dev/null +++ b/drivers/vdpa/mlx5/mlx5_vdpa_mem.c @@ -0,0 +1,346 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2019 Mellanox Technologies, Ltd + */ +#include + +#include +#include +#include +#include + +#include +#include + +#include "mlx5_vdpa_utils.h" +#include "mlx5_vdpa.h" + +static int +mlx5_vdpa_pd_prepare(struct mlx5_vdpa_priv *priv) +{ +#ifdef HAVE_IBV_FLOW_DV_SUPPORT + if (priv->pd) + return 0; + priv->pd = mlx5_glue->alloc_pd(priv->ctx); + if (priv->pd == NULL) { + DRV_LOG(ERR, "Failed to allocate PD."); + return errno ? -errno : -ENOMEM; + } + struct mlx5dv_obj obj; + struct mlx5dv_pd pd_info; + int ret = 0; + + obj.pd.in = priv->pd; + obj.pd.out = &pd_info; + ret = mlx5_glue->dv_init_obj(&obj, MLX5DV_OBJ_PD); + if (ret) { + DRV_LOG(ERR, "Fail to get PD object info."); + mlx5_glue->dealloc_pd(priv->pd); + priv->pd = NULL; + return -errno; + } + priv->pdn = pd_info.pdn; + return 0; +#else + (void)priv; + DRV_LOG(ERR, "Cannot get pdn - no DV support."); + return -ENOTSUP; +#endif /* HAVE_IBV_FLOW_DV_SUPPORT */ +} + +void +mlx5_vdpa_mem_dereg(struct mlx5_vdpa_priv *priv) +{ + struct mlx5_vdpa_query_mr *entry; + struct mlx5_vdpa_query_mr *next; + + entry = SLIST_FIRST(&priv->mr_list); + while (entry) { + next = SLIST_NEXT(entry, next); + claim_zero(mlx5_devx_cmd_destroy(entry->mkey)); + if (!entry->is_indirect) + claim_zero(mlx5_glue->devx_umem_dereg(entry->umem)); + SLIST_REMOVE(&priv->mr_list, entry, mlx5_vdpa_query_mr, next); + rte_free(entry); + entry = next; + } + SLIST_INIT(&priv->mr_list); + if (priv->null_mr) { + claim_zero(mlx5_glue->dereg_mr(priv->null_mr)); + priv->null_mr = NULL; + } + if (priv->pd) { + claim_zero(mlx5_glue->dealloc_pd(priv->pd)); + priv->pd = NULL; + } + if (priv->vmem) { + free(priv->vmem); + priv->vmem = NULL; + } +} + +static int +mlx5_vdpa_regions_addr_cmp(const void *a, const void *b) +{ + const struct rte_vhost_mem_region *region_a = a; + const struct rte_vhost_mem_region *region_b = b; + + if (region_a->guest_phys_addr < region_b->guest_phys_addr) + return -1; + if (region_a->guest_phys_addr > region_b->guest_phys_addr) + return 1; + return 0; +} + +#define KLM_NUM_MAX_ALIGN(sz) (RTE_ALIGN_CEIL(sz, MLX5_MAX_KLM_BYTE_COUNT) / \ + MLX5_MAX_KLM_BYTE_COUNT) + +/* + * Allocate and sort the region list and choose indirect mkey mode: + * 1. Calculate GCD, guest memory size and indirect mkey entries num per mode. + * 2. Align GCD to the maximum allowed size(2G) and to be power of 2. + * 2. Decide the indirect mkey mode according to the next rules: + * a. If both KLM_FBS entries number and KLM entries number are bigger + * than the maximum allowed(MLX5_DEVX_MAX_KLM_ENTRIES) - error. + * b. KLM mode if KLM_FBS entries number is bigger than the maximum + * allowed(MLX5_DEVX_MAX_KLM_ENTRIES). + * c. KLM mode if GCD is smaller than the minimum allowed(4K). + * d. KLM mode if the total size of KLM entries is in one cache line + * and the total size of KLM_FBS entries is not in one cache line. + * e. Otherwise, KLM_FBS mode. + */ +static struct rte_vhost_memory * +mlx5_vdpa_vhost_mem_regions_prepare(int vid, uint8_t *mode, uint64_t *mem_size, + uint64_t *gcd, uint32_t *entries_num) +{ + struct rte_vhost_memory *mem; + uint64_t size; + uint64_t klm_entries_num = 0; + uint64_t klm_fbs_entries_num; + uint32_t i; + int ret = rte_vhost_get_mem_table(vid, &mem); + + if (ret < 0) { + DRV_LOG(ERR, "Failed to get VM memory layout vid =%d.", vid); + rte_errno = EINVAL; + return NULL; + } + qsort(mem->regions, mem->nregions, sizeof(mem->regions[0]), + mlx5_vdpa_regions_addr_cmp); + *mem_size = (mem->regions[(mem->nregions - 1)].guest_phys_addr) + + (mem->regions[(mem->nregions - 1)].size) - + (mem->regions[0].guest_phys_addr); + *gcd = 0; + for (i = 0; i < mem->nregions; ++i) { + DRV_LOG(INFO, "Region %u: HVA 0x%" PRIx64 ", GPA 0x%" PRIx64 + ", size 0x%" PRIx64 ".", i, + mem->regions[i].host_user_addr, + mem->regions[i].guest_phys_addr, mem->regions[i].size); + if (i > 0) { + /* Hole handle. */ + size = mem->regions[i].guest_phys_addr - + (mem->regions[i - 1].guest_phys_addr + + mem->regions[i - 1].size); + *gcd = rte_get_gcd(*gcd, size); + klm_entries_num += KLM_NUM_MAX_ALIGN(size); + } + size = mem->regions[i].size; + *gcd = rte_get_gcd(*gcd, size); + klm_entries_num += KLM_NUM_MAX_ALIGN(size); + } + if (*gcd > MLX5_MAX_KLM_BYTE_COUNT) + *gcd = rte_get_gcd(*gcd, MLX5_MAX_KLM_BYTE_COUNT); + if (!RTE_IS_POWER_OF_2(*gcd)) { + uint64_t candidate_gcd = rte_align64prevpow2(*gcd); + + while (candidate_gcd > 1 && (*gcd % candidate_gcd)) + candidate_gcd /= 2; + DRV_LOG(DEBUG, "GCD 0x%" PRIx64 " is not power of 2. Adjusted " + "GCD is 0x%" PRIx64 ".", *gcd, candidate_gcd); + *gcd = candidate_gcd; + } + klm_fbs_entries_num = *mem_size / *gcd; + if (*gcd < MLX5_MIN_KLM_FIXED_BUFFER_SIZE || klm_fbs_entries_num > + MLX5_DEVX_MAX_KLM_ENTRIES || + ((klm_entries_num * sizeof(struct mlx5_klm)) <= + RTE_CACHE_LINE_SIZE && (klm_fbs_entries_num * + sizeof(struct mlx5_klm)) > + RTE_CACHE_LINE_SIZE)) { + *mode = MLX5_MKC_ACCESS_MODE_KLM; + *entries_num = klm_entries_num; + DRV_LOG(INFO, "Indirect mkey mode is KLM."); + } else { + *mode = MLX5_MKC_ACCESS_MODE_KLM_FBS; + *entries_num = klm_fbs_entries_num; + DRV_LOG(INFO, "Indirect mkey mode is KLM Fixed Buffer Size."); + } + DRV_LOG(DEBUG, "Memory registration information: nregions = %u, " + "mem_size = 0x%" PRIx64 ", GCD = 0x%" PRIx64 + ", klm_fbs_entries_num = 0x%" PRIx64 ", klm_entries_num = 0x%" + PRIx64 ".", mem->nregions, *mem_size, *gcd, klm_fbs_entries_num, + klm_entries_num); + if (*entries_num > MLX5_DEVX_MAX_KLM_ENTRIES) { + DRV_LOG(ERR, "Failed to prepare memory of vid %d - memory is " + "too fragmented.", vid); + free(mem); + return NULL; + } + return mem; +} + +#define KLM_SIZE_MAX_ALIGN(sz) ((sz) > MLX5_MAX_KLM_BYTE_COUNT ? \ + MLX5_MAX_KLM_BYTE_COUNT : (sz)) + +/* + * The target here is to group all the physical memory regions of the + * virtio device in one indirect mkey. + * For KLM Fixed Buffer Size mode (HW find the translation entry in one + * read according to the guest phisical address): + * All the sub-direct mkeys of it must be in the same size, hence, each + * one of them should be in the GCD size of all the virtio memory + * regions and the holes between them. + * For KLM mode (each entry may be in different size so HW must iterate + * the entries): + * Each virtio memory region and each hole between them have one entry, + * just need to cover the maximum allowed size(2G) by splitting entries + * which their associated memory regions are bigger than 2G. + * It means that each virtio memory region may be mapped to more than + * one direct mkey in the 2 modes. + * All the holes of invalid memory between the virtio memory regions + * will be mapped to the null memory region for security. + */ +int +mlx5_vdpa_mem_register(struct mlx5_vdpa_priv *priv) +{ + struct mlx5_devx_mkey_attr mkey_attr; + struct mlx5_vdpa_query_mr *entry = NULL; + struct rte_vhost_mem_region *reg = NULL; + uint8_t mode; + uint32_t entries_num = 0; + uint32_t i; + uint64_t gcd; + uint64_t klm_size; + uint64_t mem_size; + uint64_t k; + int klm_index = 0; + int ret; + struct rte_vhost_memory *mem = mlx5_vdpa_vhost_mem_regions_prepare + (priv->vid, &mode, &mem_size, &gcd, &entries_num); + struct mlx5_klm klm_array[entries_num]; + + if (!mem) + return -rte_errno; + priv->vmem = mem; + ret = mlx5_vdpa_pd_prepare(priv); + if (ret) + goto error; + priv->null_mr = mlx5_glue->alloc_null_mr(priv->pd); + if (!priv->null_mr) { + DRV_LOG(ERR, "Failed to allocate null MR."); + ret = -errno; + goto error; + } + DRV_LOG(DEBUG, "Dump fill Mkey = %u.", priv->null_mr->lkey); + for (i = 0; i < mem->nregions; i++) { + reg = &mem->regions[i]; + entry = rte_zmalloc(__func__, sizeof(*entry), 0); + if (!entry) { + ret = -ENOMEM; + DRV_LOG(ERR, "Failed to allocate mem entry memory."); + goto error; + } + entry->umem = mlx5_glue->devx_umem_reg(priv->ctx, + (void *)(uintptr_t)reg->host_user_addr, + reg->size, IBV_ACCESS_LOCAL_WRITE); + if (!entry->umem) { + DRV_LOG(ERR, "Failed to register Umem by Devx."); + ret = -errno; + goto error; + } + mkey_attr.addr = (uintptr_t)(reg->guest_phys_addr); + mkey_attr.size = reg->size; + mkey_attr.umem_id = entry->umem->umem_id; + mkey_attr.pd = priv->pdn; + mkey_attr.pg_access = 1; + mkey_attr.klm_array = NULL; + mkey_attr.klm_num = 0; + entry->mkey = mlx5_devx_cmd_mkey_create(priv->ctx, &mkey_attr); + if (!entry->mkey) { + DRV_LOG(ERR, "Failed to create direct Mkey."); + ret = -rte_errno; + goto error; + } + entry->addr = (void *)(uintptr_t)(reg->host_user_addr); + entry->length = reg->size; + entry->is_indirect = 0; + if (i > 0) { + uint64_t sadd; + uint64_t empty_region_sz = reg->guest_phys_addr - + (mem->regions[i - 1].guest_phys_addr + + mem->regions[i - 1].size); + + if (empty_region_sz > 0) { + sadd = mem->regions[i - 1].guest_phys_addr + + mem->regions[i - 1].size; + klm_size = mode == MLX5_MKC_ACCESS_MODE_KLM ? + KLM_SIZE_MAX_ALIGN(empty_region_sz) : gcd; + for (k = 0; k < empty_region_sz; + k += klm_size) { + klm_array[klm_index].byte_count = + k + klm_size > empty_region_sz ? + empty_region_sz - k : klm_size; + klm_array[klm_index].mkey = + priv->null_mr->lkey; + klm_array[klm_index].address = sadd + k; + klm_index++; + } + } + } + klm_size = mode == MLX5_MKC_ACCESS_MODE_KLM ? + KLM_SIZE_MAX_ALIGN(reg->size) : gcd; + for (k = 0; k < reg->size; k += klm_size) { + klm_array[klm_index].byte_count = k + klm_size > + reg->size ? reg->size - k : klm_size; + klm_array[klm_index].mkey = entry->mkey->id; + klm_array[klm_index].address = reg->guest_phys_addr + k; + klm_index++; + } + SLIST_INSERT_HEAD(&priv->mr_list, entry, next); + } + mkey_attr.addr = (uintptr_t)(mem->regions[0].guest_phys_addr); + mkey_attr.size = mem_size; + mkey_attr.pd = priv->pdn; + mkey_attr.umem_id = 0; + /* Must be zero for KLM mode. */ + mkey_attr.log_entity_size = mode == MLX5_MKC_ACCESS_MODE_KLM_FBS ? + rte_log2_u64(gcd) : 0; + mkey_attr.pg_access = 0; + mkey_attr.klm_array = klm_array; + mkey_attr.klm_num = klm_index; + entry = rte_zmalloc(__func__, sizeof(*entry), 0); + if (!entry) { + DRV_LOG(ERR, "Failed to allocate memory for indirect entry."); + ret = -ENOMEM; + goto error; + } + entry->mkey = mlx5_devx_cmd_mkey_create(priv->ctx, &mkey_attr); + if (!entry->mkey) { + DRV_LOG(ERR, "Failed to create indirect Mkey."); + ret = -rte_errno; + goto error; + } + entry->is_indirect = 1; + SLIST_INSERT_HEAD(&priv->mr_list, entry, next); + priv->gpa_mkey_index = entry->mkey->id; + return 0; +error: + if (entry) { + if (entry->mkey) + mlx5_devx_cmd_destroy(entry->mkey); + if (entry->umem) + mlx5_glue->devx_umem_dereg(entry->umem); + rte_free(entry); + } + mlx5_vdpa_mem_dereg(priv); + rte_errno = -ret; + return ret; +} From patchwork Sun Feb 2 16:03:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65467 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 06FC9A04FA; Sun, 2 Feb 2020 17:04:53 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 5E2A81C0B6; Sun, 2 Feb 2020 17:04:24 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id 1DB1E1BFEB for ; Sun, 2 Feb 2020 17:04:22 +0100 (CET) Received: from Internal Mail-Server by MTLPINE2 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 2 Feb 2020 18:04:19 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 012G3uj3032300; Sun, 2 Feb 2020 18:04:19 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Sun, 2 Feb 2020 16:03:45 +0000 Message-Id: <1580659433-25581-6-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580659433-25581-1-git-send-email-matan@mellanox.com> References: <1580292549-27439-1-git-send-email-matan@mellanox.com> <1580659433-25581-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v3 05/13] vdpa/mlx5: prepare HW queues X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" As an arrangement to the vitrio queues creation, a 2 QPs and CQ may be created for the virtio queue. The design is to trigger an event for the guest and for the vdpa driver when a new CQE is posted by the HW after the packet transition. This patch add the basic operations to create and destroy the above HW objects and to trigger the CQE events when a new CQE is posted. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko --- drivers/common/mlx5/mlx5_prm.h | 4 + drivers/vdpa/mlx5/Makefile | 1 + drivers/vdpa/mlx5/meson.build | 1 + drivers/vdpa/mlx5/mlx5_vdpa.h | 89 ++++++++ drivers/vdpa/mlx5/mlx5_vdpa_event.c | 400 ++++++++++++++++++++++++++++++++++++ 5 files changed, 495 insertions(+) create mode 100644 drivers/vdpa/mlx5/mlx5_vdpa_event.c diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h index 15940c4..855b37a 100644 --- a/drivers/common/mlx5/mlx5_prm.h +++ b/drivers/common/mlx5/mlx5_prm.h @@ -391,6 +391,10 @@ struct mlx5_cqe { /* CQE format value. */ #define MLX5_COMPRESSED 0x3 +/* CQ doorbell cmd types. */ +#define MLX5_CQ_DBR_CMD_SOL_ONLY (1 << 24) +#define MLX5_CQ_DBR_CMD_ALL (0 << 24) + /* Action type of header modification. */ enum { MLX5_MODIFICATION_TYPE_SET = 0x1, diff --git a/drivers/vdpa/mlx5/Makefile b/drivers/vdpa/mlx5/Makefile index bceab1e..086af1b 100644 --- a/drivers/vdpa/mlx5/Makefile +++ b/drivers/vdpa/mlx5/Makefile @@ -9,6 +9,7 @@ LIB = librte_pmd_mlx5_vdpa.a # Sources. SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_mem.c +SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_event.c # Basic CFLAGS. CFLAGS += -O3 diff --git a/drivers/vdpa/mlx5/meson.build b/drivers/vdpa/mlx5/meson.build index 47f9537..3da0d76 100644 --- a/drivers/vdpa/mlx5/meson.build +++ b/drivers/vdpa/mlx5/meson.build @@ -13,6 +13,7 @@ deps += ['hash', 'common_mlx5', 'vhost', 'bus_pci', 'eal', 'sched'] sources = files( 'mlx5_vdpa.c', 'mlx5_vdpa_mem.c', + 'mlx5_vdpa_event.c', ) cflags_options = [ '-std=c11', diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h index f367991..6282635 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -15,9 +15,40 @@ #ifdef PEDANTIC #pragma GCC diagnostic error "-Wpedantic" #endif +#include +#include #include #include +#include + + +#define MLX5_VDPA_INTR_RETRIES 256 +#define MLX5_VDPA_INTR_RETRIES_USEC 1000 + +struct mlx5_vdpa_cq { + uint16_t log_desc_n; + uint32_t cq_ci:24; + uint32_t arm_sn:2; + rte_spinlock_t sl; + struct mlx5_devx_obj *cq; + struct mlx5dv_devx_umem *umem_obj; + union { + volatile void *umem_buf; + volatile struct mlx5_cqe *cqes; + }; + volatile uint32_t *db_rec; + uint64_t errors; +}; + +struct mlx5_vdpa_event_qp { + struct mlx5_vdpa_cq cq; + struct mlx5_devx_obj *fw_qp; + struct mlx5_devx_obj *sw_qp; + struct mlx5dv_devx_umem *umem_obj; + void *umem_buf; + volatile uint32_t *db_rec; +}; struct mlx5_vdpa_query_mr { SLIST_ENTRY(mlx5_vdpa_query_mr) next; @@ -40,6 +71,10 @@ struct mlx5_vdpa_priv { uint32_t gpa_mkey_index; struct ibv_mr *null_mr; struct rte_vhost_memory *vmem; + uint32_t eqn; + struct mlx5dv_devx_event_channel *eventc; + struct mlx5dv_devx_uar *uar; + struct rte_intr_handle intr_handle; SLIST_HEAD(mr_list, mlx5_vdpa_query_mr) mr_list; }; @@ -63,4 +98,58 @@ struct mlx5_vdpa_priv { */ int mlx5_vdpa_mem_register(struct mlx5_vdpa_priv *priv); + +/** + * Create an event QP and all its related resources. + * + * @param[in] priv + * The vdpa driver private structure. + * @param[in] desc_n + * Number of descriptors. + * @param[in] callfd + * The guest notification file descriptor. + * @param[in/out] eqp + * Pointer to the event QP structure. + * + * @return + * 0 on success, -1 otherwise and rte_errno is set. + */ +int mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv *priv, uint16_t desc_n, + int callfd, struct mlx5_vdpa_event_qp *eqp); + +/** + * Destroy an event QP and all its related resources. + * + * @param[in/out] eqp + * Pointer to the event QP structure. + */ +void mlx5_vdpa_event_qp_destroy(struct mlx5_vdpa_event_qp *eqp); + +/** + * Release all the event global resources. + * + * @param[in] priv + * The vdpa driver private structure. + */ +void mlx5_vdpa_event_qp_global_release(struct mlx5_vdpa_priv *priv); + +/** + * Setup CQE event. + * + * @param[in] priv + * The vdpa driver private structure. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +int mlx5_vdpa_cqe_event_setup(struct mlx5_vdpa_priv *priv); + +/** + * Unset CQE event . + * + * @param[in] priv + * The vdpa driver private structure. + */ +void mlx5_vdpa_cqe_event_unset(struct mlx5_vdpa_priv *priv); + #endif /* RTE_PMD_MLX5_VDPA_H_ */ diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_event.c b/drivers/vdpa/mlx5/mlx5_vdpa_event.c new file mode 100644 index 0000000..c50e58e --- /dev/null +++ b/drivers/vdpa/mlx5/mlx5_vdpa_event.c @@ -0,0 +1,400 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2019 Mellanox Technologies, Ltd + */ +#include +#include +#include + +#include +#include +#include +#include +#include +#include + +#include + +#include "mlx5_vdpa_utils.h" +#include "mlx5_vdpa.h" + + +void +mlx5_vdpa_event_qp_global_release(struct mlx5_vdpa_priv *priv) +{ + if (priv->uar) { + mlx5_glue->devx_free_uar(priv->uar); + priv->uar = NULL; + } + if (priv->eventc) { + mlx5_glue->devx_destroy_event_channel(priv->eventc); + priv->eventc = NULL; + } + priv->eqn = 0; +} + +/* Prepare all the global resources for all the event objects.*/ +static int +mlx5_vdpa_event_qp_global_prepare(struct mlx5_vdpa_priv *priv) +{ + uint32_t lcore; + + if (priv->eventc) + return 0; + lcore = (uint32_t)rte_lcore_to_cpu_id(-1); + if (mlx5_glue->devx_query_eqn(priv->ctx, lcore, &priv->eqn)) { + rte_errno = errno; + DRV_LOG(ERR, "Failed to query EQ number %d.", rte_errno); + return -1; + } + priv->eventc = mlx5_glue->devx_create_event_channel(priv->ctx, + MLX5DV_DEVX_CREATE_EVENT_CHANNEL_FLAGS_OMIT_EV_DATA); + if (!priv->eventc) { + rte_errno = errno; + DRV_LOG(ERR, "Failed to create event channel %d.", + rte_errno); + goto error; + } + priv->uar = mlx5_glue->devx_alloc_uar(priv->ctx, 0); + if (!priv->uar) { + rte_errno = errno; + DRV_LOG(ERR, "Failed to allocate UAR."); + goto error; + } + return 0; +error: + mlx5_vdpa_event_qp_global_release(priv); + return -1; +} + +static void +mlx5_vdpa_cq_destroy(struct mlx5_vdpa_cq *cq) +{ + if (cq->cq) + claim_zero(mlx5_devx_cmd_destroy(cq->cq)); + if (cq->umem_obj) + claim_zero(mlx5_glue->devx_umem_dereg(cq->umem_obj)); + if (cq->umem_buf) + rte_free((void *)(uintptr_t)cq->umem_buf); + memset(cq, 0, sizeof(*cq)); +} + +static inline void +mlx5_vdpa_cq_arm(struct mlx5_vdpa_priv *priv, struct mlx5_vdpa_cq *cq) +{ + const unsigned int cqe_mask = (1 << cq->log_desc_n) - 1; + uint32_t arm_sn = cq->arm_sn << MLX5_CQ_SQN_OFFSET; + uint32_t cq_ci = cq->cq_ci & MLX5_CI_MASK & cqe_mask; + uint32_t doorbell_hi = arm_sn | MLX5_CQ_DBR_CMD_ALL | cq_ci; + uint64_t doorbell = ((uint64_t)doorbell_hi << 32) | cq->cq->id; + uint64_t db_be = rte_cpu_to_be_64(doorbell); + uint32_t *addr = RTE_PTR_ADD(priv->uar->base_addr, MLX5_CQ_DOORBELL); + + rte_io_wmb(); + cq->db_rec[MLX5_CQ_ARM_DB] = rte_cpu_to_be_32(doorbell_hi); + rte_wmb(); +#ifdef RTE_ARCH_64 + *(uint64_t *)addr = db_be; +#else + *(uint32_t *)addr = db_be; + rte_io_wmb(); + *((uint32_t *)addr + 1) = db_be >> 32; +#endif + cq->arm_sn++; +} + +static int +mlx5_vdpa_cq_create(struct mlx5_vdpa_priv *priv, uint16_t log_desc_n, + int callfd, struct mlx5_vdpa_cq *cq) +{ + struct mlx5_devx_cq_attr attr; + size_t pgsize = sysconf(_SC_PAGESIZE); + uint32_t umem_size; + int ret; + uint16_t event_nums[1] = {0}; + + cq->log_desc_n = log_desc_n; + umem_size = sizeof(struct mlx5_cqe) * (1 << log_desc_n) + + sizeof(*cq->db_rec) * 2; + cq->umem_buf = rte_zmalloc(__func__, umem_size, 4096); + if (!cq->umem_buf) { + DRV_LOG(ERR, "Failed to allocate memory for CQ."); + rte_errno = ENOMEM; + return -ENOMEM; + } + cq->umem_obj = mlx5_glue->devx_umem_reg(priv->ctx, + (void *)(uintptr_t)cq->umem_buf, + umem_size, + IBV_ACCESS_LOCAL_WRITE); + if (!cq->umem_obj) { + DRV_LOG(ERR, "Failed to register umem for CQ."); + goto error; + } + attr.q_umem_valid = 1; + attr.db_umem_valid = 1; + attr.use_first_only = 0; + attr.overrun_ignore = 0; + attr.uar_page_id = priv->uar->page_id; + attr.q_umem_id = cq->umem_obj->umem_id; + attr.q_umem_offset = 0; + attr.db_umem_id = cq->umem_obj->umem_id; + attr.db_umem_offset = sizeof(struct mlx5_cqe) * (1 << log_desc_n); + attr.eqn = priv->eqn; + attr.log_cq_size = log_desc_n; + attr.log_page_size = rte_log2_u32(pgsize); + cq->cq = mlx5_devx_cmd_create_cq(priv->ctx, &attr); + if (!cq->cq) + goto error; + cq->db_rec = RTE_PTR_ADD(cq->umem_buf, (uintptr_t)attr.db_umem_offset); + cq->cq_ci = 0; + rte_spinlock_init(&cq->sl); + /* Subscribe CQ event to the event channel controlled by the driver. */ + ret = mlx5_glue->devx_subscribe_devx_event(priv->eventc, cq->cq->obj, + sizeof(event_nums), + event_nums, + (uint64_t)(uintptr_t)cq); + if (ret) { + DRV_LOG(ERR, "Failed to subscribe CQE event."); + rte_errno = errno; + goto error; + } + /* Subscribe CQ event to the guest FD only if it is not in poll mode. */ + if (callfd != -1) { + ret = mlx5_glue->devx_subscribe_devx_event_fd(priv->eventc, + callfd, + cq->cq->obj, 0); + if (ret) { + DRV_LOG(ERR, "Failed to subscribe CQE event fd."); + rte_errno = errno; + goto error; + } + } + /* First arming. */ + mlx5_vdpa_cq_arm(priv, cq); + return 0; +error: + mlx5_vdpa_cq_destroy(cq); + return -1; +} + +static inline void __rte_unused +mlx5_vdpa_cq_poll(struct mlx5_vdpa_priv *priv __rte_unused, + struct mlx5_vdpa_cq *cq) +{ + struct mlx5_vdpa_event_qp *eqp = + container_of(cq, struct mlx5_vdpa_event_qp, cq); + const unsigned int cqe_size = 1 << cq->log_desc_n; + const unsigned int cqe_mask = cqe_size - 1; + int ret; + + do { + volatile struct mlx5_cqe *cqe = cq->cqes + (cq->cq_ci & + cqe_mask); + + ret = check_cqe(cqe, cqe_size, cq->cq_ci); + switch (ret) { + case MLX5_CQE_STATUS_ERR: + cq->errors++; + /*fall-through*/ + case MLX5_CQE_STATUS_SW_OWN: + cq->cq_ci++; + break; + case MLX5_CQE_STATUS_HW_OWN: + default: + break; + } + } while (ret != MLX5_CQE_STATUS_HW_OWN); + rte_io_wmb(); + /* Ring CQ doorbell record. */ + cq->db_rec[0] = rte_cpu_to_be_32(cq->cq_ci); + rte_io_wmb(); + /* Ring SW QP doorbell record. */ + eqp->db_rec[0] = rte_cpu_to_be_32(cq->cq_ci + cqe_size); +} + +static void +mlx5_vdpa_interrupt_handler(void *cb_arg) +{ +#ifndef HAVE_IBV_DEVX_EVENT + (void)cb_arg; + return; +#else + struct mlx5_vdpa_priv *priv = cb_arg; + union { + struct mlx5dv_devx_async_event_hdr event_resp; + uint8_t buf[sizeof(struct mlx5dv_devx_async_event_hdr) + 128]; + } out; + + while (mlx5_glue->devx_get_event(priv->eventc, &out.event_resp, + sizeof(out.buf)) >= + (ssize_t)sizeof(out.event_resp.cookie)) { + struct mlx5_vdpa_cq *cq = (struct mlx5_vdpa_cq *) + (uintptr_t)out.event_resp.cookie; + rte_spinlock_lock(&cq->sl); + mlx5_vdpa_cq_poll(priv, cq); + mlx5_vdpa_cq_arm(priv, cq); + rte_spinlock_unlock(&cq->sl); + DRV_LOG(DEBUG, "CQ %d event: new cq_ci = %u.", cq->cq->id, + cq->cq_ci); + } +#endif /* HAVE_IBV_DEVX_ASYNC */ +} + +int +mlx5_vdpa_cqe_event_setup(struct mlx5_vdpa_priv *priv) +{ + int flags = fcntl(priv->eventc->fd, F_GETFL); + int ret = fcntl(priv->eventc->fd, F_SETFL, flags | O_NONBLOCK); + if (ret) { + DRV_LOG(ERR, "Failed to change event channel FD."); + rte_errno = errno; + return -rte_errno; + } + priv->intr_handle.fd = priv->eventc->fd; + priv->intr_handle.type = RTE_INTR_HANDLE_EXT; + if (rte_intr_callback_register(&priv->intr_handle, + mlx5_vdpa_interrupt_handler, priv)) { + priv->intr_handle.fd = 0; + DRV_LOG(ERR, "Failed to register CQE interrupt %d.", rte_errno); + return -rte_errno; + } + return 0; +} + +void +mlx5_vdpa_cqe_event_unset(struct mlx5_vdpa_priv *priv) +{ + int retries = MLX5_VDPA_INTR_RETRIES; + int ret = -EAGAIN; + + if (priv->intr_handle.fd) { + while (retries-- && ret == -EAGAIN) { + ret = rte_intr_callback_unregister(&priv->intr_handle, + mlx5_vdpa_interrupt_handler, + priv); + if (ret == -EAGAIN) { + DRV_LOG(DEBUG, "Try again to unregister fd %d " + "of CQ interrupt, retries = %d.", + priv->intr_handle.fd, retries); + usleep(MLX5_VDPA_INTR_RETRIES_USEC); + } + } + memset(&priv->intr_handle, 0, sizeof(priv->intr_handle)); + } +} + +void +mlx5_vdpa_event_qp_destroy(struct mlx5_vdpa_event_qp *eqp) +{ + if (eqp->sw_qp) + claim_zero(mlx5_devx_cmd_destroy(eqp->sw_qp)); + if (eqp->umem_obj) + claim_zero(mlx5_glue->devx_umem_dereg(eqp->umem_obj)); + if (eqp->umem_buf) + rte_free(eqp->umem_buf); + if (eqp->fw_qp) + claim_zero(mlx5_devx_cmd_destroy(eqp->fw_qp)); + mlx5_vdpa_cq_destroy(&eqp->cq); + memset(eqp, 0, sizeof(*eqp)); +} + +static int +mlx5_vdpa_qps2rts(struct mlx5_vdpa_event_qp *eqp) +{ + if (mlx5_devx_cmd_modify_qp_state(eqp->fw_qp, MLX5_CMD_OP_RST2INIT_QP, + eqp->sw_qp->id)) { + DRV_LOG(ERR, "Failed to modify FW QP to INIT state(%u).", + rte_errno); + return -1; + } + if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp, MLX5_CMD_OP_RST2INIT_QP, + eqp->fw_qp->id)) { + DRV_LOG(ERR, "Failed to modify SW QP to INIT state(%u).", + rte_errno); + return -1; + } + if (mlx5_devx_cmd_modify_qp_state(eqp->fw_qp, MLX5_CMD_OP_INIT2RTR_QP, + eqp->sw_qp->id)) { + DRV_LOG(ERR, "Failed to modify FW QP to RTR state(%u).", + rte_errno); + return -1; + } + if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp, MLX5_CMD_OP_INIT2RTR_QP, + eqp->fw_qp->id)) { + DRV_LOG(ERR, "Failed to modify SW QP to RTR state(%u).", + rte_errno); + return -1; + } + if (mlx5_devx_cmd_modify_qp_state(eqp->fw_qp, MLX5_CMD_OP_RTR2RTS_QP, + eqp->sw_qp->id)) { + DRV_LOG(ERR, "Failed to modify FW QP to RTS state(%u).", + rte_errno); + return -1; + } + if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp, MLX5_CMD_OP_RTR2RTS_QP, + eqp->fw_qp->id)) { + DRV_LOG(ERR, "Failed to modify SW QP to RTS state(%u).", + rte_errno); + return -1; + } + return 0; +} + +int +mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv *priv, uint16_t desc_n, + int callfd, struct mlx5_vdpa_event_qp *eqp) +{ + struct mlx5_devx_qp_attr attr = {0}; + uint16_t log_desc_n = rte_log2_u32(desc_n); + uint32_t umem_size = (1 << log_desc_n) * MLX5_WSEG_SIZE + + sizeof(*eqp->db_rec) * 2; + + if (mlx5_vdpa_event_qp_global_prepare(priv)) + return -1; + if (mlx5_vdpa_cq_create(priv, log_desc_n, callfd, &eqp->cq)) + return -1; + attr.pd = priv->pdn; + eqp->fw_qp = mlx5_devx_cmd_create_qp(priv->ctx, &attr); + if (!eqp->fw_qp) { + DRV_LOG(ERR, "Failed to create FW QP(%u).", rte_errno); + goto error; + } + eqp->umem_buf = rte_zmalloc(__func__, umem_size, 4096); + if (!eqp->umem_buf) { + DRV_LOG(ERR, "Failed to allocate memory for SW QP."); + rte_errno = ENOMEM; + goto error; + } + eqp->umem_obj = mlx5_glue->devx_umem_reg(priv->ctx, + (void *)(uintptr_t)eqp->umem_buf, + umem_size, + IBV_ACCESS_LOCAL_WRITE); + if (!eqp->umem_obj) { + DRV_LOG(ERR, "Failed to register umem for SW QP."); + goto error; + } + attr.uar_index = priv->uar->page_id; + attr.cqn = eqp->cq.cq->id; + attr.log_page_size = rte_log2_u32(sysconf(_SC_PAGESIZE)); + attr.rq_size = 1 << log_desc_n; + attr.log_rq_stride = rte_log2_u32(MLX5_WSEG_SIZE); + attr.sq_size = 0; /* No need SQ. */ + attr.dbr_umem_valid = 1; + attr.wq_umem_id = eqp->umem_obj->umem_id; + attr.wq_umem_offset = 0; + attr.dbr_umem_id = eqp->umem_obj->umem_id; + attr.dbr_address = (1 << log_desc_n) * MLX5_WSEG_SIZE; + eqp->sw_qp = mlx5_devx_cmd_create_qp(priv->ctx, &attr); + if (!eqp->sw_qp) { + DRV_LOG(ERR, "Failed to create SW QP(%u).", rte_errno); + goto error; + } + eqp->db_rec = RTE_PTR_ADD(eqp->umem_buf, (uintptr_t)attr.dbr_address); + if (mlx5_vdpa_qps2rts(eqp)) + goto error; + /* First ringing. */ + rte_write32(rte_cpu_to_be_32(1 << log_desc_n), &eqp->db_rec[0]); + return 0; +error: + mlx5_vdpa_event_qp_destroy(eqp); + return -1; +} From patchwork Sun Feb 2 16:03:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65468 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3D50AA04FA; Sun, 2 Feb 2020 17:05:08 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 0B2B31BFF8; Sun, 2 Feb 2020 17:04:30 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id 927D71BFEB for ; Sun, 2 Feb 2020 17:04:28 +0100 (CET) Received: from Internal Mail-Server by MTLPINE1 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 2 Feb 2020 18:04:25 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 012G3uj4032300; Sun, 2 Feb 2020 18:04:25 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Sun, 2 Feb 2020 16:03:46 +0000 Message-Id: <1580659433-25581-7-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580659433-25581-1-git-send-email-matan@mellanox.com> References: <1580292549-27439-1-git-send-email-matan@mellanox.com> <1580659433-25581-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v3 06/13] vdpa/mlx5: prepare virtio queues X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The HW virtq object represents an emulated context for a VIRTIO_NET virtqueue which was created and managed by a VIRTIO_NET driver as defined in VIRTIO Specification. Add support to prepare and release all the basic HW resources needed the user virtqs emulation according to the rte_vhost configurations. This patch prepares the basic configurations needed by DevX commands to create a virtq. Add new file mlx5_vdpa_virtq.c to manage virtq operations. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko --- drivers/vdpa/mlx5/Makefile | 1 + drivers/vdpa/mlx5/meson.build | 1 + drivers/vdpa/mlx5/mlx5_vdpa.c | 1 + drivers/vdpa/mlx5/mlx5_vdpa.h | 36 ++++++ drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 212 ++++++++++++++++++++++++++++++++++++ 5 files changed, 251 insertions(+) create mode 100644 drivers/vdpa/mlx5/mlx5_vdpa_virtq.c diff --git a/drivers/vdpa/mlx5/Makefile b/drivers/vdpa/mlx5/Makefile index 086af1b..1b24400 100644 --- a/drivers/vdpa/mlx5/Makefile +++ b/drivers/vdpa/mlx5/Makefile @@ -10,6 +10,7 @@ LIB = librte_pmd_mlx5_vdpa.a SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_mem.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_event.c +SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_virtq.c # Basic CFLAGS. CFLAGS += -O3 diff --git a/drivers/vdpa/mlx5/meson.build b/drivers/vdpa/mlx5/meson.build index 3da0d76..732ddce 100644 --- a/drivers/vdpa/mlx5/meson.build +++ b/drivers/vdpa/mlx5/meson.build @@ -14,6 +14,7 @@ sources = files( 'mlx5_vdpa.c', 'mlx5_vdpa_mem.c', 'mlx5_vdpa_event.c', + 'mlx5_vdpa_virtq.c', ) cflags_options = [ '-std=c11', diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c index 16107cf..d76c3aa 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -230,6 +230,7 @@ goto error; } SLIST_INIT(&priv->mr_list); + SLIST_INIT(&priv->virtq_list); pthread_mutex_lock(&priv_list_lock); TAILQ_INSERT_TAIL(&priv_list, priv, next); pthread_mutex_unlock(&priv_list_lock); diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h index 6282635..9284420 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -59,6 +59,19 @@ struct mlx5_vdpa_query_mr { int is_indirect; }; +struct mlx5_vdpa_virtq { + SLIST_ENTRY(mlx5_vdpa_virtq) next; + uint16_t index; + uint16_t vq_size; + struct mlx5_devx_obj *virtq; + struct mlx5_vdpa_event_qp eqp; + struct { + struct mlx5dv_devx_umem *obj; + void *buf; + uint32_t size; + } umems[3]; +}; + struct mlx5_vdpa_priv { TAILQ_ENTRY(mlx5_vdpa_priv) next; int id; /* vDPA device id. */ @@ -75,6 +88,10 @@ struct mlx5_vdpa_priv { struct mlx5dv_devx_event_channel *eventc; struct mlx5dv_devx_uar *uar; struct rte_intr_handle intr_handle; + struct mlx5_devx_obj *td; + struct mlx5_devx_obj *tis; + uint16_t nr_virtqs; + SLIST_HEAD(virtq_list, mlx5_vdpa_virtq) virtq_list; SLIST_HEAD(mr_list, mlx5_vdpa_query_mr) mr_list; }; @@ -152,4 +169,23 @@ int mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv *priv, uint16_t desc_n, */ void mlx5_vdpa_cqe_event_unset(struct mlx5_vdpa_priv *priv); +/** + * Release a virtq and all its related resources. + * + * @param[in] priv + * The vdpa driver private structure. + */ +void mlx5_vdpa_virtqs_release(struct mlx5_vdpa_priv *priv); + +/** + * Create all the HW virtqs resources and all their related resources. + * + * @param[in] priv + * The vdpa driver private structure. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +int mlx5_vdpa_virtqs_prepare(struct mlx5_vdpa_priv *priv); + #endif /* RTE_PMD_MLX5_VDPA_H_ */ diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c new file mode 100644 index 0000000..781bccf --- /dev/null +++ b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c @@ -0,0 +1,212 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2019 Mellanox Technologies, Ltd + */ +#include + +#include +#include + +#include + +#include "mlx5_vdpa_utils.h" +#include "mlx5_vdpa.h" + + +static int +mlx5_vdpa_virtq_unset(struct mlx5_vdpa_virtq *virtq) +{ + int i; + + if (virtq->virtq) { + claim_zero(mlx5_devx_cmd_destroy(virtq->virtq)); + virtq->virtq = NULL; + } + for (i = 0; i < 3; ++i) { + if (virtq->umems[i].obj) + claim_zero(mlx5_glue->devx_umem_dereg + (virtq->umems[i].obj)); + if (virtq->umems[i].buf) + rte_free(virtq->umems[i].buf); + } + memset(&virtq->umems, 0, sizeof(virtq->umems)); + if (virtq->eqp.fw_qp) + mlx5_vdpa_event_qp_destroy(&virtq->eqp); + return 0; +} + +void +mlx5_vdpa_virtqs_release(struct mlx5_vdpa_priv *priv) +{ + struct mlx5_vdpa_virtq *entry; + struct mlx5_vdpa_virtq *next; + + entry = SLIST_FIRST(&priv->virtq_list); + while (entry) { + next = SLIST_NEXT(entry, next); + mlx5_vdpa_virtq_unset(entry); + SLIST_REMOVE(&priv->virtq_list, entry, mlx5_vdpa_virtq, next); + rte_free(entry); + entry = next; + } + SLIST_INIT(&priv->virtq_list); + if (priv->tis) { + claim_zero(mlx5_devx_cmd_destroy(priv->tis)); + priv->tis = NULL; + } + if (priv->td) { + claim_zero(mlx5_devx_cmd_destroy(priv->td)); + priv->td = NULL; + } +} + +static uint64_t +mlx5_vdpa_hva_to_gpa(struct rte_vhost_memory *mem, uint64_t hva) +{ + struct rte_vhost_mem_region *reg; + uint32_t i; + uint64_t gpa = 0; + + for (i = 0; i < mem->nregions; i++) { + reg = &mem->regions[i]; + if (hva >= reg->host_user_addr && + hva < reg->host_user_addr + reg->size) { + gpa = hva - reg->host_user_addr + reg->guest_phys_addr; + break; + } + } + return gpa; +} + +static int +mlx5_vdpa_virtq_setup(struct mlx5_vdpa_priv *priv, + struct mlx5_vdpa_virtq *virtq, int index) +{ + struct rte_vhost_vring vq; + struct mlx5_devx_virtq_attr attr = {0}; + uint64_t gpa; + int ret; + int i; + uint16_t last_avail_idx; + uint16_t last_used_idx; + + ret = rte_vhost_get_vhost_vring(priv->vid, index, &vq); + if (ret) + return -1; + virtq->index = index; + virtq->vq_size = vq.size; + /* + * No need event QPs creation when the guest in poll mode or when the + * capability allows it. + */ + attr.event_mode = vq.callfd != -1 || !(priv->caps.event_mode & (1 << + MLX5_VIRTQ_EVENT_MODE_NO_MSIX)) ? + MLX5_VIRTQ_EVENT_MODE_QP : + MLX5_VIRTQ_EVENT_MODE_NO_MSIX; + if (attr.event_mode == MLX5_VIRTQ_EVENT_MODE_QP) { + ret = mlx5_vdpa_event_qp_create(priv, vq.size, vq.callfd, + &virtq->eqp); + if (ret) { + DRV_LOG(ERR, "Failed to create event QPs for virtq %d.", + index); + return -1; + } + attr.qp_id = virtq->eqp.fw_qp->id; + } else { + DRV_LOG(INFO, "Virtq %d is, for sure, working by poll mode, no" + " need event QPs and event mechanism.", index); + } + /* Setup 3 UMEMs for each virtq. */ + for (i = 0; i < 3; ++i) { + virtq->umems[i].size = priv->caps.umems[i].a * vq.size + + priv->caps.umems[i].b; + virtq->umems[i].buf = rte_zmalloc(__func__, + virtq->umems[i].size, 4096); + if (!virtq->umems[i].buf) { + DRV_LOG(ERR, "Cannot allocate umem %d memory for virtq" + " %u.", i, index); + goto error; + } + virtq->umems[i].obj = mlx5_glue->devx_umem_reg(priv->ctx, + virtq->umems[i].buf, + virtq->umems[i].size, + IBV_ACCESS_LOCAL_WRITE); + if (!virtq->umems[i].obj) { + DRV_LOG(ERR, "Failed to register umem %d for virtq %u.", + i, index); + goto error; + } + attr.umems[i].id = virtq->umems[i].obj->umem_id; + attr.umems[i].offset = 0; + attr.umems[i].size = virtq->umems[i].size; + } + gpa = mlx5_vdpa_hva_to_gpa(priv->vmem, (uint64_t)(uintptr_t)vq.desc); + if (!gpa) { + DRV_LOG(ERR, "Fail to get GPA for descriptor ring."); + goto error; + } + attr.desc_addr = gpa; + gpa = mlx5_vdpa_hva_to_gpa(priv->vmem, (uint64_t)(uintptr_t)vq.used); + if (!gpa) { + DRV_LOG(ERR, "Fail to get GPA for used ring."); + goto error; + } + attr.used_addr = gpa; + gpa = mlx5_vdpa_hva_to_gpa(priv->vmem, (uint64_t)(uintptr_t)vq.avail); + if (!gpa) { + DRV_LOG(ERR, "Fail to get GPA for available ring."); + goto error; + } + attr.available_addr = gpa; + rte_vhost_get_vring_base(priv->vid, index, &last_avail_idx, + &last_used_idx); + DRV_LOG(INFO, "vid %d: Init last_avail_idx=%d, last_used_idx=%d for " + "virtq %d.", priv->vid, last_avail_idx, last_used_idx, index); + attr.hw_available_index = last_avail_idx; + attr.hw_used_index = last_used_idx; + attr.q_size = vq.size; + attr.mkey = priv->gpa_mkey_index; + attr.tis_id = priv->tis->id; + attr.queue_index = index; + virtq->virtq = mlx5_devx_cmd_create_virtq(priv->ctx, &attr); + if (!virtq->virtq) + goto error; + return 0; +error: + mlx5_vdpa_virtq_unset(virtq); + return -1; +} + +int +mlx5_vdpa_virtqs_prepare(struct mlx5_vdpa_priv *priv) +{ + struct mlx5_devx_tis_attr tis_attr = {0}; + struct mlx5_vdpa_virtq *virtq; + uint32_t i; + uint16_t nr_vring = rte_vhost_get_vring_num(priv->vid); + + priv->td = mlx5_devx_cmd_create_td(priv->ctx); + if (!priv->td) { + DRV_LOG(ERR, "Failed to create transport domain."); + return -rte_errno; + } + tis_attr.transport_domain = priv->td->id; + priv->tis = mlx5_devx_cmd_create_tis(priv->ctx, &tis_attr); + if (!priv->tis) { + DRV_LOG(ERR, "Failed to create TIS."); + goto error; + } + for (i = 0; i < nr_vring; i++) { + virtq = rte_zmalloc(__func__, sizeof(*virtq), 0); + if (!virtq || mlx5_vdpa_virtq_setup(priv, virtq, i)) { + if (virtq) + rte_free(virtq); + goto error; + } + SLIST_INSERT_HEAD(&priv->virtq_list, virtq, next); + } + priv->nr_virtqs = nr_vring; + return 0; +error: + mlx5_vdpa_virtqs_release(priv); + return -1; +} From patchwork Sun Feb 2 16:03:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65469 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 881EEA04FA; Sun, 2 Feb 2020 17:05:20 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 5F3DB1C0D5; Sun, 2 Feb 2020 17:04:35 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id 5F7221C0D5 for ; Sun, 2 Feb 2020 17:04:33 +0100 (CET) Received: from Internal Mail-Server by MTLPINE1 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 2 Feb 2020 18:04:30 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 012G3uj5032300; Sun, 2 Feb 2020 18:04:30 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Sun, 2 Feb 2020 16:03:47 +0000 Message-Id: <1580659433-25581-8-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580659433-25581-1-git-send-email-matan@mellanox.com> References: <1580292549-27439-1-git-send-email-matan@mellanox.com> <1580659433-25581-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v3 07/13] vdpa/mlx5: support stateless offloads X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add support for the next features in virtq configuration: VIRTIO_F_RING_PACKED, VIRTIO_NET_F_HOST_TSO4, VIRTIO_NET_F_HOST_TSO6, VIRTIO_NET_F_CSUM, VIRTIO_NET_F_GUEST_CSUM, VIRTIO_F_VERSION_1, These features support depends in the DevX capabilities reported by the device. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko Reviewed-by: Maxime Coquelin --- doc/guides/vdpadevs/features/mlx5.ini | 7 ++- drivers/vdpa/mlx5/mlx5_vdpa.c | 10 ---- drivers/vdpa/mlx5/mlx5_vdpa.h | 10 ++++ drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 108 ++++++++++++++++++++++++++++------ 4 files changed, 107 insertions(+), 28 deletions(-) diff --git a/doc/guides/vdpadevs/features/mlx5.ini b/doc/guides/vdpadevs/features/mlx5.ini index fea491d..e4ee34b 100644 --- a/doc/guides/vdpadevs/features/mlx5.ini +++ b/doc/guides/vdpadevs/features/mlx5.ini @@ -4,10 +4,15 @@ ; Refer to default.ini for the full list of available driver features. ; [Features] - +csum = Y +guest csum = Y +host tso4 = Y +host tso6 = Y +version 1 = Y any layout = Y guest announce = Y mq = Y +packed = Y proto mq = Y proto log shmfd = Y proto host notifier = Y diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c index d76c3aa..f625b5e 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -1,8 +1,6 @@ /* SPDX-License-Identifier: BSD-3-Clause * Copyright 2019 Mellanox Technologies, Ltd */ -#include - #include #include #include @@ -17,14 +15,6 @@ #include "mlx5_vdpa.h" -#ifndef VIRTIO_F_ORDER_PLATFORM -#define VIRTIO_F_ORDER_PLATFORM 36 -#endif - -#ifndef VIRTIO_F_RING_PACKED -#define VIRTIO_F_RING_PACKED 34 -#endif - #define MLX5_VDPA_DEFAULT_FEATURES ((1ULL << VHOST_USER_F_PROTOCOL_FEATURES) | \ (1ULL << VIRTIO_F_ANY_LAYOUT) | \ (1ULL << VIRTIO_NET_F_MQ) | \ diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h index 9284420..02cf139 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -5,6 +5,7 @@ #ifndef RTE_PMD_MLX5_VDPA_H_ #define RTE_PMD_MLX5_VDPA_H_ +#include #include #ifdef PEDANTIC @@ -26,6 +27,14 @@ #define MLX5_VDPA_INTR_RETRIES 256 #define MLX5_VDPA_INTR_RETRIES_USEC 1000 +#ifndef VIRTIO_F_ORDER_PLATFORM +#define VIRTIO_F_ORDER_PLATFORM 36 +#endif + +#ifndef VIRTIO_F_RING_PACKED +#define VIRTIO_F_RING_PACKED 34 +#endif + struct mlx5_vdpa_cq { uint16_t log_desc_n; uint32_t cq_ci:24; @@ -91,6 +100,7 @@ struct mlx5_vdpa_priv { struct mlx5_devx_obj *td; struct mlx5_devx_obj *tis; uint16_t nr_virtqs; + uint64_t features; /* Negotiated features. */ SLIST_HEAD(virtq_list, mlx5_vdpa_virtq) virtq_list; SLIST_HEAD(mr_list, mlx5_vdpa_query_mr) mr_list; }; diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c index 781bccf..e27af28 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c @@ -57,6 +57,7 @@ claim_zero(mlx5_devx_cmd_destroy(priv->td)); priv->td = NULL; } + priv->features = 0; } static uint64_t @@ -94,6 +95,14 @@ return -1; virtq->index = index; virtq->vq_size = vq.size; + attr.tso_ipv4 = !!(priv->features & (1ULL << VIRTIO_NET_F_HOST_TSO4)); + attr.tso_ipv6 = !!(priv->features & (1ULL << VIRTIO_NET_F_HOST_TSO6)); + attr.tx_csum = !!(priv->features & (1ULL << VIRTIO_NET_F_CSUM)); + attr.rx_csum = !!(priv->features & (1ULL << VIRTIO_NET_F_GUEST_CSUM)); + attr.virtio_version_1_0 = !!(priv->features & (1ULL << + VIRTIO_F_VERSION_1)); + attr.type = (priv->features & (1ULL << VIRTIO_F_RING_PACKED)) ? + MLX5_VIRTQ_TYPE_PACKED : MLX5_VIRTQ_TYPE_SPLIT; /* * No need event QPs creation when the guest in poll mode or when the * capability allows it. @@ -139,24 +148,29 @@ attr.umems[i].offset = 0; attr.umems[i].size = virtq->umems[i].size; } - gpa = mlx5_vdpa_hva_to_gpa(priv->vmem, (uint64_t)(uintptr_t)vq.desc); - if (!gpa) { - DRV_LOG(ERR, "Fail to get GPA for descriptor ring."); - goto error; - } - attr.desc_addr = gpa; - gpa = mlx5_vdpa_hva_to_gpa(priv->vmem, (uint64_t)(uintptr_t)vq.used); - if (!gpa) { - DRV_LOG(ERR, "Fail to get GPA for used ring."); - goto error; - } - attr.used_addr = gpa; - gpa = mlx5_vdpa_hva_to_gpa(priv->vmem, (uint64_t)(uintptr_t)vq.avail); - if (!gpa) { - DRV_LOG(ERR, "Fail to get GPA for available ring."); - goto error; + if (attr.type == MLX5_VIRTQ_TYPE_SPLIT) { + gpa = mlx5_vdpa_hva_to_gpa(priv->vmem, + (uint64_t)(uintptr_t)vq.desc); + if (!gpa) { + DRV_LOG(ERR, "Failed to get descriptor ring GPA."); + goto error; + } + attr.desc_addr = gpa; + gpa = mlx5_vdpa_hva_to_gpa(priv->vmem, + (uint64_t)(uintptr_t)vq.used); + if (!gpa) { + DRV_LOG(ERR, "Failed to get GPA for used ring."); + goto error; + } + attr.used_addr = gpa; + gpa = mlx5_vdpa_hva_to_gpa(priv->vmem, + (uint64_t)(uintptr_t)vq.avail); + if (!gpa) { + DRV_LOG(ERR, "Failed to get GPA for available ring."); + goto error; + } + attr.available_addr = gpa; } - attr.available_addr = gpa; rte_vhost_get_vring_base(priv->vid, index, &last_avail_idx, &last_used_idx); DRV_LOG(INFO, "vid %d: Init last_avail_idx=%d, last_used_idx=%d for " @@ -176,6 +190,61 @@ return -1; } +static int +mlx5_vdpa_features_validate(struct mlx5_vdpa_priv *priv) +{ + if (priv->features & (1ULL << VIRTIO_F_RING_PACKED)) { + if (!(priv->caps.virtio_queue_type & (1 << + MLX5_VIRTQ_TYPE_PACKED))) { + DRV_LOG(ERR, "Failed to configur PACKED mode for vdev " + "%d - it was not reported by HW/driver" + " capability.", priv->vid); + return -ENOTSUP; + } + } + if (priv->features & (1ULL << VIRTIO_NET_F_HOST_TSO4)) { + if (!priv->caps.tso_ipv4) { + DRV_LOG(ERR, "Failed to enable TSO4 for vdev %d - TSO4" + " was not reported by HW/driver capability.", + priv->vid); + return -ENOTSUP; + } + } + if (priv->features & (1ULL << VIRTIO_NET_F_HOST_TSO6)) { + if (!priv->caps.tso_ipv6) { + DRV_LOG(ERR, "Failed to enable TSO6 for vdev %d - TSO6" + " was not reported by HW/driver capability.", + priv->vid); + return -ENOTSUP; + } + } + if (priv->features & (1ULL << VIRTIO_NET_F_CSUM)) { + if (!priv->caps.tx_csum) { + DRV_LOG(ERR, "Failed to enable CSUM for vdev %d - CSUM" + " was not reported by HW/driver capability.", + priv->vid); + return -ENOTSUP; + } + } + if (priv->features & (1ULL << VIRTIO_NET_F_GUEST_CSUM)) { + if (!priv->caps.rx_csum) { + DRV_LOG(ERR, "Failed to enable GUEST CSUM for vdev %d" + " GUEST CSUM was not reported by HW/driver " + "capability.", priv->vid); + return -ENOTSUP; + } + } + if (priv->features & (1ULL << VIRTIO_F_VERSION_1)) { + if (!priv->caps.virtio_version_1_0) { + DRV_LOG(ERR, "Failed to enable version 1 for vdev %d " + "version 1 was not reported by HW/driver" + " capability.", priv->vid); + return -ENOTSUP; + } + } + return 0; +} + int mlx5_vdpa_virtqs_prepare(struct mlx5_vdpa_priv *priv) { @@ -183,7 +252,12 @@ struct mlx5_vdpa_virtq *virtq; uint32_t i; uint16_t nr_vring = rte_vhost_get_vring_num(priv->vid); + int ret = rte_vhost_get_negotiated_features(priv->vid, &priv->features); + if (ret || mlx5_vdpa_features_validate(priv)) { + DRV_LOG(ERR, "Failed to configure negotiated features."); + return -1; + } priv->td = mlx5_devx_cmd_create_td(priv->ctx); if (!priv->td) { DRV_LOG(ERR, "Failed to create transport domain."); From patchwork Sun Feb 2 16:03:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65470 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id D065AA04FA; Sun, 2 Feb 2020 17:05:31 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 945E61C0DC; Sun, 2 Feb 2020 17:04:38 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id 613001C0DC for ; Sun, 2 Feb 2020 17:04:36 +0100 (CET) Received: from Internal Mail-Server by MTLPINE2 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 2 Feb 2020 18:04:34 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 012G3uj6032300; Sun, 2 Feb 2020 18:04:34 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Sun, 2 Feb 2020 16:03:48 +0000 Message-Id: <1580659433-25581-9-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580659433-25581-1-git-send-email-matan@mellanox.com> References: <1580292549-27439-1-git-send-email-matan@mellanox.com> <1580659433-25581-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v3 08/13] vdpa/mlx5: add basic steering configurations X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add a steering object to be managed by a new file mlx5_vdpa_steer.c. Allow promiscuous flow to scatter the device Rx packets to the virtio queues using RSS action. In order to allow correct RSS in L3 and L4, split the flow to 7 flows as required by the device. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko Acked-by: Maxime Coquelin --- drivers/vdpa/mlx5/Makefile | 2 + drivers/vdpa/mlx5/meson.build | 1 + drivers/vdpa/mlx5/mlx5_vdpa.c | 1 + drivers/vdpa/mlx5/mlx5_vdpa.h | 34 +++++ drivers/vdpa/mlx5/mlx5_vdpa_steer.c | 265 ++++++++++++++++++++++++++++++++++++ 5 files changed, 303 insertions(+) create mode 100644 drivers/vdpa/mlx5/mlx5_vdpa_steer.c diff --git a/drivers/vdpa/mlx5/Makefile b/drivers/vdpa/mlx5/Makefile index 1b24400..8362fef 100644 --- a/drivers/vdpa/mlx5/Makefile +++ b/drivers/vdpa/mlx5/Makefile @@ -11,6 +11,8 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_mem.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_event.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_virtq.c +SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_steer.c + # Basic CFLAGS. CFLAGS += -O3 diff --git a/drivers/vdpa/mlx5/meson.build b/drivers/vdpa/mlx5/meson.build index 732ddce..3f85dda 100644 --- a/drivers/vdpa/mlx5/meson.build +++ b/drivers/vdpa/mlx5/meson.build @@ -15,6 +15,7 @@ sources = files( 'mlx5_vdpa_mem.c', 'mlx5_vdpa_event.c', 'mlx5_vdpa_virtq.c', + 'mlx5_vdpa_steer.c', ) cflags_options = [ '-std=c11', diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c index f625b5e..28b94a3 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -209,6 +209,7 @@ goto error; } priv->caps = attr.vdpa; + priv->log_max_rqt_size = attr.log_max_rqt_size; } priv->ctx = ctx; priv->dev_addr.pci_addr = pci_dev->addr; diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h index 02cf139..d7eb5ee 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -81,6 +81,18 @@ struct mlx5_vdpa_virtq { } umems[3]; }; +struct mlx5_vdpa_steer { + struct mlx5_devx_obj *rqt; + void *domain; + void *tbl; + struct { + struct mlx5dv_flow_matcher *matcher; + struct mlx5_devx_obj *tir; + void *tir_action; + void *flow; + } rss[7]; +}; + struct mlx5_vdpa_priv { TAILQ_ENTRY(mlx5_vdpa_priv) next; int id; /* vDPA device id. */ @@ -101,7 +113,9 @@ struct mlx5_vdpa_priv { struct mlx5_devx_obj *tis; uint16_t nr_virtqs; uint64_t features; /* Negotiated features. */ + uint16_t log_max_rqt_size; SLIST_HEAD(virtq_list, mlx5_vdpa_virtq) virtq_list; + struct mlx5_vdpa_steer steer; SLIST_HEAD(mr_list, mlx5_vdpa_query_mr) mr_list; }; @@ -198,4 +212,24 @@ int mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv *priv, uint16_t desc_n, */ int mlx5_vdpa_virtqs_prepare(struct mlx5_vdpa_priv *priv); +/** + * Unset steering and release all its related resources- stop traffic. + * + * @param[in] priv + * The vdpa driver private structure. + */ +int mlx5_vdpa_steer_unset(struct mlx5_vdpa_priv *priv); + +/** + * Setup steering and all its related resources to enable RSS trafic from the + * device to all the Rx host queues. + * + * @param[in] priv + * The vdpa driver private structure. + * + * @return + * 0 on success, a negative value otherwise. + */ +int mlx5_vdpa_steer_setup(struct mlx5_vdpa_priv *priv); + #endif /* RTE_PMD_MLX5_VDPA_H_ */ diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_steer.c b/drivers/vdpa/mlx5/mlx5_vdpa_steer.c new file mode 100644 index 0000000..f365c10 --- /dev/null +++ b/drivers/vdpa/mlx5/mlx5_vdpa_steer.c @@ -0,0 +1,265 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2019 Mellanox Technologies, Ltd + */ +#include + +#include +#include +#include + +#include + +#include "mlx5_vdpa_utils.h" +#include "mlx5_vdpa.h" + +int +mlx5_vdpa_steer_unset(struct mlx5_vdpa_priv *priv) +{ + int ret __rte_unused; + unsigned i; + + for (i = 0; i < RTE_DIM(priv->steer.rss); ++i) { + if (priv->steer.rss[i].flow) { + claim_zero(mlx5_glue->dv_destroy_flow + (priv->steer.rss[i].flow)); + priv->steer.rss[i].flow = NULL; + } + if (priv->steer.rss[i].tir_action) { + claim_zero(mlx5_glue->destroy_flow_action + (priv->steer.rss[i].tir_action)); + priv->steer.rss[i].tir_action = NULL; + } + if (priv->steer.rss[i].tir) { + claim_zero(mlx5_devx_cmd_destroy + (priv->steer.rss[i].tir)); + priv->steer.rss[i].tir = NULL; + } + if (priv->steer.rss[i].matcher) { + claim_zero(mlx5_glue->dv_destroy_flow_matcher + (priv->steer.rss[i].matcher)); + priv->steer.rss[i].matcher = NULL; + } + } + if (priv->steer.tbl) { + claim_zero(mlx5_glue->dr_destroy_flow_tbl(priv->steer.tbl)); + priv->steer.tbl = NULL; + } + if (priv->steer.domain) { + claim_zero(mlx5_glue->dr_destroy_domain(priv->steer.domain)); + priv->steer.domain = NULL; + } + if (priv->steer.rqt) { + claim_zero(mlx5_devx_cmd_destroy(priv->steer.rqt)); + priv->steer.rqt = NULL; + } + return 0; +} + +/* + * According to VIRTIO_NET Spec the virtqueues index identity its type by: + * 0 receiveq1 + * 1 transmitq1 + * ... + * 2(N-1) receiveqN + * 2(N-1)+1 transmitqN + * 2N controlq + */ +static uint8_t +is_virtq_recvq(int virtq_index, int nr_vring) +{ + if (virtq_index % 2 == 0 && virtq_index != nr_vring - 1) + return 1; + return 0; +} + +#define MLX5_VDPA_DEFAULT_RQT_SIZE 512 +static int __rte_unused +mlx5_vdpa_rqt_prepare(struct mlx5_vdpa_priv *priv) +{ + struct mlx5_vdpa_virtq *virtq; + uint32_t rqt_n = RTE_MIN(MLX5_VDPA_DEFAULT_RQT_SIZE, + 1 << priv->log_max_rqt_size); + struct mlx5_devx_rqt_attr *attr = rte_zmalloc(__func__, sizeof(*attr) + + rqt_n * + sizeof(uint32_t), 0); + uint32_t i = 0, j; + int ret = 0; + + if (!attr) { + DRV_LOG(ERR, "Failed to allocate RQT attributes memory."); + rte_errno = ENOMEM; + return -ENOMEM; + } + SLIST_FOREACH(virtq, &priv->virtq_list, next) { + if (is_virtq_recvq(virtq->index, priv->nr_virtqs)) { + attr->rq_list[i] = virtq->virtq->id; + i++; + } + } + for (j = 0; i != rqt_n; ++i, ++j) + attr->rq_list[i] = attr->rq_list[j]; + attr->rq_type = MLX5_INLINE_Q_TYPE_VIRTQ; + attr->rqt_max_size = rqt_n; + attr->rqt_actual_size = rqt_n; + if (!priv->steer.rqt) { + priv->steer.rqt = mlx5_devx_cmd_create_rqt(priv->ctx, attr); + if (!priv->steer.rqt) { + DRV_LOG(ERR, "Failed to create RQT."); + ret = -rte_errno; + } + } else { + ret = mlx5_devx_cmd_modify_rqt(priv->steer.rqt, attr); + if (ret) + DRV_LOG(ERR, "Failed to modify RQT."); + } + rte_free(attr); + return ret; +} + +static int __rte_unused +mlx5_vdpa_rss_flows_create(struct mlx5_vdpa_priv *priv) +{ +#ifdef HAVE_MLX5DV_DR + struct mlx5_devx_tir_attr tir_att = { + .disp_type = MLX5_TIRC_DISP_TYPE_INDIRECT, + .rx_hash_fn = MLX5_RX_HASH_FN_TOEPLITZ, + .transport_domain = priv->td->id, + .indirect_table = priv->steer.rqt->id, + .rx_hash_symmetric = 1, + .rx_hash_toeplitz_key = { 0x2cc681d1, 0x5bdbf4f7, 0xfca28319, + 0xdb1a3e94, 0x6b9e38d9, 0x2c9c03d1, + 0xad9944a7, 0xd9563d59, 0x063c25f3, + 0xfc1fdc2a }, + }; + struct { + size_t size; + /**< Size of match value. Do NOT split size and key! */ + uint32_t buf[MLX5_ST_SZ_DW(fte_match_param)]; + /**< Matcher value. This value is used as the mask or a key. */ + } matcher_mask = { + .size = sizeof(matcher_mask.buf), + }, + matcher_value = { + .size = sizeof(matcher_value.buf), + }; + struct mlx5dv_flow_matcher_attr dv_attr = { + .type = IBV_FLOW_ATTR_NORMAL, + .match_mask = (void *)&matcher_mask, + }; + void *match_m = matcher_mask.buf; + void *match_v = matcher_value.buf; + void *headers_m = MLX5_ADDR_OF(fte_match_param, match_m, outer_headers); + void *headers_v = MLX5_ADDR_OF(fte_match_param, match_v, outer_headers); + void *actions[1]; + const uint8_t l3_hash = + (1 << MLX5_RX_HASH_FIELD_SELECT_SELECTED_FIELDS_SRC_IP) | + (1 << MLX5_RX_HASH_FIELD_SELECT_SELECTED_FIELDS_DST_IP); + const uint8_t l4_hash = + (1 << MLX5_RX_HASH_FIELD_SELECT_SELECTED_FIELDS_L4_SPORT) | + (1 << MLX5_RX_HASH_FIELD_SELECT_SELECTED_FIELDS_L4_DPORT); + enum { PRIO, CRITERIA, IP_VER_M, IP_VER_V, IP_PROT_M, IP_PROT_V, L3_BIT, + L4_BIT, HASH, END}; + const uint8_t vars[RTE_DIM(priv->steer.rss)][END] = { + { 7, 0, 0, 0, 0, 0, 0, 0, 0 }, + { 6, 1 << MLX5_MATCH_CRITERIA_ENABLE_OUTER_BIT, 0xf, 4, 0, 0, + MLX5_L3_PROT_TYPE_IPV4, 0, l3_hash }, + { 6, 1 << MLX5_MATCH_CRITERIA_ENABLE_OUTER_BIT, 0xf, 6, 0, 0, + MLX5_L3_PROT_TYPE_IPV6, 0, l3_hash }, + { 5, 1 << MLX5_MATCH_CRITERIA_ENABLE_OUTER_BIT, 0xf, 4, 0xff, + IPPROTO_UDP, MLX5_L3_PROT_TYPE_IPV4, MLX5_L4_PROT_TYPE_UDP, + l3_hash | l4_hash }, + { 5, 1 << MLX5_MATCH_CRITERIA_ENABLE_OUTER_BIT, 0xf, 4, 0xff, + IPPROTO_TCP, MLX5_L3_PROT_TYPE_IPV4, MLX5_L4_PROT_TYPE_TCP, + l3_hash | l4_hash }, + { 5, 1 << MLX5_MATCH_CRITERIA_ENABLE_OUTER_BIT, 0xf, 6, 0xff, + IPPROTO_UDP, MLX5_L3_PROT_TYPE_IPV6, MLX5_L4_PROT_TYPE_UDP, + l3_hash | l4_hash }, + { 5, 1 << MLX5_MATCH_CRITERIA_ENABLE_OUTER_BIT, 0xf, 6, 0xff, + IPPROTO_TCP, MLX5_L3_PROT_TYPE_IPV6, MLX5_L4_PROT_TYPE_TCP, + l3_hash | l4_hash }, + }; + unsigned i; + + for (i = 0; i < RTE_DIM(priv->steer.rss); ++i) { + dv_attr.priority = vars[i][PRIO]; + dv_attr.match_criteria_enable = vars[i][CRITERIA]; + MLX5_SET(fte_match_set_lyr_2_4, headers_m, ip_version, + vars[i][IP_VER_M]); + MLX5_SET(fte_match_set_lyr_2_4, headers_v, ip_version, + vars[i][IP_VER_V]); + MLX5_SET(fte_match_set_lyr_2_4, headers_m, ip_protocol, + vars[i][IP_PROT_M]); + MLX5_SET(fte_match_set_lyr_2_4, headers_v, ip_protocol, + vars[i][IP_PROT_V]); + tir_att.rx_hash_field_selector_outer.l3_prot_type = + vars[i][L3_BIT]; + tir_att.rx_hash_field_selector_outer.l4_prot_type = + vars[i][L4_BIT]; + tir_att.rx_hash_field_selector_outer.selected_fields = + vars[i][HASH]; + priv->steer.rss[i].matcher = mlx5_glue->dv_create_flow_matcher + (priv->ctx, &dv_attr, priv->steer.tbl); + if (!priv->steer.rss[i].matcher) { + DRV_LOG(ERR, "Failed to create matcher %d.", i); + goto error; + } + priv->steer.rss[i].tir = mlx5_devx_cmd_create_tir(priv->ctx, + &tir_att); + if (!priv->steer.rss[i].tir) { + DRV_LOG(ERR, "Failed to create TIR %d.", i); + goto error; + } + priv->steer.rss[i].tir_action = + mlx5_glue->dv_create_flow_action_dest_devx_tir + (priv->steer.rss[i].tir->obj); + if (!priv->steer.rss[i].tir_action) { + DRV_LOG(ERR, "Failed to create TIR action %d.", i); + goto error; + } + actions[0] = priv->steer.rss[i].tir_action; + priv->steer.rss[i].flow = mlx5_glue->dv_create_flow + (priv->steer.rss[i].matcher, + (void *)&matcher_value, 1, actions); + if (!priv->steer.rss[i].flow) { + DRV_LOG(ERR, "Failed to create flow %d.", i); + goto error; + } + } + return 0; +error: + /* Resources will be freed by the caller. */ + return -1; +#else + (void)priv; + return -ENOTSUP; +#endif /* HAVE_MLX5DV_DR */ +} + +int +mlx5_vdpa_steer_setup(struct mlx5_vdpa_priv *priv) +{ +#ifdef HAVE_MLX5DV_DR + if (mlx5_vdpa_rqt_prepare(priv)) + return -1; + priv->steer.domain = mlx5_glue->dr_create_domain(priv->ctx, + MLX5DV_DR_DOMAIN_TYPE_NIC_RX); + if (!priv->steer.domain) { + DRV_LOG(ERR, "Failed to create Rx domain."); + goto error; + } + priv->steer.tbl = mlx5_glue->dr_create_flow_tbl(priv->steer.domain, 0); + if (!priv->steer.tbl) { + DRV_LOG(ERR, "Failed to create table 0 with Rx domain."); + goto error; + } + if (mlx5_vdpa_rss_flows_create(priv)) + goto error; + return 0; +error: + mlx5_vdpa_steer_unset(priv); + return -1; +#else + (void)priv; + return -ENOTSUP; +#endif /* HAVE_MLX5DV_DR */ +} From patchwork Sun Feb 2 16:03:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65471 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id A1D33A04FA; Sun, 2 Feb 2020 17:05:44 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 43F741C01F; Sun, 2 Feb 2020 17:04:44 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id A27261BFD9 for ; Sun, 2 Feb 2020 17:04:42 +0100 (CET) Received: from Internal Mail-Server by MTLPINE1 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 2 Feb 2020 18:04:38 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 012G3uj7032300; Sun, 2 Feb 2020 18:04:38 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Sun, 2 Feb 2020 16:03:49 +0000 Message-Id: <1580659433-25581-10-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580659433-25581-1-git-send-email-matan@mellanox.com> References: <1580292549-27439-1-git-send-email-matan@mellanox.com> <1580659433-25581-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v3 09/13] vdpa/mlx5: support queue state operation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add support for set_vring_state operation. Using DevX API the virtq state can be changed as described in PRM: enable - move to ready state. disable - move to suspend state. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko Acked-by: Maxime Coquelin --- drivers/vdpa/mlx5/mlx5_vdpa.c | 23 ++++++++++++++++++++++- drivers/vdpa/mlx5/mlx5_vdpa.h | 15 +++++++++++++++ drivers/vdpa/mlx5/mlx5_vdpa_steer.c | 22 ++++++++++++++++++++-- drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 25 +++++++++++++++++++++---- 4 files changed, 78 insertions(+), 7 deletions(-) diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c index 28b94a3..3615681 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -106,13 +106,34 @@ return 0; } +static int +mlx5_vdpa_set_vring_state(int vid, int vring, int state) +{ + int did = rte_vhost_get_vdpa_device_id(vid); + struct mlx5_vdpa_priv *priv = mlx5_vdpa_find_priv_resource_by_did(did); + struct mlx5_vdpa_virtq *virtq = NULL; + + if (priv == NULL) { + DRV_LOG(ERR, "Invalid device id: %d.", did); + return -EINVAL; + } + SLIST_FOREACH(virtq, &priv->virtq_list, next) + if (virtq->index == vring) + break; + if (!virtq) { + DRV_LOG(ERR, "Invalid or unconfigured vring id: %d.", vring); + return -EINVAL; + } + return mlx5_vdpa_virtq_enable(virtq, state); +} + static struct rte_vdpa_dev_ops mlx5_vdpa_ops = { .get_queue_num = mlx5_vdpa_get_queue_num, .get_features = mlx5_vdpa_get_vdpa_features, .get_protocol_features = mlx5_vdpa_get_protocol_features, .dev_conf = NULL, .dev_close = NULL, - .set_vring_state = NULL, + .set_vring_state = mlx5_vdpa_set_vring_state, .set_features = NULL, .migration_done = NULL, .get_vfio_group_fd = NULL, diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h index d7eb5ee..629a282 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -70,8 +70,10 @@ struct mlx5_vdpa_query_mr { struct mlx5_vdpa_virtq { SLIST_ENTRY(mlx5_vdpa_virtq) next; + uint8_t enable; uint16_t index; uint16_t vq_size; + struct mlx5_vdpa_priv *priv; struct mlx5_devx_obj *virtq; struct mlx5_vdpa_event_qp eqp; struct { @@ -213,6 +215,19 @@ int mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv *priv, uint16_t desc_n, int mlx5_vdpa_virtqs_prepare(struct mlx5_vdpa_priv *priv); /** + * Enable\Disable virtq.. + * + * @param[in] virtq + * The vdpa driver private virtq structure. + * @param[in] enable + * Set to enable, otherwise disable. + * + * @return + * 0 on success, a negative value otherwise. + */ +int mlx5_vdpa_virtq_enable(struct mlx5_vdpa_virtq *virtq, int enable); + +/** * Unset steering and release all its related resources- stop traffic. * * @param[in] priv diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_steer.c b/drivers/vdpa/mlx5/mlx5_vdpa_steer.c index f365c10..36017f1 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa_steer.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa_steer.c @@ -73,7 +73,7 @@ } #define MLX5_VDPA_DEFAULT_RQT_SIZE 512 -static int __rte_unused +static int mlx5_vdpa_rqt_prepare(struct mlx5_vdpa_priv *priv) { struct mlx5_vdpa_virtq *virtq; @@ -91,7 +91,8 @@ return -ENOMEM; } SLIST_FOREACH(virtq, &priv->virtq_list, next) { - if (is_virtq_recvq(virtq->index, priv->nr_virtqs)) { + if (is_virtq_recvq(virtq->index, priv->nr_virtqs) && + virtq->enable) { attr->rq_list[i] = virtq->virtq->id; i++; } @@ -116,6 +117,23 @@ return ret; } +int +mlx5_vdpa_virtq_enable(struct mlx5_vdpa_virtq *virtq, int enable) +{ + struct mlx5_vdpa_priv *priv = virtq->priv; + int ret = 0; + + if (virtq->enable == !!enable) + return 0; + virtq->enable = !!enable; + if (is_virtq_recvq(virtq->index, priv->nr_virtqs)) { + ret = mlx5_vdpa_rqt_prepare(priv); + if (ret) + virtq->enable = !enable; + } + return ret; +} + static int __rte_unused mlx5_vdpa_rss_flows_create(struct mlx5_vdpa_priv *priv) { diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c index e27af28..9967be3 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c @@ -15,13 +15,13 @@ static int mlx5_vdpa_virtq_unset(struct mlx5_vdpa_virtq *virtq) { - int i; + unsigned int i; if (virtq->virtq) { claim_zero(mlx5_devx_cmd_destroy(virtq->virtq)); virtq->virtq = NULL; } - for (i = 0; i < 3; ++i) { + for (i = 0; i < RTE_DIM(virtq->umems); ++i) { if (virtq->umems[i].obj) claim_zero(mlx5_glue->devx_umem_dereg (virtq->umems[i].obj)); @@ -60,6 +60,19 @@ priv->features = 0; } +static int +mlx5_vdpa_virtq_modify(struct mlx5_vdpa_virtq *virtq, int state) +{ + struct mlx5_devx_virtq_attr attr = { + .type = MLX5_VIRTQ_MODIFY_TYPE_STATE, + .state = state ? MLX5_VIRTQ_STATE_RDY : + MLX5_VIRTQ_STATE_SUSPEND, + .queue_index = virtq->index, + }; + + return mlx5_devx_cmd_modify_virtq(virtq->virtq, &attr); +} + static uint64_t mlx5_vdpa_hva_to_gpa(struct rte_vhost_memory *mem, uint64_t hva) { @@ -86,7 +99,7 @@ struct mlx5_devx_virtq_attr attr = {0}; uint64_t gpa; int ret; - int i; + unsigned int i; uint16_t last_avail_idx; uint16_t last_used_idx; @@ -125,7 +138,7 @@ " need event QPs and event mechanism.", index); } /* Setup 3 UMEMs for each virtq. */ - for (i = 0; i < 3; ++i) { + for (i = 0; i < RTE_DIM(virtq->umems); ++i) { virtq->umems[i].size = priv->caps.umems[i].a * vq.size + priv->caps.umems[i].b; virtq->umems[i].buf = rte_zmalloc(__func__, @@ -182,8 +195,12 @@ attr.tis_id = priv->tis->id; attr.queue_index = index; virtq->virtq = mlx5_devx_cmd_create_virtq(priv->ctx, &attr); + virtq->priv = priv; if (!virtq->virtq) goto error; + if (mlx5_vdpa_virtq_modify(virtq, 1)) + goto error; + virtq->enable = 1; return 0; error: mlx5_vdpa_virtq_unset(virtq); From patchwork Sun Feb 2 16:03:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65472 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id A911FA04FA; Sun, 2 Feb 2020 17:05:55 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 426441C132; Sun, 2 Feb 2020 17:04:45 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id A37001BFE0 for ; Sun, 2 Feb 2020 17:04:42 +0100 (CET) Received: from Internal Mail-Server by MTLPINE1 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 2 Feb 2020 18:04:41 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 012G3uj8032300; Sun, 2 Feb 2020 18:04:41 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Sun, 2 Feb 2020 16:03:50 +0000 Message-Id: <1580659433-25581-11-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580659433-25581-1-git-send-email-matan@mellanox.com> References: <1580292549-27439-1-git-send-email-matan@mellanox.com> <1580659433-25581-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v3 10/13] vdpa/mlx5: map doorbell X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The HW supports only 4 bytes doorbell writing detection. The virtio device set only 2 bytes when it rings the doorbell. Map the virtio doorbell detected by the virtio queue kickfd to the HW VAR space when it expects to get the virtio emulation doorbell. Use the EAL interrupt mechanism to get notification when a new event appears in kickfd by the guest and write 4 bytes to the HW doorbell space in the notification callback. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko Acked-by: Maxime Coquelin --- drivers/vdpa/mlx5/mlx5_vdpa.h | 3 ++ drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 80 +++++++++++++++++++++++++++++++++++++ 2 files changed, 83 insertions(+) diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h index 629a282..5424be5 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -81,6 +81,7 @@ struct mlx5_vdpa_virtq { void *buf; uint32_t size; } umems[3]; + struct rte_intr_handle intr_handle; }; struct mlx5_vdpa_steer { @@ -118,6 +119,8 @@ struct mlx5_vdpa_priv { uint16_t log_max_rqt_size; SLIST_HEAD(virtq_list, mlx5_vdpa_virtq) virtq_list; struct mlx5_vdpa_steer steer; + struct mlx5dv_var *var; + void *virtq_db_addr; SLIST_HEAD(mr_list, mlx5_vdpa_query_mr) mr_list; }; diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c index 9967be3..32a13ce 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c @@ -2,9 +2,12 @@ * Copyright 2019 Mellanox Technologies, Ltd */ #include +#include +#include #include #include +#include #include @@ -12,11 +15,52 @@ #include "mlx5_vdpa.h" +static void +mlx5_vdpa_virtq_handler(void *cb_arg) +{ + struct mlx5_vdpa_virtq *virtq = cb_arg; + struct mlx5_vdpa_priv *priv = virtq->priv; + uint64_t buf; + int nbytes; + + do { + nbytes = read(virtq->intr_handle.fd, &buf, 8); + if (nbytes < 0) { + if (errno == EINTR || + errno == EWOULDBLOCK || + errno == EAGAIN) + continue; + DRV_LOG(ERR, "Failed to read kickfd of virtq %d: %s", + virtq->index, strerror(errno)); + } + break; + } while (1); + rte_write32(virtq->index, priv->virtq_db_addr); + DRV_LOG(DEBUG, "Ring virtq %u doorbell.", virtq->index); +} + static int mlx5_vdpa_virtq_unset(struct mlx5_vdpa_virtq *virtq) { unsigned int i; + int retries = MLX5_VDPA_INTR_RETRIES; + int ret = -EAGAIN; + if (virtq->intr_handle.fd) { + while (retries-- && ret == -EAGAIN) { + ret = rte_intr_callback_unregister(&virtq->intr_handle, + mlx5_vdpa_virtq_handler, + virtq); + if (ret == -EAGAIN) { + DRV_LOG(DEBUG, "Try again to unregister fd %d " + "of virtq %d interrupt, retries = %d.", + virtq->intr_handle.fd, + (int)virtq->index, retries); + usleep(MLX5_VDPA_INTR_RETRIES_USEC); + } + } + memset(&virtq->intr_handle, 0, sizeof(virtq->intr_handle)); + } if (virtq->virtq) { claim_zero(mlx5_devx_cmd_destroy(virtq->virtq)); virtq->virtq = NULL; @@ -57,6 +101,14 @@ claim_zero(mlx5_devx_cmd_destroy(priv->td)); priv->td = NULL; } + if (priv->virtq_db_addr) { + claim_zero(munmap(priv->virtq_db_addr, priv->var->length)); + priv->virtq_db_addr = NULL; + } + if (priv->var) { + mlx5_glue->dv_free_var(priv->var); + priv->var = NULL; + } priv->features = 0; } @@ -201,6 +253,17 @@ if (mlx5_vdpa_virtq_modify(virtq, 1)) goto error; virtq->enable = 1; + virtq->intr_handle.fd = vq.kickfd; + virtq->intr_handle.type = RTE_INTR_HANDLE_EXT; + if (rte_intr_callback_register(&virtq->intr_handle, + mlx5_vdpa_virtq_handler, virtq)) { + virtq->intr_handle.fd = 0; + DRV_LOG(ERR, "Failed to register virtq %d interrupt.", index); + goto error; + } else { + DRV_LOG(DEBUG, "Register fd %d interrupt for virtq %d.", + virtq->intr_handle.fd, index); + } return 0; error: mlx5_vdpa_virtq_unset(virtq); @@ -275,6 +338,23 @@ DRV_LOG(ERR, "Failed to configure negotiated features."); return -1; } + priv->var = mlx5_glue->dv_alloc_var(priv->ctx, 0); + if (!priv->var) { + DRV_LOG(ERR, "Failed to allocate VAR %u.\n", errno); + return -1; + } + /* Always map the entire page. */ + priv->virtq_db_addr = mmap(NULL, priv->var->length, PROT_READ | + PROT_WRITE, MAP_SHARED, priv->ctx->cmd_fd, + priv->var->mmap_off); + if (priv->virtq_db_addr == MAP_FAILED) { + DRV_LOG(ERR, "Failed to map doorbell page %u.", errno); + priv->virtq_db_addr = NULL; + goto error; + } else { + DRV_LOG(DEBUG, "VAR address of doorbell mapping is %p.", + priv->virtq_db_addr); + } priv->td = mlx5_devx_cmd_create_td(priv->ctx); if (!priv->td) { DRV_LOG(ERR, "Failed to create transport domain."); From patchwork Sun Feb 2 16:03:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65473 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id A3D1CA04FA; Sun, 2 Feb 2020 17:06:06 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 3BCF91BFDF; Sun, 2 Feb 2020 17:04:48 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id 67CB41C137 for ; Sun, 2 Feb 2020 17:04:46 +0100 (CET) Received: from Internal Mail-Server by MTLPINE2 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 2 Feb 2020 18:04:45 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 012G3uj9032300; Sun, 2 Feb 2020 18:04:45 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Sun, 2 Feb 2020 16:03:51 +0000 Message-Id: <1580659433-25581-12-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580659433-25581-1-git-send-email-matan@mellanox.com> References: <1580292549-27439-1-git-send-email-matan@mellanox.com> <1580659433-25581-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v3 11/13] vdpa/mlx5: support live migration X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add support for live migration feature by the HW: Create a single Mkey that maps the memory address space of the VHOST live migration log file. Modify VIRTIO_NET_Q object and provide vhost_log_page, dirty_bitmap_mkey, dirty_bitmap_size, dirty_bitmap_addr and dirty_bitmap_dump_enable. Modify VIRTIO_NET_Q object and move state to SUSPEND. Query VIRTIO_NET_Q and get hw_available_idx and hw_used_idx. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko Acked-by: Maxime Coquelin --- doc/guides/vdpadevs/features/mlx5.ini | 1 + drivers/vdpa/mlx5/Makefile | 1 + drivers/vdpa/mlx5/meson.build | 1 + drivers/vdpa/mlx5/mlx5_vdpa.c | 44 +++++++++++- drivers/vdpa/mlx5/mlx5_vdpa.h | 55 +++++++++++++++ drivers/vdpa/mlx5/mlx5_vdpa_lm.c | 129 ++++++++++++++++++++++++++++++++++ drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 7 +- 7 files changed, 235 insertions(+), 3 deletions(-) create mode 100644 drivers/vdpa/mlx5/mlx5_vdpa_lm.c diff --git a/doc/guides/vdpadevs/features/mlx5.ini b/doc/guides/vdpadevs/features/mlx5.ini index e4ee34b..1da9c1b 100644 --- a/doc/guides/vdpadevs/features/mlx5.ini +++ b/doc/guides/vdpadevs/features/mlx5.ini @@ -9,6 +9,7 @@ guest csum = Y host tso4 = Y host tso6 = Y version 1 = Y +log all = Y any layout = Y guest announce = Y mq = Y diff --git a/drivers/vdpa/mlx5/Makefile b/drivers/vdpa/mlx5/Makefile index 8362fef..d4a544c 100644 --- a/drivers/vdpa/mlx5/Makefile +++ b/drivers/vdpa/mlx5/Makefile @@ -12,6 +12,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_mem.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_event.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_virtq.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_steer.c +SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_lm.c # Basic CFLAGS. diff --git a/drivers/vdpa/mlx5/meson.build b/drivers/vdpa/mlx5/meson.build index 3f85dda..bb96dad 100644 --- a/drivers/vdpa/mlx5/meson.build +++ b/drivers/vdpa/mlx5/meson.build @@ -16,6 +16,7 @@ sources = files( 'mlx5_vdpa_event.c', 'mlx5_vdpa_virtq.c', 'mlx5_vdpa_steer.c', + 'mlx5_vdpa_lm.c', ) cflags_options = [ '-std=c11', diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c index 3615681..1bb6c68 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -19,7 +19,8 @@ (1ULL << VIRTIO_F_ANY_LAYOUT) | \ (1ULL << VIRTIO_NET_F_MQ) | \ (1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE) | \ - (1ULL << VIRTIO_F_ORDER_PLATFORM)) + (1ULL << VIRTIO_F_ORDER_PLATFORM) | \ + (1ULL << VHOST_F_LOG_ALL)) #define MLX5_VDPA_PROTOCOL_FEATURES \ ((1ULL << VHOST_USER_PROTOCOL_F_SLAVE_REQ) | \ @@ -127,6 +128,45 @@ return mlx5_vdpa_virtq_enable(virtq, state); } +static int +mlx5_vdpa_features_set(int vid) +{ + int did = rte_vhost_get_vdpa_device_id(vid); + struct mlx5_vdpa_priv *priv = mlx5_vdpa_find_priv_resource_by_did(did); + uint64_t log_base, log_size; + uint64_t features; + int ret; + + if (priv == NULL) { + DRV_LOG(ERR, "Invalid device id: %d.", did); + return -EINVAL; + } + ret = rte_vhost_get_negotiated_features(vid, &features); + if (ret) { + DRV_LOG(ERR, "Failed to get negotiated features."); + return ret; + } + if (RTE_VHOST_NEED_LOG(features)) { + ret = rte_vhost_get_log_base(vid, &log_base, &log_size); + if (ret) { + DRV_LOG(ERR, "Failed to get log base."); + return ret; + } + ret = mlx5_vdpa_dirty_bitmap_set(priv, log_base, log_size); + if (ret) { + DRV_LOG(ERR, "Failed to set dirty bitmap."); + return ret; + } + DRV_LOG(INFO, "mlx5 vdpa: enabling dirty logging..."); + ret = mlx5_vdpa_logging_enable(priv, 1); + if (ret) { + DRV_LOG(ERR, "Failed t enable dirty logging."); + return ret; + } + } + return 0; +} + static struct rte_vdpa_dev_ops mlx5_vdpa_ops = { .get_queue_num = mlx5_vdpa_get_queue_num, .get_features = mlx5_vdpa_get_vdpa_features, @@ -134,7 +174,7 @@ .dev_conf = NULL, .dev_close = NULL, .set_vring_state = mlx5_vdpa_set_vring_state, - .set_features = NULL, + .set_features = mlx5_vdpa_features_set, .migration_done = NULL, .get_vfio_group_fd = NULL, .get_vfio_device_fd = NULL, diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h index 5424be5..527436d 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -250,4 +250,59 @@ int mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv *priv, uint16_t desc_n, */ int mlx5_vdpa_steer_setup(struct mlx5_vdpa_priv *priv); +/** + * Enable\Disable live migration logging. + * + * @param[in] priv + * The vdpa driver private structure. + * @param[in] enable + * Set for enable, unset for disable. + * + * @return + * 0 on success, a negative value otherwise. + */ +int mlx5_vdpa_logging_enable(struct mlx5_vdpa_priv *priv, int enable); + +/** + * Set dirty bitmap logging to allow live migration. + * + * @param[in] priv + * The vdpa driver private structure. + * @param[in] log_base + * Vhost log base. + * @param[in] log_size + * Vhost log size. + * + * @return + * 0 on success, a negative value otherwise. + */ +int mlx5_vdpa_dirty_bitmap_set(struct mlx5_vdpa_priv *priv, uint64_t log_base, + uint64_t log_size); + +/** + * Log all virtqs information for live migration. + * + * @param[in] priv + * The vdpa driver private structure. + * @param[in] enable + * Set for enable, unset for disable. + * + * @return + * 0 on success, a negative value otherwise. + */ +int mlx5_vdpa_lm_log(struct mlx5_vdpa_priv *priv); + +/** + * Modify virtq state to be ready or suspend. + * + * @param[in] virtq + * The vdpa driver private virtq structure. + * @param[in] state + * Set for ready, otherwise suspend. + * + * @return + * 0 on success, a negative value otherwise. + */ +int mlx5_vdpa_virtq_modify(struct mlx5_vdpa_virtq *virtq, int state); + #endif /* RTE_PMD_MLX5_VDPA_H_ */ diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_lm.c b/drivers/vdpa/mlx5/mlx5_vdpa_lm.c new file mode 100644 index 0000000..3358704 --- /dev/null +++ b/drivers/vdpa/mlx5/mlx5_vdpa_lm.c @@ -0,0 +1,129 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2019 Mellanox Technologies, Ltd + */ +#include +#include + +#include "mlx5_vdpa_utils.h" +#include "mlx5_vdpa.h" + + +int +mlx5_vdpa_logging_enable(struct mlx5_vdpa_priv *priv, int enable) +{ + struct mlx5_devx_virtq_attr attr = { + .type = MLX5_VIRTQ_MODIFY_TYPE_DIRTY_BITMAP_DUMP_ENABLE, + .dirty_bitmap_dump_enable = enable, + }; + struct mlx5_vdpa_virtq *virtq; + + SLIST_FOREACH(virtq, &priv->virtq_list, next) { + attr.queue_index = virtq->index; + if (mlx5_devx_cmd_modify_virtq(virtq->virtq, &attr)) { + DRV_LOG(ERR, "Failed to modify virtq %d logging.", + virtq->index); + return -1; + } + } + return 0; +} + +int +mlx5_vdpa_dirty_bitmap_set(struct mlx5_vdpa_priv *priv, uint64_t log_base, + uint64_t log_size) +{ + struct mlx5_devx_mkey_attr mkey_attr = { + .addr = (uintptr_t)log_base, + .size = log_size, + .pd = priv->pdn, + .pg_access = 1, + .klm_array = NULL, + .klm_num = 0, + }; + struct mlx5_devx_virtq_attr attr = { + .type = MLX5_VIRTQ_MODIFY_TYPE_DIRTY_BITMAP_PARAMS, + .dirty_bitmap_addr = log_base, + .dirty_bitmap_size = log_size, + }; + struct mlx5_vdpa_query_mr *mr = rte_malloc(__func__, sizeof(*mr), 0); + struct mlx5_vdpa_virtq *virtq; + + if (!mr) { + DRV_LOG(ERR, "Failed to allocate mem for lm mr."); + return -1; + } + mr->umem = mlx5_glue->devx_umem_reg(priv->ctx, + (void *)(uintptr_t)log_base, + log_size, IBV_ACCESS_LOCAL_WRITE); + if (!mr->umem) { + DRV_LOG(ERR, "Failed to register umem for lm mr."); + goto err; + } + mkey_attr.umem_id = mr->umem->umem_id; + mr->mkey = mlx5_devx_cmd_mkey_create(priv->ctx, &mkey_attr); + if (!mr->mkey) { + DRV_LOG(ERR, "Failed to create Mkey for lm."); + goto err; + } + attr.dirty_bitmap_mkey = mr->mkey->id; + SLIST_FOREACH(virtq, &priv->virtq_list, next) { + attr.queue_index = virtq->index; + if (mlx5_devx_cmd_modify_virtq(virtq->virtq, &attr)) { + DRV_LOG(ERR, "Failed to modify virtq %d for lm.", + virtq->index); + goto err; + } + } + mr->is_indirect = 0; + SLIST_INSERT_HEAD(&priv->mr_list, mr, next); + return 0; +err: + if (mr->mkey) + mlx5_devx_cmd_destroy(mr->mkey); + if (mr->umem) + mlx5_glue->devx_umem_dereg(mr->umem); + rte_free(mr); + return -1; +} + +#define MLX5_VDPA_USED_RING_LEN(size) \ + ((size) * sizeof(struct vring_used_elem) + sizeof(uint16_t) * 3) + +int +mlx5_vdpa_lm_log(struct mlx5_vdpa_priv *priv) +{ + struct mlx5_devx_virtq_attr attr = {0}; + struct mlx5_vdpa_virtq *virtq; + uint64_t features; + int ret = rte_vhost_get_negotiated_features(priv->vid, &features); + + if (ret) { + DRV_LOG(ERR, "Failed to get negotiated features."); + return -1; + } + if (!RTE_VHOST_NEED_LOG(features)) + return 0; + SLIST_FOREACH(virtq, &priv->virtq_list, next) { + ret = mlx5_vdpa_virtq_modify(virtq, 0); + if (ret) + return -1; + if (mlx5_devx_cmd_query_virtq(virtq->virtq, &attr)) { + DRV_LOG(ERR, "Failed to query virtq %d.", virtq->index); + return -1; + } + DRV_LOG(INFO, "Query vid %d vring %d: hw_available_idx=%d, " + "hw_used_index=%d", priv->vid, virtq->index, + attr.hw_available_index, attr.hw_used_index); + ret = rte_vhost_set_vring_base(priv->vid, virtq->index, + attr.hw_available_index, + attr.hw_used_index); + if (ret) { + DRV_LOG(ERR, "Failed to set virtq %d base.", + virtq->index); + return -1; + } + rte_vhost_log_used_vring(priv->vid, virtq->index, 0, + MLX5_VDPA_USED_RING_LEN(virtq->vq_size)); + } + return 0; +} diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c index 32a13ce..2312331 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c @@ -112,7 +112,7 @@ priv->features = 0; } -static int +int mlx5_vdpa_virtq_modify(struct mlx5_vdpa_virtq *virtq, int state) { struct mlx5_devx_virtq_attr attr = { @@ -253,6 +253,11 @@ if (mlx5_vdpa_virtq_modify(virtq, 1)) goto error; virtq->enable = 1; + virtq->priv = priv; + /* Be sure notifications are not missed during configuration. */ + claim_zero(rte_vhost_enable_guest_notification(priv->vid, index, 1)); + rte_write32(virtq->index, priv->virtq_db_addr); + /* Setup doorbell mapping. */ virtq->intr_handle.fd = vq.kickfd; virtq->intr_handle.type = RTE_INTR_HANDLE_EXT; if (rte_intr_callback_register(&virtq->intr_handle, From patchwork Sun Feb 2 16:03:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65474 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0F590A04FA; Sun, 2 Feb 2020 17:06:19 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 475961BFE0; Sun, 2 Feb 2020 17:04:54 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id A1ADD1BFE0 for ; Sun, 2 Feb 2020 17:04:52 +0100 (CET) Received: from Internal Mail-Server by MTLPINE1 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 2 Feb 2020 18:04:49 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 012G3ujA032300; Sun, 2 Feb 2020 18:04:49 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Sun, 2 Feb 2020 16:03:52 +0000 Message-Id: <1580659433-25581-13-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580659433-25581-1-git-send-email-matan@mellanox.com> References: <1580292549-27439-1-git-send-email-matan@mellanox.com> <1580659433-25581-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v3 12/13] vdpa/mlx5: support close and config operations X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Support dev_conf and dev_conf operations. These operations allow vdpa traffic. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko Acked-by: Maxime Coquelin --- drivers/vdpa/mlx5/mlx5_vdpa.c | 58 ++++++++++++++++++++++++++++++++++++++++--- drivers/vdpa/mlx5/mlx5_vdpa.h | 1 + 2 files changed, 55 insertions(+), 4 deletions(-) diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c index 1bb6c68..57619d2 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -167,12 +167,59 @@ return 0; } +static int +mlx5_vdpa_dev_close(int vid) +{ + int did = rte_vhost_get_vdpa_device_id(vid); + struct mlx5_vdpa_priv *priv = mlx5_vdpa_find_priv_resource_by_did(did); + int ret = 0; + + if (priv == NULL) { + DRV_LOG(ERR, "Invalid device id: %d.", did); + return -1; + } + if (priv->configured) + ret |= mlx5_vdpa_lm_log(priv); + mlx5_vdpa_cqe_event_unset(priv); + ret |= mlx5_vdpa_steer_unset(priv); + mlx5_vdpa_virtqs_release(priv); + mlx5_vdpa_event_qp_global_release(priv); + mlx5_vdpa_mem_dereg(priv); + priv->configured = 0; + priv->vid = 0; + return ret; +} + +static int +mlx5_vdpa_dev_config(int vid) +{ + int did = rte_vhost_get_vdpa_device_id(vid); + struct mlx5_vdpa_priv *priv = mlx5_vdpa_find_priv_resource_by_did(did); + + if (priv == NULL) { + DRV_LOG(ERR, "Invalid device id: %d.", did); + return -EINVAL; + } + if (priv->configured && mlx5_vdpa_dev_close(vid)) { + DRV_LOG(ERR, "Failed to reconfigure vid %d.", vid); + return -1; + } + priv->vid = vid; + if (mlx5_vdpa_mem_register(priv) || mlx5_vdpa_virtqs_prepare(priv) || + mlx5_vdpa_steer_setup(priv) || mlx5_vdpa_cqe_event_setup(priv)) { + mlx5_vdpa_dev_close(vid); + return -1; + } + priv->configured = 1; + return 0; +} + static struct rte_vdpa_dev_ops mlx5_vdpa_ops = { .get_queue_num = mlx5_vdpa_get_queue_num, .get_features = mlx5_vdpa_get_vdpa_features, .get_protocol_features = mlx5_vdpa_get_protocol_features, - .dev_conf = NULL, - .dev_close = NULL, + .dev_conf = mlx5_vdpa_dev_config, + .dev_close = mlx5_vdpa_dev_close, .set_vring_state = mlx5_vdpa_set_vring_state, .set_features = mlx5_vdpa_features_set, .migration_done = NULL, @@ -321,12 +368,15 @@ break; } } - if (found) { + if (found) TAILQ_REMOVE(&priv_list, priv, next); + pthread_mutex_unlock(&priv_list_lock); + if (found) { + if (priv->configured) + mlx5_vdpa_dev_close(priv->vid); mlx5_glue->close_device(priv->ctx); rte_free(priv); } - pthread_mutex_unlock(&priv_list_lock); return 0; } diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h index 527436d..824e174 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -98,6 +98,7 @@ struct mlx5_vdpa_steer { struct mlx5_vdpa_priv { TAILQ_ENTRY(mlx5_vdpa_priv) next; + uint8_t configured; int id; /* vDPA device id. */ int vid; /* vhost device id. */ struct ibv_context *ctx; /* Device context. */ From patchwork Sun Feb 2 16:03:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65475 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3D26CA04FA; Sun, 2 Feb 2020 17:06:34 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id E4A9D1C18F; Sun, 2 Feb 2020 17:04:57 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id D588E1C18F for ; Sun, 2 Feb 2020 17:04:56 +0100 (CET) Received: from Internal Mail-Server by MTLPINE2 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 2 Feb 2020 18:04:51 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 012G3ujB032300; Sun, 2 Feb 2020 18:04:51 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Sun, 2 Feb 2020 16:03:53 +0000 Message-Id: <1580659433-25581-14-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580659433-25581-1-git-send-email-matan@mellanox.com> References: <1580292549-27439-1-git-send-email-matan@mellanox.com> <1580659433-25581-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v3 13/13] vdpa/mlx5: disable ROCE X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" In order to support virtio queue creation by the FW, ROCE mode should be disabled in the device. Do it by netlink which is like the devlink tool commands: 1. devlink dev param set pci/[pci] name enable_roce value false cmode driverinit 2. devlink dev reload pci/[pci] Or by sysfs which is like: echo 0 > /sys/bus/pci/devices/[pci]/roce_enable The IB device is matched again after ROCE disabling. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko Acked-by: Maxime Coquelin --- drivers/vdpa/mlx5/Makefile | 2 +- drivers/vdpa/mlx5/meson.build | 2 +- drivers/vdpa/mlx5/mlx5_vdpa.c | 191 ++++++++++++++++++++++++++++++++++-------- 3 files changed, 160 insertions(+), 35 deletions(-) diff --git a/drivers/vdpa/mlx5/Makefile b/drivers/vdpa/mlx5/Makefile index d4a544c..7153217 100644 --- a/drivers/vdpa/mlx5/Makefile +++ b/drivers/vdpa/mlx5/Makefile @@ -29,7 +29,7 @@ CFLAGS += -D_XOPEN_SOURCE=600 CFLAGS += $(WERROR_FLAGS) CFLAGS += -Wno-strict-prototypes LDLIBS += -lrte_common_mlx5 -LDLIBS += -lrte_eal -lrte_vhost -lrte_kvargs -lrte_bus_pci -lrte_sched +LDLIBS += -lrte_eal -lrte_vhost -lrte_kvargs -lrte_pci -lrte_bus_pci -lrte_sched # A few warnings cannot be avoided in external headers. CFLAGS += -Wno-error=cast-qual diff --git a/drivers/vdpa/mlx5/meson.build b/drivers/vdpa/mlx5/meson.build index bb96dad..9c152e5 100644 --- a/drivers/vdpa/mlx5/meson.build +++ b/drivers/vdpa/mlx5/meson.build @@ -9,7 +9,7 @@ endif fmt_name = 'mlx5_vdpa' allow_experimental_apis = true -deps += ['hash', 'common_mlx5', 'vhost', 'bus_pci', 'eal', 'sched'] +deps += ['hash', 'common_mlx5', 'vhost', 'pci', 'bus_pci', 'eal', 'sched'] sources = files( 'mlx5_vdpa.c', 'mlx5_vdpa_mem.c', diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c index 57619d2..710f305 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -1,15 +1,19 @@ /* SPDX-License-Identifier: BSD-3-Clause * Copyright 2019 Mellanox Technologies, Ltd */ +#include + #include #include #include #include +#include #include #include #include #include +#include #include "mlx5_vdpa_utils.h" #include "mlx5_vdpa.h" @@ -228,6 +232,145 @@ .get_notify_area = NULL, }; +static struct ibv_device * +mlx5_vdpa_get_ib_device_match(struct rte_pci_addr *addr) +{ + int n; + struct ibv_device **ibv_list = mlx5_glue->get_device_list(&n); + struct ibv_device *ibv_match = NULL; + + if (!ibv_list) { + rte_errno = ENOSYS; + return NULL; + } + while (n-- > 0) { + struct rte_pci_addr pci_addr; + + DRV_LOG(DEBUG, "Checking device \"%s\"..", ibv_list[n]->name); + if (mlx5_dev_to_pci_addr(ibv_list[n]->ibdev_path, &pci_addr)) + continue; + if (memcmp(addr, &pci_addr, sizeof(pci_addr))) + continue; + ibv_match = ibv_list[n]; + break; + } + if (!ibv_match) + rte_errno = ENOENT; + mlx5_glue->free_device_list(ibv_list); + return ibv_match; +} + +/* Try to disable ROCE by Netlink\Devlink. */ +static int +mlx5_vdpa_nl_roce_disable(const char *addr) +{ + int nlsk_fd = mlx5_nl_init(NETLINK_GENERIC); + int devlink_id; + int enable; + int ret; + + if (nlsk_fd < 0) + return nlsk_fd; + devlink_id = mlx5_nl_devlink_family_id_get(nlsk_fd); + if (devlink_id < 0) { + ret = devlink_id; + DRV_LOG(DEBUG, "Failed to get devlink id for ROCE operations by" + " Netlink."); + goto close; + } + ret = mlx5_nl_enable_roce_get(nlsk_fd, devlink_id, addr, &enable); + if (ret) { + DRV_LOG(DEBUG, "Failed to get ROCE enable by Netlink: %d.", + ret); + goto close; + } else if (!enable) { + DRV_LOG(INFO, "ROCE has already disabled(Netlink)."); + goto close; + } + ret = mlx5_nl_enable_roce_set(nlsk_fd, devlink_id, addr, 0); + if (ret) + DRV_LOG(DEBUG, "Failed to disable ROCE by Netlink: %d.", ret); + else + DRV_LOG(INFO, "ROCE is disabled by Netlink successfully."); +close: + close(nlsk_fd); + return ret; +} + +/* Try to disable ROCE by sysfs. */ +static int +mlx5_vdpa_sys_roce_disable(const char *addr) +{ + FILE *file_o; + int enable; + int ret; + + MKSTR(file_p, "/sys/bus/pci/devices/%s/roce_enable", addr); + file_o = fopen(file_p, "rb"); + if (!file_o) { + rte_errno = ENOTSUP; + return -ENOTSUP; + } + ret = fscanf(file_o, "%d", &enable); + if (ret != 1) { + rte_errno = EINVAL; + ret = EINVAL; + goto close; + } else if (!enable) { + ret = 0; + DRV_LOG(INFO, "ROCE has already disabled(sysfs)."); + goto close; + } + fclose(file_o); + file_o = fopen(file_p, "wb"); + if (!file_o) { + rte_errno = ENOTSUP; + return -ENOTSUP; + } + fprintf(file_o, "0\n"); + ret = 0; +close: + if (ret) + DRV_LOG(DEBUG, "Failed to disable ROCE by sysfs: %d.", ret); + else + DRV_LOG(INFO, "ROCE is disabled by sysfs successfully."); + fclose(file_o); + return ret; +} + +#define MLX5_VDPA_MAX_RETRIES 20 +#define MLX5_VDPA_USEC 1000 +static int +mlx5_vdpa_roce_disable(struct rte_pci_addr *addr, struct ibv_device **ibv) +{ + char addr_name[64] = {0}; + + rte_pci_device_name(addr, addr_name, sizeof(addr_name)); + /* Firstly try to disable ROCE by Netlink and fallback to sysfs. */ + if (mlx5_vdpa_nl_roce_disable(addr_name) == 0 || + mlx5_vdpa_sys_roce_disable(addr_name) == 0) { + /* + * Succeed to disable ROCE, wait for the IB device to appear + * again after reload. + */ + int r; + struct ibv_device *ibv_new; + + for (r = MLX5_VDPA_MAX_RETRIES; r; r--) { + ibv_new = mlx5_vdpa_get_ib_device_match(addr); + if (ibv_new) { + *ibv = ibv_new; + return 0; + } + usleep(MLX5_VDPA_USEC); + } + DRV_LOG(ERR, "Cannot much device %s after ROCE disable, " + "retries exceed %d", addr_name, MLX5_VDPA_MAX_RETRIES); + rte_errno = EAGAIN; + } + return -rte_errno; +} + /** * DPDK callback to register a PCI device. * @@ -246,8 +389,7 @@ mlx5_vdpa_pci_probe(struct rte_pci_driver *pci_drv __rte_unused, struct rte_pci_device *pci_dev __rte_unused) { - struct ibv_device **ibv_list; - struct ibv_device *ibv_match = NULL; + struct ibv_device *ibv; struct mlx5_vdpa_priv *priv = NULL; struct ibv_context *ctx = NULL; struct mlx5_hca_attr attr; @@ -258,42 +400,25 @@ " driver."); return 1; } - errno = 0; - ibv_list = mlx5_glue->get_device_list(&ret); - if (!ibv_list) { - rte_errno = ENOSYS; - DRV_LOG(ERR, "Failed to get device list, is ib_uverbs loaded?"); + ibv = mlx5_vdpa_get_ib_device_match(&pci_dev->addr); + if (!ibv) { + DRV_LOG(ERR, "No matching IB device for PCI slot " + PCI_PRI_FMT ".", pci_dev->addr.domain, + pci_dev->addr.bus, pci_dev->addr.devid, + pci_dev->addr.function); return -rte_errno; - } - while (ret-- > 0) { - struct rte_pci_addr pci_addr; - - DRV_LOG(DEBUG, "Checking device \"%s\"..", ibv_list[ret]->name); - if (mlx5_dev_to_pci_addr(ibv_list[ret]->ibdev_path, &pci_addr)) - continue; - if (pci_dev->addr.domain != pci_addr.domain || - pci_dev->addr.bus != pci_addr.bus || - pci_dev->addr.devid != pci_addr.devid || - pci_dev->addr.function != pci_addr.function) - continue; + } else { DRV_LOG(INFO, "PCI information matches for device \"%s\".", - ibv_list[ret]->name); - ibv_match = ibv_list[ret]; - break; + ibv->name); } - mlx5_glue->free_device_list(ibv_list); - if (!ibv_match) { - DRV_LOG(ERR, "No matching IB device for PCI slot " - "%" SCNx32 ":%" SCNx8 ":%" SCNx8 ".%" SCNx8 ".", - pci_dev->addr.domain, pci_dev->addr.bus, - pci_dev->addr.devid, pci_dev->addr.function); - rte_errno = ENOENT; - return -rte_errno; + if (mlx5_vdpa_roce_disable(&pci_dev->addr, &ibv) != 0) { + DRV_LOG(WARNING, "Failed to disable ROCE for \"%s\".", + ibv->name); + //return -rte_errno; } - ctx = mlx5_glue->dv_open_device(ibv_match); + ctx = mlx5_glue->dv_open_device(ibv); if (!ctx) { - DRV_LOG(ERR, "Failed to open IB device \"%s\".", - ibv_match->name); + DRV_LOG(ERR, "Failed to open IB device \"%s\".", ibv->name); rte_errno = ENODEV; return -rte_errno; }