From patchwork Wed Jan 29 10:08:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65289 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9E5B9A0531; Wed, 29 Jan 2020 11:09:27 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 906D71BFC0; Wed, 29 Jan 2020 11:09:21 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id 57DB51BFB6 for ; Wed, 29 Jan 2020 11:09:20 +0100 (CET) Received: from Internal Mail-Server by MTLPINE2 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 29 Jan 2020 12:09:18 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 00TA9BHH032108; Wed, 29 Jan 2020 12:09:18 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Wed, 29 Jan 2020 10:08:57 +0000 Message-Id: <1580292549-27439-2-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580292549-27439-1-git-send-email-matan@mellanox.com> References: <1579539790-3882-1-git-send-email-matan@mellanox.com> <1580292549-27439-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v2 01/13] drivers: introduce mlx5 vDPA driver X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add a new driver to support vDPA operations by Mellanox devices. The first Mellanox devices which support vDPA operations are ConnectX6DX and Bluefield1 HCA for their PF ports and VF ports. This driver is depending on rdma-core like the mlx5 PMD, also it is going to use mlx5 DevX to create HW objects directly by the FW. Hence, the common/mlx5 library is linked to the mlx5_vdpa driver. This driver will not be compiled by default due to the above dependencies. Register a new log type for this driver. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko Reviewed-by: Maxime Coquelin --- MAINTAINERS | 7 + config/common_base | 5 + doc/guides/rel_notes/release_20_02.rst | 5 + doc/guides/vdpadevs/features/mlx5.ini | 14 ++ doc/guides/vdpadevs/index.rst | 1 + doc/guides/vdpadevs/mlx5.rst | 111 ++++++++++++ drivers/common/Makefile | 2 +- drivers/common/mlx5/Makefile | 17 +- drivers/meson.build | 8 +- drivers/vdpa/Makefile | 2 + drivers/vdpa/meson.build | 3 +- drivers/vdpa/mlx5/Makefile | 36 ++++ drivers/vdpa/mlx5/meson.build | 29 +++ drivers/vdpa/mlx5/mlx5_vdpa.c | 227 ++++++++++++++++++++++++ drivers/vdpa/mlx5/mlx5_vdpa_utils.h | 20 +++ drivers/vdpa/mlx5/rte_pmd_mlx5_vdpa_version.map | 3 + mk/rte.app.mk | 15 +- 17 files changed, 488 insertions(+), 17 deletions(-) create mode 100644 doc/guides/vdpadevs/features/mlx5.ini create mode 100644 doc/guides/vdpadevs/mlx5.rst create mode 100644 drivers/vdpa/mlx5/Makefile create mode 100644 drivers/vdpa/mlx5/meson.build create mode 100644 drivers/vdpa/mlx5/mlx5_vdpa.c create mode 100644 drivers/vdpa/mlx5/mlx5_vdpa_utils.h create mode 100644 drivers/vdpa/mlx5/rte_pmd_mlx5_vdpa_version.map diff --git a/MAINTAINERS b/MAINTAINERS index 150d507..f697e9a 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1103,6 +1103,13 @@ F: drivers/vdpa/ifc/ F: doc/guides/vdpadevs/ifc.rst F: doc/guides/vdpadevs/features/ifcvf.ini +Mellanox mlx5 vDPA +M: Matan Azrad +M: Viacheslav Ovsiienko +F: drivers/vdpa/mlx5/ +F: doc/guides/vdpadevs/mlx5.rst +F: doc/guides/vdpadevs/features/mlx5.ini + Eventdev Drivers ---------------- diff --git a/config/common_base b/config/common_base index c897dd0..6ea9c63 100644 --- a/config/common_base +++ b/config/common_base @@ -366,6 +366,11 @@ CONFIG_RTE_LIBRTE_MLX4_DEBUG=n CONFIG_RTE_LIBRTE_MLX5_PMD=n CONFIG_RTE_LIBRTE_MLX5_DEBUG=n +# +# Compile vdpa-oriented Mellanox ConnectX-6 & Bluefield (MLX5) PMD +# +CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD=n + # Linking method for mlx4/5 dependency on ibverbs and related libraries # Default linking is dynamic by linker. # Other options are: dynamic by dlopen at run-time, or statically embedded. diff --git a/doc/guides/rel_notes/release_20_02.rst b/doc/guides/rel_notes/release_20_02.rst index 50e2c14..690e7db 100644 --- a/doc/guides/rel_notes/release_20_02.rst +++ b/doc/guides/rel_notes/release_20_02.rst @@ -113,6 +113,11 @@ New Features * Added support for RSS using L3/L4 source/destination only. * Added support for matching on GTP tunnel header item. +* **Add new vDPA PMD based on Mellanox devices** + + Added a new Mellanox vDPA (``mlx5_vdpa``) PMD. + See the :doc:`../vdpadevs/mlx5` guide for more details on this driver. + * **Updated testpmd application.** Added support for ESP and L2TPv3 over IP rte_flow patterns to the testpmd diff --git a/doc/guides/vdpadevs/features/mlx5.ini b/doc/guides/vdpadevs/features/mlx5.ini new file mode 100644 index 0000000..d635bdf --- /dev/null +++ b/doc/guides/vdpadevs/features/mlx5.ini @@ -0,0 +1,14 @@ +; +; Supported features of the 'mlx5' VDPA driver. +; +; Refer to default.ini for the full list of available driver features. +; +[Features] +Other kdrv = Y +ARMv8 = Y +Power8 = Y +x86-32 = Y +x86-64 = Y +Usage doc = Y +Design doc = Y + diff --git a/doc/guides/vdpadevs/index.rst b/doc/guides/vdpadevs/index.rst index 9657108..1a13efe 100644 --- a/doc/guides/vdpadevs/index.rst +++ b/doc/guides/vdpadevs/index.rst @@ -13,3 +13,4 @@ which can be used from an application through vhost API. features_overview ifc + mlx5 diff --git a/doc/guides/vdpadevs/mlx5.rst b/doc/guides/vdpadevs/mlx5.rst new file mode 100644 index 0000000..1861e71 --- /dev/null +++ b/doc/guides/vdpadevs/mlx5.rst @@ -0,0 +1,111 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright 2019 Mellanox Technologies, Ltd + +MLX5 vDPA driver +================ + +The MLX5 vDPA (vhost data path acceleration) driver library +(**librte_pmd_mlx5_vdpa**) provides support for **Mellanox ConnectX-6**, +**Mellanox ConnectX-6DX** and **Mellanox BlueField** families of +10/25/40/50/100/200 Gb/s adapters as well as their virtual functions (VF) in +SR-IOV context. + +.. note:: + + Due to external dependencies, this driver is disabled in default + configuration of the "make" build. It can be enabled with + ``CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD=y`` or by using "meson" build system which + will detect dependencies. + + +Design +------ + +For security reasons and robustness, this driver only deals with virtual +memory addresses. The way resources allocations are handled by the kernel, +combined with hardware specifications that allow to handle virtual memory +addresses directly, ensure that DPDK applications cannot access random +physical memory (or memory that does not belong to the current process). + +The PMD can use libibverbs and libmlx5 to access the device firmware +or directly the hardware components. +There are different levels of objects and bypassing abilities +to get the best performances: + +- Verbs is a complete high-level generic API +- Direct Verbs is a device-specific API +- DevX allows to access firmware objects +- Direct Rules manages flow steering at low-level hardware layer + +Enabling librte_pmd_mlx5_vdpa causes DPDK applications to be linked against +libibverbs. + +A Mellanox mlx5 PCI device can be probed by either net/mlx5 driver or vdpa/mlx5 +driver but not in parallel. Hence, the user should decide the driver by the +``class`` parameter in the device argument list. +By default, the mlx5 device will be probed by the net/mlx5 driver. + +Supported NICs +-------------- + +* Mellanox(R) ConnectX(R)-6 200G MCX654106A-HCAT (4x200G) +* Mellanox(R) ConnectX(R)-6DX EN 100G MCX623106AN-CDAT (2*100G) +* Mellanox(R) ConnectX(R)-6DX EN 200G MCX623105AN-VDAT (1*200G) +* Mellanox(R) BlueField SmartNIC 25G MBF1M332A-ASCAT (2*25G) + +Prerequisites +------------- + +- Mellanox OFED version: **4.7** + see :doc:`../../nics/mlx5` guide for more Mellanox OFED details. + +Compilation options +~~~~~~~~~~~~~~~~~~~ + +These options can be modified in the ``.config`` file. + +- ``CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD`` (default **n**) + + Toggle compilation of librte_pmd_mlx5 itself. + +- ``CONFIG_RTE_IBVERBS_LINK_DLOPEN`` (default **n**) + + Build PMD with additional code to make it loadable without hard + dependencies on **libibverbs** nor **libmlx5**, which may not be installed + on the target system. + + In this mode, their presence is still required for it to run properly, + however their absence won't prevent a DPDK application from starting (with + ``CONFIG_RTE_BUILD_SHARED_LIB`` disabled) and they won't show up as + missing with ``ldd(1)``. + + It works by moving these dependencies to a purpose-built rdma-core "glue" + plug-in which must either be installed in a directory whose name is based + on ``CONFIG_RTE_EAL_PMD_PATH`` suffixed with ``-glue`` if set, or in a + standard location for the dynamic linker (e.g. ``/lib``) if left to the + default empty string (``""``). + + This option has no performance impact. + +- ``CONFIG_RTE_IBVERBS_LINK_STATIC`` (default **n**) + + Embed static flavor of the dependencies **libibverbs** and **libmlx5** + in the PMD shared library or the executable static binary. + +.. note:: + + For BlueField, target should be set to ``arm64-bluefield-linux-gcc``. This + will enable ``CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD`` and set + ``RTE_CACHE_LINE_SIZE`` to 64. Default armv8a configuration of make build and + meson build set it to 128 then brings performance degradation. + +Run-time configuration +~~~~~~~~~~~~~~~~~~~~~~ + +- **ethtool** operations on related kernel interfaces also affect the PMD. + +- ``class`` parameter [string] + + Select the class of the driver that should probe the device. + `vdpa` for the mlx5 vDPA driver. + diff --git a/drivers/common/Makefile b/drivers/common/Makefile index 4775d4b..96bd7ac 100644 --- a/drivers/common/Makefile +++ b/drivers/common/Makefile @@ -35,7 +35,7 @@ ifneq (,$(findstring y,$(IAVF-y))) DIRS-y += iavf endif -ifeq ($(CONFIG_RTE_LIBRTE_MLX5_PMD),y) +ifeq ($(findstring y,$(CONFIG_RTE_LIBRTE_MLX5_PMD)$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD)),y) DIRS-y += mlx5 endif diff --git a/drivers/common/mlx5/Makefile b/drivers/common/mlx5/Makefile index 9d4d81f..c4b7999 100644 --- a/drivers/common/mlx5/Makefile +++ b/drivers/common/mlx5/Makefile @@ -10,15 +10,16 @@ LIB_GLUE_BASE = librte_pmd_mlx5_glue.so LIB_GLUE_VERSION = 20.02.0 # Sources. +ifeq ($(findstring y,$(CONFIG_RTE_LIBRTE_MLX5_PMD)$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD)),y) ifneq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y) -SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_glue.c +SRCS-y += mlx5_glue.c endif -SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_devx_cmds.c -SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_common.c -SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_nl.c - +SRCS-y += mlx5_devx_cmds.c +SRCS-y += mlx5_common.c +SRCS-y += mlx5_nl.c ifeq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y) -INSTALL-$(CONFIG_RTE_LIBRTE_MLX5_PMD)-lib += $(LIB_GLUE) +INSTALL-y-lib += $(LIB_GLUE) +endif endif # Basic CFLAGS. @@ -317,7 +318,9 @@ mlx5_autoconf.h: mlx5_autoconf.h.new cmp '$<' '$@' $(AUTOCONF_OUTPUT) || \ mv '$<' '$@' -$(SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD):.c=.o): mlx5_autoconf.h +ifeq ($(findstring y,$(CONFIG_RTE_LIBRTE_MLX5_PMD)$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD)),y) +$(SRCS-y:.c=.o): mlx5_autoconf.h +endif # Generate dependency plug-in for rdma-core when the PMD must not be linked # directly, so that applications do not inherit this dependency. diff --git a/drivers/meson.build b/drivers/meson.build index 29708cc..bd154fa 100644 --- a/drivers/meson.build +++ b/drivers/meson.build @@ -42,6 +42,7 @@ foreach class:dpdk_driver_classes build = true # set to false to disable, e.g. missing deps reason = '' # set if build == false to explain name = drv + fmt_name = '' allow_experimental_apis = false sources = [] objs = [] @@ -98,8 +99,11 @@ foreach class:dpdk_driver_classes else class_drivers += name - dpdk_conf.set(config_flag_fmt.format(name.to_upper()),1) - lib_name = driver_name_fmt.format(name) + if fmt_name == '' + fmt_name = name + endif + dpdk_conf.set(config_flag_fmt.format(fmt_name.to_upper()),1) + lib_name = driver_name_fmt.format(fmt_name) if allow_experimental_apis cflags += '-DALLOW_EXPERIMENTAL_API' diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile index b5a7a11..6e88359 100644 --- a/drivers/vdpa/Makefile +++ b/drivers/vdpa/Makefile @@ -7,4 +7,6 @@ ifeq ($(CONFIG_RTE_EAL_VFIO),y) DIRS-$(CONFIG_RTE_LIBRTE_IFC_PMD) += ifc endif +DIRS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5 + include $(RTE_SDK)/mk/rte.subdir.mk diff --git a/drivers/vdpa/meson.build b/drivers/vdpa/meson.build index 2f047b5..e3ed54a 100644 --- a/drivers/vdpa/meson.build +++ b/drivers/vdpa/meson.build @@ -1,7 +1,8 @@ # SPDX-License-Identifier: BSD-3-Clause # Copyright 2019 Mellanox Technologies, Ltd -drivers = ['ifc'] +drivers = ['ifc', + 'mlx5',] std_deps = ['bus_pci', 'kvargs'] std_deps += ['vhost'] config_flag_fmt = 'RTE_LIBRTE_@0@_PMD' diff --git a/drivers/vdpa/mlx5/Makefile b/drivers/vdpa/mlx5/Makefile new file mode 100644 index 0000000..c1c8cc0 --- /dev/null +++ b/drivers/vdpa/mlx5/Makefile @@ -0,0 +1,36 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright 2019 Mellanox Technologies, Ltd + +include $(RTE_SDK)/mk/rte.vars.mk + +# Library name. +LIB = librte_pmd_mlx5_vdpa.a + +# Sources. +SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa.c + +# Basic CFLAGS. +CFLAGS += -O3 +CFLAGS += -std=c11 -Wall -Wextra +CFLAGS += -g +CFLAGS += -I$(RTE_SDK)/drivers/common/mlx5 +CFLAGS += -I$(RTE_SDK)/drivers/net/mlx5_vdpa +CFLAGS += -I$(BUILDDIR)/drivers/common/mlx5 +CFLAGS += -D_BSD_SOURCE +CFLAGS += -D_DEFAULT_SOURCE +CFLAGS += -D_XOPEN_SOURCE=600 +CFLAGS += $(WERROR_FLAGS) +CFLAGS += -Wno-strict-prototypes +LDLIBS += -lrte_common_mlx5 +LDLIBS += -lrte_eal -lrte_vhost -lrte_kvargs -lrte_bus_pci + +# A few warnings cannot be avoided in external headers. +CFLAGS += -Wno-error=cast-qual + +EXPORT_MAP := rte_pmd_mlx5_vdpa_version.map +# memseg walk is not part of stable API +CFLAGS += -DALLOW_EXPERIMENTAL_API + +CFLAGS += -DNDEBUG -UPEDANTIC + +include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/vdpa/mlx5/meson.build b/drivers/vdpa/mlx5/meson.build new file mode 100644 index 0000000..4bca6ea --- /dev/null +++ b/drivers/vdpa/mlx5/meson.build @@ -0,0 +1,29 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright 2019 Mellanox Technologies, Ltd + +if not is_linux + build = false + reason = 'only supported on Linux' + subdir_done() +endif + +fmt_name = 'mlx5_vdpa' +allow_experimental_apis = true +deps += ['hash', 'common_mlx5', 'vhost', 'bus_pci', 'eal'] +sources = files( + 'mlx5_vdpa.c', +) +cflags_options = [ + '-std=c11', + '-Wno-strict-prototypes', + '-D_BSD_SOURCE', + '-D_DEFAULT_SOURCE', + '-D_XOPEN_SOURCE=600' +] +foreach option:cflags_options + if cc.has_argument(option) + cflags += option + endif +endforeach + +cflags += [ '-DNDEBUG', '-UPEDANTIC' ] diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c new file mode 100644 index 0000000..6286d7a --- /dev/null +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -0,0 +1,227 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2019 Mellanox Technologies, Ltd + */ +#include +#include +#include +#include +#include + +#include +#include + +#include "mlx5_vdpa_utils.h" + + +struct mlx5_vdpa_priv { + TAILQ_ENTRY(mlx5_vdpa_priv) next; + int id; /* vDPA device id. */ + struct ibv_context *ctx; /* Device context. */ + struct rte_vdpa_dev_addr dev_addr; +}; + +TAILQ_HEAD(mlx5_vdpa_privs, mlx5_vdpa_priv) priv_list = + TAILQ_HEAD_INITIALIZER(priv_list); +static pthread_mutex_t priv_list_lock = PTHREAD_MUTEX_INITIALIZER; +int mlx5_vdpa_logtype; + +static struct rte_vdpa_dev_ops mlx5_vdpa_ops = { + .get_queue_num = NULL, + .get_features = NULL, + .get_protocol_features = NULL, + .dev_conf = NULL, + .dev_close = NULL, + .set_vring_state = NULL, + .set_features = NULL, + .migration_done = NULL, + .get_vfio_group_fd = NULL, + .get_vfio_device_fd = NULL, + .get_notify_area = NULL, +}; + +/** + * DPDK callback to register a PCI device. + * + * This function spawns vdpa device out of a given PCI device. + * + * @param[in] pci_drv + * PCI driver structure (mlx5_vpda_driver). + * @param[in] pci_dev + * PCI device information. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +static int +mlx5_vdpa_pci_probe(struct rte_pci_driver *pci_drv __rte_unused, + struct rte_pci_device *pci_dev __rte_unused) +{ + struct ibv_device **ibv_list; + struct ibv_device *ibv_match = NULL; + struct mlx5_vdpa_priv *priv = NULL; + struct ibv_context *ctx = NULL; + int ret; + + if (mlx5_class_get(pci_dev->device.devargs) != MLX5_CLASS_VDPA) { + DRV_LOG(DEBUG, "Skip probing - should be probed by other mlx5" + " driver."); + return 1; + } + errno = 0; + ibv_list = mlx5_glue->get_device_list(&ret); + if (!ibv_list) { + rte_errno = errno; + DRV_LOG(ERR, "Failed to get device list, is ib_uverbs loaded?"); + return -ENOSYS; + } + while (ret-- > 0) { + struct rte_pci_addr pci_addr; + + DRV_LOG(DEBUG, "Checking device \"%s\"..", ibv_list[ret]->name); + if (mlx5_dev_to_pci_addr(ibv_list[ret]->ibdev_path, &pci_addr)) + continue; + if (pci_dev->addr.domain != pci_addr.domain || + pci_dev->addr.bus != pci_addr.bus || + pci_dev->addr.devid != pci_addr.devid || + pci_dev->addr.function != pci_addr.function) + continue; + DRV_LOG(INFO, "PCI information matches for device \"%s\".", + ibv_list[ret]->name); + ibv_match = ibv_list[ret]; + break; + } + mlx5_glue->free_device_list(ibv_list); + if (!ibv_match) { + DRV_LOG(ERR, "No matching IB device for PCI slot " + "%" SCNx32 ":%" SCNx8 ":%" SCNx8 ".%" SCNx8 ".", + pci_dev->addr.domain, pci_dev->addr.bus, + pci_dev->addr.devid, pci_dev->addr.function); + rte_errno = ENOENT; + return -rte_errno; + } + ctx = mlx5_glue->dv_open_device(ibv_match); + if (!ctx) { + DRV_LOG(ERR, "Failed to open IB device \"%s\".", + ibv_match->name); + rte_errno = ENODEV; + return -rte_errno; + } + priv = rte_zmalloc("mlx5 vDPA device private", sizeof(*priv), + RTE_CACHE_LINE_SIZE); + if (!priv) { + DRV_LOG(ERR, "Failed to allocate private memory."); + rte_errno = ENOMEM; + goto error; + } + priv->ctx = ctx; + priv->dev_addr.pci_addr = pci_dev->addr; + priv->dev_addr.type = PCI_ADDR; + priv->id = rte_vdpa_register_device(&priv->dev_addr, &mlx5_vdpa_ops); + if (priv->id < 0) { + DRV_LOG(ERR, "Failed to register vDPA device."); + rte_errno = rte_errno ? rte_errno : EINVAL; + goto error; + } + pthread_mutex_lock(&priv_list_lock); + TAILQ_INSERT_TAIL(&priv_list, priv, next); + pthread_mutex_unlock(&priv_list_lock); + return 0; + +error: + if (priv) + rte_free(priv); + if (ctx) + mlx5_glue->close_device(ctx); + return -rte_errno; +} + +/** + * DPDK callback to remove a PCI device. + * + * This function removes all vDPA devices belong to a given PCI device. + * + * @param[in] pci_dev + * Pointer to the PCI device. + * + * @return + * 0 on success, the function cannot fail. + */ +static int +mlx5_vdpa_pci_remove(struct rte_pci_device *pci_dev) +{ + struct mlx5_vdpa_priv *priv = NULL; + int found = 0; + + pthread_mutex_lock(&priv_list_lock); + TAILQ_FOREACH(priv, &priv_list, next) { + if (memcmp(&priv->dev_addr.pci_addr, &pci_dev->addr, + sizeof(pci_dev->addr)) == 0) { + found = 1; + break; + } + } + if (found) { + TAILQ_REMOVE(&priv_list, priv, next); + mlx5_glue->close_device(priv->ctx); + rte_free(priv); + } + pthread_mutex_unlock(&priv_list_lock); + return 0; +} + +static const struct rte_pci_id mlx5_vdpa_pci_id_map[] = { + { + RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX, + PCI_DEVICE_ID_MELLANOX_CONNECTX5BF) + }, + { + RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX, + PCI_DEVICE_ID_MELLANOX_CONNECTX5BFVF) + }, + { + RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX, + PCI_DEVICE_ID_MELLANOX_CONNECTX6) + }, + { + RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX, + PCI_DEVICE_ID_MELLANOX_CONNECTX6VF) + }, + { + RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX, + PCI_DEVICE_ID_MELLANOX_CONNECTX6DX) + }, + { + RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX, + PCI_DEVICE_ID_MELLANOX_CONNECTX6DXVF) + }, + { + .vendor_id = 0 + } +}; + +static struct rte_pci_driver mlx5_vdpa_driver = { + .driver = { + .name = "mlx5_vdpa", + }, + .id_table = mlx5_vdpa_pci_id_map, + .probe = mlx5_vdpa_pci_probe, + .remove = mlx5_vdpa_pci_remove, + .drv_flags = 0, +}; + +/** + * Driver initialization routine. + */ +RTE_INIT(rte_mlx5_vdpa_init) +{ + /* Initialize common log type. */ + mlx5_vdpa_logtype = rte_log_register("pmd.vdpa.mlx5"); + if (mlx5_vdpa_logtype >= 0) + rte_log_set_level(mlx5_vdpa_logtype, RTE_LOG_NOTICE); + if (mlx5_glue) + rte_pci_register(&mlx5_vdpa_driver); +} + +RTE_PMD_EXPORT_NAME(net_mlx5_vdpa, __COUNTER__); +RTE_PMD_REGISTER_PCI_TABLE(net_mlx5_vdpa, mlx5_vdpa_pci_id_map); +RTE_PMD_REGISTER_KMOD_DEP(net_mlx5_vdpa, "* ib_uverbs & mlx5_core & mlx5_ib"); diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_utils.h b/drivers/vdpa/mlx5/mlx5_vdpa_utils.h new file mode 100644 index 0000000..a239df9 --- /dev/null +++ b/drivers/vdpa/mlx5/mlx5_vdpa_utils.h @@ -0,0 +1,20 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2019 Mellanox Technologies, Ltd + */ + +#ifndef RTE_PMD_MLX5_VDPA_UTILS_H_ +#define RTE_PMD_MLX5_VDPA_UTILS_H_ + +#include + + +extern int mlx5_vdpa_logtype; + +#define MLX5_VDPA_LOG_PREFIX "mlx5_vdpa" +/* Generic printf()-like logging macro with automatic line feed. */ +#define DRV_LOG(level, ...) \ + PMD_DRV_LOG_(level, mlx5_vdpa_logtype, MLX5_VDPA_LOG_PREFIX, \ + __VA_ARGS__ PMD_DRV_LOG_STRIP PMD_DRV_LOG_OPAREN, \ + PMD_DRV_LOG_CPAREN) + +#endif /* RTE_PMD_MLX5_VDPA_UTILS_H_ */ diff --git a/drivers/vdpa/mlx5/rte_pmd_mlx5_vdpa_version.map b/drivers/vdpa/mlx5/rte_pmd_mlx5_vdpa_version.map new file mode 100644 index 0000000..143836e --- /dev/null +++ b/drivers/vdpa/mlx5/rte_pmd_mlx5_vdpa_version.map @@ -0,0 +1,3 @@ +DPDK_20.02 { + local: *; +}; diff --git a/mk/rte.app.mk b/mk/rte.app.mk index 45f4cad..b33cd8a 100644 --- a/mk/rte.app.mk +++ b/mk/rte.app.mk @@ -196,18 +196,21 @@ endif _LDLIBS-$(CONFIG_RTE_LIBRTE_LIO_PMD) += -lrte_pmd_lio _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += -lrte_pmd_memif _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += -lrte_pmd_mlx4 -_LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += -lrte_common_mlx5 +ifeq ($(findstring y,$(CONFIG_RTE_LIBRTE_MLX5_PMD)$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD)),y) +_LDLIBS-y += -lrte_common_mlx5 +endif _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += -lrte_pmd_mlx5 +_LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += -lrte_pmd_mlx5_vdpa ifeq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y) -_LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += -ldl -_LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += -ldl +_LDLIBS-y += -ldl else ifeq ($(CONFIG_RTE_IBVERBS_LINK_STATIC),y) LIBS_IBVERBS_STATIC = $(shell $(RTE_SDK)/buildtools/options-ibverbs-static.sh) -_LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += $(LIBS_IBVERBS_STATIC) -_LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += $(LIBS_IBVERBS_STATIC) +_LDLIBS-y += $(LIBS_IBVERBS_STATIC) else +ifeq ($(findstring y,$(CONFIG_RTE_LIBRTE_MLX5_PMD)$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD)),y) +_LDLIBS-y += -libverbs -lmlx5 +endif _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += -libverbs -lmlx4 -_LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += -libverbs -lmlx5 endif _LDLIBS-$(CONFIG_RTE_LIBRTE_MVPP2_PMD) += -lrte_pmd_mvpp2 _LDLIBS-$(CONFIG_RTE_LIBRTE_MVNETA_PMD) += -lrte_pmd_mvneta From patchwork Wed Jan 29 10:08:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65290 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0856DA0531; Wed, 29 Jan 2020 11:09:40 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 5A0031BFC6; Wed, 29 Jan 2020 11:09:31 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id F3A351BFAF for ; Wed, 29 Jan 2020 11:09:29 +0100 (CET) Received: from Internal Mail-Server by MTLPINE2 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 29 Jan 2020 12:09:26 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 00TA9BHI032108; Wed, 29 Jan 2020 12:09:26 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Wed, 29 Jan 2020 10:08:58 +0000 Message-Id: <1580292549-27439-3-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580292549-27439-1-git-send-email-matan@mellanox.com> References: <1579539790-3882-1-git-send-email-matan@mellanox.com> <1580292549-27439-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v2 02/13] vdpa/mlx5: support queues number operation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Support get_queue_num operation to get the maximum number of queues supported by the device. This number comes from the DevX capabilities. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko Reviewed-by: Maxime Coquelin --- drivers/vdpa/mlx5/mlx5_vdpa.c | 54 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 53 insertions(+), 1 deletion(-) diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c index 6286d7a..15e53f2 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -9,6 +9,7 @@ #include #include +#include #include "mlx5_vdpa_utils.h" @@ -18,6 +19,7 @@ struct mlx5_vdpa_priv { int id; /* vDPA device id. */ struct ibv_context *ctx; /* Device context. */ struct rte_vdpa_dev_addr dev_addr; + struct mlx5_hca_vdpa_attr caps; }; TAILQ_HEAD(mlx5_vdpa_privs, mlx5_vdpa_priv) priv_list = @@ -25,8 +27,43 @@ struct mlx5_vdpa_priv { static pthread_mutex_t priv_list_lock = PTHREAD_MUTEX_INITIALIZER; int mlx5_vdpa_logtype; +static struct mlx5_vdpa_priv * +mlx5_vdpa_find_priv_resource_by_did(int did) +{ + struct mlx5_vdpa_priv *priv; + int found = 0; + + pthread_mutex_lock(&priv_list_lock); + TAILQ_FOREACH(priv, &priv_list, next) { + if (did == priv->id) { + found = 1; + break; + } + } + pthread_mutex_unlock(&priv_list_lock); + if (!found) { + DRV_LOG(ERR, "Invalid device id: %d.", did); + rte_errno = EINVAL; + return NULL; + } + return priv; +} + +static int +mlx5_vdpa_get_queue_num(int did, uint32_t *queue_num) +{ + struct mlx5_vdpa_priv *priv = mlx5_vdpa_find_priv_resource_by_did(did); + + if (priv == NULL) { + DRV_LOG(ERR, "Invalid device id: %d.", did); + return -1; + } + *queue_num = priv->caps.max_num_virtio_queues; + return 0; +} + static struct rte_vdpa_dev_ops mlx5_vdpa_ops = { - .get_queue_num = NULL, + .get_queue_num = mlx5_vdpa_get_queue_num, .get_features = NULL, .get_protocol_features = NULL, .dev_conf = NULL, @@ -60,6 +97,7 @@ struct mlx5_vdpa_priv { struct ibv_device *ibv_match = NULL; struct mlx5_vdpa_priv *priv = NULL; struct ibv_context *ctx = NULL; + struct mlx5_hca_attr attr; int ret; if (mlx5_class_get(pci_dev->device.devargs) != MLX5_CLASS_VDPA) { @@ -113,6 +151,20 @@ struct mlx5_vdpa_priv { rte_errno = ENOMEM; goto error; } + ret = mlx5_devx_cmd_query_hca_attr(ctx, &attr); + if (ret) { + DRV_LOG(ERR, "Unable to read HCA capabilities."); + rte_errno = ENOTSUP; + goto error; + } else { + if (!attr.vdpa.valid || !attr.vdpa.max_num_virtio_queues) { + DRV_LOG(ERR, "Not enough capabilities to support vdpa," + " maybe old FW/OFED version?"); + rte_errno = ENOTSUP; + goto error; + } + priv->caps = attr.vdpa; + } priv->ctx = ctx; priv->dev_addr.pci_addr = pci_dev->addr; priv->dev_addr.type = PCI_ADDR; From patchwork Wed Jan 29 10:08:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65291 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 6E439A0531; Wed, 29 Jan 2020 11:09:48 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id C172D1BFBE; Wed, 29 Jan 2020 11:09:36 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id 0B94F1BFBE for ; Wed, 29 Jan 2020 11:09:34 +0100 (CET) Received: from Internal Mail-Server by MTLPINE2 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 29 Jan 2020 12:09:34 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 00TA9BHJ032108; Wed, 29 Jan 2020 12:09:34 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Wed, 29 Jan 2020 10:08:59 +0000 Message-Id: <1580292549-27439-4-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580292549-27439-1-git-send-email-matan@mellanox.com> References: <1579539790-3882-1-git-send-email-matan@mellanox.com> <1580292549-27439-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v2 03/13] vdpa/mlx5: support features get operations X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add support for get_features and get_protocol_features operations. Part of the features are reported by the DevX capabilities. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko Reviewed-by: Maxime Coquelin --- doc/guides/vdpadevs/features/mlx5.ini | 7 ++++ drivers/vdpa/mlx5/mlx5_vdpa.c | 66 +++++++++++++++++++++++++++++++++-- 2 files changed, 71 insertions(+), 2 deletions(-) diff --git a/doc/guides/vdpadevs/features/mlx5.ini b/doc/guides/vdpadevs/features/mlx5.ini index d635bdf..fea491d 100644 --- a/doc/guides/vdpadevs/features/mlx5.ini +++ b/doc/guides/vdpadevs/features/mlx5.ini @@ -4,6 +4,13 @@ ; Refer to default.ini for the full list of available driver features. ; [Features] + +any layout = Y +guest announce = Y +mq = Y +proto mq = Y +proto log shmfd = Y +proto host notifier = Y Other kdrv = Y ARMv8 = Y Power8 = Y diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c index 15e53f2..67e90fd 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -1,6 +1,8 @@ /* SPDX-License-Identifier: BSD-3-Clause * Copyright 2019 Mellanox Technologies, Ltd */ +#include + #include #include #include @@ -10,6 +12,7 @@ #include #include #include +#include #include "mlx5_vdpa_utils.h" @@ -22,6 +25,27 @@ struct mlx5_vdpa_priv { struct mlx5_hca_vdpa_attr caps; }; +#ifndef VIRTIO_F_ORDER_PLATFORM +#define VIRTIO_F_ORDER_PLATFORM 36 +#endif + +#ifndef VIRTIO_F_RING_PACKED +#define VIRTIO_F_RING_PACKED 34 +#endif + +#define MLX5_VDPA_DEFAULT_FEATURES ((1ULL << VHOST_USER_F_PROTOCOL_FEATURES) | \ + (1ULL << VIRTIO_F_ANY_LAYOUT) | \ + (1ULL << VIRTIO_NET_F_MQ) | \ + (1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE) | \ + (1ULL << VIRTIO_F_ORDER_PLATFORM)) + +#define MLX5_VDPA_PROTOCOL_FEATURES \ + ((1ULL << VHOST_USER_PROTOCOL_F_SLAVE_REQ) | \ + (1ULL << VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD) | \ + (1ULL << VHOST_USER_PROTOCOL_F_HOST_NOTIFIER) | \ + (1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD) | \ + (1ULL << VHOST_USER_PROTOCOL_F_MQ)) + TAILQ_HEAD(mlx5_vdpa_privs, mlx5_vdpa_priv) priv_list = TAILQ_HEAD_INITIALIZER(priv_list); static pthread_mutex_t priv_list_lock = PTHREAD_MUTEX_INITIALIZER; @@ -62,10 +86,48 @@ struct mlx5_vdpa_priv { return 0; } +static int +mlx5_vdpa_get_vdpa_features(int did, uint64_t *features) +{ + struct mlx5_vdpa_priv *priv = mlx5_vdpa_find_priv_resource_by_did(did); + + if (priv == NULL) { + DRV_LOG(ERR, "Invalid device id: %d.", did); + return -1; + } + *features = MLX5_VDPA_DEFAULT_FEATURES; + if (priv->caps.virtio_queue_type & (1 << MLX5_VIRTQ_TYPE_PACKED)) + *features |= (1ULL << VIRTIO_F_RING_PACKED); + if (priv->caps.tso_ipv4) + *features |= (1ULL << VIRTIO_NET_F_HOST_TSO4); + if (priv->caps.tso_ipv6) + *features |= (1ULL << VIRTIO_NET_F_HOST_TSO6); + if (priv->caps.tx_csum) + *features |= (1ULL << VIRTIO_NET_F_CSUM); + if (priv->caps.rx_csum) + *features |= (1ULL << VIRTIO_NET_F_GUEST_CSUM); + if (priv->caps.virtio_version_1_0) + *features |= (1ULL << VIRTIO_F_VERSION_1); + return 0; +} + +static int +mlx5_vdpa_get_protocol_features(int did, uint64_t *features) +{ + struct mlx5_vdpa_priv *priv = mlx5_vdpa_find_priv_resource_by_did(did); + + if (priv == NULL) { + DRV_LOG(ERR, "Invalid device id: %d.", did); + return -1; + } + *features = MLX5_VDPA_PROTOCOL_FEATURES; + return 0; +} + static struct rte_vdpa_dev_ops mlx5_vdpa_ops = { .get_queue_num = mlx5_vdpa_get_queue_num, - .get_features = NULL, - .get_protocol_features = NULL, + .get_features = mlx5_vdpa_get_vdpa_features, + .get_protocol_features = mlx5_vdpa_get_protocol_features, .dev_conf = NULL, .dev_close = NULL, .set_vring_state = NULL, From patchwork Wed Jan 29 10:09:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65292 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3EFC2A0531; Wed, 29 Jan 2020 11:09:58 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 696601BFDB; Wed, 29 Jan 2020 11:09:46 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id 6E8211BFDB for ; Wed, 29 Jan 2020 11:09:44 +0100 (CET) Received: from Internal Mail-Server by MTLPINE1 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 29 Jan 2020 12:09:39 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 00TA9BHK032108; Wed, 29 Jan 2020 12:09:39 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Wed, 29 Jan 2020 10:09:00 +0000 Message-Id: <1580292549-27439-5-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580292549-27439-1-git-send-email-matan@mellanox.com> References: <1579539790-3882-1-git-send-email-matan@mellanox.com> <1580292549-27439-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v2 04/13] vdpa/mlx5: prepare memory regions X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" In order to map the guest physical addresses used by the virtio device guest side to the host physical addresses used by the HW as the host side, memory regions are created. By this way, for example, the HW can translate the addresses of the packets posted by the guest and to take the packets from the correct place. The design is to work with single MR which will be configured to the virtio queues in the HW, hence a lot of direct MRs are grouped to single indirect MR. Create functions to prepare and release MRs with all the related resources that are required for it. Create a new file mlx5_vdpa_mem.c to manage all the MR related code in the driver. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko Acked-by: Maxime Coquelin --- drivers/vdpa/mlx5/Makefile | 4 +- drivers/vdpa/mlx5/meson.build | 3 +- drivers/vdpa/mlx5/mlx5_vdpa.c | 11 +- drivers/vdpa/mlx5/mlx5_vdpa.h | 60 +++++++ drivers/vdpa/mlx5/mlx5_vdpa_mem.c | 346 ++++++++++++++++++++++++++++++++++++++ 5 files changed, 413 insertions(+), 11 deletions(-) create mode 100644 drivers/vdpa/mlx5/mlx5_vdpa.h create mode 100644 drivers/vdpa/mlx5/mlx5_vdpa_mem.c diff --git a/drivers/vdpa/mlx5/Makefile b/drivers/vdpa/mlx5/Makefile index c1c8cc0..5472797 100644 --- a/drivers/vdpa/mlx5/Makefile +++ b/drivers/vdpa/mlx5/Makefile @@ -8,6 +8,7 @@ LIB = librte_pmd_mlx5_vdpa.a # Sources. SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa.c +SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_mem.c # Basic CFLAGS. CFLAGS += -O3 @@ -15,6 +16,7 @@ CFLAGS += -std=c11 -Wall -Wextra CFLAGS += -g CFLAGS += -I$(RTE_SDK)/drivers/common/mlx5 CFLAGS += -I$(RTE_SDK)/drivers/net/mlx5_vdpa +CFLAGS += -I$(RTE_SDK)/lib/librte_sched CFLAGS += -I$(BUILDDIR)/drivers/common/mlx5 CFLAGS += -D_BSD_SOURCE CFLAGS += -D_DEFAULT_SOURCE @@ -22,7 +24,7 @@ CFLAGS += -D_XOPEN_SOURCE=600 CFLAGS += $(WERROR_FLAGS) CFLAGS += -Wno-strict-prototypes LDLIBS += -lrte_common_mlx5 -LDLIBS += -lrte_eal -lrte_vhost -lrte_kvargs -lrte_bus_pci +LDLIBS += -lrte_eal -lrte_vhost -lrte_kvargs -lrte_bus_pci -lrte_sched # A few warnings cannot be avoided in external headers. CFLAGS += -Wno-error=cast-qual diff --git a/drivers/vdpa/mlx5/meson.build b/drivers/vdpa/mlx5/meson.build index 4bca6ea..7e5dd95 100644 --- a/drivers/vdpa/mlx5/meson.build +++ b/drivers/vdpa/mlx5/meson.build @@ -9,9 +9,10 @@ endif fmt_name = 'mlx5_vdpa' allow_experimental_apis = true -deps += ['hash', 'common_mlx5', 'vhost', 'bus_pci', 'eal'] +deps += ['hash', 'common_mlx5', 'vhost', 'bus_pci', 'eal', 'sched'] sources = files( 'mlx5_vdpa.c', + 'mlx5_vdpa_mem.c', ) cflags_options = [ '-std=c11', diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c index 67e90fd..c67f93d 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -7,7 +7,6 @@ #include #include #include -#include #include #include @@ -15,16 +14,9 @@ #include #include "mlx5_vdpa_utils.h" +#include "mlx5_vdpa.h" -struct mlx5_vdpa_priv { - TAILQ_ENTRY(mlx5_vdpa_priv) next; - int id; /* vDPA device id. */ - struct ibv_context *ctx; /* Device context. */ - struct rte_vdpa_dev_addr dev_addr; - struct mlx5_hca_vdpa_attr caps; -}; - #ifndef VIRTIO_F_ORDER_PLATFORM #define VIRTIO_F_ORDER_PLATFORM 36 #endif @@ -236,6 +228,7 @@ struct mlx5_vdpa_priv { rte_errno = rte_errno ? rte_errno : EINVAL; goto error; } + SLIST_INIT(&priv->mr_list); pthread_mutex_lock(&priv_list_lock); TAILQ_INSERT_TAIL(&priv_list, priv, next); pthread_mutex_unlock(&priv_list_lock); diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h new file mode 100644 index 0000000..e27baea --- /dev/null +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -0,0 +1,60 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2019 Mellanox Technologies, Ltd + */ + +#ifndef RTE_PMD_MLX5_VDPA_H_ +#define RTE_PMD_MLX5_VDPA_H_ + +#include + +#include +#include + +#include +#include + +struct mlx5_vdpa_query_mr { + SLIST_ENTRY(mlx5_vdpa_query_mr) next; + void *addr; + uint64_t length; + struct mlx5dv_devx_umem *umem; + struct mlx5_devx_obj *mkey; + int is_indirect; +}; + +struct mlx5_vdpa_priv { + TAILQ_ENTRY(mlx5_vdpa_priv) next; + int id; /* vDPA device id. */ + int vid; /* vhost device id. */ + struct ibv_context *ctx; /* Device context. */ + struct rte_vdpa_dev_addr dev_addr; + struct mlx5_hca_vdpa_attr caps; + uint32_t pdn; /* Protection Domain number. */ + struct ibv_pd *pd; + uint32_t gpa_mkey_index; + struct ibv_mr *null_mr; + struct rte_vhost_memory *vmem; + SLIST_HEAD(mr_list, mlx5_vdpa_query_mr) mr_list; +}; + +/** + * Release all the prepared memory regions and all their related resources. + * + * @param[in] priv + * The vdpa driver private structure. + */ +void mlx5_vdpa_mem_dereg(struct mlx5_vdpa_priv *priv); + +/** + * Register all the memory regions of the virtio device to the HW and allocate + * all their related resources. + * + * @param[in] priv + * The vdpa driver private structure. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +int mlx5_vdpa_mem_register(struct mlx5_vdpa_priv *priv); + +#endif /* RTE_PMD_MLX5_VDPA_H_ */ diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_mem.c b/drivers/vdpa/mlx5/mlx5_vdpa_mem.c new file mode 100644 index 0000000..398ca35 --- /dev/null +++ b/drivers/vdpa/mlx5/mlx5_vdpa_mem.c @@ -0,0 +1,346 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2019 Mellanox Technologies, Ltd + */ +#include + +#include +#include +#include +#include + +#include +#include + +#include "mlx5_vdpa_utils.h" +#include "mlx5_vdpa.h" + +static int +mlx5_vdpa_pd_prepare(struct mlx5_vdpa_priv *priv) +{ +#ifdef HAVE_IBV_FLOW_DV_SUPPORT + if (priv->pd) + return 0; + priv->pd = mlx5_glue->alloc_pd(priv->ctx); + if (priv->pd == NULL) { + DRV_LOG(ERR, "Failed to allocate PD."); + return errno ? -errno : -ENOMEM; + } + struct mlx5dv_obj obj; + struct mlx5dv_pd pd_info; + int ret = 0; + + obj.pd.in = priv->pd; + obj.pd.out = &pd_info; + ret = mlx5_glue->dv_init_obj(&obj, MLX5DV_OBJ_PD); + if (ret) { + DRV_LOG(ERR, "Fail to get PD object info."); + mlx5_glue->dealloc_pd(priv->pd); + priv->pd = NULL; + return -errno; + } + priv->pdn = pd_info.pdn; + return 0; +#else + (void)priv; + DRV_LOG(ERR, "Cannot get pdn - no DV support."); + return -ENOTSUP; +#endif /* HAVE_IBV_FLOW_DV_SUPPORT */ +} + +void +mlx5_vdpa_mem_dereg(struct mlx5_vdpa_priv *priv) +{ + struct mlx5_vdpa_query_mr *entry; + struct mlx5_vdpa_query_mr *next; + + entry = SLIST_FIRST(&priv->mr_list); + while (entry) { + next = SLIST_NEXT(entry, next); + claim_zero(mlx5_devx_cmd_destroy(entry->mkey)); + if (!entry->is_indirect) + claim_zero(mlx5_glue->devx_umem_dereg(entry->umem)); + SLIST_REMOVE(&priv->mr_list, entry, mlx5_vdpa_query_mr, next); + rte_free(entry); + entry = next; + } + SLIST_INIT(&priv->mr_list); + if (priv->null_mr) { + claim_zero(mlx5_glue->dereg_mr(priv->null_mr)); + priv->null_mr = NULL; + } + if (priv->pd) { + claim_zero(mlx5_glue->dealloc_pd(priv->pd)); + priv->pd = NULL; + } + if (priv->vmem) { + free(priv->vmem); + priv->vmem = NULL; + } +} + +static int +mlx5_vdpa_regions_addr_cmp(const void *a, const void *b) +{ + const struct rte_vhost_mem_region *region_a = a; + const struct rte_vhost_mem_region *region_b = b; + + if (region_a->guest_phys_addr < region_b->guest_phys_addr) + return -1; + if (region_a->guest_phys_addr > region_b->guest_phys_addr) + return 1; + return 0; +} + +#define KLM_NUM_MAX_ALIGN(sz) (RTE_ALIGN_CEIL(sz, MLX5_MAX_KLM_BYTE_COUNT) / \ + MLX5_MAX_KLM_BYTE_COUNT) + +/* + * Allocate and sort the region list and choose indirect mkey mode: + * 1. Calculate GCD, guest memory size and indirect mkey entries num per mode. + * 2. Align GCD to the maximum allowed size(2G) and to be power of 2. + * 2. Decide the indirect mkey mode according to the next rules: + * a. If both KLM_FBS entries number and KLM entries number are bigger + * than the maximum allowed(MLX5_DEVX_MAX_KLM_ENTRIES) - error. + * b. KLM mode if KLM_FBS entries number is bigger than the maximum + * allowed(MLX5_DEVX_MAX_KLM_ENTRIES). + * c. KLM mode if GCD is smaller than the minimum allowed(4K). + * d. KLM mode if the total size of KLM entries is in one cache line + * and the total size of KLM_FBS entries is not in one cache line. + * e. Otherwise, KLM_FBS mode. + */ +static struct rte_vhost_memory * +mlx5_vdpa_vhost_mem_regions_prepare(int vid, uint8_t *mode, uint64_t *mem_size, + uint64_t *gcd, uint32_t *entries_num) +{ + struct rte_vhost_memory *mem; + uint64_t size; + uint64_t klm_entries_num = 0; + uint64_t klm_fbs_entries_num; + uint32_t i; + int ret = rte_vhost_get_mem_table(vid, &mem); + + if (ret < 0) { + DRV_LOG(ERR, "Failed to get VM memory layout vid =%d.", vid); + rte_errno = EINVAL; + return NULL; + } + qsort(mem->regions, mem->nregions, sizeof(mem->regions[0]), + mlx5_vdpa_regions_addr_cmp); + *mem_size = (mem->regions[(mem->nregions - 1)].guest_phys_addr) + + (mem->regions[(mem->nregions - 1)].size) - + (mem->regions[0].guest_phys_addr); + *gcd = 0; + for (i = 0; i < mem->nregions; ++i) { + DRV_LOG(INFO, "Region %u: HVA 0x%" PRIx64 ", GPA 0x%" PRIx64 + ", size 0x%" PRIx64 ".", i, + mem->regions[i].host_user_addr, + mem->regions[i].guest_phys_addr, mem->regions[i].size); + if (i > 0) { + /* Hole handle. */ + size = mem->regions[i].guest_phys_addr - + (mem->regions[i - 1].guest_phys_addr + + mem->regions[i - 1].size); + *gcd = rte_get_gcd(*gcd, size); + klm_entries_num += KLM_NUM_MAX_ALIGN(size); + } + size = mem->regions[i].size; + *gcd = rte_get_gcd(*gcd, size); + klm_entries_num += KLM_NUM_MAX_ALIGN(size); + } + if (*gcd > MLX5_MAX_KLM_BYTE_COUNT) + *gcd = rte_get_gcd(*gcd, MLX5_MAX_KLM_BYTE_COUNT); + if (!RTE_IS_POWER_OF_2(*gcd)) { + uint64_t candidate_gcd = rte_align64prevpow2(*gcd); + + while (candidate_gcd > 1 && (*gcd % candidate_gcd)) + candidate_gcd /= 2; + DRV_LOG(DEBUG, "GCD 0x%" PRIx64 " is not power of 2. Adjusted " + "GCD is 0x%" PRIx64 ".", *gcd, candidate_gcd); + *gcd = candidate_gcd; + } + klm_fbs_entries_num = *mem_size / *gcd; + if (*gcd < MLX5_MIN_KLM_FIXED_BUFFER_SIZE || klm_fbs_entries_num > + MLX5_DEVX_MAX_KLM_ENTRIES || + ((klm_entries_num * sizeof(struct mlx5_klm)) <= + RTE_CACHE_LINE_SIZE && (klm_fbs_entries_num * + sizeof(struct mlx5_klm)) > + RTE_CACHE_LINE_SIZE)) { + *mode = MLX5_MKC_ACCESS_MODE_KLM; + *entries_num = klm_entries_num; + DRV_LOG(INFO, "Indirect mkey mode is KLM."); + } else { + *mode = MLX5_MKC_ACCESS_MODE_KLM_FBS; + *entries_num = klm_fbs_entries_num; + DRV_LOG(INFO, "Indirect mkey mode is KLM Fixed Buffer Size."); + } + DRV_LOG(DEBUG, "Memory registration information: nregions = %u, " + "mem_size = 0x%" PRIx64 ", GCD = 0x%" PRIx64 + ", klm_fbs_entries_num = 0x%" PRIx64 ", klm_entries_num = 0x%" + PRIx64 ".", mem->nregions, *mem_size, *gcd, klm_fbs_entries_num, + klm_entries_num); + if (*entries_num > MLX5_DEVX_MAX_KLM_ENTRIES) { + DRV_LOG(ERR, "Failed to prepare memory of vid %d - memory is " + "too fragmented.", vid); + free(mem); + return NULL; + } + return mem; +} + +#define KLM_SIZE_MAX_ALIGN(sz) ((sz) > MLX5_MAX_KLM_BYTE_COUNT ? \ + MLX5_MAX_KLM_BYTE_COUNT : (sz)) + +/* + * The target here is to group all the physical memory regions of the + * virtio device in one indirect mkey. + * For KLM Fixed Buffer Size mode (HW find the translation entry in one + * read according to the guest phisical address): + * All the sub-direct mkeys of it must be in the same size, hence, each + * one of them should be in the GCD size of all the virtio memory + * regions and the holes between them. + * For KLM mode (each entry may be in different size so HW must iterate + * the entries): + * Each virtio memory region and each hole between them have one entry, + * just need to cover the maximum allowed size(2G) by splitting entries + * which their associated memory regions are bigger than 2G. + * It means that each virtio memory region may be mapped to more than + * one direct mkey in the 2 modes. + * All the holes of invalid memory between the virtio memory regions + * will be mapped to the null memory region for security. + */ +int +mlx5_vdpa_mem_register(struct mlx5_vdpa_priv *priv) +{ + struct mlx5_devx_mkey_attr mkey_attr; + struct mlx5_vdpa_query_mr *entry = NULL; + struct rte_vhost_mem_region *reg = NULL; + uint8_t mode; + uint32_t entries_num = 0; + uint32_t i; + uint64_t gcd; + uint64_t klm_size; + uint64_t mem_size; + uint64_t k; + int klm_index = 0; + int ret; + struct rte_vhost_memory *mem = mlx5_vdpa_vhost_mem_regions_prepare + (priv->vid, &mode, &mem_size, &gcd, &entries_num); + struct mlx5_klm klm_array[entries_num]; + + if (!mem) + return -rte_errno; + priv->vmem = mem; + ret = mlx5_vdpa_pd_prepare(priv); + if (ret) + goto error; + priv->null_mr = mlx5_glue->alloc_null_mr(priv->pd); + if (!priv->null_mr) { + DRV_LOG(ERR, "Failed to allocate null MR."); + ret = -errno; + goto error; + } + DRV_LOG(DEBUG, "Dump fill Mkey = %u.", priv->null_mr->lkey); + for (i = 0; i < mem->nregions; i++) { + reg = &mem->regions[i]; + entry = rte_zmalloc(__func__, sizeof(*entry), 0); + if (!entry) { + ret = -ENOMEM; + DRV_LOG(ERR, "Failed to allocate mem entry memory."); + goto error; + } + entry->umem = mlx5_glue->devx_umem_reg(priv->ctx, + (void *)(uintptr_t)reg->host_user_addr, + reg->size, IBV_ACCESS_LOCAL_WRITE); + if (!entry->umem) { + DRV_LOG(ERR, "Failed to register Umem by Devx."); + ret = -errno; + goto error; + } + mkey_attr.addr = (uintptr_t)(reg->guest_phys_addr); + mkey_attr.size = reg->size; + mkey_attr.umem_id = entry->umem->umem_id; + mkey_attr.pd = priv->pdn; + mkey_attr.pg_access = 1; + mkey_attr.klm_array = NULL; + mkey_attr.klm_num = 0; + entry->mkey = mlx5_devx_cmd_mkey_create(priv->ctx, &mkey_attr); + if (!entry->mkey) { + DRV_LOG(ERR, "Failed to create direct Mkey."); + ret = -rte_errno; + goto error; + } + entry->addr = (void *)(uintptr_t)(reg->host_user_addr); + entry->length = reg->size; + entry->is_indirect = 0; + if (i > 0) { + uint64_t sadd; + uint64_t empty_region_sz = reg->guest_phys_addr - + (mem->regions[i - 1].guest_phys_addr + + mem->regions[i - 1].size); + + if (empty_region_sz > 0) { + sadd = mem->regions[i - 1].guest_phys_addr + + mem->regions[i - 1].size; + klm_size = mode == MLX5_MKC_ACCESS_MODE_KLM ? + KLM_SIZE_MAX_ALIGN(empty_region_sz) : gcd; + for (k = 0; k < empty_region_sz; + k += klm_size) { + klm_array[klm_index].byte_count = + k + klm_size > empty_region_sz ? + empty_region_sz - k : klm_size; + klm_array[klm_index].mkey = + priv->null_mr->lkey; + klm_array[klm_index].address = sadd + k; + klm_index++; + } + } + } + klm_size = mode == MLX5_MKC_ACCESS_MODE_KLM ? + KLM_SIZE_MAX_ALIGN(reg->size) : gcd; + for (k = 0; k < reg->size; k += klm_size) { + klm_array[klm_index].byte_count = k + klm_size > + reg->size ? reg->size - k : klm_size; + klm_array[klm_index].mkey = entry->mkey->id; + klm_array[klm_index].address = reg->guest_phys_addr + k; + klm_index++; + } + SLIST_INSERT_HEAD(&priv->mr_list, entry, next); + } + mkey_attr.addr = (uintptr_t)(mem->regions[0].guest_phys_addr); + mkey_attr.size = mem_size; + mkey_attr.pd = priv->pdn; + mkey_attr.umem_id = 0; + /* Must be zero for KLM mode. */ + mkey_attr.log_entity_size = mode == MLX5_MKC_ACCESS_MODE_KLM_FBS ? + rte_log2_u64(gcd) : 0; + mkey_attr.pg_access = 0; + mkey_attr.klm_array = klm_array; + mkey_attr.klm_num = klm_index; + entry = rte_zmalloc(__func__, sizeof(*entry), 0); + if (!entry) { + DRV_LOG(ERR, "Failed to allocate memory for indirect entry."); + ret = -ENOMEM; + goto error; + } + entry->mkey = mlx5_devx_cmd_mkey_create(priv->ctx, &mkey_attr); + if (!entry->mkey) { + DRV_LOG(ERR, "Failed to create indirect Mkey."); + ret = -rte_errno; + goto error; + } + entry->is_indirect = 1; + SLIST_INSERT_HEAD(&priv->mr_list, entry, next); + priv->gpa_mkey_index = entry->mkey->id; + return 0; +error: + if (entry) { + if (entry->mkey) + mlx5_devx_cmd_destroy(entry->mkey); + if (entry->umem) + mlx5_glue->devx_umem_dereg(entry->umem); + rte_free(entry); + } + mlx5_vdpa_mem_dereg(priv); + rte_errno = -ret; + return ret; +} From patchwork Wed Jan 29 10:09:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65293 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0372AA0531; Wed, 29 Jan 2020 11:10:10 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id CA69E1BFE1; Wed, 29 Jan 2020 11:09:50 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id 88E9A1BFE1 for ; Wed, 29 Jan 2020 11:09:49 +0100 (CET) Received: from Internal Mail-Server by MTLPINE1 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 29 Jan 2020 12:09:46 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 00TA9BHL032108; Wed, 29 Jan 2020 12:09:46 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Wed, 29 Jan 2020 10:09:01 +0000 Message-Id: <1580292549-27439-6-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580292549-27439-1-git-send-email-matan@mellanox.com> References: <1579539790-3882-1-git-send-email-matan@mellanox.com> <1580292549-27439-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v2 05/13] vdpa/mlx5: prepare HW queues X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" As an arrangement to the vitrio queues creation, a 2 QPs and CQ may be created for the virtio queue. The design is to trigger an event for the guest and for the vdpa driver when a new CQE is posted by the HW after the packet transition. This patch add the basic operations to create and destroy the above HW objects and to trigger the CQE events when a new CQE is posted. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko --- drivers/common/mlx5/mlx5_prm.h | 4 + drivers/vdpa/mlx5/Makefile | 1 + drivers/vdpa/mlx5/meson.build | 1 + drivers/vdpa/mlx5/mlx5_vdpa.h | 89 ++++++++ drivers/vdpa/mlx5/mlx5_vdpa_event.c | 399 ++++++++++++++++++++++++++++++++++++ 5 files changed, 494 insertions(+) create mode 100644 drivers/vdpa/mlx5/mlx5_vdpa_event.c diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h index b48cd0a..b533798 100644 --- a/drivers/common/mlx5/mlx5_prm.h +++ b/drivers/common/mlx5/mlx5_prm.h @@ -392,6 +392,10 @@ struct mlx5_cqe { /* CQE format value. */ #define MLX5_COMPRESSED 0x3 +/* CQ doorbell cmd types. */ +#define MLX5_CQ_DBR_CMD_SOL_ONLY (1 << 24) +#define MLX5_CQ_DBR_CMD_ALL (0 << 24) + /* Action type of header modification. */ enum { MLX5_MODIFICATION_TYPE_SET = 0x1, diff --git a/drivers/vdpa/mlx5/Makefile b/drivers/vdpa/mlx5/Makefile index 5472797..7f13756 100644 --- a/drivers/vdpa/mlx5/Makefile +++ b/drivers/vdpa/mlx5/Makefile @@ -9,6 +9,7 @@ LIB = librte_pmd_mlx5_vdpa.a # Sources. SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_mem.c +SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_event.c # Basic CFLAGS. CFLAGS += -O3 diff --git a/drivers/vdpa/mlx5/meson.build b/drivers/vdpa/mlx5/meson.build index 7e5dd95..c609f7c 100644 --- a/drivers/vdpa/mlx5/meson.build +++ b/drivers/vdpa/mlx5/meson.build @@ -13,6 +13,7 @@ deps += ['hash', 'common_mlx5', 'vhost', 'bus_pci', 'eal', 'sched'] sources = files( 'mlx5_vdpa.c', 'mlx5_vdpa_mem.c', + 'mlx5_vdpa_event.c', ) cflags_options = [ '-std=c11', diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h index e27baea..30030b7 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -9,9 +9,40 @@ #include #include +#include +#include #include #include +#include + + +#define MLX5_VDPA_INTR_RETRIES 256 +#define MLX5_VDPA_INTR_RETRIES_USEC 1000 + +struct mlx5_vdpa_cq { + uint16_t log_desc_n; + uint32_t cq_ci:24; + uint32_t arm_sn:2; + rte_spinlock_t sl; + struct mlx5_devx_obj *cq; + struct mlx5dv_devx_umem *umem_obj; + union { + volatile void *umem_buf; + volatile struct mlx5_cqe *cqes; + }; + volatile uint32_t *db_rec; + uint64_t errors; +}; + +struct mlx5_vdpa_event_qp { + struct mlx5_vdpa_cq cq; + struct mlx5_devx_obj *fw_qp; + struct mlx5_devx_obj *sw_qp; + struct mlx5dv_devx_umem *umem_obj; + void *umem_buf; + volatile uint32_t *db_rec; +}; struct mlx5_vdpa_query_mr { SLIST_ENTRY(mlx5_vdpa_query_mr) next; @@ -34,6 +65,10 @@ struct mlx5_vdpa_priv { uint32_t gpa_mkey_index; struct ibv_mr *null_mr; struct rte_vhost_memory *vmem; + uint32_t eqn; + struct mlx5dv_devx_event_channel *eventc; + struct mlx5dv_devx_uar *uar; + struct rte_intr_handle intr_handle; SLIST_HEAD(mr_list, mlx5_vdpa_query_mr) mr_list; }; @@ -57,4 +92,58 @@ struct mlx5_vdpa_priv { */ int mlx5_vdpa_mem_register(struct mlx5_vdpa_priv *priv); + +/** + * Create an event QP and all its related resources. + * + * @param[in] priv + * The vdpa driver private structure. + * @param[in] desc_n + * Number of descriptors. + * @param[in] callfd + * The guest notification file descriptor. + * @param[in/out] eqp + * Pointer to the event QP structure. + * + * @return + * 0 on success, -1 otherwise and rte_errno is set. + */ +int mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv *priv, uint16_t desc_n, + int callfd, struct mlx5_vdpa_event_qp *eqp); + +/** + * Destroy an event QP and all its related resources. + * + * @param[in/out] eqp + * Pointer to the event QP structure. + */ +void mlx5_vdpa_event_qp_destroy(struct mlx5_vdpa_event_qp *eqp); + +/** + * Release all the event global resources. + * + * @param[in] priv + * The vdpa driver private structure. + */ +void mlx5_vdpa_event_qp_global_release(struct mlx5_vdpa_priv *priv); + +/** + * Setup CQE event. + * + * @param[in] priv + * The vdpa driver private structure. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +int mlx5_vdpa_cqe_event_setup(struct mlx5_vdpa_priv *priv); + +/** + * Unset CQE event . + * + * @param[in] priv + * The vdpa driver private structure. + */ +void mlx5_vdpa_cqe_event_unset(struct mlx5_vdpa_priv *priv); + #endif /* RTE_PMD_MLX5_VDPA_H_ */ diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_event.c b/drivers/vdpa/mlx5/mlx5_vdpa_event.c new file mode 100644 index 0000000..35518ad --- /dev/null +++ b/drivers/vdpa/mlx5/mlx5_vdpa_event.c @@ -0,0 +1,399 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2019 Mellanox Technologies, Ltd + */ +#include +#include +#include + +#include +#include +#include +#include +#include +#include + +#include + +#include "mlx5_vdpa_utils.h" +#include "mlx5_vdpa.h" + + +void +mlx5_vdpa_event_qp_global_release(struct mlx5_vdpa_priv *priv) +{ + if (priv->uar) { + mlx5_glue->devx_free_uar(priv->uar); + priv->uar = NULL; + } + if (priv->eventc) { + mlx5_glue->devx_destroy_event_channel(priv->eventc); + priv->eventc = NULL; + } + priv->eqn = 0; +} + +/* Prepare all the global resources for all the event objects.*/ +static int +mlx5_vdpa_event_qp_global_prepare(struct mlx5_vdpa_priv *priv) +{ + uint32_t lcore; + + if (priv->eventc) + return 0; + lcore = (uint32_t)rte_lcore_to_cpu_id(-1); + if (mlx5_glue->devx_query_eqn(priv->ctx, lcore, &priv->eqn)) { + rte_errno = errno; + DRV_LOG(ERR, "Failed to query EQ number %d.", rte_errno); + return -1; + } + priv->eventc = mlx5_glue->devx_create_event_channel(priv->ctx, + MLX5DV_DEVX_CREATE_EVENT_CHANNEL_FLAGS_OMIT_EV_DATA); + if (!priv->eventc) { + rte_errno = errno; + DRV_LOG(ERR, "Failed to create event channel %d.", + rte_errno); + goto error; + } + priv->uar = mlx5_glue->devx_alloc_uar(priv->ctx, 0); + if (!priv->uar) { + rte_errno = errno; + DRV_LOG(ERR, "Failed to allocate UAR."); + goto error; + } + return 0; +error: + mlx5_vdpa_event_qp_global_release(priv); + return -1; +} + +static void +mlx5_vdpa_cq_destroy(struct mlx5_vdpa_cq *cq) +{ + if (cq->cq) + claim_zero(mlx5_devx_cmd_destroy(cq->cq)); + if (cq->umem_obj) + claim_zero(mlx5_glue->devx_umem_dereg(cq->umem_obj)); + if (cq->umem_buf) + rte_free((void *)(uintptr_t)cq->umem_buf); + memset(cq, 0, sizeof(*cq)); +} + +static inline void +mlx5_vdpa_cq_arm(struct mlx5_vdpa_priv *priv, struct mlx5_vdpa_cq *cq) +{ + const unsigned int cqe_mask = (1 << cq->log_desc_n) - 1; + uint32_t arm_sn = cq->arm_sn << MLX5_CQ_SQN_OFFSET; + uint32_t cq_ci = cq->cq_ci & MLX5_CI_MASK & cqe_mask; + uint32_t doorbell_hi = arm_sn | MLX5_CQ_DBR_CMD_ALL | cq_ci; + uint64_t doorbell = ((uint64_t)doorbell_hi << 32) | cq->cq->id; + uint64_t db_be = rte_cpu_to_be_64(doorbell); + uint32_t *addr = RTE_PTR_ADD(priv->uar->base_addr, MLX5_CQ_DOORBELL); + + rte_io_wmb(); + cq->db_rec[MLX5_CQ_ARM_DB] = rte_cpu_to_be_32(doorbell_hi); + rte_wmb(); +#ifdef RTE_ARCH_64 + *(uint64_t *)addr = db_be; +#else + *(uint32_t *)addr = db_be; + rte_io_wmb(); + *((uint32_t *)addr + 1) = db_be >> 32; +#endif + cq->arm_sn++; +} + +static int +mlx5_vdpa_cq_create(struct mlx5_vdpa_priv *priv, uint16_t log_desc_n, + int callfd, struct mlx5_vdpa_cq *cq) +{ + struct mlx5_devx_cq_attr attr; + size_t pgsize = sysconf(_SC_PAGESIZE); + uint32_t umem_size; + int ret; + uint16_t event_nums[1] = {0}; + + cq->log_desc_n = log_desc_n; + umem_size = sizeof(struct mlx5_cqe) * (1 << log_desc_n) + + sizeof(*cq->db_rec) * 2; + cq->umem_buf = rte_zmalloc(__func__, umem_size, 4096); + if (!cq->umem_buf) { + DRV_LOG(ERR, "Failed to allocate memory for CQ."); + rte_errno = ENOMEM; + return -ENOMEM; + } + cq->umem_obj = mlx5_glue->devx_umem_reg(priv->ctx, + (void *)(uintptr_t)cq->umem_buf, + umem_size, + IBV_ACCESS_LOCAL_WRITE); + if (!cq->umem_obj) { + DRV_LOG(ERR, "Failed to register umem for CQ."); + goto error; + } + attr.q_umem_valid = 1; + attr.db_umem_valid = 1; + attr.use_first_only = 0; + attr.overrun_ignore = 0; + attr.uar_page_id = priv->uar->page_id; + attr.q_umem_id = cq->umem_obj->umem_id; + attr.q_umem_offset = 0; + attr.db_umem_id = cq->umem_obj->umem_id; + attr.db_umem_offset = sizeof(struct mlx5_cqe) * (1 << log_desc_n); + attr.eqn = priv->eqn; + attr.log_cq_size = log_desc_n; + attr.log_page_size = rte_log2_u32(pgsize); + cq->cq = mlx5_devx_cmd_create_cq(priv->ctx, &attr); + if (!cq->cq) + goto error; + cq->db_rec = RTE_PTR_ADD(cq->umem_buf, (uintptr_t)attr.db_umem_offset); + cq->cq_ci = 0; + rte_spinlock_init(&cq->sl); + /* Subscribe CQ event to the event channel controlled by the driver. */ + ret = mlx5_glue->devx_subscribe_devx_event(priv->eventc, cq->cq->obj, + sizeof(event_nums), + event_nums, + (uint64_t)(uintptr_t)cq); + if (ret) { + DRV_LOG(ERR, "Failed to subscribe CQE event."); + rte_errno = errno; + goto error; + } + /* Subscribe CQ event to the guest FD only if it is not in poll mode. */ + if (callfd != -1) { + ret = mlx5_glue->devx_subscribe_devx_event_fd(priv->eventc, + callfd, + cq->cq->obj, 0); + if (ret) { + DRV_LOG(ERR, "Failed to subscribe CQE event fd."); + rte_errno = errno; + goto error; + } + } + /* First arming. */ + mlx5_vdpa_cq_arm(priv, cq); + return 0; +error: + mlx5_vdpa_cq_destroy(cq); + return -1; +} + +static inline void __rte_unused +mlx5_vdpa_cq_poll(struct mlx5_vdpa_priv *priv __rte_unused, + struct mlx5_vdpa_cq *cq) +{ + struct mlx5_vdpa_event_qp *eqp = + container_of(cq, struct mlx5_vdpa_event_qp, cq); + const unsigned int cqe_size = 1 << cq->log_desc_n; + const unsigned int cqe_mask = cqe_size - 1; + int ret; + + do { + volatile struct mlx5_cqe *cqe = cq->cqes + (cq->cq_ci & + cqe_mask); + + ret = check_cqe(cqe, cqe_size, cq->cq_ci); + switch (ret) { + case MLX5_CQE_STATUS_ERR: + cq->errors++; + /*fall-through*/ + case MLX5_CQE_STATUS_SW_OWN: + cq->cq_ci++; + break; + case MLX5_CQE_STATUS_HW_OWN: + default: + break; + } + } while (ret != MLX5_CQE_STATUS_HW_OWN); + rte_io_wmb(); + /* Ring CQ doorbell record. */ + cq->db_rec[0] = rte_cpu_to_be_32(cq->cq_ci); + rte_io_wmb(); + /* Ring SW QP doorbell record. */ + eqp->db_rec[0] = rte_cpu_to_be_32(cq->cq_ci + cqe_size); +} + +static void +mlx5_vdpa_interrupt_handler(void *cb_arg) +{ +#ifndef HAVE_IBV_DEVX_EVENT + (void)cb_arg; + return; +#else + struct mlx5_vdpa_priv *priv = cb_arg; + union { + struct mlx5dv_devx_async_event_hdr event_resp; + uint8_t buf[sizeof(struct mlx5dv_devx_async_event_hdr) + 128]; + } out; + + while (mlx5_glue->devx_get_event(priv->eventc, &out.event_resp, + sizeof(out.buf)) >= + (ssize_t)sizeof(out.event_resp.cookie)) { + struct mlx5_vdpa_cq *cq = (struct mlx5_vdpa_cq *) + (uintptr_t)out.event_resp.cookie; + rte_spinlock_lock(&cq->sl); + mlx5_vdpa_cq_poll(priv, cq); + mlx5_vdpa_cq_arm(priv, cq); + rte_spinlock_unlock(&cq->sl); + DRV_LOG(DEBUG, "CQ %p event: new cq_ci = %u.", cq, cq->cq_ci); + } +#endif /* HAVE_IBV_DEVX_ASYNC */ +} + +int +mlx5_vdpa_cqe_event_setup(struct mlx5_vdpa_priv *priv) +{ + int flags = fcntl(priv->eventc->fd, F_GETFL); + int ret = fcntl(priv->eventc->fd, F_SETFL, flags | O_NONBLOCK); + if (ret) { + DRV_LOG(ERR, "Failed to change event channel FD."); + rte_errno = errno; + return -rte_errno; + } + priv->intr_handle.fd = priv->eventc->fd; + priv->intr_handle.type = RTE_INTR_HANDLE_EXT; + if (rte_intr_callback_register(&priv->intr_handle, + mlx5_vdpa_interrupt_handler, priv)) { + priv->intr_handle.fd = 0; + DRV_LOG(ERR, "Failed to register CQE interrupt %d.", rte_errno); + return -rte_errno; + } + return 0; +} + +void +mlx5_vdpa_cqe_event_unset(struct mlx5_vdpa_priv *priv) +{ + int retries = MLX5_VDPA_INTR_RETRIES; + int ret = -EAGAIN; + + if (priv->intr_handle.fd) { + while (retries-- && ret == -EAGAIN) { + ret = rte_intr_callback_unregister(&priv->intr_handle, + mlx5_vdpa_interrupt_handler, + priv); + if (ret == -EAGAIN) { + DRV_LOG(DEBUG, "Try again to unregister fd %d " + "of CQ interrupt, retries = %d.", + priv->intr_handle.fd, retries); + usleep(MLX5_VDPA_INTR_RETRIES_USEC); + } + } + memset(&priv->intr_handle, 0, sizeof(priv->intr_handle)); + } +} + +void +mlx5_vdpa_event_qp_destroy(struct mlx5_vdpa_event_qp *eqp) +{ + if (eqp->sw_qp) + claim_zero(mlx5_devx_cmd_destroy(eqp->sw_qp)); + if (eqp->umem_obj) + claim_zero(mlx5_glue->devx_umem_dereg(eqp->umem_obj)); + if (eqp->umem_buf) + rte_free(eqp->umem_buf); + if (eqp->fw_qp) + claim_zero(mlx5_devx_cmd_destroy(eqp->fw_qp)); + mlx5_vdpa_cq_destroy(&eqp->cq); + memset(eqp, 0, sizeof(*eqp)); +} + +static int +mlx5_vdpa_qps2rts(struct mlx5_vdpa_event_qp *eqp) +{ + if (mlx5_devx_cmd_modify_qp_state(eqp->fw_qp, MLX5_CMD_OP_RST2INIT_QP, + eqp->sw_qp->id)) { + DRV_LOG(ERR, "Failed to modify FW QP to INIT state(%u).", + rte_errno); + return -1; + } + if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp, MLX5_CMD_OP_RST2INIT_QP, + eqp->fw_qp->id)) { + DRV_LOG(ERR, "Failed to modify SW QP to INIT state(%u).", + rte_errno); + return -1; + } + if (mlx5_devx_cmd_modify_qp_state(eqp->fw_qp, MLX5_CMD_OP_INIT2RTR_QP, + eqp->sw_qp->id)) { + DRV_LOG(ERR, "Failed to modify FW QP to RTR state(%u).", + rte_errno); + return -1; + } + if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp, MLX5_CMD_OP_INIT2RTR_QP, + eqp->fw_qp->id)) { + DRV_LOG(ERR, "Failed to modify SW QP to RTR state(%u).", + rte_errno); + return -1; + } + if (mlx5_devx_cmd_modify_qp_state(eqp->fw_qp, MLX5_CMD_OP_RTR2RTS_QP, + eqp->sw_qp->id)) { + DRV_LOG(ERR, "Failed to modify FW QP to RTS state(%u).", + rte_errno); + return -1; + } + if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp, MLX5_CMD_OP_RTR2RTS_QP, + eqp->fw_qp->id)) { + DRV_LOG(ERR, "Failed to modify SW QP to RTS state(%u).", + rte_errno); + return -1; + } + return 0; +} + +int +mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv *priv, uint16_t desc_n, + int callfd, struct mlx5_vdpa_event_qp *eqp) +{ + struct mlx5_devx_qp_attr attr = {0}; + uint16_t log_desc_n = rte_log2_u32(desc_n); + uint32_t umem_size = (1 << log_desc_n) * MLX5_WSEG_SIZE + + sizeof(*eqp->db_rec) * 2; + + if (mlx5_vdpa_event_qp_global_prepare(priv)) + return -1; + if (mlx5_vdpa_cq_create(priv, log_desc_n, callfd, &eqp->cq)) + return -1; + attr.pd = priv->pdn; + eqp->fw_qp = mlx5_devx_cmd_create_qp(priv->ctx, &attr); + if (!eqp->fw_qp) { + DRV_LOG(ERR, "Failed to create FW QP(%u).", rte_errno); + goto error; + } + eqp->umem_buf = rte_zmalloc(__func__, umem_size, 4096); + if (!eqp->umem_buf) { + DRV_LOG(ERR, "Failed to allocate memory for SW QP."); + rte_errno = ENOMEM; + goto error; + } + eqp->umem_obj = mlx5_glue->devx_umem_reg(priv->ctx, + (void *)(uintptr_t)eqp->umem_buf, + umem_size, + IBV_ACCESS_LOCAL_WRITE); + if (!eqp->umem_obj) { + DRV_LOG(ERR, "Failed to register umem for SW QP."); + goto error; + } + attr.uar_index = priv->uar->page_id; + attr.cqn = eqp->cq.cq->id; + attr.log_page_size = rte_log2_u32(sysconf(_SC_PAGESIZE)); + attr.rq_size = 1 << log_desc_n; + attr.log_rq_stride = rte_log2_u32(MLX5_WSEG_SIZE); + attr.sq_size = 0; /* No need SQ. */ + attr.dbr_umem_valid = 1; + attr.wq_umem_id = eqp->umem_obj->umem_id; + attr.wq_umem_offset = 0; + attr.dbr_umem_id = eqp->umem_obj->umem_id; + attr.dbr_address = (1 << log_desc_n) * MLX5_WSEG_SIZE; + eqp->sw_qp = mlx5_devx_cmd_create_qp(priv->ctx, &attr); + if (!eqp->sw_qp) { + DRV_LOG(ERR, "Failed to create SW QP(%u).", rte_errno); + goto error; + } + eqp->db_rec = RTE_PTR_ADD(eqp->umem_buf, (uintptr_t)attr.dbr_address); + if (mlx5_vdpa_qps2rts(eqp)) + goto error; + /* First ringing. */ + rte_write32(rte_cpu_to_be_32(1 << log_desc_n), &eqp->db_rec[0]); + return 0; +error: + mlx5_vdpa_event_qp_destroy(eqp); + return -1; +} From patchwork Wed Jan 29 10:09:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65294 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3CA12A0531; Wed, 29 Jan 2020 11:10:24 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id B596E1BFE7; Wed, 29 Jan 2020 11:09:55 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id CFC741BFE0 for ; Wed, 29 Jan 2020 11:09:54 +0100 (CET) Received: from Internal Mail-Server by MTLPINE2 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 29 Jan 2020 12:09:53 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 00TA9BHM032108; Wed, 29 Jan 2020 12:09:53 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Wed, 29 Jan 2020 10:09:02 +0000 Message-Id: <1580292549-27439-7-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580292549-27439-1-git-send-email-matan@mellanox.com> References: <1579539790-3882-1-git-send-email-matan@mellanox.com> <1580292549-27439-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v2 06/13] vdpa/mlx5: prepare virtio queues X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The HW virtq object represents an emulated context for a VIRTIO_NET virtqueue which was created and managed by a VIRTIO_NET driver as defined in VIRTIO Specification. Add support to prepare and release all the basic HW resources needed the user virtqs emulation according to the rte_vhost configurations. This patch prepares the basic configurations needed by DevX commands to create a virtq. Add new file mlx5_vdpa_virtq.c to manage virtq operations. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko --- drivers/vdpa/mlx5/Makefile | 1 + drivers/vdpa/mlx5/meson.build | 1 + drivers/vdpa/mlx5/mlx5_vdpa.c | 1 + drivers/vdpa/mlx5/mlx5_vdpa.h | 36 ++++++ drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 212 ++++++++++++++++++++++++++++++++++++ 5 files changed, 251 insertions(+) create mode 100644 drivers/vdpa/mlx5/mlx5_vdpa_virtq.c diff --git a/drivers/vdpa/mlx5/Makefile b/drivers/vdpa/mlx5/Makefile index 7f13756..353e262 100644 --- a/drivers/vdpa/mlx5/Makefile +++ b/drivers/vdpa/mlx5/Makefile @@ -10,6 +10,7 @@ LIB = librte_pmd_mlx5_vdpa.a SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_mem.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_event.c +SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_virtq.c # Basic CFLAGS. CFLAGS += -O3 diff --git a/drivers/vdpa/mlx5/meson.build b/drivers/vdpa/mlx5/meson.build index c609f7c..e017f95 100644 --- a/drivers/vdpa/mlx5/meson.build +++ b/drivers/vdpa/mlx5/meson.build @@ -14,6 +14,7 @@ sources = files( 'mlx5_vdpa.c', 'mlx5_vdpa_mem.c', 'mlx5_vdpa_event.c', + 'mlx5_vdpa_virtq.c', ) cflags_options = [ '-std=c11', diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c index c67f93d..4d30b35 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -229,6 +229,7 @@ goto error; } SLIST_INIT(&priv->mr_list); + SLIST_INIT(&priv->virtq_list); pthread_mutex_lock(&priv_list_lock); TAILQ_INSERT_TAIL(&priv_list, priv, next); pthread_mutex_unlock(&priv_list_lock); diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h index 30030b7..a7e2185 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -53,6 +53,19 @@ struct mlx5_vdpa_query_mr { int is_indirect; }; +struct mlx5_vdpa_virtq { + SLIST_ENTRY(mlx5_vdpa_virtq) next; + uint16_t index; + uint16_t vq_size; + struct mlx5_devx_obj *virtq; + struct mlx5_vdpa_event_qp eqp; + struct { + struct mlx5dv_devx_umem *obj; + void *buf; + uint32_t size; + } umems[3]; +}; + struct mlx5_vdpa_priv { TAILQ_ENTRY(mlx5_vdpa_priv) next; int id; /* vDPA device id. */ @@ -69,6 +82,10 @@ struct mlx5_vdpa_priv { struct mlx5dv_devx_event_channel *eventc; struct mlx5dv_devx_uar *uar; struct rte_intr_handle intr_handle; + struct mlx5_devx_obj *td; + struct mlx5_devx_obj *tis; + uint16_t nr_virtqs; + SLIST_HEAD(virtq_list, mlx5_vdpa_virtq) virtq_list; SLIST_HEAD(mr_list, mlx5_vdpa_query_mr) mr_list; }; @@ -146,4 +163,23 @@ int mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv *priv, uint16_t desc_n, */ void mlx5_vdpa_cqe_event_unset(struct mlx5_vdpa_priv *priv); +/** + * Release a virtq and all its related resources. + * + * @param[in] priv + * The vdpa driver private structure. + */ +void mlx5_vdpa_virtqs_release(struct mlx5_vdpa_priv *priv); + +/** + * Create all the HW virtqs resources and all their related resources. + * + * @param[in] priv + * The vdpa driver private structure. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +int mlx5_vdpa_virtqs_prepare(struct mlx5_vdpa_priv *priv); + #endif /* RTE_PMD_MLX5_VDPA_H_ */ diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c new file mode 100644 index 0000000..781bccf --- /dev/null +++ b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c @@ -0,0 +1,212 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2019 Mellanox Technologies, Ltd + */ +#include + +#include +#include + +#include + +#include "mlx5_vdpa_utils.h" +#include "mlx5_vdpa.h" + + +static int +mlx5_vdpa_virtq_unset(struct mlx5_vdpa_virtq *virtq) +{ + int i; + + if (virtq->virtq) { + claim_zero(mlx5_devx_cmd_destroy(virtq->virtq)); + virtq->virtq = NULL; + } + for (i = 0; i < 3; ++i) { + if (virtq->umems[i].obj) + claim_zero(mlx5_glue->devx_umem_dereg + (virtq->umems[i].obj)); + if (virtq->umems[i].buf) + rte_free(virtq->umems[i].buf); + } + memset(&virtq->umems, 0, sizeof(virtq->umems)); + if (virtq->eqp.fw_qp) + mlx5_vdpa_event_qp_destroy(&virtq->eqp); + return 0; +} + +void +mlx5_vdpa_virtqs_release(struct mlx5_vdpa_priv *priv) +{ + struct mlx5_vdpa_virtq *entry; + struct mlx5_vdpa_virtq *next; + + entry = SLIST_FIRST(&priv->virtq_list); + while (entry) { + next = SLIST_NEXT(entry, next); + mlx5_vdpa_virtq_unset(entry); + SLIST_REMOVE(&priv->virtq_list, entry, mlx5_vdpa_virtq, next); + rte_free(entry); + entry = next; + } + SLIST_INIT(&priv->virtq_list); + if (priv->tis) { + claim_zero(mlx5_devx_cmd_destroy(priv->tis)); + priv->tis = NULL; + } + if (priv->td) { + claim_zero(mlx5_devx_cmd_destroy(priv->td)); + priv->td = NULL; + } +} + +static uint64_t +mlx5_vdpa_hva_to_gpa(struct rte_vhost_memory *mem, uint64_t hva) +{ + struct rte_vhost_mem_region *reg; + uint32_t i; + uint64_t gpa = 0; + + for (i = 0; i < mem->nregions; i++) { + reg = &mem->regions[i]; + if (hva >= reg->host_user_addr && + hva < reg->host_user_addr + reg->size) { + gpa = hva - reg->host_user_addr + reg->guest_phys_addr; + break; + } + } + return gpa; +} + +static int +mlx5_vdpa_virtq_setup(struct mlx5_vdpa_priv *priv, + struct mlx5_vdpa_virtq *virtq, int index) +{ + struct rte_vhost_vring vq; + struct mlx5_devx_virtq_attr attr = {0}; + uint64_t gpa; + int ret; + int i; + uint16_t last_avail_idx; + uint16_t last_used_idx; + + ret = rte_vhost_get_vhost_vring(priv->vid, index, &vq); + if (ret) + return -1; + virtq->index = index; + virtq->vq_size = vq.size; + /* + * No need event QPs creation when the guest in poll mode or when the + * capability allows it. + */ + attr.event_mode = vq.callfd != -1 || !(priv->caps.event_mode & (1 << + MLX5_VIRTQ_EVENT_MODE_NO_MSIX)) ? + MLX5_VIRTQ_EVENT_MODE_QP : + MLX5_VIRTQ_EVENT_MODE_NO_MSIX; + if (attr.event_mode == MLX5_VIRTQ_EVENT_MODE_QP) { + ret = mlx5_vdpa_event_qp_create(priv, vq.size, vq.callfd, + &virtq->eqp); + if (ret) { + DRV_LOG(ERR, "Failed to create event QPs for virtq %d.", + index); + return -1; + } + attr.qp_id = virtq->eqp.fw_qp->id; + } else { + DRV_LOG(INFO, "Virtq %d is, for sure, working by poll mode, no" + " need event QPs and event mechanism.", index); + } + /* Setup 3 UMEMs for each virtq. */ + for (i = 0; i < 3; ++i) { + virtq->umems[i].size = priv->caps.umems[i].a * vq.size + + priv->caps.umems[i].b; + virtq->umems[i].buf = rte_zmalloc(__func__, + virtq->umems[i].size, 4096); + if (!virtq->umems[i].buf) { + DRV_LOG(ERR, "Cannot allocate umem %d memory for virtq" + " %u.", i, index); + goto error; + } + virtq->umems[i].obj = mlx5_glue->devx_umem_reg(priv->ctx, + virtq->umems[i].buf, + virtq->umems[i].size, + IBV_ACCESS_LOCAL_WRITE); + if (!virtq->umems[i].obj) { + DRV_LOG(ERR, "Failed to register umem %d for virtq %u.", + i, index); + goto error; + } + attr.umems[i].id = virtq->umems[i].obj->umem_id; + attr.umems[i].offset = 0; + attr.umems[i].size = virtq->umems[i].size; + } + gpa = mlx5_vdpa_hva_to_gpa(priv->vmem, (uint64_t)(uintptr_t)vq.desc); + if (!gpa) { + DRV_LOG(ERR, "Fail to get GPA for descriptor ring."); + goto error; + } + attr.desc_addr = gpa; + gpa = mlx5_vdpa_hva_to_gpa(priv->vmem, (uint64_t)(uintptr_t)vq.used); + if (!gpa) { + DRV_LOG(ERR, "Fail to get GPA for used ring."); + goto error; + } + attr.used_addr = gpa; + gpa = mlx5_vdpa_hva_to_gpa(priv->vmem, (uint64_t)(uintptr_t)vq.avail); + if (!gpa) { + DRV_LOG(ERR, "Fail to get GPA for available ring."); + goto error; + } + attr.available_addr = gpa; + rte_vhost_get_vring_base(priv->vid, index, &last_avail_idx, + &last_used_idx); + DRV_LOG(INFO, "vid %d: Init last_avail_idx=%d, last_used_idx=%d for " + "virtq %d.", priv->vid, last_avail_idx, last_used_idx, index); + attr.hw_available_index = last_avail_idx; + attr.hw_used_index = last_used_idx; + attr.q_size = vq.size; + attr.mkey = priv->gpa_mkey_index; + attr.tis_id = priv->tis->id; + attr.queue_index = index; + virtq->virtq = mlx5_devx_cmd_create_virtq(priv->ctx, &attr); + if (!virtq->virtq) + goto error; + return 0; +error: + mlx5_vdpa_virtq_unset(virtq); + return -1; +} + +int +mlx5_vdpa_virtqs_prepare(struct mlx5_vdpa_priv *priv) +{ + struct mlx5_devx_tis_attr tis_attr = {0}; + struct mlx5_vdpa_virtq *virtq; + uint32_t i; + uint16_t nr_vring = rte_vhost_get_vring_num(priv->vid); + + priv->td = mlx5_devx_cmd_create_td(priv->ctx); + if (!priv->td) { + DRV_LOG(ERR, "Failed to create transport domain."); + return -rte_errno; + } + tis_attr.transport_domain = priv->td->id; + priv->tis = mlx5_devx_cmd_create_tis(priv->ctx, &tis_attr); + if (!priv->tis) { + DRV_LOG(ERR, "Failed to create TIS."); + goto error; + } + for (i = 0; i < nr_vring; i++) { + virtq = rte_zmalloc(__func__, sizeof(*virtq), 0); + if (!virtq || mlx5_vdpa_virtq_setup(priv, virtq, i)) { + if (virtq) + rte_free(virtq); + goto error; + } + SLIST_INSERT_HEAD(&priv->virtq_list, virtq, next); + } + priv->nr_virtqs = nr_vring; + return 0; +error: + mlx5_vdpa_virtqs_release(priv); + return -1; +} From patchwork Wed Jan 29 10:09:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65295 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 107AEA0531; Wed, 29 Jan 2020 11:10:35 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 36F4D1BFB0; Wed, 29 Jan 2020 11:10:07 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id 3AB6F1BFAF for ; Wed, 29 Jan 2020 11:10:05 +0100 (CET) Received: from Internal Mail-Server by MTLPINE2 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 29 Jan 2020 12:10:00 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 00TA9BHN032108; Wed, 29 Jan 2020 12:10:00 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Wed, 29 Jan 2020 10:09:03 +0000 Message-Id: <1580292549-27439-8-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580292549-27439-1-git-send-email-matan@mellanox.com> References: <1579539790-3882-1-git-send-email-matan@mellanox.com> <1580292549-27439-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v2 07/13] vdpa/mlx5: support stateless offloads X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add support for the next features in virtq configuration: VIRTIO_F_RING_PACKED, VIRTIO_NET_F_HOST_TSO4, VIRTIO_NET_F_HOST_TSO6, VIRTIO_NET_F_CSUM, VIRTIO_NET_F_GUEST_CSUM, VIRTIO_F_VERSION_1, These features support depends in the DevX capabilities reported by the device. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko Reviewed-by: Maxime Coquelin --- doc/guides/vdpadevs/features/mlx5.ini | 7 ++- drivers/vdpa/mlx5/mlx5_vdpa.c | 10 ---- drivers/vdpa/mlx5/mlx5_vdpa.h | 10 ++++ drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 108 ++++++++++++++++++++++++++++------ 4 files changed, 107 insertions(+), 28 deletions(-) diff --git a/doc/guides/vdpadevs/features/mlx5.ini b/doc/guides/vdpadevs/features/mlx5.ini index fea491d..e4ee34b 100644 --- a/doc/guides/vdpadevs/features/mlx5.ini +++ b/doc/guides/vdpadevs/features/mlx5.ini @@ -4,10 +4,15 @@ ; Refer to default.ini for the full list of available driver features. ; [Features] - +csum = Y +guest csum = Y +host tso4 = Y +host tso6 = Y +version 1 = Y any layout = Y guest announce = Y mq = Y +packed = Y proto mq = Y proto log shmfd = Y proto host notifier = Y diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c index 4d30b35..dfbd0af 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -1,8 +1,6 @@ /* SPDX-License-Identifier: BSD-3-Clause * Copyright 2019 Mellanox Technologies, Ltd */ -#include - #include #include #include @@ -17,14 +15,6 @@ #include "mlx5_vdpa.h" -#ifndef VIRTIO_F_ORDER_PLATFORM -#define VIRTIO_F_ORDER_PLATFORM 36 -#endif - -#ifndef VIRTIO_F_RING_PACKED -#define VIRTIO_F_RING_PACKED 34 -#endif - #define MLX5_VDPA_DEFAULT_FEATURES ((1ULL << VHOST_USER_F_PROTOCOL_FEATURES) | \ (1ULL << VIRTIO_F_ANY_LAYOUT) | \ (1ULL << VIRTIO_NET_F_MQ) | \ diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h index a7e2185..e530058 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -5,6 +5,7 @@ #ifndef RTE_PMD_MLX5_VDPA_H_ #define RTE_PMD_MLX5_VDPA_H_ +#include #include #include @@ -20,6 +21,14 @@ #define MLX5_VDPA_INTR_RETRIES 256 #define MLX5_VDPA_INTR_RETRIES_USEC 1000 +#ifndef VIRTIO_F_ORDER_PLATFORM +#define VIRTIO_F_ORDER_PLATFORM 36 +#endif + +#ifndef VIRTIO_F_RING_PACKED +#define VIRTIO_F_RING_PACKED 34 +#endif + struct mlx5_vdpa_cq { uint16_t log_desc_n; uint32_t cq_ci:24; @@ -85,6 +94,7 @@ struct mlx5_vdpa_priv { struct mlx5_devx_obj *td; struct mlx5_devx_obj *tis; uint16_t nr_virtqs; + uint64_t features; /* Negotiated features. */ SLIST_HEAD(virtq_list, mlx5_vdpa_virtq) virtq_list; SLIST_HEAD(mr_list, mlx5_vdpa_query_mr) mr_list; }; diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c index 781bccf..e27af28 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c @@ -57,6 +57,7 @@ claim_zero(mlx5_devx_cmd_destroy(priv->td)); priv->td = NULL; } + priv->features = 0; } static uint64_t @@ -94,6 +95,14 @@ return -1; virtq->index = index; virtq->vq_size = vq.size; + attr.tso_ipv4 = !!(priv->features & (1ULL << VIRTIO_NET_F_HOST_TSO4)); + attr.tso_ipv6 = !!(priv->features & (1ULL << VIRTIO_NET_F_HOST_TSO6)); + attr.tx_csum = !!(priv->features & (1ULL << VIRTIO_NET_F_CSUM)); + attr.rx_csum = !!(priv->features & (1ULL << VIRTIO_NET_F_GUEST_CSUM)); + attr.virtio_version_1_0 = !!(priv->features & (1ULL << + VIRTIO_F_VERSION_1)); + attr.type = (priv->features & (1ULL << VIRTIO_F_RING_PACKED)) ? + MLX5_VIRTQ_TYPE_PACKED : MLX5_VIRTQ_TYPE_SPLIT; /* * No need event QPs creation when the guest in poll mode or when the * capability allows it. @@ -139,24 +148,29 @@ attr.umems[i].offset = 0; attr.umems[i].size = virtq->umems[i].size; } - gpa = mlx5_vdpa_hva_to_gpa(priv->vmem, (uint64_t)(uintptr_t)vq.desc); - if (!gpa) { - DRV_LOG(ERR, "Fail to get GPA for descriptor ring."); - goto error; - } - attr.desc_addr = gpa; - gpa = mlx5_vdpa_hva_to_gpa(priv->vmem, (uint64_t)(uintptr_t)vq.used); - if (!gpa) { - DRV_LOG(ERR, "Fail to get GPA for used ring."); - goto error; - } - attr.used_addr = gpa; - gpa = mlx5_vdpa_hva_to_gpa(priv->vmem, (uint64_t)(uintptr_t)vq.avail); - if (!gpa) { - DRV_LOG(ERR, "Fail to get GPA for available ring."); - goto error; + if (attr.type == MLX5_VIRTQ_TYPE_SPLIT) { + gpa = mlx5_vdpa_hva_to_gpa(priv->vmem, + (uint64_t)(uintptr_t)vq.desc); + if (!gpa) { + DRV_LOG(ERR, "Failed to get descriptor ring GPA."); + goto error; + } + attr.desc_addr = gpa; + gpa = mlx5_vdpa_hva_to_gpa(priv->vmem, + (uint64_t)(uintptr_t)vq.used); + if (!gpa) { + DRV_LOG(ERR, "Failed to get GPA for used ring."); + goto error; + } + attr.used_addr = gpa; + gpa = mlx5_vdpa_hva_to_gpa(priv->vmem, + (uint64_t)(uintptr_t)vq.avail); + if (!gpa) { + DRV_LOG(ERR, "Failed to get GPA for available ring."); + goto error; + } + attr.available_addr = gpa; } - attr.available_addr = gpa; rte_vhost_get_vring_base(priv->vid, index, &last_avail_idx, &last_used_idx); DRV_LOG(INFO, "vid %d: Init last_avail_idx=%d, last_used_idx=%d for " @@ -176,6 +190,61 @@ return -1; } +static int +mlx5_vdpa_features_validate(struct mlx5_vdpa_priv *priv) +{ + if (priv->features & (1ULL << VIRTIO_F_RING_PACKED)) { + if (!(priv->caps.virtio_queue_type & (1 << + MLX5_VIRTQ_TYPE_PACKED))) { + DRV_LOG(ERR, "Failed to configur PACKED mode for vdev " + "%d - it was not reported by HW/driver" + " capability.", priv->vid); + return -ENOTSUP; + } + } + if (priv->features & (1ULL << VIRTIO_NET_F_HOST_TSO4)) { + if (!priv->caps.tso_ipv4) { + DRV_LOG(ERR, "Failed to enable TSO4 for vdev %d - TSO4" + " was not reported by HW/driver capability.", + priv->vid); + return -ENOTSUP; + } + } + if (priv->features & (1ULL << VIRTIO_NET_F_HOST_TSO6)) { + if (!priv->caps.tso_ipv6) { + DRV_LOG(ERR, "Failed to enable TSO6 for vdev %d - TSO6" + " was not reported by HW/driver capability.", + priv->vid); + return -ENOTSUP; + } + } + if (priv->features & (1ULL << VIRTIO_NET_F_CSUM)) { + if (!priv->caps.tx_csum) { + DRV_LOG(ERR, "Failed to enable CSUM for vdev %d - CSUM" + " was not reported by HW/driver capability.", + priv->vid); + return -ENOTSUP; + } + } + if (priv->features & (1ULL << VIRTIO_NET_F_GUEST_CSUM)) { + if (!priv->caps.rx_csum) { + DRV_LOG(ERR, "Failed to enable GUEST CSUM for vdev %d" + " GUEST CSUM was not reported by HW/driver " + "capability.", priv->vid); + return -ENOTSUP; + } + } + if (priv->features & (1ULL << VIRTIO_F_VERSION_1)) { + if (!priv->caps.virtio_version_1_0) { + DRV_LOG(ERR, "Failed to enable version 1 for vdev %d " + "version 1 was not reported by HW/driver" + " capability.", priv->vid); + return -ENOTSUP; + } + } + return 0; +} + int mlx5_vdpa_virtqs_prepare(struct mlx5_vdpa_priv *priv) { @@ -183,7 +252,12 @@ struct mlx5_vdpa_virtq *virtq; uint32_t i; uint16_t nr_vring = rte_vhost_get_vring_num(priv->vid); + int ret = rte_vhost_get_negotiated_features(priv->vid, &priv->features); + if (ret || mlx5_vdpa_features_validate(priv)) { + DRV_LOG(ERR, "Failed to configure negotiated features."); + return -1; + } priv->td = mlx5_devx_cmd_create_td(priv->ctx); if (!priv->td) { DRV_LOG(ERR, "Failed to create transport domain."); From patchwork Wed Jan 29 10:09:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65296 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 529B1A0531; Wed, 29 Jan 2020 11:10:53 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id CE0031BFF6; Wed, 29 Jan 2020 11:10:13 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id 8827D1BFF2 for ; Wed, 29 Jan 2020 11:10:09 +0100 (CET) Received: from Internal Mail-Server by MTLPINE1 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 29 Jan 2020 12:10:07 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 00TA9BHO032108; Wed, 29 Jan 2020 12:10:07 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Wed, 29 Jan 2020 10:09:04 +0000 Message-Id: <1580292549-27439-9-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580292549-27439-1-git-send-email-matan@mellanox.com> References: <1579539790-3882-1-git-send-email-matan@mellanox.com> <1580292549-27439-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v2 08/13] vdpa/mlx5: add basic steering configurations X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add a steering object to be managed by a new file mlx5_vdpa_steer.c. Allow promiscuous flow to scatter the device Rx packets to the virtio queues using RSS action. In order to allow correct RSS in L3 and L4, split the flow to 7 flows as required by the device. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko Acked-by: Maxime Coquelin --- drivers/vdpa/mlx5/Makefile | 2 + drivers/vdpa/mlx5/meson.build | 1 + drivers/vdpa/mlx5/mlx5_vdpa.c | 1 + drivers/vdpa/mlx5/mlx5_vdpa.h | 34 +++++ drivers/vdpa/mlx5/mlx5_vdpa_steer.c | 265 ++++++++++++++++++++++++++++++++++++ 5 files changed, 303 insertions(+) create mode 100644 drivers/vdpa/mlx5/mlx5_vdpa_steer.c diff --git a/drivers/vdpa/mlx5/Makefile b/drivers/vdpa/mlx5/Makefile index 353e262..2f70a98 100644 --- a/drivers/vdpa/mlx5/Makefile +++ b/drivers/vdpa/mlx5/Makefile @@ -11,6 +11,8 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_mem.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_event.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_virtq.c +SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_steer.c + # Basic CFLAGS. CFLAGS += -O3 diff --git a/drivers/vdpa/mlx5/meson.build b/drivers/vdpa/mlx5/meson.build index e017f95..2849178 100644 --- a/drivers/vdpa/mlx5/meson.build +++ b/drivers/vdpa/mlx5/meson.build @@ -15,6 +15,7 @@ sources = files( 'mlx5_vdpa_mem.c', 'mlx5_vdpa_event.c', 'mlx5_vdpa_virtq.c', + 'mlx5_vdpa_steer.c', ) cflags_options = [ '-std=c11', diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c index dfbd0af..12cfee2 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -208,6 +208,7 @@ goto error; } priv->caps = attr.vdpa; + priv->log_max_rqt_size = attr.log_max_rqt_size; } priv->ctx = ctx; priv->dev_addr.pci_addr = pci_dev->addr; diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h index e530058..2b0b285 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -75,6 +75,18 @@ struct mlx5_vdpa_virtq { } umems[3]; }; +struct mlx5_vdpa_steer { + struct mlx5_devx_obj *rqt; + void *domain; + void *tbl; + struct { + struct mlx5dv_flow_matcher *matcher; + struct mlx5_devx_obj *tir; + void *tir_action; + void *flow; + } rss[7]; +}; + struct mlx5_vdpa_priv { TAILQ_ENTRY(mlx5_vdpa_priv) next; int id; /* vDPA device id. */ @@ -95,7 +107,9 @@ struct mlx5_vdpa_priv { struct mlx5_devx_obj *tis; uint16_t nr_virtqs; uint64_t features; /* Negotiated features. */ + uint16_t log_max_rqt_size; SLIST_HEAD(virtq_list, mlx5_vdpa_virtq) virtq_list; + struct mlx5_vdpa_steer steer; SLIST_HEAD(mr_list, mlx5_vdpa_query_mr) mr_list; }; @@ -192,4 +206,24 @@ int mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv *priv, uint16_t desc_n, */ int mlx5_vdpa_virtqs_prepare(struct mlx5_vdpa_priv *priv); +/** + * Unset steering and release all its related resources- stop traffic. + * + * @param[in] priv + * The vdpa driver private structure. + */ +int mlx5_vdpa_steer_unset(struct mlx5_vdpa_priv *priv); + +/** + * Setup steering and all its related resources to enable RSS trafic from the + * device to all the Rx host queues. + * + * @param[in] priv + * The vdpa driver private structure. + * + * @return + * 0 on success, a negative value otherwise. + */ +int mlx5_vdpa_steer_setup(struct mlx5_vdpa_priv *priv); + #endif /* RTE_PMD_MLX5_VDPA_H_ */ diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_steer.c b/drivers/vdpa/mlx5/mlx5_vdpa_steer.c new file mode 100644 index 0000000..f365c10 --- /dev/null +++ b/drivers/vdpa/mlx5/mlx5_vdpa_steer.c @@ -0,0 +1,265 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2019 Mellanox Technologies, Ltd + */ +#include + +#include +#include +#include + +#include + +#include "mlx5_vdpa_utils.h" +#include "mlx5_vdpa.h" + +int +mlx5_vdpa_steer_unset(struct mlx5_vdpa_priv *priv) +{ + int ret __rte_unused; + unsigned i; + + for (i = 0; i < RTE_DIM(priv->steer.rss); ++i) { + if (priv->steer.rss[i].flow) { + claim_zero(mlx5_glue->dv_destroy_flow + (priv->steer.rss[i].flow)); + priv->steer.rss[i].flow = NULL; + } + if (priv->steer.rss[i].tir_action) { + claim_zero(mlx5_glue->destroy_flow_action + (priv->steer.rss[i].tir_action)); + priv->steer.rss[i].tir_action = NULL; + } + if (priv->steer.rss[i].tir) { + claim_zero(mlx5_devx_cmd_destroy + (priv->steer.rss[i].tir)); + priv->steer.rss[i].tir = NULL; + } + if (priv->steer.rss[i].matcher) { + claim_zero(mlx5_glue->dv_destroy_flow_matcher + (priv->steer.rss[i].matcher)); + priv->steer.rss[i].matcher = NULL; + } + } + if (priv->steer.tbl) { + claim_zero(mlx5_glue->dr_destroy_flow_tbl(priv->steer.tbl)); + priv->steer.tbl = NULL; + } + if (priv->steer.domain) { + claim_zero(mlx5_glue->dr_destroy_domain(priv->steer.domain)); + priv->steer.domain = NULL; + } + if (priv->steer.rqt) { + claim_zero(mlx5_devx_cmd_destroy(priv->steer.rqt)); + priv->steer.rqt = NULL; + } + return 0; +} + +/* + * According to VIRTIO_NET Spec the virtqueues index identity its type by: + * 0 receiveq1 + * 1 transmitq1 + * ... + * 2(N-1) receiveqN + * 2(N-1)+1 transmitqN + * 2N controlq + */ +static uint8_t +is_virtq_recvq(int virtq_index, int nr_vring) +{ + if (virtq_index % 2 == 0 && virtq_index != nr_vring - 1) + return 1; + return 0; +} + +#define MLX5_VDPA_DEFAULT_RQT_SIZE 512 +static int __rte_unused +mlx5_vdpa_rqt_prepare(struct mlx5_vdpa_priv *priv) +{ + struct mlx5_vdpa_virtq *virtq; + uint32_t rqt_n = RTE_MIN(MLX5_VDPA_DEFAULT_RQT_SIZE, + 1 << priv->log_max_rqt_size); + struct mlx5_devx_rqt_attr *attr = rte_zmalloc(__func__, sizeof(*attr) + + rqt_n * + sizeof(uint32_t), 0); + uint32_t i = 0, j; + int ret = 0; + + if (!attr) { + DRV_LOG(ERR, "Failed to allocate RQT attributes memory."); + rte_errno = ENOMEM; + return -ENOMEM; + } + SLIST_FOREACH(virtq, &priv->virtq_list, next) { + if (is_virtq_recvq(virtq->index, priv->nr_virtqs)) { + attr->rq_list[i] = virtq->virtq->id; + i++; + } + } + for (j = 0; i != rqt_n; ++i, ++j) + attr->rq_list[i] = attr->rq_list[j]; + attr->rq_type = MLX5_INLINE_Q_TYPE_VIRTQ; + attr->rqt_max_size = rqt_n; + attr->rqt_actual_size = rqt_n; + if (!priv->steer.rqt) { + priv->steer.rqt = mlx5_devx_cmd_create_rqt(priv->ctx, attr); + if (!priv->steer.rqt) { + DRV_LOG(ERR, "Failed to create RQT."); + ret = -rte_errno; + } + } else { + ret = mlx5_devx_cmd_modify_rqt(priv->steer.rqt, attr); + if (ret) + DRV_LOG(ERR, "Failed to modify RQT."); + } + rte_free(attr); + return ret; +} + +static int __rte_unused +mlx5_vdpa_rss_flows_create(struct mlx5_vdpa_priv *priv) +{ +#ifdef HAVE_MLX5DV_DR + struct mlx5_devx_tir_attr tir_att = { + .disp_type = MLX5_TIRC_DISP_TYPE_INDIRECT, + .rx_hash_fn = MLX5_RX_HASH_FN_TOEPLITZ, + .transport_domain = priv->td->id, + .indirect_table = priv->steer.rqt->id, + .rx_hash_symmetric = 1, + .rx_hash_toeplitz_key = { 0x2cc681d1, 0x5bdbf4f7, 0xfca28319, + 0xdb1a3e94, 0x6b9e38d9, 0x2c9c03d1, + 0xad9944a7, 0xd9563d59, 0x063c25f3, + 0xfc1fdc2a }, + }; + struct { + size_t size; + /**< Size of match value. Do NOT split size and key! */ + uint32_t buf[MLX5_ST_SZ_DW(fte_match_param)]; + /**< Matcher value. This value is used as the mask or a key. */ + } matcher_mask = { + .size = sizeof(matcher_mask.buf), + }, + matcher_value = { + .size = sizeof(matcher_value.buf), + }; + struct mlx5dv_flow_matcher_attr dv_attr = { + .type = IBV_FLOW_ATTR_NORMAL, + .match_mask = (void *)&matcher_mask, + }; + void *match_m = matcher_mask.buf; + void *match_v = matcher_value.buf; + void *headers_m = MLX5_ADDR_OF(fte_match_param, match_m, outer_headers); + void *headers_v = MLX5_ADDR_OF(fte_match_param, match_v, outer_headers); + void *actions[1]; + const uint8_t l3_hash = + (1 << MLX5_RX_HASH_FIELD_SELECT_SELECTED_FIELDS_SRC_IP) | + (1 << MLX5_RX_HASH_FIELD_SELECT_SELECTED_FIELDS_DST_IP); + const uint8_t l4_hash = + (1 << MLX5_RX_HASH_FIELD_SELECT_SELECTED_FIELDS_L4_SPORT) | + (1 << MLX5_RX_HASH_FIELD_SELECT_SELECTED_FIELDS_L4_DPORT); + enum { PRIO, CRITERIA, IP_VER_M, IP_VER_V, IP_PROT_M, IP_PROT_V, L3_BIT, + L4_BIT, HASH, END}; + const uint8_t vars[RTE_DIM(priv->steer.rss)][END] = { + { 7, 0, 0, 0, 0, 0, 0, 0, 0 }, + { 6, 1 << MLX5_MATCH_CRITERIA_ENABLE_OUTER_BIT, 0xf, 4, 0, 0, + MLX5_L3_PROT_TYPE_IPV4, 0, l3_hash }, + { 6, 1 << MLX5_MATCH_CRITERIA_ENABLE_OUTER_BIT, 0xf, 6, 0, 0, + MLX5_L3_PROT_TYPE_IPV6, 0, l3_hash }, + { 5, 1 << MLX5_MATCH_CRITERIA_ENABLE_OUTER_BIT, 0xf, 4, 0xff, + IPPROTO_UDP, MLX5_L3_PROT_TYPE_IPV4, MLX5_L4_PROT_TYPE_UDP, + l3_hash | l4_hash }, + { 5, 1 << MLX5_MATCH_CRITERIA_ENABLE_OUTER_BIT, 0xf, 4, 0xff, + IPPROTO_TCP, MLX5_L3_PROT_TYPE_IPV4, MLX5_L4_PROT_TYPE_TCP, + l3_hash | l4_hash }, + { 5, 1 << MLX5_MATCH_CRITERIA_ENABLE_OUTER_BIT, 0xf, 6, 0xff, + IPPROTO_UDP, MLX5_L3_PROT_TYPE_IPV6, MLX5_L4_PROT_TYPE_UDP, + l3_hash | l4_hash }, + { 5, 1 << MLX5_MATCH_CRITERIA_ENABLE_OUTER_BIT, 0xf, 6, 0xff, + IPPROTO_TCP, MLX5_L3_PROT_TYPE_IPV6, MLX5_L4_PROT_TYPE_TCP, + l3_hash | l4_hash }, + }; + unsigned i; + + for (i = 0; i < RTE_DIM(priv->steer.rss); ++i) { + dv_attr.priority = vars[i][PRIO]; + dv_attr.match_criteria_enable = vars[i][CRITERIA]; + MLX5_SET(fte_match_set_lyr_2_4, headers_m, ip_version, + vars[i][IP_VER_M]); + MLX5_SET(fte_match_set_lyr_2_4, headers_v, ip_version, + vars[i][IP_VER_V]); + MLX5_SET(fte_match_set_lyr_2_4, headers_m, ip_protocol, + vars[i][IP_PROT_M]); + MLX5_SET(fte_match_set_lyr_2_4, headers_v, ip_protocol, + vars[i][IP_PROT_V]); + tir_att.rx_hash_field_selector_outer.l3_prot_type = + vars[i][L3_BIT]; + tir_att.rx_hash_field_selector_outer.l4_prot_type = + vars[i][L4_BIT]; + tir_att.rx_hash_field_selector_outer.selected_fields = + vars[i][HASH]; + priv->steer.rss[i].matcher = mlx5_glue->dv_create_flow_matcher + (priv->ctx, &dv_attr, priv->steer.tbl); + if (!priv->steer.rss[i].matcher) { + DRV_LOG(ERR, "Failed to create matcher %d.", i); + goto error; + } + priv->steer.rss[i].tir = mlx5_devx_cmd_create_tir(priv->ctx, + &tir_att); + if (!priv->steer.rss[i].tir) { + DRV_LOG(ERR, "Failed to create TIR %d.", i); + goto error; + } + priv->steer.rss[i].tir_action = + mlx5_glue->dv_create_flow_action_dest_devx_tir + (priv->steer.rss[i].tir->obj); + if (!priv->steer.rss[i].tir_action) { + DRV_LOG(ERR, "Failed to create TIR action %d.", i); + goto error; + } + actions[0] = priv->steer.rss[i].tir_action; + priv->steer.rss[i].flow = mlx5_glue->dv_create_flow + (priv->steer.rss[i].matcher, + (void *)&matcher_value, 1, actions); + if (!priv->steer.rss[i].flow) { + DRV_LOG(ERR, "Failed to create flow %d.", i); + goto error; + } + } + return 0; +error: + /* Resources will be freed by the caller. */ + return -1; +#else + (void)priv; + return -ENOTSUP; +#endif /* HAVE_MLX5DV_DR */ +} + +int +mlx5_vdpa_steer_setup(struct mlx5_vdpa_priv *priv) +{ +#ifdef HAVE_MLX5DV_DR + if (mlx5_vdpa_rqt_prepare(priv)) + return -1; + priv->steer.domain = mlx5_glue->dr_create_domain(priv->ctx, + MLX5DV_DR_DOMAIN_TYPE_NIC_RX); + if (!priv->steer.domain) { + DRV_LOG(ERR, "Failed to create Rx domain."); + goto error; + } + priv->steer.tbl = mlx5_glue->dr_create_flow_tbl(priv->steer.domain, 0); + if (!priv->steer.tbl) { + DRV_LOG(ERR, "Failed to create table 0 with Rx domain."); + goto error; + } + if (mlx5_vdpa_rss_flows_create(priv)) + goto error; + return 0; +error: + mlx5_vdpa_steer_unset(priv); + return -1; +#else + (void)priv; + return -ENOTSUP; +#endif /* HAVE_MLX5DV_DR */ +} From patchwork Wed Jan 29 10:09:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65297 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id D953CA0531; Wed, 29 Jan 2020 11:11:14 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id D15871C00D; Wed, 29 Jan 2020 11:10:18 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id 526EF1BFFA for ; Wed, 29 Jan 2020 11:10:15 +0100 (CET) Received: from Internal Mail-Server by MTLPINE2 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 29 Jan 2020 12:10:14 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 00TA9BHP032108; Wed, 29 Jan 2020 12:10:14 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Wed, 29 Jan 2020 10:09:05 +0000 Message-Id: <1580292549-27439-10-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580292549-27439-1-git-send-email-matan@mellanox.com> References: <1579539790-3882-1-git-send-email-matan@mellanox.com> <1580292549-27439-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v2 09/13] vdpa/mlx5: support queue state operation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add support for set_vring_state operation. Using DevX API the virtq state can be changed as described in PRM: enable - move to ready state. disable - move to suspend state. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko Acked-by: Maxime Coquelin --- drivers/vdpa/mlx5/mlx5_vdpa.c | 23 ++++++++++++++++++++++- drivers/vdpa/mlx5/mlx5_vdpa.h | 15 +++++++++++++++ drivers/vdpa/mlx5/mlx5_vdpa_steer.c | 22 ++++++++++++++++++++-- drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 25 +++++++++++++++++++++---- 4 files changed, 78 insertions(+), 7 deletions(-) diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c index 12cfee2..71189c4 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -106,13 +106,34 @@ return 0; } +static int +mlx5_vdpa_set_vring_state(int vid, int vring, int state) +{ + int did = rte_vhost_get_vdpa_device_id(vid); + struct mlx5_vdpa_priv *priv = mlx5_vdpa_find_priv_resource_by_did(did); + struct mlx5_vdpa_virtq *virtq = NULL; + + if (priv == NULL) { + DRV_LOG(ERR, "Invalid device id: %d.", did); + return -EINVAL; + } + SLIST_FOREACH(virtq, &priv->virtq_list, next) + if (virtq->index == vring) + break; + if (!virtq) { + DRV_LOG(ERR, "Invalid or unconfigured vring id: %d.", vring); + return -EINVAL; + } + return mlx5_vdpa_virtq_enable(virtq, state); +} + static struct rte_vdpa_dev_ops mlx5_vdpa_ops = { .get_queue_num = mlx5_vdpa_get_queue_num, .get_features = mlx5_vdpa_get_vdpa_features, .get_protocol_features = mlx5_vdpa_get_protocol_features, .dev_conf = NULL, .dev_close = NULL, - .set_vring_state = NULL, + .set_vring_state = mlx5_vdpa_set_vring_state, .set_features = NULL, .migration_done = NULL, .get_vfio_group_fd = NULL, diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h index 2b0b285..383a33e 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -64,8 +64,10 @@ struct mlx5_vdpa_query_mr { struct mlx5_vdpa_virtq { SLIST_ENTRY(mlx5_vdpa_virtq) next; + uint8_t enable; uint16_t index; uint16_t vq_size; + struct mlx5_vdpa_priv *priv; struct mlx5_devx_obj *virtq; struct mlx5_vdpa_event_qp eqp; struct { @@ -207,6 +209,19 @@ int mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv *priv, uint16_t desc_n, int mlx5_vdpa_virtqs_prepare(struct mlx5_vdpa_priv *priv); /** + * Enable\Disable virtq.. + * + * @param[in] virtq + * The vdpa driver private virtq structure. + * @param[in] enable + * Set to enable, otherwise disable. + * + * @return + * 0 on success, a negative value otherwise. + */ +int mlx5_vdpa_virtq_enable(struct mlx5_vdpa_virtq *virtq, int enable); + +/** * Unset steering and release all its related resources- stop traffic. * * @param[in] priv diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_steer.c b/drivers/vdpa/mlx5/mlx5_vdpa_steer.c index f365c10..36017f1 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa_steer.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa_steer.c @@ -73,7 +73,7 @@ } #define MLX5_VDPA_DEFAULT_RQT_SIZE 512 -static int __rte_unused +static int mlx5_vdpa_rqt_prepare(struct mlx5_vdpa_priv *priv) { struct mlx5_vdpa_virtq *virtq; @@ -91,7 +91,8 @@ return -ENOMEM; } SLIST_FOREACH(virtq, &priv->virtq_list, next) { - if (is_virtq_recvq(virtq->index, priv->nr_virtqs)) { + if (is_virtq_recvq(virtq->index, priv->nr_virtqs) && + virtq->enable) { attr->rq_list[i] = virtq->virtq->id; i++; } @@ -116,6 +117,23 @@ return ret; } +int +mlx5_vdpa_virtq_enable(struct mlx5_vdpa_virtq *virtq, int enable) +{ + struct mlx5_vdpa_priv *priv = virtq->priv; + int ret = 0; + + if (virtq->enable == !!enable) + return 0; + virtq->enable = !!enable; + if (is_virtq_recvq(virtq->index, priv->nr_virtqs)) { + ret = mlx5_vdpa_rqt_prepare(priv); + if (ret) + virtq->enable = !enable; + } + return ret; +} + static int __rte_unused mlx5_vdpa_rss_flows_create(struct mlx5_vdpa_priv *priv) { diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c index e27af28..60aa040 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c @@ -15,13 +15,13 @@ static int mlx5_vdpa_virtq_unset(struct mlx5_vdpa_virtq *virtq) { - int i; + unsigned int; if (virtq->virtq) { claim_zero(mlx5_devx_cmd_destroy(virtq->virtq)); virtq->virtq = NULL; } - for (i = 0; i < 3; ++i) { + for (i = 0; i < RTE_DIM(virtq->umems); ++i) { if (virtq->umems[i].obj) claim_zero(mlx5_glue->devx_umem_dereg (virtq->umems[i].obj)); @@ -60,6 +60,19 @@ priv->features = 0; } +static int +mlx5_vdpa_virtq_modify(struct mlx5_vdpa_virtq *virtq, int state) +{ + struct mlx5_devx_virtq_attr attr = { + .type = MLX5_VIRTQ_MODIFY_TYPE_STATE, + .state = state ? MLX5_VIRTQ_STATE_RDY : + MLX5_VIRTQ_STATE_SUSPEND, + .queue_index = virtq->index, + }; + + return mlx5_devx_cmd_modify_virtq(virtq->virtq, &attr); +} + static uint64_t mlx5_vdpa_hva_to_gpa(struct rte_vhost_memory *mem, uint64_t hva) { @@ -86,7 +99,7 @@ struct mlx5_devx_virtq_attr attr = {0}; uint64_t gpa; int ret; - int i; + unsigned i; uint16_t last_avail_idx; uint16_t last_used_idx; @@ -125,7 +138,7 @@ " need event QPs and event mechanism.", index); } /* Setup 3 UMEMs for each virtq. */ - for (i = 0; i < 3; ++i) { + for (i = 0; i < RTE_DIM(virtq->umems); ++i) { virtq->umems[i].size = priv->caps.umems[i].a * vq.size + priv->caps.umems[i].b; virtq->umems[i].buf = rte_zmalloc(__func__, @@ -182,8 +195,12 @@ attr.tis_id = priv->tis->id; attr.queue_index = index; virtq->virtq = mlx5_devx_cmd_create_virtq(priv->ctx, &attr); + virtq->priv = priv; if (!virtq->virtq) goto error; + if (mlx5_vdpa_virtq_modify(virtq, 1)) + goto error; + virtq->enable = 1; return 0; error: mlx5_vdpa_virtq_unset(virtq); From patchwork Wed Jan 29 10:09:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65299 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id B3ED9A0531; Wed, 29 Jan 2020 11:11:38 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 54A861C01B; Wed, 29 Jan 2020 11:10:27 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id 5E8031BFB1 for ; Wed, 29 Jan 2020 11:10:24 +0100 (CET) Received: from Internal Mail-Server by MTLPINE1 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 29 Jan 2020 12:10:20 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 00TA9BHQ032108; Wed, 29 Jan 2020 12:10:20 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Wed, 29 Jan 2020 10:09:06 +0000 Message-Id: <1580292549-27439-11-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580292549-27439-1-git-send-email-matan@mellanox.com> References: <1579539790-3882-1-git-send-email-matan@mellanox.com> <1580292549-27439-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v2 10/13] vdpa/mlx5: map doorbell X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The HW supports only 4 bytes doorbell writing detection. The virtio device set only 2 bytes when it rings the doorbell. Map the virtio doorbell detected by the virtio queue kickfd to the HW VAR space when it expects to get the virtio emulation doorbell. Use the EAL interrupt mechanism to get notification when a new event appears in kickfd by the guest and write 4 bytes to the HW doorbell space in the notification callback. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko Acked-by: Maxime Coquelin --- drivers/vdpa/mlx5/mlx5_vdpa.h | 3 ++ drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 82 ++++++++++++++++++++++++++++++++++++- 2 files changed, 84 insertions(+), 1 deletion(-) diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h index 383a33e..af78ea1 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -75,6 +75,7 @@ struct mlx5_vdpa_virtq { void *buf; uint32_t size; } umems[3]; + struct rte_intr_handle intr_handle; }; struct mlx5_vdpa_steer { @@ -112,6 +113,8 @@ struct mlx5_vdpa_priv { uint16_t log_max_rqt_size; SLIST_HEAD(virtq_list, mlx5_vdpa_virtq) virtq_list; struct mlx5_vdpa_steer steer; + struct mlx5dv_var *var; + void *virtq_db_addr; SLIST_HEAD(mr_list, mlx5_vdpa_query_mr) mr_list; }; diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c index 60aa040..91347e9 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c @@ -2,9 +2,12 @@ * Copyright 2019 Mellanox Technologies, Ltd */ #include +#include +#include #include #include +#include #include @@ -12,11 +15,52 @@ #include "mlx5_vdpa.h" +static void +mlx5_vdpa_virtq_handler(void *cb_arg) +{ + struct mlx5_vdpa_virtq *virtq = cb_arg; + struct mlx5_vdpa_priv *priv = virtq->priv; + uint64_t buf; + int nbytes; + + do { + nbytes = read(virtq->intr_handle.fd, &buf, 8); + if (nbytes < 0) { + if (errno == EINTR || + errno == EWOULDBLOCK || + errno == EAGAIN) + continue; + DRV_LOG(ERR, "Failed to read kickfd of virtq %d: %s", + virtq->index, strerror(errno)); + } + break; + } while (1); + rte_write32(virtq->index, priv->virtq_db_addr); + DRV_LOG(DEBUG, "Ring virtq %u doorbell.", virtq->index); +} + static int mlx5_vdpa_virtq_unset(struct mlx5_vdpa_virtq *virtq) { - unsigned int; + unsigned int i; + int retries = MLX5_VDPA_INTR_RETRIES; + int ret = -EAGAIN; + if (virtq->intr_handle.fd) { + while (retries-- && ret == -EAGAIN) { + ret = rte_intr_callback_unregister(&virtq->intr_handle, + mlx5_vdpa_virtq_handler, + virtq); + if (ret == -EAGAIN) { + DRV_LOG(DEBUG, "Try again to unregister fd %d " + "of virtq %d interrupt, retries = %d.", + virtq->intr_handle.fd, + (int)virtq->index, retries); + usleep(MLX5_VDPA_INTR_RETRIES_USEC); + } + } + memset(&virtq->intr_handle, 0, sizeof(virtq->intr_handle)); + } if (virtq->virtq) { claim_zero(mlx5_devx_cmd_destroy(virtq->virtq)); virtq->virtq = NULL; @@ -57,6 +101,14 @@ claim_zero(mlx5_devx_cmd_destroy(priv->td)); priv->td = NULL; } + if (priv->virtq_db_addr) { + claim_zero(munmap(priv->virtq_db_addr, priv->var->length)); + priv->virtq_db_addr = NULL; + } + if (priv->var) { + mlx5_glue->dv_free_var(priv->var); + priv->var = NULL; + } priv->features = 0; } @@ -201,6 +253,17 @@ if (mlx5_vdpa_virtq_modify(virtq, 1)) goto error; virtq->enable = 1; + virtq->intr_handle.fd = vq.kickfd; + virtq->intr_handle.type = RTE_INTR_HANDLE_EXT; + if (rte_intr_callback_register(&virtq->intr_handle, + mlx5_vdpa_virtq_handler, virtq)) { + virtq->intr_handle.fd = 0; + DRV_LOG(ERR, "Failed to register virtq %d interrupt.", index); + goto error; + } else { + DRV_LOG(DEBUG, "Register fd %d interrupt for virtq %d.", + virtq->intr_handle.fd, index); + } return 0; error: mlx5_vdpa_virtq_unset(virtq); @@ -275,6 +338,23 @@ DRV_LOG(ERR, "Failed to configure negotiated features."); return -1; } + priv->var = mlx5_glue->dv_alloc_var(priv->ctx, 0); + if (!priv->var) { + DRV_LOG(ERR, "Failed to allocate VAR %u.\n", errno); + return -1; + } + /* Always map the entire page. */ + priv->virtq_db_addr = mmap(NULL, priv->var->length, PROT_READ | + PROT_WRITE, MAP_SHARED, priv->ctx->cmd_fd, + priv->var->mmap_off); + if (priv->virtq_db_addr == MAP_FAILED) { + DRV_LOG(ERR, "Failed to map doorbell page %u.", errno); + priv->virtq_db_addr = NULL; + goto error; + } else { + DRV_LOG(DEBUG, "VAR address of doorbell mapping is %p.", + priv->virtq_db_addr); + } priv->td = mlx5_devx_cmd_create_td(priv->ctx); if (!priv->td) { DRV_LOG(ERR, "Failed to create transport domain."); From patchwork Wed Jan 29 10:09:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65298 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 73F65A0531; Wed, 29 Jan 2020 11:11:28 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id EFF171BFB4; Wed, 29 Jan 2020 11:10:25 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id 614B51BFB2 for ; Wed, 29 Jan 2020 11:10:24 +0100 (CET) Received: from Internal Mail-Server by MTLPINE1 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 29 Jan 2020 12:10:23 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 00TA9BHR032108; Wed, 29 Jan 2020 12:10:23 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Wed, 29 Jan 2020 10:09:07 +0000 Message-Id: <1580292549-27439-12-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580292549-27439-1-git-send-email-matan@mellanox.com> References: <1579539790-3882-1-git-send-email-matan@mellanox.com> <1580292549-27439-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v2 11/13] vdpa/mlx5: support live migration X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add support for live migration feature by the HW: Create a single Mkey that maps the memory address space of the VHOST live migration log file. Modify VIRTIO_NET_Q object and provide vhost_log_page, dirty_bitmap_mkey, dirty_bitmap_size, dirty_bitmap_addr and dirty_bitmap_dump_enable. Modify VIRTIO_NET_Q object and move state to SUSPEND. Query VIRTIO_NET_Q and get hw_available_idx and hw_used_idx. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko Acked-by: Maxime Coquelin --- doc/guides/vdpadevs/features/mlx5.ini | 1 + drivers/vdpa/mlx5/Makefile | 1 + drivers/vdpa/mlx5/meson.build | 1 + drivers/vdpa/mlx5/mlx5_vdpa.c | 44 +++++++++++- drivers/vdpa/mlx5/mlx5_vdpa.h | 55 ++++++++++++++ drivers/vdpa/mlx5/mlx5_vdpa_lm.c | 130 ++++++++++++++++++++++++++++++++++ drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 7 +- 7 files changed, 236 insertions(+), 3 deletions(-) create mode 100644 drivers/vdpa/mlx5/mlx5_vdpa_lm.c diff --git a/doc/guides/vdpadevs/features/mlx5.ini b/doc/guides/vdpadevs/features/mlx5.ini index e4ee34b..1da9c1b 100644 --- a/doc/guides/vdpadevs/features/mlx5.ini +++ b/doc/guides/vdpadevs/features/mlx5.ini @@ -9,6 +9,7 @@ guest csum = Y host tso4 = Y host tso6 = Y version 1 = Y +log all = Y any layout = Y guest announce = Y mq = Y diff --git a/drivers/vdpa/mlx5/Makefile b/drivers/vdpa/mlx5/Makefile index 2f70a98..4d1f528 100644 --- a/drivers/vdpa/mlx5/Makefile +++ b/drivers/vdpa/mlx5/Makefile @@ -12,6 +12,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_mem.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_event.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_virtq.c SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_steer.c +SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa_lm.c # Basic CFLAGS. diff --git a/drivers/vdpa/mlx5/meson.build b/drivers/vdpa/mlx5/meson.build index 2849178..2e521b8 100644 --- a/drivers/vdpa/mlx5/meson.build +++ b/drivers/vdpa/mlx5/meson.build @@ -16,6 +16,7 @@ sources = files( 'mlx5_vdpa_event.c', 'mlx5_vdpa_virtq.c', 'mlx5_vdpa_steer.c', + 'mlx5_vdpa_lm.c', ) cflags_options = [ '-std=c11', diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c index 71189c4..4ce0ba0 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -19,7 +19,8 @@ (1ULL << VIRTIO_F_ANY_LAYOUT) | \ (1ULL << VIRTIO_NET_F_MQ) | \ (1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE) | \ - (1ULL << VIRTIO_F_ORDER_PLATFORM)) + (1ULL << VIRTIO_F_ORDER_PLATFORM) | \ + (1ULL << VHOST_F_LOG_ALL)) #define MLX5_VDPA_PROTOCOL_FEATURES \ ((1ULL << VHOST_USER_PROTOCOL_F_SLAVE_REQ) | \ @@ -127,6 +128,45 @@ return mlx5_vdpa_virtq_enable(virtq, state); } +static int +mlx5_vdpa_features_set(int vid) +{ + int did = rte_vhost_get_vdpa_device_id(vid); + struct mlx5_vdpa_priv *priv = mlx5_vdpa_find_priv_resource_by_did(did); + uint64_t log_base, log_size; + uint64_t features; + int ret; + + if (priv == NULL) { + DRV_LOG(ERR, "Invalid device id: %d.", did); + return -EINVAL; + } + ret = rte_vhost_get_negotiated_features(vid, &features); + if (ret) { + DRV_LOG(ERR, "Failed to get negotiated features."); + return ret; + } + if (RTE_VHOST_NEED_LOG(features)) { + ret = rte_vhost_get_log_base(vid, &log_base, &log_size); + if (ret) { + DRV_LOG(ERR, "Failed to get log base."); + return ret; + } + ret = mlx5_vdpa_dirty_bitmap_set(priv, log_base, log_size); + if (ret) { + DRV_LOG(ERR, "Failed to set dirty bitmap."); + return ret; + } + DRV_LOG(INFO, "mlx5 vdpa: enabling dirty logging..."); + ret = mlx5_vdpa_logging_enable(priv, 1); + if (ret) { + DRV_LOG(ERR, "Failed t enable dirty logging."); + return ret; + } + } + return 0; +} + static struct rte_vdpa_dev_ops mlx5_vdpa_ops = { .get_queue_num = mlx5_vdpa_get_queue_num, .get_features = mlx5_vdpa_get_vdpa_features, @@ -134,7 +174,7 @@ .dev_conf = NULL, .dev_close = NULL, .set_vring_state = mlx5_vdpa_set_vring_state, - .set_features = NULL, + .set_features = mlx5_vdpa_features_set, .migration_done = NULL, .get_vfio_group_fd = NULL, .get_vfio_device_fd = NULL, diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h index af78ea1..70264e4 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -244,4 +244,59 @@ int mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv *priv, uint16_t desc_n, */ int mlx5_vdpa_steer_setup(struct mlx5_vdpa_priv *priv); +/** + * Enable\Disable live migration logging. + * + * @param[in] priv + * The vdpa driver private structure. + * @param[in] enable + * Set for enable, unset for disable. + * + * @return + * 0 on success, a negative value otherwise. + */ +int mlx5_vdpa_logging_enable(struct mlx5_vdpa_priv *priv, int enable); + +/** + * Set dirty bitmap logging to allow live migration. + * + * @param[in] priv + * The vdpa driver private structure. + * @param[in] log_base + * Vhost log base. + * @param[in] log_size + * Vhost log size. + * + * @return + * 0 on success, a negative value otherwise. + */ +int mlx5_vdpa_dirty_bitmap_set(struct mlx5_vdpa_priv *priv, uint64_t log_base, + uint64_t log_size); + +/** + * Log all virtqs information for live migration. + * + * @param[in] priv + * The vdpa driver private structure. + * @param[in] enable + * Set for enable, unset for disable. + * + * @return + * 0 on success, a negative value otherwise. + */ +int mlx5_vdpa_lm_log(struct mlx5_vdpa_priv *priv); + +/** + * Modify virtq state to be ready or suspend. + * + * @param[in] virtq + * The vdpa driver private virtq structure. + * @param[in] state + * Set for ready, otherwise suspend. + * + * @return + * 0 on success, a negative value otherwise. + */ +int mlx5_vdpa_virtq_modify(struct mlx5_vdpa_virtq *virtq, int state); + #endif /* RTE_PMD_MLX5_VDPA_H_ */ diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_lm.c b/drivers/vdpa/mlx5/mlx5_vdpa_lm.c new file mode 100644 index 0000000..cfeec5f --- /dev/null +++ b/drivers/vdpa/mlx5/mlx5_vdpa_lm.c @@ -0,0 +1,130 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2019 Mellanox Technologies, Ltd + */ +#include +#include + +#include "mlx5_vdpa_utils.h" +#include "mlx5_vdpa.h" + + +int +mlx5_vdpa_logging_enable(struct mlx5_vdpa_priv *priv, int enable) +{ + struct mlx5_devx_virtq_attr attr = { + .type = MLX5_VIRTQ_MODIFY_TYPE_DIRTY_BITMAP_DUMP_ENABLE, + .dirty_bitmap_dump_enable = enable, + }; + struct mlx5_vdpa_virtq *virtq; + + SLIST_FOREACH(virtq, &priv->virtq_list, next) { + attr.queue_index = virtq->index; + if (mlx5_devx_cmd_modify_virtq(virtq->virtq, &attr)) { + DRV_LOG(ERR, "Failed to modify virtq %d logging.", + virtq->index); + return -1; + } + } + return 0; +} + +int +mlx5_vdpa_dirty_bitmap_set(struct mlx5_vdpa_priv *priv, uint64_t log_base, + uint64_t log_size) +{ + struct mlx5_devx_mkey_attr mkey_attr = { + .addr = (uintptr_t)log_base, + .size = log_size, + .pd = priv->pdn, + .pg_access = 1, + .klm_array = NULL, + .klm_num = 0, + }; + struct mlx5_devx_virtq_attr attr = { + .type = MLX5_VIRTQ_MODIFY_TYPE_DIRTY_BITMAP_PARAMS, + .dirty_bitmap_addr = log_base, + .dirty_bitmap_size = log_size, + }; + struct mlx5_vdpa_query_mr *mr = rte_malloc(__func__, sizeof(*mr), 0); + struct mlx5_vdpa_virtq *virtq; + + if (!mr) { + DRV_LOG(ERR, "Failed to allocate mem for lm mr."); + return -1; + } + mr->umem = mlx5_glue->devx_umem_reg(priv->ctx, + (void *)(uintptr_t)log_base, + log_size, IBV_ACCESS_LOCAL_WRITE); + if (!mr->umem) { + DRV_LOG(ERR, "Failed to register umem for lm mr."); + goto err; + } + mkey_attr.umem_id = mr->umem->umem_id; + mr->mkey = mlx5_devx_cmd_mkey_create(priv->ctx, &mkey_attr); + if (!mr->mkey) { + DRV_LOG(ERR, "Failed to create Mkey for lm."); + goto err; + } + attr.dirty_bitmap_mkey = mr->mkey->id; + SLIST_FOREACH(virtq, &priv->virtq_list, next) { + attr.queue_index = virtq->index; + if (mlx5_devx_cmd_modify_virtq(virtq->virtq, &attr)) { + DRV_LOG(ERR, "Failed to modify virtq %d for lm.", + virtq->index); + goto err; + } + } + mr->is_indirect = 0; + SLIST_INSERT_HEAD(&priv->mr_list, mr, next); + return 0; +err: + if (mr->mkey) + mlx5_devx_cmd_destroy(mr->mkey); + if (mr->umem) + mlx5_glue->devx_umem_dereg(mr->umem); + rte_free(mr); + return -1; +} + +#define MLX5_VDPA_USED_RING_LEN(size) \ + ((size) * sizeof(struct vring_used_elem) + sizeof(uint16_t) * 3) + +int +mlx5_vdpa_lm_log(struct mlx5_vdpa_priv *priv) +{ + struct mlx5_devx_virtq_attr attr = {0}; + struct mlx5_vdpa_virtq *virtq; + uint64_t features; + int ret = rte_vhost_get_negotiated_features(priv->vid, &features); + + if (ret) { + DRV_LOG(ERR, "Failed to get negotiated features."); + return -1; + } + if (RTE_VHOST_NEED_LOG(features)) { + SLIST_FOREACH(virtq, &priv->virtq_list, next) { + ret = mlx5_vdpa_virtq_modify(virtq, 0); + if (ret) + return -1; + if (mlx5_devx_cmd_query_virtq(virtq->virtq, &attr)) { + DRV_LOG(ERR, "Failed to query virtq %d.", + virtq->index); + return -1; + } + DRV_LOG(INFO, "Query vid %d vring %d: hw_available_idx=" + "%d, hw_used_index=%d", priv->vid, virtq->index, + attr.hw_available_index, attr.hw_used_index); + ret = rte_vhost_set_vring_base(priv->vid, virtq->index, + attr.hw_available_index, + attr.hw_used_index); + if (ret) { + DRV_LOG(ERR, "Failed to set virtq %d base.", + virtq->index); + return -1; + } + rte_vhost_log_used_vring(priv->vid, virtq->index, 0, + MLX5_VDPA_USED_RING_LEN(virtq->vq_size)); + } + } + return 0; +} diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c index 91347e9..af058a1 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c @@ -112,7 +112,7 @@ priv->features = 0; } -static int +int mlx5_vdpa_virtq_modify(struct mlx5_vdpa_virtq *virtq, int state) { struct mlx5_devx_virtq_attr attr = { @@ -253,6 +253,11 @@ if (mlx5_vdpa_virtq_modify(virtq, 1)) goto error; virtq->enable = 1; + virtq->priv = priv; + /* Be sure notifications are not missed during configuration. */ + claim_zero(rte_vhost_enable_guest_notification(priv->vid, index, 1)); + rte_write32(virtq->index, priv->virtq_db_addr); + /* Setup doorbell mapping. */ virtq->intr_handle.fd = vq.kickfd; virtq->intr_handle.type = RTE_INTR_HANDLE_EXT; if (rte_intr_callback_register(&virtq->intr_handle, From patchwork Wed Jan 29 10:09:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65301 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3A3A2A0531; Wed, 29 Jan 2020 11:11:59 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 075FD1C02D; Wed, 29 Jan 2020 11:10:34 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id 978F81BFCD for ; Wed, 29 Jan 2020 11:10:30 +0100 (CET) Received: from Internal Mail-Server by MTLPINE2 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 29 Jan 2020 12:10:26 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 00TA9BHS032108; Wed, 29 Jan 2020 12:10:25 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Wed, 29 Jan 2020 10:09:08 +0000 Message-Id: <1580292549-27439-13-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580292549-27439-1-git-send-email-matan@mellanox.com> References: <1579539790-3882-1-git-send-email-matan@mellanox.com> <1580292549-27439-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v2 12/13] vdpa/mlx5: support close and config operations X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Support dev_conf and dev_conf operations. These operations allow vdpa traffic. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko Acked-by: Maxime Coquelin --- drivers/vdpa/mlx5/mlx5_vdpa.c | 58 ++++++++++++++++++++++++++++++++++++++++--- drivers/vdpa/mlx5/mlx5_vdpa.h | 1 + 2 files changed, 55 insertions(+), 4 deletions(-) diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c index 4ce0ba0..c8ef3b4 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -167,12 +167,59 @@ return 0; } +static int +mlx5_vdpa_dev_close(int vid) +{ + int did = rte_vhost_get_vdpa_device_id(vid); + struct mlx5_vdpa_priv *priv = mlx5_vdpa_find_priv_resource_by_did(did); + int ret = 0; + + if (priv == NULL) { + DRV_LOG(ERR, "Invalid device id: %d.", did); + return -1; + } + if (priv->configured) + ret |= mlx5_vdpa_lm_log(priv); + mlx5_vdpa_cqe_event_unset(priv); + ret |= mlx5_vdpa_steer_unset(priv); + mlx5_vdpa_virtqs_release(priv); + mlx5_vdpa_event_qp_global_release(priv); + mlx5_vdpa_mem_dereg(priv); + priv->configured = 0; + priv->vid = 0; + return ret; +} + +static int +mlx5_vdpa_dev_config(int vid) +{ + int did = rte_vhost_get_vdpa_device_id(vid); + struct mlx5_vdpa_priv *priv = mlx5_vdpa_find_priv_resource_by_did(did); + + if (priv == NULL) { + DRV_LOG(ERR, "Invalid device id: %d.", did); + return -EINVAL; + } + if (priv->configured && mlx5_vdpa_dev_close(vid)) { + DRV_LOG(ERR, "Failed to reconfigure vid %d.", vid); + return -1; + } + priv->vid = vid; + if (mlx5_vdpa_mem_register(priv) || mlx5_vdpa_virtqs_prepare(priv) || + mlx5_vdpa_steer_setup(priv) || mlx5_vdpa_cqe_event_setup(priv)) { + mlx5_vdpa_dev_close(vid); + return -1; + } + priv->configured = 1; + return 0; +} + static struct rte_vdpa_dev_ops mlx5_vdpa_ops = { .get_queue_num = mlx5_vdpa_get_queue_num, .get_features = mlx5_vdpa_get_vdpa_features, .get_protocol_features = mlx5_vdpa_get_protocol_features, - .dev_conf = NULL, - .dev_close = NULL, + .dev_conf = mlx5_vdpa_dev_config, + .dev_close = mlx5_vdpa_dev_close, .set_vring_state = mlx5_vdpa_set_vring_state, .set_features = mlx5_vdpa_features_set, .migration_done = NULL, @@ -320,12 +367,15 @@ break; } } - if (found) { + if (found) TAILQ_REMOVE(&priv_list, priv, next); + pthread_mutex_unlock(&priv_list_lock); + if (found) { + if (priv->configured) + mlx5_vdpa_dev_close(priv->vid); mlx5_glue->close_device(priv->ctx); rte_free(priv); } - pthread_mutex_unlock(&priv_list_lock); return 0; } diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h index 70264e4..75e96d6 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -92,6 +92,7 @@ struct mlx5_vdpa_steer { struct mlx5_vdpa_priv { TAILQ_ENTRY(mlx5_vdpa_priv) next; + uint8_t configured; int id; /* vDPA device id. */ int vid; /* vhost device id. */ struct ibv_context *ctx; /* Device context. */ From patchwork Wed Jan 29 10:09:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Azrad X-Patchwork-Id: 65300 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id CC9EEA0531; Wed, 29 Jan 2020 11:11:50 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 519791BFD6; Wed, 29 Jan 2020 11:10:31 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id 8144F1C029 for ; Wed, 29 Jan 2020 11:10:29 +0100 (CET) Received: from Internal Mail-Server by MTLPINE1 (envelope-from asafp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 29 Jan 2020 12:10:27 +0200 Received: from pegasus07.mtr.labs.mlnx (pegasus07.mtr.labs.mlnx [10.210.16.112]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 00TA9BHT032108; Wed, 29 Jan 2020 12:10:27 +0200 From: Matan Azrad To: dev@dpdk.org, Viacheslav Ovsiienko Cc: Maxime Coquelin Date: Wed, 29 Jan 2020 10:09:09 +0000 Message-Id: <1580292549-27439-14-git-send-email-matan@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1580292549-27439-1-git-send-email-matan@mellanox.com> References: <1579539790-3882-1-git-send-email-matan@mellanox.com> <1580292549-27439-1-git-send-email-matan@mellanox.com> Subject: [dpdk-dev] [PATCH v2 13/13] vdpa/mlx5: disable ROCE X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" In order to support virtio queue creation by the FW, ROCE mode should be disabled in the device. Do it by netlink which is like the devlink tool commands: 1. devlink dev param set pci/[pci] name enable_roce value false cmode driverinit 2. devlink dev reload pci/[pci] Or by sysfs which is like: echo 0 > /sys/bus/pci/devices/[pci]/roce_enable The IB device is matched again after ROCE disabling. Signed-off-by: Matan Azrad Acked-by: Viacheslav Ovsiienko Acked-by: Maxime Coquelin --- drivers/vdpa/mlx5/Makefile | 2 +- drivers/vdpa/mlx5/meson.build | 2 +- drivers/vdpa/mlx5/mlx5_vdpa.c | 192 ++++++++++++++++++++++++++++++++++-------- 3 files changed, 161 insertions(+), 35 deletions(-) diff --git a/drivers/vdpa/mlx5/Makefile b/drivers/vdpa/mlx5/Makefile index 4d1f528..5cdcdd4 100644 --- a/drivers/vdpa/mlx5/Makefile +++ b/drivers/vdpa/mlx5/Makefile @@ -29,7 +29,7 @@ CFLAGS += -D_XOPEN_SOURCE=600 CFLAGS += $(WERROR_FLAGS) CFLAGS += -Wno-strict-prototypes LDLIBS += -lrte_common_mlx5 -LDLIBS += -lrte_eal -lrte_vhost -lrte_kvargs -lrte_bus_pci -lrte_sched +LDLIBS += -lrte_eal -lrte_vhost -lrte_kvargs -lrte_pci -lrte_bus_pci -lrte_sched # A few warnings cannot be avoided in external headers. CFLAGS += -Wno-error=cast-qual diff --git a/drivers/vdpa/mlx5/meson.build b/drivers/vdpa/mlx5/meson.build index 2e521b8..e5236b3 100644 --- a/drivers/vdpa/mlx5/meson.build +++ b/drivers/vdpa/mlx5/meson.build @@ -9,7 +9,7 @@ endif fmt_name = 'mlx5_vdpa' allow_experimental_apis = true -deps += ['hash', 'common_mlx5', 'vhost', 'bus_pci', 'eal', 'sched'] +deps += ['hash', 'common_mlx5', 'vhost', 'pci', 'bus_pci', 'eal', 'sched'] sources = files( 'mlx5_vdpa.c', 'mlx5_vdpa_mem.c', diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c index c8ef3b4..e098be9 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -1,15 +1,19 @@ /* SPDX-License-Identifier: BSD-3-Clause * Copyright 2019 Mellanox Technologies, Ltd */ +#include + #include #include #include #include +#include #include #include #include #include +#include #include "mlx5_vdpa_utils.h" #include "mlx5_vdpa.h" @@ -228,6 +232,145 @@ .get_notify_area = NULL, }; +static struct ibv_device * +mlx5_vdpa_get_ib_device_match(struct rte_pci_addr *addr) +{ + int n; + struct ibv_device **ibv_list = mlx5_glue->get_device_list(&n); + struct ibv_device *ibv_match = NULL; + + if (!ibv_list) { + rte_errno = ENOSYS; + return NULL; + } + while (n-- > 0) { + struct rte_pci_addr pci_addr; + + DRV_LOG(DEBUG, "Checking device \"%s\"..", ibv_list[n]->name); + if (mlx5_dev_to_pci_addr(ibv_list[n]->ibdev_path, &pci_addr)) + continue; + if (memcmp(addr, &pci_addr, sizeof(pci_addr))) + continue; + ibv_match = ibv_list[n]; + break; + } + if (!ibv_match) + rte_errno = ENOENT; + mlx5_glue->free_device_list(ibv_list); + return ibv_match; +} + +/* Try to disable ROCE by Netlink\Devlink. */ +static int +mlx5_vdpa_nl_roce_disable(const char *addr) +{ + int nlsk_fd = mlx5_nl_init(NETLINK_GENERIC); + int devlink_id; + int enable; + int ret; + + if (nlsk_fd < 0) + return nlsk_fd; + devlink_id = mlx5_nl_devlink_family_id_get(nlsk_fd); + if (devlink_id < 0) { + ret = devlink_id; + DRV_LOG(DEBUG, "Failed to get devlink id for ROCE operations by" + " Netlink."); + goto close; + } + ret = mlx5_nl_enable_roce_get(nlsk_fd, devlink_id, addr, &enable); + if (ret) { + DRV_LOG(DEBUG, "Failed to get ROCE enable by Netlink: %d.", + ret); + goto close; + } else if (!enable) { + DRV_LOG(INFO, "ROCE has already disabled(Netlink)."); + goto close; + } + ret = mlx5_nl_enable_roce_set(nlsk_fd, devlink_id, addr, 0); + if (ret) + DRV_LOG(DEBUG, "Failed to disable ROCE by Netlink: %d.", ret); + else + DRV_LOG(INFO, "ROCE is disabled by Netlink successfully."); +close: + close(nlsk_fd); + return ret; +} + +/* Try to disable ROCE by sysfs. */ +static int +mlx5_vdpa_sys_roce_disable(const char *addr) +{ + FILE *file_o; + int enable; + int ret; + + MKSTR(file_p, "/sys/bus/pci/devices/%s/roce_enable", addr); + file_o = fopen(file_p, "rb"); + if (!file_o) { + rte_errno = ENOTSUP; + return -ENOTSUP; + } + ret = fscanf(file_o, "%d", &enable); + if (ret != 1) { + rte_errno = EINVAL; + ret = EINVAL; + goto close; + } else if (!enable) { + ret = 0; + DRV_LOG(INFO, "ROCE has already disabled(sysfs)."); + goto close; + } + fclose(file_o); + file_o = fopen(file_p, "wb"); + if (!file_o) { + rte_errno = ENOTSUP; + return -ENOTSUP; + } + fprintf(file_o, "0\n"); + ret = 0; +close: + if (ret) + DRV_LOG(DEBUG, "Failed to disable ROCE by sysfs: %d.", ret); + else + DRV_LOG(INFO, "ROCE is disabled by sysfs successfully."); + fclose(file_o); + return ret; +} + +#define MLX5_VDPA_MAX_RETRIES 20 +#define MLX5_VDPA_USEC 1000 +static int +mlx5_vdpa_roce_disable(struct rte_pci_addr *addr, struct ibv_device **ibv) +{ + char addr_name[64] = {0}; + + rte_pci_device_name(addr, addr_name, sizeof(addr_name)); + /* Firstly try to disable ROCE by Netlink and fallback to sysfs. */ + if (mlx5_vdpa_nl_roce_disable(addr_name) == 0 || + mlx5_vdpa_sys_roce_disable(addr_name) == 0) { + /* + * Succeed to disable ROCE, wait for the IB device to appear + * again after reload. + */ + int r; + struct ibv_device *ibv_new; + + for (r = MLX5_VDPA_MAX_RETRIES; r; r--) { + ibv_new = mlx5_vdpa_get_ib_device_match(addr); + if (ibv_new) { + *ibv = ibv_new; + return 0; + } + usleep(MLX5_VDPA_USEC); + } + DRV_LOG(ERR, "Cannot much device %s after ROCE disable, " + "retries exceed %d", addr_name, MLX5_VDPA_MAX_RETRIES); + rte_errno = EAGAIN; + } + return -rte_errno; +} + /** * DPDK callback to register a PCI device. * @@ -245,8 +388,7 @@ mlx5_vdpa_pci_probe(struct rte_pci_driver *pci_drv __rte_unused, struct rte_pci_device *pci_dev __rte_unused) { - struct ibv_device **ibv_list; - struct ibv_device *ibv_match = NULL; + struct ibv_device *ibv; struct mlx5_vdpa_priv *priv = NULL; struct ibv_context *ctx = NULL; struct mlx5_hca_attr attr; @@ -257,42 +399,26 @@ " driver."); return 1; } - errno = 0; - ibv_list = mlx5_glue->get_device_list(&ret); - if (!ibv_list) { - rte_errno = errno; - DRV_LOG(ERR, "Failed to get device list, is ib_uverbs loaded?"); - return -ENOSYS; - } - while (ret-- > 0) { - struct rte_pci_addr pci_addr; - - DRV_LOG(DEBUG, "Checking device \"%s\"..", ibv_list[ret]->name); - if (mlx5_dev_to_pci_addr(ibv_list[ret]->ibdev_path, &pci_addr)) - continue; - if (pci_dev->addr.domain != pci_addr.domain || - pci_dev->addr.bus != pci_addr.bus || - pci_dev->addr.devid != pci_addr.devid || - pci_dev->addr.function != pci_addr.function) - continue; - DRV_LOG(INFO, "PCI information matches for device \"%s\".", - ibv_list[ret]->name); - ibv_match = ibv_list[ret]; - break; - } - mlx5_glue->free_device_list(ibv_list); - if (!ibv_match) { + ibv = mlx5_vdpa_get_ib_device_match(&pci_dev->addr); + if (!ibv) { DRV_LOG(ERR, "No matching IB device for PCI slot " - "%" SCNx32 ":%" SCNx8 ":%" SCNx8 ".%" SCNx8 ".", - pci_dev->addr.domain, pci_dev->addr.bus, - pci_dev->addr.devid, pci_dev->addr.function); + PCI_PRI_FMT ".", pci_dev->addr.domain, + pci_dev->addr.bus, pci_dev->addr.devid, + pci_dev->addr.function); rte_errno = ENOENT; return -rte_errno; + } else { + DRV_LOG(INFO, "PCI information matches for device \"%s\".", + ibv->name); + } + if (mlx5_vdpa_roce_disable(&pci_dev->addr, &ibv) != 0) { + DRV_LOG(WARNING, "Failed to disable ROCE for \"%s\".", + ibv->name); + //return -rte_errno; } - ctx = mlx5_glue->dv_open_device(ibv_match); + ctx = mlx5_glue->dv_open_device(ibv); if (!ctx) { - DRV_LOG(ERR, "Failed to open IB device \"%s\".", - ibv_match->name); + DRV_LOG(ERR, "Failed to open IB device \"%s\".", ibv->name); rte_errno = ENODEV; return -rte_errno; }