diff mbox series

[v2,01/13] drivers: introduce mlx5 vDPA driver

Message ID 1580292549-27439-2-git-send-email-matan@mellanox.com (mailing list archive)
State Superseded, archived
Delegated to: Maxime Coquelin
Headers show
Series Introduce mlx5 vDPA driver | expand

Checks

Context Check Description
ci/Intel-compilation fail apply issues
ci/checkpatch warning coding style issues

Commit Message

Matan Azrad Jan. 29, 2020, 10:08 a.m. UTC
Add a new driver to support vDPA operations by Mellanox devices.

The first Mellanox devices which support vDPA operations are
ConnectX6DX and Bluefield1 HCA for their PF ports and VF ports.

This driver is depending on rdma-core like the mlx5 PMD, also it is
going to use mlx5 DevX to create HW objects directly by the FW.
Hence, the common/mlx5 library is linked to the mlx5_vdpa driver.

This driver will not be compiled by default due to the above
dependencies.

Register a new log type for this driver.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 MAINTAINERS                                     |   7 +
 config/common_base                              |   5 +
 doc/guides/rel_notes/release_20_02.rst          |   5 +
 doc/guides/vdpadevs/features/mlx5.ini           |  14 ++
 doc/guides/vdpadevs/index.rst                   |   1 +
 doc/guides/vdpadevs/mlx5.rst                    | 111 ++++++++++++
 drivers/common/Makefile                         |   2 +-
 drivers/common/mlx5/Makefile                    |  17 +-
 drivers/meson.build                             |   8 +-
 drivers/vdpa/Makefile                           |   2 +
 drivers/vdpa/meson.build                        |   3 +-
 drivers/vdpa/mlx5/Makefile                      |  36 ++++
 drivers/vdpa/mlx5/meson.build                   |  29 +++
 drivers/vdpa/mlx5/mlx5_vdpa.c                   | 227 ++++++++++++++++++++++++
 drivers/vdpa/mlx5/mlx5_vdpa_utils.h             |  20 +++
 drivers/vdpa/mlx5/rte_pmd_mlx5_vdpa_version.map |   3 +
 mk/rte.app.mk                                   |  15 +-
 17 files changed, 488 insertions(+), 17 deletions(-)
 create mode 100644 doc/guides/vdpadevs/features/mlx5.ini
 create mode 100644 doc/guides/vdpadevs/mlx5.rst
 create mode 100644 drivers/vdpa/mlx5/Makefile
 create mode 100644 drivers/vdpa/mlx5/meson.build
 create mode 100644 drivers/vdpa/mlx5/mlx5_vdpa.c
 create mode 100644 drivers/vdpa/mlx5/mlx5_vdpa_utils.h
 create mode 100644 drivers/vdpa/mlx5/rte_pmd_mlx5_vdpa_version.map

Comments

Maxime Coquelin Jan. 30, 2020, 2:38 p.m. UTC | #1
On 1/29/20 11:08 AM, Matan Azrad wrote:
> Add a new driver to support vDPA operations by Mellanox devices.
> 
> The first Mellanox devices which support vDPA operations are
> ConnectX6DX and Bluefield1 HCA for their PF ports and VF ports.
> 
> This driver is depending on rdma-core like the mlx5 PMD, also it is
> going to use mlx5 DevX to create HW objects directly by the FW.
> Hence, the common/mlx5 library is linked to the mlx5_vdpa driver.

If possible, I would really appreciate to have the information on
the versions required for the above dependencies. Better if it is also
mentionned in the guide.

> This driver will not be compiled by default due to the above
> dependencies.
> 
> Register a new log type for this driver.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> ---
>  MAINTAINERS                                     |   7 +
>  config/common_base                              |   5 +
>  doc/guides/rel_notes/release_20_02.rst          |   5 +
>  doc/guides/vdpadevs/features/mlx5.ini           |  14 ++
>  doc/guides/vdpadevs/index.rst                   |   1 +
>  doc/guides/vdpadevs/mlx5.rst                    | 111 ++++++++++++
>  drivers/common/Makefile                         |   2 +-
>  drivers/common/mlx5/Makefile                    |  17 +-
>  drivers/meson.build                             |   8 +-
>  drivers/vdpa/Makefile                           |   2 +
>  drivers/vdpa/meson.build                        |   3 +-
>  drivers/vdpa/mlx5/Makefile                      |  36 ++++
>  drivers/vdpa/mlx5/meson.build                   |  29 +++
>  drivers/vdpa/mlx5/mlx5_vdpa.c                   | 227 ++++++++++++++++++++++++
>  drivers/vdpa/mlx5/mlx5_vdpa_utils.h             |  20 +++
>  drivers/vdpa/mlx5/rte_pmd_mlx5_vdpa_version.map |   3 +
>  mk/rte.app.mk                                   |  15 +-
>  17 files changed, 488 insertions(+), 17 deletions(-)
>  create mode 100644 doc/guides/vdpadevs/features/mlx5.ini
>  create mode 100644 doc/guides/vdpadevs/mlx5.rst
>  create mode 100644 drivers/vdpa/mlx5/Makefile
>  create mode 100644 drivers/vdpa/mlx5/meson.build
>  create mode 100644 drivers/vdpa/mlx5/mlx5_vdpa.c
>  create mode 100644 drivers/vdpa/mlx5/mlx5_vdpa_utils.h
>  create mode 100644 drivers/vdpa/mlx5/rte_pmd_mlx5_vdpa_version.map
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 150d507..f697e9a 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1103,6 +1103,13 @@ F: drivers/vdpa/ifc/
>  F: doc/guides/vdpadevs/ifc.rst
>  F: doc/guides/vdpadevs/features/ifcvf.ini
>  
> +Mellanox mlx5 vDPA
> +M: Matan Azrad <matan@mellanox.com>
> +M: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> +F: drivers/vdpa/mlx5/
> +F: doc/guides/vdpadevs/mlx5.rst
> +F: doc/guides/vdpadevs/features/mlx5.ini
> +
>  
>  Eventdev Drivers
>  ----------------
> diff --git a/config/common_base b/config/common_base
> index c897dd0..6ea9c63 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -366,6 +366,11 @@ CONFIG_RTE_LIBRTE_MLX4_DEBUG=n
>  CONFIG_RTE_LIBRTE_MLX5_PMD=n
>  CONFIG_RTE_LIBRTE_MLX5_DEBUG=n
>  
> +#
> +# Compile vdpa-oriented Mellanox ConnectX-6 & Bluefield (MLX5) PMD
> +#
> +CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD=n
> +
>  # Linking method for mlx4/5 dependency on ibverbs and related libraries
>  # Default linking is dynamic by linker.
>  # Other options are: dynamic by dlopen at run-time, or statically embedded.
> diff --git a/doc/guides/rel_notes/release_20_02.rst b/doc/guides/rel_notes/release_20_02.rst
> index 50e2c14..690e7db 100644
> --- a/doc/guides/rel_notes/release_20_02.rst
> +++ b/doc/guides/rel_notes/release_20_02.rst
> @@ -113,6 +113,11 @@ New Features
>    * Added support for RSS using L3/L4 source/destination only.
>    * Added support for matching on GTP tunnel header item.
>  
> +* **Add new vDPA PMD based on Mellanox devices**
> +
> +  Added a new Mellanox vDPA  (``mlx5_vdpa``) PMD.
> +  See the :doc:`../vdpadevs/mlx5` guide for more details on this driver.
> +
>  * **Updated testpmd application.**
>  
>    Added support for ESP and L2TPv3 over IP rte_flow patterns to the testpmd
> diff --git a/doc/guides/vdpadevs/features/mlx5.ini b/doc/guides/vdpadevs/features/mlx5.ini
> new file mode 100644
> index 0000000..d635bdf
> --- /dev/null
> +++ b/doc/guides/vdpadevs/features/mlx5.ini
> @@ -0,0 +1,14 @@
> +;
> +; Supported features of the 'mlx5' VDPA driver.
> +;
> +; Refer to default.ini for the full list of available driver features.
> +;
> +[Features]
> +Other kdrv           = Y
> +ARMv8                = Y
> +Power8               = Y
> +x86-32               = Y
> +x86-64               = Y
> +Usage doc            = Y
> +Design doc           = Y
> +
> diff --git a/doc/guides/vdpadevs/index.rst b/doc/guides/vdpadevs/index.rst
> index 9657108..1a13efe 100644
> --- a/doc/guides/vdpadevs/index.rst
> +++ b/doc/guides/vdpadevs/index.rst
> @@ -13,3 +13,4 @@ which can be used from an application through vhost API.
>  
>      features_overview
>      ifc
> +    mlx5
> diff --git a/doc/guides/vdpadevs/mlx5.rst b/doc/guides/vdpadevs/mlx5.rst
> new file mode 100644
> index 0000000..1861e71
> --- /dev/null
> +++ b/doc/guides/vdpadevs/mlx5.rst
> @@ -0,0 +1,111 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +    Copyright 2019 Mellanox Technologies, Ltd
> +
> +MLX5 vDPA driver
> +================
> +
> +The MLX5 vDPA (vhost data path acceleration) driver library
> +(**librte_pmd_mlx5_vdpa**) provides support for **Mellanox ConnectX-6**,
> +**Mellanox ConnectX-6DX** and **Mellanox BlueField** families of
> +10/25/40/50/100/200 Gb/s adapters as well as their virtual functions (VF) in
> +SR-IOV context.
> +
> +.. note::
> +
> +   Due to external dependencies, this driver is disabled in default
> +   configuration of the "make" build. It can be enabled with
> +   ``CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD=y`` or by using "meson" build system which
> +   will detect dependencies.
> +
> +
> +Design
> +------
> +
> +For security reasons and robustness, this driver only deals with virtual
> +memory addresses. The way resources allocations are handled by the kernel,
> +combined with hardware specifications that allow to handle virtual memory
> +addresses directly, ensure that DPDK applications cannot access random
> +physical memory (or memory that does not belong to the current process).
> +
> +The PMD can use libibverbs and libmlx5 to access the device firmware
> +or directly the hardware components.
> +There are different levels of objects and bypassing abilities
> +to get the best performances:
> +
> +- Verbs is a complete high-level generic API
> +- Direct Verbs is a device-specific API
> +- DevX allows to access firmware objects
> +- Direct Rules manages flow steering at low-level hardware layer
> +
> +Enabling librte_pmd_mlx5_vdpa causes DPDK applications to be linked against
> +libibverbs.
> +
> +A Mellanox mlx5 PCI device can be probed by either net/mlx5 driver or vdpa/mlx5
> +driver but not in parallel. Hence, the user should decide the driver by the
> +``class`` parameter in the device argument list.
> +By default, the mlx5 device will be probed by the net/mlx5 driver. 
> +
> +Supported NICs
> +--------------
> +
> +* Mellanox(R) ConnectX(R)-6 200G MCX654106A-HCAT (4x200G)
> +* Mellanox(R) ConnectX(R)-6DX EN 100G MCX623106AN-CDAT (2*100G)
> +* Mellanox(R) ConnectX(R)-6DX EN 200G MCX623105AN-VDAT (1*200G)
> +* Mellanox(R) BlueField SmartNIC 25G MBF1M332A-ASCAT (2*25G)
> +
> +Prerequisites
> +-------------
> +
> +- Mellanox OFED version: **4.7**
> +  see :doc:`../../nics/mlx5` guide for more Mellanox OFED details.
> +
> +Compilation options
> +~~~~~~~~~~~~~~~~~~~
> +
> +These options can be modified in the ``.config`` file.
> +
> +- ``CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD`` (default **n**)
> +
> +  Toggle compilation of librte_pmd_mlx5 itself.
> +
> +- ``CONFIG_RTE_IBVERBS_LINK_DLOPEN`` (default **n**)
> +
> +  Build PMD with additional code to make it loadable without hard
> +  dependencies on **libibverbs** nor **libmlx5**, which may not be installed
> +  on the target system.
> +
> +  In this mode, their presence is still required for it to run properly,
> +  however their absence won't prevent a DPDK application from starting (with
> +  ``CONFIG_RTE_BUILD_SHARED_LIB`` disabled) and they won't show up as
> +  missing with ``ldd(1)``.
> +
> +  It works by moving these dependencies to a purpose-built rdma-core "glue"
> +  plug-in which must either be installed in a directory whose name is based
> +  on ``CONFIG_RTE_EAL_PMD_PATH`` suffixed with ``-glue`` if set, or in a
> +  standard location for the dynamic linker (e.g. ``/lib``) if left to the
> +  default empty string (``""``).
> +
> +  This option has no performance impact.
> +
> +- ``CONFIG_RTE_IBVERBS_LINK_STATIC`` (default **n**)
> +
> +  Embed static flavor of the dependencies **libibverbs** and **libmlx5**
> +  in the PMD shared library or the executable static binary.
> +
> +.. note::
> +
> +   For BlueField, target should be set to ``arm64-bluefield-linux-gcc``. This
> +   will enable ``CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD`` and set
> +   ``RTE_CACHE_LINE_SIZE`` to 64. Default armv8a configuration of make build and
> +   meson build set it to 128 then brings performance degradation.
> +
> +Run-time configuration
> +~~~~~~~~~~~~~~~~~~~~~~
> +
> +- **ethtool** operations on related kernel interfaces also affect the PMD.
> +
> +- ``class`` parameter [string]
> +
> +  Select the class of the driver that should probe the device.
> +  `vdpa` for the mlx5 vDPA driver.
> +
> diff --git a/drivers/common/Makefile b/drivers/common/Makefile
> index 4775d4b..96bd7ac 100644
> --- a/drivers/common/Makefile
> +++ b/drivers/common/Makefile
> @@ -35,7 +35,7 @@ ifneq (,$(findstring y,$(IAVF-y)))
>  DIRS-y += iavf
>  endif
>  
> -ifeq ($(CONFIG_RTE_LIBRTE_MLX5_PMD),y)
> +ifeq ($(findstring y,$(CONFIG_RTE_LIBRTE_MLX5_PMD)$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD)),y)
>  DIRS-y += mlx5
>  endif
>  
> diff --git a/drivers/common/mlx5/Makefile b/drivers/common/mlx5/Makefile
> index 9d4d81f..c4b7999 100644
> --- a/drivers/common/mlx5/Makefile
> +++ b/drivers/common/mlx5/Makefile
> @@ -10,15 +10,16 @@ LIB_GLUE_BASE = librte_pmd_mlx5_glue.so
>  LIB_GLUE_VERSION = 20.02.0
>  
>  # Sources.
> +ifeq ($(findstring y,$(CONFIG_RTE_LIBRTE_MLX5_PMD)$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD)),y)
>  ifneq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y)
> -SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_glue.c
> +SRCS-y += mlx5_glue.c
>  endif
> -SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_devx_cmds.c
> -SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_common.c
> -SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_nl.c
> -
> +SRCS-y += mlx5_devx_cmds.c
> +SRCS-y += mlx5_common.c
> +SRCS-y += mlx5_nl.c
>  ifeq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y)
> -INSTALL-$(CONFIG_RTE_LIBRTE_MLX5_PMD)-lib += $(LIB_GLUE)
> +INSTALL-y-lib += $(LIB_GLUE)
> +endif
>  endif
>  
>  # Basic CFLAGS.
> @@ -317,7 +318,9 @@ mlx5_autoconf.h: mlx5_autoconf.h.new
>  		cmp '$<' '$@' $(AUTOCONF_OUTPUT) || \
>  		mv '$<' '$@'
>  
> -$(SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD):.c=.o): mlx5_autoconf.h
> +ifeq ($(findstring y,$(CONFIG_RTE_LIBRTE_MLX5_PMD)$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD)),y)
> +$(SRCS-y:.c=.o): mlx5_autoconf.h
> +endif
>  
>  # Generate dependency plug-in for rdma-core when the PMD must not be linked
>  # directly, so that applications do not inherit this dependency.
> diff --git a/drivers/meson.build b/drivers/meson.build
> index 29708cc..bd154fa 100644
> --- a/drivers/meson.build
> +++ b/drivers/meson.build
> @@ -42,6 +42,7 @@ foreach class:dpdk_driver_classes
>  		build = true # set to false to disable, e.g. missing deps
>  		reason = '<unknown reason>' # set if build == false to explain
>  		name = drv
> +		fmt_name = ''
>  		allow_experimental_apis = false
>  		sources = []
>  		objs = []
> @@ -98,8 +99,11 @@ foreach class:dpdk_driver_classes
>  		else
>  			class_drivers += name
>  
> -			dpdk_conf.set(config_flag_fmt.format(name.to_upper()),1)
> -			lib_name = driver_name_fmt.format(name)
> +			if fmt_name == ''
> +				fmt_name = name
> +			endif
> +			dpdk_conf.set(config_flag_fmt.format(fmt_name.to_upper()),1)
> +			lib_name = driver_name_fmt.format(fmt_name)
>  
>  			if allow_experimental_apis
>  				cflags += '-DALLOW_EXPERIMENTAL_API'
> diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile
> index b5a7a11..6e88359 100644
> --- a/drivers/vdpa/Makefile
> +++ b/drivers/vdpa/Makefile
> @@ -7,4 +7,6 @@ ifeq ($(CONFIG_RTE_EAL_VFIO),y)
>  DIRS-$(CONFIG_RTE_LIBRTE_IFC_PMD) += ifc
>  endif
>  
> +DIRS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5
> +
>  include $(RTE_SDK)/mk/rte.subdir.mk
> diff --git a/drivers/vdpa/meson.build b/drivers/vdpa/meson.build
> index 2f047b5..e3ed54a 100644
> --- a/drivers/vdpa/meson.build
> +++ b/drivers/vdpa/meson.build
> @@ -1,7 +1,8 @@
>  # SPDX-License-Identifier: BSD-3-Clause
>  # Copyright 2019 Mellanox Technologies, Ltd
>  
> -drivers = ['ifc']
> +drivers = ['ifc',
> +	   'mlx5',]
>  std_deps = ['bus_pci', 'kvargs']
>  std_deps += ['vhost']
>  config_flag_fmt = 'RTE_LIBRTE_@0@_PMD'
> diff --git a/drivers/vdpa/mlx5/Makefile b/drivers/vdpa/mlx5/Makefile
> new file mode 100644
> index 0000000..c1c8cc0
> --- /dev/null
> +++ b/drivers/vdpa/mlx5/Makefile
> @@ -0,0 +1,36 @@
> +#   SPDX-License-Identifier: BSD-3-Clause
> +#   Copyright 2019 Mellanox Technologies, Ltd
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# Library name.
> +LIB = librte_pmd_mlx5_vdpa.a
> +
> +# Sources.
> +SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa.c
> +
> +# Basic CFLAGS.
> +CFLAGS += -O3
> +CFLAGS += -std=c11 -Wall -Wextra
> +CFLAGS += -g
> +CFLAGS += -I$(RTE_SDK)/drivers/common/mlx5
> +CFLAGS += -I$(RTE_SDK)/drivers/net/mlx5_vdpa
> +CFLAGS += -I$(BUILDDIR)/drivers/common/mlx5
> +CFLAGS += -D_BSD_SOURCE
> +CFLAGS += -D_DEFAULT_SOURCE
> +CFLAGS += -D_XOPEN_SOURCE=600
> +CFLAGS += $(WERROR_FLAGS)
> +CFLAGS += -Wno-strict-prototypes
> +LDLIBS += -lrte_common_mlx5
> +LDLIBS += -lrte_eal -lrte_vhost -lrte_kvargs -lrte_bus_pci
> +
> +# A few warnings cannot be avoided in external headers.
> +CFLAGS += -Wno-error=cast-qual
> +
> +EXPORT_MAP := rte_pmd_mlx5_vdpa_version.map
> +# memseg walk is not part of stable API
> +CFLAGS += -DALLOW_EXPERIMENTAL_API
> +
> +CFLAGS += -DNDEBUG -UPEDANTIC
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/drivers/vdpa/mlx5/meson.build b/drivers/vdpa/mlx5/meson.build
> new file mode 100644
> index 0000000..4bca6ea
> --- /dev/null
> +++ b/drivers/vdpa/mlx5/meson.build
> @@ -0,0 +1,29 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright 2019 Mellanox Technologies, Ltd
> +
> +if not is_linux
> +	build = false
> +	reason = 'only supported on Linux'
> +	subdir_done()
> +endif
> +
> +fmt_name = 'mlx5_vdpa'
> +allow_experimental_apis = true
> +deps += ['hash', 'common_mlx5', 'vhost', 'bus_pci', 'eal']
> +sources = files(
> +	'mlx5_vdpa.c',
> +)
> +cflags_options = [
> +	'-std=c11',
> +	'-Wno-strict-prototypes',
> +	'-D_BSD_SOURCE',
> +	'-D_DEFAULT_SOURCE',
> +	'-D_XOPEN_SOURCE=600'
> +]
> +foreach option:cflags_options
> +	if cc.has_argument(option)
> +		cflags += option
> +	endif
> +endforeach
> +
> +cflags += [ '-DNDEBUG', '-UPEDANTIC' ]
> diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c
> new file mode 100644
> index 0000000..6286d7a
> --- /dev/null
> +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c
> @@ -0,0 +1,227 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2019 Mellanox Technologies, Ltd
> + */
> +#include <rte_malloc.h>
> +#include <rte_log.h>
> +#include <rte_errno.h>
> +#include <rte_bus_pci.h>
> +#include <rte_vdpa.h>
> +
> +#include <mlx5_glue.h>
> +#include <mlx5_common.h>
> +
> +#include "mlx5_vdpa_utils.h"
> +
> +
> +struct mlx5_vdpa_priv {
> +	TAILQ_ENTRY(mlx5_vdpa_priv) next;
> +	int id; /* vDPA device id. */
> +	struct ibv_context *ctx; /* Device context. */
> +	struct rte_vdpa_dev_addr dev_addr;
> +};
> +
> +TAILQ_HEAD(mlx5_vdpa_privs, mlx5_vdpa_priv) priv_list =
> +					      TAILQ_HEAD_INITIALIZER(priv_list);
> +static pthread_mutex_t priv_list_lock = PTHREAD_MUTEX_INITIALIZER;
> +int mlx5_vdpa_logtype;
> +
> +static struct rte_vdpa_dev_ops mlx5_vdpa_ops = {
> +	.get_queue_num = NULL,
> +	.get_features = NULL,
> +	.get_protocol_features = NULL,
> +	.dev_conf = NULL,
> +	.dev_close = NULL,
> +	.set_vring_state = NULL,
> +	.set_features = NULL,
> +	.migration_done = NULL,
> +	.get_vfio_group_fd = NULL,
> +	.get_vfio_device_fd = NULL,
> +	.get_notify_area = NULL,
> +};
> +
> +/**
> + * DPDK callback to register a PCI device.
> + *
> + * This function spawns vdpa device out of a given PCI device.
> + *
> + * @param[in] pci_drv
> + *   PCI driver structure (mlx5_vpda_driver).
> + * @param[in] pci_dev
> + *   PCI device information.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +static int
> +mlx5_vdpa_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
> +		    struct rte_pci_device *pci_dev __rte_unused)
> +{
> +	struct ibv_device **ibv_list;
> +	struct ibv_device *ibv_match = NULL;
> +	struct mlx5_vdpa_priv *priv = NULL;
> +	struct ibv_context *ctx = NULL;
> +	int ret;
> +
> +	if (mlx5_class_get(pci_dev->device.devargs) != MLX5_CLASS_VDPA) {
> +		DRV_LOG(DEBUG, "Skip probing - should be probed by other mlx5"
> +			" driver.");
> +		return 1;

Maybe the function doc should be implemented for return 1 case:

* @return
*   0 on success, a negative errno value otherwise and rte_errno is set.

> +	}
> +	errno = 0;
> +	ibv_list = mlx5_glue->get_device_list(&ret);
> +	if (!ibv_list) {
> +		rte_errno = errno;
> +		DRV_LOG(ERR, "Failed to get device list, is ib_uverbs loaded?");
> +		return -ENOSYS;

Shouldn't return -rte_errno to be consistent with the rest of the
function?

For the sake of consistency, you could also goto error instead.

> +	}
> +	while (ret-- > 0) {
> +		struct rte_pci_addr pci_addr;
> +
> +		DRV_LOG(DEBUG, "Checking device \"%s\"..", ibv_list[ret]->name);
> +		if (mlx5_dev_to_pci_addr(ibv_list[ret]->ibdev_path, &pci_addr))
> +			continue;
> +		if (pci_dev->addr.domain != pci_addr.domain ||
> +		    pci_dev->addr.bus != pci_addr.bus ||
> +		    pci_dev->addr.devid != pci_addr.devid ||
> +		    pci_dev->addr.function != pci_addr.function)
> +			continue;
> +		DRV_LOG(INFO, "PCI information matches for device \"%s\".",
> +			ibv_list[ret]->name);
> +		ibv_match = ibv_list[ret];
> +		break;
> +	}
> +	mlx5_glue->free_device_list(ibv_list);
> +	if (!ibv_match) {
> +		DRV_LOG(ERR, "No matching IB device for PCI slot "
> +			"%" SCNx32 ":%" SCNx8 ":%" SCNx8 ".%" SCNx8 ".",
> +			pci_dev->addr.domain, pci_dev->addr.bus,
> +			pci_dev->addr.devid, pci_dev->addr.function);
> +		rte_errno = ENOENT;
> +		return -rte_errno;
> +	}
> +	ctx = mlx5_glue->dv_open_device(ibv_match);
> +	if (!ctx) {
> +		DRV_LOG(ERR, "Failed to open IB device \"%s\".",
> +			ibv_match->name);
> +		rte_errno = ENODEV;
> +		return -rte_errno;
> +	}
> +	priv = rte_zmalloc("mlx5 vDPA device private", sizeof(*priv),
> +			   RTE_CACHE_LINE_SIZE);
> +	if (!priv) {
> +		DRV_LOG(ERR, "Failed to allocate private memory.");
> +		rte_errno = ENOMEM;
> +		goto error;
> +	}
> +	priv->ctx = ctx;
> +	priv->dev_addr.pci_addr = pci_dev->addr;
> +	priv->dev_addr.type = PCI_ADDR;
> +	priv->id = rte_vdpa_register_device(&priv->dev_addr, &mlx5_vdpa_ops);
> +	if (priv->id < 0) {
> +		DRV_LOG(ERR, "Failed to register vDPA device.");
> +		rte_errno = rte_errno ? rte_errno : EINVAL;
> +		goto error;
> +	}
> +	pthread_mutex_lock(&priv_list_lock);
> +	TAILQ_INSERT_TAIL(&priv_list, priv, next);
> +	pthread_mutex_unlock(&priv_list_lock);
> +	return 0;
> +
> +error:
> +	if (priv)
> +		rte_free(priv);
> +	if (ctx)
> +		mlx5_glue->close_device(ctx);
> +	return -rte_errno;
> +}
> +

These are minor comments.
If directly fixed in v3, feel free to add my:
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

Thanks,
Maxime
Matan Azrad Feb. 1, 2020, 5:53 p.m. UTC | #2
Hi Maxime

Thank you very much for the review.

Please see answers inline.

From: Maxime Coquelin
> On 1/29/20 11:08 AM, Matan Azrad wrote:
> > Add a new driver to support vDPA operations by Mellanox devices.
> >
> > The first Mellanox devices which support vDPA operations are
> > ConnectX6DX and Bluefield1 HCA for their PF ports and VF ports.
> >
> > This driver is depending on rdma-core like the mlx5 PMD, also it is
> > going to use mlx5 DevX to create HW objects directly by the FW.
> > Hence, the common/mlx5 library is linked to the mlx5_vdpa driver.
> 
> If possible, I would really appreciate to have the information on the versions
> required for the above dependencies. Better if it is also mentionned in the
> guide.
> 

It is already in the guide 😊

> > This driver will not be compiled by default due to the above
> > dependencies.
> >
> > Register a new log type for this driver.
> >
> > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> > ---
> >  MAINTAINERS                                     |   7 +
> >  config/common_base                              |   5 +
> >  doc/guides/rel_notes/release_20_02.rst          |   5 +
> >  doc/guides/vdpadevs/features/mlx5.ini           |  14 ++
> >  doc/guides/vdpadevs/index.rst                   |   1 +
> >  doc/guides/vdpadevs/mlx5.rst                    | 111 ++++++++++++
> >  drivers/common/Makefile                         |   2 +-
> >  drivers/common/mlx5/Makefile                    |  17 +-
> >  drivers/meson.build                             |   8 +-
> >  drivers/vdpa/Makefile                           |   2 +
> >  drivers/vdpa/meson.build                        |   3 +-
> >  drivers/vdpa/mlx5/Makefile                      |  36 ++++
> >  drivers/vdpa/mlx5/meson.build                   |  29 +++
> >  drivers/vdpa/mlx5/mlx5_vdpa.c                   | 227
> ++++++++++++++++++++++++
> >  drivers/vdpa/mlx5/mlx5_vdpa_utils.h             |  20 +++
> >  drivers/vdpa/mlx5/rte_pmd_mlx5_vdpa_version.map |   3 +
> >  mk/rte.app.mk                                   |  15 +-
> >  17 files changed, 488 insertions(+), 17 deletions(-)  create mode
> > 100644 doc/guides/vdpadevs/features/mlx5.ini
> >  create mode 100644 doc/guides/vdpadevs/mlx5.rst  create mode 100644
> > drivers/vdpa/mlx5/Makefile  create mode 100644
> > drivers/vdpa/mlx5/meson.build  create mode 100644
> > drivers/vdpa/mlx5/mlx5_vdpa.c  create mode 100644
> > drivers/vdpa/mlx5/mlx5_vdpa_utils.h
> >  create mode 100644
> drivers/vdpa/mlx5/rte_pmd_mlx5_vdpa_version.map
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS index 150d507..f697e9a 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -1103,6 +1103,13 @@ F: drivers/vdpa/ifc/
> >  F: doc/guides/vdpadevs/ifc.rst
> >  F: doc/guides/vdpadevs/features/ifcvf.ini
> >
> > +Mellanox mlx5 vDPA
> > +M: Matan Azrad <matan@mellanox.com>
> > +M: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> > +F: drivers/vdpa/mlx5/
> > +F: doc/guides/vdpadevs/mlx5.rst
> > +F: doc/guides/vdpadevs/features/mlx5.ini
> > +
> >
> >  Eventdev Drivers
> >  ----------------
> > diff --git a/config/common_base b/config/common_base index
> > c897dd0..6ea9c63 100644
> > --- a/config/common_base
> > +++ b/config/common_base
> > @@ -366,6 +366,11 @@ CONFIG_RTE_LIBRTE_MLX4_DEBUG=n
> > CONFIG_RTE_LIBRTE_MLX5_PMD=n
> CONFIG_RTE_LIBRTE_MLX5_DEBUG=n
> >
> > +#
> > +# Compile vdpa-oriented Mellanox ConnectX-6 & Bluefield (MLX5) PMD #
> > +CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD=n
> > +
> >  # Linking method for mlx4/5 dependency on ibverbs and related
> > libraries  # Default linking is dynamic by linker.
> >  # Other options are: dynamic by dlopen at run-time, or statically
> embedded.
> > diff --git a/doc/guides/rel_notes/release_20_02.rst
> > b/doc/guides/rel_notes/release_20_02.rst
> > index 50e2c14..690e7db 100644
> > --- a/doc/guides/rel_notes/release_20_02.rst
> > +++ b/doc/guides/rel_notes/release_20_02.rst
> > @@ -113,6 +113,11 @@ New Features
> >    * Added support for RSS using L3/L4 source/destination only.
> >    * Added support for matching on GTP tunnel header item.
> >
> > +* **Add new vDPA PMD based on Mellanox devices**
> > +
> > +  Added a new Mellanox vDPA  (``mlx5_vdpa``) PMD.
> > +  See the :doc:`../vdpadevs/mlx5` guide for more details on this driver.
> > +
> >  * **Updated testpmd application.**
> >
> >    Added support for ESP and L2TPv3 over IP rte_flow patterns to the
> > testpmd diff --git a/doc/guides/vdpadevs/features/mlx5.ini
> > b/doc/guides/vdpadevs/features/mlx5.ini
> > new file mode 100644
> > index 0000000..d635bdf
> > --- /dev/null
> > +++ b/doc/guides/vdpadevs/features/mlx5.ini
> > @@ -0,0 +1,14 @@
> > +;
> > +; Supported features of the 'mlx5' VDPA driver.
> > +;
> > +; Refer to default.ini for the full list of available driver features.
> > +;
> > +[Features]
> > +Other kdrv           = Y
> > +ARMv8                = Y
> > +Power8               = Y
> > +x86-32               = Y
> > +x86-64               = Y
> > +Usage doc            = Y
> > +Design doc           = Y
> > +
> > diff --git a/doc/guides/vdpadevs/index.rst
> > b/doc/guides/vdpadevs/index.rst index 9657108..1a13efe 100644
> > --- a/doc/guides/vdpadevs/index.rst
> > +++ b/doc/guides/vdpadevs/index.rst
> > @@ -13,3 +13,4 @@ which can be used from an application through vhost
> API.
> >
> >      features_overview
> >      ifc
> > +    mlx5
> > diff --git a/doc/guides/vdpadevs/mlx5.rst
> > b/doc/guides/vdpadevs/mlx5.rst new file mode 100644 index
> > 0000000..1861e71
> > --- /dev/null
> > +++ b/doc/guides/vdpadevs/mlx5.rst
> > @@ -0,0 +1,111 @@
> > +..  SPDX-License-Identifier: BSD-3-Clause
> > +    Copyright 2019 Mellanox Technologies, Ltd
> > +
> > +MLX5 vDPA driver
> > +================
> > +
> > +The MLX5 vDPA (vhost data path acceleration) driver library
> > +(**librte_pmd_mlx5_vdpa**) provides support for **Mellanox
> > +ConnectX-6**, **Mellanox ConnectX-6DX** and **Mellanox BlueField**
> > +families of
> > +10/25/40/50/100/200 Gb/s adapters as well as their virtual functions
> > +(VF) in SR-IOV context.
> > +
> > +.. note::
> > +
> > +   Due to external dependencies, this driver is disabled in default
> > +   configuration of the "make" build. It can be enabled with
> > +   ``CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD=y`` or by using "meson" build
> system which
> > +   will detect dependencies.
> > +
> > +
> > +Design
> > +------
> > +
> > +For security reasons and robustness, this driver only deals with
> > +virtual memory addresses. The way resources allocations are handled
> > +by the kernel, combined with hardware specifications that allow to
> > +handle virtual memory addresses directly, ensure that DPDK
> > +applications cannot access random physical memory (or memory that
> does not belong to the current process).
> > +
> > +The PMD can use libibverbs and libmlx5 to access the device firmware
> > +or directly the hardware components.
> > +There are different levels of objects and bypassing abilities to get
> > +the best performances:
> > +
> > +- Verbs is a complete high-level generic API
> > +- Direct Verbs is a device-specific API
> > +- DevX allows to access firmware objects
> > +- Direct Rules manages flow steering at low-level hardware layer
> > +
> > +Enabling librte_pmd_mlx5_vdpa causes DPDK applications to be linked
> > +against libibverbs.
> > +
> > +A Mellanox mlx5 PCI device can be probed by either net/mlx5 driver or
> > +vdpa/mlx5 driver but not in parallel. Hence, the user should decide
> > +the driver by the ``class`` parameter in the device argument list.
> > +By default, the mlx5 device will be probed by the net/mlx5 driver.
> > +
> > +Supported NICs
> > +--------------
> > +
> > +* Mellanox(R) ConnectX(R)-6 200G MCX654106A-HCAT (4x200G)
> > +* Mellanox(R) ConnectX(R)-6DX EN 100G MCX623106AN-CDAT (2*100G)
> > +* Mellanox(R) ConnectX(R)-6DX EN 200G MCX623105AN-VDAT (1*200G)
> > +* Mellanox(R) BlueField SmartNIC 25G MBF1M332A-ASCAT (2*25G)
> > +
> > +Prerequisites
> > +-------------
> > +
> > +- Mellanox OFED version: **4.7**
> > +  see :doc:`../../nics/mlx5` guide for more Mellanox OFED details.
> > +
> > +Compilation options
> > +~~~~~~~~~~~~~~~~~~~
> > +
> > +These options can be modified in the ``.config`` file.
> > +
> > +- ``CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD`` (default **n**)
> > +
> > +  Toggle compilation of librte_pmd_mlx5 itself.
> > +
> > +- ``CONFIG_RTE_IBVERBS_LINK_DLOPEN`` (default **n**)
> > +
> > +  Build PMD with additional code to make it loadable without hard
> > + dependencies on **libibverbs** nor **libmlx5**, which may not be
> > + installed  on the target system.
> > +
> > +  In this mode, their presence is still required for it to run
> > + properly,  however their absence won't prevent a DPDK application
> > + from starting (with  ``CONFIG_RTE_BUILD_SHARED_LIB`` disabled) and
> > + they won't show up as  missing with ``ldd(1)``.
> > +
> > +  It works by moving these dependencies to a purpose-built rdma-core
> "glue"
> > +  plug-in which must either be installed in a directory whose name is
> > + based  on ``CONFIG_RTE_EAL_PMD_PATH`` suffixed with ``-glue`` if
> > + set, or in a  standard location for the dynamic linker (e.g.
> > + ``/lib``) if left to the  default empty string (``""``).
> > +
> > +  This option has no performance impact.
> > +
> > +- ``CONFIG_RTE_IBVERBS_LINK_STATIC`` (default **n**)
> > +
> > +  Embed static flavor of the dependencies **libibverbs** and
> > + **libmlx5**  in the PMD shared library or the executable static binary.
> > +
> > +.. note::
> > +
> > +   For BlueField, target should be set to ``arm64-bluefield-linux-gcc``. This
> > +   will enable ``CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD`` and set
> > +   ``RTE_CACHE_LINE_SIZE`` to 64. Default armv8a configuration of make
> build and
> > +   meson build set it to 128 then brings performance degradation.
> > +
> > +Run-time configuration
> > +~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +- **ethtool** operations on related kernel interfaces also affect the
> PMD.
> > +
> > +- ``class`` parameter [string]
> > +
> > +  Select the class of the driver that should probe the device.
> > +  `vdpa` for the mlx5 vDPA driver.
> > +
> > diff --git a/drivers/common/Makefile b/drivers/common/Makefile index
> > 4775d4b..96bd7ac 100644
> > --- a/drivers/common/Makefile
> > +++ b/drivers/common/Makefile
> > @@ -35,7 +35,7 @@ ifneq (,$(findstring y,$(IAVF-y)))  DIRS-y += iavf
> > endif
> >
> > -ifeq ($(CONFIG_RTE_LIBRTE_MLX5_PMD),y)
> > +ifeq ($(findstring
> >
> +y,$(CONFIG_RTE_LIBRTE_MLX5_PMD)$(CONFIG_RTE_LIBRTE_MLX5_VDPA
> _PMD)),y)
> >  DIRS-y += mlx5
> >  endif
> >
> > diff --git a/drivers/common/mlx5/Makefile
> > b/drivers/common/mlx5/Makefile index 9d4d81f..c4b7999 100644
> > --- a/drivers/common/mlx5/Makefile
> > +++ b/drivers/common/mlx5/Makefile
> > @@ -10,15 +10,16 @@ LIB_GLUE_BASE = librte_pmd_mlx5_glue.so
> > LIB_GLUE_VERSION = 20.02.0
> >
> >  # Sources.
> > +ifeq ($(findstring
> >
> +y,$(CONFIG_RTE_LIBRTE_MLX5_PMD)$(CONFIG_RTE_LIBRTE_MLX5_VDPA
> _PMD)),y)
> >  ifneq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y)
> > -SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_glue.c
> > +SRCS-y += mlx5_glue.c
> >  endif
> > -SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_devx_cmds.c
> > -SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_common.c
> > -SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_nl.c
> > -
> > +SRCS-y += mlx5_devx_cmds.c
> > +SRCS-y += mlx5_common.c
> > +SRCS-y += mlx5_nl.c
> >  ifeq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y)
> > -INSTALL-$(CONFIG_RTE_LIBRTE_MLX5_PMD)-lib += $(LIB_GLUE)
> > +INSTALL-y-lib += $(LIB_GLUE)
> > +endif
> >  endif
> >
> >  # Basic CFLAGS.
> > @@ -317,7 +318,9 @@ mlx5_autoconf.h: mlx5_autoconf.h.new
> >  		cmp '$<' '$@' $(AUTOCONF_OUTPUT) || \
> >  		mv '$<' '$@'
> >
> > -$(SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD):.c=.o): mlx5_autoconf.h
> > +ifeq ($(findstring
> >
> +y,$(CONFIG_RTE_LIBRTE_MLX5_PMD)$(CONFIG_RTE_LIBRTE_MLX5_VDPA
> _PMD)),y)
> > +$(SRCS-y:.c=.o): mlx5_autoconf.h
> > +endif
> >
> >  # Generate dependency plug-in for rdma-core when the PMD must not be
> > linked  # directly, so that applications do not inherit this dependency.
> > diff --git a/drivers/meson.build b/drivers/meson.build index
> > 29708cc..bd154fa 100644
> > --- a/drivers/meson.build
> > +++ b/drivers/meson.build
> > @@ -42,6 +42,7 @@ foreach class:dpdk_driver_classes
> >  		build = true # set to false to disable, e.g. missing deps
> >  		reason = '<unknown reason>' # set if build == false to explain
> >  		name = drv
> > +		fmt_name = ''
> >  		allow_experimental_apis = false
> >  		sources = []
> >  		objs = []
> > @@ -98,8 +99,11 @@ foreach class:dpdk_driver_classes
> >  		else
> >  			class_drivers += name
> >
> > -
> 	dpdk_conf.set(config_flag_fmt.format(name.to_upper()),1)
> > -			lib_name = driver_name_fmt.format(name)
> > +			if fmt_name == ''
> > +				fmt_name = name
> > +			endif
> > +
> 	dpdk_conf.set(config_flag_fmt.format(fmt_name.to_upper()),1)
> > +			lib_name = driver_name_fmt.format(fmt_name)
> >
> >  			if allow_experimental_apis
> >  				cflags += '-DALLOW_EXPERIMENTAL_API'
> > diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile index
> > b5a7a11..6e88359 100644
> > --- a/drivers/vdpa/Makefile
> > +++ b/drivers/vdpa/Makefile
> > @@ -7,4 +7,6 @@ ifeq ($(CONFIG_RTE_EAL_VFIO),y)
> >  DIRS-$(CONFIG_RTE_LIBRTE_IFC_PMD) += ifc  endif
> >
> > +DIRS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5
> > +
> >  include $(RTE_SDK)/mk/rte.subdir.mk
> > diff --git a/drivers/vdpa/meson.build b/drivers/vdpa/meson.build index
> > 2f047b5..e3ed54a 100644
> > --- a/drivers/vdpa/meson.build
> > +++ b/drivers/vdpa/meson.build
> > @@ -1,7 +1,8 @@
> >  # SPDX-License-Identifier: BSD-3-Clause  # Copyright 2019 Mellanox
> > Technologies, Ltd
> >
> > -drivers = ['ifc']
> > +drivers = ['ifc',
> > +	   'mlx5',]
> >  std_deps = ['bus_pci', 'kvargs']
> >  std_deps += ['vhost']
> >  config_flag_fmt = 'RTE_LIBRTE_@0@_PMD'
> > diff --git a/drivers/vdpa/mlx5/Makefile b/drivers/vdpa/mlx5/Makefile
> > new file mode 100644 index 0000000..c1c8cc0
> > --- /dev/null
> > +++ b/drivers/vdpa/mlx5/Makefile
> > @@ -0,0 +1,36 @@
> > +#   SPDX-License-Identifier: BSD-3-Clause
> > +#   Copyright 2019 Mellanox Technologies, Ltd
> > +
> > +include $(RTE_SDK)/mk/rte.vars.mk
> > +
> > +# Library name.
> > +LIB = librte_pmd_mlx5_vdpa.a
> > +
> > +# Sources.
> > +SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa.c
> > +
> > +# Basic CFLAGS.
> > +CFLAGS += -O3
> > +CFLAGS += -std=c11 -Wall -Wextra
> > +CFLAGS += -g
> > +CFLAGS += -I$(RTE_SDK)/drivers/common/mlx5 CFLAGS +=
> > +-I$(RTE_SDK)/drivers/net/mlx5_vdpa
> > +CFLAGS += -I$(BUILDDIR)/drivers/common/mlx5 CFLAGS += -
> D_BSD_SOURCE
> > +CFLAGS += -D_DEFAULT_SOURCE CFLAGS += -D_XOPEN_SOURCE=600
> CFLAGS +=
> > +$(WERROR_FLAGS) CFLAGS += -Wno-strict-prototypes LDLIBS +=
> > +-lrte_common_mlx5 LDLIBS += -lrte_eal -lrte_vhost -lrte_kvargs
> > +-lrte_bus_pci
> > +
> > +# A few warnings cannot be avoided in external headers.
> > +CFLAGS += -Wno-error=cast-qual
> > +
> > +EXPORT_MAP := rte_pmd_mlx5_vdpa_version.map # memseg walk is not
> part
> > +of stable API CFLAGS += -DALLOW_EXPERIMENTAL_API
> > +
> > +CFLAGS += -DNDEBUG -UPEDANTIC
> > +
> > +include $(RTE_SDK)/mk/rte.lib.mk
> > diff --git a/drivers/vdpa/mlx5/meson.build
> > b/drivers/vdpa/mlx5/meson.build new file mode 100644 index
> > 0000000..4bca6ea
> > --- /dev/null
> > +++ b/drivers/vdpa/mlx5/meson.build
> > @@ -0,0 +1,29 @@
> > +# SPDX-License-Identifier: BSD-3-Clause # Copyright 2019 Mellanox
> > +Technologies, Ltd
> > +
> > +if not is_linux
> > +	build = false
> > +	reason = 'only supported on Linux'
> > +	subdir_done()
> > +endif
> > +
> > +fmt_name = 'mlx5_vdpa'
> > +allow_experimental_apis = true
> > +deps += ['hash', 'common_mlx5', 'vhost', 'bus_pci', 'eal'] sources =
> > +files(
> > +	'mlx5_vdpa.c',
> > +)
> > +cflags_options = [
> > +	'-std=c11',
> > +	'-Wno-strict-prototypes',
> > +	'-D_BSD_SOURCE',
> > +	'-D_DEFAULT_SOURCE',
> > +	'-D_XOPEN_SOURCE=600'
> > +]
> > +foreach option:cflags_options
> > +	if cc.has_argument(option)
> > +		cflags += option
> > +	endif
> > +endforeach
> > +
> > +cflags += [ '-DNDEBUG', '-UPEDANTIC' ]
> > diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c
> > b/drivers/vdpa/mlx5/mlx5_vdpa.c new file mode 100644 index
> > 0000000..6286d7a
> > --- /dev/null
> > +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c
> > @@ -0,0 +1,227 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright 2019 Mellanox Technologies, Ltd  */ #include
> > +<rte_malloc.h> #include <rte_log.h> #include <rte_errno.h> #include
> > +<rte_bus_pci.h> #include <rte_vdpa.h>
> > +
> > +#include <mlx5_glue.h>
> > +#include <mlx5_common.h>
> > +
> > +#include "mlx5_vdpa_utils.h"
> > +
> > +
> > +struct mlx5_vdpa_priv {
> > +	TAILQ_ENTRY(mlx5_vdpa_priv) next;
> > +	int id; /* vDPA device id. */
> > +	struct ibv_context *ctx; /* Device context. */
> > +	struct rte_vdpa_dev_addr dev_addr;
> > +};
> > +
> > +TAILQ_HEAD(mlx5_vdpa_privs, mlx5_vdpa_priv) priv_list =
> > +
> TAILQ_HEAD_INITIALIZER(priv_list);
> > +static pthread_mutex_t priv_list_lock = PTHREAD_MUTEX_INITIALIZER;
> > +int mlx5_vdpa_logtype;
> > +
> > +static struct rte_vdpa_dev_ops mlx5_vdpa_ops = {
> > +	.get_queue_num = NULL,
> > +	.get_features = NULL,
> > +	.get_protocol_features = NULL,
> > +	.dev_conf = NULL,
> > +	.dev_close = NULL,
> > +	.set_vring_state = NULL,
> > +	.set_features = NULL,
> > +	.migration_done = NULL,
> > +	.get_vfio_group_fd = NULL,
> > +	.get_vfio_device_fd = NULL,
> > +	.get_notify_area = NULL,
> > +};
> > +
> > +/**
> > + * DPDK callback to register a PCI device.
> > + *
> > + * This function spawns vdpa device out of a given PCI device.
> > + *
> > + * @param[in] pci_drv
> > + *   PCI driver structure (mlx5_vpda_driver).
> > + * @param[in] pci_dev
> > + *   PCI device information.
> > + *
> > + * @return
> > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > + */
> > +static int
> > +mlx5_vdpa_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
> > +		    struct rte_pci_device *pci_dev __rte_unused) {
> > +	struct ibv_device **ibv_list;
> > +	struct ibv_device *ibv_match = NULL;
> > +	struct mlx5_vdpa_priv *priv = NULL;
> > +	struct ibv_context *ctx = NULL;
> > +	int ret;
> > +
> > +	if (mlx5_class_get(pci_dev->device.devargs) != MLX5_CLASS_VDPA)
> {
> > +		DRV_LOG(DEBUG, "Skip probing - should be probed by other
> mlx5"
> > +			" driver.");
> > +		return 1;
> 
> Maybe the function doc should be implemented for return 1 case:
> 
> * @return
> *   0 on success, a negative errno value otherwise and rte_errno is set.
> 

Sure, will fix it.

> > +	}
> > +	errno = 0;
> > +	ibv_list = mlx5_glue->get_device_list(&ret);
> > +	if (!ibv_list) {
> > +		rte_errno = errno;
> > +		DRV_LOG(ERR, "Failed to get device list, is ib_uverbs
> loaded?");
> > +		return -ENOSYS;
> 
> Shouldn't return -rte_errno to be consistent with the rest of the function?
> For the sake of consistency, you could also goto error instead.

Ok, will save consistency.

> 
> > +	}
> > +	while (ret-- > 0) {
> > +		struct rte_pci_addr pci_addr;
> > +
> > +		DRV_LOG(DEBUG, "Checking device \"%s\"..", ibv_list[ret]-
> >name);
> > +		if (mlx5_dev_to_pci_addr(ibv_list[ret]->ibdev_path,
> &pci_addr))
> > +			continue;
> > +		if (pci_dev->addr.domain != pci_addr.domain ||
> > +		    pci_dev->addr.bus != pci_addr.bus ||
> > +		    pci_dev->addr.devid != pci_addr.devid ||
> > +		    pci_dev->addr.function != pci_addr.function)
> > +			continue;
> > +		DRV_LOG(INFO, "PCI information matches for device
> \"%s\".",
> > +			ibv_list[ret]->name);
> > +		ibv_match = ibv_list[ret];
> > +		break;
> > +	}
> > +	mlx5_glue->free_device_list(ibv_list);
> > +	if (!ibv_match) {
> > +		DRV_LOG(ERR, "No matching IB device for PCI slot "
> > +			"%" SCNx32 ":%" SCNx8 ":%" SCNx8 ".%" SCNx8 ".",
> > +			pci_dev->addr.domain, pci_dev->addr.bus,
> > +			pci_dev->addr.devid, pci_dev->addr.function);
> > +		rte_errno = ENOENT;
> > +		return -rte_errno;
> > +	}
> > +	ctx = mlx5_glue->dv_open_device(ibv_match);
> > +	if (!ctx) {
> > +		DRV_LOG(ERR, "Failed to open IB device \"%s\".",
> > +			ibv_match->name);
> > +		rte_errno = ENODEV;
> > +		return -rte_errno;
> > +	}
> > +	priv = rte_zmalloc("mlx5 vDPA device private", sizeof(*priv),
> > +			   RTE_CACHE_LINE_SIZE);
> > +	if (!priv) {
> > +		DRV_LOG(ERR, "Failed to allocate private memory.");
> > +		rte_errno = ENOMEM;
> > +		goto error;
> > +	}
> > +	priv->ctx = ctx;
> > +	priv->dev_addr.pci_addr = pci_dev->addr;
> > +	priv->dev_addr.type = PCI_ADDR;
> > +	priv->id = rte_vdpa_register_device(&priv->dev_addr,
> &mlx5_vdpa_ops);
> > +	if (priv->id < 0) {
> > +		DRV_LOG(ERR, "Failed to register vDPA device.");
> > +		rte_errno = rte_errno ? rte_errno : EINVAL;
> > +		goto error;
> > +	}
> > +	pthread_mutex_lock(&priv_list_lock);
> > +	TAILQ_INSERT_TAIL(&priv_list, priv, next);
> > +	pthread_mutex_unlock(&priv_list_lock);
> > +	return 0;
> > +
> > +error:
> > +	if (priv)
> > +		rte_free(priv);
> > +	if (ctx)
> > +		mlx5_glue->close_device(ctx);
> > +	return -rte_errno;
> > +}
> > +
> 
> These are minor comments.
> If directly fixed in v3, feel free to add my:
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> 
> Thanks,
> Maxime
diff mbox series

Patch

diff --git a/MAINTAINERS b/MAINTAINERS
index 150d507..f697e9a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1103,6 +1103,13 @@  F: drivers/vdpa/ifc/
 F: doc/guides/vdpadevs/ifc.rst
 F: doc/guides/vdpadevs/features/ifcvf.ini
 
+Mellanox mlx5 vDPA
+M: Matan Azrad <matan@mellanox.com>
+M: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
+F: drivers/vdpa/mlx5/
+F: doc/guides/vdpadevs/mlx5.rst
+F: doc/guides/vdpadevs/features/mlx5.ini
+
 
 Eventdev Drivers
 ----------------
diff --git a/config/common_base b/config/common_base
index c897dd0..6ea9c63 100644
--- a/config/common_base
+++ b/config/common_base
@@ -366,6 +366,11 @@  CONFIG_RTE_LIBRTE_MLX4_DEBUG=n
 CONFIG_RTE_LIBRTE_MLX5_PMD=n
 CONFIG_RTE_LIBRTE_MLX5_DEBUG=n
 
+#
+# Compile vdpa-oriented Mellanox ConnectX-6 & Bluefield (MLX5) PMD
+#
+CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD=n
+
 # Linking method for mlx4/5 dependency on ibverbs and related libraries
 # Default linking is dynamic by linker.
 # Other options are: dynamic by dlopen at run-time, or statically embedded.
diff --git a/doc/guides/rel_notes/release_20_02.rst b/doc/guides/rel_notes/release_20_02.rst
index 50e2c14..690e7db 100644
--- a/doc/guides/rel_notes/release_20_02.rst
+++ b/doc/guides/rel_notes/release_20_02.rst
@@ -113,6 +113,11 @@  New Features
   * Added support for RSS using L3/L4 source/destination only.
   * Added support for matching on GTP tunnel header item.
 
+* **Add new vDPA PMD based on Mellanox devices**
+
+  Added a new Mellanox vDPA  (``mlx5_vdpa``) PMD.
+  See the :doc:`../vdpadevs/mlx5` guide for more details on this driver.
+
 * **Updated testpmd application.**
 
   Added support for ESP and L2TPv3 over IP rte_flow patterns to the testpmd
diff --git a/doc/guides/vdpadevs/features/mlx5.ini b/doc/guides/vdpadevs/features/mlx5.ini
new file mode 100644
index 0000000..d635bdf
--- /dev/null
+++ b/doc/guides/vdpadevs/features/mlx5.ini
@@ -0,0 +1,14 @@ 
+;
+; Supported features of the 'mlx5' VDPA driver.
+;
+; Refer to default.ini for the full list of available driver features.
+;
+[Features]
+Other kdrv           = Y
+ARMv8                = Y
+Power8               = Y
+x86-32               = Y
+x86-64               = Y
+Usage doc            = Y
+Design doc           = Y
+
diff --git a/doc/guides/vdpadevs/index.rst b/doc/guides/vdpadevs/index.rst
index 9657108..1a13efe 100644
--- a/doc/guides/vdpadevs/index.rst
+++ b/doc/guides/vdpadevs/index.rst
@@ -13,3 +13,4 @@  which can be used from an application through vhost API.
 
     features_overview
     ifc
+    mlx5
diff --git a/doc/guides/vdpadevs/mlx5.rst b/doc/guides/vdpadevs/mlx5.rst
new file mode 100644
index 0000000..1861e71
--- /dev/null
+++ b/doc/guides/vdpadevs/mlx5.rst
@@ -0,0 +1,111 @@ 
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright 2019 Mellanox Technologies, Ltd
+
+MLX5 vDPA driver
+================
+
+The MLX5 vDPA (vhost data path acceleration) driver library
+(**librte_pmd_mlx5_vdpa**) provides support for **Mellanox ConnectX-6**,
+**Mellanox ConnectX-6DX** and **Mellanox BlueField** families of
+10/25/40/50/100/200 Gb/s adapters as well as their virtual functions (VF) in
+SR-IOV context.
+
+.. note::
+
+   Due to external dependencies, this driver is disabled in default
+   configuration of the "make" build. It can be enabled with
+   ``CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD=y`` or by using "meson" build system which
+   will detect dependencies.
+
+
+Design
+------
+
+For security reasons and robustness, this driver only deals with virtual
+memory addresses. The way resources allocations are handled by the kernel,
+combined with hardware specifications that allow to handle virtual memory
+addresses directly, ensure that DPDK applications cannot access random
+physical memory (or memory that does not belong to the current process).
+
+The PMD can use libibverbs and libmlx5 to access the device firmware
+or directly the hardware components.
+There are different levels of objects and bypassing abilities
+to get the best performances:
+
+- Verbs is a complete high-level generic API
+- Direct Verbs is a device-specific API
+- DevX allows to access firmware objects
+- Direct Rules manages flow steering at low-level hardware layer
+
+Enabling librte_pmd_mlx5_vdpa causes DPDK applications to be linked against
+libibverbs.
+
+A Mellanox mlx5 PCI device can be probed by either net/mlx5 driver or vdpa/mlx5
+driver but not in parallel. Hence, the user should decide the driver by the
+``class`` parameter in the device argument list.
+By default, the mlx5 device will be probed by the net/mlx5 driver. 
+
+Supported NICs
+--------------
+
+* Mellanox(R) ConnectX(R)-6 200G MCX654106A-HCAT (4x200G)
+* Mellanox(R) ConnectX(R)-6DX EN 100G MCX623106AN-CDAT (2*100G)
+* Mellanox(R) ConnectX(R)-6DX EN 200G MCX623105AN-VDAT (1*200G)
+* Mellanox(R) BlueField SmartNIC 25G MBF1M332A-ASCAT (2*25G)
+
+Prerequisites
+-------------
+
+- Mellanox OFED version: **4.7**
+  see :doc:`../../nics/mlx5` guide for more Mellanox OFED details.
+
+Compilation options
+~~~~~~~~~~~~~~~~~~~
+
+These options can be modified in the ``.config`` file.
+
+- ``CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD`` (default **n**)
+
+  Toggle compilation of librte_pmd_mlx5 itself.
+
+- ``CONFIG_RTE_IBVERBS_LINK_DLOPEN`` (default **n**)
+
+  Build PMD with additional code to make it loadable without hard
+  dependencies on **libibverbs** nor **libmlx5**, which may not be installed
+  on the target system.
+
+  In this mode, their presence is still required for it to run properly,
+  however their absence won't prevent a DPDK application from starting (with
+  ``CONFIG_RTE_BUILD_SHARED_LIB`` disabled) and they won't show up as
+  missing with ``ldd(1)``.
+
+  It works by moving these dependencies to a purpose-built rdma-core "glue"
+  plug-in which must either be installed in a directory whose name is based
+  on ``CONFIG_RTE_EAL_PMD_PATH`` suffixed with ``-glue`` if set, or in a
+  standard location for the dynamic linker (e.g. ``/lib``) if left to the
+  default empty string (``""``).
+
+  This option has no performance impact.
+
+- ``CONFIG_RTE_IBVERBS_LINK_STATIC`` (default **n**)
+
+  Embed static flavor of the dependencies **libibverbs** and **libmlx5**
+  in the PMD shared library or the executable static binary.
+
+.. note::
+
+   For BlueField, target should be set to ``arm64-bluefield-linux-gcc``. This
+   will enable ``CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD`` and set
+   ``RTE_CACHE_LINE_SIZE`` to 64. Default armv8a configuration of make build and
+   meson build set it to 128 then brings performance degradation.
+
+Run-time configuration
+~~~~~~~~~~~~~~~~~~~~~~
+
+- **ethtool** operations on related kernel interfaces also affect the PMD.
+
+- ``class`` parameter [string]
+
+  Select the class of the driver that should probe the device.
+  `vdpa` for the mlx5 vDPA driver.
+
diff --git a/drivers/common/Makefile b/drivers/common/Makefile
index 4775d4b..96bd7ac 100644
--- a/drivers/common/Makefile
+++ b/drivers/common/Makefile
@@ -35,7 +35,7 @@  ifneq (,$(findstring y,$(IAVF-y)))
 DIRS-y += iavf
 endif
 
-ifeq ($(CONFIG_RTE_LIBRTE_MLX5_PMD),y)
+ifeq ($(findstring y,$(CONFIG_RTE_LIBRTE_MLX5_PMD)$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD)),y)
 DIRS-y += mlx5
 endif
 
diff --git a/drivers/common/mlx5/Makefile b/drivers/common/mlx5/Makefile
index 9d4d81f..c4b7999 100644
--- a/drivers/common/mlx5/Makefile
+++ b/drivers/common/mlx5/Makefile
@@ -10,15 +10,16 @@  LIB_GLUE_BASE = librte_pmd_mlx5_glue.so
 LIB_GLUE_VERSION = 20.02.0
 
 # Sources.
+ifeq ($(findstring y,$(CONFIG_RTE_LIBRTE_MLX5_PMD)$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD)),y)
 ifneq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y)
-SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_glue.c
+SRCS-y += mlx5_glue.c
 endif
-SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_devx_cmds.c
-SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_common.c
-SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_nl.c
-
+SRCS-y += mlx5_devx_cmds.c
+SRCS-y += mlx5_common.c
+SRCS-y += mlx5_nl.c
 ifeq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y)
-INSTALL-$(CONFIG_RTE_LIBRTE_MLX5_PMD)-lib += $(LIB_GLUE)
+INSTALL-y-lib += $(LIB_GLUE)
+endif
 endif
 
 # Basic CFLAGS.
@@ -317,7 +318,9 @@  mlx5_autoconf.h: mlx5_autoconf.h.new
 		cmp '$<' '$@' $(AUTOCONF_OUTPUT) || \
 		mv '$<' '$@'
 
-$(SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD):.c=.o): mlx5_autoconf.h
+ifeq ($(findstring y,$(CONFIG_RTE_LIBRTE_MLX5_PMD)$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD)),y)
+$(SRCS-y:.c=.o): mlx5_autoconf.h
+endif
 
 # Generate dependency plug-in for rdma-core when the PMD must not be linked
 # directly, so that applications do not inherit this dependency.
diff --git a/drivers/meson.build b/drivers/meson.build
index 29708cc..bd154fa 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -42,6 +42,7 @@  foreach class:dpdk_driver_classes
 		build = true # set to false to disable, e.g. missing deps
 		reason = '<unknown reason>' # set if build == false to explain
 		name = drv
+		fmt_name = ''
 		allow_experimental_apis = false
 		sources = []
 		objs = []
@@ -98,8 +99,11 @@  foreach class:dpdk_driver_classes
 		else
 			class_drivers += name
 
-			dpdk_conf.set(config_flag_fmt.format(name.to_upper()),1)
-			lib_name = driver_name_fmt.format(name)
+			if fmt_name == ''
+				fmt_name = name
+			endif
+			dpdk_conf.set(config_flag_fmt.format(fmt_name.to_upper()),1)
+			lib_name = driver_name_fmt.format(fmt_name)
 
 			if allow_experimental_apis
 				cflags += '-DALLOW_EXPERIMENTAL_API'
diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile
index b5a7a11..6e88359 100644
--- a/drivers/vdpa/Makefile
+++ b/drivers/vdpa/Makefile
@@ -7,4 +7,6 @@  ifeq ($(CONFIG_RTE_EAL_VFIO),y)
 DIRS-$(CONFIG_RTE_LIBRTE_IFC_PMD) += ifc
 endif
 
+DIRS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5
+
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/drivers/vdpa/meson.build b/drivers/vdpa/meson.build
index 2f047b5..e3ed54a 100644
--- a/drivers/vdpa/meson.build
+++ b/drivers/vdpa/meson.build
@@ -1,7 +1,8 @@ 
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright 2019 Mellanox Technologies, Ltd
 
-drivers = ['ifc']
+drivers = ['ifc',
+	   'mlx5',]
 std_deps = ['bus_pci', 'kvargs']
 std_deps += ['vhost']
 config_flag_fmt = 'RTE_LIBRTE_@0@_PMD'
diff --git a/drivers/vdpa/mlx5/Makefile b/drivers/vdpa/mlx5/Makefile
new file mode 100644
index 0000000..c1c8cc0
--- /dev/null
+++ b/drivers/vdpa/mlx5/Makefile
@@ -0,0 +1,36 @@ 
+#   SPDX-License-Identifier: BSD-3-Clause
+#   Copyright 2019 Mellanox Technologies, Ltd
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# Library name.
+LIB = librte_pmd_mlx5_vdpa.a
+
+# Sources.
+SRCS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD) += mlx5_vdpa.c
+
+# Basic CFLAGS.
+CFLAGS += -O3
+CFLAGS += -std=c11 -Wall -Wextra
+CFLAGS += -g
+CFLAGS += -I$(RTE_SDK)/drivers/common/mlx5
+CFLAGS += -I$(RTE_SDK)/drivers/net/mlx5_vdpa
+CFLAGS += -I$(BUILDDIR)/drivers/common/mlx5
+CFLAGS += -D_BSD_SOURCE
+CFLAGS += -D_DEFAULT_SOURCE
+CFLAGS += -D_XOPEN_SOURCE=600
+CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -Wno-strict-prototypes
+LDLIBS += -lrte_common_mlx5
+LDLIBS += -lrte_eal -lrte_vhost -lrte_kvargs -lrte_bus_pci
+
+# A few warnings cannot be avoided in external headers.
+CFLAGS += -Wno-error=cast-qual
+
+EXPORT_MAP := rte_pmd_mlx5_vdpa_version.map
+# memseg walk is not part of stable API
+CFLAGS += -DALLOW_EXPERIMENTAL_API
+
+CFLAGS += -DNDEBUG -UPEDANTIC
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/vdpa/mlx5/meson.build b/drivers/vdpa/mlx5/meson.build
new file mode 100644
index 0000000..4bca6ea
--- /dev/null
+++ b/drivers/vdpa/mlx5/meson.build
@@ -0,0 +1,29 @@ 
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2019 Mellanox Technologies, Ltd
+
+if not is_linux
+	build = false
+	reason = 'only supported on Linux'
+	subdir_done()
+endif
+
+fmt_name = 'mlx5_vdpa'
+allow_experimental_apis = true
+deps += ['hash', 'common_mlx5', 'vhost', 'bus_pci', 'eal']
+sources = files(
+	'mlx5_vdpa.c',
+)
+cflags_options = [
+	'-std=c11',
+	'-Wno-strict-prototypes',
+	'-D_BSD_SOURCE',
+	'-D_DEFAULT_SOURCE',
+	'-D_XOPEN_SOURCE=600'
+]
+foreach option:cflags_options
+	if cc.has_argument(option)
+		cflags += option
+	endif
+endforeach
+
+cflags += [ '-DNDEBUG', '-UPEDANTIC' ]
diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c
new file mode 100644
index 0000000..6286d7a
--- /dev/null
+++ b/drivers/vdpa/mlx5/mlx5_vdpa.c
@@ -0,0 +1,227 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2019 Mellanox Technologies, Ltd
+ */
+#include <rte_malloc.h>
+#include <rte_log.h>
+#include <rte_errno.h>
+#include <rte_bus_pci.h>
+#include <rte_vdpa.h>
+
+#include <mlx5_glue.h>
+#include <mlx5_common.h>
+
+#include "mlx5_vdpa_utils.h"
+
+
+struct mlx5_vdpa_priv {
+	TAILQ_ENTRY(mlx5_vdpa_priv) next;
+	int id; /* vDPA device id. */
+	struct ibv_context *ctx; /* Device context. */
+	struct rte_vdpa_dev_addr dev_addr;
+};
+
+TAILQ_HEAD(mlx5_vdpa_privs, mlx5_vdpa_priv) priv_list =
+					      TAILQ_HEAD_INITIALIZER(priv_list);
+static pthread_mutex_t priv_list_lock = PTHREAD_MUTEX_INITIALIZER;
+int mlx5_vdpa_logtype;
+
+static struct rte_vdpa_dev_ops mlx5_vdpa_ops = {
+	.get_queue_num = NULL,
+	.get_features = NULL,
+	.get_protocol_features = NULL,
+	.dev_conf = NULL,
+	.dev_close = NULL,
+	.set_vring_state = NULL,
+	.set_features = NULL,
+	.migration_done = NULL,
+	.get_vfio_group_fd = NULL,
+	.get_vfio_device_fd = NULL,
+	.get_notify_area = NULL,
+};
+
+/**
+ * DPDK callback to register a PCI device.
+ *
+ * This function spawns vdpa device out of a given PCI device.
+ *
+ * @param[in] pci_drv
+ *   PCI driver structure (mlx5_vpda_driver).
+ * @param[in] pci_dev
+ *   PCI device information.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_vdpa_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
+		    struct rte_pci_device *pci_dev __rte_unused)
+{
+	struct ibv_device **ibv_list;
+	struct ibv_device *ibv_match = NULL;
+	struct mlx5_vdpa_priv *priv = NULL;
+	struct ibv_context *ctx = NULL;
+	int ret;
+
+	if (mlx5_class_get(pci_dev->device.devargs) != MLX5_CLASS_VDPA) {
+		DRV_LOG(DEBUG, "Skip probing - should be probed by other mlx5"
+			" driver.");
+		return 1;
+	}
+	errno = 0;
+	ibv_list = mlx5_glue->get_device_list(&ret);
+	if (!ibv_list) {
+		rte_errno = errno;
+		DRV_LOG(ERR, "Failed to get device list, is ib_uverbs loaded?");
+		return -ENOSYS;
+	}
+	while (ret-- > 0) {
+		struct rte_pci_addr pci_addr;
+
+		DRV_LOG(DEBUG, "Checking device \"%s\"..", ibv_list[ret]->name);
+		if (mlx5_dev_to_pci_addr(ibv_list[ret]->ibdev_path, &pci_addr))
+			continue;
+		if (pci_dev->addr.domain != pci_addr.domain ||
+		    pci_dev->addr.bus != pci_addr.bus ||
+		    pci_dev->addr.devid != pci_addr.devid ||
+		    pci_dev->addr.function != pci_addr.function)
+			continue;
+		DRV_LOG(INFO, "PCI information matches for device \"%s\".",
+			ibv_list[ret]->name);
+		ibv_match = ibv_list[ret];
+		break;
+	}
+	mlx5_glue->free_device_list(ibv_list);
+	if (!ibv_match) {
+		DRV_LOG(ERR, "No matching IB device for PCI slot "
+			"%" SCNx32 ":%" SCNx8 ":%" SCNx8 ".%" SCNx8 ".",
+			pci_dev->addr.domain, pci_dev->addr.bus,
+			pci_dev->addr.devid, pci_dev->addr.function);
+		rte_errno = ENOENT;
+		return -rte_errno;
+	}
+	ctx = mlx5_glue->dv_open_device(ibv_match);
+	if (!ctx) {
+		DRV_LOG(ERR, "Failed to open IB device \"%s\".",
+			ibv_match->name);
+		rte_errno = ENODEV;
+		return -rte_errno;
+	}
+	priv = rte_zmalloc("mlx5 vDPA device private", sizeof(*priv),
+			   RTE_CACHE_LINE_SIZE);
+	if (!priv) {
+		DRV_LOG(ERR, "Failed to allocate private memory.");
+		rte_errno = ENOMEM;
+		goto error;
+	}
+	priv->ctx = ctx;
+	priv->dev_addr.pci_addr = pci_dev->addr;
+	priv->dev_addr.type = PCI_ADDR;
+	priv->id = rte_vdpa_register_device(&priv->dev_addr, &mlx5_vdpa_ops);
+	if (priv->id < 0) {
+		DRV_LOG(ERR, "Failed to register vDPA device.");
+		rte_errno = rte_errno ? rte_errno : EINVAL;
+		goto error;
+	}
+	pthread_mutex_lock(&priv_list_lock);
+	TAILQ_INSERT_TAIL(&priv_list, priv, next);
+	pthread_mutex_unlock(&priv_list_lock);
+	return 0;
+
+error:
+	if (priv)
+		rte_free(priv);
+	if (ctx)
+		mlx5_glue->close_device(ctx);
+	return -rte_errno;
+}
+
+/**
+ * DPDK callback to remove a PCI device.
+ *
+ * This function removes all vDPA devices belong to a given PCI device.
+ *
+ * @param[in] pci_dev
+ *   Pointer to the PCI device.
+ *
+ * @return
+ *   0 on success, the function cannot fail.
+ */
+static int
+mlx5_vdpa_pci_remove(struct rte_pci_device *pci_dev)
+{
+	struct mlx5_vdpa_priv *priv = NULL;
+	int found = 0;
+
+	pthread_mutex_lock(&priv_list_lock);
+	TAILQ_FOREACH(priv, &priv_list, next) {
+		if (memcmp(&priv->dev_addr.pci_addr, &pci_dev->addr,
+			   sizeof(pci_dev->addr)) == 0) {
+			found = 1;
+			break;
+		}
+	}
+	if (found) {
+		TAILQ_REMOVE(&priv_list, priv, next);
+		mlx5_glue->close_device(priv->ctx);
+		rte_free(priv);
+	}
+	pthread_mutex_unlock(&priv_list_lock);
+	return 0;
+}
+
+static const struct rte_pci_id mlx5_vdpa_pci_id_map[] = {
+	{
+		RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX,
+			       PCI_DEVICE_ID_MELLANOX_CONNECTX5BF)
+	},
+	{
+		RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX,
+			       PCI_DEVICE_ID_MELLANOX_CONNECTX5BFVF)
+	},
+	{
+		RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX,
+				PCI_DEVICE_ID_MELLANOX_CONNECTX6)
+	},
+	{
+		RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX,
+				PCI_DEVICE_ID_MELLANOX_CONNECTX6VF)
+	},
+	{
+		RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX,
+				PCI_DEVICE_ID_MELLANOX_CONNECTX6DX)
+	},
+	{
+		RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX,
+				PCI_DEVICE_ID_MELLANOX_CONNECTX6DXVF)
+	},
+	{
+		.vendor_id = 0
+	}
+};
+
+static struct rte_pci_driver mlx5_vdpa_driver = {
+	.driver = {
+		.name = "mlx5_vdpa",
+	},
+	.id_table = mlx5_vdpa_pci_id_map,
+	.probe = mlx5_vdpa_pci_probe,
+	.remove = mlx5_vdpa_pci_remove,
+	.drv_flags = 0,
+};
+
+/**
+ * Driver initialization routine.
+ */
+RTE_INIT(rte_mlx5_vdpa_init)
+{
+	/* Initialize common log type. */
+	mlx5_vdpa_logtype = rte_log_register("pmd.vdpa.mlx5");
+	if (mlx5_vdpa_logtype >= 0)
+		rte_log_set_level(mlx5_vdpa_logtype, RTE_LOG_NOTICE);
+	if (mlx5_glue)
+		rte_pci_register(&mlx5_vdpa_driver);
+}
+
+RTE_PMD_EXPORT_NAME(net_mlx5_vdpa, __COUNTER__);
+RTE_PMD_REGISTER_PCI_TABLE(net_mlx5_vdpa, mlx5_vdpa_pci_id_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_mlx5_vdpa, "* ib_uverbs & mlx5_core & mlx5_ib");
diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_utils.h b/drivers/vdpa/mlx5/mlx5_vdpa_utils.h
new file mode 100644
index 0000000..a239df9
--- /dev/null
+++ b/drivers/vdpa/mlx5/mlx5_vdpa_utils.h
@@ -0,0 +1,20 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2019 Mellanox Technologies, Ltd
+ */
+
+#ifndef RTE_PMD_MLX5_VDPA_UTILS_H_
+#define RTE_PMD_MLX5_VDPA_UTILS_H_
+
+#include <mlx5_common.h>
+
+
+extern int mlx5_vdpa_logtype;
+
+#define MLX5_VDPA_LOG_PREFIX "mlx5_vdpa"
+/* Generic printf()-like logging macro with automatic line feed. */
+#define DRV_LOG(level, ...) \
+	PMD_DRV_LOG_(level, mlx5_vdpa_logtype, MLX5_VDPA_LOG_PREFIX, \
+		__VA_ARGS__ PMD_DRV_LOG_STRIP PMD_DRV_LOG_OPAREN, \
+		PMD_DRV_LOG_CPAREN)
+
+#endif /* RTE_PMD_MLX5_VDPA_UTILS_H_ */
diff --git a/drivers/vdpa/mlx5/rte_pmd_mlx5_vdpa_version.map b/drivers/vdpa/mlx5/rte_pmd_mlx5_vdpa_version.map
new file mode 100644
index 0000000..143836e
--- /dev/null
+++ b/drivers/vdpa/mlx5/rte_pmd_mlx5_vdpa_version.map
@@ -0,0 +1,3 @@ 
+DPDK_20.02 {
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 45f4cad..b33cd8a 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -196,18 +196,21 @@  endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LIO_PMD)        += -lrte_pmd_lio
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF)      += -lrte_pmd_memif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD)       += -lrte_pmd_mlx4
-_LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD)       += -lrte_common_mlx5
+ifeq ($(findstring y,$(CONFIG_RTE_LIBRTE_MLX5_PMD)$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD)),y)
+_LDLIBS-y                                   += -lrte_common_mlx5
+endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD)       += -lrte_pmd_mlx5
+_LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD)  += -lrte_pmd_mlx5_vdpa
 ifeq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y)
-_LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD)       += -ldl
-_LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD)       += -ldl
+_LDLIBS-y                                   += -ldl
 else ifeq ($(CONFIG_RTE_IBVERBS_LINK_STATIC),y)
 LIBS_IBVERBS_STATIC = $(shell $(RTE_SDK)/buildtools/options-ibverbs-static.sh)
-_LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD)       += $(LIBS_IBVERBS_STATIC)
-_LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD)       += $(LIBS_IBVERBS_STATIC)
+_LDLIBS-y                                   += $(LIBS_IBVERBS_STATIC)
 else
+ifeq ($(findstring y,$(CONFIG_RTE_LIBRTE_MLX5_PMD)$(CONFIG_RTE_LIBRTE_MLX5_VDPA_PMD)),y)
+_LDLIBS-y                                   += -libverbs -lmlx5
+endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD)       += -libverbs -lmlx4
-_LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD)       += -libverbs -lmlx5
 endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MVPP2_PMD)      += -lrte_pmd_mvpp2
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MVNETA_PMD)     += -lrte_pmd_mvneta