[dpdk-dev,v3,3/3] doc: add librte_pmd_mlx4 documentation

Message ID 1424872326-17930-4-git-send-email-adrien.mazarguil@6wind.com (mailing list archive)
State Accepted, archived
Headers

Commit Message

Adrien Mazarguil Feb. 25, 2015, 1:52 p.m. UTC
  This documentation covers implementation details, features and limitations,
configuration, prerequisites and provides a usage example.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 MAINTAINERS                                  |   1 +
 doc/guides/prog_guide/index.rst              |   1 +
 doc/guides/prog_guide/mlx4_poll_mode_drv.rst | 326 +++++++++++++++++++++++++++
 doc/guides/prog_guide/source_org.rst         |   1 +
 4 files changed, 329 insertions(+)
 create mode 100644 doc/guides/prog_guide/mlx4_poll_mode_drv.rst
  

Comments

Siobhan Butler March 2, 2015, 5:45 p.m. UTC | #1
Thank you Adrien this is great.
Siobhan

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
> Sent: Wednesday, February 25, 2015 1:52 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v3 3/3] doc: add librte_pmd_mlx4
> documentation
> 
> This documentation covers implementation details, features and limitations,
> configuration, prerequisites and provides a usage example.
> 
> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> ---
>  MAINTAINERS                                  |   1 +
>  doc/guides/prog_guide/index.rst              |   1 +
>  doc/guides/prog_guide/mlx4_poll_mode_drv.rst | 326
> +++++++++++++++++++++++++++
>  doc/guides/prog_guide/source_org.rst         |   1 +
>  4 files changed, 329 insertions(+)
>  create mode 100644 doc/guides/prog_guide/mlx4_poll_mode_drv.rst
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index d8b0fbc..ac61825 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -223,6 +223,7 @@ F: lib/librte_pmd_fm10k/  Mellanox mlx4
>  M: Adrien Mazarguil <adrien.mazarguil@6wind.com>
>  F: lib/librte_pmd_mlx4/
> +F: doc/guides/prog_guide/mlx4_poll_mode_drv.rst
> 
>  RedHat virtio
>  M: Changchun Ouyang <changchun.ouyang@intel.com> diff --git
> a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
> index de69682..87f6b35 100644
> --- a/doc/guides/prog_guide/index.rst
> +++ b/doc/guides/prog_guide/index.rst
> @@ -56,6 +56,7 @@ Programmer's Guide
>      intel_dpdk_xen_based_packet_switch_sol
>      libpcap_ring_based_poll_mode_drv
>      link_bonding_poll_mode_drv_lib
> +    mlx4_poll_mode_drv
>      timer_lib
>      hash_lib
>      lpm_lib
> diff --git a/doc/guides/prog_guide/mlx4_poll_mode_drv.rst
> b/doc/guides/prog_guide/mlx4_poll_mode_drv.rst
> new file mode 100644
> index 0000000..35570c3
> --- /dev/null
> +++ b/doc/guides/prog_guide/mlx4_poll_mode_drv.rst
> @@ -0,0 +1,326 @@
> +..  BSD LICENSE
> +    Copyright 2012-2015 6WIND S.A.
> +
> +    Redistribution and use in source and binary forms, with or without
> +    modification, are permitted provided that the following conditions
> +    are met:
> +
> +    * Redistributions of source code must retain the above copyright
> +    notice, this list of conditions and the following disclaimer.
> +    * Redistributions in binary form must reproduce the above copyright
> +    notice, this list of conditions and the following disclaimer in
> +    the documentation and/or other materials provided with the
> +    distribution.
> +    * Neither the name of 6WIND S.A. nor the names of its
> +    contributors may be used to endorse or promote products derived
> +    from this software without specific prior written permission.
> +
> +    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> +    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> +    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> +    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> +    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> +    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> +    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF USE,
> +    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> +    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> +    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE
> +    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> +
> +MLX4 poll mode driver library
> +=============================
> +
> +The MLX4 poll mode driver library (**librte_pmd_mlx4**) implements
> +support for **Mellanox ConnectX-3** 10/40 Gbps adapters (EN 40, EN 10,
> +Pro EN 40) as well as their virtual functions (VF) in SR-IOV context.
> +
> +.. note::
> +
> +   Due to external dependencies, this driver is disabled by default. It must
> +   be enabled manually by setting ``CONFIG_RTE_LIBRTE_MLX4_PMD=y``
> and
> +   recompiling DPDK.
> +
> +Implementation details
> +----------------------
> +
> +Most Mellanox ConnectX-3 devices provide two ports but expose a single
> +PCI bus address, thus unlike most drivers, librte_pmd_mlx4 registers
> +itself as a PCI driver that allocates one Ethernet device per detected port.
> +
> +For this reason, one cannot white/blacklist a single port without also
> +white/blacklisting the others on the same device.
> +
> +Besides its dependency on libibverbs (that implies libmlx4 and
> +associated kernel support), librte_pmd_mlx4 relies heavily on system
> +calls for control operations such as querying/updating the MTU and flow
> control parameters.
> +
> +For security reasons and robustness, this driver only deals with
> +virtual memory addresses. The way resources allocations are handled by
> +the kernel combined with hardware specifications that allow it to
> +handle virtual memory addresses directly ensure that DPDK applications
> +cannot access random physical memory (or memory that does not belong
> to the current process).
> +
> +This capability allows the PMD to coexist with kernel network
> +interfaces which remain functional, although they stop receiving
> +unicast packets as long as they share the same MAC address.
> +
> +Compiling librte_pmd_mlx4 causes DPDK to be linked against libibverbs.
> +
> +Features and limitations
> +------------------------
> +
> +- RSS, also known as RCA, is supported. In this mode the number of
> +  configured RX queues must be a power of two.
> +- VLAN filtering is supported.
> +- Link state information is provided.
> +- Promiscuous mode is supported.
> +- All multicast mode is supported.
> +- Multiple MAC addresses (unicast, multicast) can be configured.
> +- Scattered packets are supported for TX and RX.
> +
> +..
> +
> +- RSS hash key cannot be modified.
> +- Hardware counters are not implemented (they are software counters).
> +- Checksum offloads are not supported yet.
> +
> +Configuration
> +-------------
> +
> +Compilation options
> +~~~~~~~~~~~~~~~~~~~
> +
> +- ``CONFIG_RTE_LIBRTE_MLX4_PMD`` (default **n**)
> +
> +  Toggle compilation of librte_pmd_mlx4 itself.
> +
> +- ``CONFIG_RTE_LIBRTE_MLX4_DEBUG`` (default **n**)
> +
> +  Toggle debugging code and stricter compilation flags. Enabling this
> + option  adds additional run-time checks and debugging messages at the
> + cost of  lower performance.
> +
> +- ``CONFIG_RTE_LIBRTE_MLX4_SGE_WR_N`` (default **4**)
> +
> +  Number of scatter/gather elements (SGEs) per work request (WR).
> + Lowering  this number improves performance but also limits the ability
> + to receive  scattered packets (packets that do not fit a single mbuf).
> + The default  value is a safe tradeoff.
> +
> +- ``CONFIG_RTE_LIBRTE_MLX4_MAX_INLINE`` (default **0**)
> +
> +  Amount of data to be inlined during TX operations. Improves latency
> + but  lowers throughput.
> +
> +- ``CONFIG_RTE_LIBRTE_MLX4_TX_MP_CACHE`` (default **8**)
> +
> +  Maximum number of cached memory pools (MPs) per TX queue. Each MP
> + from  which buffers are to be transmitted must be associated to memory
> + regions  (MRs). This is a slow operation that must be cached.
> +
> +  This value is always 1 for RX queues since they use a single MP.
> +
> +- ``CONFIG_RTE_LIBRTE_MLX4_SOFT_COUNTERS`` (default **1**)
> +
> +  Toggle software counters. No counters are available if this option is
> + disabled since hardware counters are not supported.
> +
> +- ``CONFIG_RTE_LIBRTE_MLX4_COMPAT_VMWARE`` (default **1**)
> +
> +  Toggle VMware compatibility code. It also requires the environment
> + variable ``MLX4_COMPAT_VMWARE`` set to a nonzero value at runtime.
> +
> +Environment variables
> +~~~~~~~~~~~~~~~~~~~~~
> +
> +- ``MLX4_INLINE_RECV_SIZE``
> +
> +  A nonzero value enables inline receive for packets up to that size.
> + May  significantly improve performance in some cases but lower it in
> + others. Requires careful testing.
> +
> +- ``MLX4_COMPAT_VMWARE``
> +
> +  Only supported when compiled with
> +  ``CONFIG_RTE_LIBRTE_MLX4_COMPAT_VMWARE=1``. Adds workarounds
> to run
> + in  VMware systems that do not support the flows API properly.
> +
> +Run-time configuration
> +~~~~~~~~~~~~~~~~~~~~~~
> +
> +- The only constraint when RSS mode is requested is to make sure the
> +number
> +  of RX queues is a power of two. This is a hardware requirement.
> +
> +- librte_pmd_mlx4 brings kernel network interfaces up during
> +initialization
> +  because it is affected by their state. Forcing them down prevents
> +packets
> +  reception.
> +
> +- **ethtool** operations on related kernel interfaces also affect the PMD.
> +
> +Prerequisites
> +-------------
> +
> +This driver relies on external libraries and kernel drivers for
> +resources allocations and initialization. The following dependencies
> +are not part of DPDK and must be installed separately:
> +
> +- **libibverbs**
> +
> +  User space verbs framework used by librte_pmd_mlx4. This library
> + provides  a generic interface between the kernel and low-level user
> + space drivers  such as libmlx4.
> +
> +  It allows slow and privileged operations (context initialization,
> + hardware  resources allocations) to be managed by the kernel and fast
> + operations to  never leave user space.
> +
> +- **libmlx4**
> +
> +  Low-level user space driver library for Mellanox ConnectX-3 devices,
> + it is automatically loaded by libibverbs.
> +
> +  This library basically implements send/receive calls to the hardware
> + queues.
> +
> +- **Kernel modules** (mlnx-ofed-kernel)
> +
> +  They provide the kernel-side verbs API and low level device drivers
> + that  manage actual hardware initialization and resources sharing with
> + user  space processes.
> +
> +  Unlike most other PMDs, these modules must remain loaded and bound
> to
> + their devices:
> +
> +  - mlx4_core: hardware driver managing Mellanox ConnectX-3 devices.
> +  - mlx4_en: Ethernet device driver that provides kernel network interfaces.
> +  - mlx4_ib: InifiniBand device driver.
> +  - ib_uverbs: user space driver for verbs (entry point for libibverbs).
> +
> +While these libraries and kernel modules are available on OpenFabrics
> +Aliance's `website <https://www.openfabrics.org/>`_ and provided by
> +package managers on most distributions, this PMD requires Ethernet
> +extensions that may not be supported at the moment (this is a work in
> progress).
> +
> +`Mellanox OFED
> +<http://www.mellanox.com/page/products_dyn?product_family=26&mta
> g=linux
> +_sw_drivers>`_ includes the necessary support and should be used in the
> +meantime. For DPDK, only libibverbs, libmlx4 and mlnx-ofed-kernel
> +packages are required from that distribution.
> +
> +.. note::
> +
> +   Both libraries are BSD and GPL licensed. Linux kernel modules are GPL
> +   licensed.
> +
> +Usage example
> +-------------
> +
> +This section demonstrates how to launch **testpmd** with Mellanox
> +ConnectX-3 devices managed by librte_pmd_mlx4.
> +
> +#. Load the kernel modules:
> +
> +   .. code-block:: console
> +
> +      modprobe -a ib_uverbs mlx4_en mlx4_core mlx4_ib
> +
> +   .. note::
> +
> +      User space I/O kernel modules (uio and igb_uio) are not used and do
> +      not have to be loaded.
> +
> +#. Make sure Ethernet interfaces are in working order and linked to kernel
> +   verbs. Related sysfs entries should be present:
> +
> +   .. code-block:: console
> +
> +      ls -d /sys/class/net/*/device/infiniband_verbs/uverbs* | cut -d /
> + -f 5
> +
> +   Example output:
> +
> +   .. code-block:: console
> +
> +      eth2
> +      eth3
> +      eth4
> +      eth5
> +
> +#. Optionally, retrieve their PCI bus addresses for whitelisting:
> +
> +   .. code-block:: console
> +
> +      {
> +          for intf in eth2 eth3 eth4 eth5;
> +          do
> +              (cd "/sys/class/net/${intf}/device/" && pwd -P);
> +          done;
> +      } |
> +      sed -n 's,.*/\(.*\),-w \1,p'
> +
> +   Example output:
> +
> +   .. code-block:: console
> +
> +      -w 0000:83:00.0
> +      -w 0000:83:00.0
> +      -w 0000:84:00.0
> +      -w 0000:84:00.0
> +
> +   .. note::
> +
> +      There are only two distinct PCI bus addresses because the Mellanox
> +      ConnectX-3 adapters installed on this system are dual port.
> +
> +#. Request huge pages:
> +
> +   .. code-block:: console
> +
> +      echo 1024 >
> + /sys/kernel/mm/hugepages/hugepages-
> 2048kB/nr_hugepages/nr_hugepages
> +
> +#. Start testpmd with basic parameters:
> +
> +   .. code-block:: console
> +
> +      testpmd -c 0xff00 -n 4 -w 0000:83:00.0 -w 0000:84:00.0 -- --rxq=2
> + --txq=2 -i
> +
> +   Example output:
> +
> +   .. code-block:: console
> +
> +      [...]
> +      EAL: PCI device 0000:83:00.0 on NUMA socket 1
> +      EAL:   probe driver: 15b3:1007 librte_pmd_mlx4
> +      PMD: librte_pmd_mlx4: PCI information matches, using device "mlx4_0"
> (VF: false)
> +      PMD: librte_pmd_mlx4: 2 port(s) detected
> +      PMD: librte_pmd_mlx4: port 1 MAC address is 00:02:c9:b5:b7:50
> +      PMD: librte_pmd_mlx4: port 2 MAC address is 00:02:c9:b5:b7:51
> +      EAL: PCI device 0000:84:00.0 on NUMA socket 1
> +      EAL:   probe driver: 15b3:1007 librte_pmd_mlx4
> +      PMD: librte_pmd_mlx4: PCI information matches, using device "mlx4_1"
> (VF: false)
> +      PMD: librte_pmd_mlx4: 2 port(s) detected
> +      PMD: librte_pmd_mlx4: port 1 MAC address is 00:02:c9:b5:ba:b0
> +      PMD: librte_pmd_mlx4: port 2 MAC address is 00:02:c9:b5:ba:b1
> +      Interactive-mode selected
> +      Configuring Port 0 (socket 0)
> +      PMD: librte_pmd_mlx4: 0x867d60: TX queues number update: 0 -> 2
> +      PMD: librte_pmd_mlx4: 0x867d60: RX queues number update: 0 -> 2
> +      Port 0: 00:02:C9:B5:B7:50
> +      Configuring Port 1 (socket 0)
> +      PMD: librte_pmd_mlx4: 0x867da0: TX queues number update: 0 -> 2
> +      PMD: librte_pmd_mlx4: 0x867da0: RX queues number update: 0 -> 2
> +      Port 1: 00:02:C9:B5:B7:51
> +      Configuring Port 2 (socket 0)
> +      PMD: librte_pmd_mlx4: 0x867de0: TX queues number update: 0 -> 2
> +      PMD: librte_pmd_mlx4: 0x867de0: RX queues number update: 0 -> 2
> +      Port 2: 00:02:C9:B5:BA:B0
> +      Configuring Port 3 (socket 0)
> +      PMD: librte_pmd_mlx4: 0x867e20: TX queues number update: 0 -> 2
> +      PMD: librte_pmd_mlx4: 0x867e20: RX queues number update: 0 -> 2
> +      Port 3: 00:02:C9:B5:BA:B1
> +      Checking link statuses...
> +      Port 0 Link Up - speed 10000 Mbps - full-duplex
> +      Port 1 Link Up - speed 40000 Mbps - full-duplex
> +      Port 2 Link Up - speed 10000 Mbps - full-duplex
> +      Port 3 Link Up - speed 40000 Mbps - full-duplex
> +      Done
> +      testpmd>
> diff --git a/doc/guides/prog_guide/source_org.rst
> b/doc/guides/prog_guide/source_org.rst
> index c8ca54f..c66ad16 100644
> --- a/doc/guides/prog_guide/source_org.rst
> +++ b/doc/guides/prog_guide/source_org.rst
> @@ -83,6 +83,7 @@ The lib directory contains::
>      +-- librte_pmd_e1000    # 1GbE poll mode drivers (igb and em)
>      +-- librte_pmd_ixgbe    # 10GbE poll mode driver
>      +-- librte_pmd_i40e     # 40GbE poll mode driver
> +    +-- librte_pmd_mlx4     # Mellanox ConnectX-3 poll mode driver
>      +-- librte_pmd_pcap     # PCAP poll mode driver
>      +-- librte_pmd_ring     # ring poll mode driver
>      +-- librte_pmd_virtio   # virtio poll mode driver
> --
> 2.1.0
  

Patch

diff --git a/MAINTAINERS b/MAINTAINERS
index d8b0fbc..ac61825 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -223,6 +223,7 @@  F: lib/librte_pmd_fm10k/
 Mellanox mlx4
 M: Adrien Mazarguil <adrien.mazarguil@6wind.com>
 F: lib/librte_pmd_mlx4/
+F: doc/guides/prog_guide/mlx4_poll_mode_drv.rst
 
 RedHat virtio
 M: Changchun Ouyang <changchun.ouyang@intel.com>
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index de69682..87f6b35 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -56,6 +56,7 @@  Programmer's Guide
     intel_dpdk_xen_based_packet_switch_sol
     libpcap_ring_based_poll_mode_drv
     link_bonding_poll_mode_drv_lib
+    mlx4_poll_mode_drv
     timer_lib
     hash_lib
     lpm_lib
diff --git a/doc/guides/prog_guide/mlx4_poll_mode_drv.rst b/doc/guides/prog_guide/mlx4_poll_mode_drv.rst
new file mode 100644
index 0000000..35570c3
--- /dev/null
+++ b/doc/guides/prog_guide/mlx4_poll_mode_drv.rst
@@ -0,0 +1,326 @@ 
+..  BSD LICENSE
+    Copyright 2012-2015 6WIND S.A.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of 6WIND S.A. nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+MLX4 poll mode driver library
+=============================
+
+The MLX4 poll mode driver library (**librte_pmd_mlx4**) implements support
+for **Mellanox ConnectX-3** 10/40 Gbps adapters (EN 40, EN 10, Pro EN 40) as
+well as their virtual functions (VF) in SR-IOV context.
+
+.. note::
+
+   Due to external dependencies, this driver is disabled by default. It must
+   be enabled manually by setting ``CONFIG_RTE_LIBRTE_MLX4_PMD=y`` and
+   recompiling DPDK.
+
+Implementation details
+----------------------
+
+Most Mellanox ConnectX-3 devices provide two ports but expose a single PCI
+bus address, thus unlike most drivers, librte_pmd_mlx4 registers itself as a
+PCI driver that allocates one Ethernet device per detected port.
+
+For this reason, one cannot white/blacklist a single port without also
+white/blacklisting the others on the same device.
+
+Besides its dependency on libibverbs (that implies libmlx4 and associated
+kernel support), librte_pmd_mlx4 relies heavily on system calls for control
+operations such as querying/updating the MTU and flow control parameters.
+
+For security reasons and robustness, this driver only deals with virtual
+memory addresses. The way resources allocations are handled by the kernel
+combined with hardware specifications that allow it to handle virtual memory
+addresses directly ensure that DPDK applications cannot access random
+physical memory (or memory that does not belong to the current process).
+
+This capability allows the PMD to coexist with kernel network interfaces
+which remain functional, although they stop receiving unicast packets as
+long as they share the same MAC address.
+
+Compiling librte_pmd_mlx4 causes DPDK to be linked against libibverbs.
+
+Features and limitations
+------------------------
+
+- RSS, also known as RCA, is supported. In this mode the number of
+  configured RX queues must be a power of two.
+- VLAN filtering is supported.
+- Link state information is provided.
+- Promiscuous mode is supported.
+- All multicast mode is supported.
+- Multiple MAC addresses (unicast, multicast) can be configured.
+- Scattered packets are supported for TX and RX.
+
+..
+
+- RSS hash key cannot be modified.
+- Hardware counters are not implemented (they are software counters).
+- Checksum offloads are not supported yet.
+
+Configuration
+-------------
+
+Compilation options
+~~~~~~~~~~~~~~~~~~~
+
+- ``CONFIG_RTE_LIBRTE_MLX4_PMD`` (default **n**)
+
+  Toggle compilation of librte_pmd_mlx4 itself.
+
+- ``CONFIG_RTE_LIBRTE_MLX4_DEBUG`` (default **n**)
+
+  Toggle debugging code and stricter compilation flags. Enabling this option
+  adds additional run-time checks and debugging messages at the cost of
+  lower performance.
+
+- ``CONFIG_RTE_LIBRTE_MLX4_SGE_WR_N`` (default **4**)
+
+  Number of scatter/gather elements (SGEs) per work request (WR). Lowering
+  this number improves performance but also limits the ability to receive
+  scattered packets (packets that do not fit a single mbuf). The default
+  value is a safe tradeoff.
+
+- ``CONFIG_RTE_LIBRTE_MLX4_MAX_INLINE`` (default **0**)
+
+  Amount of data to be inlined during TX operations. Improves latency but
+  lowers throughput.
+
+- ``CONFIG_RTE_LIBRTE_MLX4_TX_MP_CACHE`` (default **8**)
+
+  Maximum number of cached memory pools (MPs) per TX queue. Each MP from
+  which buffers are to be transmitted must be associated to memory regions
+  (MRs). This is a slow operation that must be cached.
+
+  This value is always 1 for RX queues since they use a single MP.
+
+- ``CONFIG_RTE_LIBRTE_MLX4_SOFT_COUNTERS`` (default **1**)
+
+  Toggle software counters. No counters are available if this option is
+  disabled since hardware counters are not supported.
+
+- ``CONFIG_RTE_LIBRTE_MLX4_COMPAT_VMWARE`` (default **1**)
+
+  Toggle VMware compatibility code. It also requires the environment
+  variable ``MLX4_COMPAT_VMWARE`` set to a nonzero value at runtime.
+
+Environment variables
+~~~~~~~~~~~~~~~~~~~~~
+
+- ``MLX4_INLINE_RECV_SIZE``
+
+  A nonzero value enables inline receive for packets up to that size. May
+  significantly improve performance in some cases but lower it in
+  others. Requires careful testing.
+
+- ``MLX4_COMPAT_VMWARE``
+
+  Only supported when compiled with
+  ``CONFIG_RTE_LIBRTE_MLX4_COMPAT_VMWARE=1``. Adds workarounds to run in
+  VMware systems that do not support the flows API properly.
+
+Run-time configuration
+~~~~~~~~~~~~~~~~~~~~~~
+
+- The only constraint when RSS mode is requested is to make sure the number
+  of RX queues is a power of two. This is a hardware requirement.
+
+- librte_pmd_mlx4 brings kernel network interfaces up during initialization
+  because it is affected by their state. Forcing them down prevents packets
+  reception.
+
+- **ethtool** operations on related kernel interfaces also affect the PMD.
+
+Prerequisites
+-------------
+
+This driver relies on external libraries and kernel drivers for resources
+allocations and initialization. The following dependencies are not part of
+DPDK and must be installed separately:
+
+- **libibverbs**
+
+  User space verbs framework used by librte_pmd_mlx4. This library provides
+  a generic interface between the kernel and low-level user space drivers
+  such as libmlx4.
+
+  It allows slow and privileged operations (context initialization, hardware
+  resources allocations) to be managed by the kernel and fast operations to
+  never leave user space.
+
+- **libmlx4**
+
+  Low-level user space driver library for Mellanox ConnectX-3 devices,
+  it is automatically loaded by libibverbs.
+
+  This library basically implements send/receive calls to the hardware
+  queues.
+
+- **Kernel modules** (mlnx-ofed-kernel)
+
+  They provide the kernel-side verbs API and low level device drivers that
+  manage actual hardware initialization and resources sharing with user
+  space processes.
+
+  Unlike most other PMDs, these modules must remain loaded and bound to
+  their devices:
+
+  - mlx4_core: hardware driver managing Mellanox ConnectX-3 devices.
+  - mlx4_en: Ethernet device driver that provides kernel network interfaces.
+  - mlx4_ib: InifiniBand device driver.
+  - ib_uverbs: user space driver for verbs (entry point for libibverbs).
+
+While these libraries and kernel modules are available on OpenFabrics
+Aliance's `website <https://www.openfabrics.org/>`_ and provided by package
+managers on most distributions, this PMD requires Ethernet extensions that
+may not be supported at the moment (this is a work in progress).
+
+`Mellanox OFED
+<http://www.mellanox.com/page/products_dyn?product_family=26&mtag=linux_sw_drivers>`_
+includes the necessary support and should be used in the meantime. For DPDK,
+only libibverbs, libmlx4 and mlnx-ofed-kernel packages are required from
+that distribution.
+
+.. note::
+
+   Both libraries are BSD and GPL licensed. Linux kernel modules are GPL
+   licensed.
+
+Usage example
+-------------
+
+This section demonstrates how to launch **testpmd** with Mellanox ConnectX-3
+devices managed by librte_pmd_mlx4.
+
+#. Load the kernel modules:
+
+   .. code-block:: console
+
+      modprobe -a ib_uverbs mlx4_en mlx4_core mlx4_ib
+
+   .. note::
+
+      User space I/O kernel modules (uio and igb_uio) are not used and do
+      not have to be loaded.
+
+#. Make sure Ethernet interfaces are in working order and linked to kernel
+   verbs. Related sysfs entries should be present:
+
+   .. code-block:: console
+
+      ls -d /sys/class/net/*/device/infiniband_verbs/uverbs* | cut -d / -f 5
+
+   Example output:
+
+   .. code-block:: console
+
+      eth2
+      eth3
+      eth4
+      eth5
+
+#. Optionally, retrieve their PCI bus addresses for whitelisting:
+
+   .. code-block:: console
+
+      {
+          for intf in eth2 eth3 eth4 eth5;
+          do
+              (cd "/sys/class/net/${intf}/device/" && pwd -P);
+          done;
+      } |
+      sed -n 's,.*/\(.*\),-w \1,p'
+
+   Example output:
+
+   .. code-block:: console
+
+      -w 0000:83:00.0
+      -w 0000:83:00.0
+      -w 0000:84:00.0
+      -w 0000:84:00.0
+
+   .. note::
+
+      There are only two distinct PCI bus addresses because the Mellanox
+      ConnectX-3 adapters installed on this system are dual port.
+
+#. Request huge pages:
+
+   .. code-block:: console
+
+      echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages/nr_hugepages
+
+#. Start testpmd with basic parameters:
+
+   .. code-block:: console
+
+      testpmd -c 0xff00 -n 4 -w 0000:83:00.0 -w 0000:84:00.0 -- --rxq=2 --txq=2 -i
+
+   Example output:
+
+   .. code-block:: console
+
+      [...]
+      EAL: PCI device 0000:83:00.0 on NUMA socket 1
+      EAL:   probe driver: 15b3:1007 librte_pmd_mlx4
+      PMD: librte_pmd_mlx4: PCI information matches, using device "mlx4_0" (VF: false)
+      PMD: librte_pmd_mlx4: 2 port(s) detected
+      PMD: librte_pmd_mlx4: port 1 MAC address is 00:02:c9:b5:b7:50
+      PMD: librte_pmd_mlx4: port 2 MAC address is 00:02:c9:b5:b7:51
+      EAL: PCI device 0000:84:00.0 on NUMA socket 1
+      EAL:   probe driver: 15b3:1007 librte_pmd_mlx4
+      PMD: librte_pmd_mlx4: PCI information matches, using device "mlx4_1" (VF: false)
+      PMD: librte_pmd_mlx4: 2 port(s) detected
+      PMD: librte_pmd_mlx4: port 1 MAC address is 00:02:c9:b5:ba:b0
+      PMD: librte_pmd_mlx4: port 2 MAC address is 00:02:c9:b5:ba:b1
+      Interactive-mode selected
+      Configuring Port 0 (socket 0)
+      PMD: librte_pmd_mlx4: 0x867d60: TX queues number update: 0 -> 2
+      PMD: librte_pmd_mlx4: 0x867d60: RX queues number update: 0 -> 2
+      Port 0: 00:02:C9:B5:B7:50
+      Configuring Port 1 (socket 0)
+      PMD: librte_pmd_mlx4: 0x867da0: TX queues number update: 0 -> 2
+      PMD: librte_pmd_mlx4: 0x867da0: RX queues number update: 0 -> 2
+      Port 1: 00:02:C9:B5:B7:51
+      Configuring Port 2 (socket 0)
+      PMD: librte_pmd_mlx4: 0x867de0: TX queues number update: 0 -> 2
+      PMD: librte_pmd_mlx4: 0x867de0: RX queues number update: 0 -> 2
+      Port 2: 00:02:C9:B5:BA:B0
+      Configuring Port 3 (socket 0)
+      PMD: librte_pmd_mlx4: 0x867e20: TX queues number update: 0 -> 2
+      PMD: librte_pmd_mlx4: 0x867e20: RX queues number update: 0 -> 2
+      Port 3: 00:02:C9:B5:BA:B1
+      Checking link statuses...
+      Port 0 Link Up - speed 10000 Mbps - full-duplex
+      Port 1 Link Up - speed 40000 Mbps - full-duplex
+      Port 2 Link Up - speed 10000 Mbps - full-duplex
+      Port 3 Link Up - speed 40000 Mbps - full-duplex
+      Done
+      testpmd>
diff --git a/doc/guides/prog_guide/source_org.rst b/doc/guides/prog_guide/source_org.rst
index c8ca54f..c66ad16 100644
--- a/doc/guides/prog_guide/source_org.rst
+++ b/doc/guides/prog_guide/source_org.rst
@@ -83,6 +83,7 @@  The lib directory contains::
     +-- librte_pmd_e1000    # 1GbE poll mode drivers (igb and em)
     +-- librte_pmd_ixgbe    # 10GbE poll mode driver
     +-- librte_pmd_i40e     # 40GbE poll mode driver
+    +-- librte_pmd_mlx4     # Mellanox ConnectX-3 poll mode driver
     +-- librte_pmd_pcap     # PCAP poll mode driver
     +-- librte_pmd_ring     # ring poll mode driver
     +-- librte_pmd_virtio   # virtio poll mode driver