mbox series

[v4,0/6] introduce DMA memory mapping for external memory

Message ID cover.1552206210.git.shahafs@mellanox.com (mailing list archive)
Headers
Series introduce DMA memory mapping for external memory |

Message

Shahaf Shuler March 10, 2019, 8:27 a.m. UTC
  The DPDK APIs expose 3 different modes to work with memory used for DMA:

1. Use the DPDK owned memory (backed by the DPDK provided hugepages).
This memory is allocated by the DPDK libraries, included in the DPDK
memory system (memseg lists) and automatically DMA mapped by the DPDK
layers.

2. Use memory allocated by the user and register to the DPDK memory
systems. Upon registration of memory, the DPDK layers will DMA map it
to all needed devices. After registration, allocation of this memory
will be done with rte_*malloc APIs.

3. Use memory allocated by the user and not registered to the DPDK memory
system. This is for users who wants to have tight control on this
memory (e.g. avoid the rte_malloc header).
The user should create a memory, register it through rte_extmem_register
API, and call DMA map function in order to register such memory to
the different devices.

The scope of the patch focus on #3 above.

Currently the only way to map external memory is through VFIO
(rte_vfio_dma_map). While VFIO is common, there are other vendors
which use different ways to map memory (e.g. Mellanox and NXP).

The work in this patch moves the DMA mapping to vendor agnostic APIs.
Device level DMA map and unmap APIs were added. Implementation of those
APIs was done currently only for PCI devices.

For PCI bus devices, the pci driver can expose its own map and unmap
functions to be used for the mapping. In case the driver doesn't provide
any, the memory will be mapped, if possible, to IOMMU through VFIO APIs.

Application usage with those APIs is quite simple:
* allocate memory
* call rte_extmem_register on the memory chunk.
* take a device, and query its rte_device.
* call the device specific mapping function for this device.

Future work will deprecate the rte_vfio_dma_map and rte_vfio_dma_unmap
APIs, leaving the rte device APIs as the preferred option for the user.

On v4:
 - Changed rte_dev_dma_map errno to ENOTSUP in case bus doesn't support
   DMA map API.

On v3:
 - Fixed compilation issue on freebsd.
 - Fixed forgotten rte_bus_dma_map to rte_dev_dma_map.
 - Removed __rte_deprecated from vfio function till the time the rte_dev_dma_map
   will be non-experimental.
 - Changed error return value to always be -1, with proper errno.
 - Used rte_mem_virt2memseg_list instead of rte_mem_virt2memseg to verify
   memory is registered.
 - Added above check also on dma_unmap calls.
 - Added note in the API the memory must be registered in advance.
 - Added debug log to report the case memory mapping to vfio was skipped.

On v2:
 - Added warn in release notes about the API change in vfio.
 - Moved function doc to prototype declaration.
 - Used dma_map and dma_unmap instead of map and unmap.
 - Used RTE_VFIO_DEFAULT_CONTAINER_FD instead of -1 fixed value.
 - Moved bus function to eal_common_dev.c and renamed them properly.
 - Changed eth device iterator to use RTE_DEV_FOREACH.
 - Enforced memory is registered with rte_extmem_* prior to mapping.
 - Used EEXIST as the only possible return value from type1 vfio IOMMU mapping.

[1] https://patches.dpdk.org/patch/47796/

Shahaf Shuler (6):
  vfio: allow DMA map of memory for the default vfio fd
  vfio: don't fail to DMA map if memory is already mapped
  bus: introduce device level DMA memory mapping
  net/mlx5: refactor external memory registration
  net/mlx5: support PCI device DMA map and unmap
  doc: deprecation notice for VFIO DMA map APIs

 doc/guides/prog_guide/env_abstraction_layer.rst |   2 +-
 doc/guides/rel_notes/deprecation.rst            |   4 +
 doc/guides/rel_notes/release_19_05.rst          |   3 +
 drivers/bus/pci/pci_common.c                    |  48 ++++
 drivers/bus/pci/rte_bus_pci.h                   |  40 ++++
 drivers/net/mlx5/mlx5.c                         |   2 +
 drivers/net/mlx5/mlx5_mr.c                      | 225 ++++++++++++++++---
 drivers/net/mlx5/mlx5_rxtx.h                    |   5 +
 lib/librte_eal/common/eal_common_dev.c          |  34 +++
 lib/librte_eal/common/include/rte_bus.h         |  44 ++++
 lib/librte_eal/common/include/rte_dev.h         |  47 ++++
 lib/librte_eal/common/include/rte_vfio.h        |   8 +-
 lib/librte_eal/linuxapp/eal/eal_vfio.c          |  42 +++-
 lib/librte_eal/rte_eal_version.map              |   2 +
 14 files changed, 468 insertions(+), 38 deletions(-)
  

Comments

Gaƫtan Rivet March 11, 2019, 9:27 a.m. UTC | #1
Hello Shahaf,

Thanks for taking my remarks into account. You can add my acked-by if
you need to the series (not really for mlx5 PMD but you get the idea).

BR,

On Sun, Mar 10, 2019 at 10:27:57AM +0200, Shahaf Shuler wrote:
> The DPDK APIs expose 3 different modes to work with memory used for DMA:
> 
> 1. Use the DPDK owned memory (backed by the DPDK provided hugepages).
> This memory is allocated by the DPDK libraries, included in the DPDK
> memory system (memseg lists) and automatically DMA mapped by the DPDK
> layers.
> 
> 2. Use memory allocated by the user and register to the DPDK memory
> systems. Upon registration of memory, the DPDK layers will DMA map it
> to all needed devices. After registration, allocation of this memory
> will be done with rte_*malloc APIs.
> 
> 3. Use memory allocated by the user and not registered to the DPDK memory
> system. This is for users who wants to have tight control on this
> memory (e.g. avoid the rte_malloc header).
> The user should create a memory, register it through rte_extmem_register
> API, and call DMA map function in order to register such memory to
> the different devices.
> 
> The scope of the patch focus on #3 above.
> 
> Currently the only way to map external memory is through VFIO
> (rte_vfio_dma_map). While VFIO is common, there are other vendors
> which use different ways to map memory (e.g. Mellanox and NXP).
> 
> The work in this patch moves the DMA mapping to vendor agnostic APIs.
> Device level DMA map and unmap APIs were added. Implementation of those
> APIs was done currently only for PCI devices.
> 
> For PCI bus devices, the pci driver can expose its own map and unmap
> functions to be used for the mapping. In case the driver doesn't provide
> any, the memory will be mapped, if possible, to IOMMU through VFIO APIs.
> 
> Application usage with those APIs is quite simple:
> * allocate memory
> * call rte_extmem_register on the memory chunk.
> * take a device, and query its rte_device.
> * call the device specific mapping function for this device.
> 
> Future work will deprecate the rte_vfio_dma_map and rte_vfio_dma_unmap
> APIs, leaving the rte device APIs as the preferred option for the user.
> 
> On v4:
>  - Changed rte_dev_dma_map errno to ENOTSUP in case bus doesn't support
>    DMA map API.
> 
> On v3:
>  - Fixed compilation issue on freebsd.
>  - Fixed forgotten rte_bus_dma_map to rte_dev_dma_map.
>  - Removed __rte_deprecated from vfio function till the time the rte_dev_dma_map
>    will be non-experimental.
>  - Changed error return value to always be -1, with proper errno.
>  - Used rte_mem_virt2memseg_list instead of rte_mem_virt2memseg to verify
>    memory is registered.
>  - Added above check also on dma_unmap calls.
>  - Added note in the API the memory must be registered in advance.
>  - Added debug log to report the case memory mapping to vfio was skipped.
> 
> On v2:
>  - Added warn in release notes about the API change in vfio.
>  - Moved function doc to prototype declaration.
>  - Used dma_map and dma_unmap instead of map and unmap.
>  - Used RTE_VFIO_DEFAULT_CONTAINER_FD instead of -1 fixed value.
>  - Moved bus function to eal_common_dev.c and renamed them properly.
>  - Changed eth device iterator to use RTE_DEV_FOREACH.
>  - Enforced memory is registered with rte_extmem_* prior to mapping.
>  - Used EEXIST as the only possible return value from type1 vfio IOMMU mapping.
> 
> [1] https://patches.dpdk.org/patch/47796/
> 
> Shahaf Shuler (6):
>   vfio: allow DMA map of memory for the default vfio fd
>   vfio: don't fail to DMA map if memory is already mapped
>   bus: introduce device level DMA memory mapping
>   net/mlx5: refactor external memory registration
>   net/mlx5: support PCI device DMA map and unmap
>   doc: deprecation notice for VFIO DMA map APIs
> 
>  doc/guides/prog_guide/env_abstraction_layer.rst |   2 +-
>  doc/guides/rel_notes/deprecation.rst            |   4 +
>  doc/guides/rel_notes/release_19_05.rst          |   3 +
>  drivers/bus/pci/pci_common.c                    |  48 ++++
>  drivers/bus/pci/rte_bus_pci.h                   |  40 ++++
>  drivers/net/mlx5/mlx5.c                         |   2 +
>  drivers/net/mlx5/mlx5_mr.c                      | 225 ++++++++++++++++---
>  drivers/net/mlx5/mlx5_rxtx.h                    |   5 +
>  lib/librte_eal/common/eal_common_dev.c          |  34 +++
>  lib/librte_eal/common/include/rte_bus.h         |  44 ++++
>  lib/librte_eal/common/include/rte_dev.h         |  47 ++++
>  lib/librte_eal/common/include/rte_vfio.h        |   8 +-
>  lib/librte_eal/linuxapp/eal/eal_vfio.c          |  42 +++-
>  lib/librte_eal/rte_eal_version.map              |   2 +
>  14 files changed, 468 insertions(+), 38 deletions(-)
> 
> -- 
> 2.12.0
>
  
Thomas Monjalon March 30, 2019, 2:40 p.m. UTC | #2
10/03/2019 09:27, Shahaf Shuler:
> The DPDK APIs expose 3 different modes to work with memory used for DMA:
> 
> 1. Use the DPDK owned memory (backed by the DPDK provided hugepages).
> This memory is allocated by the DPDK libraries, included in the DPDK
> memory system (memseg lists) and automatically DMA mapped by the DPDK
> layers.
> 
> 2. Use memory allocated by the user and register to the DPDK memory
> systems. Upon registration of memory, the DPDK layers will DMA map it
> to all needed devices. After registration, allocation of this memory
> will be done with rte_*malloc APIs.
> 
> 3. Use memory allocated by the user and not registered to the DPDK memory
> system. This is for users who wants to have tight control on this
> memory (e.g. avoid the rte_malloc header).
> The user should create a memory, register it through rte_extmem_register
> API, and call DMA map function in order to register such memory to
> the different devices.
> 
> The scope of the patch focus on #3 above.
> 
> Currently the only way to map external memory is through VFIO
> (rte_vfio_dma_map). While VFIO is common, there are other vendors
> which use different ways to map memory (e.g. Mellanox and NXP).
> 
> The work in this patch moves the DMA mapping to vendor agnostic APIs.
> Device level DMA map and unmap APIs were added. Implementation of those
> APIs was done currently only for PCI devices.
> 
> For PCI bus devices, the pci driver can expose its own map and unmap
> functions to be used for the mapping. In case the driver doesn't provide
> any, the memory will be mapped, if possible, to IOMMU through VFIO APIs.
> 
> Application usage with those APIs is quite simple:
> * allocate memory
> * call rte_extmem_register on the memory chunk.
> * take a device, and query its rte_device.
> * call the device specific mapping function for this device.
> 
> Future work will deprecate the rte_vfio_dma_map and rte_vfio_dma_unmap
> APIs, leaving the rte device APIs as the preferred option for the user.
> 
> Shahaf Shuler (6):
>   vfio: allow DMA map of memory for the default vfio fd
>   vfio: don't fail to DMA map if memory is already mapped
>   bus: introduce device level DMA memory mapping
>   net/mlx5: refactor external memory registration
>   net/mlx5: support PCI device DMA map and unmap
>   doc: deprecation notice for VFIO DMA map APIs

Applied, thanks