mbox series

[v4,00/20] Support externally allocated memory in DPDK

Message ID cover.1537546029.git.anatoly.burakov@intel.com (mailing list archive)
Headers
Series Support externally allocated memory in DPDK |

Message

Burakov, Anatoly Sept. 21, 2018, 4:13 p.m. UTC
  This is a proposal to enable using externally allocated memory
in DPDK.

In a nutshell, here is what is being done here:

- Index internal malloc heaps by NUMA node index, rather than NUMA
  node itself (external heaps will have ID's in order of creation)
- Add identifier string to malloc heap, to uniquely identify it
  - Each new heap will receive a unique socket ID that will be used by
    allocator to decide from which heap (internal or external) to
    allocate requested amount of memory
- Allow creating named heaps and add/remove memory to/from those heaps
- Allocate memseg lists at runtime, to keep track of IOVA addresses
  of externally allocated memory
  - If IOVA addresses aren't provided, use RTE_BAD_IOVA
- Allow malloc and memzones to allocate from external heaps
- Allow other data structures to allocate from externall heaps

The responsibility to ensure memory is accessible before using it is
on the shoulders of the user - there is no checking done with regards
to validity of the memory (nor could there be...).

The general approach is to create heap and add memory into it. For any
other process wishing to use the same memory, said memory must first
be attached (otherwise some things will not work).

A design decision was made to make multiprocess synchronization a
manual process. Due to underlying issues with attaching to fbarrays in
secondary processes, this design was deemed to be better because we
don't want to fail to create external heap in the primary because
something in the secondary has failed when in fact we may not eve have
wanted this memory to be accessible in the secondary in the first
place.

Using external memory in multiprocess is *hard*, because not only
memory space needs to be preallocated, but it also needs to be attached
in each process to allow other processes to access the page table. The
attach API call may or may not succeed, depending on memory layout, for
reasons similar to other multiprocess failures. This is treated as a
"known issue" for this release.

Creating and destroying heaps is currently restricted to primary
processes, because we need to keep track of all socket ID's we've ever
used to prevent their reuse, and obviously different processes would
have kept different socket ID counters, and it isn't important enough
to put into shared memory. This means that secondary processes will
not be able to create new heaps. If this use case is important
enough, we can put the max socket ID into shared memory, or allow
socket ID reuse (which i do not think is a good idea because it has
the potential to make things harder to debug).

v4 -> v3 changes:
- Dropped sample application in favor of new testpmd flag
- Added new flag to testpmd, with four options of mempool allocation
- Added new API to check if a socket ID belongs to an external heap
- Adjusted malloc and mempool code to not make any assumptions about
  IOVA-contiguousness when dealing with externally allocated memory

v3 -> v2 changes:
- Rebase on top of latest master
- Clarifications added to mempool code as per Andrew Rynchenko's
  comments

v2 -> v1 changes:
- Fixed NULL dereference on heap socket ID lookup
- Fixed memseg offset calculation on adding memory to heap
- Improved unit test to test for above bugfixes
- Restricted heap creation to primary processes only
- Added sample application
- Added documentation

RFC -> v1 changes:
- Removed the "named heaps" API, allocate using fake socket ID instead
- Added multiprocess support
- Everything is now thread-safe
- Numerous bugfixes and API improvements

Anatoly Burakov (20):
  mem: add length to memseg list
  mem: allow memseg lists to be marked as external
  malloc: index heaps using heap ID rather than NUMA node
  mem: do not check for invalid socket ID
  flow_classify: do not check for invalid socket ID
  pipeline: do not check for invalid socket ID
  sched: do not check for invalid socket ID
  malloc: add name to malloc heaps
  malloc: add function to query socket ID of named heap
  malloc: add function to check if socket is external
  malloc: allow creating malloc heaps
  malloc: allow destroying heaps
  malloc: allow adding memory to named heaps
  malloc: allow removing memory from named heaps
  malloc: allow attaching to external memory chunks
  malloc: allow detaching from external memory
  test: add unit tests for external memory support
  app/testpmd: add support for external memory
  doc: add external memory feature to the release notes
  doc: add external memory feature to programmer's guide

 app/test-pmd/config.c                         |  21 +-
 app/test-pmd/parameters.c                     |  23 +-
 app/test-pmd/testpmd.c                        | 337 +++++++++++++-
 app/test-pmd/testpmd.h                        |  13 +-
 config/common_base                            |   1 +
 config/rte_config.h                           |   1 +
 .../prog_guide/env_abstraction_layer.rst      |  38 ++
 doc/guides/rel_notes/deprecation.rst          |  15 -
 doc/guides/rel_notes/release_18_11.rst        |  24 +-
 doc/guides/testpmd_app_ug/run_app.rst         |  12 +
 drivers/bus/fslmc/fslmc_vfio.c                |   7 +-
 drivers/bus/pci/linux/pci.c                   |   2 +-
 drivers/net/mlx4/mlx4_mr.c                    |   3 +
 drivers/net/mlx5/mlx5.c                       |   5 +-
 drivers/net/mlx5/mlx5_mr.c                    |   3 +
 drivers/net/virtio/virtio_user/vhost_kernel.c |   5 +-
 lib/librte_eal/bsdapp/eal/Makefile            |   2 +-
 lib/librte_eal/bsdapp/eal/eal.c               |   3 +
 lib/librte_eal/bsdapp/eal/eal_memory.c        |   9 +-
 lib/librte_eal/common/eal_common_memory.c     |   8 +-
 lib/librte_eal/common/eal_common_memzone.c    |   8 +-
 .../common/include/rte_eal_memconfig.h        |   6 +-
 lib/librte_eal/common/include/rte_malloc.h    | 198 +++++++++
 .../common/include/rte_malloc_heap.h          |   3 +
 lib/librte_eal/common/include/rte_memory.h    |   9 +
 lib/librte_eal/common/malloc_elem.c           |  10 +-
 lib/librte_eal/common/malloc_heap.c           | 300 +++++++++++--
 lib/librte_eal/common/malloc_heap.h           |  17 +
 lib/librte_eal/common/rte_malloc.c            | 420 +++++++++++++++++-
 lib/librte_eal/linuxapp/eal/Makefile          |   2 +-
 lib/librte_eal/linuxapp/eal/eal.c             |  10 +-
 lib/librte_eal/linuxapp/eal/eal_memalloc.c    |  12 +-
 lib/librte_eal/linuxapp/eal/eal_memory.c      |   4 +-
 lib/librte_eal/linuxapp/eal/eal_vfio.c        |  17 +-
 lib/librte_eal/meson.build                    |   2 +-
 lib/librte_eal/rte_eal_version.map            |   8 +
 lib/librte_flow_classify/rte_flow_classify.c  |   3 +-
 lib/librte_mempool/rte_mempool.c              |  57 ++-
 lib/librte_pipeline/rte_pipeline.c            |   3 +-
 lib/librte_sched/rte_sched.c                  |   2 +-
 test/test/Makefile                            |   1 +
 test/test/autotest_data.py                    |  14 +-
 test/test/meson.build                         |   1 +
 test/test/test_external_mem.c                 | 389 ++++++++++++++++
 test/test/test_malloc.c                       |   3 +
 test/test/test_memzone.c                      |   3 +
 46 files changed, 1897 insertions(+), 137 deletions(-)
 create mode 100644 test/test/test_external_mem.c
  

Comments

Thomas Monjalon Sept. 23, 2018, 9:21 p.m. UTC | #1
Hi Anatoly,

21/09/2018 18:13, Anatoly Burakov:
> This is a proposal to enable using externally allocated memory
> in DPDK.

About this change and previous ones, I think we may miss some
documentation about the usage and the internal design of the DPDK
memory allocation.
You already updated some doc recently:
	http://git.dpdk.org/dpdk/commit/?id=b31739328

This is what we have currently:
	http://doc.dpdk.org/guides/prog_guide/env_abstraction_layer.html#memory-segments-and-memory-zones-memzone
	http://doc.dpdk.org/guides/prog_guide/env_abstraction_layer.html#malloc
	http://doc.dpdk.org/guides/prog_guide/mempool_lib.html

This is probably a good time to check this doc again.
Do you think it deserves more explanations, or maybe some figures?
  
Burakov, Anatoly Sept. 24, 2018, 8:54 a.m. UTC | #2
On 23-Sep-18 10:21 PM, Thomas Monjalon wrote:
> Hi Anatoly,
> 
> 21/09/2018 18:13, Anatoly Burakov:
>> This is a proposal to enable using externally allocated memory
>> in DPDK.
> 
> About this change and previous ones, I think we may miss some
> documentation about the usage and the internal design of the DPDK
> memory allocation.
> You already updated some doc recently:
> 	http://git.dpdk.org/dpdk/commit/?id=b31739328
> 
> This is what we have currently:
> 	http://doc.dpdk.org/guides/prog_guide/env_abstraction_layer.html#memory-segments-and-memory-zones-memzone
> 	http://doc.dpdk.org/guides/prog_guide/env_abstraction_layer.html#malloc
> 	http://doc.dpdk.org/guides/prog_guide/mempool_lib.html
> 
> This is probably a good time to check this doc again.
> Do you think it deserves more explanations, or maybe some figures?
> 

Maybe this could be split into two sections - explanation of user-facing 
API, and explanation of its inner workings. However, I don't want for 
DPDK documentation to become my personal soapbox, so i'm open to 
suggestions on what is missing and how to organize the memory docs better :)