[v3,00/20] Support externally allocated memory in DPDK
mbox series

Message ID cover.1537443103.git.anatoly.burakov@intel.com
Headers show
  • Support externally allocated memory in DPDK
Related show


Burakov, Anatoly Sept. 20, 2018, 11:36 a.m. UTC
This is a proposal to enable using externally allocated memory
in DPDK.

In a nutshell, here is what is being done here:

- Index internal malloc heaps by NUMA node index, rather than NUMA
  node itself (external heaps will have ID's in order of creation)
- Add identifier string to malloc heap, to uniquely identify it
  - Each new heap will receive a unique socket ID that will be used by
    allocator to decide from which heap (internal or external) to
    allocate requested amount of memory
- Allow creating named heaps and add/remove memory to/from those heaps
- Allocate memseg lists at runtime, to keep track of IOVA addresses
  of externally allocated memory
  - If IOVA addresses aren't provided, use RTE_BAD_IOVA
- Allow malloc and memzones to allocate from external heaps
- Allow other data structures to allocate from externall heaps

The responsibility to ensure memory is accessible before using it is
on the shoulders of the user - there is no checking done with regards
to validity of the memory (nor could there be...).

The general approach is to create heap and add memory into it. For any
other process wishing to use the same memory, said memory must first
be attached (otherwise some things will not work).

A design decision was made to make multiprocess synchronization a
manual process. Due to underlying issues with attaching to fbarrays in
secondary processes, this design was deemed to be better because we
don't want to fail to create external heap in the primary because
something in the secondary has failed when in fact we may not eve have
wanted this memory to be accessible in the secondary in the first

Using external memory in multiprocess is *hard*, because not only
memory space needs to be preallocated, but it also needs to be attached
in each process to allow other processes to access the page table. The
attach API call may or may not succeed, depending on memory layout, for
reasons similar to other multiprocess failures. This is treated as a
"known issue" for this release.

Creating and destroying heaps is currently restricted to primary
processes, because we need to keep track of all socket ID's we've ever
used to prevent their reuse, and obviously different processes would
have kept different socket ID counters, and it isn't important enough
to put into shared memory. This means that secondary processes will
not be able to create new heaps. If this use case is important
enough, we can put the max socket ID into shared memory, or allow
socket ID reuse (which i do not think is a good idea because it has
the potential to make things harder to debug).

v3 -> v2 changes:
- Rebase on top of latest master
- Clarifications added to mempool code as per Andrew Rynchenko's

v2 -> v1 changes:
- Fixed NULL dereference on heap socket ID lookup
- Fixed memseg offset calculation on adding memory to heap
- Improved unit test to test for above bugfixes
- Restricted heap creation to primary processes only
- Added sample application
- Added documentation

RFC -> v1 changes:
- Removed the "named heaps" API, allocate using fake socket ID instead
- Added multiprocess support
- Everything is now thread-safe
- Numerous bugfixes and API improvements

Anatoly Burakov (20):
  mem: add length to memseg list
  mem: allow memseg lists to be marked as external
  malloc: index heaps using heap ID rather than NUMA node
  mem: do not check for invalid socket ID
  flow_classify: do not check for invalid socket ID
  pipeline: do not check for invalid socket ID
  sched: do not check for invalid socket ID
  malloc: add name to malloc heaps
  malloc: add function to query socket ID of named heap
  malloc: allow creating malloc heaps
  malloc: allow destroying heaps
  malloc: allow adding memory to named heaps
  malloc: allow removing memory from named heaps
  malloc: allow attaching to external memory chunks
  malloc: allow detaching from external memory
  test: add unit tests for external memory support
  examples: add external memory example app
  doc: add external memory feature to the release notes
  doc: add external memory feature to programmer's guide
  doc: add external memory sample application guide

 config/common_base                            |   1 +
 config/rte_config.h                           |   1 +
 .../prog_guide/env_abstraction_layer.rst      |  38 ++
 doc/guides/rel_notes/deprecation.rst          |  15 -
 doc/guides/rel_notes/release_18_11.rst        |  24 +-
 doc/guides/sample_app_ug/external_mem.rst     | 115 +++++
 doc/guides/sample_app_ug/index.rst            |   1 +
 drivers/bus/fslmc/fslmc_vfio.c                |   7 +-
 drivers/bus/pci/linux/pci.c                   |   2 +-
 drivers/net/mlx4/mlx4_mr.c                    |   3 +
 drivers/net/mlx5/mlx5.c                       |   5 +-
 drivers/net/mlx5/mlx5_mr.c                    |   3 +
 drivers/net/virtio/virtio_user/vhost_kernel.c |   5 +-
 examples/external_mem/Makefile                |  62 +++
 examples/external_mem/extmem.c                | 461 ++++++++++++++++++
 examples/external_mem/meson.build             |  12 +
 lib/librte_eal/bsdapp/eal/Makefile            |   2 +-
 lib/librte_eal/bsdapp/eal/eal.c               |   3 +
 lib/librte_eal/bsdapp/eal/eal_memory.c        |   9 +-
 lib/librte_eal/common/eal_common_memory.c     |   8 +-
 lib/librte_eal/common/eal_common_memzone.c    |   8 +-
 .../common/include/rte_eal_memconfig.h        |   6 +-
 lib/librte_eal/common/include/rte_malloc.h    | 183 +++++++
 .../common/include/rte_malloc_heap.h          |   3 +
 lib/librte_eal/common/include/rte_memory.h    |   9 +
 lib/librte_eal/common/malloc_heap.c           | 300 ++++++++++--
 lib/librte_eal/common/malloc_heap.h           |  17 +
 lib/librte_eal/common/rte_malloc.c            | 393 ++++++++++++++-
 lib/librte_eal/linuxapp/eal/Makefile          |   2 +-
 lib/librte_eal/linuxapp/eal/eal.c             |  10 +-
 lib/librte_eal/linuxapp/eal/eal_memalloc.c    |  12 +-
 lib/librte_eal/linuxapp/eal/eal_memory.c      |   4 +-
 lib/librte_eal/linuxapp/eal/eal_vfio.c        |  17 +-
 lib/librte_eal/meson.build                    |   2 +-
 lib/librte_eal/rte_eal_version.map            |   7 +
 lib/librte_flow_classify/rte_flow_classify.c  |   3 +-
 lib/librte_mempool/rte_mempool.c              |  35 +-
 lib/librte_pipeline/rte_pipeline.c            |   3 +-
 lib/librte_sched/rte_sched.c                  |   2 +-
 test/test/Makefile                            |   1 +
 test/test/autotest_data.py                    |  14 +-
 test/test/meson.build                         |   1 +
 test/test/test_external_mem.c                 | 389 +++++++++++++++
 test/test/test_malloc.c                       |   3 +
 test/test/test_memzone.c                      |   3 +
 45 files changed, 2099 insertions(+), 105 deletions(-)
 create mode 100644 doc/guides/sample_app_ug/external_mem.rst
 create mode 100644 examples/external_mem/Makefile
 create mode 100644 examples/external_mem/extmem.c
 create mode 100644 examples/external_mem/meson.build
 create mode 100644 test/test/test_external_mem.c