From patchwork Wed Sep 26 11:21:59 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 45378 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 7A7C71B213; Wed, 26 Sep 2018 13:22:43 +0200 (CEST) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id A49D51B184 for ; Wed, 26 Sep 2018 13:22:29 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 26 Sep 2018 04:22:28 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,306,1534834800"; d="scan'208";a="93854864" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga001.jf.intel.com with ESMTP; 26 Sep 2018 04:22:22 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w8QBMM6N011616; Wed, 26 Sep 2018 12:22:22 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w8QBMM9S029198; Wed, 26 Sep 2018 12:22:22 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w8QBMK2t029188; Wed, 26 Sep 2018 12:22:20 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: laszlo.madarassy@ericsson.com, laszlo.vadkerti@ericsson.com, andras.kovacs@ericsson.com, winnie.tian@ericsson.com, daniel.andrasi@ericsson.com, janos.kobor@ericsson.com, geza.koblo@ericsson.com, srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com, keith.wiles@intel.com, bruce.richardson@intel.com, thomas@monjalon.net, shreyansh.jain@nxp.com, shahafs@mellanox.com, arybchenko@solarflare.com Date: Wed, 26 Sep 2018 12:21:59 +0100 Message-Id: X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: Subject: [dpdk-dev] [PATCH v5 00/21] Support externally allocated memory in DPDK X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This is a proposal to enable using externally allocated memory in DPDK. In a nutshell, here is what is being done here: - Index internal malloc heaps by NUMA node index, rather than NUMA node itself (external heaps will have ID's in order of creation) - Add identifier string to malloc heap, to uniquely identify it - Each new heap will receive a unique socket ID that will be used by allocator to decide from which heap (internal or external) to allocate requested amount of memory - Allow creating named heaps and add/remove memory to/from those heaps - Allocate memseg lists at runtime, to keep track of IOVA addresses of externally allocated memory - If IOVA addresses aren't provided, use RTE_BAD_IOVA - Allow malloc and memzones to allocate from external heaps - Allow other data structures to allocate from externall heaps The responsibility to ensure memory is accessible before using it is on the shoulders of the user - there is no checking done with regards to validity of the memory (nor could there be...). The general approach is to create heap and add memory into it. For any other process wishing to use the same memory, said memory must first be attached (otherwise some things will not work). A design decision was made to make multiprocess synchronization a manual process. Due to underlying issues with attaching to fbarrays in secondary processes, this design was deemed to be better because we don't want to fail to create external heap in the primary because something in the secondary has failed when in fact we may not eve have wanted this memory to be accessible in the secondary in the first place. Using external memory in multiprocess is *hard*, because not only memory space needs to be preallocated, but it also needs to be attached in each process to allow other processes to access the page table. The attach API call may or may not succeed, depending on memory layout, for reasons similar to other multiprocess failures. This is treated as a "known issue" for this release. v5 -> v4 changes: - All processes are now able to create and destroy malloc heaps - Memory is automatically mapped for DMA on adding it to heap - Mem event callbacks are triggered on adding/removing memory - Fixed compile issues on FreeBSD - Better documentation on API/ABI changes v4 -> v3 changes: - Dropped sample application in favor of new testpmd flag - Added new flag to testpmd, with four options of mempool allocation - Added new API to check if a socket ID belongs to an external heap - Adjusted malloc and mempool code to not make any assumptions about IOVA-contiguousness when dealing with externally allocated memory v3 -> v2 changes: - Rebase on top of latest master - Clarifications added to mempool code as per Andrew Rynchenko's comments v2 -> v1 changes: - Fixed NULL dereference on heap socket ID lookup - Fixed memseg offset calculation on adding memory to heap - Improved unit test to test for above bugfixes - Restricted heap creation to primary processes only - Added sample application - Added documentation RFC -> v1 changes: - Removed the "named heaps" API, allocate using fake socket ID instead - Added multiprocess support - Everything is now thread-safe - Numerous bugfixes and API improvements Anatoly Burakov (21): mem: add length to memseg list mem: allow memseg lists to be marked as external malloc: index heaps using heap ID rather than NUMA node mem: do not check for invalid socket ID flow_classify: do not check for invalid socket ID pipeline: do not check for invalid socket ID sched: do not check for invalid socket ID malloc: add name to malloc heaps malloc: add function to query socket ID of named heap malloc: add function to check if socket is external malloc: allow creating malloc heaps malloc: allow destroying heaps malloc: allow adding memory to named heaps malloc: allow removing memory from named heaps malloc: allow attaching to external memory chunks malloc: allow detaching from external memory malloc: enable event callbacks for external memory test: add unit tests for external memory support app/testpmd: add support for external memory doc: add external memory feature to the release notes doc: add external memory feature to programmer's guide app/test-pmd/config.c | 21 +- app/test-pmd/parameters.c | 23 +- app/test-pmd/testpmd.c | 305 ++++++++++++- app/test-pmd/testpmd.h | 13 +- config/common_base | 1 + config/rte_config.h | 1 + .../prog_guide/env_abstraction_layer.rst | 37 ++ doc/guides/rel_notes/deprecation.rst | 15 - doc/guides/rel_notes/release_18_11.rst | 28 +- doc/guides/testpmd_app_ug/run_app.rst | 12 + drivers/bus/fslmc/fslmc_vfio.c | 14 +- drivers/bus/pci/linux/pci.c | 2 +- drivers/net/mlx4/mlx4_mr.c | 3 + drivers/net/mlx5/mlx5.c | 5 +- drivers/net/mlx5/mlx5_mr.c | 3 + drivers/net/virtio/virtio_user/vhost_kernel.c | 5 +- .../net/virtio/virtio_user/virtio_user_dev.c | 8 + lib/librte_eal/bsdapp/eal/Makefile | 2 +- lib/librte_eal/bsdapp/eal/eal.c | 3 + lib/librte_eal/bsdapp/eal/eal_memory.c | 9 +- lib/librte_eal/common/eal_common_memory.c | 8 +- lib/librte_eal/common/eal_common_memzone.c | 8 +- .../common/include/rte_eal_memconfig.h | 9 +- lib/librte_eal/common/include/rte_malloc.h | 192 ++++++++ .../common/include/rte_malloc_heap.h | 3 + lib/librte_eal/common/include/rte_memory.h | 9 + lib/librte_eal/common/malloc_elem.c | 10 +- lib/librte_eal/common/malloc_heap.c | 316 +++++++++++-- lib/librte_eal/common/malloc_heap.h | 17 + lib/librte_eal/common/rte_malloc.c | 429 +++++++++++++++++- lib/librte_eal/linuxapp/eal/Makefile | 2 +- lib/librte_eal/linuxapp/eal/eal.c | 10 +- lib/librte_eal/linuxapp/eal/eal_memalloc.c | 12 +- lib/librte_eal/linuxapp/eal/eal_memory.c | 4 +- lib/librte_eal/linuxapp/eal/eal_vfio.c | 27 +- lib/librte_eal/meson.build | 2 +- lib/librte_eal/rte_eal_version.map | 8 + lib/librte_flow_classify/rte_flow_classify.c | 3 +- lib/librte_mempool/rte_mempool.c | 57 ++- lib/librte_pipeline/rte_pipeline.c | 3 +- lib/librte_sched/rte_sched.c | 2 +- test/test/Makefile | 1 + test/test/autotest_data.py | 14 +- test/test/meson.build | 1 + test/test/test_external_mem.c | 389 ++++++++++++++++ test/test/test_malloc.c | 3 + test/test/test_memzone.c | 3 + 47 files changed, 1913 insertions(+), 139 deletions(-) create mode 100644 test/test/test_external_mem.c