From patchwork Mon Oct 1 12:56:08 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 45776 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 24D7D1B17F; Mon, 1 Oct 2018 15:00:20 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 363BB1B13A for ; Mon, 1 Oct 2018 15:00:18 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 Oct 2018 06:00:17 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,327,1534834800"; d="scan'208";a="84801979" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by FMSMGA003.fm.intel.com with ESMTP; 01 Oct 2018 05:56:30 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w91CuUFp000575; Mon, 1 Oct 2018 13:56:30 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w91CuU1t023786; Mon, 1 Oct 2018 13:56:30 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w91CuTHH023778; Mon, 1 Oct 2018 13:56:29 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: laszlo.madarassy@ericsson.com, laszlo.vadkerti@ericsson.com, andras.kovacs@ericsson.com, winnie.tian@ericsson.com, daniel.andrasi@ericsson.com, janos.kobor@ericsson.com, geza.koblo@ericsson.com, srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com, keith.wiles@intel.com, bruce.richardson@intel.com, thomas@monjalon.net, shreyansh.jain@nxp.com, shahafs@mellanox.com, arybchenko@solarflare.com, alejandro.lucero@netronome.com Date: Mon, 1 Oct 2018 13:56:08 +0100 Message-Id: X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: Subject: [dpdk-dev] [PATCH v8 00/21] Support externally allocated memory in DPDK X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This is a proposal to enable using externally allocated memory in DPDK. In a nutshell, here is what is being done here: - Index internal malloc heaps by NUMA node index, rather than NUMA node itself (external heaps will have ID's in order of creation) - Add identifier string to malloc heap, to uniquely identify it - Each new heap will receive a unique socket ID that will be used by allocator to decide from which heap (internal or external) to allocate requested amount of memory - Allow creating named heaps and add/remove memory to/from those heaps - Allocate memseg lists at runtime, to keep track of IOVA addresses of externally allocated memory - If IOVA addresses aren't provided, use RTE_BAD_IOVA - Allow malloc and memzones to allocate from external heaps - Allow other data structures to allocate from externall heaps The responsibility to ensure memory is accessible before using it is on the shoulders of the user - there is no checking done with regards to validity of the memory (nor could there be...). The general approach is to create heap and add memory into it. For any other process wishing to use the same memory, said memory must first be attached (otherwise some things will not work). A design decision was made to make multiprocess synchronization a manual process. Due to underlying issues with attaching to fbarrays in secondary processes, this design was deemed to be better because we don't want to fail to create external heap in the primary because something in the secondary has failed when in fact we may not eve have wanted this memory to be accessible in the secondary in the first place. Using external memory in multiprocess is *hard*, because not only memory space needs to be preallocated, but it also needs to be attached in each process to allow other processes to access the page table. The attach API call may or may not succeed, depending on memory layout, for reasons similar to other multiprocess failures. This is treated as a "known issue" for this release. v8 -> v7 changes: - Rebase on latest master - More documentation on ABI changes v7 -> v6 changes: - Fixed missing IOVA address setup in testpmd - Fixed MLX drivers as per Yongseok's comments - Added a check for invalid heap idx on adding memory to heap v6 -> v5 changes: - Fixed documentation formatting as per Marko's comments v5 -> v4 changes: - All processes are now able to create and destroy malloc heaps - Memory is automatically mapped for DMA on adding it to heap - Mem event callbacks are triggered on adding/removing memory - Fixed compile issues on FreeBSD - Better documentation on API/ABI changes v4 -> v3 changes: - Dropped sample application in favor of new testpmd flag - Added new flag to testpmd, with four options of mempool allocation - Added new API to check if a socket ID belongs to an external heap - Adjusted malloc and mempool code to not make any assumptions about IOVA-contiguousness when dealing with externally allocated memory v3 -> v2 changes: - Rebase on top of latest master - Clarifications added to mempool code as per Andrew Rynchenko's comments v2 -> v1 changes: - Fixed NULL dereference on heap socket ID lookup - Fixed memseg offset calculation on adding memory to heap - Improved unit test to test for above bugfixes - Restricted heap creation to primary processes only - Added sample application - Added documentation RFC -> v1 changes: - Removed the "named heaps" API, allocate using fake socket ID instead - Added multiprocess support - Everything is now thread-safe - Numerous bugfixes and API improvements Anatoly Burakov (21): mem: add length to memseg list mem: allow memseg lists to be marked as external malloc: index heaps using heap ID rather than NUMA node mem: do not check for invalid socket ID flow_classify: do not check for invalid socket ID pipeline: do not check for invalid socket ID sched: do not check for invalid socket ID malloc: add name to malloc heaps malloc: add function to query socket ID of named heap malloc: add function to check if socket is external malloc: allow creating malloc heaps malloc: allow destroying heaps malloc: allow adding memory to named heaps malloc: allow removing memory from named heaps malloc: allow attaching to external memory chunks malloc: allow detaching from external memory malloc: enable event callbacks for external memory test: add unit tests for external memory support app/testpmd: add support for external memory doc: add external memory feature to the release notes doc: add external memory feature to programmer's guide app/test-pmd/config.c | 21 +- app/test-pmd/parameters.c | 23 +- app/test-pmd/testpmd.c | 320 ++++++++++++- app/test-pmd/testpmd.h | 13 +- config/common_base | 1 + config/rte_config.h | 1 + .../prog_guide/env_abstraction_layer.rst | 37 ++ doc/guides/rel_notes/deprecation.rst | 15 - doc/guides/rel_notes/release_18_11.rst | 37 +- doc/guides/testpmd_app_ug/run_app.rst | 12 + drivers/bus/fslmc/fslmc_vfio.c | 13 +- drivers/bus/pci/linux/pci.c | 2 +- drivers/net/mlx5/mlx5.c | 4 +- drivers/net/virtio/virtio_user/vhost_kernel.c | 3 + .../net/virtio/virtio_user/virtio_user_dev.c | 6 + lib/librte_eal/bsdapp/eal/Makefile | 2 +- lib/librte_eal/bsdapp/eal/eal.c | 3 + lib/librte_eal/bsdapp/eal/eal_memory.c | 9 +- lib/librte_eal/common/eal_common_memory.c | 8 +- lib/librte_eal/common/eal_common_memzone.c | 8 +- .../common/include/rte_eal_memconfig.h | 9 +- lib/librte_eal/common/include/rte_malloc.h | 192 ++++++++ .../common/include/rte_malloc_heap.h | 3 + lib/librte_eal/common/include/rte_memory.h | 9 + lib/librte_eal/common/malloc_elem.c | 10 +- lib/librte_eal/common/malloc_heap.c | 320 +++++++++++-- lib/librte_eal/common/malloc_heap.h | 17 + lib/librte_eal/common/rte_malloc.c | 429 +++++++++++++++++- lib/librte_eal/linuxapp/eal/Makefile | 2 +- lib/librte_eal/linuxapp/eal/eal.c | 10 +- lib/librte_eal/linuxapp/eal/eal_memalloc.c | 12 +- lib/librte_eal/linuxapp/eal/eal_memory.c | 4 +- lib/librte_eal/linuxapp/eal/eal_vfio.c | 27 +- lib/librte_eal/meson.build | 2 +- lib/librte_eal/rte_eal_version.map | 8 + lib/librte_flow_classify/rte_flow_classify.c | 3 +- lib/librte_mempool/rte_mempool.c | 57 ++- lib/librte_pipeline/rte_pipeline.c | 3 +- lib/librte_sched/rte_sched.c | 2 +- test/test/Makefile | 1 + test/test/autotest_data.py | 14 +- test/test/meson.build | 1 + test/test/test_external_mem.c | 389 ++++++++++++++++ test/test/test_malloc.c | 3 + test/test/test_memzone.c | 3 + 45 files changed, 1930 insertions(+), 138 deletions(-) create mode 100644 test/test/test_external_mem.c