From patchwork Fri Jul 6 13:17:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 42502 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 03BCD1BEFF; Fri, 6 Jul 2018 15:17:47 +0200 (CEST) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by dpdk.org (Postfix) with ESMTP id BD8D71BE41 for ; Fri, 6 Jul 2018 15:17:36 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 Jul 2018 06:17:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,316,1526367600"; d="scan'208";a="70160976" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga001.fm.intel.com with ESMTP; 06 Jul 2018 06:17:33 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w66DHX3a027469; Fri, 6 Jul 2018 14:17:33 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w66DHXn0003746; Fri, 6 Jul 2018 14:17:33 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w66DHXNN003742; Fri, 6 Jul 2018 14:17:33 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com Date: Fri, 6 Jul 2018 14:17:22 +0100 Message-Id: X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [RFC 01/11] mem: allow memseg lists to be marked as external X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" When we allocate and use DPDK memory, we need to be able to differentiate between DPDK hugepage segments and segments that were made part of DPDK but are externally allocated. Add such a property to memseg lists. Signed-off-by: Anatoly Burakov --- lib/librte_eal/common/eal_common_memory.c | 51 ++++++++++++++++--- .../common/include/rte_eal_memconfig.h | 1 + lib/librte_eal/common/malloc_heap.c | 2 +- 3 files changed, 47 insertions(+), 7 deletions(-) diff --git a/lib/librte_eal/common/eal_common_memory.c b/lib/librte_eal/common/eal_common_memory.c index 4f0688f9d..835bbffb6 100644 --- a/lib/librte_eal/common/eal_common_memory.c +++ b/lib/librte_eal/common/eal_common_memory.c @@ -24,6 +24,21 @@ #include "eal_private.h" #include "eal_internal_cfg.h" +/* forward declarations for memseg walk functions. we support external segments, + * but for some functionality to work, we need to either skip or not skip + * external segments. for example, while we expect for virt2memseg to return a + * valid memseg even though it's an external memseg, for regular memseg walk we + * want to skip those because the expectation is that we will only walk the + * DPDK allocated memory. + */ +static int +memseg_list_walk(rte_memseg_list_walk_t func, void *arg, bool skip_external); +static int +memseg_walk(rte_memseg_walk_t func, void *arg, bool skip_external); +static int +memseg_contig_walk(rte_memseg_contig_walk_t func, void *arg, + bool skip_external); + /* * Try to mmap *size bytes in /dev/zero. If it is successful, return the * pointer to the mmap'd area and keep *size unmodified. Else, retry @@ -621,9 +636,9 @@ rte_mem_iova2virt(rte_iova_t iova) * as we know they are PA-contiguous as well */ if (internal_config.legacy_mem) - rte_memseg_contig_walk(find_virt_legacy, &vi); + memseg_contig_walk(find_virt_legacy, &vi, false); else - rte_memseg_walk(find_virt, &vi); + memseg_walk(find_virt, &vi, false); return vi.virt; } @@ -787,8 +802,8 @@ rte_mem_lock_page(const void *virt) return mlock((void *)aligned, page_size); } -int __rte_experimental -rte_memseg_contig_walk(rte_memseg_contig_walk_t func, void *arg) +static int +memseg_contig_walk(rte_memseg_contig_walk_t func, void *arg, bool skip_external) { struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; int i, ms_idx, ret = 0; @@ -803,6 +818,8 @@ rte_memseg_contig_walk(rte_memseg_contig_walk_t func, void *arg) if (msl->memseg_arr.count == 0) continue; + if (skip_external && msl->external) + continue; arr = &msl->memseg_arr; @@ -837,7 +854,13 @@ rte_memseg_contig_walk(rte_memseg_contig_walk_t func, void *arg) } int __rte_experimental -rte_memseg_walk(rte_memseg_walk_t func, void *arg) +rte_memseg_contig_walk(rte_memseg_contig_walk_t func, void *arg) +{ + return memseg_contig_walk(func, arg, true); +} + +static int +memseg_walk(rte_memseg_walk_t func, void *arg, bool skip_external) { struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; int i, ms_idx, ret = 0; @@ -852,6 +875,8 @@ rte_memseg_walk(rte_memseg_walk_t func, void *arg) if (msl->memseg_arr.count == 0) continue; + if (skip_external && msl->external) + continue; arr = &msl->memseg_arr; @@ -875,7 +900,13 @@ rte_memseg_walk(rte_memseg_walk_t func, void *arg) } int __rte_experimental -rte_memseg_list_walk(rte_memseg_list_walk_t func, void *arg) +rte_memseg_walk(rte_memseg_walk_t func, void *arg) +{ + return memseg_walk(func, arg, true); +} + +static int +memseg_list_walk(rte_memseg_list_walk_t func, void *arg, bool skip_external) { struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; int i, ret = 0; @@ -888,6 +919,8 @@ rte_memseg_list_walk(rte_memseg_list_walk_t func, void *arg) if (msl->base_va == NULL) continue; + if (skip_external && msl->external) + continue; ret = func(msl, arg); if (ret < 0) { @@ -904,6 +937,12 @@ rte_memseg_list_walk(rte_memseg_list_walk_t func, void *arg) return ret; } +int __rte_experimental +rte_memseg_list_walk(rte_memseg_list_walk_t func, void *arg) +{ + return memseg_list_walk(func, arg, true); +} + /* init memory subsystem */ int rte_eal_memory_init(void) diff --git a/lib/librte_eal/common/include/rte_eal_memconfig.h b/lib/librte_eal/common/include/rte_eal_memconfig.h index aff0688dd..4e8720ba6 100644 --- a/lib/librte_eal/common/include/rte_eal_memconfig.h +++ b/lib/librte_eal/common/include/rte_eal_memconfig.h @@ -30,6 +30,7 @@ struct rte_memseg_list { uint64_t addr_64; /**< Makes sure addr is always 64-bits */ }; + bool external; /**< true if this list points to external memory */ int socket_id; /**< Socket ID for all memsegs in this list. */ uint64_t page_sz; /**< Page size for all memsegs in this list. */ volatile uint32_t version; /**< version number for multiprocess sync. */ diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c index d6cf3af81..8a1f54905 100644 --- a/lib/librte_eal/common/malloc_heap.c +++ b/lib/librte_eal/common/malloc_heap.c @@ -631,7 +631,7 @@ malloc_heap_free(struct malloc_elem *elem) ret = 0; /* ...of which we can't avail if we are in legacy mode */ - if (internal_config.legacy_mem) + if (internal_config.legacy_mem || msl->external) goto free_unlock; /* check if we can free any memory back to the system */ From patchwork Fri Jul 6 13:17:23 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 42499 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 3DF491BEC1; Fri, 6 Jul 2018 15:17:41 +0200 (CEST) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by dpdk.org (Postfix) with ESMTP id 296591BE43 for ; Fri, 6 Jul 2018 15:17:36 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 Jul 2018 06:17:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,316,1526367600"; d="scan'208";a="238320016" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga005.jf.intel.com with ESMTP; 06 Jul 2018 06:17:33 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w66DHXal027472; Fri, 6 Jul 2018 14:17:33 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w66DHXFe003754; Fri, 6 Jul 2018 14:17:33 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w66DHXnI003749; Fri, 6 Jul 2018 14:17:33 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com Date: Fri, 6 Jul 2018 14:17:23 +0100 Message-Id: <603dcc865358cc669cba9ec042db93e32a4c8cd5.1530881548.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [RFC 02/11] eal: add function to rerieve socket index by socket ID X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" We are preparing to switch to index heap based on heap indexes rather than by NUMA nodes. First few indexes will be equal to NUMA node ID indexes. For example, currently on a machine with NUMA nodes [0, 8], heaps 0 and 8 will be active, while we want to make it so that heaps 0 and 1 are active. However, currently we don't have a function to map a specific NUMA node to a node index, so add it in this patch. Signed-off-by: Anatoly Burakov Acked-by: Alejandro Lucero --- lib/librte_eal/common/eal_common_lcore.c | 15 +++++++++++++++ lib/librte_eal/common/include/rte_lcore.h | 19 ++++++++++++++++++- lib/librte_eal/rte_eal_version.map | 1 + 3 files changed, 34 insertions(+), 1 deletion(-) diff --git a/lib/librte_eal/common/eal_common_lcore.c b/lib/librte_eal/common/eal_common_lcore.c index 3167e9d79..579f5a0a1 100644 --- a/lib/librte_eal/common/eal_common_lcore.c +++ b/lib/librte_eal/common/eal_common_lcore.c @@ -132,3 +132,18 @@ rte_socket_id_by_idx(unsigned int idx) } return config->numa_nodes[idx]; } + +int __rte_experimental +rte_socket_idx_by_id(unsigned int socket) +{ + const struct rte_config *config = rte_eal_get_configuration(); + int i; + + for (i = 0; i < (int) config->numa_node_count; i++) { + unsigned int cur_socket = config->numa_nodes[i]; + if (cur_socket == socket) + return i; + } + rte_errno = EINVAL; + return -1; +} diff --git a/lib/librte_eal/common/include/rte_lcore.h b/lib/librte_eal/common/include/rte_lcore.h index 6e09d9181..f58cda09a 100644 --- a/lib/librte_eal/common/include/rte_lcore.h +++ b/lib/librte_eal/common/include/rte_lcore.h @@ -156,11 +156,28 @@ rte_socket_count(void); * * @return * - physical socket id as recognized by EAL - * - -1 on error, with errno set to EINVAL + * - -1 on error, with rte_errno set to EINVAL */ int __rte_experimental rte_socket_id_by_idx(unsigned int idx); +/** + * Return index for a particular socket id. + * + * This will return position in list of all detected physical socket id's for a + * given socket. For example, on a machine with sockets [0, 8], passing + * 8 as a parameter will return 1. + * + * @param socket + * physical socket id to return index for + * + * @return + * - index of physical socket id as recognized by EAL + * - -1 on error, with rte_errno set to EINVAL + */ +int __rte_experimental +rte_socket_idx_by_id(unsigned int socket); + /** * Get the ID of the physical socket of the specified lcore * diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map index f7dd0e7bc..e7fb37b2a 100644 --- a/lib/librte_eal/rte_eal_version.map +++ b/lib/librte_eal/rte_eal_version.map @@ -296,6 +296,7 @@ EXPERIMENTAL { rte_mp_sendmsg; rte_socket_count; rte_socket_id_by_idx; + rte_socket_idx_by_id; rte_vfio_dma_map; rte_vfio_dma_unmap; rte_vfio_get_container_fd; From patchwork Fri Jul 6 13:17:24 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 42509 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id F08621BF4D; Fri, 6 Jul 2018 15:17:59 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 579CA1BF42 for ; Fri, 6 Jul 2018 15:17:57 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 Jul 2018 06:17:55 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,316,1526367600"; d="scan'208";a="53071164" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga008.fm.intel.com with ESMTP; 06 Jul 2018 06:17:34 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w66DHXXQ027475; Fri, 6 Jul 2018 14:17:33 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w66DHXZv003761; Fri, 6 Jul 2018 14:17:33 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w66DHXDG003757; Fri, 6 Jul 2018 14:17:33 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: Thomas Monjalon , srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com Date: Fri, 6 Jul 2018 14:17:24 +0100 Message-Id: X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [RFC 03/11] malloc: index heaps using heap ID rather than NUMA node X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Switch over all parts of EAL to use heap ID instead of NUMA node ID to identify heaps. Heap ID for DPDK-internal heaps is NUMA node's index within the detected NUMA node list. Signed-off-by: Anatoly Burakov --- config/common_base | 1 + lib/librte_eal/common/eal_common_memzone.c | 46 ++++++++++------ .../common/include/rte_eal_memconfig.h | 4 +- lib/librte_eal/common/malloc_heap.c | 53 ++++++++++++------- lib/librte_eal/common/rte_malloc.c | 28 ++++++---- 5 files changed, 84 insertions(+), 48 deletions(-) diff --git a/config/common_base b/config/common_base index fcf3a1f6f..b0e3937e0 100644 --- a/config/common_base +++ b/config/common_base @@ -61,6 +61,7 @@ CONFIG_RTE_CACHE_LINE_SIZE=64 CONFIG_RTE_LIBRTE_EAL=y CONFIG_RTE_MAX_LCORE=128 CONFIG_RTE_MAX_NUMA_NODES=8 +CONFIG_RTE_MAX_HEAPS=32 CONFIG_RTE_MAX_MEMSEG_LISTS=64 # each memseg list will be limited to either RTE_MAX_MEMSEG_PER_LIST pages # or RTE_MAX_MEM_MB_PER_LIST megabytes worth of memory, whichever is smaller diff --git a/lib/librte_eal/common/eal_common_memzone.c b/lib/librte_eal/common/eal_common_memzone.c index faa3b0615..25c56052c 100644 --- a/lib/librte_eal/common/eal_common_memzone.c +++ b/lib/librte_eal/common/eal_common_memzone.c @@ -52,6 +52,26 @@ memzone_lookup_thread_unsafe(const char *name) return NULL; } +static size_t +heap_max_free_elem(unsigned int heap_idx, unsigned int align) +{ + struct rte_malloc_socket_stats stats; + struct rte_mem_config *mcfg; + size_t len; + + /* get pointer to global configuration */ + mcfg = rte_eal_get_configuration()->mem_config; + + malloc_heap_get_stats(&mcfg->malloc_heaps[heap_idx], &stats); + + len = stats.greatest_free_size; + + if (len < MALLOC_ELEM_OVERHEAD + align) + return 0; + + return len - MALLOC_ELEM_OVERHEAD - align; +} + /* This function will return the greatest free block if a heap has been * specified. If no heap has been specified, it will return the heap and @@ -59,29 +79,23 @@ memzone_lookup_thread_unsafe(const char *name) static size_t find_heap_max_free_elem(int *s, unsigned align) { - struct rte_mem_config *mcfg; - struct rte_malloc_socket_stats stats; - int i, socket = *s; + unsigned int idx; + int socket = *s; size_t len = 0; - /* get pointer to global configuration */ - mcfg = rte_eal_get_configuration()->mem_config; - - for (i = 0; i < RTE_MAX_NUMA_NODES; i++) { - if ((socket != SOCKET_ID_ANY) && (socket != i)) + for (idx = 0; idx < rte_socket_count(); idx++) { + int cur_socket = rte_socket_id_by_idx(idx); + if ((socket != SOCKET_ID_ANY) && (socket != cur_socket)) continue; - malloc_heap_get_stats(&mcfg->malloc_heaps[i], &stats); - if (stats.greatest_free_size > len) { - len = stats.greatest_free_size; - *s = i; + size_t cur_len = heap_max_free_elem(idx, align); + if (cur_len > len) { + len = cur_len; + *s = cur_socket; } } - if (len < MALLOC_ELEM_OVERHEAD + align) - return 0; - - return len - MALLOC_ELEM_OVERHEAD - align; + return len; } static const struct rte_memzone * diff --git a/lib/librte_eal/common/include/rte_eal_memconfig.h b/lib/librte_eal/common/include/rte_eal_memconfig.h index 4e8720ba6..7e03196a6 100644 --- a/lib/librte_eal/common/include/rte_eal_memconfig.h +++ b/lib/librte_eal/common/include/rte_eal_memconfig.h @@ -71,8 +71,8 @@ struct rte_mem_config { struct rte_tailq_head tailq_head[RTE_MAX_TAILQ]; /**< Tailqs for objects */ - /* Heaps of Malloc per socket */ - struct malloc_heap malloc_heaps[RTE_MAX_NUMA_NODES]; + /* Heaps of Malloc */ + struct malloc_heap malloc_heaps[RTE_MAX_HEAPS]; /* address of mem_config in primary process. used to map shared config into * exact same address the primary process maps it. diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c index 8a1f54905..e7e1838b1 100644 --- a/lib/librte_eal/common/malloc_heap.c +++ b/lib/librte_eal/common/malloc_heap.c @@ -93,9 +93,10 @@ malloc_add_seg(const struct rte_memseg_list *msl, struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; struct rte_memseg_list *found_msl; struct malloc_heap *heap; - int msl_idx; + int msl_idx, heap_idx; - heap = &mcfg->malloc_heaps[msl->socket_id]; + heap_idx = rte_socket_idx_by_id(msl->socket_id); + heap = &mcfg->malloc_heaps[heap_idx]; /* msl is const, so find it */ msl_idx = msl - mcfg->memsegs; @@ -494,14 +495,20 @@ alloc_more_mem_on_socket(struct malloc_heap *heap, size_t size, int socket, /* this will try lower page sizes first */ static void * -heap_alloc_on_socket(const char *type, size_t size, int socket, - unsigned int flags, size_t align, size_t bound, bool contig) +malloc_heap_alloc_on_heap_id(const char *type, size_t size, + unsigned int heap_id, unsigned int flags, size_t align, + size_t bound, bool contig) { struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; - struct malloc_heap *heap = &mcfg->malloc_heaps[socket]; + struct malloc_heap *heap = &mcfg->malloc_heaps[heap_id]; unsigned int size_flags = flags & ~RTE_MEMZONE_SIZE_HINT_ONLY; + int socket_id; void *ret; + /* return NULL if size is 0 or alignment is not power-of-2 */ + if (size == 0 || (align && !rte_is_power_of_2(align))) + return NULL; + rte_spinlock_lock(&(heap->lock)); align = align == 0 ? 1 : align; @@ -521,8 +528,13 @@ heap_alloc_on_socket(const char *type, size_t size, int socket, if (ret != NULL) goto alloc_unlock; - if (!alloc_more_mem_on_socket(heap, size, socket, flags, align, bound, - contig)) { + socket_id = rte_socket_id_by_idx(heap_id); + /* if socket ID is invalid, this is an external heap */ + if (socket_id < 0) + goto alloc_unlock; + + if (!alloc_more_mem_on_socket(heap, size, socket_id, flags, align, + bound, contig)) { ret = heap_alloc(heap, type, size, flags, align, bound, contig); /* this should have succeeded */ @@ -538,13 +550,9 @@ void * malloc_heap_alloc(const char *type, size_t size, int socket_arg, unsigned int flags, size_t align, size_t bound, bool contig) { - int socket, i, cur_socket; + int socket, heap_id, i; void *ret; - /* return NULL if size is 0 or alignment is not power-of-2 */ - if (size == 0 || (align && !rte_is_power_of_2(align))) - return NULL; - if (!rte_eal_has_hugepages()) socket_arg = SOCKET_ID_ANY; @@ -557,18 +565,23 @@ malloc_heap_alloc(const char *type, size_t size, int socket_arg, if (socket >= RTE_MAX_NUMA_NODES) return NULL; - ret = heap_alloc_on_socket(type, size, socket, flags, align, bound, - contig); + /* turn socket ID into heap ID */ + heap_id = rte_socket_idx_by_id(socket); + /* if heap id is invalid, it's a non-existent NUMA node */ + if (heap_id < 0) + return NULL; + + ret = malloc_heap_alloc_on_heap_id(type, size, heap_id, flags, align, + bound, contig); if (ret != NULL || socket_arg != SOCKET_ID_ANY) return ret; /* try other heaps */ for (i = 0; i < (int) rte_socket_count(); i++) { - cur_socket = rte_socket_id_by_idx(i); - if (cur_socket == socket) + if (i == heap_id) continue; - ret = heap_alloc_on_socket(type, size, cur_socket, flags, - align, bound, contig); + ret = malloc_heap_alloc_on_heap_id(type, size, i, flags, align, + bound, contig); if (ret != NULL) return ret; } @@ -788,7 +801,7 @@ malloc_heap_resize(struct malloc_elem *elem, size_t size) } /* - * Function to retrieve data for heap on given socket + * Function to retrieve data for a given heap */ int malloc_heap_get_stats(struct malloc_heap *heap, @@ -826,7 +839,7 @@ malloc_heap_get_stats(struct malloc_heap *heap, } /* - * Function to retrieve data for heap on given socket + * Function to retrieve data for a given heap */ void malloc_heap_dump(struct malloc_heap *heap, FILE *f) diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c index b51a6d111..4387bc494 100644 --- a/lib/librte_eal/common/rte_malloc.c +++ b/lib/librte_eal/common/rte_malloc.c @@ -152,11 +152,17 @@ rte_malloc_get_socket_stats(int socket, struct rte_malloc_socket_stats *socket_stats) { struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + int heap_idx; if (socket >= RTE_MAX_NUMA_NODES || socket < 0) return -1; - return malloc_heap_get_stats(&mcfg->malloc_heaps[socket], socket_stats); + heap_idx = rte_socket_idx_by_id(socket); + if (heap_idx < 0) + return -1; + + return malloc_heap_get_stats(&mcfg->malloc_heaps[heap_idx], + socket_stats); } /* @@ -168,10 +174,9 @@ rte_malloc_dump_heaps(FILE *f) struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; unsigned int idx; - for (idx = 0; idx < rte_socket_count(); idx++) { - unsigned int socket = rte_socket_id_by_idx(idx); - fprintf(f, "Heap on socket %i:\n", socket); - malloc_heap_dump(&mcfg->malloc_heaps[socket], f); + for (idx = 0; idx < RTE_MAX_HEAPS; idx++) { + fprintf(f, "Heap id: %u\n", idx); + malloc_heap_dump(&mcfg->malloc_heaps[idx], f); } } @@ -182,14 +187,17 @@ rte_malloc_dump_heaps(FILE *f) void rte_malloc_dump_stats(FILE *f, __rte_unused const char *type) { - unsigned int socket; + struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + unsigned int heap_id; struct rte_malloc_socket_stats sock_stats; + /* Iterate through all initialised heaps */ - for (socket=0; socket< RTE_MAX_NUMA_NODES; socket++) { - if ((rte_malloc_get_socket_stats(socket, &sock_stats) < 0)) - continue; + for (heap_id = 0; heap_id < RTE_MAX_HEAPS; heap_id++) { + struct malloc_heap *heap = &mcfg->malloc_heaps[heap_id]; - fprintf(f, "Socket:%u\n", socket); + malloc_heap_get_stats(heap, &sock_stats); + + fprintf(f, "Heap id:%u\n", heap_id); fprintf(f, "\tHeap_size:%zu,\n", sock_stats.heap_totalsz_bytes); fprintf(f, "\tFree_size:%zu,\n", sock_stats.heap_freesz_bytes); fprintf(f, "\tAlloc_size:%zu,\n", sock_stats.heap_allocsz_bytes); From patchwork Fri Jul 6 13:17:25 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 42506 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id D086A1BF1B; Fri, 6 Jul 2018 15:17:51 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 3AF5D1BE41 for ; Fri, 6 Jul 2018 15:17:39 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 Jul 2018 06:17:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,316,1526367600"; d="scan'208";a="62751146" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by FMSMGA003.fm.intel.com with ESMTP; 06 Jul 2018 06:17:34 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w66DHXS5027478; Fri, 6 Jul 2018 14:17:33 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w66DHXea003768; Fri, 6 Jul 2018 14:17:33 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w66DHXJx003764; Fri, 6 Jul 2018 14:17:33 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com Date: Fri, 6 Jul 2018 14:17:25 +0100 Message-Id: X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [RFC 04/11] malloc: add name to malloc heaps X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" We will need to refer to external heaps in some way. While we use heap ID's internally, for external API use it has to be something more user-friendly. So, we will be using a string to uniquely identify a heap. Signed-off-by: Anatoly Burakov --- lib/librte_eal/common/include/rte_malloc_heap.h | 2 ++ lib/librte_eal/common/malloc_heap.c | 13 +++++++++++++ lib/librte_eal/common/rte_malloc.c | 1 + 3 files changed, 16 insertions(+) diff --git a/lib/librte_eal/common/include/rte_malloc_heap.h b/lib/librte_eal/common/include/rte_malloc_heap.h index d43fa9097..bd64dff03 100644 --- a/lib/librte_eal/common/include/rte_malloc_heap.h +++ b/lib/librte_eal/common/include/rte_malloc_heap.h @@ -12,6 +12,7 @@ /* Number of free lists per heap, grouped by size. */ #define RTE_HEAP_NUM_FREELISTS 13 +#define RTE_HEAP_NAME_MAX_LEN 32 /* dummy definition, for pointers */ struct malloc_elem; @@ -27,6 +28,7 @@ struct malloc_heap { unsigned alloc_count; size_t total_size; + char name[RTE_HEAP_NAME_MAX_LEN]; } __rte_cache_aligned; #endif /* _RTE_MALLOC_HEAP_H_ */ diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c index e7e1838b1..8f22c062b 100644 --- a/lib/librte_eal/common/malloc_heap.c +++ b/lib/librte_eal/common/malloc_heap.c @@ -848,6 +848,7 @@ malloc_heap_dump(struct malloc_heap *heap, FILE *f) rte_spinlock_lock(&heap->lock); + fprintf(f, "Heap name: %s\n", heap->name); fprintf(f, "Heap size: 0x%zx\n", heap->total_size); fprintf(f, "Heap alloc count: %u\n", heap->alloc_count); @@ -864,6 +865,18 @@ int rte_eal_malloc_heap_init(void) { struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + unsigned int i; + + /* assign names to default DPDK heaps */ + for (i = 0; i < rte_socket_count(); i++) { + struct malloc_heap *heap = &mcfg->malloc_heaps[i]; + char heap_name[RTE_HEAP_NAME_MAX_LEN]; + int socket_id = rte_socket_id_by_idx(i); + + snprintf(heap_name, sizeof(heap_name) - 1, + "socket_%i", socket_id); + strlcpy(heap->name, heap_name, RTE_HEAP_NAME_MAX_LEN); + } if (register_mp_requests()) { RTE_LOG(ERR, EAL, "Couldn't register malloc multiprocess actions\n"); diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c index 4387bc494..75d6e0b4d 100644 --- a/lib/librte_eal/common/rte_malloc.c +++ b/lib/librte_eal/common/rte_malloc.c @@ -198,6 +198,7 @@ rte_malloc_dump_stats(FILE *f, __rte_unused const char *type) malloc_heap_get_stats(heap, &sock_stats); fprintf(f, "Heap id:%u\n", heap_id); + fprintf(f, "\tHeap name:%s\n", heap->name); fprintf(f, "\tHeap_size:%zu,\n", sock_stats.heap_totalsz_bytes); fprintf(f, "\tFree_size:%zu,\n", sock_stats.heap_freesz_bytes); fprintf(f, "\tAlloc_size:%zu,\n", sock_stats.heap_allocsz_bytes); From patchwork Fri Jul 6 13:17:26 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 42505 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id A8B671BF13; Fri, 6 Jul 2018 15:17:50 +0200 (CEST) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id 6AA681BE41 for ; Fri, 6 Jul 2018 15:17:38 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 Jul 2018 06:17:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,316,1526367600"; d="scan'208";a="55598237" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga006.jf.intel.com with ESMTP; 06 Jul 2018 06:17:34 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w66DHX3P027482; Fri, 6 Jul 2018 14:17:33 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w66DHXYk003783; Fri, 6 Jul 2018 14:17:33 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w66DHX6t003775; Fri, 6 Jul 2018 14:17:33 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com Date: Fri, 6 Jul 2018 14:17:26 +0100 Message-Id: <9782368c814417e812260de99a5c4f83055d959d.1530881548.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [RFC 05/11] malloc: enable retrieving statistics from named heaps X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add internal functions to look up heap by name, and enable dumping statistics for a specified named heap. Signed-off-by: Anatoly Burakov --- lib/librte_eal/common/include/rte_malloc.h | 19 +++++++++++-- lib/librte_eal/common/malloc_heap.c | 31 ++++++++++++++++++++++ lib/librte_eal/common/malloc_heap.h | 6 +++++ lib/librte_eal/common/rte_malloc.c | 17 ++++++++++++ lib/librte_eal/rte_eal_version.map | 1 + 5 files changed, 72 insertions(+), 2 deletions(-) diff --git a/lib/librte_eal/common/include/rte_malloc.h b/lib/librte_eal/common/include/rte_malloc.h index a9fb7e452..7cbcd3184 100644 --- a/lib/librte_eal/common/include/rte_malloc.h +++ b/lib/librte_eal/common/include/rte_malloc.h @@ -256,13 +256,28 @@ rte_malloc_validate(const void *ptr, size_t *size); * @param socket_stats * A structure which provides memory to store statistics * @return - * Null on error - * Pointer to structure storing statistics on success + * 0 on success + * -1 on error */ int rte_malloc_get_socket_stats(int socket, struct rte_malloc_socket_stats *socket_stats); +/** + * Get heap statistics for the specified heap. + * + * @param socket + * An unsigned integer specifying the socket to get heap statistics for + * @param socket_stats + * A structure which provides memory to store statistics + * @return + * 0 on success + * -1 on error + */ +int __rte_experimental +rte_malloc_get_stats_from_heap(const char *heap_name, + struct rte_malloc_socket_stats *socket_stats); + /** * Dump statistics. * diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c index 8f22c062b..8437d33b3 100644 --- a/lib/librte_eal/common/malloc_heap.c +++ b/lib/librte_eal/common/malloc_heap.c @@ -614,6 +614,37 @@ malloc_heap_free_pages(void *aligned_start, size_t aligned_len) return 0; } +int +malloc_heap_find_named_heap_idx(const char *heap_name) +{ + struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + int heap_idx; + + if (heap_name == NULL) + return -1; + if (strnlen(heap_name, RTE_HEAP_NAME_MAX_LEN) == RTE_HEAP_NAME_MAX_LEN) + return -1; + for (heap_idx = rte_socket_count(); heap_idx < RTE_MAX_HEAPS; + heap_idx++) { + struct malloc_heap *heap = &mcfg->malloc_heaps[heap_idx]; + if (strncmp(heap->name, heap_name, RTE_HEAP_NAME_MAX_LEN) == 0) + return heap_idx; + } + return -1; +} + +struct malloc_heap * +malloc_heap_find_named_heap(const char *heap_name) +{ + struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + int heap_idx; + + heap_idx = malloc_heap_find_named_heap_idx(heap_name); + if (heap_idx < 0) + return NULL; + return &mcfg->malloc_heaps[heap_idx]; +} + int malloc_heap_free(struct malloc_elem *elem) { diff --git a/lib/librte_eal/common/malloc_heap.h b/lib/librte_eal/common/malloc_heap.h index 03b801415..4f3137260 100644 --- a/lib/librte_eal/common/malloc_heap.h +++ b/lib/librte_eal/common/malloc_heap.h @@ -29,6 +29,12 @@ void * malloc_heap_alloc(const char *type, size_t size, int socket, unsigned int flags, size_t align, size_t bound, bool contig); +int +malloc_heap_find_named_heap_idx(const char *name); + +struct malloc_heap * +malloc_heap_find_named_heap(const char *name); + int malloc_heap_free(struct malloc_elem *elem); diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c index 75d6e0b4d..2508abdb1 100644 --- a/lib/librte_eal/common/rte_malloc.c +++ b/lib/librte_eal/common/rte_malloc.c @@ -165,6 +165,23 @@ rte_malloc_get_socket_stats(int socket, socket_stats); } +/* + * Function to retrieve data for heap on given socket + */ +int __rte_experimental +rte_malloc_get_stats_from_heap(const char *heap_name, + struct rte_malloc_socket_stats *socket_stats) +{ + struct malloc_heap *heap; + + heap = malloc_heap_find_named_heap(heap_name); + + if (heap == NULL) + return -1; + + return malloc_heap_get_stats(heap, socket_stats); +} + /* * Function to dump contents of all heaps */ diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map index e7fb37b2a..786df1e39 100644 --- a/lib/librte_eal/rte_eal_version.map +++ b/lib/librte_eal/rte_eal_version.map @@ -278,6 +278,7 @@ EXPERIMENTAL { rte_fbarray_set_used; rte_log_register_type_and_pick_level; rte_malloc_dump_heaps; + rte_malloc_get_stats_from_heap; rte_mem_alloc_validator_register; rte_mem_alloc_validator_unregister; rte_mem_event_callback_register; From patchwork Fri Jul 6 13:17:27 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 42501 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 351501BEE4; Fri, 6 Jul 2018 15:17:45 +0200 (CEST) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 3EBA21BE4F for ; Fri, 6 Jul 2018 15:17:37 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 Jul 2018 06:17:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,316,1526367600"; d="scan'208";a="243136087" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga005.fm.intel.com with ESMTP; 06 Jul 2018 06:17:34 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w66DHXRX027485; Fri, 6 Jul 2018 14:17:34 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w66DHXwq003790; Fri, 6 Jul 2018 14:17:33 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w66DHXes003786; Fri, 6 Jul 2018 14:17:33 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com Date: Fri, 6 Jul 2018 14:17:27 +0100 Message-Id: X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [RFC 06/11] malloc: enable allocating from named heaps X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add new malloc API to allocate memory from heap referenced to by specified name. Signed-off-by: Anatoly Burakov --- lib/librte_eal/common/include/rte_malloc.h | 25 ++++++++++++++++++++++ lib/librte_eal/common/malloc_heap.c | 2 +- lib/librte_eal/common/malloc_heap.h | 6 ++++++ lib/librte_eal/common/rte_malloc.c | 11 ++++++++++ lib/librte_eal/rte_eal_version.map | 1 + 5 files changed, 44 insertions(+), 1 deletion(-) diff --git a/lib/librte_eal/common/include/rte_malloc.h b/lib/librte_eal/common/include/rte_malloc.h index 7cbcd3184..f1bcd9b65 100644 --- a/lib/librte_eal/common/include/rte_malloc.h +++ b/lib/librte_eal/common/include/rte_malloc.h @@ -213,6 +213,31 @@ rte_zmalloc_socket(const char *type, size_t size, unsigned align, int socket); void * rte_calloc_socket(const char *type, size_t num, size_t size, unsigned align, int socket); +/** + * This function allocates memory from a specified named heap. + * + * @param name + * Name of the heap to allocate from. + * @param type + * A string identifying the type of allocated objects (useful for debug + * purposes, such as identifying the cause of a memory leak). Can be NULL. + * @param size + * Size (in bytes) to be allocated. + * @param align + * If 0, the return is a pointer that is suitably aligned for any kind of + * variable (in the same manner as malloc()). + * Otherwise, the return is a pointer that is a multiple of *align*. In + * this case, it must be a power of two. (Minimum alignment is the + * cacheline size, i.e. 64-bytes) + * @return + * - NULL on error. Not enough memory, or invalid arguments (size is 0, + * align is not a power of two). + * - Otherwise, the pointer to the allocated object. + */ +__rte_experimental void * +rte_malloc_from_heap(const char *heap_name, const char *type, size_t size, + unsigned int align); + /** * Frees the memory space pointed to by the provided pointer. * diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c index 8437d33b3..a33acc252 100644 --- a/lib/librte_eal/common/malloc_heap.c +++ b/lib/librte_eal/common/malloc_heap.c @@ -494,7 +494,7 @@ alloc_more_mem_on_socket(struct malloc_heap *heap, size_t size, int socket, } /* this will try lower page sizes first */ -static void * +void * malloc_heap_alloc_on_heap_id(const char *type, size_t size, unsigned int heap_id, unsigned int flags, size_t align, size_t bound, bool contig) diff --git a/lib/librte_eal/common/malloc_heap.h b/lib/librte_eal/common/malloc_heap.h index 4f3137260..a7e408c99 100644 --- a/lib/librte_eal/common/malloc_heap.h +++ b/lib/librte_eal/common/malloc_heap.h @@ -29,6 +29,12 @@ void * malloc_heap_alloc(const char *type, size_t size, int socket, unsigned int flags, size_t align, size_t bound, bool contig); +/* allocate from specified heap id */ +void * +malloc_heap_alloc_on_heap_id(const char *type, size_t size, + unsigned int heap_id, unsigned int flags, size_t align, + size_t bound, bool contig); + int malloc_heap_find_named_heap_idx(const char *name); diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c index 2508abdb1..215876a59 100644 --- a/lib/librte_eal/common/rte_malloc.c +++ b/lib/librte_eal/common/rte_malloc.c @@ -55,6 +55,17 @@ rte_malloc_socket(const char *type, size_t size, unsigned int align, align == 0 ? 1 : align, 0, false); } +void * +rte_malloc_from_heap(const char *heap_name, const char *type, size_t size, + unsigned int align) +{ + int heap_idx = malloc_heap_find_named_heap_idx(heap_name); + if (heap_idx < 0) + return NULL; + return malloc_heap_alloc_on_heap_id(type, size, heap_idx, 0, + align == 0 ? 1 : align, 0, false); +} + /* * Allocate memory on default heap. */ diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map index 786df1e39..716a7585d 100644 --- a/lib/librte_eal/rte_eal_version.map +++ b/lib/librte_eal/rte_eal_version.map @@ -278,6 +278,7 @@ EXPERIMENTAL { rte_fbarray_set_used; rte_log_register_type_and_pick_level; rte_malloc_dump_heaps; + rte_malloc_from_heap; rte_malloc_get_stats_from_heap; rte_mem_alloc_validator_register; rte_mem_alloc_validator_unregister; From patchwork Fri Jul 6 13:17:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 42504 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 685201BF0E; Fri, 6 Jul 2018 15:17:49 +0200 (CEST) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 2AA5A1BE4E for ; Fri, 6 Jul 2018 15:17:38 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 Jul 2018 06:17:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,316,1526367600"; d="scan'208";a="243136088" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga005.fm.intel.com with ESMTP; 06 Jul 2018 06:17:34 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w66DHY8f027488; Fri, 6 Jul 2018 14:17:34 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w66DHY5w003797; Fri, 6 Jul 2018 14:17:34 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w66DHXaI003793; Fri, 6 Jul 2018 14:17:33 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com Date: Fri, 6 Jul 2018 14:17:28 +0100 Message-Id: <19df594b15fcecefc3d9790e7f7b426a4f609a10.1530881548.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [RFC 07/11] malloc: enable creating new malloc heaps X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add API to allow creating new malloc heaps. They will be created with indexes higher than heaps reserved for NUMA sockets, and up to RTE_MAX_HEAPS. Signed-off-by: Anatoly Burakov --- lib/librte_eal/common/include/rte_malloc.h | 21 ++++++++++ lib/librte_eal/common/malloc_heap.c | 16 ++++++++ lib/librte_eal/common/malloc_heap.h | 3 ++ lib/librte_eal/common/rte_malloc.c | 46 ++++++++++++++++++++++ lib/librte_eal/rte_eal_version.map | 1 + 5 files changed, 87 insertions(+) diff --git a/lib/librte_eal/common/include/rte_malloc.h b/lib/librte_eal/common/include/rte_malloc.h index f1bcd9b65..fa6de073a 100644 --- a/lib/librte_eal/common/include/rte_malloc.h +++ b/lib/librte_eal/common/include/rte_malloc.h @@ -253,6 +253,27 @@ rte_malloc_from_heap(const char *heap_name, const char *type, size_t size, void rte_free(void *ptr); +/** + * Creates a new empty malloc heap with a specified name. + * + * @note Concurrently creating or destroying heaps is not safe. + * + * @note This function does not need to be called in multiple processes, as + * multiprocess synchronization will happen automatically as far as heap data + * is concerned. However, before accessing pointers to memory in this heap, it + * is responsibility of the user to ensure that the heap memory is accessible + * in all processes. + * + * @param heap_name + * Name of the heap to create. + * + * @return + * - 0 on successful creation. + * - -1 on error. + */ +int __rte_experimental +rte_malloc_heap_create(const char *heap_name); + /** * If malloc debug is enabled, check a memory block for header * and trailer markers to indicate that all is well with the block. diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c index a33acc252..f5d103626 100644 --- a/lib/librte_eal/common/malloc_heap.c +++ b/lib/librte_eal/common/malloc_heap.c @@ -892,6 +892,22 @@ malloc_heap_dump(struct malloc_heap *heap, FILE *f) rte_spinlock_unlock(&heap->lock); } +int +malloc_heap_create(struct malloc_heap *heap, const char *heap_name) +{ + /* initialize empty heap */ + heap->alloc_count = 0; + heap->first = NULL; + heap->last = NULL; + LIST_INIT(heap->free_head); + rte_spinlock_init(&heap->lock); + heap->total_size = 0; + + /* set up name */ + strlcpy(heap->name, heap_name, RTE_HEAP_NAME_MAX_LEN); + return 0; +} + int rte_eal_malloc_heap_init(void) { diff --git a/lib/librte_eal/common/malloc_heap.h b/lib/librte_eal/common/malloc_heap.h index a7e408c99..aa819ef65 100644 --- a/lib/librte_eal/common/malloc_heap.h +++ b/lib/librte_eal/common/malloc_heap.h @@ -35,6 +35,9 @@ malloc_heap_alloc_on_heap_id(const char *type, size_t size, unsigned int heap_id, unsigned int flags, size_t align, size_t bound, bool contig); +int +malloc_heap_create(struct malloc_heap *heap, const char *heap_name); + int malloc_heap_find_named_heap_idx(const char *name); diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c index 215876a59..e000dc5b7 100644 --- a/lib/librte_eal/common/rte_malloc.c +++ b/lib/librte_eal/common/rte_malloc.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include #include @@ -272,3 +273,48 @@ rte_malloc_virt2iova(const void *addr) return ms->iova + RTE_PTR_DIFF(addr, ms->addr); } + +int +rte_malloc_heap_create(const char *heap_name) +{ + struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + struct malloc_heap *heap = NULL; + int i; + + /* iova_addrs is allowed to be NULL */ + if (heap_name == NULL || + strnlen(heap_name, RTE_HEAP_NAME_MAX_LEN) == 0 || + strnlen(heap_name, RTE_HEAP_NAME_MAX_LEN) == + RTE_HEAP_NAME_MAX_LEN) { + rte_errno = EINVAL; + return -1; + } + /* check if there is space in the heap list, or if heap with this name + * already exists. start from non-socket heaps. + */ + for (i = rte_socket_count(); i < RTE_MAX_HEAPS; i++) { + struct malloc_heap *tmp = &mcfg->malloc_heaps[i]; + /* existing heap */ + if (strncmp(heap_name, tmp->name, + RTE_HEAP_NAME_MAX_LEN) == 0) { + RTE_LOG(ERR, EAL, "Heap %s already exists\n", + heap_name); + rte_errno = EEXIST; + return -1; + } + /* empty heap */ + if (strnlen(tmp->name, RTE_HEAP_NAME_MAX_LEN) == 0) { + heap = tmp; + break; + } + } + if (heap == NULL) { + RTE_LOG(ERR, EAL, "Cannot create new heap: no space\n"); + rte_errno = ENOSPC; + return -1; + } + + /* we're sure that we can create a new heap, so do it */ + return malloc_heap_create(heap, heap_name); +} + diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map index 716a7585d..f3c375156 100644 --- a/lib/librte_eal/rte_eal_version.map +++ b/lib/librte_eal/rte_eal_version.map @@ -280,6 +280,7 @@ EXPERIMENTAL { rte_malloc_dump_heaps; rte_malloc_from_heap; rte_malloc_get_stats_from_heap; + rte_malloc_heap_create; rte_mem_alloc_validator_register; rte_mem_alloc_validator_unregister; rte_mem_event_callback_register; From patchwork Fri Jul 6 13:17:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 42503 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 4EF461BF05; Fri, 6 Jul 2018 15:17:48 +0200 (CEST) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by dpdk.org (Postfix) with ESMTP id 11E281BE43 for ; Fri, 6 Jul 2018 15:17:37 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 Jul 2018 06:17:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,316,1526367600"; d="scan'208";a="70577415" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga001.jf.intel.com with ESMTP; 06 Jul 2018 06:17:35 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w66DHYIe027491; Fri, 6 Jul 2018 14:17:34 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w66DHYxF003804; Fri, 6 Jul 2018 14:17:34 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w66DHYi3003800; Fri, 6 Jul 2018 14:17:34 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com Date: Fri, 6 Jul 2018 14:17:29 +0100 Message-Id: X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [RFC 08/11] malloc: allow adding memory to named heaps X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add an API to add externally allocated memory to malloc heap. The memory will be stored in memseg lists like regular DPDK memory. Multiple segments are allowed within a heap. If IOVA table is not provided, IOVA addresses are filled in with RTE_BAD_IOVA. Signed-off-by: Anatoly Burakov --- lib/librte_eal/common/include/rte_malloc.h | 44 ++++++++++++++ lib/librte_eal/common/malloc_heap.c | 70 ++++++++++++++++++++++ lib/librte_eal/common/malloc_heap.h | 4 ++ lib/librte_eal/common/rte_malloc.c | 39 ++++++++++++ lib/librte_eal/rte_eal_version.map | 1 + 5 files changed, 158 insertions(+) diff --git a/lib/librte_eal/common/include/rte_malloc.h b/lib/librte_eal/common/include/rte_malloc.h index fa6de073a..5f933993b 100644 --- a/lib/librte_eal/common/include/rte_malloc.h +++ b/lib/librte_eal/common/include/rte_malloc.h @@ -274,6 +274,50 @@ rte_free(void *ptr); int __rte_experimental rte_malloc_heap_create(const char *heap_name); +/** + * Add more memory to heap with specified name. + * + * @note Concurrently adding memory to or removing memory from different heaps + * is not safe. + * + * @note This function does not need to be called in multiple processes, as + * multiprocess synchronization will happen automatically as far as heap data + * is concerned. However, before accessing pointers to memory in this heap, it + * is responsibility of the user to ensure that the heap memory is accessible + * in all processes. + * + * @note Memory must be previously allocated for DPDK to be able to use it as a + * malloc heap. Failing to do so will result in undefined behavior, up to and + * including crashes. + * + * @note Adding memory to heap may fail in multiple processes scenario, as + * attaching to ``rte_fbarray`` structures may not always be successful in + * secondary processes. + * + * @param heap_name + * Name of the heap to create. + * @param va_addr + * Start of virtual area to add to the heap. + * @param len + * Length of virtual area to add to the heap. + * @param iova_addrs + * Array of page IOVA addresses corresponding to each page in this memory + * area. Can be NULL, in which case page IOVA addresses will be set to + * RTE_BAD_IOVA. + * @param n_pages + * Number of elements in the iova_addrs array. Must be zero of ``iova_addrs`` + * is NULL. + * @param page_sz + * Page size of the underlying memory. + * + * @return + * - 0 on successful creation. + * - -1 on error. + */ +int __rte_experimental +rte_malloc_heap_add_memory(const char *heap_name, void *va_addr, size_t len, + rte_iova_t iova_addrs[], unsigned int n_pages, size_t page_sz); + /** * If malloc debug is enabled, check a memory block for header * and trailer markers to indicate that all is well with the block. diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c index f5d103626..29446cac9 100644 --- a/lib/librte_eal/common/malloc_heap.c +++ b/lib/librte_eal/common/malloc_heap.c @@ -892,6 +892,76 @@ malloc_heap_dump(struct malloc_heap *heap, FILE *f) rte_spinlock_unlock(&heap->lock); } +int +malloc_heap_add_external_memory(struct malloc_heap *heap, void *va_addr, + rte_iova_t iova_addrs[], unsigned int n_pages, size_t page_sz) +{ + struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + char fbarray_name[RTE_FBARRAY_NAME_LEN]; + struct rte_memseg_list *msl = NULL; + struct rte_fbarray *arr; + size_t seg_len = n_pages * page_sz; + unsigned int i; + + /* first, find a free memseg list */ + for (i = 0; i < RTE_MAX_MEMSEG_LISTS; i++) { + struct rte_memseg_list *tmp = &mcfg->memsegs[i]; + if (tmp->base_va == NULL) { + msl = tmp; + break; + } + } + if (msl == NULL) { + RTE_LOG(ERR, EAL, "Couldn't find empty memseg list\n"); + rte_errno = ENOSPC; + return -1; + } + + snprintf(fbarray_name, sizeof(fbarray_name) - 1, "%s_%p", + heap->name, va_addr); + + /* create the backing fbarray */ + if (rte_fbarray_init(&msl->memseg_arr, fbarray_name, n_pages, + sizeof(struct rte_memseg)) < 0) { + RTE_LOG(ERR, EAL, "Couldn't create fbarray backing the memseg list\n"); + return -1; + } + arr = &msl->memseg_arr; + + /* fbarray created, fill it up */ + for (i = 0; i < n_pages; i++) { + struct rte_memseg *ms; + + rte_fbarray_set_used(arr, i); + ms = rte_fbarray_get(arr, i); + ms->addr = RTE_PTR_ADD(va_addr, n_pages * page_sz); + ms->iova = iova_addrs == NULL ? RTE_BAD_IOVA : iova_addrs[i]; + ms->hugepage_sz = page_sz; + ms->len = page_sz; + ms->nchannel = rte_memory_get_nchannel(); + ms->nrank = rte_memory_get_nrank(); + ms->socket_id = -1; /* invalid socket ID */ + } + + /* set up the memseg list */ + msl->base_va = va_addr; + msl->page_sz = page_sz; + msl->socket_id = -1; /* invalid socket ID */ + msl->version = 0; + msl->external = true; + + /* now, add newly minted memory to the malloc heap */ + malloc_heap_add_memory(heap, msl, va_addr, seg_len); + + heap->total_size += seg_len; + + /* all done! */ + RTE_LOG(DEBUG, EAL, "Added segment for heap %s starting at %p\n", + heap->name, va_addr); + + return 0; +} + int malloc_heap_create(struct malloc_heap *heap, const char *heap_name) { diff --git a/lib/librte_eal/common/malloc_heap.h b/lib/librte_eal/common/malloc_heap.h index aa819ef65..3be4656d0 100644 --- a/lib/librte_eal/common/malloc_heap.h +++ b/lib/librte_eal/common/malloc_heap.h @@ -38,6 +38,10 @@ malloc_heap_alloc_on_heap_id(const char *type, size_t size, int malloc_heap_create(struct malloc_heap *heap, const char *heap_name); +int +malloc_heap_add_external_memory(struct malloc_heap *heap, void *va_addr, + rte_iova_t iova_addrs[], unsigned int n_pages, size_t page_sz); + int malloc_heap_find_named_heap_idx(const char *name); diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c index e000dc5b7..db0f604ad 100644 --- a/lib/librte_eal/common/rte_malloc.c +++ b/lib/librte_eal/common/rte_malloc.c @@ -274,6 +274,45 @@ rte_malloc_virt2iova(const void *addr) return ms->iova + RTE_PTR_DIFF(addr, ms->addr); } +int +rte_malloc_heap_add_memory(const char *heap_name, void *va_addr, size_t len, + rte_iova_t iova_addrs[], unsigned int n_pages, size_t page_sz) +{ + struct malloc_heap *heap = NULL; + unsigned int n; + int ret; + + /* iova_addrs is allowed to be NULL */ + if (heap_name == NULL || va_addr == NULL || + /* n_pages can be 0 if iova_addrs is NULL */ + ((iova_addrs != NULL) == (n_pages == 0)) || + page_sz == 0 || !rte_is_power_of_2(page_sz) || + strnlen(heap_name, RTE_HEAP_NAME_MAX_LEN) == 0 || + strnlen(heap_name, RTE_HEAP_NAME_MAX_LEN) == + RTE_HEAP_NAME_MAX_LEN) { + rte_errno = EINVAL; + return -1; + } + /* find our heap */ + heap = malloc_heap_find_named_heap(heap_name); + if (heap == NULL) { + rte_errno = EINVAL; + return -1; + } + n = len / page_sz; + if (n != n_pages && iova_addrs != NULL) { + rte_errno = EINVAL; + return -1; + } + + rte_spinlock_lock(&heap->lock); + ret = malloc_heap_add_external_memory(heap, va_addr, iova_addrs, n, + page_sz); + rte_spinlock_unlock(&heap->lock); + + return ret; +} + int rte_malloc_heap_create(const char *heap_name) { diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map index f3c375156..6290cc910 100644 --- a/lib/librte_eal/rte_eal_version.map +++ b/lib/librte_eal/rte_eal_version.map @@ -280,6 +280,7 @@ EXPERIMENTAL { rte_malloc_dump_heaps; rte_malloc_from_heap; rte_malloc_get_stats_from_heap; + rte_malloc_heap_add_memory; rte_malloc_heap_create; rte_mem_alloc_validator_register; rte_mem_alloc_validator_unregister; From patchwork Fri Jul 6 13:17:30 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 42500 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 3592B1BECD; Fri, 6 Jul 2018 15:17:43 +0200 (CEST) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by dpdk.org (Postfix) with ESMTP id 3AD7C1BE4E for ; Fri, 6 Jul 2018 15:17:37 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 Jul 2018 06:17:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,316,1526367600"; d="scan'208";a="213916539" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga004.jf.intel.com with ESMTP; 06 Jul 2018 06:17:34 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w66DHYvX027494; Fri, 6 Jul 2018 14:17:34 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w66DHY3k003811; Fri, 6 Jul 2018 14:17:34 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w66DHYAp003807; Fri, 6 Jul 2018 14:17:34 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com Date: Fri, 6 Jul 2018 14:17:30 +0100 Message-Id: <1b2f86a255a36135a8d718b00d609d729acabee2.1530881548.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [RFC 09/11] malloc: allow removing memory from named heaps X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add an API to remove memory from specified heaps. This will first check if all elements within the region are free, and that the region is the original region that was added to the heap (by comparing its length to length of memory addressed by the underlying memseg list). Signed-off-by: Anatoly Burakov --- lib/librte_eal/common/include/rte_malloc.h | 28 ++++++++++ lib/librte_eal/common/malloc_heap.c | 61 ++++++++++++++++++++++ lib/librte_eal/common/malloc_heap.h | 4 ++ lib/librte_eal/common/rte_malloc.c | 28 ++++++++++ lib/librte_eal/rte_eal_version.map | 1 + 5 files changed, 122 insertions(+) diff --git a/lib/librte_eal/common/include/rte_malloc.h b/lib/librte_eal/common/include/rte_malloc.h index 5f933993b..25d8d3f11 100644 --- a/lib/librte_eal/common/include/rte_malloc.h +++ b/lib/librte_eal/common/include/rte_malloc.h @@ -318,6 +318,34 @@ int __rte_experimental rte_malloc_heap_add_memory(const char *heap_name, void *va_addr, size_t len, rte_iova_t iova_addrs[], unsigned int n_pages, size_t page_sz); +/** + * Remove memory area from heap with specified name. + * + * @note Concurrently adding or removing memory from different heaps is not + * safe. + * + * @note This function does not need to be called in multiple processes, as + * multiprocess synchronization will happen automatically as far as heap data + * is concerned. However, before accessing pointers to memory in this heap, it + * is responsibility of the user to ensure that the heap memory is accessible + * in all processes. + * + * @note Memory area must be empty to allow its removal from the heap. + * + * @param heap_name + * Name of the heap to create. + * @param va_addr + * Virtual address to remove from the heap. + * @param len + * Length of virtual area to remove from the heap. + * + * @return + * - 0 on successful creation. + * - -1 on error. + */ +int __rte_experimental +rte_malloc_heap_remove_memory(const char *heap_name, void *va_addr, size_t len); + /** * If malloc debug is enabled, check a memory block for header * and trailer markers to indicate that all is well with the block. diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c index 29446cac9..27dbf6e60 100644 --- a/lib/librte_eal/common/malloc_heap.c +++ b/lib/librte_eal/common/malloc_heap.c @@ -892,6 +892,44 @@ malloc_heap_dump(struct malloc_heap *heap, FILE *f) rte_spinlock_unlock(&heap->lock); } +static int +destroy_seg(struct malloc_elem *elem, size_t len) +{ + struct malloc_heap *heap = elem->heap; + struct rte_memseg_list *msl; + + /* check if the element is unused */ + if (elem->state != ELEM_FREE) { + rte_errno = EBUSY; + return -1; + } + + msl = elem->msl; + + /* check if element encompasses the entire memseg list */ + if (elem->size != len || len != (msl->page_sz * msl->memseg_arr.len)) { + rte_errno = EINVAL; + return -1; + } + + /* destroy the fbarray backing this memory */ + if (rte_fbarray_destroy(&msl->memseg_arr) < 0) + return -1; + + /* reset the memseg list */ + memset(msl, 0, sizeof(*msl)); + + /* this element can be removed */ + malloc_elem_free_list_remove(elem); + malloc_elem_hide_region(elem, elem, len); + + memset(elem, 0, sizeof(*elem)); + + heap->total_size -= len; + + return 0; +} + int malloc_heap_add_external_memory(struct malloc_heap *heap, void *va_addr, rte_iova_t iova_addrs[], unsigned int n_pages, size_t page_sz) @@ -962,6 +1000,29 @@ malloc_heap_add_external_memory(struct malloc_heap *heap, void *va_addr, return 0; } +int +malloc_heap_remove_external_memory(struct malloc_heap *heap, void *va_addr, + size_t len) +{ + struct malloc_elem *elem = heap->first; + + /* find element with specified va address */ + while (elem != NULL && elem != va_addr) { + elem = elem->next; + /* stop if we've blown past our VA */ + if (elem > (struct malloc_elem *)va_addr) { + elem = NULL; + break; + } + } + /* check if element was found */ + if (elem == NULL) { + rte_errno = EINVAL; + return -1; + } + return destroy_seg(elem, len); +} + int malloc_heap_create(struct malloc_heap *heap, const char *heap_name) { diff --git a/lib/librte_eal/common/malloc_heap.h b/lib/librte_eal/common/malloc_heap.h index 3be4656d0..000146365 100644 --- a/lib/librte_eal/common/malloc_heap.h +++ b/lib/librte_eal/common/malloc_heap.h @@ -42,6 +42,10 @@ int malloc_heap_add_external_memory(struct malloc_heap *heap, void *va_addr, rte_iova_t iova_addrs[], unsigned int n_pages, size_t page_sz); +int +malloc_heap_remove_external_memory(struct malloc_heap *heap, void *va_addr, + size_t len); + int malloc_heap_find_named_heap_idx(const char *name); diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c index db0f604ad..8d2eb7250 100644 --- a/lib/librte_eal/common/rte_malloc.c +++ b/lib/librte_eal/common/rte_malloc.c @@ -313,6 +313,34 @@ rte_malloc_heap_add_memory(const char *heap_name, void *va_addr, size_t len, return ret; } +int +rte_malloc_heap_remove_memory(const char *heap_name, void *va_addr, size_t len) +{ + struct malloc_heap *heap = NULL; + int ret; + + /* iova_addrs is allowed to be NULL */ + if (heap_name == NULL || va_addr == NULL || len == 0 || + strnlen(heap_name, RTE_HEAP_NAME_MAX_LEN) == 0 || + strnlen(heap_name, RTE_HEAP_NAME_MAX_LEN) == + RTE_HEAP_NAME_MAX_LEN) { + rte_errno = EINVAL; + return -1; + } + /* find our heap */ + heap = malloc_heap_find_named_heap(heap_name); + if (heap == NULL) { + rte_errno = EINVAL; + return -1; + } + + rte_spinlock_lock(&heap->lock); + ret = malloc_heap_remove_external_memory(heap, va_addr, len); + rte_spinlock_unlock(&heap->lock); + + return ret; +} + int rte_malloc_heap_create(const char *heap_name) { diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map index 6290cc910..7ee79051f 100644 --- a/lib/librte_eal/rte_eal_version.map +++ b/lib/librte_eal/rte_eal_version.map @@ -282,6 +282,7 @@ EXPERIMENTAL { rte_malloc_get_stats_from_heap; rte_malloc_heap_add_memory; rte_malloc_heap_create; + rte_malloc_heap_remove_memory; rte_mem_alloc_validator_register; rte_mem_alloc_validator_unregister; rte_mem_event_callback_register; From patchwork Fri Jul 6 13:17:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 42507 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 4CE041BF28; Fri, 6 Jul 2018 15:17:53 +0200 (CEST) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id D2AC51BE9C for ; Fri, 6 Jul 2018 15:17:40 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 Jul 2018 06:17:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,316,1526367600"; d="scan'208";a="72741956" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga002.jf.intel.com with ESMTP; 06 Jul 2018 06:17:35 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w66DHY83027497; Fri, 6 Jul 2018 14:17:34 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w66DHYDN003818; Fri, 6 Jul 2018 14:17:34 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w66DHY8j003814; Fri, 6 Jul 2018 14:17:34 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com Date: Fri, 6 Jul 2018 14:17:31 +0100 Message-Id: <41225d71752906662b81fe939cd7fe47994133e1.1530881548.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [RFC 10/11] malloc: allow destroying heaps X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add an API to destroy specified heap. Any memory regions still contained within the heap will be removed first. Signed-off-by: Anatoly Burakov --- lib/librte_eal/common/include/rte_malloc.h | 21 ++++++++++++++++ lib/librte_eal/common/malloc_heap.c | 29 ++++++++++++++++++++++ lib/librte_eal/common/malloc_heap.h | 3 +++ lib/librte_eal/common/rte_malloc.c | 27 ++++++++++++++++++++ lib/librte_eal/rte_eal_version.map | 1 + 5 files changed, 81 insertions(+) diff --git a/lib/librte_eal/common/include/rte_malloc.h b/lib/librte_eal/common/include/rte_malloc.h index 25d8d3f11..239cda2ca 100644 --- a/lib/librte_eal/common/include/rte_malloc.h +++ b/lib/librte_eal/common/include/rte_malloc.h @@ -346,6 +346,27 @@ rte_malloc_heap_add_memory(const char *heap_name, void *va_addr, size_t len, int __rte_experimental rte_malloc_heap_remove_memory(const char *heap_name, void *va_addr, size_t len); +/** + * Destroys a previously created malloc heap with specified name. + * + * @note Concurrently creating or destroying heaps is not thread-safe. + * + * @note This function does not deallocate the memory backing the heap - it only + * deregisters memory from DPDK. + * + * @note This function will return a failure result if not all memory allocated + * from the heap has been freed back to malloc heap. + * + * @param heap_name + * Name of the heap to create. + * + * @return + * - 0 on successful creation. + * - -1 on error. + */ +int __rte_experimental +rte_malloc_heap_destroy(const char *heap_name); + /** * If malloc debug is enabled, check a memory block for header * and trailer markers to indicate that all is well with the block. diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c index 27dbf6e60..e447b6412 100644 --- a/lib/librte_eal/common/malloc_heap.c +++ b/lib/librte_eal/common/malloc_heap.c @@ -1039,6 +1039,35 @@ malloc_heap_create(struct malloc_heap *heap, const char *heap_name) return 0; } +int +malloc_heap_destroy(struct malloc_heap *heap) +{ + struct malloc_elem *elem; + + if (heap->alloc_count != 0) { + RTE_LOG(ERR, EAL, "Heap is still in use\n"); + rte_errno = EBUSY; + return -1; + } + elem = heap->first; + while (elem != NULL) { + struct malloc_elem *next = elem->next; + + if (destroy_seg(elem, elem->size) < 0) + return -1; + + elem = next; + } + if (heap->total_size != 0) + RTE_LOG(ERR, EAL, "Total size not zero, heap is likely corrupt\n"); + + /* we can't memset the entire thing as we're still holding the lock */ + LIST_INIT(heap->free_head); + memset(&heap->name, 0, sizeof(heap->name)); + + return 0; +} + int rte_eal_malloc_heap_init(void) { diff --git a/lib/librte_eal/common/malloc_heap.h b/lib/librte_eal/common/malloc_heap.h index 000146365..399c9a6b1 100644 --- a/lib/librte_eal/common/malloc_heap.h +++ b/lib/librte_eal/common/malloc_heap.h @@ -38,6 +38,9 @@ malloc_heap_alloc_on_heap_id(const char *type, size_t size, int malloc_heap_create(struct malloc_heap *heap, const char *heap_name); +int +malloc_heap_destroy(struct malloc_heap *heap); + int malloc_heap_add_external_memory(struct malloc_heap *heap, void *va_addr, rte_iova_t iova_addrs[], unsigned int n_pages, size_t page_sz); diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c index 8d2eb7250..b6beee7ce 100644 --- a/lib/librte_eal/common/rte_malloc.c +++ b/lib/librte_eal/common/rte_malloc.c @@ -385,3 +385,30 @@ rte_malloc_heap_create(const char *heap_name) return malloc_heap_create(heap, heap_name); } +int +rte_malloc_heap_destroy(const char *heap_name) +{ + struct malloc_heap *heap = NULL; + int ret; + + if (heap_name == NULL || + strnlen(heap_name, RTE_HEAP_NAME_MAX_LEN) == 0 || + strnlen(heap_name, RTE_HEAP_NAME_MAX_LEN) == + RTE_HEAP_NAME_MAX_LEN) { + rte_errno = EINVAL; + return -1; + } + /* start from non-socket heaps */ + heap = malloc_heap_find_named_heap(heap_name); + if (heap == NULL) { + RTE_LOG(ERR, EAL, "Heap %s not found\n", heap_name); + rte_errno = ENOENT; + return -1; + } + /* sanity checks done, now we can destroy the heap */ + rte_spinlock_lock(&heap->lock); + ret = malloc_heap_destroy(heap); + rte_spinlock_unlock(&heap->lock); + + return ret; +} diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map index 7ee79051f..cdde7eb3b 100644 --- a/lib/librte_eal/rte_eal_version.map +++ b/lib/librte_eal/rte_eal_version.map @@ -282,6 +282,7 @@ EXPERIMENTAL { rte_malloc_get_stats_from_heap; rte_malloc_heap_add_memory; rte_malloc_heap_create; + rte_malloc_heap_destroy; rte_malloc_heap_remove_memory; rte_mem_alloc_validator_register; rte_mem_alloc_validator_unregister; From patchwork Fri Jul 6 13:17:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 42508 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 5DF1B1BF43; Fri, 6 Jul 2018 15:17:58 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 71BED1BF43 for ; Fri, 6 Jul 2018 15:17:56 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 Jul 2018 06:17:55 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,316,1526367600"; d="scan'208";a="53071166" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga008.fm.intel.com with ESMTP; 06 Jul 2018 06:17:35 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w66DHYjX027500; Fri, 6 Jul 2018 14:17:34 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w66DHYeL003825; Fri, 6 Jul 2018 14:17:34 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w66DHYSA003821; Fri, 6 Jul 2018 14:17:34 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com Date: Fri, 6 Jul 2018 14:17:32 +0100 Message-Id: <3a31e2adf03569582e4ecd1acdec80c599ee884e.1530881548.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [RFC 11/11] memzone: enable reserving memory from named heaps X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add ability to allocate memory for memzones from named heaps. The semantics are kept similar to regular allocations, and as much of the code as possible is shared. Signed-off-by: Anatoly Burakov --- lib/librte_eal/common/eal_common_memzone.c | 237 +++++++++++++++----- lib/librte_eal/common/include/rte_memzone.h | 183 +++++++++++++++ lib/librte_eal/rte_eal_version.map | 3 + 3 files changed, 373 insertions(+), 50 deletions(-) diff --git a/lib/librte_eal/common/eal_common_memzone.c b/lib/librte_eal/common/eal_common_memzone.c index 25c56052c..d37e7ae1d 100644 --- a/lib/librte_eal/common/eal_common_memzone.c +++ b/lib/librte_eal/common/eal_common_memzone.c @@ -98,17 +98,14 @@ find_heap_max_free_elem(int *s, unsigned align) return len; } -static const struct rte_memzone * -memzone_reserve_aligned_thread_unsafe(const char *name, size_t len, - int socket_id, unsigned int flags, unsigned int align, +static int +common_checks(const char *name, size_t len, unsigned int align, unsigned int bound) { struct rte_memzone *mz; struct rte_mem_config *mcfg; struct rte_fbarray *arr; size_t requested_len; - int mz_idx; - bool contig; /* get pointer to global configuration */ mcfg = rte_eal_get_configuration()->mem_config; @@ -118,14 +115,14 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len, if (arr->count >= arr->len) { RTE_LOG(ERR, EAL, "%s(): No more room in config\n", __func__); rte_errno = ENOSPC; - return NULL; + return -1; } if (strlen(name) > sizeof(mz->name) - 1) { RTE_LOG(DEBUG, EAL, "%s(): memzone <%s>: name too long\n", __func__, name); rte_errno = ENAMETOOLONG; - return NULL; + return -1; } /* zone already exist */ @@ -133,7 +130,7 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len, RTE_LOG(DEBUG, EAL, "%s(): memzone <%s> already exists\n", __func__, name); rte_errno = EEXIST; - return NULL; + return -1; } /* if alignment is not a power of two */ @@ -141,7 +138,7 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len, RTE_LOG(ERR, EAL, "%s(): Invalid alignment: %u\n", __func__, align); rte_errno = EINVAL; - return NULL; + return -1; } /* alignment less than cache size is not allowed */ @@ -151,7 +148,7 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len, /* align length on cache boundary. Check for overflow before doing so */ if (len > SIZE_MAX - RTE_CACHE_LINE_MASK) { rte_errno = EINVAL; /* requested size too big */ - return NULL; + return -1; } len += RTE_CACHE_LINE_MASK; @@ -163,49 +160,23 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len, /* check that boundary condition is valid */ if (bound != 0 && (requested_len > bound || !rte_is_power_of_2(bound))) { rte_errno = EINVAL; - return NULL; - } - - if ((socket_id != SOCKET_ID_ANY) && - (socket_id >= RTE_MAX_NUMA_NODES || socket_id < 0)) { - rte_errno = EINVAL; - return NULL; - } - - if (!rte_eal_has_hugepages()) - socket_id = SOCKET_ID_ANY; - - contig = (flags & RTE_MEMZONE_IOVA_CONTIG) != 0; - /* malloc only cares about size flags, remove contig flag from flags */ - flags &= ~RTE_MEMZONE_IOVA_CONTIG; - - if (len == 0) { - /* len == 0 is only allowed for non-contiguous zones */ - if (contig) { - RTE_LOG(DEBUG, EAL, "Reserving zero-length contiguous memzones is not supported\n"); - rte_errno = EINVAL; - return NULL; - } - if (bound != 0) - requested_len = bound; - else { - requested_len = find_heap_max_free_elem(&socket_id, align); - if (requested_len == 0) { - rte_errno = ENOMEM; - return NULL; - } - } - } - - /* allocate memory on heap */ - void *mz_addr = malloc_heap_alloc(NULL, requested_len, socket_id, flags, - align, bound, contig); - if (mz_addr == NULL) { - rte_errno = ENOMEM; - return NULL; + return -1; } + return 0; +} +static const struct rte_memzone * +create_memzone(const char *name, void *mz_addr, size_t requested_len) +{ + struct rte_mem_config *mcfg; + struct rte_fbarray *arr; struct malloc_elem *elem = malloc_elem_from_data(mz_addr); + struct rte_memzone *mz; + int mz_idx; + + /* get pointer to global configuration */ + mcfg = rte_eal_get_configuration()->mem_config; + arr = &mcfg->memzones; /* fill the zone in config */ mz_idx = rte_fbarray_find_next_free(arr, 0); @@ -236,6 +207,134 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len, return mz; } +static const struct rte_memzone * +memzone_reserve_from_heap_aligned_thread_unsafe(const char *name, size_t len, + const char *heap_name, unsigned int flags, unsigned int align, + unsigned int bound) +{ + size_t requested_len = len; + void *mz_addr; + int heap_idx; + bool contig; + + /* this function sets rte_errno */ + if (common_checks(name, len, align, bound) < 0) + return NULL; + + heap_idx = malloc_heap_find_named_heap_idx(heap_name); + if (heap_idx < 0) { + rte_errno = ENOENT; + return NULL; + } + + contig = (flags & RTE_MEMZONE_IOVA_CONTIG) != 0; + /* malloc only cares about size flags, remove contig flag from flags */ + flags &= ~RTE_MEMZONE_IOVA_CONTIG; + + if (len == 0) { + /* len == 0 is only allowed for non-contiguous zones */ + if (contig) { + RTE_LOG(DEBUG, EAL, "Reserving zero-length contiguous memzones is not supported\n"); + rte_errno = EINVAL; + return NULL; + } + if (bound != 0) + requested_len = bound; + else { + requested_len = heap_max_free_elem(heap_idx, align); + if (requested_len == 0) { + rte_errno = ENOMEM; + return NULL; + } + } + } + + /* allocate memory on heap */ + mz_addr = malloc_heap_alloc_on_heap_id(NULL, requested_len, heap_idx, + flags, align, bound, contig); + if (mz_addr == NULL) { + rte_errno = ENOMEM; + return NULL; + } + return create_memzone(name, mz_addr, requested_len); +} + +static const struct rte_memzone * +memzone_reserve_aligned_thread_unsafe(const char *name, size_t len, + int socket_id, unsigned int flags, unsigned int align, + unsigned int bound) +{ + size_t requested_len = len; + bool contig; + void *mz_addr; + + /* this function sets rte_errno */ + if (common_checks(name, len, align, bound) < 0) + return NULL; + + if ((socket_id != SOCKET_ID_ANY) && + (socket_id >= RTE_MAX_NUMA_NODES || socket_id < 0)) { + rte_errno = EINVAL; + return NULL; + } + + if (!rte_eal_has_hugepages()) + socket_id = SOCKET_ID_ANY; + + contig = (flags & RTE_MEMZONE_IOVA_CONTIG) != 0; + /* malloc only cares about size flags, remove contig flag from flags */ + flags &= ~RTE_MEMZONE_IOVA_CONTIG; + + if (len == 0) { + /* len == 0 is only allowed for non-contiguous zones */ + if (contig) { + RTE_LOG(DEBUG, EAL, "Reserving zero-length contiguous memzones is not supported\n"); + rte_errno = EINVAL; + return NULL; + } + if (bound != 0) + requested_len = bound; + else { + requested_len = find_heap_max_free_elem(&socket_id, + align); + if (requested_len == 0) { + rte_errno = ENOMEM; + return NULL; + } + } + } + + /* allocate memory on heap */ + mz_addr = malloc_heap_alloc(NULL, requested_len, socket_id, flags, + align, bound, contig); + if (mz_addr == NULL) { + rte_errno = ENOMEM; + return NULL; + } + return create_memzone(name, mz_addr, requested_len); +} + +static const struct rte_memzone * +rte_memzone_reserve_from_heap_thread_safe(const char *name, size_t len, + const char *heap_name, unsigned int flags, unsigned int align, + unsigned int bound) +{ + struct rte_mem_config *mcfg; + const struct rte_memzone *mz = NULL; + + /* get pointer to global configuration */ + mcfg = rte_eal_get_configuration()->mem_config; + + rte_rwlock_write_lock(&mcfg->mlock); + + mz = memzone_reserve_from_heap_aligned_thread_unsafe(name, len, + heap_name, flags, align, bound); + + rte_rwlock_write_unlock(&mcfg->mlock); + + return mz; +} + static const struct rte_memzone * rte_memzone_reserve_thread_safe(const char *name, size_t len, int socket_id, unsigned int flags, unsigned int align, unsigned int bound) @@ -293,6 +392,44 @@ rte_memzone_reserve(const char *name, size_t len, int socket_id, flags, RTE_CACHE_LINE_SIZE, 0); } +/* + * Return a pointer to a correctly filled memzone descriptor (with a + * specified alignment and boundary). If the allocation cannot be done, + * return NULL. + */ +const struct rte_memzone * +rte_memzone_reserve_from_heap_bounded(const char *name, size_t len, + const char *heap_name, unsigned int flags, unsigned int align, + unsigned int bound) +{ + return rte_memzone_reserve_from_heap_thread_safe(name, len, heap_name, + flags, align, bound); +} + +/* + * Return a pointer to a correctly filled memzone descriptor (with a + * specified alignment). If the allocation cannot be done, return NULL. + */ +const struct rte_memzone * +rte_memzone_reserve_from_heap_aligned(const char *name, size_t len, + const char *heap_name, unsigned int flags, unsigned int align) +{ + return rte_memzone_reserve_from_heap_thread_safe(name, len, heap_name, + flags, align, 0); +} + +/* + * Return a pointer to a correctly filled memzone descriptor. If the + * allocation cannot be done, return NULL. + */ +const struct rte_memzone * +rte_memzone_reserve_from_heap(const char *name, size_t len, + const char *heap_name, unsigned int flags) +{ + return rte_memzone_reserve_from_heap_thread_safe(name, len, heap_name, + flags, RTE_CACHE_LINE_SIZE, 0); +} + int rte_memzone_free(const struct rte_memzone *mz) { diff --git a/lib/librte_eal/common/include/rte_memzone.h b/lib/librte_eal/common/include/rte_memzone.h index ef370fa6f..b27e5c421 100644 --- a/lib/librte_eal/common/include/rte_memzone.h +++ b/lib/librte_eal/common/include/rte_memzone.h @@ -258,6 +258,189 @@ const struct rte_memzone *rte_memzone_reserve_bounded(const char *name, size_t len, int socket_id, unsigned flags, unsigned align, unsigned bound); +/** + * Reserve a portion of physical memory from a specified named heap. + * + * This function reserves some memory and returns a pointer to a + * correctly filled memzone descriptor. If the allocation cannot be + * done, return NULL. + * + * @note Reserving memzones with len set to 0 will only attempt to allocate + * memzones from memory that is already available. It will not trigger any + * new allocations. + * + * @note Reserving IOVA-contiguous memzones with len set to 0 is not currently + * supported. + * + * @param name + * The name of the memzone. If it already exists, the function will + * fail and return NULL. + * @param len + * The size of the memory to be reserved. If it + * is 0, the biggest contiguous zone will be reserved. + * @param heap_name + * The name of the heap to reserve memory from. + * @param flags + * The flags parameter is used to request memzones to be + * taken from specifically sized hugepages. + * - RTE_MEMZONE_2MB - Reserved from 2MB pages + * - RTE_MEMZONE_1GB - Reserved from 1GB pages + * - RTE_MEMZONE_16MB - Reserved from 16MB pages + * - RTE_MEMZONE_16GB - Reserved from 16GB pages + * - RTE_MEMZONE_256KB - Reserved from 256KB pages + * - RTE_MEMZONE_256MB - Reserved from 256MB pages + * - RTE_MEMZONE_512MB - Reserved from 512MB pages + * - RTE_MEMZONE_4GB - Reserved from 4GB pages + * - RTE_MEMZONE_SIZE_HINT_ONLY - Allow alternative page size to be used if + * the requested page size is unavailable. + * If this flag is not set, the function + * will return error on an unavailable size + * request. + * - RTE_MEMZONE_IOVA_CONTIG - Ensure reserved memzone is IOVA-contiguous. + * This option should be used when allocating + * memory intended for hardware rings etc. + * @return + * A pointer to a correctly-filled read-only memzone descriptor, or NULL + * on error. + * On error case, rte_errno will be set appropriately: + * - E_RTE_NO_CONFIG - function could not get pointer to rte_config structure + * - E_RTE_SECONDARY - function was called from a secondary process instance + * - ENOSPC - the maximum number of memzones has already been allocated + * - EEXIST - a memzone with the same name already exists + * - ENOMEM - no appropriate memory area found in which to create memzone + * - EINVAL - invalid parameters + */ +__rte_experimental const struct rte_memzone * +rte_memzone_reserve_from_heap(const char *name, size_t len, + const char *heap_name, unsigned int flags); + +/** + * Reserve a portion of physical memory from a specified named heap with + * alignment on a specified boundary. + * + * This function reserves some memory with alignment on a specified + * boundary, and returns a pointer to a correctly filled memzone + * descriptor. If the allocation cannot be done or if the alignment + * is not a power of 2, returns NULL. + * + * @note Reserving memzones with len set to 0 will only attempt to allocate + * memzones from memory that is already available. It will not trigger any + * new allocations. + * + * @note Reserving IOVA-contiguous memzones with len set to 0 is not currently + * supported. + * + * @param name + * The name of the memzone. If it already exists, the function will + * fail and return NULL. + * @param len + * The size of the memory to be reserved. If it + * is 0, the biggest contiguous zone will be reserved. + * @param heap_name + * The name of the heap to reserve memory from. + * @param flags + * The flags parameter is used to request memzones to be + * taken from specifically sized hugepages. + * - RTE_MEMZONE_2MB - Reserved from 2MB pages + * - RTE_MEMZONE_1GB - Reserved from 1GB pages + * - RTE_MEMZONE_16MB - Reserved from 16MB pages + * - RTE_MEMZONE_16GB - Reserved from 16GB pages + * - RTE_MEMZONE_256KB - Reserved from 256KB pages + * - RTE_MEMZONE_256MB - Reserved from 256MB pages + * - RTE_MEMZONE_512MB - Reserved from 512MB pages + * - RTE_MEMZONE_4GB - Reserved from 4GB pages + * - RTE_MEMZONE_SIZE_HINT_ONLY - Allow alternative page size to be used if + * the requested page size is unavailable. + * If this flag is not set, the function + * will return error on an unavailable size + * request. + * - RTE_MEMZONE_IOVA_CONTIG - Ensure reserved memzone is IOVA-contiguous. + * This option should be used when allocating + * memory intended for hardware rings etc. + * @param align + * Alignment for resulting memzone. Must be a power of 2. + * @return + * A pointer to a correctly-filled read-only memzone descriptor, or NULL + * on error. + * On error case, rte_errno will be set appropriately: + * - E_RTE_NO_CONFIG - function could not get pointer to rte_config structure + * - E_RTE_SECONDARY - function was called from a secondary process instance + * - ENOSPC - the maximum number of memzones has already been allocated + * - EEXIST - a memzone with the same name already exists + * - ENOMEM - no appropriate memory area found in which to create memzone + * - EINVAL - invalid parameters + */ +__rte_experimental const struct rte_memzone * +rte_memzone_reserve_from_heap_aligned(const char *name, size_t len, + const char *heap_name, unsigned int flags, unsigned int align); + +/** + * Reserve a portion of physical memory from a specified named heap with + * specified alignment and boundary. + * + * This function reserves some memory with specified alignment and + * boundary, and returns a pointer to a correctly filled memzone + * descriptor. If the allocation cannot be done or if the alignment + * or boundary are not a power of 2, returns NULL. + * Memory buffer is reserved in a way, that it wouldn't cross specified + * boundary. That implies that requested length should be less or equal + * then boundary. + * + * @note Reserving memzones with len set to 0 will only attempt to allocate + * memzones from memory that is already available. It will not trigger any + * new allocations. + * + * @note Reserving IOVA-contiguous memzones with len set to 0 is not currently + * supported. + * + * @param name + * The name of the memzone. If it already exists, the function will + * fail and return NULL. + * @param len + * The size of the memory to be reserved. If it + * is 0, the biggest contiguous zone will be reserved. + * @param heap_name + * The name of the heap to reserve memory from. + * @param flags + * The flags parameter is used to request memzones to be + * taken from specifically sized hugepages. + * - RTE_MEMZONE_2MB - Reserved from 2MB pages + * - RTE_MEMZONE_1GB - Reserved from 1GB pages + * - RTE_MEMZONE_16MB - Reserved from 16MB pages + * - RTE_MEMZONE_16GB - Reserved from 16GB pages + * - RTE_MEMZONE_256KB - Reserved from 256KB pages + * - RTE_MEMZONE_256MB - Reserved from 256MB pages + * - RTE_MEMZONE_512MB - Reserved from 512MB pages + * - RTE_MEMZONE_4GB - Reserved from 4GB pages + * - RTE_MEMZONE_SIZE_HINT_ONLY - Allow alternative page size to be used if + * the requested page size is unavailable. + * If this flag is not set, the function + * will return error on an unavailable size + * request. + * - RTE_MEMZONE_IOVA_CONTIG - Ensure reserved memzone is IOVA-contiguous. + * This option should be used when allocating + * memory intended for hardware rings etc. + * @param align + * Alignment for resulting memzone. Must be a power of 2. + * @param bound + * Boundary for resulting memzone. Must be a power of 2 or zero. + * Zero value implies no boundary condition. + * @return + * A pointer to a correctly-filled read-only memzone descriptor, or NULL + * on error. + * On error case, rte_errno will be set appropriately: + * - E_RTE_NO_CONFIG - function could not get pointer to rte_config structure + * - E_RTE_SECONDARY - function was called from a secondary process instance + * - ENOSPC - the maximum number of memzones has already been allocated + * - EEXIST - a memzone with the same name already exists + * - ENOMEM - no appropriate memory area found in which to create memzone + * - EINVAL - invalid parameters + */ +__rte_experimental const struct rte_memzone * +rte_memzone_reserve_from_heap_bounded(const char *name, size_t len, + const char *heap_name, unsigned int flags, unsigned int align, + unsigned int bound); + /** * Free a memzone. * diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map index cdde7eb3b..db1cfae6a 100644 --- a/lib/librte_eal/rte_eal_version.map +++ b/lib/librte_eal/rte_eal_version.map @@ -294,6 +294,9 @@ EXPERIMENTAL { rte_memseg_contig_walk; rte_memseg_list_walk; rte_memseg_walk; + rte_memzone_reserve_from_heap; + rte_memzone_reserve_from_heap_aligned; + rte_memzone_reserve_from_heap_bounded; rte_mp_action_register; rte_mp_action_unregister; rte_mp_reply;