From patchwork Thu Sep 20 11:36:14 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 45025 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 2B41A1B14D; Thu, 20 Sep 2018 13:37:29 +0200 (CEST) Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by dpdk.org (Postfix) with ESMTP id 2316F5F65 for ; Thu, 20 Sep 2018 13:36:54 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Sep 2018 04:36:54 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,398,1531810800"; d="scan'208";a="90428545" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga004.fm.intel.com with ESMTP; 20 Sep 2018 04:36:34 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w8KBaYmA022796; Thu, 20 Sep 2018 12:36:34 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w8KBaXDO011364; Thu, 20 Sep 2018 12:36:33 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w8KBaXKZ011359; Thu, 20 Sep 2018 12:36:33 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: Bruce Richardson , laszlo.madarassy@ericsson.com, laszlo.vadkerti@ericsson.com, andras.kovacs@ericsson.com, winnie.tian@ericsson.com, daniel.andrasi@ericsson.com, janos.kobor@ericsson.com, geza.koblo@ericsson.com, srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com, keith.wiles@intel.com, thomas@monjalon.net, shreyansh.jain@nxp.com, shahafs@mellanox.com, arybchenko@solarflare.com Date: Thu, 20 Sep 2018 12:36:14 +0100 Message-Id: X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 01/20] mem: add length to memseg list X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Previously, to calculate length of memory area covered by a memseg list, we would've needed to multiply page size by length of fbarray backing that memseg list. This is not obvious and unnecessarily low level, so store length in the memseg list itself. Signed-off-by: Anatoly Burakov --- drivers/bus/pci/linux/pci.c | 2 +- lib/librte_eal/bsdapp/eal/eal_memory.c | 2 ++ lib/librte_eal/common/eal_common_memory.c | 5 ++--- lib/librte_eal/common/include/rte_eal_memconfig.h | 1 + lib/librte_eal/linuxapp/eal/eal_memalloc.c | 3 ++- lib/librte_eal/linuxapp/eal/eal_memory.c | 4 +++- 6 files changed, 11 insertions(+), 6 deletions(-) diff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c index 04648ac93..d6e1027ab 100644 --- a/drivers/bus/pci/linux/pci.c +++ b/drivers/bus/pci/linux/pci.c @@ -119,7 +119,7 @@ rte_pci_unmap_device(struct rte_pci_device *dev) static int find_max_end_va(const struct rte_memseg_list *msl, void *arg) { - size_t sz = msl->memseg_arr.len * msl->page_sz; + size_t sz = msl->len; void *end_va = RTE_PTR_ADD(msl->base_va, sz); void **max_va = arg; diff --git a/lib/librte_eal/bsdapp/eal/eal_memory.c b/lib/librte_eal/bsdapp/eal/eal_memory.c index 16d2bc7c3..65ea670f9 100644 --- a/lib/librte_eal/bsdapp/eal/eal_memory.c +++ b/lib/librte_eal/bsdapp/eal/eal_memory.c @@ -79,6 +79,7 @@ rte_eal_hugepage_init(void) } msl->base_va = addr; msl->page_sz = page_sz; + msl->len = internal_config.memory; msl->socket_id = 0; /* populate memsegs. each memseg is 1 page long */ @@ -370,6 +371,7 @@ alloc_va_space(struct rte_memseg_list *msl) return -1; } msl->base_va = addr; + msl->len = mem_sz; return 0; } diff --git a/lib/librte_eal/common/eal_common_memory.c b/lib/librte_eal/common/eal_common_memory.c index 0b69804ff..30d018209 100644 --- a/lib/librte_eal/common/eal_common_memory.c +++ b/lib/librte_eal/common/eal_common_memory.c @@ -171,7 +171,7 @@ virt2memseg(const void *addr, const struct rte_memseg_list *msl) /* a memseg list was specified, check if it's the right one */ start = msl->base_va; - end = RTE_PTR_ADD(start, (size_t)msl->page_sz * msl->memseg_arr.len); + end = RTE_PTR_ADD(start, msl->len); if (addr < start || addr >= end) return NULL; @@ -194,8 +194,7 @@ virt2memseg_list(const void *addr) msl = &mcfg->memsegs[msl_idx]; start = msl->base_va; - end = RTE_PTR_ADD(start, - (size_t)msl->page_sz * msl->memseg_arr.len); + end = RTE_PTR_ADD(start, msl->len); if (addr >= start && addr < end) break; } diff --git a/lib/librte_eal/common/include/rte_eal_memconfig.h b/lib/librte_eal/common/include/rte_eal_memconfig.h index aff0688dd..1d8b0a6fe 100644 --- a/lib/librte_eal/common/include/rte_eal_memconfig.h +++ b/lib/librte_eal/common/include/rte_eal_memconfig.h @@ -30,6 +30,7 @@ struct rte_memseg_list { uint64_t addr_64; /**< Makes sure addr is always 64-bits */ }; + size_t len; /**< Length of memory area covered by this memseg list. */ int socket_id; /**< Socket ID for all memsegs in this list. */ uint64_t page_sz; /**< Page size for all memsegs in this list. */ volatile uint32_t version; /**< version number for multiprocess sync. */ diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c b/lib/librte_eal/linuxapp/eal/eal_memalloc.c index b2e2a9599..71a6e0fd9 100644 --- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c +++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c @@ -986,7 +986,7 @@ free_seg_walk(const struct rte_memseg_list *msl, void *arg) int msl_idx, seg_idx, ret, dir_fd = -1; start_addr = (uintptr_t) msl->base_va; - end_addr = start_addr + msl->memseg_arr.len * (size_t)msl->page_sz; + end_addr = start_addr + msl->len; if ((uintptr_t)wa->ms->addr < start_addr || (uintptr_t)wa->ms->addr >= end_addr) @@ -1472,6 +1472,7 @@ secondary_msl_create_walk(const struct rte_memseg_list *msl, return -1; } local_msl->base_va = primary_msl->base_va; + local_msl->len = primary_msl->len; return 0; } diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c index e3ac24815..897d94179 100644 --- a/lib/librte_eal/linuxapp/eal/eal_memory.c +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c @@ -861,6 +861,7 @@ alloc_va_space(struct rte_memseg_list *msl) return -1; } msl->base_va = addr; + msl->len = mem_sz; return 0; } @@ -1369,6 +1370,7 @@ eal_legacy_hugepage_init(void) msl->base_va = addr; msl->page_sz = page_sz; msl->socket_id = 0; + msl->len = internal_config.memory; /* populate memsegs. each memseg is one page long */ for (cur_seg = 0; cur_seg < n_segs; cur_seg++) { @@ -1615,7 +1617,7 @@ eal_legacy_hugepage_init(void) if (msl->memseg_arr.count > 0) continue; /* this is an unused list, deallocate it */ - mem_sz = (size_t)msl->page_sz * msl->memseg_arr.len; + mem_sz = msl->len; munmap(msl->base_va, mem_sz); msl->base_va = NULL; From patchwork Thu Sep 20 11:36:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 45020 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 5842B1B128; Thu, 20 Sep 2018 13:37:18 +0200 (CEST) Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by dpdk.org (Postfix) with ESMTP id 78EFF5F34 for ; Thu, 20 Sep 2018 13:36:46 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Sep 2018 04:36:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,398,1531810800"; d="scan'208";a="234501480" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga004.jf.intel.com with ESMTP; 20 Sep 2018 04:36:39 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w8KBadcD022801; Thu, 20 Sep 2018 12:36:39 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w8KBacOE011433; Thu, 20 Sep 2018 12:36:38 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w8KBaYwT011371; Thu, 20 Sep 2018 12:36:34 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: Neil Horman , John McNamara , Marko Kovacevic , Hemant Agrawal , Shreyansh Jain , Matan Azrad , Shahaf Shuler , Yongseok Koh , Maxime Coquelin , Tiwei Bie , Zhihong Wang , Bruce Richardson , Olivier Matz , Andrew Rybchenko , laszlo.madarassy@ericsson.com, laszlo.vadkerti@ericsson.com, andras.kovacs@ericsson.com, winnie.tian@ericsson.com, daniel.andrasi@ericsson.com, janos.kobor@ericsson.com, geza.koblo@ericsson.com, srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com, keith.wiles@intel.com, thomas@monjalon.net Date: Thu, 20 Sep 2018 12:36:15 +0100 Message-Id: X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 02/20] mem: allow memseg lists to be marked as external X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" When we allocate and use DPDK memory, we need to be able to differentiate between DPDK hugepage segments and segments that were made part of DPDK but are externally allocated. Add such a property to memseg lists. This breaks the ABI, so bump the EAL library ABI version and document the change in release notes. All current calls for memseg walk functions were adjusted to ignore external segments where it made sense. Mempools is a special case, because we may be asked to allocate a mempool on a specific socket, and we need to ignore all page sizes on other heaps or other sockets. Previously, this assumption of knowing all page sizes was not a problem, but it will be now, so we have to match socket ID with page size when calculating minimum page size for a mempool. Signed-off-by: Anatoly Burakov Acked-by: Andrew Rybchenko --- Notes: v3: - Add comment to explain the process of picking up minimum page sizes for mempool v2: - Add documentation changes and ABI break v1: - Adjust all calls to memseg walk functions to ignore external segments where it made sense to do so doc/guides/rel_notes/deprecation.rst | 15 -------- doc/guides/rel_notes/release_18_11.rst | 12 ++++++- drivers/bus/fslmc/fslmc_vfio.c | 7 ++-- drivers/net/mlx4/mlx4_mr.c | 3 ++ drivers/net/mlx5/mlx5.c | 5 ++- drivers/net/mlx5/mlx5_mr.c | 3 ++ drivers/net/virtio/virtio_user/vhost_kernel.c | 5 ++- lib/librte_eal/bsdapp/eal/Makefile | 2 +- lib/librte_eal/bsdapp/eal/eal.c | 3 ++ lib/librte_eal/bsdapp/eal/eal_memory.c | 7 ++-- lib/librte_eal/common/eal_common_memory.c | 3 ++ .../common/include/rte_eal_memconfig.h | 1 + lib/librte_eal/common/include/rte_memory.h | 9 +++++ lib/librte_eal/common/malloc_heap.c | 9 +++-- lib/librte_eal/linuxapp/eal/Makefile | 2 +- lib/librte_eal/linuxapp/eal/eal.c | 10 +++++- lib/librte_eal/linuxapp/eal/eal_memalloc.c | 9 +++++ lib/librte_eal/linuxapp/eal/eal_vfio.c | 17 ++++++--- lib/librte_eal/meson.build | 2 +- lib/librte_mempool/rte_mempool.c | 35 ++++++++++++++----- test/test/test_malloc.c | 3 ++ test/test/test_memzone.c | 3 ++ 22 files changed, 125 insertions(+), 40 deletions(-) diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst index 138335dfb..d2aec64d1 100644 --- a/doc/guides/rel_notes/deprecation.rst +++ b/doc/guides/rel_notes/deprecation.rst @@ -11,21 +11,6 @@ API and ABI deprecation notices are to be posted here. Deprecation Notices ------------------- -* eal: certain structures will change in EAL on account of upcoming external - memory support. Aside from internal changes leading to an ABI break, the - following externally visible changes will also be implemented: - - - ``rte_memseg_list`` will change to include a boolean flag indicating - whether a particular memseg list is externally allocated. This will have - implications for any users of memseg-walk-related functions, as they will - now have to skip externally allocated segments in most cases if the intent - is to only iterate over internal DPDK memory. - - ``socket_id`` parameter across the entire DPDK will gain additional meaning, - as some socket ID's will now be representing externally allocated memory. No - changes will be required for existing code as backwards compatibility will - be kept, and those who do not use this feature will not see these extra - socket ID's. - * eal: both declaring and identifying devices will be streamlined in v18.11. New functions will appear to query a specific port from buses, classes of device and device drivers. Device declaration will be made coherent with the diff --git a/doc/guides/rel_notes/release_18_11.rst b/doc/guides/rel_notes/release_18_11.rst index bc9b74ec4..e96ec9b43 100644 --- a/doc/guides/rel_notes/release_18_11.rst +++ b/doc/guides/rel_notes/release_18_11.rst @@ -91,6 +91,13 @@ API Changes flag the MAC can be properly configured in any case. This is particularly important for bonding. +* eal: The following API changes were made in 18.11: + + - ``rte_memseg_list`` structure now has an additional flag indicating whether + the memseg list is externally allocated. This will have implications for any + users of memseg-walk-related functions, as they will now have to skip + externally allocated segments in most cases if the intent is to only iterate + over internal DPDK memory. ABI Changes ----------- @@ -107,6 +114,9 @@ ABI Changes ========================================================= +* eal: EAL library ABI version was changed due to previously announced work on + supporting external memory in DPDK. + Removed Items ------------- @@ -152,7 +162,7 @@ The libraries prepended with a plus sign were incremented in this version. librte_compressdev.so.1 librte_cryptodev.so.5 librte_distributor.so.1 - librte_eal.so.8 + + librte_eal.so.9 librte_ethdev.so.10 librte_eventdev.so.4 librte_flow_classify.so.1 diff --git a/drivers/bus/fslmc/fslmc_vfio.c b/drivers/bus/fslmc/fslmc_vfio.c index 4c2cd2a87..2e9244fb7 100644 --- a/drivers/bus/fslmc/fslmc_vfio.c +++ b/drivers/bus/fslmc/fslmc_vfio.c @@ -317,12 +317,15 @@ fslmc_unmap_dma(uint64_t vaddr, uint64_t iovaddr __rte_unused, size_t len) } static int -fslmc_dmamap_seg(const struct rte_memseg_list *msl __rte_unused, - const struct rte_memseg *ms, void *arg) +fslmc_dmamap_seg(const struct rte_memseg_list *msl, const struct rte_memseg *ms, + void *arg) { int *n_segs = arg; int ret; + if (msl->external) + return 0; + ret = fslmc_map_dma(ms->addr_64, ms->iova, ms->len); if (ret) DPAA2_BUS_ERR("Unable to VFIO map (addr=%p, len=%zu)", diff --git a/drivers/net/mlx4/mlx4_mr.c b/drivers/net/mlx4/mlx4_mr.c index d23d3c613..9f5d790b6 100644 --- a/drivers/net/mlx4/mlx4_mr.c +++ b/drivers/net/mlx4/mlx4_mr.c @@ -496,6 +496,9 @@ mr_find_contig_memsegs_cb(const struct rte_memseg_list *msl, { struct mr_find_contig_memsegs_data *data = arg; + if (msl->external) + return 0; + if (data->addr < ms->addr_64 || data->addr >= ms->addr_64 + len) return 0; /* Found, save it and stop walking. */ diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index 30d4e70a7..c90e1d8ce 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -568,11 +568,14 @@ static struct rte_pci_driver mlx5_driver; static void *uar_base; static int -find_lower_va_bound(const struct rte_memseg_list *msl __rte_unused, +find_lower_va_bound(const struct rte_memseg_list *msl, const struct rte_memseg *ms, void *arg) { void **addr = arg; + if (msl->external) + return 0; + if (*addr == NULL) *addr = ms->addr; else diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c index 1d1bcb5fe..fd4345f9c 100644 --- a/drivers/net/mlx5/mlx5_mr.c +++ b/drivers/net/mlx5/mlx5_mr.c @@ -486,6 +486,9 @@ mr_find_contig_memsegs_cb(const struct rte_memseg_list *msl, { struct mr_find_contig_memsegs_data *data = arg; + if (msl->external) + return 0; + if (data->addr < ms->addr_64 || data->addr >= ms->addr_64 + len) return 0; /* Found, save it and stop walking. */ diff --git a/drivers/net/virtio/virtio_user/vhost_kernel.c b/drivers/net/virtio/virtio_user/vhost_kernel.c index d1be82162..91cd545b2 100644 --- a/drivers/net/virtio/virtio_user/vhost_kernel.c +++ b/drivers/net/virtio/virtio_user/vhost_kernel.c @@ -75,13 +75,16 @@ struct walk_arg { uint32_t region_nr; }; static int -add_memory_region(const struct rte_memseg_list *msl __rte_unused, +add_memory_region(const struct rte_memseg_list *msl, const struct rte_memseg *ms, size_t len, void *arg) { struct walk_arg *wa = arg; struct vhost_memory_region *mr; void *start_addr; + if (msl->external) + return 0; + if (wa->region_nr >= max_regions) return -1; diff --git a/lib/librte_eal/bsdapp/eal/Makefile b/lib/librte_eal/bsdapp/eal/Makefile index d27da3d15..97bff4852 100644 --- a/lib/librte_eal/bsdapp/eal/Makefile +++ b/lib/librte_eal/bsdapp/eal/Makefile @@ -22,7 +22,7 @@ LDLIBS += -lrte_kvargs EXPORT_MAP := ../../rte_eal_version.map -LIBABIVER := 8 +LIBABIVER := 9 # specific to bsdapp exec-env SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) := eal.c diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c index d7ae9d686..7735194a3 100644 --- a/lib/librte_eal/bsdapp/eal/eal.c +++ b/lib/librte_eal/bsdapp/eal/eal.c @@ -502,6 +502,9 @@ check_socket(const struct rte_memseg_list *msl, void *arg) { int *socket_id = arg; + if (msl->external) + return 0; + if (msl->socket_id == *socket_id && msl->memseg_arr.count != 0) return 1; diff --git a/lib/librte_eal/bsdapp/eal/eal_memory.c b/lib/librte_eal/bsdapp/eal/eal_memory.c index 65ea670f9..4b092e1f2 100644 --- a/lib/librte_eal/bsdapp/eal/eal_memory.c +++ b/lib/librte_eal/bsdapp/eal/eal_memory.c @@ -236,12 +236,15 @@ struct attach_walk_args { int seg_idx; }; static int -attach_segment(const struct rte_memseg_list *msl __rte_unused, - const struct rte_memseg *ms, void *arg) +attach_segment(const struct rte_memseg_list *msl, const struct rte_memseg *ms, + void *arg) { struct attach_walk_args *wa = arg; void *addr; + if (msl->external) + return 0; + addr = mmap(ms->addr, ms->len, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED, wa->fd_hugepage, wa->seg_idx * EAL_PAGE_SIZE); diff --git a/lib/librte_eal/common/eal_common_memory.c b/lib/librte_eal/common/eal_common_memory.c index 30d018209..a2461ed79 100644 --- a/lib/librte_eal/common/eal_common_memory.c +++ b/lib/librte_eal/common/eal_common_memory.c @@ -272,6 +272,9 @@ physmem_size(const struct rte_memseg_list *msl, void *arg) { uint64_t *total_len = arg; + if (msl->external) + return 0; + *total_len += msl->memseg_arr.count * msl->page_sz; return 0; diff --git a/lib/librte_eal/common/include/rte_eal_memconfig.h b/lib/librte_eal/common/include/rte_eal_memconfig.h index 1d8b0a6fe..6baa6854f 100644 --- a/lib/librte_eal/common/include/rte_eal_memconfig.h +++ b/lib/librte_eal/common/include/rte_eal_memconfig.h @@ -33,6 +33,7 @@ struct rte_memseg_list { size_t len; /**< Length of memory area covered by this memseg list. */ int socket_id; /**< Socket ID for all memsegs in this list. */ uint64_t page_sz; /**< Page size for all memsegs in this list. */ + unsigned int external; /**< 1 if this list points to external memory */ volatile uint32_t version; /**< version number for multiprocess sync. */ struct rte_fbarray memseg_arr; }; diff --git a/lib/librte_eal/common/include/rte_memory.h b/lib/librte_eal/common/include/rte_memory.h index 14bd277a4..ffdd56bfb 100644 --- a/lib/librte_eal/common/include/rte_memory.h +++ b/lib/librte_eal/common/include/rte_memory.h @@ -215,6 +215,9 @@ typedef int (*rte_memseg_list_walk_t)(const struct rte_memseg_list *msl, * @note This function read-locks the memory hotplug subsystem, and thus cannot * be used within memory-related callback functions. * + * @note This function will also walk through externally allocated segments. It + * is up to the user to decide whether to skip through these segments. + * * @param func * Iterator function * @param arg @@ -233,6 +236,9 @@ rte_memseg_walk(rte_memseg_walk_t func, void *arg); * @note This function read-locks the memory hotplug subsystem, and thus cannot * be used within memory-related callback functions. * + * @note This function will also walk through externally allocated segments. It + * is up to the user to decide whether to skip through these segments. + * * @param func * Iterator function * @param arg @@ -251,6 +257,9 @@ rte_memseg_contig_walk(rte_memseg_contig_walk_t func, void *arg); * @note This function read-locks the memory hotplug subsystem, and thus cannot * be used within memory-related callback functions. * + * @note This function will also walk through externally allocated segments. It + * is up to the user to decide whether to skip through these segments. + * * @param func * Iterator function * @param arg diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c index ac7bbb3ba..3c8e2063b 100644 --- a/lib/librte_eal/common/malloc_heap.c +++ b/lib/librte_eal/common/malloc_heap.c @@ -95,6 +95,9 @@ malloc_add_seg(const struct rte_memseg_list *msl, struct malloc_heap *heap; int msl_idx; + if (msl->external) + return 0; + heap = &mcfg->malloc_heaps[msl->socket_id]; /* msl is const, so find it */ @@ -754,8 +757,10 @@ malloc_heap_free(struct malloc_elem *elem) /* anything after this is a bonus */ ret = 0; - /* ...of which we can't avail if we are in legacy mode */ - if (internal_config.legacy_mem) + /* ...of which we can't avail if we are in legacy mode, or if this is an + * externally allocated segment. + */ + if (internal_config.legacy_mem || msl->external) goto free_unlock; /* check if we can free any memory back to the system */ diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile index fd92c75c2..5c16bc40f 100644 --- a/lib/librte_eal/linuxapp/eal/Makefile +++ b/lib/librte_eal/linuxapp/eal/Makefile @@ -10,7 +10,7 @@ ARCH_DIR ?= $(RTE_ARCH) EXPORT_MAP := ../../rte_eal_version.map VPATH += $(RTE_SDK)/lib/librte_eal/common/arch/$(ARCH_DIR) -LIBABIVER := 8 +LIBABIVER := 9 VPATH += $(RTE_SDK)/lib/librte_eal/common diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c index e59ac6577..253a6aece 100644 --- a/lib/librte_eal/linuxapp/eal/eal.c +++ b/lib/librte_eal/linuxapp/eal/eal.c @@ -725,6 +725,9 @@ check_socket(const struct rte_memseg_list *msl, void *arg) { int *socket_id = arg; + if (msl->external) + return 0; + return *socket_id == msl->socket_id; } @@ -1059,7 +1062,12 @@ mark_freeable(const struct rte_memseg_list *msl, const struct rte_memseg *ms, void *arg __rte_unused) { /* ms is const, so find this memseg */ - struct rte_memseg *found = rte_mem_virt2memseg(ms->addr, msl); + struct rte_memseg *found; + + if (msl->external) + return 0; + + found = rte_mem_virt2memseg(ms->addr, msl); found->flags &= ~RTE_MEMSEG_FLAG_DO_NOT_FREE; diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c b/lib/librte_eal/linuxapp/eal/eal_memalloc.c index 71a6e0fd9..f6a0098af 100644 --- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c +++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c @@ -1408,6 +1408,9 @@ sync_walk(const struct rte_memseg_list *msl, void *arg __rte_unused) unsigned int i; int msl_idx; + if (msl->external) + return 0; + msl_idx = msl - mcfg->memsegs; primary_msl = &mcfg->memsegs[msl_idx]; local_msl = &local_memsegs[msl_idx]; @@ -1456,6 +1459,9 @@ secondary_msl_create_walk(const struct rte_memseg_list *msl, char name[PATH_MAX]; int msl_idx, ret; + if (msl->external) + return 0; + msl_idx = msl - mcfg->memsegs; primary_msl = &mcfg->memsegs[msl_idx]; local_msl = &local_memsegs[msl_idx]; @@ -1509,6 +1515,9 @@ fd_list_create_walk(const struct rte_memseg_list *msl, unsigned int len; int msl_idx; + if (msl->external) + return 0; + msl_idx = msl - mcfg->memsegs; len = msl->memseg_arr.len; diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxapp/eal/eal_vfio.c index c68dc38e0..fddbc3b54 100644 --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c @@ -1082,11 +1082,14 @@ rte_vfio_get_group_num(const char *sysfs_base, } static int -type1_map(const struct rte_memseg_list *msl __rte_unused, - const struct rte_memseg *ms, void *arg) +type1_map(const struct rte_memseg_list *msl, const struct rte_memseg *ms, + void *arg) { int *vfio_container_fd = arg; + if (msl->external) + return 0; + return vfio_type1_dma_mem_map(*vfio_container_fd, ms->addr_64, ms->iova, ms->len, 1); } @@ -1196,11 +1199,14 @@ vfio_spapr_dma_do_map(int vfio_container_fd, uint64_t vaddr, uint64_t iova, } static int -vfio_spapr_map_walk(const struct rte_memseg_list *msl __rte_unused, +vfio_spapr_map_walk(const struct rte_memseg_list *msl, const struct rte_memseg *ms, void *arg) { int *vfio_container_fd = arg; + if (msl->external) + return 0; + return vfio_spapr_dma_mem_map(*vfio_container_fd, ms->addr_64, ms->iova, ms->len, 1); } @@ -1210,12 +1216,15 @@ struct spapr_walk_param { uint64_t hugepage_sz; }; static int -vfio_spapr_window_size_walk(const struct rte_memseg_list *msl __rte_unused, +vfio_spapr_window_size_walk(const struct rte_memseg_list *msl, const struct rte_memseg *ms, void *arg) { struct spapr_walk_param *param = arg; uint64_t max = ms->iova + ms->len; + if (msl->external) + return 0; + if (max > param->window_size) { param->hugepage_sz = ms->hugepage_sz; param->window_size = max; diff --git a/lib/librte_eal/meson.build b/lib/librte_eal/meson.build index e1fde15d1..62ef985b9 100644 --- a/lib/librte_eal/meson.build +++ b/lib/librte_eal/meson.build @@ -21,7 +21,7 @@ else error('unsupported system type "@0@"'.format(host_machine.system())) endif -version = 8 # the version of the EAL API +version = 9 # the version of the EAL API allow_experimental_apis = true deps += 'compat' deps += 'kvargs' diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c index 03e6b5f73..2ed539f01 100644 --- a/lib/librte_mempool/rte_mempool.c +++ b/lib/librte_mempool/rte_mempool.c @@ -99,25 +99,44 @@ static unsigned optimize_object_size(unsigned obj_size) return new_obj_size * RTE_MEMPOOL_ALIGN; } +struct pagesz_walk_arg { + int socket_id; + size_t min; +}; + static int find_min_pagesz(const struct rte_memseg_list *msl, void *arg) { - size_t *min = arg; + struct pagesz_walk_arg *wa = arg; + bool valid; - if (msl->page_sz < *min) - *min = msl->page_sz; + /* + * we need to only look at page sizes available for a particular socket + * ID. so, we either need an exact match on socket ID (can match both + * native and external memory), or, if SOCKET_ID_ANY was specified as a + * socket ID argument, we must only look at native memory and ignore any + * page sizes associated with external memory. + */ + valid = msl->socket_id == wa->socket_id; + valid |= wa->socket_id == SOCKET_ID_ANY && msl->external == 0; + + if (valid && msl->page_sz < wa->min) + wa->min = msl->page_sz; return 0; } static size_t -get_min_page_size(void) +get_min_page_size(int socket_id) { - size_t min_pagesz = SIZE_MAX; + struct pagesz_walk_arg wa; - rte_memseg_list_walk(find_min_pagesz, &min_pagesz); + wa.min = SIZE_MAX; + wa.socket_id = socket_id; - return min_pagesz == SIZE_MAX ? (size_t) getpagesize() : min_pagesz; + rte_memseg_list_walk(find_min_pagesz, &wa); + + return wa.min == SIZE_MAX ? (size_t) getpagesize() : wa.min; } @@ -470,7 +489,7 @@ rte_mempool_populate_default(struct rte_mempool *mp) pg_sz = 0; pg_shift = 0; } else if (try_contig) { - pg_sz = get_min_page_size(); + pg_sz = get_min_page_size(mp->socket_id); pg_shift = rte_bsf32(pg_sz); } else { pg_sz = getpagesize(); diff --git a/test/test/test_malloc.c b/test/test/test_malloc.c index 4b5abb4e0..5e5272419 100644 --- a/test/test/test_malloc.c +++ b/test/test/test_malloc.c @@ -711,6 +711,9 @@ check_socket_mem(const struct rte_memseg_list *msl, void *arg) { int32_t *socket = arg; + if (msl->external) + return 0; + return *socket == msl->socket_id; } diff --git a/test/test/test_memzone.c b/test/test/test_memzone.c index 452d7cc5e..9fe465e62 100644 --- a/test/test/test_memzone.c +++ b/test/test/test_memzone.c @@ -115,6 +115,9 @@ find_available_pagesz(const struct rte_memseg_list *msl, void *arg) { struct walk_arg *wa = arg; + if (msl->external) + return 0; + if (msl->page_sz == RTE_PGSIZE_2M) wa->hugepage_2MB_avail = 1; if (msl->page_sz == RTE_PGSIZE_1G) From patchwork Thu Sep 20 11:36:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 45011 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 5026A5F48; Thu, 20 Sep 2018 13:36:47 +0200 (CEST) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 100F05F27 for ; Thu, 20 Sep 2018 13:36:43 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Sep 2018 04:36:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,398,1531810800"; d="scan'208";a="265202967" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga006.fm.intel.com with ESMTP; 20 Sep 2018 04:36:39 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w8KBad4k022804; Thu, 20 Sep 2018 12:36:39 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w8KBad9Q011444; Thu, 20 Sep 2018 12:36:39 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w8KBadlr011440; Thu, 20 Sep 2018 12:36:39 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: Thomas Monjalon , Bruce Richardson , laszlo.madarassy@ericsson.com, laszlo.vadkerti@ericsson.com, andras.kovacs@ericsson.com, winnie.tian@ericsson.com, daniel.andrasi@ericsson.com, janos.kobor@ericsson.com, geza.koblo@ericsson.com, srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com, keith.wiles@intel.com, shreyansh.jain@nxp.com, shahafs@mellanox.com, arybchenko@solarflare.com Date: Thu, 20 Sep 2018 12:36:16 +0100 Message-Id: X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 03/20] malloc: index heaps using heap ID rather than NUMA node X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Switch over all parts of EAL to use heap ID instead of NUMA node ID to identify heaps. Heap ID for DPDK-internal heaps is NUMA node's index within the detected NUMA node list. Heap ID for external heaps will be order of their creation. Signed-off-by: Anatoly Burakov --- config/common_base | 1 + config/rte_config.h | 1 + .../common/include/rte_eal_memconfig.h | 4 +- .../common/include/rte_malloc_heap.h | 1 + lib/librte_eal/common/malloc_heap.c | 98 +++++++++++++------ lib/librte_eal/common/malloc_heap.h | 3 + lib/librte_eal/common/rte_malloc.c | 41 +++++--- 7 files changed, 106 insertions(+), 43 deletions(-) diff --git a/config/common_base b/config/common_base index 155c7d40e..b52770b27 100644 --- a/config/common_base +++ b/config/common_base @@ -61,6 +61,7 @@ CONFIG_RTE_CACHE_LINE_SIZE=64 CONFIG_RTE_LIBRTE_EAL=y CONFIG_RTE_MAX_LCORE=128 CONFIG_RTE_MAX_NUMA_NODES=8 +CONFIG_RTE_MAX_HEAPS=32 CONFIG_RTE_MAX_MEMSEG_LISTS=64 # each memseg list will be limited to either RTE_MAX_MEMSEG_PER_LIST pages # or RTE_MAX_MEM_MB_PER_LIST megabytes worth of memory, whichever is smaller diff --git a/config/rte_config.h b/config/rte_config.h index 567051b9c..5dd2ac1ad 100644 --- a/config/rte_config.h +++ b/config/rte_config.h @@ -24,6 +24,7 @@ #define RTE_BUILD_SHARED_LIB /* EAL defines */ +#define RTE_MAX_HEAPS 32 #define RTE_MAX_MEMSEG_LISTS 128 #define RTE_MAX_MEMSEG_PER_LIST 8192 #define RTE_MAX_MEM_MB_PER_LIST 32768 diff --git a/lib/librte_eal/common/include/rte_eal_memconfig.h b/lib/librte_eal/common/include/rte_eal_memconfig.h index 6baa6854f..d7920a4e0 100644 --- a/lib/librte_eal/common/include/rte_eal_memconfig.h +++ b/lib/librte_eal/common/include/rte_eal_memconfig.h @@ -72,8 +72,8 @@ struct rte_mem_config { struct rte_tailq_head tailq_head[RTE_MAX_TAILQ]; /**< Tailqs for objects */ - /* Heaps of Malloc per socket */ - struct malloc_heap malloc_heaps[RTE_MAX_NUMA_NODES]; + /* Heaps of Malloc */ + struct malloc_heap malloc_heaps[RTE_MAX_HEAPS]; /* address of mem_config in primary process. used to map shared config into * exact same address the primary process maps it. diff --git a/lib/librte_eal/common/include/rte_malloc_heap.h b/lib/librte_eal/common/include/rte_malloc_heap.h index d43fa9097..e7ac32d42 100644 --- a/lib/librte_eal/common/include/rte_malloc_heap.h +++ b/lib/librte_eal/common/include/rte_malloc_heap.h @@ -27,6 +27,7 @@ struct malloc_heap { unsigned alloc_count; size_t total_size; + unsigned int socket_id; } __rte_cache_aligned; #endif /* _RTE_MALLOC_HEAP_H_ */ diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c index 3c8e2063b..1d1e35708 100644 --- a/lib/librte_eal/common/malloc_heap.c +++ b/lib/librte_eal/common/malloc_heap.c @@ -66,6 +66,21 @@ check_hugepage_sz(unsigned flags, uint64_t hugepage_sz) return check_flag & flags; } +int +malloc_socket_to_heap_id(unsigned int socket_id) +{ + struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + int i; + + for (i = 0; i < RTE_MAX_HEAPS; i++) { + struct malloc_heap *heap = &mcfg->malloc_heaps[i]; + + if (heap->socket_id == socket_id) + return i; + } + return -1; +} + /* * Expand the heap with a memory area. */ @@ -93,12 +108,13 @@ malloc_add_seg(const struct rte_memseg_list *msl, struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; struct rte_memseg_list *found_msl; struct malloc_heap *heap; - int msl_idx; + int msl_idx, heap_idx; if (msl->external) return 0; - heap = &mcfg->malloc_heaps[msl->socket_id]; + heap_idx = malloc_socket_to_heap_id(msl->socket_id); + heap = &mcfg->malloc_heaps[heap_idx]; /* msl is const, so find it */ msl_idx = msl - mcfg->memsegs; @@ -111,6 +127,7 @@ malloc_add_seg(const struct rte_memseg_list *msl, malloc_heap_add_memory(heap, found_msl, ms->addr, len); heap->total_size += len; + heap->socket_id = msl->socket_id; RTE_LOG(DEBUG, EAL, "Added %zuM to heap on socket %i\n", len >> 20, msl->socket_id); @@ -561,12 +578,14 @@ alloc_more_mem_on_socket(struct malloc_heap *heap, size_t size, int socket, /* this will try lower page sizes first */ static void * -heap_alloc_on_socket(const char *type, size_t size, int socket, - unsigned int flags, size_t align, size_t bound, bool contig) +malloc_heap_alloc_on_heap_id(const char *type, size_t size, + unsigned int heap_id, unsigned int flags, size_t align, + size_t bound, bool contig) { struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; - struct malloc_heap *heap = &mcfg->malloc_heaps[socket]; + struct malloc_heap *heap = &mcfg->malloc_heaps[heap_id]; unsigned int size_flags = flags & ~RTE_MEMZONE_SIZE_HINT_ONLY; + int socket_id; void *ret; rte_spinlock_lock(&(heap->lock)); @@ -584,12 +603,28 @@ heap_alloc_on_socket(const char *type, size_t size, int socket, * we may still be able to allocate memory from appropriate page sizes, * we just need to request more memory first. */ + + socket_id = rte_socket_id_by_idx(heap_id); + /* + * if socket ID is negative, we cannot find a socket ID for this heap - + * which means it's an external heap. those can have unexpected page + * sizes, so if the user asked to allocate from there - assume user + * knows what they're doing, and allow allocating from there with any + * page size flags. + */ + if (socket_id < 0) + size_flags |= RTE_MEMZONE_SIZE_HINT_ONLY; + ret = heap_alloc(heap, type, size, size_flags, align, bound, contig); if (ret != NULL) goto alloc_unlock; - if (!alloc_more_mem_on_socket(heap, size, socket, flags, align, bound, - contig)) { + /* if socket ID is invalid, this is an external heap */ + if (socket_id < 0) + goto alloc_unlock; + + if (!alloc_more_mem_on_socket(heap, size, socket_id, flags, align, + bound, contig)) { ret = heap_alloc(heap, type, size, flags, align, bound, contig); /* this should have succeeded */ @@ -605,7 +640,7 @@ void * malloc_heap_alloc(const char *type, size_t size, int socket_arg, unsigned int flags, size_t align, size_t bound, bool contig) { - int socket, i, cur_socket; + int socket, heap_id, i; void *ret; /* return NULL if size is 0 or alignment is not power-of-2 */ @@ -620,22 +655,25 @@ malloc_heap_alloc(const char *type, size_t size, int socket_arg, else socket = socket_arg; - /* Check socket parameter */ - if (socket >= RTE_MAX_NUMA_NODES) + /* turn socket ID into heap ID */ + heap_id = malloc_socket_to_heap_id(socket); + /* if heap id is negative, socket ID was invalid */ + if (heap_id < 0) return NULL; - ret = heap_alloc_on_socket(type, size, socket, flags, align, bound, - contig); + ret = malloc_heap_alloc_on_heap_id(type, size, heap_id, flags, align, + bound, contig); if (ret != NULL || socket_arg != SOCKET_ID_ANY) return ret; - /* try other heaps */ + /* try other heaps. we are only iterating through native DPDK sockets, + * so external heaps won't be included. + */ for (i = 0; i < (int) rte_socket_count(); i++) { - cur_socket = rte_socket_id_by_idx(i); - if (cur_socket == socket) + if (i == heap_id) continue; - ret = heap_alloc_on_socket(type, size, cur_socket, flags, - align, bound, contig); + ret = malloc_heap_alloc_on_heap_id(type, size, i, flags, align, + bound, contig); if (ret != NULL) return ret; } @@ -643,11 +681,11 @@ malloc_heap_alloc(const char *type, size_t size, int socket_arg, } static void * -heap_alloc_biggest_on_socket(const char *type, int socket, unsigned int flags, - size_t align, bool contig) +heap_alloc_biggest_on_heap_id(const char *type, unsigned int heap_id, + unsigned int flags, size_t align, bool contig) { struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; - struct malloc_heap *heap = &mcfg->malloc_heaps[socket]; + struct malloc_heap *heap = &mcfg->malloc_heaps[heap_id]; void *ret; rte_spinlock_lock(&(heap->lock)); @@ -665,7 +703,7 @@ void * malloc_heap_alloc_biggest(const char *type, int socket_arg, unsigned int flags, size_t align, bool contig) { - int socket, i, cur_socket; + int socket, i, cur_socket, heap_id; void *ret; /* return NULL if align is not power-of-2 */ @@ -680,11 +718,13 @@ malloc_heap_alloc_biggest(const char *type, int socket_arg, unsigned int flags, else socket = socket_arg; - /* Check socket parameter */ - if (socket >= RTE_MAX_NUMA_NODES) + /* turn socket ID into heap ID */ + heap_id = malloc_socket_to_heap_id(socket); + /* if heap id is negative, socket ID was invalid */ + if (heap_id < 0) return NULL; - ret = heap_alloc_biggest_on_socket(type, socket, flags, align, + ret = heap_alloc_biggest_on_heap_id(type, heap_id, flags, align, contig); if (ret != NULL || socket_arg != SOCKET_ID_ANY) return ret; @@ -694,8 +734,8 @@ malloc_heap_alloc_biggest(const char *type, int socket_arg, unsigned int flags, cur_socket = rte_socket_id_by_idx(i); if (cur_socket == socket) continue; - ret = heap_alloc_biggest_on_socket(type, cur_socket, flags, - align, contig); + ret = heap_alloc_biggest_on_heap_id(type, i, flags, align, + contig); if (ret != NULL) return ret; } @@ -760,7 +800,7 @@ malloc_heap_free(struct malloc_elem *elem) /* ...of which we can't avail if we are in legacy mode, or if this is an * externally allocated segment. */ - if (internal_config.legacy_mem || msl->external) + if (internal_config.legacy_mem || (msl->external > 0)) goto free_unlock; /* check if we can free any memory back to the system */ @@ -917,7 +957,7 @@ malloc_heap_resize(struct malloc_elem *elem, size_t size) } /* - * Function to retrieve data for heap on given socket + * Function to retrieve data for a given heap */ int malloc_heap_get_stats(struct malloc_heap *heap, @@ -955,7 +995,7 @@ malloc_heap_get_stats(struct malloc_heap *heap, } /* - * Function to retrieve data for heap on given socket + * Function to retrieve data for a given heap */ void malloc_heap_dump(struct malloc_heap *heap, FILE *f) diff --git a/lib/librte_eal/common/malloc_heap.h b/lib/librte_eal/common/malloc_heap.h index f52cb5559..61b844b6f 100644 --- a/lib/librte_eal/common/malloc_heap.h +++ b/lib/librte_eal/common/malloc_heap.h @@ -46,6 +46,9 @@ malloc_heap_get_stats(struct malloc_heap *heap, void malloc_heap_dump(struct malloc_heap *heap, FILE *f); +int +malloc_socket_to_heap_id(unsigned int socket_id); + int rte_eal_malloc_heap_init(void); diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c index b51a6d111..dfcdf380a 100644 --- a/lib/librte_eal/common/rte_malloc.c +++ b/lib/librte_eal/common/rte_malloc.c @@ -152,11 +152,20 @@ rte_malloc_get_socket_stats(int socket, struct rte_malloc_socket_stats *socket_stats) { struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + int heap_idx, ret = -1; - if (socket >= RTE_MAX_NUMA_NODES || socket < 0) - return -1; + rte_rwlock_read_lock(&mcfg->memory_hotplug_lock); - return malloc_heap_get_stats(&mcfg->malloc_heaps[socket], socket_stats); + heap_idx = malloc_socket_to_heap_id(socket); + if (heap_idx < 0) + goto unlock; + + ret = malloc_heap_get_stats(&mcfg->malloc_heaps[heap_idx], + socket_stats); +unlock: + rte_rwlock_read_unlock(&mcfg->memory_hotplug_lock); + + return ret; } /* @@ -168,12 +177,14 @@ rte_malloc_dump_heaps(FILE *f) struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; unsigned int idx; - for (idx = 0; idx < rte_socket_count(); idx++) { - unsigned int socket = rte_socket_id_by_idx(idx); - fprintf(f, "Heap on socket %i:\n", socket); - malloc_heap_dump(&mcfg->malloc_heaps[socket], f); + rte_rwlock_read_lock(&mcfg->memory_hotplug_lock); + + for (idx = 0; idx < RTE_MAX_HEAPS; idx++) { + fprintf(f, "Heap id: %u\n", idx); + malloc_heap_dump(&mcfg->malloc_heaps[idx], f); } + rte_rwlock_read_unlock(&mcfg->memory_hotplug_lock); } /* @@ -182,14 +193,19 @@ rte_malloc_dump_heaps(FILE *f) void rte_malloc_dump_stats(FILE *f, __rte_unused const char *type) { - unsigned int socket; + struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + unsigned int heap_id; struct rte_malloc_socket_stats sock_stats; + + rte_rwlock_read_lock(&mcfg->memory_hotplug_lock); + /* Iterate through all initialised heaps */ - for (socket=0; socket< RTE_MAX_NUMA_NODES; socket++) { - if ((rte_malloc_get_socket_stats(socket, &sock_stats) < 0)) - continue; + for (heap_id = 0; heap_id < RTE_MAX_HEAPS; heap_id++) { + struct malloc_heap *heap = &mcfg->malloc_heaps[heap_id]; - fprintf(f, "Socket:%u\n", socket); + malloc_heap_get_stats(heap, &sock_stats); + + fprintf(f, "Heap id:%u\n", heap_id); fprintf(f, "\tHeap_size:%zu,\n", sock_stats.heap_totalsz_bytes); fprintf(f, "\tFree_size:%zu,\n", sock_stats.heap_freesz_bytes); fprintf(f, "\tAlloc_size:%zu,\n", sock_stats.heap_allocsz_bytes); @@ -198,6 +214,7 @@ rte_malloc_dump_stats(FILE *f, __rte_unused const char *type) fprintf(f, "\tAlloc_count:%u,\n",sock_stats.alloc_count); fprintf(f, "\tFree_count:%u,\n", sock_stats.free_count); } + rte_rwlock_read_unlock(&mcfg->memory_hotplug_lock); return; } From patchwork Thu Sep 20 11:36:17 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 45029 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id E8CE91B13E; Thu, 20 Sep 2018 13:38:24 +0200 (CEST) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by dpdk.org (Postfix) with ESMTP id 942527CD2 for ; Thu, 20 Sep 2018 13:38:21 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Sep 2018 04:38:20 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,398,1531810800"; d="scan'208";a="258849948" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga005.jf.intel.com with ESMTP; 20 Sep 2018 04:36:40 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w8KBadaZ022807; Thu, 20 Sep 2018 12:36:39 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w8KBadV7011451; Thu, 20 Sep 2018 12:36:39 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w8KBad2Z011447; Thu, 20 Sep 2018 12:36:39 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: John McNamara , Marko Kovacevic , laszlo.madarassy@ericsson.com, laszlo.vadkerti@ericsson.com, andras.kovacs@ericsson.com, winnie.tian@ericsson.com, daniel.andrasi@ericsson.com, janos.kobor@ericsson.com, geza.koblo@ericsson.com, srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com, keith.wiles@intel.com, bruce.richardson@intel.com, thomas@monjalon.net, shreyansh.jain@nxp.com, shahafs@mellanox.com, arybchenko@solarflare.com Date: Thu, 20 Sep 2018 12:36:17 +0100 Message-Id: <663775477c41a4ea6c1aedb5849154f3d0ad23af.1537443103.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 04/20] mem: do not check for invalid socket ID X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" We will be assigning "invalid" socket ID's to external heap, and malloc will now be able to verify if a supplied socket ID is in fact a valid one, rendering parameter checks for sockets obsolete. This changes the semantics of what we understand by "socket ID", so document the change in the release notes. Signed-off-by: Anatoly Burakov --- doc/guides/rel_notes/release_18_11.rst | 7 +++++++ lib/librte_eal/common/eal_common_memzone.c | 8 +++++--- lib/librte_eal/common/malloc_heap.c | 2 +- lib/librte_eal/common/rte_malloc.c | 4 ---- 4 files changed, 13 insertions(+), 8 deletions(-) diff --git a/doc/guides/rel_notes/release_18_11.rst b/doc/guides/rel_notes/release_18_11.rst index e96ec9b43..63bbb1b51 100644 --- a/doc/guides/rel_notes/release_18_11.rst +++ b/doc/guides/rel_notes/release_18_11.rst @@ -98,6 +98,13 @@ API Changes users of memseg-walk-related functions, as they will now have to skip externally allocated segments in most cases if the intent is to only iterate over internal DPDK memory. + - ``socket_id`` parameter across the entire DPDK has gained additional + meaning, as some socket ID's will now be representing externally allocated + memory. No changes will be required for existing code as backwards + compatibility will be kept, and those who do not use this feature will not + see these extra socket ID's. Any new API's must not check socket ID + parameters themselves, and must instead leave it to the memory subsystem to + decide whether socket ID is a valid one. ABI Changes ----------- diff --git a/lib/librte_eal/common/eal_common_memzone.c b/lib/librte_eal/common/eal_common_memzone.c index 7300fe05d..b7081afbf 100644 --- a/lib/librte_eal/common/eal_common_memzone.c +++ b/lib/librte_eal/common/eal_common_memzone.c @@ -120,13 +120,15 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len, return NULL; } - if ((socket_id != SOCKET_ID_ANY) && - (socket_id >= RTE_MAX_NUMA_NODES || socket_id < 0)) { + if ((socket_id != SOCKET_ID_ANY) && socket_id < 0) { rte_errno = EINVAL; return NULL; } - if (!rte_eal_has_hugepages()) + /* only set socket to SOCKET_ID_ANY if we aren't allocating for an + * external heap. + */ + if (!rte_eal_has_hugepages() && socket_id < RTE_MAX_NUMA_NODES) socket_id = SOCKET_ID_ANY; contig = (flags & RTE_MEMZONE_IOVA_CONTIG) != 0; diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c index 1d1e35708..73e478076 100644 --- a/lib/librte_eal/common/malloc_heap.c +++ b/lib/librte_eal/common/malloc_heap.c @@ -647,7 +647,7 @@ malloc_heap_alloc(const char *type, size_t size, int socket_arg, if (size == 0 || (align && !rte_is_power_of_2(align))) return NULL; - if (!rte_eal_has_hugepages()) + if (!rte_eal_has_hugepages() && socket_arg < RTE_MAX_NUMA_NODES) socket_arg = SOCKET_ID_ANY; if (socket_arg == SOCKET_ID_ANY) diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c index dfcdf380a..458c44ba6 100644 --- a/lib/librte_eal/common/rte_malloc.c +++ b/lib/librte_eal/common/rte_malloc.c @@ -47,10 +47,6 @@ rte_malloc_socket(const char *type, size_t size, unsigned int align, if (!rte_eal_has_hugepages()) socket_arg = SOCKET_ID_ANY; - /* Check socket parameter */ - if (socket_arg >= RTE_MAX_NUMA_NODES) - return NULL; - return malloc_heap_alloc(type, size, socket_arg, 0, align == 0 ? 1 : align, 0, false); } From patchwork Thu Sep 20 11:36:18 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 45012 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 3C8AE7CD2; Thu, 20 Sep 2018 13:36:49 +0200 (CEST) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id ABF245F34 for ; Thu, 20 Sep 2018 13:36:44 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Sep 2018 04:36:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,398,1531810800"; d="scan'208";a="93379281" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga002.jf.intel.com with ESMTP; 20 Sep 2018 04:36:40 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w8KBadE8022810; Thu, 20 Sep 2018 12:36:39 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w8KBad0m011458; Thu, 20 Sep 2018 12:36:39 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w8KBadh2011454; Thu, 20 Sep 2018 12:36:39 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: Bernard Iremonger , laszlo.madarassy@ericsson.com, laszlo.vadkerti@ericsson.com, andras.kovacs@ericsson.com, winnie.tian@ericsson.com, daniel.andrasi@ericsson.com, janos.kobor@ericsson.com, geza.koblo@ericsson.com, srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com, keith.wiles@intel.com, bruce.richardson@intel.com, thomas@monjalon.net, shreyansh.jain@nxp.com, shahafs@mellanox.com, arybchenko@solarflare.com Date: Thu, 20 Sep 2018 12:36:18 +0100 Message-Id: <932326930206abf6242ab2b548aeb932fcc6ade6.1537443103.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 05/20] flow_classify: do not check for invalid socket ID X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" We will be assigning "invalid" socket ID's to external heap, and malloc will now be able to verify if a supplied socket ID is in fact a valid one, rendering parameter checks for sockets obsolete. Signed-off-by: Anatoly Burakov --- lib/librte_flow_classify/rte_flow_classify.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/lib/librte_flow_classify/rte_flow_classify.c b/lib/librte_flow_classify/rte_flow_classify.c index 4c3469da1..fb652a2b7 100644 --- a/lib/librte_flow_classify/rte_flow_classify.c +++ b/lib/librte_flow_classify/rte_flow_classify.c @@ -247,8 +247,7 @@ rte_flow_classifier_check_params(struct rte_flow_classifier_params *params) } /* socket */ - if ((params->socket_id < 0) || - (params->socket_id >= RTE_MAX_NUMA_NODES)) { + if (params->socket_id < 0) { RTE_FLOW_CLASSIFY_LOG(ERR, "%s: Incorrect value for parameter socket_id\n", __func__); From patchwork Thu Sep 20 11:36:19 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 45022 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 5FE291B135; Thu, 20 Sep 2018 13:37:22 +0200 (CEST) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 168F01B0FF for ; Thu, 20 Sep 2018 13:36:53 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Sep 2018 04:36:52 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,398,1531810800"; d="scan'208";a="81952066" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by FMSMGA003.fm.intel.com with ESMTP; 20 Sep 2018 04:36:40 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w8KBadVG022813; Thu, 20 Sep 2018 12:36:39 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w8KBadlC011466; Thu, 20 Sep 2018 12:36:39 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w8KBadCw011461; Thu, 20 Sep 2018 12:36:39 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: Cristian Dumitrescu , laszlo.madarassy@ericsson.com, laszlo.vadkerti@ericsson.com, andras.kovacs@ericsson.com, winnie.tian@ericsson.com, daniel.andrasi@ericsson.com, janos.kobor@ericsson.com, geza.koblo@ericsson.com, srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com, keith.wiles@intel.com, bruce.richardson@intel.com, thomas@monjalon.net, shreyansh.jain@nxp.com, shahafs@mellanox.com, arybchenko@solarflare.com Date: Thu, 20 Sep 2018 12:36:19 +0100 Message-Id: <0b0548460627e7abf100a1db874c5b2ae50c4ec6.1537443103.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 06/20] pipeline: do not check for invalid socket ID X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" We will be assigning "invalid" socket ID's to external heap, and malloc will now be able to verify if a supplied socket ID is in fact a valid one, rendering parameter checks for sockets obsolete. Signed-off-by: Anatoly Burakov --- lib/librte_pipeline/rte_pipeline.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/lib/librte_pipeline/rte_pipeline.c b/lib/librte_pipeline/rte_pipeline.c index 0cb8b804e..2c047a8a4 100644 --- a/lib/librte_pipeline/rte_pipeline.c +++ b/lib/librte_pipeline/rte_pipeline.c @@ -178,8 +178,7 @@ rte_pipeline_check_params(struct rte_pipeline_params *params) } /* socket */ - if ((params->socket_id < 0) || - (params->socket_id >= RTE_MAX_NUMA_NODES)) { + if (params->socket_id < 0) { RTE_LOG(ERR, PIPELINE, "%s: Incorrect value for parameter socket_id\n", __func__); From patchwork Thu Sep 20 11:36:20 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 45013 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id AE7987D26; Thu, 20 Sep 2018 13:36:51 +0200 (CEST) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id E75AA5F48 for ; Thu, 20 Sep 2018 13:36:44 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Sep 2018 04:36:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,398,1531810800"; d="scan'208";a="75882414" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga006.jf.intel.com with ESMTP; 20 Sep 2018 04:36:40 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w8KBadAa022816; Thu, 20 Sep 2018 12:36:39 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w8KBadH6011473; Thu, 20 Sep 2018 12:36:39 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w8KBad9T011469; Thu, 20 Sep 2018 12:36:39 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: Cristian Dumitrescu , laszlo.madarassy@ericsson.com, laszlo.vadkerti@ericsson.com, andras.kovacs@ericsson.com, winnie.tian@ericsson.com, daniel.andrasi@ericsson.com, janos.kobor@ericsson.com, geza.koblo@ericsson.com, srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com, keith.wiles@intel.com, bruce.richardson@intel.com, thomas@monjalon.net, shreyansh.jain@nxp.com, shahafs@mellanox.com, arybchenko@solarflare.com Date: Thu, 20 Sep 2018 12:36:20 +0100 Message-Id: X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 07/20] sched: do not check for invalid socket ID X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" We will be assigning "invalid" socket ID's to external heap, and malloc will now be able to verify if a supplied socket ID is in fact a valid one, rendering parameter checks for sockets obsolete. Signed-off-by: Anatoly Burakov --- lib/librte_sched/rte_sched.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c index 9269e5c71..d4e2189c7 100644 --- a/lib/librte_sched/rte_sched.c +++ b/lib/librte_sched/rte_sched.c @@ -329,7 +329,7 @@ rte_sched_port_check_params(struct rte_sched_port_params *params) return -1; /* socket */ - if ((params->socket < 0) || (params->socket >= RTE_MAX_NUMA_NODES)) + if (params->socket < 0) return -3; /* rate */ From patchwork Thu Sep 20 11:36:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 45030 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 637F81B153; Thu, 20 Sep 2018 13:38:28 +0200 (CEST) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by dpdk.org (Postfix) with ESMTP id 6BF4F7CD2 for ; Thu, 20 Sep 2018 13:38:22 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Sep 2018 04:38:20 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,398,1531810800"; d="scan'208";a="258849960" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga005.jf.intel.com with ESMTP; 20 Sep 2018 04:36:40 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w8KBada5022819; Thu, 20 Sep 2018 12:36:39 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w8KBadwI011480; Thu, 20 Sep 2018 12:36:39 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w8KBadip011476; Thu, 20 Sep 2018 12:36:39 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: laszlo.madarassy@ericsson.com, laszlo.vadkerti@ericsson.com, andras.kovacs@ericsson.com, winnie.tian@ericsson.com, daniel.andrasi@ericsson.com, janos.kobor@ericsson.com, geza.koblo@ericsson.com, srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com, keith.wiles@intel.com, bruce.richardson@intel.com, thomas@monjalon.net, shreyansh.jain@nxp.com, shahafs@mellanox.com, arybchenko@solarflare.com Date: Thu, 20 Sep 2018 12:36:21 +0100 Message-Id: X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 08/20] malloc: add name to malloc heaps X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" We will need to refer to external heaps in some way. While we use heap ID's internally, for external API use it has to be something more user-friendly. So, we will be using a string to uniquely identify a heap. Signed-off-by: Anatoly Burakov --- lib/librte_eal/common/include/rte_malloc_heap.h | 2 ++ lib/librte_eal/common/malloc_heap.c | 15 ++++++++++++++- lib/librte_eal/common/rte_malloc.c | 1 + 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/lib/librte_eal/common/include/rte_malloc_heap.h b/lib/librte_eal/common/include/rte_malloc_heap.h index e7ac32d42..1c08ef3e0 100644 --- a/lib/librte_eal/common/include/rte_malloc_heap.h +++ b/lib/librte_eal/common/include/rte_malloc_heap.h @@ -12,6 +12,7 @@ /* Number of free lists per heap, grouped by size. */ #define RTE_HEAP_NUM_FREELISTS 13 +#define RTE_HEAP_NAME_MAX_LEN 32 /* dummy definition, for pointers */ struct malloc_elem; @@ -28,6 +29,7 @@ struct malloc_heap { unsigned alloc_count; size_t total_size; unsigned int socket_id; + char name[RTE_HEAP_NAME_MAX_LEN]; } __rte_cache_aligned; #endif /* _RTE_MALLOC_HEAP_H_ */ diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c index 73e478076..2a5d2a381 100644 --- a/lib/librte_eal/common/malloc_heap.c +++ b/lib/librte_eal/common/malloc_heap.c @@ -127,7 +127,6 @@ malloc_add_seg(const struct rte_memseg_list *msl, malloc_heap_add_memory(heap, found_msl, ms->addr, len); heap->total_size += len; - heap->socket_id = msl->socket_id; RTE_LOG(DEBUG, EAL, "Added %zuM to heap on socket %i\n", len >> 20, msl->socket_id); @@ -1020,6 +1019,20 @@ int rte_eal_malloc_heap_init(void) { struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + unsigned int i; + + /* assign names to default DPDK heaps */ + for (i = 0; i < rte_socket_count(); i++) { + struct malloc_heap *heap = &mcfg->malloc_heaps[i]; + char heap_name[RTE_HEAP_NAME_MAX_LEN]; + int socket_id = rte_socket_id_by_idx(i); + + snprintf(heap_name, sizeof(heap_name) - 1, + "socket_%i", socket_id); + strlcpy(heap->name, heap_name, RTE_HEAP_NAME_MAX_LEN); + heap->socket_id = socket_id; + } + if (register_mp_requests()) { RTE_LOG(ERR, EAL, "Couldn't register malloc multiprocess actions\n"); diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c index 458c44ba6..0515d47f3 100644 --- a/lib/librte_eal/common/rte_malloc.c +++ b/lib/librte_eal/common/rte_malloc.c @@ -202,6 +202,7 @@ rte_malloc_dump_stats(FILE *f, __rte_unused const char *type) malloc_heap_get_stats(heap, &sock_stats); fprintf(f, "Heap id:%u\n", heap_id); + fprintf(f, "\tHeap name:%s\n", heap->name); fprintf(f, "\tHeap_size:%zu,\n", sock_stats.heap_totalsz_bytes); fprintf(f, "\tFree_size:%zu,\n", sock_stats.heap_freesz_bytes); fprintf(f, "\tAlloc_size:%zu,\n", sock_stats.heap_allocsz_bytes); From patchwork Thu Sep 20 11:36:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 45024 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id F2AD31B144; Thu, 20 Sep 2018 13:37:26 +0200 (CEST) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id AEC455F4A for ; Thu, 20 Sep 2018 13:36:54 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Sep 2018 04:36:52 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,398,1531810800"; d="scan'208";a="81952068" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by FMSMGA003.fm.intel.com with ESMTP; 20 Sep 2018 04:36:40 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w8KBadJp022822; Thu, 20 Sep 2018 12:36:40 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w8KBadXq011488; Thu, 20 Sep 2018 12:36:39 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w8KBade8011483; Thu, 20 Sep 2018 12:36:39 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: laszlo.madarassy@ericsson.com, laszlo.vadkerti@ericsson.com, andras.kovacs@ericsson.com, winnie.tian@ericsson.com, daniel.andrasi@ericsson.com, janos.kobor@ericsson.com, geza.koblo@ericsson.com, srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com, keith.wiles@intel.com, bruce.richardson@intel.com, thomas@monjalon.net, shreyansh.jain@nxp.com, shahafs@mellanox.com, arybchenko@solarflare.com Date: Thu, 20 Sep 2018 12:36:22 +0100 Message-Id: X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 09/20] malloc: add function to query socket ID of named heap X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" When we will be creating external heaps, they will have their own "fake" socket ID, so add a function that will map the heap name to its socket ID. Signed-off-by: Anatoly Burakov --- lib/librte_eal/common/include/rte_malloc.h | 14 ++++++++ lib/librte_eal/common/rte_malloc.c | 37 ++++++++++++++++++++++ lib/librte_eal/rte_eal_version.map | 1 + 3 files changed, 52 insertions(+) diff --git a/lib/librte_eal/common/include/rte_malloc.h b/lib/librte_eal/common/include/rte_malloc.h index a9fb7e452..8870732a6 100644 --- a/lib/librte_eal/common/include/rte_malloc.h +++ b/lib/librte_eal/common/include/rte_malloc.h @@ -263,6 +263,20 @@ int rte_malloc_get_socket_stats(int socket, struct rte_malloc_socket_stats *socket_stats); +/** + * Find socket ID corresponding to a named heap. + * + * @param name + * Heap name to find socket ID for + * @return + * Socket ID in case of success (a non-negative number) + * -1 in case of error, with rte_errno set to one of the following: + * EINVAL - ``name`` was NULL + * ENOENT - heap identified by the name ``name`` was not found + */ +int __rte_experimental +rte_malloc_heap_get_socket(const char *name); + /** * Dump statistics. * diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c index 0515d47f3..ce18ac79c 100644 --- a/lib/librte_eal/common/rte_malloc.c +++ b/lib/librte_eal/common/rte_malloc.c @@ -8,6 +8,7 @@ #include #include +#include #include #include #include @@ -183,6 +184,42 @@ rte_malloc_dump_heaps(FILE *f) rte_rwlock_read_unlock(&mcfg->memory_hotplug_lock); } +int +rte_malloc_heap_get_socket(const char *name) +{ + struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + struct malloc_heap *heap = NULL; + unsigned int idx; + int ret; + + if (name == NULL || + strnlen(name, RTE_HEAP_NAME_MAX_LEN) == 0 || + strnlen(name, RTE_HEAP_NAME_MAX_LEN) == + RTE_HEAP_NAME_MAX_LEN) { + rte_errno = EINVAL; + return -1; + } + rte_rwlock_read_lock(&mcfg->memory_hotplug_lock); + for (idx = 0; idx < RTE_MAX_HEAPS; idx++) { + struct malloc_heap *tmp = &mcfg->malloc_heaps[idx]; + + if (!strncmp(name, tmp->name, RTE_HEAP_NAME_MAX_LEN)) { + heap = tmp; + break; + } + } + + if (heap != NULL) { + ret = heap->socket_id; + } else { + rte_errno = ENOENT; + ret = -1; + } + rte_rwlock_read_unlock(&mcfg->memory_hotplug_lock); + + return ret; +} + /* * Print stats on memory type. If type is NULL, info on all types is printed */ diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map index 73282bbb0..d8f9665b8 100644 --- a/lib/librte_eal/rte_eal_version.map +++ b/lib/librte_eal/rte_eal_version.map @@ -318,6 +318,7 @@ EXPERIMENTAL { rte_fbarray_set_used; rte_log_register_type_and_pick_level; rte_malloc_dump_heaps; + rte_malloc_heap_get_socket; rte_mem_alloc_validator_register; rte_mem_alloc_validator_unregister; rte_mem_event_callback_register; From patchwork Thu Sep 20 11:36:23 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 45026 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 697121B14B; Thu, 20 Sep 2018 13:37:31 +0200 (CEST) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 2F85C1B0F8 for ; Thu, 20 Sep 2018 13:36:55 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Sep 2018 04:36:53 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,398,1531810800"; d="scan'208";a="81952070" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by FMSMGA003.fm.intel.com with ESMTP; 20 Sep 2018 04:36:40 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w8KBaeKk022825; Thu, 20 Sep 2018 12:36:40 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w8KBaeZV011495; Thu, 20 Sep 2018 12:36:40 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w8KBaerM011491; Thu, 20 Sep 2018 12:36:40 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: laszlo.madarassy@ericsson.com, laszlo.vadkerti@ericsson.com, andras.kovacs@ericsson.com, winnie.tian@ericsson.com, daniel.andrasi@ericsson.com, janos.kobor@ericsson.com, geza.koblo@ericsson.com, srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com, keith.wiles@intel.com, bruce.richardson@intel.com, thomas@monjalon.net, shreyansh.jain@nxp.com, shahafs@mellanox.com, arybchenko@solarflare.com Date: Thu, 20 Sep 2018 12:36:23 +0100 Message-Id: X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 10/20] malloc: allow creating malloc heaps X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add API to allow creating new malloc heaps. They will be created with socket ID's going above RTE_MAX_NUMA_NODES, to avoid clashing with internal heaps. Signed-off-by: Anatoly Burakov --- lib/librte_eal/common/include/rte_malloc.h | 19 ++++++++ lib/librte_eal/common/malloc_heap.c | 30 +++++++++++++ lib/librte_eal/common/malloc_heap.h | 3 ++ lib/librte_eal/common/rte_malloc.c | 52 ++++++++++++++++++++++ lib/librte_eal/rte_eal_version.map | 1 + 5 files changed, 105 insertions(+) diff --git a/lib/librte_eal/common/include/rte_malloc.h b/lib/librte_eal/common/include/rte_malloc.h index 8870732a6..182afab1c 100644 --- a/lib/librte_eal/common/include/rte_malloc.h +++ b/lib/librte_eal/common/include/rte_malloc.h @@ -263,6 +263,25 @@ int rte_malloc_get_socket_stats(int socket, struct rte_malloc_socket_stats *socket_stats); +/** + * Creates a new empty malloc heap with a specified name. + * + * @note Heaps created via this call will automatically get assigned a unique + * socket ID, which can be found using ``rte_malloc_heap_get_socket()`` + * + * @param heap_name + * Name of the heap to create. + * + * @return + * - 0 on successful creation + * - -1 in case of error, with rte_errno set to one of the following: + * EINVAL - ``heap_name`` was NULL, empty or too long + * EEXIST - heap by name of ``heap_name`` already exists + * ENOSPC - no more space in internal config to store a new heap + */ +int __rte_experimental +rte_malloc_heap_create(const char *heap_name); + /** * Find socket ID corresponding to a named heap. * diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c index 2a5d2a381..1dd4ffcf9 100644 --- a/lib/librte_eal/common/malloc_heap.c +++ b/lib/librte_eal/common/malloc_heap.c @@ -29,6 +29,10 @@ #include "malloc_heap.h" #include "malloc_mp.h" +/* start external socket ID's at a very high number */ +#define CONST_MAX(a, b) (a > b ? a : b) /* RTE_MAX is not a constant */ +#define EXTERNAL_HEAP_MIN_SOCKET_ID (CONST_MAX((1 << 8), RTE_MAX_NUMA_NODES)) + static unsigned check_hugepage_sz(unsigned flags, uint64_t hugepage_sz) { @@ -1015,6 +1019,32 @@ malloc_heap_dump(struct malloc_heap *heap, FILE *f) rte_spinlock_unlock(&heap->lock); } +int +malloc_heap_create(struct malloc_heap *heap, const char *heap_name) +{ + static uint32_t next_socket_id = EXTERNAL_HEAP_MIN_SOCKET_ID; + + /* prevent overflow. did you really create 2 billion heaps??? */ + if (next_socket_id > INT32_MAX) { + RTE_LOG(ERR, EAL, "Cannot assign new socket ID's\n"); + rte_errno = ENOSPC; + return -1; + } + + /* initialize empty heap */ + heap->alloc_count = 0; + heap->first = NULL; + heap->last = NULL; + LIST_INIT(heap->free_head); + rte_spinlock_init(&heap->lock); + heap->total_size = 0; + heap->socket_id = next_socket_id++; + + /* set up name */ + strlcpy(heap->name, heap_name, RTE_HEAP_NAME_MAX_LEN); + return 0; +} + int rte_eal_malloc_heap_init(void) { diff --git a/lib/librte_eal/common/malloc_heap.h b/lib/librte_eal/common/malloc_heap.h index 61b844b6f..eebee16dc 100644 --- a/lib/librte_eal/common/malloc_heap.h +++ b/lib/librte_eal/common/malloc_heap.h @@ -33,6 +33,9 @@ void * malloc_heap_alloc_biggest(const char *type, int socket, unsigned int flags, size_t align, bool contig); +int +malloc_heap_create(struct malloc_heap *heap, const char *heap_name); + int malloc_heap_free(struct malloc_elem *elem); diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c index ce18ac79c..39875fe69 100644 --- a/lib/librte_eal/common/rte_malloc.c +++ b/lib/librte_eal/common/rte_malloc.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include @@ -286,3 +287,54 @@ rte_malloc_virt2iova(const void *addr) return ms->iova + RTE_PTR_DIFF(addr, ms->addr); } + +int +rte_malloc_heap_create(const char *heap_name) +{ + struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + struct malloc_heap *heap = NULL; + int i, ret; + + if (heap_name == NULL || + strnlen(heap_name, RTE_HEAP_NAME_MAX_LEN) == 0 || + strnlen(heap_name, RTE_HEAP_NAME_MAX_LEN) == + RTE_HEAP_NAME_MAX_LEN) { + rte_errno = EINVAL; + return -1; + } + /* check if there is space in the heap list, or if heap with this name + * already exists. + */ + rte_rwlock_write_lock(&mcfg->memory_hotplug_lock); + + for (i = 0; i < RTE_MAX_HEAPS; i++) { + struct malloc_heap *tmp = &mcfg->malloc_heaps[i]; + /* existing heap */ + if (strncmp(heap_name, tmp->name, + RTE_HEAP_NAME_MAX_LEN) == 0) { + RTE_LOG(ERR, EAL, "Heap %s already exists\n", + heap_name); + rte_errno = EEXIST; + ret = -1; + goto unlock; + } + /* empty heap */ + if (strnlen(tmp->name, RTE_HEAP_NAME_MAX_LEN) == 0) { + heap = tmp; + break; + } + } + if (heap == NULL) { + RTE_LOG(ERR, EAL, "Cannot create new heap: no space\n"); + rte_errno = ENOSPC; + ret = -1; + goto unlock; + } + + /* we're sure that we can create a new heap, so do it */ + ret = malloc_heap_create(heap, heap_name); +unlock: + rte_rwlock_write_unlock(&mcfg->memory_hotplug_lock); + + return ret; +} diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map index d8f9665b8..d1e92361b 100644 --- a/lib/librte_eal/rte_eal_version.map +++ b/lib/librte_eal/rte_eal_version.map @@ -318,6 +318,7 @@ EXPERIMENTAL { rte_fbarray_set_used; rte_log_register_type_and_pick_level; rte_malloc_dump_heaps; + rte_malloc_heap_create; rte_malloc_heap_get_socket; rte_mem_alloc_validator_register; rte_mem_alloc_validator_unregister; From patchwork Thu Sep 20 11:36:24 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 45015 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id B6F241B0FF; Thu, 20 Sep 2018 13:36:57 +0200 (CEST) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 024965F27 for ; Thu, 20 Sep 2018 13:36:44 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Sep 2018 04:36:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,398,1531810800"; d="scan'208";a="265202969" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga006.fm.intel.com with ESMTP; 20 Sep 2018 04:36:40 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w8KBaeSb022828; Thu, 20 Sep 2018 12:36:40 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w8KBaev8011531; Thu, 20 Sep 2018 12:36:40 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w8KBaeZb011521; Thu, 20 Sep 2018 12:36:40 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: laszlo.madarassy@ericsson.com, laszlo.vadkerti@ericsson.com, andras.kovacs@ericsson.com, winnie.tian@ericsson.com, daniel.andrasi@ericsson.com, janos.kobor@ericsson.com, geza.koblo@ericsson.com, srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com, keith.wiles@intel.com, bruce.richardson@intel.com, thomas@monjalon.net, shreyansh.jain@nxp.com, shahafs@mellanox.com, arybchenko@solarflare.com Date: Thu, 20 Sep 2018 12:36:24 +0100 Message-Id: <9ff60fe7919d1dec01fdbda2f4f724c047eec2bd.1537443103.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 11/20] malloc: allow destroying heaps X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add an API to destroy specified heap. Signed-off-by: Anatoly Burakov --- lib/librte_eal/common/include/rte_malloc.h | 23 +++++++++ lib/librte_eal/common/malloc_heap.c | 22 ++++++++ lib/librte_eal/common/malloc_heap.h | 3 ++ lib/librte_eal/common/rte_malloc.c | 58 ++++++++++++++++++++++ lib/librte_eal/rte_eal_version.map | 1 + 5 files changed, 107 insertions(+) diff --git a/lib/librte_eal/common/include/rte_malloc.h b/lib/librte_eal/common/include/rte_malloc.h index 182afab1c..8a8cc1e6d 100644 --- a/lib/librte_eal/common/include/rte_malloc.h +++ b/lib/librte_eal/common/include/rte_malloc.h @@ -282,6 +282,29 @@ rte_malloc_get_socket_stats(int socket, int __rte_experimental rte_malloc_heap_create(const char *heap_name); +/** + * Destroys a previously created malloc heap with specified name. + * + * @note This function will return a failure result if not all memory allocated + * from the heap has been freed back to the heap + * + * @note This function will return a failure result if not all memory segments + * were removed from the heap prior to its destruction + * + * @param heap_name + * Name of the heap to create. + * + * @return + * - 0 on success + * - -1 in case of error, with rte_errno set to one of the following: + * EINVAL - ``heap_name`` was NULL, empty or too long + * ENOENT - heap by the name of ``heap_name`` was not found + * EPERM - attempting to destroy reserved heap + * EBUSY - heap still contains data + */ +int __rte_experimental +rte_malloc_heap_destroy(const char *heap_name); + /** * Find socket ID corresponding to a named heap. * diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c index 1dd4ffcf9..e98f720cb 100644 --- a/lib/librte_eal/common/malloc_heap.c +++ b/lib/librte_eal/common/malloc_heap.c @@ -1045,6 +1045,28 @@ malloc_heap_create(struct malloc_heap *heap, const char *heap_name) return 0; } +int +malloc_heap_destroy(struct malloc_heap *heap) +{ + if (heap->alloc_count != 0) { + RTE_LOG(ERR, EAL, "Heap is still in use\n"); + rte_errno = EBUSY; + return -1; + } + if (heap->first != NULL || heap->last != NULL) { + RTE_LOG(ERR, EAL, "Heap still contains memory segments\n"); + rte_errno = EBUSY; + return -1; + } + if (heap->total_size != 0) + RTE_LOG(ERR, EAL, "Total size not zero, heap is likely corrupt\n"); + + /* after this, the lock will be dropped */ + memset(heap, 0, sizeof(*heap)); + + return 0; +} + int rte_eal_malloc_heap_init(void) { diff --git a/lib/librte_eal/common/malloc_heap.h b/lib/librte_eal/common/malloc_heap.h index eebee16dc..75278da3c 100644 --- a/lib/librte_eal/common/malloc_heap.h +++ b/lib/librte_eal/common/malloc_heap.h @@ -36,6 +36,9 @@ malloc_heap_alloc_biggest(const char *type, int socket, unsigned int flags, int malloc_heap_create(struct malloc_heap *heap, const char *heap_name); +int +malloc_heap_destroy(struct malloc_heap *heap); + int malloc_heap_free(struct malloc_elem *elem); diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c index 39875fe69..6734b0d09 100644 --- a/lib/librte_eal/common/rte_malloc.c +++ b/lib/librte_eal/common/rte_malloc.c @@ -288,6 +288,21 @@ rte_malloc_virt2iova(const void *addr) return ms->iova + RTE_PTR_DIFF(addr, ms->addr); } +static struct malloc_heap * +find_named_heap(const char *name) +{ + struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + unsigned int i; + + for (i = 0; i < RTE_MAX_HEAPS; i++) { + struct malloc_heap *heap = &mcfg->malloc_heaps[i]; + + if (!strncmp(name, heap->name, RTE_HEAP_NAME_MAX_LEN)) + return heap; + } + return NULL; +} + int rte_malloc_heap_create(const char *heap_name) { @@ -338,3 +353,46 @@ rte_malloc_heap_create(const char *heap_name) return ret; } + +int +rte_malloc_heap_destroy(const char *heap_name) +{ + struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + struct malloc_heap *heap = NULL; + int ret; + + if (heap_name == NULL || + strnlen(heap_name, RTE_HEAP_NAME_MAX_LEN) == 0 || + strnlen(heap_name, RTE_HEAP_NAME_MAX_LEN) == + RTE_HEAP_NAME_MAX_LEN) { + rte_errno = EINVAL; + return -1; + } + rte_rwlock_write_lock(&mcfg->memory_hotplug_lock); + + /* start from non-socket heaps */ + heap = find_named_heap(heap_name); + if (heap == NULL) { + RTE_LOG(ERR, EAL, "Heap %s not found\n", heap_name); + rte_errno = ENOENT; + ret = -1; + goto unlock; + } + /* we shouldn't be able to destroy internal heaps */ + if (heap->socket_id < RTE_MAX_NUMA_NODES) { + rte_errno = EPERM; + ret = -1; + goto unlock; + } + /* sanity checks done, now we can destroy the heap */ + rte_spinlock_lock(&heap->lock); + ret = malloc_heap_destroy(heap); + + /* if we failed, lock is still active */ + if (ret < 0) + rte_spinlock_unlock(&heap->lock); +unlock: + rte_rwlock_write_unlock(&mcfg->memory_hotplug_lock); + + return ret; +} diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map index d1e92361b..7db38c8ac 100644 --- a/lib/librte_eal/rte_eal_version.map +++ b/lib/librte_eal/rte_eal_version.map @@ -319,6 +319,7 @@ EXPERIMENTAL { rte_log_register_type_and_pick_level; rte_malloc_dump_heaps; rte_malloc_heap_create; + rte_malloc_heap_destroy; rte_malloc_heap_get_socket; rte_mem_alloc_validator_register; rte_mem_alloc_validator_unregister; From patchwork Thu Sep 20 11:36:25 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 45027 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 8F9241B159; Thu, 20 Sep 2018 13:37:33 +0200 (CEST) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id A86091B10A for ; Thu, 20 Sep 2018 13:36:58 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Sep 2018 04:36:57 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,398,1531810800"; d="scan'208";a="87804401" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga002.fm.intel.com with ESMTP; 20 Sep 2018 04:36:40 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w8KBaewD022831; Thu, 20 Sep 2018 12:36:40 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w8KBaeW6011538; Thu, 20 Sep 2018 12:36:40 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w8KBaeU4011534; Thu, 20 Sep 2018 12:36:40 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: laszlo.madarassy@ericsson.com, laszlo.vadkerti@ericsson.com, andras.kovacs@ericsson.com, winnie.tian@ericsson.com, daniel.andrasi@ericsson.com, janos.kobor@ericsson.com, geza.koblo@ericsson.com, srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com, keith.wiles@intel.com, bruce.richardson@intel.com, thomas@monjalon.net, shreyansh.jain@nxp.com, shahafs@mellanox.com, arybchenko@solarflare.com Date: Thu, 20 Sep 2018 12:36:25 +0100 Message-Id: X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 12/20] malloc: allow adding memory to named heaps X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add an API to add externally allocated memory to malloc heap. The memory will be stored in memseg lists like regular DPDK memory. Multiple segments are allowed within a heap. If IOVA table is not provided, IOVA addresses are filled in with RTE_BAD_IOVA. Signed-off-by: Anatoly Burakov --- lib/librte_eal/common/include/rte_malloc.h | 39 ++++++++++++ lib/librte_eal/common/malloc_heap.c | 74 ++++++++++++++++++++++ lib/librte_eal/common/malloc_heap.h | 4 ++ lib/librte_eal/common/rte_malloc.c | 51 +++++++++++++++ lib/librte_eal/rte_eal_version.map | 1 + 5 files changed, 169 insertions(+) diff --git a/lib/librte_eal/common/include/rte_malloc.h b/lib/librte_eal/common/include/rte_malloc.h index 8a8cc1e6d..47f867a05 100644 --- a/lib/librte_eal/common/include/rte_malloc.h +++ b/lib/librte_eal/common/include/rte_malloc.h @@ -263,6 +263,45 @@ int rte_malloc_get_socket_stats(int socket, struct rte_malloc_socket_stats *socket_stats); +/** + * Add memory chunk to a heap with specified name. + * + * @note Multiple memory chunks can be added to the same heap + * + * @note Memory must be previously allocated for DPDK to be able to use it as a + * malloc heap. Failing to do so will result in undefined behavior, up to and + * including segmentation faults. + * + * @note Calling this function will erase any contents already present at the + * supplied memory address. + * + * @param heap_name + * Name of the heap to add memory chunk to + * @param va_addr + * Start of virtual area to add to the heap + * @param len + * Length of virtual area to add to the heap + * @param iova_addrs + * Array of page IOVA addresses corresponding to each page in this memory + * area. Can be NULL, in which case page IOVA addresses will be set to + * RTE_BAD_IOVA. + * @param n_pages + * Number of elements in the iova_addrs array. Ignored if ``iova_addrs`` + * is NULL. + * @param page_sz + * Page size of the underlying memory + * + * @return + * - 0 on success + * - -1 in case of error, with rte_errno set to one of the following: + * EINVAL - one of the parameters was invalid + * EPERM - attempted to add memory to a reserved heap + * ENOSPC - no more space in internal config to store a new memory chunk + */ +int __rte_experimental +rte_malloc_heap_memory_add(const char *heap_name, void *va_addr, size_t len, + rte_iova_t iova_addrs[], unsigned int n_pages, size_t page_sz); + /** * Creates a new empty malloc heap with a specified name. * diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c index e98f720cb..2f6946f65 100644 --- a/lib/librte_eal/common/malloc_heap.c +++ b/lib/librte_eal/common/malloc_heap.c @@ -1019,6 +1019,80 @@ malloc_heap_dump(struct malloc_heap *heap, FILE *f) rte_spinlock_unlock(&heap->lock); } +int +malloc_heap_add_external_memory(struct malloc_heap *heap, void *va_addr, + rte_iova_t iova_addrs[], unsigned int n_pages, size_t page_sz) +{ + struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + char fbarray_name[RTE_FBARRAY_NAME_LEN]; + struct rte_memseg_list *msl = NULL; + struct rte_fbarray *arr; + size_t seg_len = n_pages * page_sz; + unsigned int i; + + /* first, find a free memseg list */ + for (i = 0; i < RTE_MAX_MEMSEG_LISTS; i++) { + struct rte_memseg_list *tmp = &mcfg->memsegs[i]; + if (tmp->base_va == NULL) { + msl = tmp; + break; + } + } + if (msl == NULL) { + RTE_LOG(ERR, EAL, "Couldn't find empty memseg list\n"); + rte_errno = ENOSPC; + return -1; + } + + snprintf(fbarray_name, sizeof(fbarray_name) - 1, "%s_%p", + heap->name, va_addr); + + /* create the backing fbarray */ + if (rte_fbarray_init(&msl->memseg_arr, fbarray_name, n_pages, + sizeof(struct rte_memseg)) < 0) { + RTE_LOG(ERR, EAL, "Couldn't create fbarray backing the memseg list\n"); + return -1; + } + arr = &msl->memseg_arr; + + /* fbarray created, fill it up */ + for (i = 0; i < n_pages; i++) { + struct rte_memseg *ms; + + rte_fbarray_set_used(arr, i); + ms = rte_fbarray_get(arr, i); + ms->addr = RTE_PTR_ADD(va_addr, i * page_sz); + ms->iova = iova_addrs == NULL ? RTE_BAD_IOVA : iova_addrs[i]; + ms->hugepage_sz = page_sz; + ms->len = page_sz; + ms->nchannel = rte_memory_get_nchannel(); + ms->nrank = rte_memory_get_nrank(); + ms->socket_id = heap->socket_id; + } + + /* set up the memseg list */ + msl->base_va = va_addr; + msl->page_sz = page_sz; + msl->socket_id = heap->socket_id; + msl->len = seg_len; + msl->version = 0; + msl->external = 1; + + /* erase contents of new memory */ + memset(va_addr, 0, seg_len); + + /* now, add newly minted memory to the malloc heap */ + malloc_heap_add_memory(heap, msl, va_addr, seg_len); + + heap->total_size += seg_len; + + /* all done! */ + RTE_LOG(DEBUG, EAL, "Added segment for heap %s starting at %p\n", + heap->name, va_addr); + + return 0; +} + int malloc_heap_create(struct malloc_heap *heap, const char *heap_name) { diff --git a/lib/librte_eal/common/malloc_heap.h b/lib/librte_eal/common/malloc_heap.h index 75278da3c..237ce9dc2 100644 --- a/lib/librte_eal/common/malloc_heap.h +++ b/lib/librte_eal/common/malloc_heap.h @@ -39,6 +39,10 @@ malloc_heap_create(struct malloc_heap *heap, const char *heap_name); int malloc_heap_destroy(struct malloc_heap *heap); +int +malloc_heap_add_external_memory(struct malloc_heap *heap, void *va_addr, + rte_iova_t iova_addrs[], unsigned int n_pages, size_t page_sz); + int malloc_heap_free(struct malloc_elem *elem); diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c index 6734b0d09..9d2041b7b 100644 --- a/lib/librte_eal/common/rte_malloc.c +++ b/lib/librte_eal/common/rte_malloc.c @@ -303,6 +303,57 @@ find_named_heap(const char *name) return NULL; } +int +rte_malloc_heap_memory_add(const char *heap_name, void *va_addr, size_t len, + rte_iova_t iova_addrs[], unsigned int n_pages, size_t page_sz) +{ + struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + struct malloc_heap *heap = NULL; + unsigned int n; + int ret; + + if (heap_name == NULL || va_addr == NULL || + page_sz == 0 || !rte_is_power_of_2(page_sz) || + strnlen(heap_name, RTE_HEAP_NAME_MAX_LEN) == 0 || + strnlen(heap_name, RTE_HEAP_NAME_MAX_LEN) == + RTE_HEAP_NAME_MAX_LEN) { + rte_errno = EINVAL; + ret = -1; + goto unlock; + } + rte_rwlock_write_lock(&mcfg->memory_hotplug_lock); + + /* find our heap */ + heap = find_named_heap(heap_name); + if (heap == NULL) { + rte_errno = ENOENT; + ret = -1; + goto unlock; + } + if (heap->socket_id < RTE_MAX_NUMA_NODES) { + /* cannot add memory to internal heaps */ + rte_errno = EPERM; + ret = -1; + goto unlock; + } + n = len / page_sz; + if (n != n_pages && iova_addrs != NULL) { + rte_errno = EINVAL; + ret = -1; + goto unlock; + } + + rte_spinlock_lock(&heap->lock); + ret = malloc_heap_add_external_memory(heap, va_addr, iova_addrs, n, + page_sz); + rte_spinlock_unlock(&heap->lock); + +unlock: + rte_rwlock_write_unlock(&mcfg->memory_hotplug_lock); + + return ret; +} + int rte_malloc_heap_create(const char *heap_name) { diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map index 7db38c8ac..939124753 100644 --- a/lib/librte_eal/rte_eal_version.map +++ b/lib/librte_eal/rte_eal_version.map @@ -321,6 +321,7 @@ EXPERIMENTAL { rte_malloc_heap_create; rte_malloc_heap_destroy; rte_malloc_heap_get_socket; + rte_malloc_heap_memory_add; rte_mem_alloc_validator_register; rte_mem_alloc_validator_unregister; rte_mem_event_callback_register; From patchwork Thu Sep 20 11:36:26 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 45014 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id D23E91B0FC; Thu, 20 Sep 2018 13:36:53 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 0E6B05F65 for ; Thu, 20 Sep 2018 13:36:44 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Sep 2018 04:36:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,398,1531810800"; d="scan'208";a="74809187" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga008.jf.intel.com with ESMTP; 20 Sep 2018 04:36:41 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w8KBaea4022834; Thu, 20 Sep 2018 12:36:40 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w8KBaebg011545; Thu, 20 Sep 2018 12:36:40 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w8KBae12011541; Thu, 20 Sep 2018 12:36:40 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: laszlo.madarassy@ericsson.com, laszlo.vadkerti@ericsson.com, andras.kovacs@ericsson.com, winnie.tian@ericsson.com, daniel.andrasi@ericsson.com, janos.kobor@ericsson.com, geza.koblo@ericsson.com, srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com, keith.wiles@intel.com, bruce.richardson@intel.com, thomas@monjalon.net, shreyansh.jain@nxp.com, shahafs@mellanox.com, arybchenko@solarflare.com Date: Thu, 20 Sep 2018 12:36:26 +0100 Message-Id: <262843097d9068d573e9a678b1f164b2a0a82276.1537443103.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 13/20] malloc: allow removing memory from named heaps X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add an API to remove memory from specified heaps. This will first check if all elements within the region are free, and that the region is the original region that was added to the heap (by comparing its length to length of memory addressed by the underlying memseg list). Signed-off-by: Anatoly Burakov --- lib/librte_eal/common/include/rte_malloc.h | 27 +++++++++++ lib/librte_eal/common/malloc_heap.c | 54 ++++++++++++++++++++++ lib/librte_eal/common/malloc_heap.h | 4 ++ lib/librte_eal/common/rte_malloc.c | 39 ++++++++++++++++ lib/librte_eal/rte_eal_version.map | 1 + 5 files changed, 125 insertions(+) diff --git a/lib/librte_eal/common/include/rte_malloc.h b/lib/librte_eal/common/include/rte_malloc.h index 47f867a05..9bbe8e3af 100644 --- a/lib/librte_eal/common/include/rte_malloc.h +++ b/lib/librte_eal/common/include/rte_malloc.h @@ -302,6 +302,33 @@ int __rte_experimental rte_malloc_heap_memory_add(const char *heap_name, void *va_addr, size_t len, rte_iova_t iova_addrs[], unsigned int n_pages, size_t page_sz); +/** + * Remove memory chunk from heap with specified name. + * + * @note Memory chunk being removed must be the same as one that was added; + * partially removing memory chunks is not supported + * + * @note Memory area must not contain any allocated elements to allow its + * removal from the heap + * + * @param heap_name + * Name of the heap to remove memory from + * @param va_addr + * Virtual address to remove from the heap + * @param len + * Length of virtual area to remove from the heap + * + * @return + * - 0 on success + * - -1 in case of error, with rte_errno set to one of the following: + * EINVAL - one of the parameters was invalid + * EPERM - attempted to remove memory from a reserved heap + * ENOENT - heap or memory chunk was not found + * EBUSY - memory chunk still contains data + */ +int __rte_experimental +rte_malloc_heap_memory_remove(const char *heap_name, void *va_addr, size_t len); + /** * Creates a new empty malloc heap with a specified name. * diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c index 2f6946f65..3ac3b06de 100644 --- a/lib/librte_eal/common/malloc_heap.c +++ b/lib/librte_eal/common/malloc_heap.c @@ -1019,6 +1019,32 @@ malloc_heap_dump(struct malloc_heap *heap, FILE *f) rte_spinlock_unlock(&heap->lock); } +static int +destroy_seg(struct malloc_elem *elem, size_t len) +{ + struct malloc_heap *heap = elem->heap; + struct rte_memseg_list *msl; + + msl = elem->msl; + + /* this element can be removed */ + malloc_elem_free_list_remove(elem); + malloc_elem_hide_region(elem, elem, len); + + heap->total_size -= len; + + memset(elem, 0, sizeof(*elem)); + + /* destroy the fbarray backing this memory */ + if (rte_fbarray_destroy(&msl->memseg_arr) < 0) + return -1; + + /* reset the memseg list */ + memset(msl, 0, sizeof(*msl)); + + return 0; +} + int malloc_heap_add_external_memory(struct malloc_heap *heap, void *va_addr, rte_iova_t iova_addrs[], unsigned int n_pages, size_t page_sz) @@ -1093,6 +1119,34 @@ malloc_heap_add_external_memory(struct malloc_heap *heap, void *va_addr, return 0; } +int +malloc_heap_remove_external_memory(struct malloc_heap *heap, void *va_addr, + size_t len) +{ + struct malloc_elem *elem = heap->first; + + /* find element with specified va address */ + while (elem != NULL && elem != va_addr) { + elem = elem->next; + /* stop if we've blown past our VA */ + if (elem > (struct malloc_elem *)va_addr) { + rte_errno = ENOENT; + return -1; + } + } + /* check if element was found */ + if (elem == NULL || elem->msl->len != len) { + rte_errno = ENOENT; + return -1; + } + /* if element's size is not equal to segment len, segment is busy */ + if (elem->state == ELEM_BUSY || elem->size != len) { + rte_errno = EBUSY; + return -1; + } + return destroy_seg(elem, len); +} + int malloc_heap_create(struct malloc_heap *heap, const char *heap_name) { diff --git a/lib/librte_eal/common/malloc_heap.h b/lib/librte_eal/common/malloc_heap.h index 237ce9dc2..e48996d52 100644 --- a/lib/librte_eal/common/malloc_heap.h +++ b/lib/librte_eal/common/malloc_heap.h @@ -43,6 +43,10 @@ int malloc_heap_add_external_memory(struct malloc_heap *heap, void *va_addr, rte_iova_t iova_addrs[], unsigned int n_pages, size_t page_sz); +int +malloc_heap_remove_external_memory(struct malloc_heap *heap, void *va_addr, + size_t len); + int malloc_heap_free(struct malloc_elem *elem); diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c index 9d2041b7b..aed066882 100644 --- a/lib/librte_eal/common/rte_malloc.c +++ b/lib/librte_eal/common/rte_malloc.c @@ -354,6 +354,45 @@ rte_malloc_heap_memory_add(const char *heap_name, void *va_addr, size_t len, return ret; } +int +rte_malloc_heap_memory_remove(const char *heap_name, void *va_addr, size_t len) +{ + struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + struct malloc_heap *heap = NULL; + int ret; + + if (heap_name == NULL || va_addr == NULL || len == 0 || + strnlen(heap_name, RTE_HEAP_NAME_MAX_LEN) == 0 || + strnlen(heap_name, RTE_HEAP_NAME_MAX_LEN) == + RTE_HEAP_NAME_MAX_LEN) { + rte_errno = EINVAL; + return -1; + } + rte_rwlock_write_lock(&mcfg->memory_hotplug_lock); + /* find our heap */ + heap = find_named_heap(heap_name); + if (heap == NULL) { + rte_errno = ENOENT; + ret = -1; + goto unlock; + } + if (heap->socket_id < RTE_MAX_NUMA_NODES) { + /* cannot remove memory from internal heaps */ + rte_errno = EPERM; + ret = -1; + goto unlock; + } + + rte_spinlock_lock(&heap->lock); + ret = malloc_heap_remove_external_memory(heap, va_addr, len); + rte_spinlock_unlock(&heap->lock); + +unlock: + rte_rwlock_write_unlock(&mcfg->memory_hotplug_lock); + + return ret; +} + int rte_malloc_heap_create(const char *heap_name) { diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map index 939124753..c0a8220d0 100644 --- a/lib/librte_eal/rte_eal_version.map +++ b/lib/librte_eal/rte_eal_version.map @@ -322,6 +322,7 @@ EXPERIMENTAL { rte_malloc_heap_destroy; rte_malloc_heap_get_socket; rte_malloc_heap_memory_add; + rte_malloc_heap_memory_remove; rte_mem_alloc_validator_register; rte_mem_alloc_validator_unregister; rte_mem_event_callback_register; From patchwork Thu Sep 20 11:36:27 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 45017 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id EE3211B106; Thu, 20 Sep 2018 13:37:10 +0200 (CEST) Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by dpdk.org (Postfix) with ESMTP id 8EF5A5F34 for ; Thu, 20 Sep 2018 13:36:45 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Sep 2018 04:36:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,398,1531810800"; d="scan'208";a="234501478" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga004.jf.intel.com with ESMTP; 20 Sep 2018 04:36:41 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w8KBaeeX022837; Thu, 20 Sep 2018 12:36:40 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w8KBaevr011556; Thu, 20 Sep 2018 12:36:40 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w8KBaeOL011548; Thu, 20 Sep 2018 12:36:40 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: laszlo.madarassy@ericsson.com, laszlo.vadkerti@ericsson.com, andras.kovacs@ericsson.com, winnie.tian@ericsson.com, daniel.andrasi@ericsson.com, janos.kobor@ericsson.com, geza.koblo@ericsson.com, srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com, keith.wiles@intel.com, bruce.richardson@intel.com, thomas@monjalon.net, shreyansh.jain@nxp.com, shahafs@mellanox.com, arybchenko@solarflare.com Date: Thu, 20 Sep 2018 12:36:27 +0100 Message-Id: <249e1f7dfe9d49243d019b8b7d8f8f71e9165a94.1537443103.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 14/20] malloc: allow attaching to external memory chunks X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" In order to use external memory in multiple processes, we need to attach to primary process's memseg lists, so add a new API to do that. It is the responsibility of the user to ensure that memory is accessible and that it has been previously added to the malloc heap by another process. Signed-off-by: Anatoly Burakov --- lib/librte_eal/common/include/rte_malloc.h | 48 +++++++++-- lib/librte_eal/common/rte_malloc.c | 93 ++++++++++++++++++++++ lib/librte_eal/rte_eal_version.map | 1 + 3 files changed, 135 insertions(+), 7 deletions(-) diff --git a/lib/librte_eal/common/include/rte_malloc.h b/lib/librte_eal/common/include/rte_malloc.h index 9bbe8e3af..440496cd9 100644 --- a/lib/librte_eal/common/include/rte_malloc.h +++ b/lib/librte_eal/common/include/rte_malloc.h @@ -268,6 +268,10 @@ rte_malloc_get_socket_stats(int socket, * * @note Multiple memory chunks can be added to the same heap * + * @note Before accessing this memory in other processes, it needs to be + * attached in each of those processes by calling + * ``rte_malloc_heap_memory_attach`` in each other process. + * * @note Memory must be previously allocated for DPDK to be able to use it as a * malloc heap. Failing to do so will result in undefined behavior, up to and * including segmentation faults. @@ -329,21 +333,48 @@ rte_malloc_heap_memory_add(const char *heap_name, void *va_addr, size_t len, int __rte_experimental rte_malloc_heap_memory_remove(const char *heap_name, void *va_addr, size_t len); +/** + * Attach to an already existing chunk of external memory in another process. + * + * @note This function must be called before any attempt is made to use an + * already existing external memory chunk. This function does *not* need to + * be called if a call to ``rte_malloc_heap_memory_add`` was made in the + * current process. + * + * @param heap_name + * Heap name to which this chunk of memory belongs + * @param va_addr + * Start address of memory chunk to attach to + * @param len + * Length of memory chunk to attach to + * @return + * 0 on successful attach + * -1 on unsuccessful attach, with rte_errno set to indicate cause for error: + * EINVAL - one of the parameters was invalid + * EPERM - attempted to attach memory to a reserved heap + * ENOENT - heap or memory chunk was not found + */ +int __rte_experimental +rte_malloc_heap_memory_attach(const char *heap_name, void *va_addr, size_t len); + /** * Creates a new empty malloc heap with a specified name. * * @note Heaps created via this call will automatically get assigned a unique * socket ID, which can be found using ``rte_malloc_heap_get_socket()`` * + * @note This function can only be called in primary process. + * * @param heap_name * Name of the heap to create. * * @return * - 0 on successful creation * - -1 in case of error, with rte_errno set to one of the following: - * EINVAL - ``heap_name`` was NULL, empty or too long - * EEXIST - heap by name of ``heap_name`` already exists - * ENOSPC - no more space in internal config to store a new heap + * EINVAL - ``heap_name`` was NULL, empty or too long + * EEXIST - heap by name of ``heap_name`` already exists + * ENOSPC - no more space in internal config to store a new heap + * E_RTE_SECONDARY - attempted to create a heap in secondary process */ int __rte_experimental rte_malloc_heap_create(const char *heap_name); @@ -357,16 +388,19 @@ rte_malloc_heap_create(const char *heap_name); * @note This function will return a failure result if not all memory segments * were removed from the heap prior to its destruction * + * @note This function can only be called in primary process. + * * @param heap_name * Name of the heap to create. * * @return * - 0 on success * - -1 in case of error, with rte_errno set to one of the following: - * EINVAL - ``heap_name`` was NULL, empty or too long - * ENOENT - heap by the name of ``heap_name`` was not found - * EPERM - attempting to destroy reserved heap - * EBUSY - heap still contains data + * EINVAL - ``heap_name`` was NULL, empty or too long + * ENOENT - heap by the name of ``heap_name`` was not found + * EPERM - attempting to destroy reserved heap + * EBUSY - heap still contains data + * E_RTE_SECONDARY - attempted to destroy a heap in secondary process */ int __rte_experimental rte_malloc_heap_destroy(const char *heap_name); diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c index aed066882..bc22d21e4 100644 --- a/lib/librte_eal/common/rte_malloc.c +++ b/lib/librte_eal/common/rte_malloc.c @@ -393,6 +393,89 @@ rte_malloc_heap_memory_remove(const char *heap_name, void *va_addr, size_t len) return ret; } +struct sync_mem_walk_arg { + void *va_addr; + size_t len; + int result; +}; + +static int +attach_mem_walk(const struct rte_memseg_list *msl, void *arg) +{ + struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + struct sync_mem_walk_arg *wa = arg; + size_t len = msl->page_sz * msl->memseg_arr.len; + + if (msl->base_va == wa->va_addr && + len == wa->len) { + struct rte_memseg_list *found_msl; + int msl_idx, ret; + + /* msl is const */ + msl_idx = msl - mcfg->memsegs; + found_msl = &mcfg->memsegs[msl_idx]; + + ret = rte_fbarray_attach(&found_msl->memseg_arr); + + if (ret < 0) + wa->result = -rte_errno; + else + wa->result = 0; + return 1; + } + return 0; +} + +int +rte_malloc_heap_memory_attach(const char *heap_name, void *va_addr, size_t len) +{ + struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + struct malloc_heap *heap = NULL; + struct sync_mem_walk_arg wa; + int ret; + + if (heap_name == NULL || va_addr == NULL || len == 0 || + strnlen(heap_name, RTE_HEAP_NAME_MAX_LEN) == 0 || + strnlen(heap_name, RTE_HEAP_NAME_MAX_LEN) == + RTE_HEAP_NAME_MAX_LEN) { + rte_errno = EINVAL; + return -1; + } + rte_rwlock_read_lock(&mcfg->memory_hotplug_lock); + + /* find our heap */ + heap = find_named_heap(heap_name); + if (heap == NULL) { + rte_errno = ENOENT; + ret = -1; + goto unlock; + } + /* we shouldn't be able to attach to internal heaps */ + if (heap->socket_id < RTE_MAX_NUMA_NODES) { + rte_errno = EPERM; + ret = -1; + goto unlock; + } + + /* find corresponding memseg list to attach to */ + wa.va_addr = va_addr; + wa.len = len; + wa.result = -ENOENT; /* fail unless explicitly told to succeed */ + + /* we're already holding a read lock */ + rte_memseg_list_walk_thread_unsafe(attach_mem_walk, &wa); + + if (wa.result < 0) { + rte_errno = -wa.result; + ret = -1; + } else { + ret = 0; + } +unlock: + rte_rwlock_read_unlock(&mcfg->memory_hotplug_lock); + return ret; +} + int rte_malloc_heap_create(const char *heap_name) { @@ -400,6 +483,11 @@ rte_malloc_heap_create(const char *heap_name) struct malloc_heap *heap = NULL; int i, ret; + if (rte_eal_process_type() != RTE_PROC_PRIMARY) { + rte_errno = E_RTE_SECONDARY; + return -1; + } + if (heap_name == NULL || strnlen(heap_name, RTE_HEAP_NAME_MAX_LEN) == 0 || strnlen(heap_name, RTE_HEAP_NAME_MAX_LEN) == @@ -451,6 +539,11 @@ rte_malloc_heap_destroy(const char *heap_name) struct malloc_heap *heap = NULL; int ret; + if (rte_eal_process_type() != RTE_PROC_PRIMARY) { + rte_errno = E_RTE_SECONDARY; + return -1; + } + if (heap_name == NULL || strnlen(heap_name, RTE_HEAP_NAME_MAX_LEN) == 0 || strnlen(heap_name, RTE_HEAP_NAME_MAX_LEN) == diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map index c0a8220d0..0a2e46767 100644 --- a/lib/librte_eal/rte_eal_version.map +++ b/lib/librte_eal/rte_eal_version.map @@ -322,6 +322,7 @@ EXPERIMENTAL { rte_malloc_heap_destroy; rte_malloc_heap_get_socket; rte_malloc_heap_memory_add; + rte_malloc_heap_memory_attach; rte_malloc_heap_memory_remove; rte_mem_alloc_validator_register; rte_mem_alloc_validator_unregister; From patchwork Thu Sep 20 11:36:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 45028 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 31D837CD9; Thu, 20 Sep 2018 13:37:59 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 59565695D for ; Thu, 20 Sep 2018 13:37:57 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Sep 2018 04:37:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,398,1531810800"; d="scan'208";a="72348611" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga008.fm.intel.com with ESMTP; 20 Sep 2018 04:36:41 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w8KBae5Q022840; Thu, 20 Sep 2018 12:36:40 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w8KBaeAi011564; Thu, 20 Sep 2018 12:36:40 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w8KBaeuO011559; Thu, 20 Sep 2018 12:36:40 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: laszlo.madarassy@ericsson.com, laszlo.vadkerti@ericsson.com, andras.kovacs@ericsson.com, winnie.tian@ericsson.com, daniel.andrasi@ericsson.com, janos.kobor@ericsson.com, geza.koblo@ericsson.com, srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com, keith.wiles@intel.com, bruce.richardson@intel.com, thomas@monjalon.net, shreyansh.jain@nxp.com, shahafs@mellanox.com, arybchenko@solarflare.com Date: Thu, 20 Sep 2018 12:36:28 +0100 Message-Id: <577fecef12871f801d8cc31e0a021afecd0f188f.1537443103.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 15/20] malloc: allow detaching from external memory X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add API to detach from existing chunk of external memory in a process. Signed-off-by: Anatoly Burakov --- lib/librte_eal/common/include/rte_malloc.h | 27 ++++++++++++++++++++++ lib/librte_eal/common/rte_malloc.c | 27 ++++++++++++++++++---- lib/librte_eal/rte_eal_version.map | 1 + 3 files changed, 50 insertions(+), 5 deletions(-) diff --git a/lib/librte_eal/common/include/rte_malloc.h b/lib/librte_eal/common/include/rte_malloc.h index 440496cd9..d2236c421 100644 --- a/lib/librte_eal/common/include/rte_malloc.h +++ b/lib/librte_eal/common/include/rte_malloc.h @@ -315,6 +315,9 @@ rte_malloc_heap_memory_add(const char *heap_name, void *va_addr, size_t len, * @note Memory area must not contain any allocated elements to allow its * removal from the heap * + * @note All other processes must detach from the memory chunk prior to it being + * removed from the heap. + * * @param heap_name * Name of the heap to remove memory from * @param va_addr @@ -357,6 +360,30 @@ rte_malloc_heap_memory_remove(const char *heap_name, void *va_addr, size_t len); int __rte_experimental rte_malloc_heap_memory_attach(const char *heap_name, void *va_addr, size_t len); +/** + * Detach from a chunk of external memory in secondary process. + * + * @note This function must be called in before any attempt is made to remove + * external memory from the heap in another process. This function does *not* + * need to be called if a call to ``rte_malloc_heap_memory_remove`` will be + * called in current process. + * + * @param heap_name + * Heap name to which this chunk of memory belongs + * @param va_addr + * Start address of memory chunk to attach to + * @param len + * Length of memory chunk to attach to + * @return + * 0 on successful detach + * -1 on unsuccessful detach, with rte_errno set to indicate cause for error: + * EINVAL - one of the parameters was invalid + * EPERM - attempted to detach memory from a reserved heap + * ENOENT - heap or memory chunk was not found + */ +int __rte_experimental +rte_malloc_heap_memory_detach(const char *heap_name, void *va_addr, size_t len); + /** * Creates a new empty malloc heap with a specified name. * diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c index bc22d21e4..e9be179d5 100644 --- a/lib/librte_eal/common/rte_malloc.c +++ b/lib/librte_eal/common/rte_malloc.c @@ -397,10 +397,11 @@ struct sync_mem_walk_arg { void *va_addr; size_t len; int result; + bool attach; }; static int -attach_mem_walk(const struct rte_memseg_list *msl, void *arg) +sync_mem_walk(const struct rte_memseg_list *msl, void *arg) { struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; struct sync_mem_walk_arg *wa = arg; @@ -415,7 +416,10 @@ attach_mem_walk(const struct rte_memseg_list *msl, void *arg) msl_idx = msl - mcfg->memsegs; found_msl = &mcfg->memsegs[msl_idx]; - ret = rte_fbarray_attach(&found_msl->memseg_arr); + if (wa->attach) + ret = rte_fbarray_attach(&found_msl->memseg_arr); + else + ret = rte_fbarray_detach(&found_msl->memseg_arr); if (ret < 0) wa->result = -rte_errno; @@ -426,8 +430,8 @@ attach_mem_walk(const struct rte_memseg_list *msl, void *arg) return 0; } -int -rte_malloc_heap_memory_attach(const char *heap_name, void *va_addr, size_t len) +static int +sync_memory(const char *heap_name, void *va_addr, size_t len, bool attach) { struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; struct malloc_heap *heap = NULL; @@ -461,9 +465,10 @@ rte_malloc_heap_memory_attach(const char *heap_name, void *va_addr, size_t len) wa.va_addr = va_addr; wa.len = len; wa.result = -ENOENT; /* fail unless explicitly told to succeed */ + wa.attach = attach; /* we're already holding a read lock */ - rte_memseg_list_walk_thread_unsafe(attach_mem_walk, &wa); + rte_memseg_list_walk_thread_unsafe(sync_mem_walk, &wa); if (wa.result < 0) { rte_errno = -wa.result; @@ -476,6 +481,18 @@ rte_malloc_heap_memory_attach(const char *heap_name, void *va_addr, size_t len) return ret; } +int +rte_malloc_heap_memory_attach(const char *heap_name, void *va_addr, size_t len) +{ + return sync_memory(heap_name, va_addr, len, true); +} + +int +rte_malloc_heap_memory_detach(const char *heap_name, void *va_addr, size_t len) +{ + return sync_memory(heap_name, va_addr, len, false); +} + int rte_malloc_heap_create(const char *heap_name) { diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map index 0a2e46767..a535c4da8 100644 --- a/lib/librte_eal/rte_eal_version.map +++ b/lib/librte_eal/rte_eal_version.map @@ -323,6 +323,7 @@ EXPERIMENTAL { rte_malloc_heap_get_socket; rte_malloc_heap_memory_add; rte_malloc_heap_memory_attach; + rte_malloc_heap_memory_detach; rte_malloc_heap_memory_remove; rte_mem_alloc_validator_register; rte_mem_alloc_validator_unregister; From patchwork Thu Sep 20 11:36:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 45018 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 6D7711B11B; Thu, 20 Sep 2018 13:37:13 +0200 (CEST) Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by dpdk.org (Postfix) with ESMTP id BF3925F27 for ; Thu, 20 Sep 2018 13:36:45 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Sep 2018 04:36:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,398,1531810800"; d="scan'208";a="91758643" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga001.fm.intel.com with ESMTP; 20 Sep 2018 04:36:41 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w8KBaegs022843; Thu, 20 Sep 2018 12:36:40 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w8KBaeSn011579; Thu, 20 Sep 2018 12:36:40 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w8KBaejG011571; Thu, 20 Sep 2018 12:36:40 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: laszlo.madarassy@ericsson.com, laszlo.vadkerti@ericsson.com, andras.kovacs@ericsson.com, winnie.tian@ericsson.com, daniel.andrasi@ericsson.com, janos.kobor@ericsson.com, geza.koblo@ericsson.com, srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com, keith.wiles@intel.com, bruce.richardson@intel.com, thomas@monjalon.net, shreyansh.jain@nxp.com, shahafs@mellanox.com, arybchenko@solarflare.com Date: Thu, 20 Sep 2018 12:36:29 +0100 Message-Id: <5bd5c4233413d3c78f7540bfd6d0e5d5e13ceb53.1537443103.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 16/20] test: add unit tests for external memory support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add simple unit tests to test external memory support. The tests are pretty basic and mostly consist of checking if invalid API calls are handled correctly, plus a simple allocation/deallocation test for malloc and memzone. Signed-off-by: Anatoly Burakov --- test/test/Makefile | 1 + test/test/autotest_data.py | 14 +- test/test/meson.build | 1 + test/test/test_external_mem.c | 389 ++++++++++++++++++++++++++++++++++ 4 files changed, 401 insertions(+), 4 deletions(-) create mode 100644 test/test/test_external_mem.c diff --git a/test/test/Makefile b/test/test/Makefile index e6967bab6..074ac6e03 100644 --- a/test/test/Makefile +++ b/test/test/Makefile @@ -71,6 +71,7 @@ SRCS-y += test_bitmap.c SRCS-y += test_reciprocal_division.c SRCS-y += test_reciprocal_division_perf.c SRCS-y += test_fbarray.c +SRCS-y += test_external_mem.c SRCS-y += test_ring.c SRCS-y += test_ring_perf.c diff --git a/test/test/autotest_data.py b/test/test/autotest_data.py index f68d9b111..51f8e1689 100644 --- a/test/test/autotest_data.py +++ b/test/test/autotest_data.py @@ -477,10 +477,16 @@ "Report": None, }, { - "Name": "Fbarray autotest", - "Command": "fbarray_autotest", - "Func": default_autotest, - "Report": None, + "Name": "Fbarray autotest", + "Command": "fbarray_autotest", + "Func": default_autotest, + "Report": None, + }, + { + "Name": "External memory autotest", + "Command": "external_mem_autotest", + "Func": default_autotest, + "Report": None, }, # #Please always keep all dump tests at the end and together! diff --git a/test/test/meson.build b/test/test/meson.build index b1dd6eca2..3abf02b71 100644 --- a/test/test/meson.build +++ b/test/test/meson.build @@ -155,6 +155,7 @@ test_names = [ 'eventdev_common_autotest', 'eventdev_octeontx_autotest', 'eventdev_sw_autotest', + 'external_mem_autotest', 'func_reentrancy_autotest', 'flow_classify_autotest', 'hash_scaling_autotest', diff --git a/test/test/test_external_mem.c b/test/test/test_external_mem.c new file mode 100644 index 000000000..d0837aa35 --- /dev/null +++ b/test/test/test_external_mem.c @@ -0,0 +1,389 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2018 Intel Corporation + */ + +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include + +#include "test.h" + +#define EXTERNAL_MEM_SZ (RTE_PGSIZE_4K << 10) /* 4M of data */ + +static int +test_invalid_param(void *addr, size_t len, size_t pgsz, rte_iova_t *iova, + int n_pages) +{ + static const char * const names[] = { + NULL, /* NULL name */ + "", /* empty name */ + "this heap name is definitely way too long to be valid" + }; + const char *valid_name = "valid heap name"; + unsigned int i; + + /* check invalid name handling */ + for (i = 0; i < RTE_DIM(names); i++) { + const char *name = names[i]; + + /* these calls may fail for other reasons, so check errno */ + if (rte_malloc_heap_create(name) >= 0 || rte_errno != EINVAL) { + printf("%s():%i: Created heap with invalid name\n", + __func__, __LINE__); + goto fail; + } + + if (rte_malloc_heap_destroy(name) >= 0 || rte_errno != EINVAL) { + printf("%s():%i: Destroyed heap with invalid name\n", + __func__, __LINE__); + goto fail; + } + + if (rte_malloc_heap_get_socket(name) >= 0 || + rte_errno != EINVAL) { + printf("%s():%i: Found socket for heap with invalid name\n", + __func__, __LINE__); + goto fail; + } + + if (rte_malloc_heap_memory_add(name, addr, len, + NULL, 0, pgsz) >= 0 || rte_errno != EINVAL) { + printf("%s():%i: Added memory to heap with invalid name\n", + __func__, __LINE__); + goto fail; + } + if (rte_malloc_heap_memory_remove(name, addr, len) >= 0 || + rte_errno != EINVAL) { + printf("%s():%i: Removed memory from heap with invalid name\n", + __func__, __LINE__); + goto fail; + } + + if (rte_malloc_heap_memory_attach(name, addr, len) >= 0 || + rte_errno != EINVAL) { + printf("%s():%i: Attached memory to heap with invalid name\n", + __func__, __LINE__); + goto fail; + } + if (rte_malloc_heap_memory_detach(name, addr, len) >= 0 || + rte_errno != EINVAL) { + printf("%s():%i: Detached memory from heap with invalid name\n", + __func__, __LINE__); + goto fail; + } + } + + /* do same as above, but with a valid heap name */ + + /* skip create call */ + if (rte_malloc_heap_destroy(valid_name) >= 0 || rte_errno != ENOENT) { + printf("%s():%i: Destroyed heap with invalid name\n", + __func__, __LINE__); + goto fail; + } + if (rte_malloc_heap_get_socket(valid_name) >= 0 || + rte_errno != ENOENT) { + printf("%s():%i: Found socket for heap with invalid name\n", + __func__, __LINE__); + goto fail; + } + + /* these calls may fail for other reasons, so check errno */ + if (rte_malloc_heap_memory_add(valid_name, addr, len, + NULL, 0, pgsz) >= 0 || rte_errno != ENOENT) { + printf("%s():%i: Added memory to non-existent heap\n", + __func__, __LINE__); + goto fail; + } + if (rte_malloc_heap_memory_remove(valid_name, addr, len) >= 0 || + rte_errno != ENOENT) { + printf("%s():%i: Removed memory from non-existent heap\n", + __func__, __LINE__); + goto fail; + } + + if (rte_malloc_heap_memory_attach(valid_name, addr, len) >= 0 || + rte_errno != ENOENT) { + printf("%s():%i: Attached memory to non-existent heap\n", + __func__, __LINE__); + goto fail; + } + if (rte_malloc_heap_memory_detach(valid_name, addr, len) >= 0 || + rte_errno != ENOENT) { + printf("%s():%i: Detached memory from non-existent heap\n", + __func__, __LINE__); + goto fail; + } + + /* create a valid heap but test other invalid parameters */ + if (rte_malloc_heap_create(valid_name) != 0) { + printf("%s():%i: Failed to create valid heap\n", + __func__, __LINE__); + goto fail; + } + + /* zero length */ + if (rte_malloc_heap_memory_add(valid_name, addr, 0, + NULL, 0, pgsz) >= 0 || rte_errno != EINVAL) { + printf("%s():%i: Added memory with invalid parameters\n", + __func__, __LINE__); + goto fail; + } + + if (rte_malloc_heap_memory_remove(valid_name, addr, 0) >= 0 || + rte_errno != EINVAL) { + printf("%s():%i: Removed memory with invalid parameters\n", + __func__, __LINE__); + goto fail; + } + + if (rte_malloc_heap_memory_attach(valid_name, addr, 0) >= 0 || + rte_errno != EINVAL) { + printf("%s():%i: Attached memory with invalid parameters\n", + __func__, __LINE__); + goto fail; + } + if (rte_malloc_heap_memory_detach(valid_name, addr, 0) >= 0 || + rte_errno != EINVAL) { + printf("%s():%i: Detached memory with invalid parameters\n", + __func__, __LINE__); + goto fail; + } + + /* zero address */ + if (rte_malloc_heap_memory_add(valid_name, NULL, len, + NULL, 0, pgsz) >= 0 || rte_errno != EINVAL) { + printf("%s():%i: Added memory with invalid parameters\n", + __func__, __LINE__); + goto fail; + } + + if (rte_malloc_heap_memory_remove(valid_name, NULL, len) >= 0 || + rte_errno != EINVAL) { + printf("%s():%i: Removed memory with invalid parameters\n", + __func__, __LINE__); + goto fail; + } + + if (rte_malloc_heap_memory_attach(valid_name, NULL, len) >= 0 || + rte_errno != EINVAL) { + printf("%s():%i: Attached memory with invalid parameters\n", + __func__, __LINE__); + goto fail; + } + if (rte_malloc_heap_memory_detach(valid_name, NULL, len) >= 0 || + rte_errno != EINVAL) { + printf("%s():%i: Detached memory with invalid parameters\n", + __func__, __LINE__); + goto fail; + } + + /* wrong page count */ + if (rte_malloc_heap_memory_add(valid_name, addr, len, + iova, 0, pgsz) >= 0 || rte_errno != EINVAL) { + printf("%s():%i: Added memory with invalid parameters\n", + __func__, __LINE__); + goto fail; + } + if (rte_malloc_heap_memory_add(valid_name, addr, len, + iova, n_pages - 1, pgsz) >= 0 || rte_errno != EINVAL) { + printf("%s():%i: Added memory with invalid parameters\n", + __func__, __LINE__); + goto fail; + } + if (rte_malloc_heap_memory_add(valid_name, addr, len, + iova, n_pages + 1, pgsz) >= 0 || rte_errno != EINVAL) { + printf("%s():%i: Added memory with invalid parameters\n", + __func__, __LINE__); + goto fail; + } + + /* tests passed, destroy heap */ + if (rte_malloc_heap_destroy(valid_name) != 0) { + printf("%s():%i: Failed to destroy valid heap\n", + __func__, __LINE__); + goto fail; + } + return 0; +fail: + rte_malloc_heap_destroy(valid_name); + return -1; +} + +static int +test_basic(void *addr, size_t len, size_t pgsz, rte_iova_t *iova, int n_pages) +{ + const char *heap_name = "heap"; + void *ptr = NULL; + int socket_id, i; + const struct rte_memzone *mz = NULL; + + /* create heap */ + if (rte_malloc_heap_create(heap_name) != 0) { + printf("%s():%i: Failed to create malloc heap\n", + __func__, __LINE__); + goto fail; + } + + /* get socket ID corresponding to this heap */ + socket_id = rte_malloc_heap_get_socket(heap_name); + if (socket_id < 0) { + printf("%s():%i: cannot find socket for external heap\n", + __func__, __LINE__); + goto fail; + } + + /* heap is empty, so any allocation should fail */ + ptr = rte_malloc_socket("EXTMEM", 64, 0, socket_id); + if (ptr != NULL) { + printf("%s():%i: Allocated from empty heap\n", __func__, + __LINE__); + goto fail; + } + + /* add memory to heap */ + if (rte_malloc_heap_memory_add(heap_name, addr, len, + iova, n_pages, pgsz) != 0) { + printf("%s():%i: Failed to add memory to heap\n", + __func__, __LINE__); + goto fail; + } + + /* check that we can get this memory from EAL now */ + for (i = 0; i < n_pages; i++) { + const struct rte_memseg *ms; + void *cur = RTE_PTR_ADD(addr, pgsz * i); + + ms = rte_mem_virt2memseg(cur, NULL); + if (ms == NULL) { + printf("%s():%i: Failed to retrieve memseg for external mem\n", + __func__, __LINE__); + goto fail; + } + if (ms->addr != cur) { + printf("%s():%i: VA mismatch\n", __func__, __LINE__); + goto fail; + } + if (ms->iova != iova[i]) { + printf("%s():%i: IOVA mismatch\n", __func__, __LINE__); + goto fail; + } + } + + /* allocate - this now should succeed */ + ptr = rte_malloc_socket("EXTMEM", 64, 0, socket_id); + if (ptr == NULL) { + printf("%s():%i: Failed to allocate from external heap\n", + __func__, __LINE__); + goto fail; + } + + /* check if address is in expected range */ + if (ptr < addr || ptr >= RTE_PTR_ADD(addr, len)) { + printf("%s():%i: Allocated from unexpected address space\n", + __func__, __LINE__); + goto fail; + } + + /* we've allocated something - removing memory should fail */ + if (rte_malloc_heap_memory_remove(heap_name, addr, len) >= 0 || + rte_errno != EBUSY) { + printf("%s():%i: Removing memory succeeded when memory is not free\n", + __func__, __LINE__); + goto fail; + } + if (rte_malloc_heap_destroy(heap_name) >= 0 || rte_errno != EBUSY) { + printf("%s():%i: Destroying heap succeeded when memory is not free\n", + __func__, __LINE__); + goto fail; + } + + /* try allocating an IOVA-contiguous memzone - this should succeed + * because we've set up a contiguous IOVA table. + */ + mz = rte_memzone_reserve("heap_test", pgsz * 2, socket_id, + RTE_MEMZONE_IOVA_CONTIG); + if (mz == NULL) { + printf("%s():%i: Failed to reserve memzone\n", + __func__, __LINE__); + goto fail; + } + + rte_malloc_dump_stats(stdout, NULL); + rte_malloc_dump_heaps(stdout); + + /* free memory - removing it should now succeed */ + rte_free(ptr); + ptr = NULL; + + rte_memzone_free(mz); + mz = NULL; + + if (rte_malloc_heap_memory_remove(heap_name, addr, len) != 0) { + printf("%s():%i: Removing memory from heap failed\n", + __func__, __LINE__); + goto fail; + } + if (rte_malloc_heap_destroy(heap_name) != 0) { + printf("%s():%i: Destroying heap failed\n", + __func__, __LINE__); + goto fail; + } + + return 0; +fail: + rte_memzone_free(mz); + rte_free(ptr); + /* even if something failed, attempt to clean up */ + rte_malloc_heap_memory_remove(heap_name, addr, len); + rte_malloc_heap_destroy(heap_name); + + return -1; +} + +/* we need to test attach/detach in secondary processes. */ +static int +test_external_mem(void) +{ + size_t len = EXTERNAL_MEM_SZ; + size_t pgsz = RTE_PGSIZE_4K; + rte_iova_t iova[len / pgsz]; + void *addr; + int ret, n_pages; + + /* create external memory area */ + n_pages = RTE_DIM(iova); + addr = mmap(NULL, len, PROT_WRITE | PROT_READ, + MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); + if (addr == MAP_FAILED) { + printf("%s():%i: Failed to create dummy memory area\n", + __func__, __LINE__); + return -1; + } + for (int i = 0; i < n_pages; i++) { + /* arbitrary IOVA */ + rte_iova_t tmp = 0x100000000 + i * pgsz; + iova[i] = tmp; + } + + ret = test_invalid_param(addr, len, pgsz, iova, n_pages); + ret |= test_basic(addr, len, pgsz, iova, n_pages); + + munmap(addr, len); + + return ret; +} + +REGISTER_TEST_COMMAND(external_mem_autotest, test_external_mem); From patchwork Thu Sep 20 11:36:30 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 45023 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 850481B13D; Thu, 20 Sep 2018 13:37:24 +0200 (CEST) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by dpdk.org (Postfix) with ESMTP id 352BC7CCA for ; Thu, 20 Sep 2018 13:36:53 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Sep 2018 04:36:53 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,398,1531810800"; d="scan'208";a="92199771" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga001.jf.intel.com with ESMTP; 20 Sep 2018 04:36:41 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w8KBafUO022846; Thu, 20 Sep 2018 12:36:41 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w8KBaec4011586; Thu, 20 Sep 2018 12:36:40 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w8KBae2p011582; Thu, 20 Sep 2018 12:36:40 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: laszlo.madarassy@ericsson.com, laszlo.vadkerti@ericsson.com, andras.kovacs@ericsson.com, winnie.tian@ericsson.com, daniel.andrasi@ericsson.com, janos.kobor@ericsson.com, geza.koblo@ericsson.com, srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com, keith.wiles@intel.com, bruce.richardson@intel.com, thomas@monjalon.net, shreyansh.jain@nxp.com, shahafs@mellanox.com, arybchenko@solarflare.com Date: Thu, 20 Sep 2018 12:36:30 +0100 Message-Id: <665a207254fdcf2a86dd371f78e7c15c3f9ed298.1537443103.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 17/20] examples: add external memory example app X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Introduce an example application demonstrating the use of external memory support. This is a simple application based on skeleton app, but instead of using internal DPDK memory, it is using externally allocated memory. The RX/TX and init path is a carbon-copy of skeleton app, with no modifications whatseoever. The only difference is an additional init stage to allocate memory and create a heap for it, and the socket ID supplied to the mempool initialization function. The memory used by this app is hugepage memory allocated anonymously. Anonymous hugepage memory will not be allocated in a NUMA-aware fashion, so there is a chance of performance degradation when using this app, but given that kernel usually gives hugepages on local socket first, this should not be a problem in most cases. Signed-off-by: Anatoly Burakov --- examples/external_mem/Makefile | 62 ++++ examples/external_mem/extmem.c | 461 ++++++++++++++++++++++++++++++ examples/external_mem/meson.build | 12 + 3 files changed, 535 insertions(+) create mode 100644 examples/external_mem/Makefile create mode 100644 examples/external_mem/extmem.c create mode 100644 examples/external_mem/meson.build diff --git a/examples/external_mem/Makefile b/examples/external_mem/Makefile new file mode 100644 index 000000000..3b6ab3b2f --- /dev/null +++ b/examples/external_mem/Makefile @@ -0,0 +1,62 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2010-2018 Intel Corporation + +# binary name +APP = extmem + +# all source are stored in SRCS-y +SRCS-y := extmem.c + +# Build using pkg-config variables if possible +$(shell pkg-config --exists libdpdk) +ifeq ($(.SHELLSTATUS),0) + +all: shared +.PHONY: shared static +shared: build/$(APP)-shared + ln -sf $(APP)-shared build/$(APP) +static: build/$(APP)-static + ln -sf $(APP)-static build/$(APP) + +PC_FILE := $(shell pkg-config --path libdpdk) +CFLAGS += -O3 $(shell pkg-config --cflags libdpdk) +CFLAGS += -DALLOW_EXPERIMENTAL_API +LDFLAGS_SHARED = $(shell pkg-config --libs libdpdk) +LDFLAGS_STATIC = -Wl,-Bstatic $(shell pkg-config --static --libs libdpdk) + +build/$(APP)-shared: $(SRCS-y) Makefile $(PC_FILE) | build + $(CC) $(CFLAGS) $(SRCS-y) -o $@ $(LDFLAGS) $(LDFLAGS_SHARED) + +build/$(APP)-static: $(SRCS-y) Makefile $(PC_FILE) | build + $(CC) $(CFLAGS) $(SRCS-y) -o $@ $(LDFLAGS) $(LDFLAGS_STATIC) + +build: + @mkdir -p $@ + +.PHONY: clean +clean: + rm -f build/$(APP) build/$(APP)-static build/$(APP)-shared + rmdir --ignore-fail-on-non-empty build + +else # Build using legacy build system + +ifeq ($(RTE_SDK),) +$(error "Please define RTE_SDK environment variable") +endif + +# Default target, can be overridden by command line or environment +RTE_TARGET ?= x86_64-native-linuxapp-gcc + +include $(RTE_SDK)/mk/rte.vars.mk + +CFLAGS += $(WERROR_FLAGS) +CFLAGS += -DALLOW_EXPERIMENTAL_API + +# workaround for a gcc bug with noreturn attribute +# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603 +ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y) +CFLAGS_main.o += -Wno-return-type +endif + +include $(RTE_SDK)/mk/rte.extapp.mk +endif diff --git a/examples/external_mem/extmem.c b/examples/external_mem/extmem.c new file mode 100644 index 000000000..818a02171 --- /dev/null +++ b/examples/external_mem/extmem.c @@ -0,0 +1,461 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2010-2018 Intel Corporation + */ + +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include + +#define RX_RING_SIZE 1024 +#define TX_RING_SIZE 1024 + +#define NUM_MBUFS 8191 +#define MBUF_CACHE_SIZE 250 +#define BURST_SIZE 32 +#define EXTMEM_HEAP_NAME "extmem" + +static const struct rte_eth_conf port_conf_default = { + .rxmode = { + .max_rx_pkt_len = ETHER_MAX_LEN, + }, +}; + +/* extmem.c: Basic DPDK skeleton forwarding example using external memory. */ + +/* + * Initializes a given port using global settings and with the RX buffers + * coming from the mbuf_pool passed as a parameter. + */ +static inline int +port_init(uint16_t port, struct rte_mempool *mbuf_pool) +{ + struct rte_eth_conf port_conf = port_conf_default; + const uint16_t rx_rings = 1, tx_rings = 1; + uint16_t nb_rxd = RX_RING_SIZE; + uint16_t nb_txd = TX_RING_SIZE; + int retval; + uint16_t q; + struct rte_eth_dev_info dev_info; + struct rte_eth_txconf txconf; + + if (!rte_eth_dev_is_valid_port(port)) + return -1; + + rte_eth_dev_info_get(port, &dev_info); + if (dev_info.tx_offload_capa & DEV_TX_OFFLOAD_MBUF_FAST_FREE) + port_conf.txmode.offloads |= + DEV_TX_OFFLOAD_MBUF_FAST_FREE; + + /* Configure the Ethernet device. */ + retval = rte_eth_dev_configure(port, rx_rings, tx_rings, &port_conf); + if (retval != 0) + return retval; + + retval = rte_eth_dev_adjust_nb_rx_tx_desc(port, &nb_rxd, &nb_txd); + if (retval != 0) + return retval; + + /* Allocate and set up 1 RX queue per Ethernet port. */ + for (q = 0; q < rx_rings; q++) { + retval = rte_eth_rx_queue_setup(port, q, nb_rxd, + rte_eth_dev_socket_id(port), NULL, mbuf_pool); + if (retval < 0) + return retval; + } + + txconf = dev_info.default_txconf; + txconf.offloads = port_conf.txmode.offloads; + /* Allocate and set up 1 TX queue per Ethernet port. */ + for (q = 0; q < tx_rings; q++) { + retval = rte_eth_tx_queue_setup(port, q, nb_txd, + rte_eth_dev_socket_id(port), &txconf); + if (retval < 0) + return retval; + } + + /* Start the Ethernet port. */ + retval = rte_eth_dev_start(port); + if (retval < 0) + return retval; + + /* Display the port MAC address. */ + struct ether_addr addr; + rte_eth_macaddr_get(port, &addr); + printf("Port %u MAC: %02" PRIx8 " %02" PRIx8 " %02" PRIx8 + " %02" PRIx8 " %02" PRIx8 " %02" PRIx8 "\n", + port, + addr.addr_bytes[0], addr.addr_bytes[1], + addr.addr_bytes[2], addr.addr_bytes[3], + addr.addr_bytes[4], addr.addr_bytes[5]); + + /* Enable RX in promiscuous mode for the Ethernet device. */ + rte_eth_promiscuous_enable(port); + + return 0; +} + +/* + * The lcore main. This is the main thread that does the work, reading from + * an input port and writing to an output port. + */ +static __attribute__((noreturn)) void +lcore_main(void) +{ + uint16_t port; + + /* + * Check that the port is on the same NUMA node as the polling thread + * for best performance. + */ + RTE_ETH_FOREACH_DEV(port) + if (rte_eth_dev_socket_id(port) > 0 && + rte_eth_dev_socket_id(port) != + (int)rte_socket_id()) + printf("WARNING, port %u is on remote NUMA node to " + "polling thread.\n\tPerformance will " + "not be optimal.\n", port); + + printf("\nCore %u forwarding packets. [Ctrl+C to quit]\n", + rte_lcore_id()); + + /* Run until the application is quit or killed. */ + for (;;) { + /* + * Receive packets on a port and forward them on the paired + * port. The mapping is 0 -> 1, 1 -> 0, 2 -> 3, 3 -> 2, etc. + */ + RTE_ETH_FOREACH_DEV(port) { + + /* Get burst of RX packets, from first port of pair. */ + struct rte_mbuf *bufs[BURST_SIZE]; + const uint16_t nb_rx = rte_eth_rx_burst(port, 0, + bufs, BURST_SIZE); + + if (unlikely(nb_rx == 0)) + continue; + + /* Send burst of TX packets, to second port of pair. */ + const uint16_t nb_tx = rte_eth_tx_burst(port ^ 1, 0, + bufs, nb_rx); + + /* Free any unsent packets. */ + if (unlikely(nb_tx < nb_rx)) { + uint16_t buf; + for (buf = nb_tx; buf < nb_rx; buf++) + rte_pktmbuf_free(bufs[buf]); + } + } + } +} + +/* extremely pessimistic estimation of memory required to create a mempool */ +static int +calc_mem_size(uint32_t nb_ports, uint32_t nb_mbufs_per_port, + uint32_t mbuf_sz, size_t pgsz, size_t *out) +{ + uint32_t nb_mbufs = nb_ports * nb_mbufs_per_port; + uint64_t total_mem, mbuf_mem, obj_sz; + + /* there is no good way to predict how much space the mempool will + * occupy because it will allocate chunks on the fly, and some of those + * will come from default DPDK memory while some will come from our + * external memory, so just assume 16MB will be enough for everyone. + */ + uint64_t hdr_mem = 16 << 20; + + obj_sz = rte_mempool_calc_obj_size(mbuf_sz, 0, NULL); + if (rte_eal_iova_mode() == RTE_IOVA_VA) { + /* contiguous - no need to account for page boundaries */ + mbuf_mem = nb_mbufs * obj_sz; + } else { + /* account for possible non-contiguousness */ + unsigned int n_pages, mbuf_per_pg, leftover; + + mbuf_per_pg = pgsz / obj_sz; + leftover = (nb_mbufs % mbuf_per_pg) > 0; + n_pages = (nb_mbufs / mbuf_per_pg) + leftover; + + mbuf_mem = n_pages * pgsz; + } + + total_mem = RTE_ALIGN(hdr_mem + mbuf_mem, pgsz); + + if (total_mem > SIZE_MAX) { + printf("Memory size too big\n"); + return -1; + } + *out = (size_t)total_mem; + + return 0; +} + +static inline uint32_t +bsf64(uint64_t v) +{ + return (uint32_t)__builtin_ctzll(v); +} + +static inline uint32_t +log2_u64(uint64_t v) +{ + if (v == 0) + return 0; + v = rte_align64pow2(v); + return bsf64(v); +} + +#ifndef MAP_HUGE_SHIFT +#define HUGE_SHIFT 26 +#else +#define HUGE_SHIFT MAP_HUGE_SHIFT +#endif + +static int +pagesz_flags(uint64_t page_sz) +{ + /* as per mmap() manpage, all page sizes are log2 of page size + * shifted by MAP_HUGE_SHIFT + */ + int log2 = log2_u64(page_sz); + return log2 << HUGE_SHIFT; +} + +static void * +alloc_mem(size_t memsz, size_t pgsz) +{ + void *addr; + int flags; + + /* allocate anonymous hugepages */ + flags = MAP_ANONYMOUS | MAP_PRIVATE | MAP_HUGETLB | pagesz_flags(pgsz); + + addr = mmap(NULL, memsz, PROT_READ | PROT_WRITE, flags, -1, 0); + if (addr == MAP_FAILED) + return NULL; + + return addr; +} + +struct extmem_param { + void *addr; + size_t len; + size_t pgsz; + rte_iova_t *iova_table; + unsigned int iova_table_len; +}; + +static int +create_extmem(uint32_t nb_ports, uint32_t nb_mbufs_per_port, uint32_t mbuf_sz, + struct extmem_param *param) +{ + uint64_t pgsizes[] = {RTE_PGSIZE_2M, RTE_PGSIZE_1G, /* x86_64, ARM */ + RTE_PGSIZE_16M, RTE_PGSIZE_16G}; /* POWER */ + unsigned int n_pages, cur_page, pgsz_idx; + size_t mem_sz, offset, cur_pgsz; + bool vfio_supported = true; + rte_iova_t *iovas = NULL; + void *addr; + int ret; + + for (pgsz_idx = 0; pgsz_idx < RTE_DIM(pgsizes); pgsz_idx++) { + /* skip anything that is too big */ + if (pgsizes[pgsz_idx] > SIZE_MAX) + continue; + + cur_pgsz = pgsizes[pgsz_idx]; + + ret = calc_mem_size(nb_ports, nb_mbufs_per_port, + mbuf_sz, cur_pgsz, &mem_sz); + if (ret < 0) { + printf("Cannot calculate memory size\n"); + return -1; + } + + /* allocate our memory */ + addr = alloc_mem(mem_sz, cur_pgsz); + + /* if we couldn't allocate memory with a specified page size, + * that doesn't mean we can't do it with other page sizes, so + * try another one. + */ + if (addr == NULL) + continue; + + /* store IOVA addresses for every page in this memory area */ + n_pages = mem_sz / cur_pgsz; + + iovas = malloc(sizeof(*iovas) * n_pages); + + if (iovas == NULL) { + printf("Cannot allocate memory for iova addresses\n"); + goto fail; + } + + /* populate IOVA table */ + for (cur_page = 0; cur_page < n_pages; cur_page++) { + rte_iova_t iova; + void *cur; + + offset = cur_pgsz * cur_page; + cur = RTE_PTR_ADD(addr, offset); + + iova = (uintptr_t)rte_mem_virt2iova(cur); + + iovas[cur_page] = iova; + + if (vfio_supported) { + /* map memory for DMA */ + ret = rte_vfio_dma_map((uintptr_t)addr, + iova, cur_pgsz); + if (ret < 0) { + /* + * ENODEV means VFIO is not initialized + * ENOTSUP means current IOMMU mode + * doesn't support mapping + * both cases are not an error + */ + if (rte_errno == ENOTSUP || + rte_errno == ENODEV) + /* VFIO is unsupported, don't + * try again. + */ + vfio_supported = false; + else + /* this is an actual error */ + goto fail; + } + } + } + + break; + } + /* if we couldn't allocate anything */ + if (iovas == NULL) + return -1; + + param->addr = addr; + param->len = mem_sz; + param->pgsz = cur_pgsz; + param->iova_table = iovas; + param->iova_table_len = n_pages; + + return 0; +fail: + if (iovas) + free(iovas); + if (addr) + munmap(addr, mem_sz); + + return -1; +} + +static int +setup_extmem(uint32_t nb_ports, uint32_t nb_mbufs_per_port, uint32_t mbuf_sz) +{ + struct extmem_param param; + int ret; + + /* create our heap */ + ret = rte_malloc_heap_create(EXTMEM_HEAP_NAME); + if (ret < 0) { + printf("Cannot create heap\n"); + return -1; + } + + ret = create_extmem(nb_ports, nb_mbufs_per_port, mbuf_sz, ¶m); + if (ret < 0) { + printf("Cannot create memory area\n"); + return -1; + } + + /* we now have a valid memory area, so add it to heap */ + ret = rte_malloc_heap_memory_add(EXTMEM_HEAP_NAME, + param.addr, param.len, param.iova_table, + param.iova_table_len, param.pgsz); + + /* not needed any more */ + free(param.iova_table); + + if (ret < 0) { + printf("Cannot add memory to heap\n"); + munmap(param.addr, param.len); + return -1; + } + + printf("Allocated %zuMB of memory\n", param.len >> 20); + + /* success */ + return 0; +} + + +/* + * The main function, which does initialization and calls the per-lcore + * functions. + */ +int +main(int argc, char *argv[]) +{ + struct rte_mempool *mbuf_pool; + unsigned int nb_ports; + int socket_id; + uint16_t portid; + uint32_t nb_mbufs_per_port, mbuf_sz; + + /* Initialize the Environment Abstraction Layer (EAL). */ + int ret = rte_eal_init(argc, argv); + if (ret < 0) + rte_exit(EXIT_FAILURE, "Error with EAL initialization\n"); + + argc -= ret; + argv += ret; + + /* Check that there is an even number of ports to send/receive on. */ + nb_ports = rte_eth_dev_count_avail(); + if (nb_ports < 2 || (nb_ports & 1)) + rte_exit(EXIT_FAILURE, "Error: number of ports must be even\n"); + + nb_mbufs_per_port = NUM_MBUFS; + mbuf_sz = RTE_MBUF_DEFAULT_BUF_SIZE; + + if (setup_extmem(nb_ports, nb_mbufs_per_port, mbuf_sz) < 0) + rte_exit(EXIT_FAILURE, "Error: cannot set up external memory\n"); + + /* retrieve socket ID for our heap */ + socket_id = rte_malloc_heap_get_socket(EXTMEM_HEAP_NAME); + if (socket_id < 0) + rte_exit(EXIT_FAILURE, "Invalid socket for external heap\n"); + + /* Creates a new mempool in memory to hold the mbufs. */ + mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL", + nb_mbufs_per_port * nb_ports, MBUF_CACHE_SIZE, 0, + mbuf_sz, socket_id); + + if (mbuf_pool == NULL) + rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n"); + + /* Initialize all ports. */ + RTE_ETH_FOREACH_DEV(portid) + if (port_init(portid, mbuf_pool) != 0) + rte_exit(EXIT_FAILURE, "Cannot init port %"PRIu16 "\n", + portid); + + if (rte_lcore_count() > 1) + printf("\nWARNING: Too many lcores enabled. Only 1 used.\n"); + + /* Call lcore_main on the master core only. */ + lcore_main(); + + return 0; +} diff --git a/examples/external_mem/meson.build b/examples/external_mem/meson.build new file mode 100644 index 000000000..17a363ad2 --- /dev/null +++ b/examples/external_mem/meson.build @@ -0,0 +1,12 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2017 Intel Corporation + +# meson file, for building this example as part of a main DPDK build. +# +# To build this example as a standalone application with an already-installed +# DPDK instance, use 'make' + +allow_experimental_apis = true +sources = files( + 'extmem.c' +) From patchwork Thu Sep 20 11:36:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 45016 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id C98071B10A; Thu, 20 Sep 2018 13:37:00 +0200 (CEST) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 9BA595F48 for ; Thu, 20 Sep 2018 13:36:45 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Sep 2018 04:36:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,398,1531810800"; d="scan'208";a="265202981" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga006.fm.intel.com with ESMTP; 20 Sep 2018 04:36:41 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w8KBafpN022849; Thu, 20 Sep 2018 12:36:41 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w8KBafdd011593; Thu, 20 Sep 2018 12:36:41 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w8KBafDS011589; Thu, 20 Sep 2018 12:36:41 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: John McNamara , Marko Kovacevic , laszlo.madarassy@ericsson.com, laszlo.vadkerti@ericsson.com, andras.kovacs@ericsson.com, winnie.tian@ericsson.com, daniel.andrasi@ericsson.com, janos.kobor@ericsson.com, geza.koblo@ericsson.com, srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com, keith.wiles@intel.com, bruce.richardson@intel.com, thomas@monjalon.net, shreyansh.jain@nxp.com, shahafs@mellanox.com, arybchenko@solarflare.com Date: Thu, 20 Sep 2018 12:36:31 +0100 Message-Id: <7492be07118c7a90a1803d983c586934f33695d6.1537443103.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 18/20] doc: add external memory feature to the release notes X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Document the addition of external memory support to DPDK. Signed-off-by: Anatoly Burakov --- doc/guides/rel_notes/release_18_11.rst | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/doc/guides/rel_notes/release_18_11.rst b/doc/guides/rel_notes/release_18_11.rst index 63bbb1b51..9a05c9980 100644 --- a/doc/guides/rel_notes/release_18_11.rst +++ b/doc/guides/rel_notes/release_18_11.rst @@ -67,6 +67,11 @@ New Features SR-IOV option in Hyper-V and Azure. This is an alternative to the previous vdev_netvsc, tap, and failsafe drivers combination. +* **Added support for using externally allocated memory in DPDK.** + + DPDK has gained support for creating new ``rte_malloc`` heaps referencing + memory that was created outside of DPDK's own page allocator, and using that + memory natively with any other DPDK library or data structure. API Changes ----------- From patchwork Thu Sep 20 11:36:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 45019 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 0FD3D1B124; Thu, 20 Sep 2018 13:37:16 +0200 (CEST) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 3E7A85F48 for ; Thu, 20 Sep 2018 13:36:46 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Sep 2018 04:36:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,398,1531810800"; d="scan'208";a="265202984" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga006.fm.intel.com with ESMTP; 20 Sep 2018 04:36:41 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w8KBaf5P022852; Thu, 20 Sep 2018 12:36:41 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w8KBafap011602; Thu, 20 Sep 2018 12:36:41 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w8KBafAB011597; Thu, 20 Sep 2018 12:36:41 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: John McNamara , Marko Kovacevic , laszlo.madarassy@ericsson.com, laszlo.vadkerti@ericsson.com, andras.kovacs@ericsson.com, winnie.tian@ericsson.com, daniel.andrasi@ericsson.com, janos.kobor@ericsson.com, geza.koblo@ericsson.com, srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com, keith.wiles@intel.com, bruce.richardson@intel.com, thomas@monjalon.net, shreyansh.jain@nxp.com, shahafs@mellanox.com, arybchenko@solarflare.com Date: Thu, 20 Sep 2018 12:36:32 +0100 Message-Id: <669da6a6e84804087bdf48bd0f04cc8eb6748d84.1537443103.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 19/20] doc: add external memory feature to programmer's guide X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add a short chapter on usage of external memory in DPDK to the Programmer's Guide. Signed-off-by: Anatoly Burakov --- .../prog_guide/env_abstraction_layer.rst | 38 +++++++++++++++++++ 1 file changed, 38 insertions(+) diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst index d362c9209..37de8d63d 100644 --- a/doc/guides/prog_guide/env_abstraction_layer.rst +++ b/doc/guides/prog_guide/env_abstraction_layer.rst @@ -213,6 +213,44 @@ Normally, these options do not need to be changed. can later be mapped into that preallocated VA space (if dynamic memory mode is enabled), and can optionally be mapped into it at startup. +Support for Externally Allocated Memory +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +It is possible to use externally allocated memory in DPDK, using a set of malloc +heap API's. Support for externally allocated memory is implemented through +overloading the socket ID - externally allocated heaps will have socket ID's +that would be considered invalid under normal circumstances. Requesting an +allocation to take place from a specified externally allocated memory is a +matter of supplying the correct socket ID to DPDK allocator, either directly +(e.g. through a call to ``rte_malloc``) or indirectly (through data +structure-specific allocation API's such as ``rte_ring_create``). + +Since there is no way DPDK can verify whether memory are is available or valid, +this responsibility falls on the shoulders of the user. All multiprocess +synchronization is also user's responsibility, as well as ensuring that all +calls to add/attach/detach/remove memory are done in the correct order. It is +not required to attach to a memory area in all processes - only attach to memory +areas as needed. + +The expected workflow is as follows: + +* Get a pointer to memory area +* Create a named heap +* Add memory area(s) to the heap + * If IOVA table is not specified, IOVA addresses will be assumed to be + unavailable + * Any DMA mappings for the external area are responsibility of the user + * Other processes must attach to the memory area before they can use it +* Get socket ID used for the heap +* Use normal DPDK allocation procedures, using supplied socket ID +* If memory area is no longer needed, it can be removed from the heap + * Other processes must detach from this memory area before it can be removed +* If heap is no longer needed, remove it + * Socket ID will become invalid and will not be reused + +For more information, please refer to ``rte_malloc`` API documentation, +specifically the ``rte_malloc_heap_*`` family of function calls. + PCI Access ~~~~~~~~~~ From patchwork Thu Sep 20 11:36:33 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Burakov, Anatoly" X-Patchwork-Id: 45021 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 6A9481B12F; Thu, 20 Sep 2018 13:37:20 +0200 (CEST) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id BDAC45F27 for ; Thu, 20 Sep 2018 13:36:46 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Sep 2018 04:36:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,398,1531810800"; d="scan'208";a="265202986" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga006.fm.intel.com with ESMTP; 20 Sep 2018 04:36:42 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w8KBaf6s022855; Thu, 20 Sep 2018 12:36:41 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w8KBafNU011609; Thu, 20 Sep 2018 12:36:41 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w8KBafFS011605; Thu, 20 Sep 2018 12:36:41 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: John McNamara , Marko Kovacevic , laszlo.madarassy@ericsson.com, laszlo.vadkerti@ericsson.com, andras.kovacs@ericsson.com, winnie.tian@ericsson.com, daniel.andrasi@ericsson.com, janos.kobor@ericsson.com, geza.koblo@ericsson.com, srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com, keith.wiles@intel.com, bruce.richardson@intel.com, thomas@monjalon.net, shreyansh.jain@nxp.com, shahafs@mellanox.com, arybchenko@solarflare.com Date: Thu, 20 Sep 2018 12:36:33 +0100 Message-Id: <3e00aba3b6baa83b585bd8b450a87d15d0a1b5b7.1537443103.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 20/20] doc: add external memory sample application guide X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add a guide for external memory sample application. The application is identical to Basic Forwarding example in everything except parts of initialization code, so the bits that are identical will not be described. It is also not necessary to describe how external memory is being allocated due to the expectation being that user will have their own mechanisms to allocate memory outside of DPDK, and will only be interested in how to integrate said memory into DPDK. Signed-off-by: Anatoly Burakov --- doc/guides/sample_app_ug/external_mem.rst | 115 ++++++++++++++++++++++ doc/guides/sample_app_ug/index.rst | 1 + 2 files changed, 116 insertions(+) create mode 100644 doc/guides/sample_app_ug/external_mem.rst diff --git a/doc/guides/sample_app_ug/external_mem.rst b/doc/guides/sample_app_ug/external_mem.rst new file mode 100644 index 000000000..594c3397a --- /dev/null +++ b/doc/guides/sample_app_ug/external_mem.rst @@ -0,0 +1,115 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2015-2018 Intel Corporation. + +External Memory Sample Application +================================== + +The External Memory sample application is a simple *skeleton* example of a +forwarding application using externally allocated memory. + +It is intended as a demonstration of the basic workflow of using externally +allocated memory in DPDK. This application is based on Basic Forwarding sample +application, and differs only in its initialization path. For more detailed +explanation of port initialization and packet forwarding code, please refer to +*Basic Forwarding sample application user guide*. + +Compiling the Application +------------------------- + +To compile the sample application see :doc:`compiling`. + +The application is located in the ``external_mem`` sub-directory. + +Running the Application +----------------------- + +To run the example in a ``linuxapp`` environment: + +.. code-block:: console + + ./build/extmem -l 1 -n 4 + +Refer to *DPDK Getting Started Guide* for general information on running +applications and the Environment Abstraction Layer (EAL) options. + + +Explanation +----------- + +For general overview of the code and explanation of the main components of this +application, please refer to *Basic Forwarding sample application user guide*. +This guide will only explain sections of the code relevant to using external +memory in DPDK. + +All DPDK library functions used in the sample code are prefixed with ``rte_`` +and are explained in detail in the *DPDK API Documentation*. + + +External Memory Initialization +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``main()`` function performs the initialization and calls the execution +threads for each lcore. + +After initializing the Environment Abstraction Layer, the application also +initializes external memory (in this case, it's allocating a chunk of memory +using anonymous hugepages) inside the ``setup_extmem()`` local function. + +The first step in this process is to create an external heap: + +.. code-block:: c + + ret = rte_malloc_heap_create(EXTMEM_HEAP_NAME); + if (ret < 0) { + printf("Cannot create heap\n"); + return -1; + } + +Once the heap is created, ``create_extmem`` function is called to create the +actual external memory area the application will be using. While the details of +that process will not be described as they are not pertinent to the external +memory API (it is expected that the user will have their own procedures to +create external memory), there are a few important things to note. + +In order to add an externally allocated memory area to the newly created heap, +the application needs the following pieces of information: + +* Pointer to start address of external memory area +* Length of this area +* Page size of memory backing this memory area +* Optionally, a per-page IOVA table + +All of this information is to be provided by the user. Additionally, if VFIO is +in use and if application intends to do DMA using the memory area, VFIO DMA +mapping must also be performed using ``rte_vfio_dma_map`` function. + +Once the external memory is created and mapped for DMA, the application also has +to add this memory to the heap that was created earlier: + +.. code-block:: c + + ret = rte_malloc_heap_memory_add(EXTMEM_HEAP_NAME, + param.addr, param.len, param.iova_table, + param.iova_table_len, param.pgsz); + +If return value indicates success, the memory area has been successfully added +to the heap. The next step is to retrieve the socket ID of this heap: + +.. code-block:: c + + socket_id = rte_malloc_heap_get_socket(EXTMEM_HEAP_NAME); + if (socket_id < 0) + rte_exit(EXIT_FAILURE, "Invalid socket for external heap\n"); + +After that, the socket ID has to be supplied to the mempool creation function: + +.. code-block:: c + + mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL", + nb_mbufs_per_port * nb_ports, MBUF_CACHE_SIZE, 0, + mbuf_sz, socket_id); + +The rest of the application is identical to the Basic Forwarding example. + +The forwarding loop can be interrupted and the application closed using +``Ctrl-C``. diff --git a/doc/guides/sample_app_ug/index.rst b/doc/guides/sample_app_ug/index.rst index 5bedf4f6f..0536edb8e 100644 --- a/doc/guides/sample_app_ug/index.rst +++ b/doc/guides/sample_app_ug/index.rst @@ -15,6 +15,7 @@ Sample Applications User Guides exception_path hello_world skeleton + external_mem rxtx_callbacks flow_classify flow_filtering