From patchwork Mon Oct 1 11:04:59 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anatoly Burakov X-Patchwork-Id: 45741 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id BF86C5F72; Mon, 1 Oct 2018 13:05:27 +0200 (CEST) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id EF4FC5920 for ; Mon, 1 Oct 2018 13:05:17 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 Oct 2018 04:05:17 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,327,1534834800"; d="scan'208";a="95408719" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga001.jf.intel.com with ESMTP; 01 Oct 2018 04:05:12 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w91B5CKo028027; Mon, 1 Oct 2018 12:05:12 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w91B5CEf017848; Mon, 1 Oct 2018 12:05:12 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w91B5CZ0017844; Mon, 1 Oct 2018 12:05:12 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: Olivier Matz , Andrew Rybchenko , laszlo.madarassy@ericsson.com, laszlo.vadkerti@ericsson.com, andras.kovacs@ericsson.com, winnie.tian@ericsson.com, daniel.andrasi@ericsson.com, janos.kobor@ericsson.com, geza.koblo@ericsson.com, srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com, keith.wiles@intel.com, bruce.richardson@intel.com, thomas@monjalon.net, shreyansh.jain@nxp.com, shahafs@mellanox.com, alejandro.lucero@netronome.com Date: Mon, 1 Oct 2018 12:04:59 +0100 Message-Id: X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v7 10/21] malloc: add function to check if socket is external X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" An API is needed to check whether a particular socket ID belongs to an internal or external heap. Prime user of this would be mempool allocator, because normal assumptions of IOVA contiguousness in IOVA as VA mode do not hold in case of externally allocated memory. Signed-off-by: Anatoly Burakov --- lib/librte_eal/common/include/rte_malloc.h | 15 +++++++++++++ lib/librte_eal/common/rte_malloc.c | 25 ++++++++++++++++++++++ lib/librte_eal/rte_eal_version.map | 1 + lib/librte_mempool/rte_mempool.c | 22 ++++++++++++++++--- 4 files changed, 60 insertions(+), 3 deletions(-) diff --git a/lib/librte_eal/common/include/rte_malloc.h b/lib/librte_eal/common/include/rte_malloc.h index 8870732a6..403271ddc 100644 --- a/lib/librte_eal/common/include/rte_malloc.h +++ b/lib/librte_eal/common/include/rte_malloc.h @@ -277,6 +277,21 @@ rte_malloc_get_socket_stats(int socket, int __rte_experimental rte_malloc_heap_get_socket(const char *name); +/** + * Check if a given socket ID refers to externally allocated memory. + * + * @note Passing SOCKET_ID_ANY will return 0. + * + * @param socket_id + * Socket ID to check + * @return + * 1 if socket ID refers to externally allocated memory + * 0 if socket ID refers to internal DPDK memory + * -1 if socket ID is invalid + */ +int __rte_experimental +rte_malloc_heap_socket_is_external(int socket_id); + /** * Dump statistics. * diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c index b807dfe09..fa81d7862 100644 --- a/lib/librte_eal/common/rte_malloc.c +++ b/lib/librte_eal/common/rte_malloc.c @@ -220,6 +220,31 @@ rte_malloc_heap_get_socket(const char *name) return ret; } +int +rte_malloc_heap_socket_is_external(int socket_id) +{ + struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + unsigned int idx; + int ret = -1; + + if (socket_id == SOCKET_ID_ANY) + return 0; + + rte_rwlock_read_lock(&mcfg->memory_hotplug_lock); + for (idx = 0; idx < RTE_MAX_HEAPS; idx++) { + struct malloc_heap *tmp = &mcfg->malloc_heaps[idx]; + + if ((int)tmp->socket_id == socket_id) { + /* external memory always has large socket ID's */ + ret = tmp->socket_id >= RTE_MAX_NUMA_NODES; + break; + } + } + rte_rwlock_read_unlock(&mcfg->memory_hotplug_lock); + + return ret; +} + /* * Print stats on memory type. If type is NULL, info on all types is printed */ diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map index d8f9665b8..bd60506af 100644 --- a/lib/librte_eal/rte_eal_version.map +++ b/lib/librte_eal/rte_eal_version.map @@ -319,6 +319,7 @@ EXPERIMENTAL { rte_log_register_type_and_pick_level; rte_malloc_dump_heaps; rte_malloc_heap_get_socket; + rte_malloc_heap_socket_is_external; rte_mem_alloc_validator_register; rte_mem_alloc_validator_unregister; rte_mem_event_callback_register; diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c index 2ed539f01..683b216f9 100644 --- a/lib/librte_mempool/rte_mempool.c +++ b/lib/librte_mempool/rte_mempool.c @@ -428,12 +428,18 @@ rte_mempool_populate_default(struct rte_mempool *mp) rte_iova_t iova; unsigned mz_id, n; int ret; - bool no_contig, try_contig, no_pageshift; + bool no_contig, try_contig, no_pageshift, external; ret = mempool_ops_alloc_once(mp); if (ret != 0) return ret; + /* check if we can retrieve a valid socket ID */ + ret = rte_malloc_heap_socket_is_external(mp->socket_id); + if (ret < 0) + return -EINVAL; + external = ret; + /* mempool must not be populated */ if (mp->nb_mem_chunks != 0) return -EEXIST; @@ -481,9 +487,19 @@ rte_mempool_populate_default(struct rte_mempool *mp) * in one contiguous chunk as well (otherwise we might end up wasting a * 1G page on a 10MB memzone). If we fail to get enough contiguous * memory, then we'll go and reserve space page-by-page. + * + * We also have to take into account the fact that memory that we're + * going to allocate from can belong to an externally allocated memory + * area, in which case the assumption of IOVA as VA mode being + * synonymous with IOVA contiguousness will not hold. We should also try + * to go for contiguous memory even if we're in no-huge mode, because + * external memory may in fact be IOVA-contiguous. */ - no_pageshift = no_contig || rte_eal_iova_mode() == RTE_IOVA_VA; - try_contig = !no_contig && !no_pageshift && rte_eal_has_hugepages(); + external = rte_malloc_heap_socket_is_external(mp->socket_id) == 1; + no_pageshift = no_contig || + (!external && rte_eal_iova_mode() == RTE_IOVA_VA); + try_contig = !no_contig && !no_pageshift && + (rte_eal_has_hugepages() || external); if (no_pageshift) { pg_sz = 0;