From patchwork Thu Dec 13 11:43:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anatoly Burakov X-Patchwork-Id: 48763 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 88A781B503; Thu, 13 Dec 2018 12:43:29 +0100 (CET) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 663AF2BC7; Thu, 13 Dec 2018 12:43:24 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Dec 2018 03:43:22 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,348,1539673200"; d="scan'208";a="259162199" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga004.jf.intel.com with ESMTP; 13 Dec 2018 03:43:20 -0800 Received: from sivswdev05.ir.intel.com (sivswdev05.ir.intel.com [10.243.17.64]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id wBDBhJMD032151; Thu, 13 Dec 2018 11:43:19 GMT Received: from sivswdev05.ir.intel.com (localhost [127.0.0.1]) by sivswdev05.ir.intel.com with ESMTP id wBDBhJ47012537; Thu, 13 Dec 2018 11:43:19 GMT Received: (from aburakov@localhost) by sivswdev05.ir.intel.com with LOCAL id wBDBhJ9c012533; Thu, 13 Dec 2018 11:43:19 GMT From: Anatoly Burakov To: dev@dpdk.org Cc: John McNamara , Marko Kovacevic , przemyslawx.lal@intel.com, kuralamudhan.ramakrishnan@intel.com, ivan.coughlan@intel.com, tiwei.bie@intel.com, ray.kinsella@intel.com, maxime.coquelin@redhat.com, stable@dpdk.org Date: Thu, 13 Dec 2018 11:43:15 +0000 Message-Id: <633b56ce7ee5551bfc456bd2f7968fd2f61b7529.1544701282.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 1/5] mem: fix error code for segment fd API for external segs X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Segment fd API does not support getting segment fd's from externally allocated memory, so return proper error code on any attempts to do so. This changes API behavior, so document the change as well. Fixes: 5282bb1c3695 ("mem: allow memseg lists to be marked as external") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov Acked-by: Tiwei Bie Reviewed-by: Maxime Coquelin --- Notes: The API is experimental, no deprecation notice needed. doc/guides/rel_notes/release_19_02.rst | 6 ++++++ lib/librte_eal/common/eal_common_memory.c | 12 ++++++++++++ 2 files changed, 18 insertions(+) diff --git a/doc/guides/rel_notes/release_19_02.rst b/doc/guides/rel_notes/release_19_02.rst index a94fa86a7..ade41b9c8 100644 --- a/doc/guides/rel_notes/release_19_02.rst +++ b/doc/guides/rel_notes/release_19_02.rst @@ -84,6 +84,12 @@ API Changes ========================================================= +* eal: segment fd API on Linux now sets error code to ``ENOTSUP`` in more cases + where segment fd API is not expected to be supported: + + - On attempt to get segment fd for an externally allocated memory segment + + ABI Changes ----------- diff --git a/lib/librte_eal/common/eal_common_memory.c b/lib/librte_eal/common/eal_common_memory.c index d47ea4938..999ba24b4 100644 --- a/lib/librte_eal/common/eal_common_memory.c +++ b/lib/librte_eal/common/eal_common_memory.c @@ -704,6 +704,12 @@ rte_memseg_get_fd_thread_unsafe(const struct rte_memseg *ms) return -1; } + /* segment fd API is not supported for external segments */ + if (msl->external) { + rte_errno = ENOTSUP; + return -1; + } + ret = eal_memalloc_get_seg_fd(msl_idx, seg_idx); if (ret < 0) { rte_errno = -ret; @@ -754,6 +760,12 @@ rte_memseg_get_fd_offset_thread_unsafe(const struct rte_memseg *ms, return -1; } + /* segment fd API is not supported for external segments */ + if (msl->external) { + rte_errno = ENOTSUP; + return -1; + } + ret = eal_memalloc_get_seg_fd_offset(msl_idx, seg_idx, offset); if (ret < 0) { rte_errno = -ret; From patchwork Thu Dec 13 11:43:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anatoly Burakov X-Patchwork-Id: 48766 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id EAC601B520; Thu, 13 Dec 2018 12:43:33 +0100 (CET) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id D33E11B4F3; Thu, 13 Dec 2018 12:43:24 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Dec 2018 03:43:22 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,348,1539673200"; d="scan'208";a="303508190" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga005.fm.intel.com with ESMTP; 13 Dec 2018 03:43:20 -0800 Received: from sivswdev05.ir.intel.com (sivswdev05.ir.intel.com [10.243.17.64]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id wBDBhJ6g032154; Thu, 13 Dec 2018 11:43:20 GMT Received: from sivswdev05.ir.intel.com (localhost [127.0.0.1]) by sivswdev05.ir.intel.com with ESMTP id wBDBhJUo012548; Thu, 13 Dec 2018 11:43:19 GMT Received: (from aburakov@localhost) by sivswdev05.ir.intel.com with LOCAL id wBDBhJaY012540; Thu, 13 Dec 2018 11:43:19 GMT From: Anatoly Burakov To: dev@dpdk.org Cc: John McNamara , Marko Kovacevic , przemyslawx.lal@intel.com, kuralamudhan.ramakrishnan@intel.com, ivan.coughlan@intel.com, tiwei.bie@intel.com, ray.kinsella@intel.com, maxime.coquelin@redhat.com, stable@dpdk.org Date: Thu, 13 Dec 2018 11:43:16 +0000 Message-Id: X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 2/5] memalloc: check for memfd support in segment fd API X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" If memfd support was not compiled, or hugepage memfd support is not available at runtime, the API will now return proper error code, indicating that this API is unsupported. This changes the API, so document the changes. Fixes: 41dbdb68723b ("mem: add external API to retrieve page fd") Fixes: 3a44687139eb ("mem: allow querying offset into segment fd") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov Acked-by: Tiwei Bie Reviewed-by: Maxime Coquelin --- Notes: The API is experimental, no deprecation notice needed. doc/guides/rel_notes/release_19_02.rst | 2 ++ lib/librte_eal/linuxapp/eal/eal_memalloc.c | 40 +++++++++++++++++----- 2 files changed, 34 insertions(+), 8 deletions(-) diff --git a/doc/guides/rel_notes/release_19_02.rst b/doc/guides/rel_notes/release_19_02.rst index ade41b9c8..960098582 100644 --- a/doc/guides/rel_notes/release_19_02.rst +++ b/doc/guides/rel_notes/release_19_02.rst @@ -88,6 +88,8 @@ API Changes where segment fd API is not expected to be supported: - On attempt to get segment fd for an externally allocated memory segment + - In cases where memfd support would have been required to provide segment + fd's (such as in-memory or no-huge mode) ABI Changes diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c b/lib/librte_eal/linuxapp/eal/eal_memalloc.c index 784939566..a93548b8c 100644 --- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c +++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c @@ -23,6 +23,10 @@ #include #include #include +#ifdef F_ADD_SEALS /* if file sealing is supported, so is memfd */ +#include +#define MEMFD_SUPPORTED +#endif #ifdef RTE_EAL_NUMA_AWARE_HUGEPAGES #include #include @@ -53,8 +57,8 @@ const int anonymous_hugepages_supported = #endif /* - * we don't actually care if memfd itself is supported - we only need to check - * if memfd supports hugetlbfs, as that already implies memfd support. + * we've already checked memfd support at compile-time, but we also need to + * check if we can create hugepage files with memfd. * * also, this is not a constant, because while we may be *compiled* with memfd * hugetlbfs support, we might not be *running* on a system that supports memfd @@ -63,10 +67,11 @@ const int anonymous_hugepages_supported = */ static int memfd_create_supported = #ifdef MFD_HUGETLB -#define MEMFD_SUPPORTED 1; +#define RTE_MFD_HUGETLB MFD_HUGETLB #else 0; +#define RTE_MFD_HUGETLB 4U #endif /* @@ -338,12 +343,12 @@ get_seg_memfd(struct hugepage_info *hi __rte_unused, int fd; char segname[250]; /* as per manpage, limit is 249 bytes plus null */ + int flags = RTE_MFD_HUGETLB | pagesz_flags(hi->hugepage_sz); + if (internal_config.single_file_segments) { fd = fd_list[list_idx].memseg_list_fd; if (fd < 0) { - int flags = MFD_HUGETLB | pagesz_flags(hi->hugepage_sz); - snprintf(segname, sizeof(segname), "seg_%i", list_idx); fd = memfd_create(segname, flags); if (fd < 0) { @@ -357,8 +362,6 @@ get_seg_memfd(struct hugepage_info *hi __rte_unused, fd = fd_list[list_idx].fds[seg_idx]; if (fd < 0) { - int flags = MFD_HUGETLB | pagesz_flags(hi->hugepage_sz); - snprintf(segname, sizeof(segname), "seg_%i-%i", list_idx, seg_idx); fd = memfd_create(segname, flags); @@ -1542,6 +1545,17 @@ int eal_memalloc_get_seg_fd(int list_idx, int seg_idx) { int fd; + + if (internal_config.in_memory || internal_config.no_hugetlbfs) { +#ifndef MEMFD_SUPPORTED + /* in in-memory or no-huge mode, we rely on memfd support */ + return -ENOTSUP; +#endif + /* memfd supported, but hugetlbfs memfd may not be */ + if (!internal_config.no_hugetlbfs && !memfd_create_supported) + return -ENOTSUP; + } + if (internal_config.single_file_segments) { fd = fd_list[list_idx].memseg_list_fd; } else if (fd_list[list_idx].len == 0) { @@ -1565,7 +1579,7 @@ test_memfd_create(void) int pagesz_flag = pagesz_flags(pagesz); int flags; - flags = pagesz_flag | MFD_HUGETLB; + flags = pagesz_flag | RTE_MFD_HUGETLB; int fd = memfd_create("test", flags); if (fd < 0) { /* we failed - let memalloc know this isn't working */ @@ -1589,6 +1603,16 @@ eal_memalloc_get_seg_fd_offset(int list_idx, int seg_idx, size_t *offset) { struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + if (internal_config.in_memory || internal_config.no_hugetlbfs) { +#ifndef MEMFD_SUPPORTED + /* in in-memory or no-huge mode, we rely on memfd support */ + return -ENOTSUP; +#endif + /* memfd supported, but hugetlbfs memfd may not be */ + if (!internal_config.no_hugetlbfs && !memfd_create_supported) + return -ENOTSUP; + } + /* fd_list not initialized? */ if (fd_list[list_idx].len == 0) return -ENODEV; From patchwork Thu Dec 13 11:43:17 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anatoly Burakov X-Patchwork-Id: 48762 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id E3D3A1B4FB; Thu, 13 Dec 2018 12:43:27 +0100 (CET) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 15A731B4F3 for ; Thu, 13 Dec 2018 12:43:23 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Dec 2018 03:43:22 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,348,1539673200"; d="scan'208";a="110066378" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga003.jf.intel.com with ESMTP; 13 Dec 2018 03:43:20 -0800 Received: from sivswdev05.ir.intel.com (sivswdev05.ir.intel.com [10.243.17.64]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id wBDBhKRY032157; Thu, 13 Dec 2018 11:43:20 GMT Received: from sivswdev05.ir.intel.com (localhost [127.0.0.1]) by sivswdev05.ir.intel.com with ESMTP id wBDBhKZt012555; Thu, 13 Dec 2018 11:43:20 GMT Received: (from aburakov@localhost) by sivswdev05.ir.intel.com with LOCAL id wBDBhKWQ012551; Thu, 13 Dec 2018 11:43:20 GMT From: Anatoly Burakov To: dev@dpdk.org Cc: Bruce Richardson , przemyslawx.lal@intel.com, kuralamudhan.ramakrishnan@intel.com, ivan.coughlan@intel.com, tiwei.bie@intel.com, ray.kinsella@intel.com, maxime.coquelin@redhat.com Date: Thu, 13 Dec 2018 11:43:17 +0000 Message-Id: X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 3/5] memalloc: allow setting up segment list fd's X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Currently, only segment fd's for multi-file segments are supported, while for memfd-backed no-huge memory we need single-file segments mode. Add support for single-file segments in the internal API. Signed-off-by: Anatoly Burakov Acked-by: Tiwei Bie Reviewed-by: Maxime Coquelin --- Notes: v2: - Add missing fd list allocation on setting segment list fd lib/librte_eal/bsdapp/eal/eal_memalloc.c | 6 +++++ lib/librte_eal/common/eal_memalloc.h | 4 ++++ lib/librte_eal/linuxapp/eal/eal_memalloc.c | 26 ++++++++++++++++++++++ 3 files changed, 36 insertions(+) diff --git a/lib/librte_eal/bsdapp/eal/eal_memalloc.c b/lib/librte_eal/bsdapp/eal/eal_memalloc.c index a5847f0bd..6893448db 100644 --- a/lib/librte_eal/bsdapp/eal/eal_memalloc.c +++ b/lib/librte_eal/bsdapp/eal/eal_memalloc.c @@ -61,6 +61,12 @@ eal_memalloc_set_seg_fd(int list_idx __rte_unused, int seg_idx __rte_unused, return -ENOTSUP; } +int +eal_memalloc_set_seg_list_fd(int list_idx __rte_unused, int fd __rte_unused) +{ + return -ENOTSUP; +} + int eal_memalloc_get_seg_fd_offset(int list_idx __rte_unused, int seg_idx __rte_unused, size_t *offset __rte_unused) diff --git a/lib/librte_eal/common/eal_memalloc.h b/lib/librte_eal/common/eal_memalloc.h index af917c2f9..b96c9c512 100644 --- a/lib/librte_eal/common/eal_memalloc.h +++ b/lib/librte_eal/common/eal_memalloc.h @@ -84,6 +84,10 @@ eal_memalloc_get_seg_fd(int list_idx, int seg_idx); int eal_memalloc_set_seg_fd(int list_idx, int seg_idx, int fd); +/* returns 0 or -errno */ +int +eal_memalloc_set_seg_list_fd(int list_idx, int fd); + int eal_memalloc_get_seg_fd_offset(int list_idx, int seg_idx, size_t *offset); diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c b/lib/librte_eal/linuxapp/eal/eal_memalloc.c index a93548b8c..eef140b33 100644 --- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c +++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c @@ -1529,6 +1529,10 @@ eal_memalloc_set_seg_fd(int list_idx, int seg_idx, int fd) { struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + /* single file segments mode doesn't support individual segment fd's */ + if (internal_config.single_file_segments) + return -ENOTSUP; + /* if list is not allocated, allocate it */ if (fd_list[list_idx].len == 0) { int len = mcfg->memsegs[list_idx].memseg_arr.len; @@ -1541,6 +1545,28 @@ eal_memalloc_set_seg_fd(int list_idx, int seg_idx, int fd) return 0; } +int +eal_memalloc_set_seg_list_fd(int list_idx, int fd) +{ + struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + + /* non-single file segment mode doesn't support segment list fd's */ + if (!internal_config.single_file_segments) + return -ENOTSUP; + + /* if list is not allocated, allocate it */ + if (fd_list[list_idx].len == 0) { + int len = mcfg->memsegs[list_idx].memseg_arr.len; + + if (alloc_list(list_idx, len) < 0) + return -ENOMEM; + } + + fd_list[list_idx].memseg_list_fd = fd; + + return 0; +} + int eal_memalloc_get_seg_fd(int list_idx, int seg_idx) { From patchwork Thu Dec 13 11:43:18 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anatoly Burakov X-Patchwork-Id: 48764 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 885D71B50F; Thu, 13 Dec 2018 12:43:31 +0100 (CET) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id 935471B4F4 for ; Thu, 13 Dec 2018 12:43:24 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Dec 2018 03:43:23 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,348,1539673200"; d="scan'208";a="118038671" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga002.jf.intel.com with ESMTP; 13 Dec 2018 03:43:21 -0800 Received: from sivswdev05.ir.intel.com (sivswdev05.ir.intel.com [10.243.17.64]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id wBDBhKJc032160; Thu, 13 Dec 2018 11:43:20 GMT Received: from sivswdev05.ir.intel.com (localhost [127.0.0.1]) by sivswdev05.ir.intel.com with ESMTP id wBDBhKjN012562; Thu, 13 Dec 2018 11:43:20 GMT Received: (from aburakov@localhost) by sivswdev05.ir.intel.com with LOCAL id wBDBhKbs012558; Thu, 13 Dec 2018 11:43:20 GMT From: Anatoly Burakov To: dev@dpdk.org Cc: John McNamara , Marko Kovacevic , przemyslawx.lal@intel.com, kuralamudhan.ramakrishnan@intel.com, ivan.coughlan@intel.com, tiwei.bie@intel.com, ray.kinsella@intel.com, maxime.coquelin@redhat.com Date: Thu, 13 Dec 2018 11:43:18 +0000 Message-Id: X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 4/5] mem: use memfd for no-huge mode X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" When running in no-huge mode, we anonymously allocate our memory. While this works for regular NICs and vdev's, it's not suitable for memory sharing scenarios such as virtio with vhost_user backend. To fix this, allocate no-huge memory using memfd, and register it with memalloc just like any other memseg fd. This will enable using rte_memseg_get_fd() API with --no-huge EAL flag. Signed-off-by: Anatoly Burakov Acked-by: Tiwei Bie Reviewed-by: Maxime Coquelin --- Notes: v3: - Clarify release notes to state that the changes apply to virtio-user NICs rather than virtio in general v2: - Detect memfd support at compile time - Change memfd-related log level to debug doc/guides/rel_notes/release_19_02.rst | 5 +++ lib/librte_eal/linuxapp/eal/eal_memory.c | 54 +++++++++++++++++++++++- 2 files changed, 57 insertions(+), 2 deletions(-) diff --git a/doc/guides/rel_notes/release_19_02.rst b/doc/guides/rel_notes/release_19_02.rst index 960098582..f733ad139 100644 --- a/doc/guides/rel_notes/release_19_02.rst +++ b/doc/guides/rel_notes/release_19_02.rst @@ -23,6 +23,11 @@ DPDK Release 19.02 New Features ------------ +* **Support for using VirtIO without hugepages** + + The --no-huge mode was augmented to use memfd-backed memory (on systems that + support memfd), to allow using virtio-user-based NICs without hugepages. + .. This section should contain new features added in this release. Sample format: diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c index 32feb415d..7d922a965 100644 --- a/lib/librte_eal/linuxapp/eal/eal_memory.c +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c @@ -25,6 +25,10 @@ #include #include #include +#ifdef F_ADD_SEALS /* if file sealing is supported, so is memfd */ +#include +#define MEMFD_SUPPORTED +#endif #ifdef RTE_EAL_NUMA_AWARE_HUGEPAGES #include #include @@ -1341,12 +1345,18 @@ eal_legacy_hugepage_init(void) /* hugetlbfs can be disabled */ if (internal_config.no_hugetlbfs) { struct rte_memseg_list *msl; + int n_segs, cur_seg, fd, flags; +#ifdef MEMFD_SUPPORTED + int memfd; +#endif uint64_t page_sz; - int n_segs, cur_seg; /* nohuge mode is legacy mode */ internal_config.legacy_mem = 1; + /* nohuge mode is single-file segments mode */ + internal_config.single_file_segments = 1; + /* create a memseg list */ msl = &mcfg->memsegs[0]; @@ -1359,8 +1369,38 @@ eal_legacy_hugepage_init(void) return -1; } + /* set up parameters for anonymous mmap */ + fd = -1; + flags = MAP_PRIVATE | MAP_ANONYMOUS; + +#ifdef MEMFD_SUPPORTED + /* create a memfd and store it in the segment fd table */ + memfd = memfd_create("nohuge", 0); + if (memfd < 0) { + RTE_LOG(DEBUG, EAL, "Cannot create memfd: %s\n", + strerror(errno)); + RTE_LOG(DEBUG, EAL, "Falling back to anonymous map\n"); + } else { + /* we got an fd - now resize it */ + if (ftruncate(memfd, internal_config.memory) < 0) { + RTE_LOG(ERR, EAL, "Cannot resize memfd: %s\n", + strerror(errno)); + RTE_LOG(ERR, EAL, "Falling back to anonymous map\n"); + close(memfd); + } else { + /* creating memfd-backed file was successful. + * we want changes to memfd to be visible to + * other processes (such as vhost backend), so + * map it as shared memory. + */ + RTE_LOG(DEBUG, EAL, "Using memfd for anonymous memory\n"); + fd = memfd; + flags = MAP_SHARED; + } + } +#endif addr = mmap(NULL, internal_config.memory, PROT_READ | PROT_WRITE, - MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + flags, fd, 0); if (addr == MAP_FAILED) { RTE_LOG(ERR, EAL, "%s: mmap() failed: %s\n", __func__, strerror(errno)); @@ -1371,6 +1411,16 @@ eal_legacy_hugepage_init(void) msl->socket_id = 0; msl->len = internal_config.memory; + /* we're in single-file segments mode, so only the segment list + * fd needs to be set up. + */ + if (fd != -1) { + if (eal_memalloc_set_seg_list_fd(0, fd) < 0) { + RTE_LOG(ERR, EAL, "Cannot set up segment list fd\n"); + /* not a serious error, proceed */ + } + } + /* populate memsegs. each memseg is one page long */ for (cur_seg = 0; cur_seg < n_segs; cur_seg++) { arr = &msl->memseg_arr; From patchwork Thu Dec 13 11:43:19 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anatoly Burakov X-Patchwork-Id: 48765 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id A95E71B517; Thu, 13 Dec 2018 12:43:32 +0100 (CET) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by dpdk.org (Postfix) with ESMTP id 9CF2F1B4F5 for ; Thu, 13 Dec 2018 12:43:24 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Dec 2018 03:43:23 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,348,1539673200"; d="scan'208";a="98448780" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga007.jf.intel.com with ESMTP; 13 Dec 2018 03:43:21 -0800 Received: from sivswdev05.ir.intel.com (sivswdev05.ir.intel.com [10.243.17.64]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id wBDBhKZs032163; Thu, 13 Dec 2018 11:43:20 GMT Received: from sivswdev05.ir.intel.com (localhost [127.0.0.1]) by sivswdev05.ir.intel.com with ESMTP id wBDBhKFd012569; Thu, 13 Dec 2018 11:43:20 GMT Received: (from aburakov@localhost) by sivswdev05.ir.intel.com with LOCAL id wBDBhKv4012565; Thu, 13 Dec 2018 11:43:20 GMT From: Anatoly Burakov To: dev@dpdk.org Cc: przemyslawx.lal@intel.com, kuralamudhan.ramakrishnan@intel.com, ivan.coughlan@intel.com, tiwei.bie@intel.com, ray.kinsella@intel.com, maxime.coquelin@redhat.com Date: Thu, 13 Dec 2018 11:43:19 +0000 Message-Id: <98395a6100625b09e0477a665c687ba69c6f9e99.1544701282.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v3 5/5] test: add segment fd API test X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Use memory autotest to also test segment fd API. This will not do any checks - just see if the relevant API's return success or indicate that the API is not supported. Signed-off-by: Anatoly Burakov Acked-by: Tiwei Bie Reviewed-by: Maxime Coquelin --- test/test/test_memory.c | 43 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) diff --git a/test/test/test_memory.c b/test/test/test_memory.c index b96bca771..3da803e4e 100644 --- a/test/test/test_memory.c +++ b/test/test/test_memory.c @@ -37,10 +37,44 @@ check_mem(const struct rte_memseg_list *msl __rte_unused, return 0; } +static int +check_seg_fds(const struct rte_memseg_list *msl, const struct rte_memseg *ms, + void *arg __rte_unused) +{ + size_t offset; + int ret; + + /* skip external segments */ + if (msl->external) + return 0; + + /* try segment fd first. we're in a callback, so thread-unsafe */ + ret = rte_memseg_get_fd_thread_unsafe(ms); + if (ret < 0) { + /* ENOTSUP means segment is valid, but there is not support for + * segment fd API (e.g. on FreeBSD). + */ + if (errno == ENOTSUP) + return 1; + /* all other errors are treated as failures */ + return -1; + } + + /* we're able to get memseg fd - try getting its offset */ + ret = rte_memseg_get_fd_offset_thread_unsafe(ms, &offset); + if (ret < 0) { + if (errno == ENOTSUP) + return 1; + return -1; + } + return 0; +} + static int test_memory(void) { uint64_t s; + int ret; /* * dump the mapped memory: the python-expect script checks @@ -59,6 +93,15 @@ test_memory(void) /* try to read memory (should not segfault) */ rte_memseg_walk(check_mem, NULL); + /* check segment fd support */ + ret = rte_memseg_walk(check_seg_fds, NULL); + if (ret == 1) { + printf("Segment fd API is unsupported\n"); + } else if (ret == -1) { + printf("Error getting segment fd's\n"); + return -1; + } + return 0; }