From patchwork Tue Dec 11 16:43:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anatoly Burakov X-Patchwork-Id: 48646 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 469955F3C; Tue, 11 Dec 2018 17:43:45 +0100 (CET) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id 837565F1C; Tue, 11 Dec 2018 17:43:37 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Dec 2018 08:43:36 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,342,1539673200"; d="scan'208";a="117910689" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga001.jf.intel.com with ESMTP; 11 Dec 2018 08:43:33 -0800 Received: from sivswdev05.ir.intel.com (sivswdev05.ir.intel.com [10.243.17.64]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id wBBGhWZF026210; Tue, 11 Dec 2018 16:43:32 GMT Received: from sivswdev05.ir.intel.com (localhost [127.0.0.1]) by sivswdev05.ir.intel.com with ESMTP id wBBGhWqa007040; Tue, 11 Dec 2018 16:43:32 GMT Received: (from aburakov@localhost) by sivswdev05.ir.intel.com with LOCAL id wBBGhW5Q007036; Tue, 11 Dec 2018 16:43:32 GMT From: Anatoly Burakov To: dev@dpdk.org Cc: John McNamara , Marko Kovacevic , przemyslawx.lal@intel.com, kuralamudhan.ramakrishnan@intel.com, ivan.coughlan@intel.com, tiwei.bie@intel.com, ray.kinsella@intel.com, maxime.coquelin@redhat.com, stable@dpdk.org Date: Tue, 11 Dec 2018 16:43:28 +0000 Message-Id: X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v2 1/5] mem: fix error code for segment fd API for external segs X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Segment fd API does not support getting segment fd's from externally allocated memory, so return proper error code on any attempts to do so. This changes API behavior, so document the change as well. Fixes: 5282bb1c3695 ("mem: allow memseg lists to be marked as external") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov --- Notes: The API is experimental, no deprecation notice needed. doc/guides/rel_notes/release_19_02.rst | 6 ++++++ lib/librte_eal/common/eal_common_memory.c | 12 ++++++++++++ 2 files changed, 18 insertions(+) diff --git a/doc/guides/rel_notes/release_19_02.rst b/doc/guides/rel_notes/release_19_02.rst index a94fa86a7..ade41b9c8 100644 --- a/doc/guides/rel_notes/release_19_02.rst +++ b/doc/guides/rel_notes/release_19_02.rst @@ -84,6 +84,12 @@ API Changes ========================================================= +* eal: segment fd API on Linux now sets error code to ``ENOTSUP`` in more cases + where segment fd API is not expected to be supported: + + - On attempt to get segment fd for an externally allocated memory segment + + ABI Changes ----------- diff --git a/lib/librte_eal/common/eal_common_memory.c b/lib/librte_eal/common/eal_common_memory.c index d47ea4938..999ba24b4 100644 --- a/lib/librte_eal/common/eal_common_memory.c +++ b/lib/librte_eal/common/eal_common_memory.c @@ -704,6 +704,12 @@ rte_memseg_get_fd_thread_unsafe(const struct rte_memseg *ms) return -1; } + /* segment fd API is not supported for external segments */ + if (msl->external) { + rte_errno = ENOTSUP; + return -1; + } + ret = eal_memalloc_get_seg_fd(msl_idx, seg_idx); if (ret < 0) { rte_errno = -ret; @@ -754,6 +760,12 @@ rte_memseg_get_fd_offset_thread_unsafe(const struct rte_memseg *ms, return -1; } + /* segment fd API is not supported for external segments */ + if (msl->external) { + rte_errno = ENOTSUP; + return -1; + } + ret = eal_memalloc_get_seg_fd_offset(msl_idx, seg_idx, offset); if (ret < 0) { rte_errno = -ret; From patchwork Tue Dec 11 16:43:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anatoly Burakov X-Patchwork-Id: 48648 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 926CB1B0FB; Tue, 11 Dec 2018 17:43:48 +0100 (CET) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 0E2ED5B38; Tue, 11 Dec 2018 17:43:37 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Dec 2018 08:43:37 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,342,1539673200"; d="scan'208";a="106615112" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga007.fm.intel.com with ESMTP; 11 Dec 2018 08:43:33 -0800 Received: from sivswdev05.ir.intel.com (sivswdev05.ir.intel.com [10.243.17.64]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id wBBGhXxJ026213; Tue, 11 Dec 2018 16:43:33 GMT Received: from sivswdev05.ir.intel.com (localhost [127.0.0.1]) by sivswdev05.ir.intel.com with ESMTP id wBBGhXAR007047; Tue, 11 Dec 2018 16:43:33 GMT Received: (from aburakov@localhost) by sivswdev05.ir.intel.com with LOCAL id wBBGhWpU007043; Tue, 11 Dec 2018 16:43:33 GMT From: Anatoly Burakov To: dev@dpdk.org Cc: John McNamara , Marko Kovacevic , przemyslawx.lal@intel.com, kuralamudhan.ramakrishnan@intel.com, ivan.coughlan@intel.com, tiwei.bie@intel.com, ray.kinsella@intel.com, maxime.coquelin@redhat.com, stable@dpdk.org Date: Tue, 11 Dec 2018 16:43:29 +0000 Message-Id: X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v2 2/5] memalloc: check for memfd support in segment fd API X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" If memfd support was not compiled, or hugepage memfd support is not available at runtime, the API will now return proper error code, indicating that this API is unsupported. This changes the API, so document the changes. Fixes: 41dbdb68723b ("mem: add external API to retrieve page fd") Fixes: 3a44687139eb ("mem: allow querying offset into segment fd") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov --- Notes: The API is experimental, no deprecation notice needed. doc/guides/rel_notes/release_19_02.rst | 2 ++ lib/librte_eal/linuxapp/eal/eal_memalloc.c | 40 +++++++++++++++++----- 2 files changed, 34 insertions(+), 8 deletions(-) diff --git a/doc/guides/rel_notes/release_19_02.rst b/doc/guides/rel_notes/release_19_02.rst index ade41b9c8..960098582 100644 --- a/doc/guides/rel_notes/release_19_02.rst +++ b/doc/guides/rel_notes/release_19_02.rst @@ -88,6 +88,8 @@ API Changes where segment fd API is not expected to be supported: - On attempt to get segment fd for an externally allocated memory segment + - In cases where memfd support would have been required to provide segment + fd's (such as in-memory or no-huge mode) ABI Changes diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c b/lib/librte_eal/linuxapp/eal/eal_memalloc.c index 784939566..a93548b8c 100644 --- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c +++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c @@ -23,6 +23,10 @@ #include #include #include +#ifdef F_ADD_SEALS /* if file sealing is supported, so is memfd */ +#include +#define MEMFD_SUPPORTED +#endif #ifdef RTE_EAL_NUMA_AWARE_HUGEPAGES #include #include @@ -53,8 +57,8 @@ const int anonymous_hugepages_supported = #endif /* - * we don't actually care if memfd itself is supported - we only need to check - * if memfd supports hugetlbfs, as that already implies memfd support. + * we've already checked memfd support at compile-time, but we also need to + * check if we can create hugepage files with memfd. * * also, this is not a constant, because while we may be *compiled* with memfd * hugetlbfs support, we might not be *running* on a system that supports memfd @@ -63,10 +67,11 @@ const int anonymous_hugepages_supported = */ static int memfd_create_supported = #ifdef MFD_HUGETLB -#define MEMFD_SUPPORTED 1; +#define RTE_MFD_HUGETLB MFD_HUGETLB #else 0; +#define RTE_MFD_HUGETLB 4U #endif /* @@ -338,12 +343,12 @@ get_seg_memfd(struct hugepage_info *hi __rte_unused, int fd; char segname[250]; /* as per manpage, limit is 249 bytes plus null */ + int flags = RTE_MFD_HUGETLB | pagesz_flags(hi->hugepage_sz); + if (internal_config.single_file_segments) { fd = fd_list[list_idx].memseg_list_fd; if (fd < 0) { - int flags = MFD_HUGETLB | pagesz_flags(hi->hugepage_sz); - snprintf(segname, sizeof(segname), "seg_%i", list_idx); fd = memfd_create(segname, flags); if (fd < 0) { @@ -357,8 +362,6 @@ get_seg_memfd(struct hugepage_info *hi __rte_unused, fd = fd_list[list_idx].fds[seg_idx]; if (fd < 0) { - int flags = MFD_HUGETLB | pagesz_flags(hi->hugepage_sz); - snprintf(segname, sizeof(segname), "seg_%i-%i", list_idx, seg_idx); fd = memfd_create(segname, flags); @@ -1542,6 +1545,17 @@ int eal_memalloc_get_seg_fd(int list_idx, int seg_idx) { int fd; + + if (internal_config.in_memory || internal_config.no_hugetlbfs) { +#ifndef MEMFD_SUPPORTED + /* in in-memory or no-huge mode, we rely on memfd support */ + return -ENOTSUP; +#endif + /* memfd supported, but hugetlbfs memfd may not be */ + if (!internal_config.no_hugetlbfs && !memfd_create_supported) + return -ENOTSUP; + } + if (internal_config.single_file_segments) { fd = fd_list[list_idx].memseg_list_fd; } else if (fd_list[list_idx].len == 0) { @@ -1565,7 +1579,7 @@ test_memfd_create(void) int pagesz_flag = pagesz_flags(pagesz); int flags; - flags = pagesz_flag | MFD_HUGETLB; + flags = pagesz_flag | RTE_MFD_HUGETLB; int fd = memfd_create("test", flags); if (fd < 0) { /* we failed - let memalloc know this isn't working */ @@ -1589,6 +1603,16 @@ eal_memalloc_get_seg_fd_offset(int list_idx, int seg_idx, size_t *offset) { struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + if (internal_config.in_memory || internal_config.no_hugetlbfs) { +#ifndef MEMFD_SUPPORTED + /* in in-memory or no-huge mode, we rely on memfd support */ + return -ENOTSUP; +#endif + /* memfd supported, but hugetlbfs memfd may not be */ + if (!internal_config.no_hugetlbfs && !memfd_create_supported) + return -ENOTSUP; + } + /* fd_list not initialized? */ if (fd_list[list_idx].len == 0) return -ENODEV; From patchwork Tue Dec 11 16:43:30 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anatoly Burakov X-Patchwork-Id: 48644 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 923055F24; Tue, 11 Dec 2018 17:43:41 +0100 (CET) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 0EF255F1B for ; Tue, 11 Dec 2018 17:43:36 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Dec 2018 08:43:36 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,342,1539673200"; d="scan'208";a="109520197" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga003.jf.intel.com with ESMTP; 11 Dec 2018 08:43:34 -0800 Received: from sivswdev05.ir.intel.com (sivswdev05.ir.intel.com [10.243.17.64]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id wBBGhXHn026216; Tue, 11 Dec 2018 16:43:33 GMT Received: from sivswdev05.ir.intel.com (localhost [127.0.0.1]) by sivswdev05.ir.intel.com with ESMTP id wBBGhXeR007054; Tue, 11 Dec 2018 16:43:33 GMT Received: (from aburakov@localhost) by sivswdev05.ir.intel.com with LOCAL id wBBGhXL2007050; Tue, 11 Dec 2018 16:43:33 GMT From: Anatoly Burakov To: dev@dpdk.org Cc: Bruce Richardson , przemyslawx.lal@intel.com, kuralamudhan.ramakrishnan@intel.com, ivan.coughlan@intel.com, tiwei.bie@intel.com, ray.kinsella@intel.com, maxime.coquelin@redhat.com Date: Tue, 11 Dec 2018 16:43:30 +0000 Message-Id: <9e1ab9ed659f3565e4fe655db7f8643516ea9bc0.1544546363.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v2 3/5] memalloc: allow setting up segment list fd's X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Currently, only segment fd's for multi-file segments are supported, while for memfd-backed no-huge memory we need single-file segments mode. Add support for single-file segments in the internal API. Signed-off-by: Anatoly Burakov --- Notes: v2: - Add missing fd list allocation on setting segment list fd lib/librte_eal/bsdapp/eal/eal_memalloc.c | 6 +++++ lib/librte_eal/common/eal_memalloc.h | 4 ++++ lib/librte_eal/linuxapp/eal/eal_memalloc.c | 26 ++++++++++++++++++++++ 3 files changed, 36 insertions(+) diff --git a/lib/librte_eal/bsdapp/eal/eal_memalloc.c b/lib/librte_eal/bsdapp/eal/eal_memalloc.c index a5847f0bd..6893448db 100644 --- a/lib/librte_eal/bsdapp/eal/eal_memalloc.c +++ b/lib/librte_eal/bsdapp/eal/eal_memalloc.c @@ -61,6 +61,12 @@ eal_memalloc_set_seg_fd(int list_idx __rte_unused, int seg_idx __rte_unused, return -ENOTSUP; } +int +eal_memalloc_set_seg_list_fd(int list_idx __rte_unused, int fd __rte_unused) +{ + return -ENOTSUP; +} + int eal_memalloc_get_seg_fd_offset(int list_idx __rte_unused, int seg_idx __rte_unused, size_t *offset __rte_unused) diff --git a/lib/librte_eal/common/eal_memalloc.h b/lib/librte_eal/common/eal_memalloc.h index af917c2f9..b96c9c512 100644 --- a/lib/librte_eal/common/eal_memalloc.h +++ b/lib/librte_eal/common/eal_memalloc.h @@ -84,6 +84,10 @@ eal_memalloc_get_seg_fd(int list_idx, int seg_idx); int eal_memalloc_set_seg_fd(int list_idx, int seg_idx, int fd); +/* returns 0 or -errno */ +int +eal_memalloc_set_seg_list_fd(int list_idx, int fd); + int eal_memalloc_get_seg_fd_offset(int list_idx, int seg_idx, size_t *offset); diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c b/lib/librte_eal/linuxapp/eal/eal_memalloc.c index a93548b8c..eef140b33 100644 --- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c +++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c @@ -1529,6 +1529,10 @@ eal_memalloc_set_seg_fd(int list_idx, int seg_idx, int fd) { struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + /* single file segments mode doesn't support individual segment fd's */ + if (internal_config.single_file_segments) + return -ENOTSUP; + /* if list is not allocated, allocate it */ if (fd_list[list_idx].len == 0) { int len = mcfg->memsegs[list_idx].memseg_arr.len; @@ -1541,6 +1545,28 @@ eal_memalloc_set_seg_fd(int list_idx, int seg_idx, int fd) return 0; } +int +eal_memalloc_set_seg_list_fd(int list_idx, int fd) +{ + struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + + /* non-single file segment mode doesn't support segment list fd's */ + if (!internal_config.single_file_segments) + return -ENOTSUP; + + /* if list is not allocated, allocate it */ + if (fd_list[list_idx].len == 0) { + int len = mcfg->memsegs[list_idx].memseg_arr.len; + + if (alloc_list(list_idx, len) < 0) + return -ENOMEM; + } + + fd_list[list_idx].memseg_list_fd = fd; + + return 0; +} + int eal_memalloc_get_seg_fd(int list_idx, int seg_idx) { From patchwork Tue Dec 11 16:43:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anatoly Burakov X-Patchwork-Id: 48647 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 09D407CB0; Tue, 11 Dec 2018 17:43:47 +0100 (CET) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id CBFD05F1B for ; Tue, 11 Dec 2018 17:43:37 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Dec 2018 08:43:36 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,342,1539673200"; d="scan'208";a="258617687" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga004.jf.intel.com with ESMTP; 11 Dec 2018 08:43:34 -0800 Received: from sivswdev05.ir.intel.com (sivswdev05.ir.intel.com [10.243.17.64]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id wBBGhXSq026219; Tue, 11 Dec 2018 16:43:33 GMT Received: from sivswdev05.ir.intel.com (localhost [127.0.0.1]) by sivswdev05.ir.intel.com with ESMTP id wBBGhXkQ007061; Tue, 11 Dec 2018 16:43:33 GMT Received: (from aburakov@localhost) by sivswdev05.ir.intel.com with LOCAL id wBBGhXDu007057; Tue, 11 Dec 2018 16:43:33 GMT From: Anatoly Burakov To: dev@dpdk.org Cc: John McNamara , Marko Kovacevic , przemyslawx.lal@intel.com, kuralamudhan.ramakrishnan@intel.com, ivan.coughlan@intel.com, tiwei.bie@intel.com, ray.kinsella@intel.com, maxime.coquelin@redhat.com Date: Tue, 11 Dec 2018 16:43:31 +0000 Message-Id: <28e25797ef95a0a74fd264388ab63b9cd980c265.1544546363.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v2 4/5] mem: use memfd for no-huge mode X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" When running in no-huge mode, we anonymously allocate our memory. While this works for regular NICs and vdev's, it's not suitable for memory sharing scenarios such as virtio with vhost_user backend. To fix this, allocate no-huge memory using memfd, and register it with memalloc just like any other memseg fd. This will enable using rte_memseg_get_fd() API with --no-huge EAL flag. Signed-off-by: Anatoly Burakov --- Notes: v2: - Detect memfd support at compile time - Change memfd-related log level to debug doc/guides/rel_notes/release_19_02.rst | 5 +++ lib/librte_eal/linuxapp/eal/eal_memory.c | 54 +++++++++++++++++++++++- 2 files changed, 57 insertions(+), 2 deletions(-) diff --git a/doc/guides/rel_notes/release_19_02.rst b/doc/guides/rel_notes/release_19_02.rst index 960098582..420d51b5b 100644 --- a/doc/guides/rel_notes/release_19_02.rst +++ b/doc/guides/rel_notes/release_19_02.rst @@ -23,6 +23,11 @@ DPDK Release 19.02 New Features ------------ +* **Support for using VirtIO without hugepages** + + The --no-huge mode was augmented to use memfd-backed memory (on systems that + support memfd), to allow using VirtIO-based NICs without hugepages. + .. This section should contain new features added in this release. Sample format: diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c index 32feb415d..7d922a965 100644 --- a/lib/librte_eal/linuxapp/eal/eal_memory.c +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c @@ -25,6 +25,10 @@ #include #include #include +#ifdef F_ADD_SEALS /* if file sealing is supported, so is memfd */ +#include +#define MEMFD_SUPPORTED +#endif #ifdef RTE_EAL_NUMA_AWARE_HUGEPAGES #include #include @@ -1341,12 +1345,18 @@ eal_legacy_hugepage_init(void) /* hugetlbfs can be disabled */ if (internal_config.no_hugetlbfs) { struct rte_memseg_list *msl; + int n_segs, cur_seg, fd, flags; +#ifdef MEMFD_SUPPORTED + int memfd; +#endif uint64_t page_sz; - int n_segs, cur_seg; /* nohuge mode is legacy mode */ internal_config.legacy_mem = 1; + /* nohuge mode is single-file segments mode */ + internal_config.single_file_segments = 1; + /* create a memseg list */ msl = &mcfg->memsegs[0]; @@ -1359,8 +1369,38 @@ eal_legacy_hugepage_init(void) return -1; } + /* set up parameters for anonymous mmap */ + fd = -1; + flags = MAP_PRIVATE | MAP_ANONYMOUS; + +#ifdef MEMFD_SUPPORTED + /* create a memfd and store it in the segment fd table */ + memfd = memfd_create("nohuge", 0); + if (memfd < 0) { + RTE_LOG(DEBUG, EAL, "Cannot create memfd: %s\n", + strerror(errno)); + RTE_LOG(DEBUG, EAL, "Falling back to anonymous map\n"); + } else { + /* we got an fd - now resize it */ + if (ftruncate(memfd, internal_config.memory) < 0) { + RTE_LOG(ERR, EAL, "Cannot resize memfd: %s\n", + strerror(errno)); + RTE_LOG(ERR, EAL, "Falling back to anonymous map\n"); + close(memfd); + } else { + /* creating memfd-backed file was successful. + * we want changes to memfd to be visible to + * other processes (such as vhost backend), so + * map it as shared memory. + */ + RTE_LOG(DEBUG, EAL, "Using memfd for anonymous memory\n"); + fd = memfd; + flags = MAP_SHARED; + } + } +#endif addr = mmap(NULL, internal_config.memory, PROT_READ | PROT_WRITE, - MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + flags, fd, 0); if (addr == MAP_FAILED) { RTE_LOG(ERR, EAL, "%s: mmap() failed: %s\n", __func__, strerror(errno)); @@ -1371,6 +1411,16 @@ eal_legacy_hugepage_init(void) msl->socket_id = 0; msl->len = internal_config.memory; + /* we're in single-file segments mode, so only the segment list + * fd needs to be set up. + */ + if (fd != -1) { + if (eal_memalloc_set_seg_list_fd(0, fd) < 0) { + RTE_LOG(ERR, EAL, "Cannot set up segment list fd\n"); + /* not a serious error, proceed */ + } + } + /* populate memsegs. each memseg is one page long */ for (cur_seg = 0; cur_seg < n_segs; cur_seg++) { arr = &msl->memseg_arr; From patchwork Tue Dec 11 16:43:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anatoly Burakov X-Patchwork-Id: 48645 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 4752D5F2D; Tue, 11 Dec 2018 17:43:43 +0100 (CET) Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by dpdk.org (Postfix) with ESMTP id 5F50E5B38 for ; Tue, 11 Dec 2018 17:43:37 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Dec 2018 08:43:36 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,342,1539673200"; d="scan'208";a="282736509" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga005.jf.intel.com with ESMTP; 11 Dec 2018 08:43:34 -0800 Received: from sivswdev05.ir.intel.com (sivswdev05.ir.intel.com [10.243.17.64]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id wBBGhX6a026222; Tue, 11 Dec 2018 16:43:33 GMT Received: from sivswdev05.ir.intel.com (localhost [127.0.0.1]) by sivswdev05.ir.intel.com with ESMTP id wBBGhXAl007068; Tue, 11 Dec 2018 16:43:33 GMT Received: (from aburakov@localhost) by sivswdev05.ir.intel.com with LOCAL id wBBGhXmr007064; Tue, 11 Dec 2018 16:43:33 GMT From: Anatoly Burakov To: dev@dpdk.org Cc: przemyslawx.lal@intel.com, kuralamudhan.ramakrishnan@intel.com, ivan.coughlan@intel.com, tiwei.bie@intel.com, ray.kinsella@intel.com, maxime.coquelin@redhat.com Date: Tue, 11 Dec 2018 16:43:32 +0000 Message-Id: X-Mailer: git-send-email 1.7.0.7 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v2 5/5] test: add segment fd API test X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Use memory autotest to also test segment fd API. This will not do any checks - just see if the relevant API's return success or indicate that the API is not supported. Signed-off-by: Anatoly Burakov --- test/test/test_memory.c | 43 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) diff --git a/test/test/test_memory.c b/test/test/test_memory.c index b96bca771..3da803e4e 100644 --- a/test/test/test_memory.c +++ b/test/test/test_memory.c @@ -37,10 +37,44 @@ check_mem(const struct rte_memseg_list *msl __rte_unused, return 0; } +static int +check_seg_fds(const struct rte_memseg_list *msl, const struct rte_memseg *ms, + void *arg __rte_unused) +{ + size_t offset; + int ret; + + /* skip external segments */ + if (msl->external) + return 0; + + /* try segment fd first. we're in a callback, so thread-unsafe */ + ret = rte_memseg_get_fd_thread_unsafe(ms); + if (ret < 0) { + /* ENOTSUP means segment is valid, but there is not support for + * segment fd API (e.g. on FreeBSD). + */ + if (errno == ENOTSUP) + return 1; + /* all other errors are treated as failures */ + return -1; + } + + /* we're able to get memseg fd - try getting its offset */ + ret = rte_memseg_get_fd_offset_thread_unsafe(ms, &offset); + if (ret < 0) { + if (errno == ENOTSUP) + return 1; + return -1; + } + return 0; +} + static int test_memory(void) { uint64_t s; + int ret; /* * dump the mapped memory: the python-expect script checks @@ -59,6 +93,15 @@ test_memory(void) /* try to read memory (should not segfault) */ rte_memseg_walk(check_mem, NULL); + /* check segment fd support */ + ret = rte_memseg_walk(check_seg_fds, NULL); + if (ret == 1) { + printf("Segment fd API is unsupported\n"); + } else if (ret == -1) { + printf("Error getting segment fd's\n"); + return -1; + } + return 0; }