Message ID | 28e25797ef95a0a74fd264388ab63b9cd980c265.1544546363.git.anatoly.burakov@intel.com (mailing list archive) |
---|---|
State | Superseded, archived |
Delegated to: | Thomas Monjalon |
Headers | show |
Series | Allow using virtio without hugepages | expand |
Context | Check | Description |
---|---|---|
ci/checkpatch | success | coding style OK |
ci/Intel-compilation | success | Compilation OK |
On Tue, Dec 11, 2018 at 04:43:31PM +0000, Anatoly Burakov wrote: > When running in no-huge mode, we anonymously allocate our memory. > While this works for regular NICs and vdev's, it's not suitable > for memory sharing scenarios such as virtio with vhost_user > backend. > > To fix this, allocate no-huge memory using memfd, and register > it with memalloc just like any other memseg fd. This will enable > using rte_memseg_get_fd() API with --no-huge EAL flag. > > Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> > --- > > Notes: > v2: > - Detect memfd support at compile time > - Change memfd-related log level to debug > > doc/guides/rel_notes/release_19_02.rst | 5 +++ > lib/librte_eal/linuxapp/eal/eal_memory.c | 54 +++++++++++++++++++++++- > 2 files changed, 57 insertions(+), 2 deletions(-) > > diff --git a/doc/guides/rel_notes/release_19_02.rst b/doc/guides/rel_notes/release_19_02.rst > index 960098582..420d51b5b 100644 > --- a/doc/guides/rel_notes/release_19_02.rst > +++ b/doc/guides/rel_notes/release_19_02.rst > @@ -23,6 +23,11 @@ DPDK Release 19.02 > New Features > ------------ > > +* **Support for using VirtIO without hugepages** > + > + The --no-huge mode was augmented to use memfd-backed memory (on systems that > + support memfd), to allow using VirtIO-based NICs without hugepages. It would be better to say virtio-user here, because virtio NICs e.g. the one emulated by QEMU, could be something quite different. > + > .. This section should contain new features added in this release. > Sample format: > [...]
On 13-Dec-18 4:59 AM, Tiwei Bie wrote: > On Tue, Dec 11, 2018 at 04:43:31PM +0000, Anatoly Burakov wrote: >> When running in no-huge mode, we anonymously allocate our memory. >> While this works for regular NICs and vdev's, it's not suitable >> for memory sharing scenarios such as virtio with vhost_user >> backend. >> >> To fix this, allocate no-huge memory using memfd, and register >> it with memalloc just like any other memseg fd. This will enable >> using rte_memseg_get_fd() API with --no-huge EAL flag. >> >> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> >> --- >> >> Notes: >> v2: >> - Detect memfd support at compile time >> - Change memfd-related log level to debug >> >> doc/guides/rel_notes/release_19_02.rst | 5 +++ >> lib/librte_eal/linuxapp/eal/eal_memory.c | 54 +++++++++++++++++++++++- >> 2 files changed, 57 insertions(+), 2 deletions(-) >> >> diff --git a/doc/guides/rel_notes/release_19_02.rst b/doc/guides/rel_notes/release_19_02.rst >> index 960098582..420d51b5b 100644 >> --- a/doc/guides/rel_notes/release_19_02.rst >> +++ b/doc/guides/rel_notes/release_19_02.rst >> @@ -23,6 +23,11 @@ DPDK Release 19.02 >> New Features >> ------------ >> >> +* **Support for using VirtIO without hugepages** >> + >> + The --no-huge mode was augmented to use memfd-backed memory (on systems that >> + support memfd), to allow using VirtIO-based NICs without hugepages. > > It would be better to say virtio-user here, because virtio NICs > e.g. the one emulated by QEMU, could be something quite different. Thanks, will fix! > >> + >> .. This section should contain new features added in this release. >> Sample format: >> > [...] >
diff --git a/doc/guides/rel_notes/release_19_02.rst b/doc/guides/rel_notes/release_19_02.rst index 960098582..420d51b5b 100644 --- a/doc/guides/rel_notes/release_19_02.rst +++ b/doc/guides/rel_notes/release_19_02.rst @@ -23,6 +23,11 @@ DPDK Release 19.02 New Features ------------ +* **Support for using VirtIO without hugepages** + + The --no-huge mode was augmented to use memfd-backed memory (on systems that + support memfd), to allow using VirtIO-based NICs without hugepages. + .. This section should contain new features added in this release. Sample format: diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c index 32feb415d..7d922a965 100644 --- a/lib/librte_eal/linuxapp/eal/eal_memory.c +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c @@ -25,6 +25,10 @@ #include <sys/time.h> #include <signal.h> #include <setjmp.h> +#ifdef F_ADD_SEALS /* if file sealing is supported, so is memfd */ +#include <linux/memfd.h> +#define MEMFD_SUPPORTED +#endif #ifdef RTE_EAL_NUMA_AWARE_HUGEPAGES #include <numa.h> #include <numaif.h> @@ -1341,12 +1345,18 @@ eal_legacy_hugepage_init(void) /* hugetlbfs can be disabled */ if (internal_config.no_hugetlbfs) { struct rte_memseg_list *msl; + int n_segs, cur_seg, fd, flags; +#ifdef MEMFD_SUPPORTED + int memfd; +#endif uint64_t page_sz; - int n_segs, cur_seg; /* nohuge mode is legacy mode */ internal_config.legacy_mem = 1; + /* nohuge mode is single-file segments mode */ + internal_config.single_file_segments = 1; + /* create a memseg list */ msl = &mcfg->memsegs[0]; @@ -1359,8 +1369,38 @@ eal_legacy_hugepage_init(void) return -1; } + /* set up parameters for anonymous mmap */ + fd = -1; + flags = MAP_PRIVATE | MAP_ANONYMOUS; + +#ifdef MEMFD_SUPPORTED + /* create a memfd and store it in the segment fd table */ + memfd = memfd_create("nohuge", 0); + if (memfd < 0) { + RTE_LOG(DEBUG, EAL, "Cannot create memfd: %s\n", + strerror(errno)); + RTE_LOG(DEBUG, EAL, "Falling back to anonymous map\n"); + } else { + /* we got an fd - now resize it */ + if (ftruncate(memfd, internal_config.memory) < 0) { + RTE_LOG(ERR, EAL, "Cannot resize memfd: %s\n", + strerror(errno)); + RTE_LOG(ERR, EAL, "Falling back to anonymous map\n"); + close(memfd); + } else { + /* creating memfd-backed file was successful. + * we want changes to memfd to be visible to + * other processes (such as vhost backend), so + * map it as shared memory. + */ + RTE_LOG(DEBUG, EAL, "Using memfd for anonymous memory\n"); + fd = memfd; + flags = MAP_SHARED; + } + } +#endif addr = mmap(NULL, internal_config.memory, PROT_READ | PROT_WRITE, - MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + flags, fd, 0); if (addr == MAP_FAILED) { RTE_LOG(ERR, EAL, "%s: mmap() failed: %s\n", __func__, strerror(errno)); @@ -1371,6 +1411,16 @@ eal_legacy_hugepage_init(void) msl->socket_id = 0; msl->len = internal_config.memory; + /* we're in single-file segments mode, so only the segment list + * fd needs to be set up. + */ + if (fd != -1) { + if (eal_memalloc_set_seg_list_fd(0, fd) < 0) { + RTE_LOG(ERR, EAL, "Cannot set up segment list fd\n"); + /* not a serious error, proceed */ + } + } + /* populate memsegs. each memseg is one page long */ for (cur_seg = 0; cur_seg < n_segs; cur_seg++) { arr = &msl->memseg_arr;
When running in no-huge mode, we anonymously allocate our memory. While this works for regular NICs and vdev's, it's not suitable for memory sharing scenarios such as virtio with vhost_user backend. To fix this, allocate no-huge memory using memfd, and register it with memalloc just like any other memseg fd. This will enable using rte_memseg_get_fd() API with --no-huge EAL flag. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> --- Notes: v2: - Detect memfd support at compile time - Change memfd-related log level to debug doc/guides/rel_notes/release_19_02.rst | 5 +++ lib/librte_eal/linuxapp/eal/eal_memory.c | 54 +++++++++++++++++++++++- 2 files changed, 57 insertions(+), 2 deletions(-)