mem: close rtemap files

Message ID 20221006100405.2809898-1-huzaifa.rahman@emumba.com (mailing list archive)
State Rejected, archived
Delegated to: David Marchand
Headers
Series mem: close rtemap files |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-aarch64-unit-testing fail Testing issues
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-x86_64-unit-testing fail Testing issues
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/github-robot: build fail github build: failed
ci/Intel-compilation success Compilation OK
ci/intel-Testing fail Testing issues

Commit Message

Huzaifa Rahman Oct. 6, 2022, 10:04 a.m. UTC
  Bugzilla ID: 560

The memory subsystem is leaving open a file descriptor for each
rtemap file. This can lead to hundreds of extra open file descriptors
which has negative side effects. For example, the application may go
over its maximum file descriptor limit, or the application may be using
limited API's like select that only allow 1024 file descriptors.

The EAL memory subsystem does not need to hold the file open.
Probably the original intention was to keep the file locked, but that is
not necessary. The Linux kernel keeps a reference count on the file,
and the mmap counts is a reference and therefore maintains the file
as locked.

The fix is just to close the file after it is setup.

Signed-off-by: huzaifa.rahman <huzaifa.rahman@emumba.com>
---
 lib/eal/linux/eal_memalloc.c | 3 +++
 1 file changed, 3 insertions(+)
  

Comments

Dmitry Kozlyuk Oct. 6, 2022, 10:42 a.m. UTC | #1
2022-10-06 10:04 (UTC+0000), huzaifa.rahman:
> Bugzilla ID: 560
> 
> The memory subsystem is leaving open a file descriptor for each
> rtemap file. This can lead to hundreds of extra open file descriptors
> which has negative side effects. For example, the application may go
> over its maximum file descriptor limit, or the application may be using
> limited API's like select that only allow 1024 file descriptors.
> 
> The EAL memory subsystem does not need to hold the file open.
> Probably the original intention was to keep the file locked, but that is
> not necessary. The Linux kernel keeps a reference count on the file,
> and the mmap counts is a reference and therefore maintains the file
> as locked.
> 
> The fix is just to close the file after it is setup.
> 
> Signed-off-by: huzaifa.rahman <huzaifa.rahman@emumba.com>
> ---
>  lib/eal/linux/eal_memalloc.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/lib/eal/linux/eal_memalloc.c b/lib/eal/linux/eal_memalloc.c
> index f8b1588cae..955c4e4f95 100644
> --- a/lib/eal/linux/eal_memalloc.c
> +++ b/lib/eal/linux/eal_memalloc.c
> @@ -679,6 +679,9 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
>  
>  	huge_recover_sigbus();
>  
> +	close(fd);
> +	fd_list[list_idx].fds[seg_idx] = -1;
> +
>  	ms->addr = addr;
>  	ms->hugepage_sz = alloc_sz;
>  	ms->len = alloc_sz;

This breaks rte_memseg_get_fd().
If memfd_create() was used to open the file descriptor,
there seems no way to reopen it once closed.

--single-file-segments may be used to save FD count,
does using it solve your issue?
  
Huzaifa Rahman Oct. 20, 2022, 10:12 a.m. UTC | #2
yes, It reduces the opened files from 450 to 2.
Is there any down side of using --single-file-segments flag. Because if
this is better why not make it the default behaviour.

On Thu, Oct 6, 2022 at 3:42 PM Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
wrote:

> 2022-10-06 10:04 (UTC+0000), huzaifa.rahman:
> > Bugzilla ID: 560
> >
> > The memory subsystem is leaving open a file descriptor for each
> > rtemap file. This can lead to hundreds of extra open file descriptors
> > which has negative side effects. For example, the application may go
> > over its maximum file descriptor limit, or the application may be using
> > limited API's like select that only allow 1024 file descriptors.
> >
> > The EAL memory subsystem does not need to hold the file open.
> > Probably the original intention was to keep the file locked, but that is
> > not necessary. The Linux kernel keeps a reference count on the file,
> > and the mmap counts is a reference and therefore maintains the file
> > as locked.
> >
> > The fix is just to close the file after it is setup.
> >
> > Signed-off-by: huzaifa.rahman <huzaifa.rahman@emumba.com>
> > ---
> >  lib/eal/linux/eal_memalloc.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/lib/eal/linux/eal_memalloc.c b/lib/eal/linux/eal_memalloc.c
> > index f8b1588cae..955c4e4f95 100644
> > --- a/lib/eal/linux/eal_memalloc.c
> > +++ b/lib/eal/linux/eal_memalloc.c
> > @@ -679,6 +679,9 @@ alloc_seg(struct rte_memseg *ms, void *addr, int
> socket_id,
> >
> >       huge_recover_sigbus();
> >
> > +     close(fd);
> > +     fd_list[list_idx].fds[seg_idx] = -1;
> > +
> >       ms->addr = addr;
> >       ms->hugepage_sz = alloc_sz;
> >       ms->len = alloc_sz;
>
> This breaks rte_memseg_get_fd().
> If memfd_create() was used to open the file descriptor,
> there seems no way to reopen it once closed.
>
> --single-file-segments may be used to save FD count,
> does using it solve your issue?
>
  
Dmitry Kozlyuk Oct. 20, 2022, 10:46 a.m. UTC | #3
2022-10-20 15:12 (UTC+0500), Huzaifa Rahman:
> On Thu, Oct 6, 2022 at 3:42 PM Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
> wrote:
> 
> > 2022-10-06 10:04 (UTC+0000), huzaifa.rahman:  
> > > Bugzilla ID: 560
> > >
> > > The memory subsystem is leaving open a file descriptor for each
> > > rtemap file. This can lead to hundreds of extra open file descriptors
> > > which has negative side effects. For example, the application may go
> > > over its maximum file descriptor limit, or the application may be using
> > > limited API's like select that only allow 1024 file descriptors.
> > >
> > > The EAL memory subsystem does not need to hold the file open.
> > > Probably the original intention was to keep the file locked, but that is
> > > not necessary. The Linux kernel keeps a reference count on the file,
> > > and the mmap counts is a reference and therefore maintains the file
> > > as locked.
> > >
> > > The fix is just to close the file after it is setup.
> > >
> > > Signed-off-by: huzaifa.rahman <huzaifa.rahman@emumba.com>
> > > ---
> > >  lib/eal/linux/eal_memalloc.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/lib/eal/linux/eal_memalloc.c b/lib/eal/linux/eal_memalloc.c
> > > index f8b1588cae..955c4e4f95 100644
> > > --- a/lib/eal/linux/eal_memalloc.c
> > > +++ b/lib/eal/linux/eal_memalloc.c
> > > @@ -679,6 +679,9 @@ alloc_seg(struct rte_memseg *ms, void *addr, int  
> > socket_id,  
> > >
> > >       huge_recover_sigbus();
> > >
> > > +     close(fd);
> > > +     fd_list[list_idx].fds[seg_idx] = -1;
> > > +
> > >       ms->addr = addr;
> > >       ms->hugepage_sz = alloc_sz;
> > >       ms->len = alloc_sz;  
> >
> > This breaks rte_memseg_get_fd().
> > If memfd_create() was used to open the file descriptor,
> > there seems no way to reopen it once closed.
> >
> > --single-file-segments may be used to save FD count,
> > does using it solve your issue?
> > 
> yes, It reduces the opened files from 450 to 2.
> Is there any down side of using --single-file-segments flag. Because if
> this is better why not make it the default behaviour.

Single-file segments option needs fallocate() kernel support
to release pages to the kernel (see `resize_hugefile_in_filesystem` function).
I wonder if DPDK still targets Linux/FreeBSD versions that lack this support.
If not, I don't see an issue in changing the behavior and removing the option.
Only it may be too late in this release and the change is a breaking one.
Adding some people who might know more about DPDK usage.

P.S. Moved the reply to the bottom to keep context.
  

Patch

diff --git a/lib/eal/linux/eal_memalloc.c b/lib/eal/linux/eal_memalloc.c
index f8b1588cae..955c4e4f95 100644
--- a/lib/eal/linux/eal_memalloc.c
+++ b/lib/eal/linux/eal_memalloc.c
@@ -679,6 +679,9 @@  alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
 
 	huge_recover_sigbus();
 
+	close(fd);
+	fd_list[list_idx].fds[seg_idx] = -1;
+
 	ms->addr = addr;
 	ms->hugepage_sz = alloc_sz;
 	ms->len = alloc_sz;