[00/14] cleanup resources on shutdown
mbox series

Message ID 20200104013341.19809-1-stephen@networkplumber.org
Headers show
Series
  • cleanup resources on shutdown
Related show

Message

Stephen Hemminger Jan. 4, 2020, 1:33 a.m. UTC
Recently started using valgrind with DPDK, and the results
are not clean.

The DPDK has a function that applications can use to tell it
to cleanup resources on shutdown (rte_eal_cleanup). But the
current coverage of that API is spotty. Many internal parts of
DPDK leave files and allocated memory behind.

This patch set is a start at getting the sub-parts of
DPDK to cleanup after themselves. These are the easier ones,
the harder and more critical ones are in the drivers
and the memory subsystem.

There are no visible API or ABI changes here.

Stephen Hemminger (14):
  eal: log: close on cleanup
  eal: log: free dynamic state on cleanup
  eal: alarm: close timerfd on eal cleanup
  eal: cleanup threads
  eal: intr: cleanup resources
  eal: mp: end the multiprocess thread during cleanup
  eal: interrupts close epoll fd on shutdown
  eal: vfio: cleanup the mp sync handle
  eal: close mem config on cleanup
  tap: close netlink socket on device close
  eal: cleanup plugins data
  ethdev: raise priority of old driver warning
  eal: hotplug: cleanup multiprocess resources
  eal: malloc: cleanup mp resources

 drivers/net/tap/rte_eth_tap.c               |  7 ++++-
 lib/librte_eal/common/eal_common_log.c      | 30 +++++++++++++++++-
 lib/librte_eal/common/eal_common_options.c  | 12 +++++++
 lib/librte_eal/common/eal_common_proc.c     | 17 +++++++---
 lib/librte_eal/common/eal_options.h         |  1 +
 lib/librte_eal/common/eal_private.h         | 30 ++++++++++++++++++
 lib/librte_eal/common/hotplug_mp.c          |  5 +++
 lib/librte_eal/common/hotplug_mp.h          |  6 ++++
 lib/librte_eal/common/malloc_heap.c         |  6 ++++
 lib/librte_eal/common/malloc_heap.h         |  3 ++
 lib/librte_eal/common/malloc_mp.c           | 12 +++++++
 lib/librte_eal/common/malloc_mp.h           |  3 ++
 lib/librte_eal/linux/eal/eal.c              | 28 +++++++++++++++++
 lib/librte_eal/linux/eal/eal_alarm.c        | 11 +++++++
 lib/librte_eal/linux/eal/eal_interrupts.c   | 35 ++++++++++++++++++---
 lib/librte_eal/linux/eal/eal_log.c          | 14 +++++++++
 lib/librte_eal/linux/eal/eal_vfio.h         |  1 +
 lib/librte_eal/linux/eal/eal_vfio_mp_sync.c |  8 +++++
 lib/librte_ethdev/rte_ethdev.c              |  2 +-
 19 files changed, 218 insertions(+), 13 deletions(-)

Comments

David Marchand Feb. 5, 2020, 9:32 a.m. UTC | #1
Hello Stephen,

On Sat, Jan 4, 2020 at 2:34 AM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> Recently started using valgrind with DPDK, and the results
> are not clean.
>
> The DPDK has a function that applications can use to tell it
> to cleanup resources on shutdown (rte_eal_cleanup). But the
> current coverage of that API is spotty. Many internal parts of
> DPDK leave files and allocated memory behind.
>
> This patch set is a start at getting the sub-parts of
> DPDK to cleanup after themselves. These are the easier ones,
> the harder and more critical ones are in the drivers
> and the memory subsystem.
>
> There are no visible API or ABI changes here.

Could you share what you did to run a dpdk application with valgrind?

I tried with testpmd and a 3.15 valgrind (fc30), but I get an init
failure on the cpu flags.

$ LD_LIBRARY_PATH=/home/dmarchan/builds/build-gcc-shared/install/usr/local/lib64
valgrind /home/dmarchan/builds/build-gcc-shared/install/usr/local/bin/dpdk-testpmd
-c 3 --no-huge -m 20 -d librte_mempool_ring.so -d librte_pmd_null.so
-w 0:0.0 --vdev net_null1 --vdev net_null2 -- --no-mlockall
--total-num-mbufs=2048 -ia
==10258== Memcheck, a memory error detector
==10258== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==10258== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==10258== Command:
/home/dmarchan/builds/build-gcc-shared/install/usr/local/bin/dpdk-testpmd
-c 3 --no-huge -m 20 -d librte_mempool_ring.so -d librte_pmd_null.so
-w 0:0.0 --vdev net_null1 --vdev net_null2 -- --no-mlockall
--total-num-mbufs=2048 -ia
==10258==
ERROR: This system does not support "RDSEED".
Please check that RTE_MACHINE is set correctly.
EAL: FATAL: unsupported cpu type.
EAL: unsupported cpu type.
EAL: Error - exiting with code: 1
  Cause: Cannot init EAL: Operation not supported
==10258==
==10258== HEAP SUMMARY:
==10258==     in use at exit: 1,388 bytes in 49 blocks
==10258==   total heap usage: 97 allocs, 48 frees, 89,426 bytes allocated
==10258==
==10258== LEAK SUMMARY:
==10258==    definitely lost: 0 bytes in 0 blocks
==10258==    indirectly lost: 0 bytes in 0 blocks
==10258==      possibly lost: 0 bytes in 0 blocks
==10258==    still reachable: 1,388 bytes in 49 blocks
==10258==         suppressed: 0 bytes in 0 blocks
==10258== Rerun with --leak-check=full to see details of leaked memory
==10258==
==10258== For lists of detected and suppressed errors, rerun with: -s
==10258== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)


Thanks.

--
David Marchand
Stephen Hemminger Feb. 5, 2020, 12:07 p.m. UTC | #2
On Wed, 5 Feb 2020 10:32:49 +0100
David Marchand <david.marchand@redhat.com> wrote:

> Hello Stephen,
> 
> On Sat, Jan 4, 2020 at 2:34 AM Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> >
> > Recently started using valgrind with DPDK, and the results
> > are not clean.
> >
> > The DPDK has a function that applications can use to tell it
> > to cleanup resources on shutdown (rte_eal_cleanup). But the
> > current coverage of that API is spotty. Many internal parts of
> > DPDK leave files and allocated memory behind.
> >
> > This patch set is a start at getting the sub-parts of
> > DPDK to cleanup after themselves. These are the easier ones,
> > the harder and more critical ones are in the drivers
> > and the memory subsystem.
> >
> > There are no visible API or ABI changes here.  
> 
> Could you share what you did to run a dpdk application with valgrind?
> 
> I tried with testpmd and a 3.15 valgrind (fc30), but I get an init
> failure on the cpu flags.
> 
> $ LD_LIBRARY_PATH=/home/dmarchan/builds/build-gcc-shared/install/usr/local/lib64
> valgrind /home/dmarchan/builds/build-gcc-shared/install/usr/local/bin/dpdk-testpmd
> -c 3 --no-huge -m 20 -d librte_mempool_ring.so -d librte_pmd_null.so
> -w 0:0.0 --vdev net_null1 --vdev net_null2 -- --no-mlockall
> --total-num-mbufs=2048 -ia
> ==10258== Memcheck, a memory error detector
> ==10258== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
> ==10258== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
> ==10258== Command:
> /home/dmarchan/builds/build-gcc-shared/install/usr/local/bin/dpdk-testpmd
> -c 3 --no-huge -m 20 -d librte_mempool_ring.so -d librte_pmd_null.so
> -w 0:0.0 --vdev net_null1 --vdev net_null2 -- --no-mlockall
> --total-num-mbufs=2048 -ia
> ==10258==
> ERROR: This system does not support "RDSEED".
> Please check that RTE_MACHINE is set correctly.
> EAL: FATAL: unsupported cpu type.
> EAL: unsupported cpu type.
> EAL: Error - exiting with code: 1
>   Cause: Cannot init EAL: Operation not supported
> ==10258==
> ==10258== HEAP SUMMARY:
> ==10258==     in use at exit: 1,388 bytes in 49 blocks
> ==10258==   total heap usage: 97 allocs, 48 frees, 89,426 bytes allocated
> ==10258==
> ==10258== LEAK SUMMARY:
> ==10258==    definitely lost: 0 bytes in 0 blocks
> ==10258==    indirectly lost: 0 bytes in 0 blocks
> ==10258==      possibly lost: 0 bytes in 0 blocks
> ==10258==    still reachable: 1,388 bytes in 49 blocks
> ==10258==         suppressed: 0 bytes in 0 blocks
> ==10258== Rerun with --leak-check=full to see details of leaked memory
> ==10258==
> ==10258== For lists of detected and suppressed errors, rerun with: -s
> ==10258== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
> 

I am testing with valgrind on ARM.
It should be possible on x86 but you need to dial down the RTE_MACHINE
choice to something valgrind understands.
David Marchand Feb. 5, 2020, 12:32 p.m. UTC | #3
On Wed, Feb 5, 2020 at 1:07 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Wed, 5 Feb 2020 10:32:49 +0100
> David Marchand <david.marchand@redhat.com> wrote:
>
> > Hello Stephen,
> >
> > On Sat, Jan 4, 2020 at 2:34 AM Stephen Hemminger
> > <stephen@networkplumber.org> wrote:
> > >
> > > Recently started using valgrind with DPDK, and the results
> > > are not clean.
> > >
> > > The DPDK has a function that applications can use to tell it
> > > to cleanup resources on shutdown (rte_eal_cleanup). But the
> > > current coverage of that API is spotty. Many internal parts of
> > > DPDK leave files and allocated memory behind.
> > >
> > > This patch set is a start at getting the sub-parts of
> > > DPDK to cleanup after themselves. These are the easier ones,
> > > the harder and more critical ones are in the drivers
> > > and the memory subsystem.
> > >
> > > There are no visible API or ABI changes here.
> >
> > Could you share what you did to run a dpdk application with valgrind?
> >
> > I tried with testpmd and a 3.15 valgrind (fc30), but I get an init
> > failure on the cpu flags.
> >
> > $ LD_LIBRARY_PATH=/home/dmarchan/builds/build-gcc-shared/install/usr/local/lib64
> > valgrind /home/dmarchan/builds/build-gcc-shared/install/usr/local/bin/dpdk-testpmd
> > -c 3 --no-huge -m 20 -d librte_mempool_ring.so -d librte_pmd_null.so
> > -w 0:0.0 --vdev net_null1 --vdev net_null2 -- --no-mlockall
> > --total-num-mbufs=2048 -ia
> > ==10258== Memcheck, a memory error detector
> > ==10258== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
> > ==10258== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
> > ==10258== Command:
> > /home/dmarchan/builds/build-gcc-shared/install/usr/local/bin/dpdk-testpmd
> > -c 3 --no-huge -m 20 -d librte_mempool_ring.so -d librte_pmd_null.so
> > -w 0:0.0 --vdev net_null1 --vdev net_null2 -- --no-mlockall
> > --total-num-mbufs=2048 -ia
> > ==10258==
> > ERROR: This system does not support "RDSEED".
> > Please check that RTE_MACHINE is set correctly.
> > EAL: FATAL: unsupported cpu type.
> > EAL: unsupported cpu type.
> > EAL: Error - exiting with code: 1
> >   Cause: Cannot init EAL: Operation not supported
> > ==10258==
> > ==10258== HEAP SUMMARY:
> > ==10258==     in use at exit: 1,388 bytes in 49 blocks
> > ==10258==   total heap usage: 97 allocs, 48 frees, 89,426 bytes allocated
> > ==10258==
> > ==10258== LEAK SUMMARY:
> > ==10258==    definitely lost: 0 bytes in 0 blocks
> > ==10258==    indirectly lost: 0 bytes in 0 blocks
> > ==10258==      possibly lost: 0 bytes in 0 blocks
> > ==10258==    still reachable: 1,388 bytes in 49 blocks
> > ==10258==         suppressed: 0 bytes in 0 blocks
> > ==10258== Rerun with --leak-check=full to see details of leaked memory
> > ==10258==
> > ==10258== For lists of detected and suppressed errors, rerun with: -s
> > ==10258== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
> >
>
> I am testing with valgrind on ARM.
> It should be possible on x86 but you need to dial down the RTE_MACHINE
> choice to something valgrind understands.
>

Ok, so no black magic in valgrind :-)
Yeah I managed to run with the x86-default target we have in
test-meson-builds.sh.


Thanks.

--
David Marchand
David Marchand Feb. 6, 2020, 2:06 p.m. UTC | #4
On Sat, Jan 4, 2020 at 2:34 AM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> Recently started using valgrind with DPDK, and the results
> are not clean.
>
> The DPDK has a function that applications can use to tell it
> to cleanup resources on shutdown (rte_eal_cleanup). But the
> current coverage of that API is spotty. Many internal parts of
> DPDK leave files and allocated memory behind.
>
> This patch set is a start at getting the sub-parts of
> DPDK to cleanup after themselves. These are the easier ones,
> the harder and more critical ones are in the drivers
> and the memory subsystem.

I am too short on time to check and integrate this big series in rc2,
and from now it would be too risky to take in 20.02.
Can you respin it for 20.05 with FreeBSD fixes too?


Thanks.
Stephen Hemminger Feb. 7, 2020, 6:24 p.m. UTC | #5
On Thu, 6 Feb 2020 15:06:56 +0100
David Marchand <david.marchand@redhat.com> wrote:

> On Sat, Jan 4, 2020 at 2:34 AM Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> >
> > Recently started using valgrind with DPDK, and the results
> > are not clean.
> >
> > The DPDK has a function that applications can use to tell it
> > to cleanup resources on shutdown (rte_eal_cleanup). But the
> > current coverage of that API is spotty. Many internal parts of
> > DPDK leave files and allocated memory behind.
> >
> > This patch set is a start at getting the sub-parts of
> > DPDK to cleanup after themselves. These are the easier ones,
> > the harder and more critical ones are in the drivers
> > and the memory subsystem.  
> 
> I am too short on time to check and integrate this big series in rc2,
> and from now it would be too risky to take in 20.02.
> Can you respin it for 20.05 with FreeBSD fixes too?

OK, but if this kind of patch can't be reviewed then
the DPDK process is still broken.

I don't see how FreeBSD matters here. It can be leaky but that
is ok.  I split it out to get review, then you complain it
is too big :-(