[v2] eal: add madvise to avoid dump memory

Message ID 20200423154302.2217041-1-fengli@smartx.com (mailing list archive)
State Superseded, archived
Delegated to: David Marchand
Headers
Series [v2] eal: add madvise to avoid dump memory |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-nxp-Performance success Performance Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/travis-robot success Travis build: passed
ci/Intel-compilation fail Compilation issues
ci/iol-testing fail Testing issues

Commit Message

Li Feng April 23, 2020, 3:43 p.m. UTC
Avoid dump all mapped memory to a core dump file when crash.
Otherwise it will very large and it's hard to analyze with gdb.

In my test, it will dump 128GiB memory to a core dump file when integrated
to spdk with default configuration.

Signed-off-by: Li Feng <fengli@smartx.com>
---
 lib/librte_eal/common/eal_common_memory.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)
  

Comments

Burakov, Anatoly April 23, 2020, 4:33 p.m. UTC | #1
On 23-Apr-20 4:43 PM, Li Feng wrote:
> Avoid dump all mapped memory to a core dump file when crash.
> Otherwise it will very large and it's hard to analyze with gdb.
> 
> In my test, it will dump 128GiB memory to a core dump file when integrated
> to spdk with default configuration.

Suggested rewording:

Currently, even though memory is mapped with PROT_NONE, this does not 
cause it to be excluded from core dumps. This is counter-productive, 
because in a lot of cases, this memory will go unused (e.g. when the 
memory subsystem preallocates VA space but hasn't yet mapped physical 
pages into it).

Use `madvise()` call with MADV_DONTDUMP parameter to exclude the 
unmapped memory from being dumped.

> 
> Signed-off-by: Li Feng <fengli@smartx.com>
> ---
>   lib/librte_eal/common/eal_common_memory.c | 14 ++++++++++++++
>   1 file changed, 14 insertions(+)
> 
> diff --git a/lib/librte_eal/common/eal_common_memory.c b/lib/librte_eal/common/eal_common_memory.c
> index cc7d54e0c..2d9564b28 100644
> --- a/lib/librte_eal/common/eal_common_memory.c
> +++ b/lib/librte_eal/common/eal_common_memory.c
> @@ -177,6 +177,20 @@ eal_get_virtual_area(void *requested_addr, size_t *size,
>   		after_len = RTE_PTR_DIFF(map_end, aligned_end);
>   		if (after_len > 0)
>   			munmap(aligned_end, after_len);
> +
> +		/*
> +		 * Exclude this pages from a core dump.
> +		 */
> +		if (madvise(aligned_addr, *size, MADV_DONTDUMP) != 0)
> +			RTE_LOG(WARNING, EAL, "Madvise with MADV_DONTDUMP failed: %s\n",
> +				strerror(errno)); > +	} else {
> +		/*
> +		 * Exclude this pages from a core dump.
> +		 */
> +		if (madvise(mapped_addr, map_sz, MADV_DONTDUMP) != 0)
> +			RTE_LOG(WARNING, EAL, "Madvise with MADV_DONTDUMP failed: %s\n",
> +				strerror(errno));
>   	}
>   
>   	return aligned_addr;
> 

For the contents of this patch,

Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

However, even though this is good to have, after some more thought, i 
believe the fix is incomplete, because this is not the only place we're 
reserving anonymous memory. We're also doing so in 
`eal_memalloc.c:free_seg()`, so an `madvise()` call should also be added 
there.

@David, now that i think of it, the PROT_NONE patch also was incomplete, 
as we only set PROT_NONE to memory that's initially reserved, but not 
when it's unmapped and returned back to the pool of anonymous memory. 
So, eal_memalloc.c should also remap anonymous memory with PROT_NONE.

@Li Feng, would you be so kind as to provide a patch replacing PROT_READ 
with PROT_NONE in eal_memalloc.c as well? Thank you very much!
  
David Marchand April 23, 2020, 8:04 p.m. UTC | #2
On Thu, Apr 23, 2020 at 6:34 PM Burakov, Anatoly
<anatoly.burakov@intel.com> wrote:
> > diff --git a/lib/librte_eal/common/eal_common_memory.c b/lib/librte_eal/common/eal_common_memory.c
> > index cc7d54e0c..2d9564b28 100644
> > --- a/lib/librte_eal/common/eal_common_memory.c
> > +++ b/lib/librte_eal/common/eal_common_memory.c
> > @@ -177,6 +177,20 @@ eal_get_virtual_area(void *requested_addr, size_t *size,
> >               after_len = RTE_PTR_DIFF(map_end, aligned_end);
> >               if (after_len > 0)
> >                       munmap(aligned_end, after_len);
> > +
> > +             /*
> > +              * Exclude this pages from a core dump.
> > +              */
> > +             if (madvise(aligned_addr, *size, MADV_DONTDUMP) != 0)
> > +                     RTE_LOG(WARNING, EAL, "Madvise with MADV_DONTDUMP failed: %s\n",
> > +                             strerror(errno));> +   } else {
> > +             /*
> > +              * Exclude this pages from a core dump.
> > +              */
> > +             if (madvise(mapped_addr, map_sz, MADV_DONTDUMP) != 0)
> > +                     RTE_LOG(WARNING, EAL, "Madvise with MADV_DONTDUMP failed: %s\n",
> > +                             strerror(errno));
> >       }
> >
> >       return aligned_addr;
> >
>
> For the contents of this patch,

MADV_DONTDUMP does not seem POSIX, but as I said [1], there seems to
be a MADV_NOCORE option on FreeBSD.
1: http://inbox.dpdk.org/dev/CAJFAV8y9YtT-7njUz+mD6U8+3XUqYrgp28KD7jy2923EpAcXrg@mail.gmail.com/


>
> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
>
> However, even though this is good to have, after some more thought, i
> believe the fix is incomplete, because this is not the only place we're
> reserving anonymous memory. We're also doing so in
> `eal_memalloc.c:free_seg()`, so an `madvise()` call should also be added
> there.
>
> @David, now that i think of it, the PROT_NONE patch also was incomplete,
> as we only set PROT_NONE to memory that's initially reserved, but not
> when it's unmapped and returned back to the pool of anonymous memory.
> So, eal_memalloc.c should also remap anonymous memory with PROT_NONE.

I can't disagree if you say so :-).

>
> @Li Feng, would you be so kind as to provide a patch replacing PROT_READ
> with PROT_NONE in eal_memalloc.c as well? Thank you very much!
>

Once we have the proper fixes, I'd like to get this Cc: stable@dpdk.org.
Thanks.
  
Burakov, Anatoly April 24, 2020, 9:12 a.m. UTC | #3
On 23-Apr-20 9:04 PM, David Marchand wrote:
> On Thu, Apr 23, 2020 at 6:34 PM Burakov, Anatoly
> <anatoly.burakov@intel.com> wrote:
>>> diff --git a/lib/librte_eal/common/eal_common_memory.c b/lib/librte_eal/common/eal_common_memory.c
>>> index cc7d54e0c..2d9564b28 100644
>>> --- a/lib/librte_eal/common/eal_common_memory.c
>>> +++ b/lib/librte_eal/common/eal_common_memory.c
>>> @@ -177,6 +177,20 @@ eal_get_virtual_area(void *requested_addr, size_t *size,
>>>                after_len = RTE_PTR_DIFF(map_end, aligned_end);
>>>                if (after_len > 0)
>>>                        munmap(aligned_end, after_len);
>>> +
>>> +             /*
>>> +              * Exclude this pages from a core dump.
>>> +              */
>>> +             if (madvise(aligned_addr, *size, MADV_DONTDUMP) != 0)
>>> +                     RTE_LOG(WARNING, EAL, "Madvise with MADV_DONTDUMP failed: %s\n",
>>> +                             strerror(errno));> +   } else {
>>> +             /*
>>> +              * Exclude this pages from a core dump.
>>> +              */
>>> +             if (madvise(mapped_addr, map_sz, MADV_DONTDUMP) != 0)
>>> +                     RTE_LOG(WARNING, EAL, "Madvise with MADV_DONTDUMP failed: %s\n",
>>> +                             strerror(errno));
>>>        }
>>>
>>>        return aligned_addr;
>>>
>>
>> For the contents of this patch,
> 
> MADV_DONTDUMP does not seem POSIX, but as I said [1], there seems to
> be a MADV_NOCORE option on FreeBSD.
> 1: http://inbox.dpdk.org/dev/CAJFAV8y9YtT-7njUz+mD6U8+3XUqYrgp28KD7jy2923EpAcXrg@mail.gmail.com/
> 
> 

Oh, right, so this would probably not compile on FreeBSD. Perhaps this 
function would have to be OS-specific after all (or call into an 
OS-specific madvise() after reserving the memory area).

>>
>> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>
>> However, even though this is good to have, after some more thought, i
>> believe the fix is incomplete, because this is not the only place we're
>> reserving anonymous memory. We're also doing so in
>> `eal_memalloc.c:free_seg()`, so an `madvise()` call should also be added
>> there.
>>
>> @David, now that i think of it, the PROT_NONE patch also was incomplete,
>> as we only set PROT_NONE to memory that's initially reserved, but not
>> when it's unmapped and returned back to the pool of anonymous memory.
>> So, eal_memalloc.c should also remap anonymous memory with PROT_NONE.
> 
> I can't disagree if you say so :-).

Nice to have that kind of power! *evil laugh*

> 
>>
>> @Li Feng, would you be so kind as to provide a patch replacing PROT_READ
>> with PROT_NONE in eal_memalloc.c as well? Thank you very much!
>>
> 
> Once we have the proper fixes, I'd like to get this Cc: stable@dpdk.org.
> Thanks.
> 
>
  
Bruce Richardson April 24, 2020, 9:14 a.m. UTC | #4
On Fri, Apr 24, 2020 at 10:12:10AM +0100, Burakov, Anatoly wrote:
> On 23-Apr-20 9:04 PM, David Marchand wrote:
> > On Thu, Apr 23, 2020 at 6:34 PM Burakov, Anatoly
> > <anatoly.burakov@intel.com> wrote:
> > > > diff --git a/lib/librte_eal/common/eal_common_memory.c b/lib/librte_eal/common/eal_common_memory.c
> > > > index cc7d54e0c..2d9564b28 100644
> > > > --- a/lib/librte_eal/common/eal_common_memory.c
> > > > +++ b/lib/librte_eal/common/eal_common_memory.c
> > > > @@ -177,6 +177,20 @@ eal_get_virtual_area(void *requested_addr, size_t *size,
> > > >                after_len = RTE_PTR_DIFF(map_end, aligned_end);
> > > >                if (after_len > 0)
> > > >                        munmap(aligned_end, after_len);
> > > > +
> > > > +             /*
> > > > +              * Exclude this pages from a core dump.
> > > > +              */
> > > > +             if (madvise(aligned_addr, *size, MADV_DONTDUMP) != 0)
> > > > +                     RTE_LOG(WARNING, EAL, "Madvise with MADV_DONTDUMP failed: %s\n",
> > > > +                             strerror(errno));> +   } else {
> > > > +             /*
> > > > +              * Exclude this pages from a core dump.
> > > > +              */
> > > > +             if (madvise(mapped_addr, map_sz, MADV_DONTDUMP) != 0)
> > > > +                     RTE_LOG(WARNING, EAL, "Madvise with MADV_DONTDUMP failed: %s\n",
> > > > +                             strerror(errno));
> > > >        }
> > > > 
> > > >        return aligned_addr;
> > > > 
> > > 
> > > For the contents of this patch,
> > 
> > MADV_DONTDUMP does not seem POSIX, but as I said [1], there seems to
> > be a MADV_NOCORE option on FreeBSD.
> > 1: http://inbox.dpdk.org/dev/CAJFAV8y9YtT-7njUz+mD6U8+3XUqYrgp28KD7jy2923EpAcXrg@mail.gmail.com/
> > 
> > 
> 
> Oh, right, so this would probably not compile on FreeBSD. Perhaps this
> function would have to be OS-specific after all (or call into an OS-specific
> madvise() after reserving the memory area).
> 

Is it just a differently named flag? If so, I think a single #ifdef macro
won't kill us in the common code.
  
Feng Li April 24, 2020, 9:33 a.m. UTC | #5
Bruce Richardson <bruce.richardson@intel.com> 于2020年4月24日周五 下午5:14写道:
>
> On Fri, Apr 24, 2020 at 10:12:10AM +0100, Burakov, Anatoly wrote:
> > On 23-Apr-20 9:04 PM, David Marchand wrote:
> > > On Thu, Apr 23, 2020 at 6:34 PM Burakov, Anatoly
> > > <anatoly.burakov@intel.com> wrote:
> > > > > diff --git a/lib/librte_eal/common/eal_common_memory.c b/lib/librte_eal/common/eal_common_memory.c
> > > > > index cc7d54e0c..2d9564b28 100644
> > > > > --- a/lib/librte_eal/common/eal_common_memory.c
> > > > > +++ b/lib/librte_eal/common/eal_common_memory.c
> > > > > @@ -177,6 +177,20 @@ eal_get_virtual_area(void *requested_addr, size_t *size,
> > > > >                after_len = RTE_PTR_DIFF(map_end, aligned_end);
> > > > >                if (after_len > 0)
> > > > >                        munmap(aligned_end, after_len);
> > > > > +
> > > > > +             /*
> > > > > +              * Exclude this pages from a core dump.
> > > > > +              */
> > > > > +             if (madvise(aligned_addr, *size, MADV_DONTDUMP) != 0)
> > > > > +                     RTE_LOG(WARNING, EAL, "Madvise with MADV_DONTDUMP failed: %s\n",
> > > > > +                             strerror(errno));> +   } else {
> > > > > +             /*
> > > > > +              * Exclude this pages from a core dump.
> > > > > +              */
> > > > > +             if (madvise(mapped_addr, map_sz, MADV_DONTDUMP) != 0)
> > > > > +                     RTE_LOG(WARNING, EAL, "Madvise with MADV_DONTDUMP failed: %s\n",
> > > > > +                             strerror(errno));
> > > > >        }
> > > > >
> > > > >        return aligned_addr;
> > > > >
> > > >
> > > > For the contents of this patch,
> > >
> > > MADV_DONTDUMP does not seem POSIX, but as I said [1], there seems to
> > > be a MADV_NOCORE option on FreeBSD.
> > > 1: http://inbox.dpdk.org/dev/CAJFAV8y9YtT-7njUz+mD6U8+3XUqYrgp28KD7jy2923EpAcXrg@mail.gmail.com/
> > >
> > >
> >
> > Oh, right, so this would probably not compile on FreeBSD. Perhaps this
> > function would have to be OS-specific after all (or call into an OS-specific
> > madvise() after reserving the memory area).
> >
>
> Is it just a differently named flag? If so, I think a single #ifdef macro
> won't kill us in the common code.
>
Just the flag name is different.
I should use RTE_EXEC_ENV_FREEBSD and RTE_EXEC_ENV_LINUX, right?

Another question, in `eal_memalloc.c:alloc_seg`, I should undo the
DONTMAP of the memory region.
Right? @Anatoly

Just few minutes, I have prepared a patch for the OS-specific code:
--- a/lib/librte_eal/common/eal_private.h
+++ b/lib/librte_eal/common/eal_private.h
@@ -443,4 +443,20 @@ rte_option_usage(void);
 uint64_t
 eal_get_baseaddr(void);

+/**
+ * @internal
+ * Exclude this pages from a core dump.
+ *
+ * @param addr
+ *  The memory region starts.
+ *
+ * @param len
+ *  The memory region length..
+ *
+ * @return
+ * returns 0 or -errno
+ */
+int
+eal_madvise_dontdump(void* addr, size_t len);
+
 #endif /* _EAL_PRIVATE_H_ */
diff --git a/lib/librte_eal/freebsd/eal_memory.c
b/lib/librte_eal/freebsd/eal_memory.c
index a97d8f0f0..585042dde 100644
--- a/lib/librte_eal/freebsd/eal_memory.c
+++ b/lib/librte_eal/freebsd/eal_memory.c
@@ -534,3 +534,9 @@ rte_eal_memseg_init(void)
  memseg_primary_init() :
  memseg_secondary_init();
 }
+
+int
+eal_madvise_dontdump(void* addr, size_t len)
+{
+ return madvise(addr, len, MADV_NOCORE);
+}
diff --git a/lib/librte_eal/linux/eal_memory.c
b/lib/librte_eal/linux/eal_memory.c
index 7a9c97ff8..cfdbfccfe 100644
--- a/lib/librte_eal/linux/eal_memory.c
+++ b/lib/librte_eal/linux/eal_memory.c
@@ -2479,3 +2479,9 @@ rte_eal_memseg_init(void)
 #endif
  memseg_secondary_init();
 }
+
+int
+eal_madvise_dontdump(void* addr, size_t len)
+{
+ return madvise(addr, len, MADV_DONTDUMP);
+}
  
Burakov, Anatoly April 24, 2020, 11 a.m. UTC | #6
On 24-Apr-20 10:33 AM, Feng Li wrote:
> Bruce Richardson <bruce.richardson@intel.com> 于2020年4月24日周五 下午5:14写道:
>>
>> On Fri, Apr 24, 2020 at 10:12:10AM +0100, Burakov, Anatoly wrote:
>>> On 23-Apr-20 9:04 PM, David Marchand wrote:
>>>> On Thu, Apr 23, 2020 at 6:34 PM Burakov, Anatoly
>>>> <anatoly.burakov@intel.com> wrote:
>>>>>> diff --git a/lib/librte_eal/common/eal_common_memory.c b/lib/librte_eal/common/eal_common_memory.c
>>>>>> index cc7d54e0c..2d9564b28 100644
>>>>>> --- a/lib/librte_eal/common/eal_common_memory.c
>>>>>> +++ b/lib/librte_eal/common/eal_common_memory.c
>>>>>> @@ -177,6 +177,20 @@ eal_get_virtual_area(void *requested_addr, size_t *size,
>>>>>>                 after_len = RTE_PTR_DIFF(map_end, aligned_end);
>>>>>>                 if (after_len > 0)
>>>>>>                         munmap(aligned_end, after_len);
>>>>>> +
>>>>>> +             /*
>>>>>> +              * Exclude this pages from a core dump.
>>>>>> +              */
>>>>>> +             if (madvise(aligned_addr, *size, MADV_DONTDUMP) != 0)
>>>>>> +                     RTE_LOG(WARNING, EAL, "Madvise with MADV_DONTDUMP failed: %s\n",
>>>>>> +                             strerror(errno));> +   } else {
>>>>>> +             /*
>>>>>> +              * Exclude this pages from a core dump.
>>>>>> +              */
>>>>>> +             if (madvise(mapped_addr, map_sz, MADV_DONTDUMP) != 0)
>>>>>> +                     RTE_LOG(WARNING, EAL, "Madvise with MADV_DONTDUMP failed: %s\n",
>>>>>> +                             strerror(errno));
>>>>>>         }
>>>>>>
>>>>>>         return aligned_addr;
>>>>>>
>>>>>
>>>>> For the contents of this patch,
>>>>
>>>> MADV_DONTDUMP does not seem POSIX, but as I said [1], there seems to
>>>> be a MADV_NOCORE option on FreeBSD.
>>>> 1: http://inbox.dpdk.org/dev/CAJFAV8y9YtT-7njUz+mD6U8+3XUqYrgp28KD7jy2923EpAcXrg@mail.gmail.com/
>>>>
>>>>
>>>
>>> Oh, right, so this would probably not compile on FreeBSD. Perhaps this
>>> function would have to be OS-specific after all (or call into an OS-specific
>>> madvise() after reserving the memory area).
>>>
>>
>> Is it just a differently named flag? If so, I think a single #ifdef macro
>> won't kill us in the common code.
>>
> Just the flag name is different.
> I should use RTE_EXEC_ENV_FREEBSD and RTE_EXEC_ENV_LINUX, right?

Yes, but we need this in two places, so a function call is still necessary.

> 
> Another question, in `eal_memalloc.c:alloc_seg`, I should undo the
> DONTMAP of the memory region.
> Right? @Anatoly

I don't think it's necessary. When you map different memory into that 
region, madvise() flags no longer apply. To be sure, i just tested this 
by adding another mmap() call after madvise() (in your test app) and 
remapping the same memory with MAP_FIXED, and the core dump was back to 
1GB of size. So, no, i don't think you should undo anything - the system 
does so automatically.

> 
> Just few minutes, I have prepared a patch for the OS-specific code:
> --- a/lib/librte_eal/common/eal_private.h
> +++ b/lib/librte_eal/common/eal_private.h
> @@ -443,4 +443,20 @@ rte_option_usage(void);
>   uint64_t
>   eal_get_baseaddr(void);
> 
> +/**
> + * @internal
> + * Exclude this pages from a core dump.
> + *
> + * @param addr
> + *  The memory region starts.
> + *
> + * @param len
> + *  The memory region length..
> + *
> + * @return
> + * returns 0 or -errno
> + */
> +int
> +eal_madvise_dontdump(void* addr, size_t len);
> +
>   #endif /* _EAL_PRIVATE_H_ */
> diff --git a/lib/librte_eal/freebsd/eal_memory.c
> b/lib/librte_eal/freebsd/eal_memory.c
> index a97d8f0f0..585042dde 100644
> --- a/lib/librte_eal/freebsd/eal_memory.c
> +++ b/lib/librte_eal/freebsd/eal_memory.c
> @@ -534,3 +534,9 @@ rte_eal_memseg_init(void)
>    memseg_primary_init() :
>    memseg_secondary_init();
>   }
> +
> +int
> +eal_madvise_dontdump(void* addr, size_t len)
> +{
> + return madvise(addr, len, MADV_NOCORE);
> +}
> diff --git a/lib/librte_eal/linux/eal_memory.c
> b/lib/librte_eal/linux/eal_memory.c
> index 7a9c97ff8..cfdbfccfe 100644
> --- a/lib/librte_eal/linux/eal_memory.c
> +++ b/lib/librte_eal/linux/eal_memory.c
> @@ -2479,3 +2479,9 @@ rte_eal_memseg_init(void)
>   #endif
>    memseg_secondary_init();
>   }
> +
> +int
> +eal_madvise_dontdump(void* addr, size_t len)
> +{
> + return madvise(addr, len, MADV_DONTDUMP);
> +}
> 

That would work as well (with added FreeBSD code of course), however if 
everyone else is OK with it, i'll settle for an #ifdef in common code.
  
Li Feng April 24, 2020, 12:03 p.m. UTC | #7
Thanks,

Feng Li

Burakov, Anatoly <anatoly.burakov@intel.com> 于2020年4月24日周五 下午7:00写道:
>
> On 24-Apr-20 10:33 AM, Feng Li wrote:
> > Bruce Richardson <bruce.richardson@intel.com> 于2020年4月24日周五 下午5:14写道:
> >>
> >> On Fri, Apr 24, 2020 at 10:12:10AM +0100, Burakov, Anatoly wrote:
> >>> On 23-Apr-20 9:04 PM, David Marchand wrote:
> >>>> On Thu, Apr 23, 2020 at 6:34 PM Burakov, Anatoly
> >>>> <anatoly.burakov@intel.com> wrote:
> >>>>>> diff --git a/lib/librte_eal/common/eal_common_memory.c b/lib/librte_eal/common/eal_common_memory.c
> >>>>>> index cc7d54e0c..2d9564b28 100644
> >>>>>> --- a/lib/librte_eal/common/eal_common_memory.c
> >>>>>> +++ b/lib/librte_eal/common/eal_common_memory.c
> >>>>>> @@ -177,6 +177,20 @@ eal_get_virtual_area(void *requested_addr, size_t *size,
> >>>>>>                 after_len = RTE_PTR_DIFF(map_end, aligned_end);
> >>>>>>                 if (after_len > 0)
> >>>>>>                         munmap(aligned_end, after_len);
> >>>>>> +
> >>>>>> +             /*
> >>>>>> +              * Exclude this pages from a core dump.
> >>>>>> +              */
> >>>>>> +             if (madvise(aligned_addr, *size, MADV_DONTDUMP) != 0)
> >>>>>> +                     RTE_LOG(WARNING, EAL, "Madvise with MADV_DONTDUMP failed: %s\n",
> >>>>>> +                             strerror(errno));> +   } else {
> >>>>>> +             /*
> >>>>>> +              * Exclude this pages from a core dump.
> >>>>>> +              */
> >>>>>> +             if (madvise(mapped_addr, map_sz, MADV_DONTDUMP) != 0)
> >>>>>> +                     RTE_LOG(WARNING, EAL, "Madvise with MADV_DONTDUMP failed: %s\n",
> >>>>>> +                             strerror(errno));
> >>>>>>         }
> >>>>>>
> >>>>>>         return aligned_addr;
> >>>>>>
> >>>>>
> >>>>> For the contents of this patch,
> >>>>
> >>>> MADV_DONTDUMP does not seem POSIX, but as I said [1], there seems to
> >>>> be a MADV_NOCORE option on FreeBSD.
> >>>> 1: http://inbox.dpdk.org/dev/CAJFAV8y9YtT-7njUz+mD6U8+3XUqYrgp28KD7jy2923EpAcXrg@mail.gmail.com/
> >>>>
> >>>>
> >>>
> >>> Oh, right, so this would probably not compile on FreeBSD. Perhaps this
> >>> function would have to be OS-specific after all (or call into an OS-specific
> >>> madvise() after reserving the memory area).
> >>>
> >>
> >> Is it just a differently named flag? If so, I think a single #ifdef macro
> >> won't kill us in the common code.
> >>
> > Just the flag name is different.
> > I should use RTE_EXEC_ENV_FREEBSD and RTE_EXEC_ENV_LINUX, right?
>
> Yes, but we need this in two places, so a function call is still necessary.
>
> >
> > Another question, in `eal_memalloc.c:alloc_seg`, I should undo the
> > DONTMAP of the memory region.
> > Right? @Anatoly
>
> I don't think it's necessary. When you map different memory into that
> region, madvise() flags no longer apply. To be sure, i just tested this
> by adding another mmap() call after madvise() (in your test app) and
> remapping the same memory with MAP_FIXED, and the core dump was back to
> 1GB of size. So, no, i don't think you should undo anything - the system
> does so automatically.
Got it.
>
> >
> > Just few minutes, I have prepared a patch for the OS-specific code:
> > --- a/lib/librte_eal/common/eal_private.h
> > +++ b/lib/librte_eal/common/eal_private.h
> > @@ -443,4 +443,20 @@ rte_option_usage(void);
> >   uint64_t
> >   eal_get_baseaddr(void);
> >
> > +/**
> > + * @internal
> > + * Exclude this pages from a core dump.
> > + *
> > + * @param addr
> > + *  The memory region starts.
> > + *
> > + * @param len
> > + *  The memory region length..
> > + *
> > + * @return
> > + * returns 0 or -errno
> > + */
> > +int
> > +eal_madvise_dontdump(void* addr, size_t len);
> > +
> >   #endif /* _EAL_PRIVATE_H_ */
> > diff --git a/lib/librte_eal/freebsd/eal_memory.c
> > b/lib/librte_eal/freebsd/eal_memory.c
> > index a97d8f0f0..585042dde 100644
> > --- a/lib/librte_eal/freebsd/eal_memory.c
> > +++ b/lib/librte_eal/freebsd/eal_memory.c
> > @@ -534,3 +534,9 @@ rte_eal_memseg_init(void)
> >    memseg_primary_init() :
> >    memseg_secondary_init();
> >   }
> > +
> > +int
> > +eal_madvise_dontdump(void* addr, size_t len)
> > +{
> > + return madvise(addr, len, MADV_NOCORE);
> > +}
> > diff --git a/lib/librte_eal/linux/eal_memory.c
> > b/lib/librte_eal/linux/eal_memory.c
> > index 7a9c97ff8..cfdbfccfe 100644
> > --- a/lib/librte_eal/linux/eal_memory.c
> > +++ b/lib/librte_eal/linux/eal_memory.c
> > @@ -2479,3 +2479,9 @@ rte_eal_memseg_init(void)
> >   #endif
> >    memseg_secondary_init();
> >   }
> > +
> > +int
> > +eal_madvise_dontdump(void* addr, size_t len)
> > +{
> > + return madvise(addr, len, MADV_DONTDUMP);
> > +}
> >
>
> That would work as well (with added FreeBSD code of course), however if
> everyone else is OK with it, i'll settle for an #ifdef in common code.
>
> --
> Thanks,
> Anatoly
  

Patch

diff --git a/lib/librte_eal/common/eal_common_memory.c b/lib/librte_eal/common/eal_common_memory.c
index cc7d54e0c..2d9564b28 100644
--- a/lib/librte_eal/common/eal_common_memory.c
+++ b/lib/librte_eal/common/eal_common_memory.c
@@ -177,6 +177,20 @@  eal_get_virtual_area(void *requested_addr, size_t *size,
 		after_len = RTE_PTR_DIFF(map_end, aligned_end);
 		if (after_len > 0)
 			munmap(aligned_end, after_len);
+
+		/*
+		 * Exclude this pages from a core dump.
+		 */
+		if (madvise(aligned_addr, *size, MADV_DONTDUMP) != 0)
+			RTE_LOG(WARNING, EAL, "Madvise with MADV_DONTDUMP failed: %s\n",
+				strerror(errno));
+	} else {
+		/*
+		 * Exclude this pages from a core dump.
+		 */
+		if (madvise(mapped_addr, map_sz, MADV_DONTDUMP) != 0)
+			RTE_LOG(WARNING, EAL, "Madvise with MADV_DONTDUMP failed: %s\n",
+				strerror(errno));
 	}
 
 	return aligned_addr;