[v2,1/6] eal: introduce oops handling API

Message ID 20210817032723.3997054-2-jerinj@marvell.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series support oops handling |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Jerin Jacob Kollanukkaran Aug. 17, 2021, 3:27 a.m. UTC
  From: Jerin Jacob <jerinj@marvell.com>

Introducing oops handling API with following specification
and enable stub implementation for Linux and FreeBSD.

On rte_eal_init() invocation, the EAL library installs the
oops handler for the essential signals.
The rte_oops_signals_enabled() API provides the list
of signals the library installed by the EAL.

The default EAL oops handler decodes the oops message using
rte_oops_decode() and then calls the signal handler
installed by the application before invoking the rte_eal_init().
This scheme will also enable the use of the default coredump
handler(for gdb etc.) provided by OS if the application does
not install any specific signal handler.

The second case where the application installs the signal
handler after the rte_eal_init() invocation, rte_oops_decode()
provides the means of decoding the oops message in
the application's fault handler.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
 doc/api/doxy-api-index.md    |   3 +-
 lib/eal/common/eal_private.h |   3 ++
 lib/eal/freebsd/eal.c        |   6 +++
 lib/eal/include/meson.build  |   1 +
 lib/eal/include/rte_oops.h   | 100 +++++++++++++++++++++++++++++++++++
 lib/eal/linux/eal.c          |   6 +++
 lib/eal/unix/eal_oops.c      |  36 +++++++++++++
 lib/eal/unix/meson.build     |   1 +
 lib/eal/version.map          |   4 ++
 9 files changed, 159 insertions(+), 1 deletion(-)
 create mode 100644 lib/eal/include/rte_oops.h
 create mode 100644 lib/eal/unix/eal_oops.c
  

Comments

Stephen Hemminger Aug. 17, 2021, 3:53 a.m. UTC | #1
On Tue, 17 Aug 2021 08:57:18 +0530
<jerinj@marvell.com> wrote:

> From: Jerin Jacob <jerinj@marvell.com>
> 
> Introducing oops handling API with following specification
> and enable stub implementation for Linux and FreeBSD.
> 
> On rte_eal_init() invocation, the EAL library installs the
> oops handler for the essential signals.
> The rte_oops_signals_enabled() API provides the list
> of signals the library installed by the EAL.

This is a big change, and many applications already handle these
signals themselves. Therefore adding this needs to be opt-in
and not enabled by default.
  
Jerin Jacob Aug. 17, 2021, 7:38 a.m. UTC | #2
On Tue, Aug 17, 2021 at 9:23 AM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Tue, 17 Aug 2021 08:57:18 +0530
> <jerinj@marvell.com> wrote:
>
> > From: Jerin Jacob <jerinj@marvell.com>
> >
> > Introducing oops handling API with following specification
> > and enable stub implementation for Linux and FreeBSD.
> >
> > On rte_eal_init() invocation, the EAL library installs the
> > oops handler for the essential signals.
> > The rte_oops_signals_enabled() API provides the list
> > of signals the library installed by the EAL.
>
> This is a big change, and many applications already handle these
> signals themselves. Therefore adding this needs to be opt-in
> and not enabled by default.

In order to avoid every application explicitly register this
sighandler and to cater to the
co-existing application-specific signal-hander usage.
The following design has been chosen. (It is mentioned in the commit log,
I will describe here for more clarity)

Case 1:
a) The application installs the signal handler prior to rte_eal_init().
b) Implementation stores the application-specific signal and replace a
signal handler as oops eal handler
c) when application/DPDK get the segfault, the default EAL oops
handler gets invoked
d) Then it dumps the EAL specific message, it calls the
application-specific signal handler
installed in step 1 by application. This avoids breaking any contract
with the application.
i.e Behavior is the same current EAL now.
That is the reason for not using SA_RESETHAND(which call SIG_DFL after
eal oops handler instead
application-specific handler)

Case 2:
a) The application install the signal handler after rte_eal_init(),
b) EAL hander get replaced with application handle then the application can call
rte_oops_decode() to decode.

In order to cater the above use case, rte_oops_signals_enabled() and
rte_oops_decode()
provided.

Here we are not breaking any contract with the application.
Do you have concerns about this design?
  
Stephen Hemminger Aug. 17, 2021, 3:09 p.m. UTC | #3
On Tue, 17 Aug 2021 13:08:46 +0530
Jerin Jacob <jerinjacobk@gmail.com> wrote:

> On Tue, Aug 17, 2021 at 9:23 AM Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> >
> > On Tue, 17 Aug 2021 08:57:18 +0530
> > <jerinj@marvell.com> wrote:
> >  
> > > From: Jerin Jacob <jerinj@marvell.com>
> > >
> > > Introducing oops handling API with following specification
> > > and enable stub implementation for Linux and FreeBSD.
> > >
> > > On rte_eal_init() invocation, the EAL library installs the
> > > oops handler for the essential signals.
> > > The rte_oops_signals_enabled() API provides the list
> > > of signals the library installed by the EAL.  
> >
> > This is a big change, and many applications already handle these
> > signals themselves. Therefore adding this needs to be opt-in
> > and not enabled by default.  
> 
> In order to avoid every application explicitly register this
> sighandler and to cater to the
> co-existing application-specific signal-hander usage.
> The following design has been chosen. (It is mentioned in the commit log,
> I will describe here for more clarity)
> 
> Case 1:
> a) The application installs the signal handler prior to rte_eal_init().
> b) Implementation stores the application-specific signal and replace a
> signal handler as oops eal handler
> c) when application/DPDK get the segfault, the default EAL oops
> handler gets invoked
> d) Then it dumps the EAL specific message, it calls the
> application-specific signal handler
> installed in step 1 by application. This avoids breaking any contract
> with the application.
> i.e Behavior is the same current EAL now.
> That is the reason for not using SA_RESETHAND(which call SIG_DFL after
> eal oops handler instead
> application-specific handler)
> 
> Case 2:
> a) The application install the signal handler after rte_eal_init(),
> b) EAL hander get replaced with application handle then the application can call
> rte_oops_decode() to decode.
> 
> In order to cater the above use case, rte_oops_signals_enabled() and
> rte_oops_decode()
> provided.
> 
> Here we are not breaking any contract with the application.
> Do you have concerns about this design?

In our application as a service it is important not to do any backtrace
in production. We rely on other infrastructure to process coredumps.

This should be controlled enabled by a command line argument.
  
Jerin Jacob Aug. 17, 2021, 3:27 p.m. UTC | #4
On Tue, Aug 17, 2021 at 8:39 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Tue, 17 Aug 2021 13:08:46 +0530
> Jerin Jacob <jerinjacobk@gmail.com> wrote:
>
> > On Tue, Aug 17, 2021 at 9:23 AM Stephen Hemminger
> > <stephen@networkplumber.org> wrote:
> > >
> > > On Tue, 17 Aug 2021 08:57:18 +0530
> > > <jerinj@marvell.com> wrote:
> > >
> > > > From: Jerin Jacob <jerinj@marvell.com>
> > > >
> > > > Introducing oops handling API with following specification
> > > > and enable stub implementation for Linux and FreeBSD.
> > > >
> > > > On rte_eal_init() invocation, the EAL library installs the
> > > > oops handler for the essential signals.
> > > > The rte_oops_signals_enabled() API provides the list
> > > > of signals the library installed by the EAL.
> > >
> > > This is a big change, and many applications already handle these
> > > signals themselves. Therefore adding this needs to be opt-in
> > > and not enabled by default.
> >
> > In order to avoid every application explicitly register this
> > sighandler and to cater to the
> > co-existing application-specific signal-hander usage.
> > The following design has been chosen. (It is mentioned in the commit log,
> > I will describe here for more clarity)
> >
> > Case 1:
> > a) The application installs the signal handler prior to rte_eal_init().
> > b) Implementation stores the application-specific signal and replace a
> > signal handler as oops eal handler
> > c) when application/DPDK get the segfault, the default EAL oops
> > handler gets invoked
> > d) Then it dumps the EAL specific message, it calls the
> > application-specific signal handler
> > installed in step 1 by application. This avoids breaking any contract
> > with the application.
> > i.e Behavior is the same current EAL now.
> > That is the reason for not using SA_RESETHAND(which call SIG_DFL after
> > eal oops handler instead
> > application-specific handler)
> >
> > Case 2:
> > a) The application install the signal handler after rte_eal_init(),
> > b) EAL hander get replaced with application handle then the application can call
> > rte_oops_decode() to decode.
> >
> > In order to cater the above use case, rte_oops_signals_enabled() and
> > rte_oops_decode()
> > provided.
> >
> > Here we are not breaking any contract with the application.
> > Do you have concerns about this design?
>
> In our application as a service it is important not to do any backtrace
> in production. We rely on other infrastructure to process coredumps.

Other infrastructure will work. For example, If we are using standard coredump
using linux infra. In Current implementation,
- EAL handler dump the DPDK OOPS like kernel on stderr
- Implementation calls SIG_DFL in eal oops handler
- The above step creates the coredump or re-directs any other
infrastructure you are using for coredump.

>
> This should be controlled enabled by a command line argument.

If we allow other infrastructure coredump to work as-is, why
enable/disable required from eal?
  
Stephen Hemminger Aug. 17, 2021, 3:52 p.m. UTC | #5
On Tue, 17 Aug 2021 20:57:50 +0530
Jerin Jacob <jerinjacobk@gmail.com> wrote:

> On Tue, Aug 17, 2021 at 8:39 PM Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> >
> > On Tue, 17 Aug 2021 13:08:46 +0530
> > Jerin Jacob <jerinjacobk@gmail.com> wrote:
> >  
> > > On Tue, Aug 17, 2021 at 9:23 AM Stephen Hemminger
> > > <stephen@networkplumber.org> wrote:  
> > > >
> > > > On Tue, 17 Aug 2021 08:57:18 +0530
> > > > <jerinj@marvell.com> wrote:
> > > >  
> > > > > From: Jerin Jacob <jerinj@marvell.com>
> > > > >
> > > > > Introducing oops handling API with following specification
> > > > > and enable stub implementation for Linux and FreeBSD.
> > > > >
> > > > > On rte_eal_init() invocation, the EAL library installs the
> > > > > oops handler for the essential signals.
> > > > > The rte_oops_signals_enabled() API provides the list
> > > > > of signals the library installed by the EAL.  
> > > >
> > > > This is a big change, and many applications already handle these
> > > > signals themselves. Therefore adding this needs to be opt-in
> > > > and not enabled by default.  
> > >
> > > In order to avoid every application explicitly register this
> > > sighandler and to cater to the
> > > co-existing application-specific signal-hander usage.
> > > The following design has been chosen. (It is mentioned in the commit log,
> > > I will describe here for more clarity)
> > >
> > > Case 1:
> > > a) The application installs the signal handler prior to rte_eal_init().
> > > b) Implementation stores the application-specific signal and replace a
> > > signal handler as oops eal handler
> > > c) when application/DPDK get the segfault, the default EAL oops
> > > handler gets invoked
> > > d) Then it dumps the EAL specific message, it calls the
> > > application-specific signal handler
> > > installed in step 1 by application. This avoids breaking any contract
> > > with the application.
> > > i.e Behavior is the same current EAL now.
> > > That is the reason for not using SA_RESETHAND(which call SIG_DFL after
> > > eal oops handler instead
> > > application-specific handler)
> > >
> > > Case 2:
> > > a) The application install the signal handler after rte_eal_init(),
> > > b) EAL hander get replaced with application handle then the application can call
> > > rte_oops_decode() to decode.
> > >
> > > In order to cater the above use case, rte_oops_signals_enabled() and
> > > rte_oops_decode()
> > > provided.
> > >
> > > Here we are not breaking any contract with the application.
> > > Do you have concerns about this design?  
> >
> > In our application as a service it is important not to do any backtrace
> > in production. We rely on other infrastructure to process coredumps.  
> 
> Other infrastructure will work. For example, If we are using standard coredump
> using linux infra. In Current implementation,
> - EAL handler dump the DPDK OOPS like kernel on stderr
> - Implementation calls SIG_DFL in eal oops handler
> - The above step creates the coredump or re-directs any other
> infrastructure you are using for coredump.
> 
> >
> > This should be controlled enabled by a command line argument.  
> 
> If we allow other infrastructure coredump to work as-is, why
> enable/disable required from eal?

The addition of DPDK OOPS adds additional steps which make all
faults be identified as the oops code.
  
Jerin Jacob Aug. 18, 2021, 9:37 a.m. UTC | #6
On Tue, Aug 17, 2021 at 9:22 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Tue, 17 Aug 2021 20:57:50 +0530
> Jerin Jacob <jerinjacobk@gmail.com> wrote:
>
> > On Tue, Aug 17, 2021 at 8:39 PM Stephen Hemminger
> > <stephen@networkplumber.org> wrote:
> > >
> > > On Tue, 17 Aug 2021 13:08:46 +0530
> > > Jerin Jacob <jerinjacobk@gmail.com> wrote:
> > >
> > > > On Tue, Aug 17, 2021 at 9:23 AM Stephen Hemminger
> > > > <stephen@networkplumber.org> wrote:
> > > > >
> > > > > On Tue, 17 Aug 2021 08:57:18 +0530
> > > > > <jerinj@marvell.com> wrote:
> > > > >
> > > > > > From: Jerin Jacob <jerinj@marvell.com>
> > > > > >
> > > > > > Introducing oops handling API with following specification
> > > > > > and enable stub implementation for Linux and FreeBSD.
> > > > > >
> > > > > > On rte_eal_init() invocation, the EAL library installs the
> > > > > > oops handler for the essential signals.
> > > > > > The rte_oops_signals_enabled() API provides the list
> > > > > > of signals the library installed by the EAL.
> > > > >
> > > > > This is a big change, and many applications already handle these
> > > > > signals themselves. Therefore adding this needs to be opt-in
> > > > > and not enabled by default.
> > > >
> > > > In order to avoid every application explicitly register this
> > > > sighandler and to cater to the
> > > > co-existing application-specific signal-hander usage.
> > > > The following design has been chosen. (It is mentioned in the commit log,
> > > > I will describe here for more clarity)
> > > >
> > > > Case 1:
> > > > a) The application installs the signal handler prior to rte_eal_init().
> > > > b) Implementation stores the application-specific signal and replace a
> > > > signal handler as oops eal handler
> > > > c) when application/DPDK get the segfault, the default EAL oops
> > > > handler gets invoked
> > > > d) Then it dumps the EAL specific message, it calls the
> > > > application-specific signal handler
> > > > installed in step 1 by application. This avoids breaking any contract
> > > > with the application.
> > > > i.e Behavior is the same current EAL now.
> > > > That is the reason for not using SA_RESETHAND(which call SIG_DFL after
> > > > eal oops handler instead
> > > > application-specific handler)
> > > >
> > > > Case 2:
> > > > a) The application install the signal handler after rte_eal_init(),
> > > > b) EAL hander get replaced with application handle then the application can call
> > > > rte_oops_decode() to decode.
> > > >
> > > > In order to cater the above use case, rte_oops_signals_enabled() and
> > > > rte_oops_decode()
> > > > provided.
> > > >
> > > > Here we are not breaking any contract with the application.
> > > > Do you have concerns about this design?
> > >
> > > In our application as a service it is important not to do any backtrace
> > > in production. We rely on other infrastructure to process coredumps.
> >
> > Other infrastructure will work. For example, If we are using standard coredump
> > using linux infra. In Current implementation,
> > - EAL handler dump the DPDK OOPS like kernel on stderr
> > - Implementation calls SIG_DFL in eal oops handler
> > - The above step creates the coredump or re-directs any other
> > infrastructure you are using for coredump.
> >
> > >
> > > This should be controlled enabled by a command line argument.
> >
> > If we allow other infrastructure coredump to work as-is, why
> > enable/disable required from eal?
>
> The addition of DPDK OOPS adds additional steps which make all
> faults be identified as the oops code.

Since we are using SA_ONSTACK it is not losing the original segfault
info.

I verified like this, Please find below the steps.

0) Enable coredump infra in Linux using coredumpctl or so
1) Apply this series
2) Apply for the following patch to create a segfault from the library.
This will test, segfault caught by eal and forward to default Linux singal
handler.

[main]dell[dpdk.org] $ git diff
diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
index 3438a96b75..b935c32c98 100644
--- a/lib/eal/linux/eal.c
+++ b/lib/eal/linux/eal.c
@@ -1338,6 +1338,8 @@ rte_eal_init(int argc, char **argv)

        eal_mcfg_complete();

+       /* Generate a segfault */
+       *(volatile int *)0x05 = 0;
        return fctret;

 }
3)Build
meson --buildtype debug build
ninja -C build

4) Run
$ ./build/app/test/dpdk-test --no-huge  -c 0x2

Please find oops dump[1] and gdb core dump backtrace[2].
Gdb core dump trace preserves the original segfault cause and trace.

Any other concerns?


[1]
[main]dell[dpdk.org] $ ./build/app/test/dpdk-test --no-huge  -c 0x2
EAL: Detected 56 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Static memory layout is selected, amount of reserved memory can
be adjusted with -m or --socket-mem
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /run/user/1000/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'VA'
EAL: WARNING: Main core has no memory on local socket!
Signal info:
------------
PID:           2666512
Signal number: 11
Fault address: 0x5

Backtrace:
----------
[  0x5582acd1e08a]: rte_eal_init()+0xe18
[  0x5582ac086f4e]: main()+0x298
[  0x7f0facf1fb25]: __libc_start_main()+0xd5
[  0x5582ac079c9e]: _start()+0x2e

Arch info:
----------
R8 : 0x0000000000000002  R9 : 0x00007ffe9273c590
R10: 0x0000000000000000  R11: 0x0000000000000246
R12: 0x00005582bc3ce7a0  R13: 0x00000000000000ca
R14: 0x0000000000000000  R15: 0x0000000000000000
RAX: 0x0000000000000005  RBX: 0x00005582bc3c75c8
RCX: 0x00007ffe9273c530  RDX: 0x0000000000000000
RBP: 0x00007ffe9273c820  RSP: 0x00007ffe9273c690
RSI: 0x0000000000000008  RDI: 0x00000000000000ca
RIP: 0x00005582acd1e08a  EFL: 0x0000000000010246


[2]

Core was generated by `./build/app/test/dpdk-test --no-huge -c 0x2'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  rte_eal_init (argc=4, argv=0x7ffe9273cec8) at ../lib/eal/linux/eal.c:1342
1342            *(volatile int *)0x05 = 0;
[Current thread is 1 (Thread 0x7f0faca83c00 (LWP 2666512))]
(gdb) bt
#0  rte_eal_init (argc=4, argv=0x7ffe9273cec8) at ../lib/eal/linux/eal.c:1342
#1  0x00005582ac086f4e in main (argc=4, argv=0x7ffe9273cec8) at
../app/test/test.c:146




>
  
Stephen Hemminger Aug. 18, 2021, 4:46 p.m. UTC | #7
On Wed, 18 Aug 2021 15:07:25 +0530
Jerin Jacob <jerinjacobk@gmail.com> wrote:

> On Tue, Aug 17, 2021 at 9:22 PM Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> >
> > On Tue, 17 Aug 2021 20:57:50 +0530
> > Jerin Jacob <jerinjacobk@gmail.com> wrote:
> >  
> > > On Tue, Aug 17, 2021 at 8:39 PM Stephen Hemminger
> > > <stephen@networkplumber.org> wrote:  
> > > >
> > > > On Tue, 17 Aug 2021 13:08:46 +0530
> > > > Jerin Jacob <jerinjacobk@gmail.com> wrote:
> > > >  
> > > > > On Tue, Aug 17, 2021 at 9:23 AM Stephen Hemminger
> > > > > <stephen@networkplumber.org> wrote:  
> > > > > >
> > > > > > On Tue, 17 Aug 2021 08:57:18 +0530
> > > > > > <jerinj@marvell.com> wrote:
> > > > > >  
> > > > > > > From: Jerin Jacob <jerinj@marvell.com>
> > > > > > >
> > > > > > > Introducing oops handling API with following specification
> > > > > > > and enable stub implementation for Linux and FreeBSD.
> > > > > > >
> > > > > > > On rte_eal_init() invocation, the EAL library installs the
> > > > > > > oops handler for the essential signals.
> > > > > > > The rte_oops_signals_enabled() API provides the list
> > > > > > > of signals the library installed by the EAL.  
> > > > > >
> > > > > > This is a big change, and many applications already handle these
> > > > > > signals themselves. Therefore adding this needs to be opt-in
> > > > > > and not enabled by default.  
> > > > >
> > > > > In order to avoid every application explicitly register this
> > > > > sighandler and to cater to the
> > > > > co-existing application-specific signal-hander usage.
> > > > > The following design has been chosen. (It is mentioned in the commit log,
> > > > > I will describe here for more clarity)
> > > > >
> > > > > Case 1:
> > > > > a) The application installs the signal handler prior to rte_eal_init().
> > > > > b) Implementation stores the application-specific signal and replace a
> > > > > signal handler as oops eal handler
> > > > > c) when application/DPDK get the segfault, the default EAL oops
> > > > > handler gets invoked
> > > > > d) Then it dumps the EAL specific message, it calls the
> > > > > application-specific signal handler
> > > > > installed in step 1 by application. This avoids breaking any contract
> > > > > with the application.
> > > > > i.e Behavior is the same current EAL now.
> > > > > That is the reason for not using SA_RESETHAND(which call SIG_DFL after
> > > > > eal oops handler instead
> > > > > application-specific handler)
> > > > >
> > > > > Case 2:
> > > > > a) The application install the signal handler after rte_eal_init(),
> > > > > b) EAL hander get replaced with application handle then the application can call
> > > > > rte_oops_decode() to decode.
> > > > >
> > > > > In order to cater the above use case, rte_oops_signals_enabled() and
> > > > > rte_oops_decode()
> > > > > provided.
> > > > >
> > > > > Here we are not breaking any contract with the application.
> > > > > Do you have concerns about this design?  
> > > >
> > > > In our application as a service it is important not to do any backtrace
> > > > in production. We rely on other infrastructure to process coredumps.  
> > >
> > > Other infrastructure will work. For example, If we are using standard coredump
> > > using linux infra. In Current implementation,
> > > - EAL handler dump the DPDK OOPS like kernel on stderr
> > > - Implementation calls SIG_DFL in eal oops handler
> > > - The above step creates the coredump or re-directs any other
> > > infrastructure you are using for coredump.
> > >  
> > > >
> > > > This should be controlled enabled by a command line argument.  
> > >
> > > If we allow other infrastructure coredump to work as-is, why
> > > enable/disable required from eal?  
> >
> > The addition of DPDK OOPS adds additional steps which make all
> > faults be identified as the oops code.  
> 
> Since we are using SA_ONSTACK it is not losing the original segfault
> info.
> 
> I verified like this, Please find below the steps.
> 
> 0) Enable coredump infra in Linux using coredumpctl or so
> 1) Apply this series
> 2) Apply for the following patch to create a segfault from the library.
> This will test, segfault caught by eal and forward to default Linux singal
> handler.
> 
> [main]dell[dpdk.org] $ git diff
> diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
> index 3438a96b75..b935c32c98 100644
> --- a/lib/eal/linux/eal.c
> +++ b/lib/eal/linux/eal.c
> @@ -1338,6 +1338,8 @@ rte_eal_init(int argc, char **argv)
> 
>         eal_mcfg_complete();
> 
> +       /* Generate a segfault */
> +       *(volatile int *)0x05 = 0;
>         return fctret;
> 
>  }
> 3)Build
> meson --buildtype debug build
> ninja -C build
> 
> 4) Run
> $ ./build/app/test/dpdk-test --no-huge  -c 0x2
> 
> Please find oops dump[1] and gdb core dump backtrace[2].
> Gdb core dump trace preserves the original segfault cause and trace.
> 
> Any other concerns?

Your new oops handling duplicates existing code in our application
(and I know others that do this as well). The problem is that an
application may do this before calling rte_eal_init and your new
code will break that.

Therefore my recommendation is that the new oops handling needs
to be not a built in feature of EAL.
  
Jerin Jacob Aug. 18, 2021, 6:04 p.m. UTC | #8
On Wed, Aug 18, 2021 at 10:16 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Wed, 18 Aug 2021 15:07:25 +0530
> Jerin Jacob <jerinjacobk@gmail.com> wrote:
>
> > On Tue, Aug 17, 2021 at 9:22 PM Stephen Hemminger
> > <stephen@networkplumber.org> wrote:
> > >
> > > On Tue, 17 Aug 2021 20:57:50 +0530
> > > Jerin Jacob <jerinjacobk@gmail.com> wrote:
> > >
> > > > On Tue, Aug 17, 2021 at 8:39 PM Stephen Hemminger
> > > > <stephen@networkplumber.org> wrote:
> > > > >
> > > > > On Tue, 17 Aug 2021 13:08:46 +0530
> > > > > Jerin Jacob <jerinjacobk@gmail.com> wrote:
> > > > >
> > > > > > On Tue, Aug 17, 2021 at 9:23 AM Stephen Hemminger
> > > > > > <stephen@networkplumber.org> wrote:
> > > > > > >
> > > > > > > On Tue, 17 Aug 2021 08:57:18 +0530
> > > > > > > <jerinj@marvell.com> wrote:
> > > > > > >
> > > > > > > > From: Jerin Jacob <jerinj@marvell.com>
> > > > > > > >
> > > > > > > > Introducing oops handling API with following specification
> > > > > > > > and enable stub implementation for Linux and FreeBSD.
> > > > > > > >
> > > > > > > > On rte_eal_init() invocation, the EAL library installs the
> > > > > > > > oops handler for the essential signals.
> > > > > > > > The rte_oops_signals_enabled() API provides the list
> > > > > > > > of signals the library installed by the EAL.
> > > > > > >
> > > > > > > This is a big change, and many applications already handle these
> > > > > > > signals themselves. Therefore adding this needs to be opt-in
> > > > > > > and not enabled by default.
> > > > > >
> > > > > > In order to avoid every application explicitly register this
> > > > > > sighandler and to cater to the
> > > > > > co-existing application-specific signal-hander usage.
> > > > > > The following design has been chosen. (It is mentioned in the commit log,
> > > > > > I will describe here for more clarity)
> > > > > >
> > > > > > Case 1:
> > > > > > a) The application installs the signal handler prior to rte_eal_init().
> > > > > > b) Implementation stores the application-specific signal and replace a
> > > > > > signal handler as oops eal handler
> > > > > > c) when application/DPDK get the segfault, the default EAL oops
> > > > > > handler gets invoked
> > > > > > d) Then it dumps the EAL specific message, it calls the
> > > > > > application-specific signal handler
> > > > > > installed in step 1 by application. This avoids breaking any contract
> > > > > > with the application.
> > > > > > i.e Behavior is the same current EAL now.
> > > > > > That is the reason for not using SA_RESETHAND(which call SIG_DFL after
> > > > > > eal oops handler instead
> > > > > > application-specific handler)
> > > > > >
> > > > > > Case 2:
> > > > > > a) The application install the signal handler after rte_eal_init(),
> > > > > > b) EAL hander get replaced with application handle then the application can call
> > > > > > rte_oops_decode() to decode.
> > > > > >
> > > > > > In order to cater the above use case, rte_oops_signals_enabled() and
> > > > > > rte_oops_decode()
> > > > > > provided.
> > > > > >
> > > > > > Here we are not breaking any contract with the application.
> > > > > > Do you have concerns about this design?
> > > > >
> > > > > In our application as a service it is important not to do any backtrace
> > > > > in production. We rely on other infrastructure to process coredumps.
> > > >
> > > > Other infrastructure will work. For example, If we are using standard coredump
> > > > using linux infra. In Current implementation,
> > > > - EAL handler dump the DPDK OOPS like kernel on stderr
> > > > - Implementation calls SIG_DFL in eal oops handler
> > > > - The above step creates the coredump or re-directs any other
> > > > infrastructure you are using for coredump.
> > > >
> > > > >
> > > > > This should be controlled enabled by a command line argument.
> > > >
> > > > If we allow other infrastructure coredump to work as-is, why
> > > > enable/disable required from eal?
> > >
> > > The addition of DPDK OOPS adds additional steps which make all
> > > faults be identified as the oops code.
> >
> > Since we are using SA_ONSTACK it is not losing the original segfault
> > info.
> >
> > I verified like this, Please find below the steps.
> >
> > 0) Enable coredump infra in Linux using coredumpctl or so
> > 1) Apply this series
> > 2) Apply for the following patch to create a segfault from the library.
> > This will test, segfault caught by eal and forward to default Linux singal
> > handler.
> >
> > [main]dell[dpdk.org] $ git diff
> > diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
> > index 3438a96b75..b935c32c98 100644
> > --- a/lib/eal/linux/eal.c
> > +++ b/lib/eal/linux/eal.c
> > @@ -1338,6 +1338,8 @@ rte_eal_init(int argc, char **argv)
> >
> >         eal_mcfg_complete();
> >
> > +       /* Generate a segfault */
> > +       *(volatile int *)0x05 = 0;
> >         return fctret;
> >
> >  }
> > 3)Build
> > meson --buildtype debug build
> > ninja -C build
> >
> > 4) Run
> > $ ./build/app/test/dpdk-test --no-huge  -c 0x2
> >
> > Please find oops dump[1] and gdb core dump backtrace[2].
> > Gdb core dump trace preserves the original segfault cause and trace.
> >
> > Any other concerns?
>
> Your new oops handling duplicates existing code in our application
> (and I know others that do this as well). The problem is that an
> application may do this before calling rte_eal_init and your new
> code will break that.

Not sure what it breaks, Could you elaborate on this? Your app signal
handler will be called with the original signal
the info it is registered before rte_eal_init().

We can have an additional API to disable the oops prints if you
insist. (Though I don't the know use case
where someone needs this other than someone don't want to see/log this
print). If that is rational,
I can add API to disable oops print it. I prefer to install it by
default as it won't break anything and it helps
to not add oops API in existing apps i.e without calling any
additional features in all existing applications.

>
> Therefore my recommendation is that the new oops handling needs
> to be not a built in feature of EAL.
>
>
>
  

Patch

diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 1992107a03..0d0da35205 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -215,7 +215,8 @@  The public API headers are grouped by topics:
   [log]                (@ref rte_log.h),
   [errno]              (@ref rte_errno.h),
   [trace]              (@ref rte_trace.h),
-  [trace_point]        (@ref rte_trace_point.h)
+  [trace_point]        (@ref rte_trace_point.h),
+  [oops]               (@ref rte_oops.h)
 
 - **misc**:
   [EAL config]         (@ref rte_eal.h),
diff --git a/lib/eal/common/eal_private.h b/lib/eal/common/eal_private.h
index 64cf4e81c8..c3a490d803 100644
--- a/lib/eal/common/eal_private.h
+++ b/lib/eal/common/eal_private.h
@@ -716,6 +716,9 @@  void __rte_thread_init(unsigned int lcore_id, rte_cpuset_t *cpuset);
  */
 void __rte_thread_uninit(void);
 
+int eal_oops_init(void);
+void eal_oops_fini(void);
+
 /**
  * asprintf(3) replacement for Windows.
  */
diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
index 6cee5ae369..3c098708c6 100644
--- a/lib/eal/freebsd/eal.c
+++ b/lib/eal/freebsd/eal.c
@@ -692,6 +692,11 @@  rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
+	if (eal_oops_init()) {
+		rte_eal_init_alert("oops init failed.");
+		rte_errno = ENOENT;
+	}
+
 	thread_id = pthread_self();
 
 	eal_reset_internal_config(internal_conf);
@@ -974,6 +979,7 @@  rte_eal_cleanup(void)
 	rte_trace_save();
 	eal_trace_fini();
 	eal_cleanup_config(internal_conf);
+	eal_oops_fini();
 	return 0;
 }
 
diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build
index 88a9eba12f..6c74bdb7b5 100644
--- a/lib/eal/include/meson.build
+++ b/lib/eal/include/meson.build
@@ -30,6 +30,7 @@  headers += files(
         'rte_malloc.h',
         'rte_memory.h',
         'rte_memzone.h',
+        'rte_oops.h',
         'rte_pci_dev_feature_defs.h',
         'rte_pci_dev_features.h',
         'rte_per_lcore.h',
diff --git a/lib/eal/include/rte_oops.h b/lib/eal/include/rte_oops.h
new file mode 100644
index 0000000000..ff82c409ec
--- /dev/null
+++ b/lib/eal/include/rte_oops.h
@@ -0,0 +1,100 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2020 Marvell.
+ */
+
+#ifndef _RTE_OOPS_H_
+#define _RTE_OOPS_H_
+
+#include <rte_common.h>
+#include <rte_compat.h>
+#include <rte_config.h>
+
+/**
+ * @file
+ *
+ * RTE oops API
+ *
+ * This file provides the oops handling APIs to RTE applications.
+ *
+ * On rte_eal_init() invocation, the EAL library installs the oops handler for
+ * the essential signals. The rte_oops_signals_enabled() API provides the list
+ * of signals the library installed by the EAL.
+ *
+ * The default EAL oops handler decodes the oops message using rte_oops_decode()
+ * and then calls the signal handler installed by the application before
+ * invoking the rte_eal_init(). This scheme will also enable the use of
+ * the default coredump handler(for gdb etc.) provided by OS if the application
+ * does not install any specific signal handler.
+ *
+ * The second case where the application installs the signal handler after
+ * the rte_eal_init() invocation, rte_oops_decode() provides the means of
+ * decoding the oops message in the application's fault handler.
+ *
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Maximum number of oops signals enabled in EAL.
+ * @see rte_oops_signals_enabled()
+ */
+#define RTE_OOPS_SIGNALS_MAX 32
+
+/**
+ * Get the list of enabled oops signals installed by EAL.
+ *
+ * @param [out] signals
+ *   A pointer to store the enabled signals.
+ *   Value NULL is allowed. if not NULL, then the size of this array must be
+ *   at least RTE_OOPS_SIGNALS_MAX.
+ *
+ * @return
+ *   Number of enabled oops signals.
+ */
+__rte_experimental
+int rte_oops_signals_enabled(int *signals);
+
+#if defined(RTE_EXEC_ENV_LINUX) || defined(RTE_EXEC_ENV_FREEBSD)
+#include <signal.h>
+#include <ucontext.h>
+
+/**
+ * Decode an oops
+ *
+ * This prototype is same as sa_sigaction defined in signal.h.
+ * Application must register signal handler using sigaction() with
+ * sa_flag as SA_SIGINFO flag to get this information from unix OS.
+ *
+ * @param sig
+ *   Signal number
+ * @param info
+ *   Signal info provided by sa_sigaction. Value NULL is allowed.
+ * @param uc
+ *   ucontext_t provided when signal installed with SA_SIGINFO flag.
+ *   Value NULL is allowed.
+ *
+ */
+__rte_experimental
+void rte_oops_decode(int sig, siginfo_t *info, ucontext_t *uc);
+#else
+
+/**
+ * Decode an oops
+ *
+ * @param sig
+ *   Signal number
+ */
+__rte_experimental
+void rte_oops_decode(int sig);
+
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_OOPS_H_ */
diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
index 3577eaeaa4..3438a96b75 100644
--- a/lib/eal/linux/eal.c
+++ b/lib/eal/linux/eal.c
@@ -991,6 +991,11 @@  rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
+	if (eal_oops_init()) {
+		rte_eal_init_alert("oops init failed.");
+		rte_errno = ENOENT;
+	}
+
 	p = strrchr(argv[0], '/');
 	strlcpy(logid, p ? p + 1 : argv[0], sizeof(logid));
 	thread_id = pthread_self();
@@ -1371,6 +1376,7 @@  rte_eal_cleanup(void)
 	rte_trace_save();
 	eal_trace_fini();
 	eal_cleanup_config(internal_conf);
+	eal_oops_fini();
 	return 0;
 }
 
diff --git a/lib/eal/unix/eal_oops.c b/lib/eal/unix/eal_oops.c
new file mode 100644
index 0000000000..53b580f733
--- /dev/null
+++ b/lib/eal/unix/eal_oops.c
@@ -0,0 +1,36 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2021 Marvell.
+ */
+
+
+#include <rte_oops.h>
+
+#include "eal_private.h"
+
+void
+rte_oops_decode(int sig, siginfo_t *info, ucontext_t *uc)
+{
+	RTE_SET_USED(sig);
+	RTE_SET_USED(info);
+	RTE_SET_USED(uc);
+
+}
+
+int
+rte_oops_signals_enabled(int *signals)
+{
+	RTE_SET_USED(signals);
+
+	return 0;
+}
+
+int
+eal_oops_init(void)
+{
+	return 0;
+}
+
+void
+eal_oops_fini(void)
+{
+}
diff --git a/lib/eal/unix/meson.build b/lib/eal/unix/meson.build
index e3ecd3e956..cdd3320669 100644
--- a/lib/eal/unix/meson.build
+++ b/lib/eal/unix/meson.build
@@ -6,5 +6,6 @@  sources += files(
         'eal_unix_memory.c',
         'eal_unix_timer.c',
         'eal_firmware.c',
+        'eal_oops.c',
         'rte_thread.c',
 )
diff --git a/lib/eal/version.map b/lib/eal/version.map
index 887012d02a..f2841d09fd 100644
--- a/lib/eal/version.map
+++ b/lib/eal/version.map
@@ -426,6 +426,10 @@  EXPERIMENTAL {
 
 	# added in 21.08
 	rte_power_monitor_multi; # WINDOWS_NO_EXPORT
+
+	# added in 21.11
+	rte_oops_signals_enabled; # WINDOWS_NO_EXPORT
+	rte_oops_decode; # WINDOWS_NO_EXPORT
 };
 
 INTERNAL {