[v4,2/5] examples/multi_process: cleanup bus objects while terminating app

Message ID 20201008153048.19369-2-rohit.raj@nxp.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series [v4,1/5] eal: add API for bus close |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Rohit Raj Oct. 8, 2020, 3:30 p.m. UTC
  From: Rohit Raj <rohit.raj@nxp.com>

Certain bus objects may need to be closed and re-acquired
while terminating and rerunning the client application.
Hence a signal handler is required to catch the termination
of the App and hence closing the bus objects.

This patch adds the missing signal handler in the client
app and closes the Bus objects in both client and server
applications when the signal Handler is called.

Signed-off-by: Rohit Raj <rohit.raj@nxp.com>
---
 .../multi_process/client_server_mp/mp_client/client.c | 11 +++++++++++
 .../multi_process/client_server_mp/mp_server/main.c   |  4 +++-
 2 files changed, 14 insertions(+), 1 deletion(-)
  

Comments

David Marchand Oct. 18, 2020, 9:25 a.m. UTC | #1
On Thu, Oct 8, 2020 at 5:31 PM <rohit.raj@nxp.com> wrote:
>
> From: Rohit Raj <rohit.raj@nxp.com>
>
> Certain bus objects may need to be closed and re-acquired
> while terminating and rerunning the client application.
> Hence a signal handler is required to catch the termination
> of the App and hence closing the bus objects.
>
> This patch adds the missing signal handler in the client
> app and closes the Bus objects in both client and server
> applications when the signal Handler is called.
>
> Signed-off-by: Rohit Raj <rohit.raj@nxp.com>
> ---
>  .../multi_process/client_server_mp/mp_client/client.c | 11 +++++++++++
>  .../multi_process/client_server_mp/mp_server/main.c   |  4 +++-
>  2 files changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/examples/multi_process/client_server_mp/mp_client/client.c b/examples/multi_process/client_server_mp/mp_client/client.c
> index 361d90b54..c37516b4c 100644
> --- a/examples/multi_process/client_server_mp/mp_client/client.c
> +++ b/examples/multi_process/client_server_mp/mp_client/client.c
> @@ -11,6 +11,7 @@
>  #include <stdlib.h>
>  #include <getopt.h>
>  #include <string.h>
> +#include <signal.h>
>
>  #include <rte_common.h>
>  #include <rte_malloc.h>
> @@ -196,6 +197,14 @@ handle_packet(struct rte_mbuf *buf)
>
>  }
>
> +static void
> +signal_handler(int signal)
> +{
> +       if (signal == SIGINT)
> +               rte_eal_cleanup();
> +       exit(0);
> +}

Calling rte_eal_cleanup from a signal handler is a bad idea.
In most cases, you are racing with other threads still using DPDK resources.
https://git.dpdk.org/dpdk/commit?id=2c434431f4
https://git.dpdk.org/dpdk/commit?id=613ce6691c

This might not be a problem in this multi_process example, but let's
keep a consistent way across all examples.
Thanks.
  
Lijun Ou Jan. 25, 2021, 11:07 a.m. UTC | #2
在 2020/10/18 17:25, David Marchand 写道:
> On Thu, Oct 8, 2020 at 5:31 PM <rohit.raj@nxp.com> wrote:
>>
>> From: Rohit Raj <rohit.raj@nxp.com>
>>
>> Certain bus objects may need to be closed and re-acquired
>> while terminating and rerunning the client application.
>> Hence a signal handler is required to catch the termination
>> of the App and hence closing the bus objects.
>>
>> This patch adds the missing signal handler in the client
>> app and closes the Bus objects in both client and server
>> applications when the signal Handler is called.
>>
>> Signed-off-by: Rohit Raj <rohit.raj@nxp.com>
>> ---
>>   .../multi_process/client_server_mp/mp_client/client.c | 11 +++++++++++
>>   .../multi_process/client_server_mp/mp_server/main.c   |  4 +++-
>>   2 files changed, 14 insertions(+), 1 deletion(-)
>>
>> diff --git a/examples/multi_process/client_server_mp/mp_client/client.c b/examples/multi_process/client_server_mp/mp_client/client.c
>> index 361d90b54..c37516b4c 100644
>> --- a/examples/multi_process/client_server_mp/mp_client/client.c
>> +++ b/examples/multi_process/client_server_mp/mp_client/client.c
>> @@ -11,6 +11,7 @@
>>   #include <stdlib.h>
>>   #include <getopt.h>
>>   #include <string.h>
>> +#include <signal.h>
>>
>>   #include <rte_common.h>
>>   #include <rte_malloc.h>
>> @@ -196,6 +197,14 @@ handle_packet(struct rte_mbuf *buf)
>>
>>   }
>>
>> +static void
>> +signal_handler(int signal)
>> +{
>> +       if (signal == SIGINT)
>> +               rte_eal_cleanup();
>> +       exit(0);
>> +}
> 
> Calling rte_eal_cleanup from a signal handler is a bad idea.
> In most cases, you are racing with other threads still using DPDK resources.
> https://git.dpdk.org/dpdk/commit?id=2c434431f4
> https://git.dpdk.org/dpdk/commit?id=613ce6691c
> 
> This might not be a problem in this multi_process example, but let's
> keep a consistent way across all examples.
> Thanks.
Hi,
   I want to know why you don't think he's a problem. recently, we use 
the patch
https://patchwork.dpdk.org/patch/86988/
startup with multiprocess. Use the tester to send traffic and kill the 
slave process repeatedly.
The coredump exception occurs. According to my analysis, the cause is 
that resources are not released after the slave process is killed.

Thanks
Lijun Ou
>
  
David Marchand Feb. 4, 2021, 12:53 p.m. UTC | #3
On Mon, Jan 25, 2021 at 12:07 PM oulijun <oulijun@huawei.com> wrote:
> >> +static void
> >> +signal_handler(int signal)
> >> +{
> >> +       if (signal == SIGINT)
> >> +               rte_eal_cleanup();
> >> +       exit(0);
> >> +}
> >
> > Calling rte_eal_cleanup from a signal handler is a bad idea.
> > In most cases, you are racing with other threads still using DPDK resources.
> > https://git.dpdk.org/dpdk/commit?id=2c434431f4
> > https://git.dpdk.org/dpdk/commit?id=613ce6691c
> >
> > This might not be a problem in this multi_process example, but let's
> > keep a consistent way across all examples.
> > Thanks.
> Hi,
>    I want to know why you don't think he's a problem. recently, we use
> the patch
> https://patchwork.dpdk.org/patch/86988/
> startup with multiprocess. Use the tester to send traffic and kill the
> slave process repeatedly.
> The coredump exception occurs. According to my analysis, the cause is
> that resources are not released after the slave process is killed.
>

Not sure I get your question and I don't see the relation with the
testpmd patch.
I did not say we must not release resources.

Just to rephrase my previous mail:

Generally speaking, calling rte_eal_cleanup() from a signal handler is
wrong since it creates races with other threads.
I recommend putting in place a synchronisation mechanism so that all
worker threads break out of their processing loop and the main thread
synchronizes/joins with them before calling rte_eal_cleanup() in its
exit path.

Now, for this patch, in
examples/multi_process/client_server_mp/mp_client/client.c, this
secondary process code seems to only have one thread (but I might be
wrong).
If this is true, then no race in theory => my comment "This might not
be a problem in this multi_process example".
  
Stephen Hemminger July 5, 2023, 11:35 p.m. UTC | #4
On Thu,  8 Oct 2020 21:00:44 +0530
rohit.raj@nxp.com wrote:

> +static void
> +signal_handler(int signal)
> +{
> +	if (signal == SIGINT)
> +		rte_eal_cleanup();

NAK
Call rte_eal_cleanup in signal handler is not safe.

Need to set a flag and handle it in main code.
  

Patch

diff --git a/examples/multi_process/client_server_mp/mp_client/client.c b/examples/multi_process/client_server_mp/mp_client/client.c
index 361d90b54..c37516b4c 100644
--- a/examples/multi_process/client_server_mp/mp_client/client.c
+++ b/examples/multi_process/client_server_mp/mp_client/client.c
@@ -11,6 +11,7 @@ 
 #include <stdlib.h>
 #include <getopt.h>
 #include <string.h>
+#include <signal.h>
 
 #include <rte_common.h>
 #include <rte_malloc.h>
@@ -196,6 +197,14 @@  handle_packet(struct rte_mbuf *buf)
 
 }
 
+static void
+signal_handler(int signal)
+{
+	if (signal == SIGINT)
+		rte_eal_cleanup();
+	exit(0);
+}
+
 /*
  * Application main function - loops through
  * receiving and processing packets. Never returns
@@ -217,6 +226,8 @@  main(int argc, char *argv[])
 	argc -= retval;
 	argv += retval;
 
+	signal(SIGINT, signal_handler);
+
 	if (parse_app_args(argc, argv) < 0)
 		rte_exit(EXIT_FAILURE, "Invalid command-line arguments\n");
 
diff --git a/examples/multi_process/client_server_mp/mp_server/main.c b/examples/multi_process/client_server_mp/mp_server/main.c
index 280dab867..b0241cc20 100644
--- a/examples/multi_process/client_server_mp/mp_server/main.c
+++ b/examples/multi_process/client_server_mp/mp_server/main.c
@@ -275,11 +275,13 @@  signal_handler(int signal)
 {
 	uint16_t port_id;
 
-	if (signal == SIGINT)
+	if (signal == SIGINT) {
 		RTE_ETH_FOREACH_DEV(port_id) {
 			rte_eth_dev_stop(port_id);
 			rte_eth_dev_close(port_id);
 		}
+		rte_eal_cleanup();
+	}
 	exit(0);
 }