[dpdk-dev,2/2] examples/kni: stop lcores while doing kni ops

Message ID 1508154348-10988-3-git-send-email-tdu@semihalf.com (mailing list archive)
State Changes Requested, archived
Delegated to: Ferruh Yigit
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Tomasz Duszynski Oct. 16, 2017, 11:45 a.m. UTC
  Since the transmit and receive functions should not be invoked when
the device is stopped, stop lcores during kni ops and restart them
after device is started once again.

Signed-off-by: Tomasz Duszynski <tdu@semihalf.com>
---
 examples/kni/main.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

--
2.7.4
  

Comments

Ferruh Yigit Oct. 21, 2017, 12:42 a.m. UTC | #1
On 10/16/2017 4:45 AM, Tomasz Duszynski wrote:
> Since the transmit and receive functions should not be invoked when
> the device is stopped, stop lcores during kni ops and restart them
> after device is started once again.

Hi Tomasz,

Are you observing any error or unexpected behavior because of rx/tx functions? I
am not sure about the patch, please check below logs, and trying to understand
scope of the patch, can you please give more details what happens if this patch
is missing?

> 
> Signed-off-by: Tomasz Duszynski <tdu@semihalf.com>
> ---
>  examples/kni/main.c | 28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
> 
> diff --git a/examples/kni/main.c b/examples/kni/main.c
> index cb48fb5..5c50448 100644
> --- a/examples/kni/main.c
> +++ b/examples/kni/main.c
> @@ -166,6 +166,23 @@ static int kni_change_mtu(uint16_t port_id, unsigned int new_mtu);
>  static int kni_config_network_interface(uint16_t port_id, uint8_t if_up);
> 
>  static rte_atomic32_t kni_stop = RTE_ATOMIC32_INIT(0);
> +static rte_atomic32_t kni_restart = RTE_ATOMIC32_INIT(0);
> +
> +static void
> +kni_stop_lcores(void)
> +{
> +	unsigned int i;
> +
> +	rte_atomic32_inc(&kni_restart);
> +	rte_atomic32_inc(&kni_stop);
> +
> +	RTE_LCORE_FOREACH(i) {
> +		if (i == rte_lcore_id())
> +			continue;

This function called by port Rx core [1], and since the thread can't wait itself
to finish, specially if nb_kni > 1, the Rx core still can do some work even
after exit from this function.

> +
> +		rte_eal_wait_lcore(i);

The API documentation says: "To be executed on the MASTER lcore only."
Not sure what happens when called from slave core, as we did here.

> +	}
> +}
> 
>  /* Print out statistics on packets handled */
>  static void
> @@ -712,6 +729,7 @@ kni_change_mtu(uint16_t port_id, unsigned int new_mtu)
> 
>  	RTE_LOG(INFO, APP, "Change MTU of port %d to %u\n", port_id, new_mtu);
> 
> +	kni_stop_lcores();
>  	/* Stop specific port */
>  	rte_eth_dev_stop(port_id);
> 
> @@ -755,6 +773,8 @@ kni_config_network_interface(uint16_t port_id, uint8_t if_up)
>  	RTE_LOG(INFO, APP, "Configure network interface of %d %s\n",
>  					port_id, if_up ? "up" : "down");
> 
> +	kni_stop_lcores();
> +
>  	if (if_up != 0) { /* Configure network interface up */
>  		rte_eth_dev_stop(port_id);
>  		ret = rte_eth_dev_start(port_id);
> @@ -911,6 +931,7 @@ main(int argc, char** argv)
>  	}
>  	check_all_ports_link_status(nb_sys_ports, ports_mask);
> 
> +restart:
>  	/* Launch per-lcore function on every lcore */
>  	rte_eal_mp_remote_launch(main_loop, NULL, CALL_MASTER);
>  	RTE_LCORE_FOREACH_SLAVE(i) {
> @@ -918,6 +939,13 @@ main(int argc, char** argv)
>  			return -1;
>  	}
> 
> +	if (rte_atomic32_read(&kni_restart)) {
> +		rte_atomic32_dec(&kni_stop);
> +		rte_atomic32_dec(&kni_restart);

kni_stop_lcores() called per port, so it is possible that kni_stop and
kni_restart increased parallel, many times. But this decrement is per
application, so they will be decremented sequentially, casing app stop - start
unnecessarily.

> +
> +		goto restart;

This will cause assigning tasks to cores again, and will produce all related
logs again, if you enable debug logs you will see it [2]. I believe confusing to
have those logs every time mtu updated etc...

> +	}
> +
>  	/* Release resources */
>  	for (port = 0; port < nb_sys_ports; port++) {
>  		if (!(ports_mask & (1 << port)))
> --
> 2.7.4
> 

[1]
main_loop
  kni_ingress
    rte_kni_handle_request
      kni_change_mtu
      ||
      kni_config_network_interface
        kni_stop_lcores


[2]
APP: Change MTU of port 0 to 1402
PMD: ixgbe_set_rx_function(): Vector rx enabled, please make sure RX burst size
no less than 4 (port=0).
PMD: ixgbe_dev_link_status_print():  Port 0: Link Down
PMD: ixgbe_dev_link_status_print(): PCI Address: 0000:08:00.1
APP: Lcore 1 is reading from port 0
APP: Lcore 2 is writing to port 0
APP: Lcore 3 is reading from port 1
APP: Lcore 4 is writing to port 1
APP: Lcore 5 has nothing to do
APP: Lcore 6 has nothing to do
APP: Lcore 7 has nothing to do
APP: Lcore 8 has nothing to do
APP: Lcore 9 has nothing to do
APP: Lcore 10 has nothing to do
APP: Lcore 11 has nothing to do
APP: Lcore 12 has nothing to do
APP: Lcore 13 has nothing to do
APP: Lcore 14 has nothing to do
APP: Lcore 15 has nothing to do
APP: Lcore 16 has nothing to do
APP: Lcore 17 has nothing to do
APP: Lcore 18 has nothing to do
APP: Lcore 19 has nothing to do
APP: Lcore 20 has nothing to do
APP: Lcore 21 has nothing to do
APP: Lcore 22 has nothing to do
APP: Lcore 23 has nothing to do
APP: Lcore 24 has nothing to do
APP: Lcore 25 has nothing to do
APP: Lcore 26 has nothing to do
APP: Lcore 27 has nothing to do
APP: Lcore 28 has nothing to do
APP: Lcore 29 has nothing to do
APP: Lcore 30 has nothing to do
APP: Lcore 31 has nothing to do
APP: Lcore 0 has nothing to do
  
Tomasz Duszynski Oct. 25, 2017, 9:28 a.m. UTC | #2
Hi Ferruh,

On Fri, Oct 20, 2017 at 05:42:13PM -0700, Ferruh Yigit wrote:
> On 10/16/2017 4:45 AM, Tomasz Duszynski wrote:
> > Since the transmit and receive functions should not be invoked when
> > the device is stopped, stop lcores during kni ops and restart them
> > after device is started once again.
>
> Hi Tomasz,
>
> Are you observing any error or unexpected behavior because of rx/tx functions? I
> am not sure about the patch, please check below logs, and trying to understand
> scope of the patch, can you please give more details what happens if this patch
> is missing?

Right, calling rx/tx functions after device was stopped will break
things.

For instace, putting interface up will call rte_eth_dev_stop() which
frees port resources. If that happens (and usually happens) during rx/tx
driver would break, as resources used in library functions calls
are bogus at that point.

So we end up with SIGSEGV.

According to dpdk documentation rx/tx functions cannot be called before
dev_start(). Thus I think cores that do rx/tx should be stopped just
before rte_eth_dev_stop() is called.

That could be fixed inside pmd but would require locks in rx/tx path which
is rather a no-go.

>
> >
> > Signed-off-by: Tomasz Duszynski <tdu@semihalf.com>
> > ---
> >  examples/kni/main.c | 28 ++++++++++++++++++++++++++++
> >  1 file changed, 28 insertions(+)
> >
> > diff --git a/examples/kni/main.c b/examples/kni/main.c
> > index cb48fb5..5c50448 100644
> > --- a/examples/kni/main.c
> > +++ b/examples/kni/main.c
> > @@ -166,6 +166,23 @@ static int kni_change_mtu(uint16_t port_id, unsigned int new_mtu);
> >  static int kni_config_network_interface(uint16_t port_id, uint8_t if_up);
> >
> >  static rte_atomic32_t kni_stop = RTE_ATOMIC32_INIT(0);
> > +static rte_atomic32_t kni_restart = RTE_ATOMIC32_INIT(0);
> > +
> > +static void
> > +kni_stop_lcores(void)
> > +{
> > +	unsigned int i;
> > +
> > +	rte_atomic32_inc(&kni_restart);
> > +	rte_atomic32_inc(&kni_stop);
> > +
> > +	RTE_LCORE_FOREACH(i) {
> > +		if (i == rte_lcore_id())
> > +			continue;
>
> This function called by port Rx core [1], and since the thread can't wait itself
> to finish, specially if nb_kni > 1, the Rx core still can do some work even
> after exit from this function.

Ack.

>
> > +
> > +		rte_eal_wait_lcore(i);
>
> The API documentation says: "To be executed on the MASTER lcore only."
> Not sure what happens when called from slave core, as we did here.

OK.

>
> > +	}
> > +}
> >
> >  /* Print out statistics on packets handled */
> >  static void
> > @@ -712,6 +729,7 @@ kni_change_mtu(uint16_t port_id, unsigned int new_mtu)
> >
> >  	RTE_LOG(INFO, APP, "Change MTU of port %d to %u\n", port_id, new_mtu);
> >
> > +	kni_stop_lcores();
> >  	/* Stop specific port */
> >  	rte_eth_dev_stop(port_id);
> >
> > @@ -755,6 +773,8 @@ kni_config_network_interface(uint16_t port_id, uint8_t if_up)
> >  	RTE_LOG(INFO, APP, "Configure network interface of %d %s\n",
> >  					port_id, if_up ? "up" : "down");
> >
> > +	kni_stop_lcores();
> > +
> >  	if (if_up != 0) { /* Configure network interface up */
> >  		rte_eth_dev_stop(port_id);
> >  		ret = rte_eth_dev_start(port_id);
> > @@ -911,6 +931,7 @@ main(int argc, char** argv)
> >  	}
> >  	check_all_ports_link_status(nb_sys_ports, ports_mask);
> >
> > +restart:
> >  	/* Launch per-lcore function on every lcore */
> >  	rte_eal_mp_remote_launch(main_loop, NULL, CALL_MASTER);
> >  	RTE_LCORE_FOREACH_SLAVE(i) {
> > @@ -918,6 +939,13 @@ main(int argc, char** argv)
> >  			return -1;
> >  	}
> >
> > +	if (rte_atomic32_read(&kni_restart)) {
> > +		rte_atomic32_dec(&kni_stop);
> > +		rte_atomic32_dec(&kni_restart);
>
> kni_stop_lcores() called per port, so it is possible that kni_stop and
> kni_restart increased parallel, many times. But this decrement is per
> application, so they will be decremented sequentially, casing app stop - start
> unnecessarily.

Right.

>
> > +
> > +		goto restart;
>
> This will cause assigning tasks to cores again, and will produce all related
> logs again, if you enable debug logs you will see it [2]. I believe confusing to
> have those logs every time mtu updated etc...

Right.

>
> > +	}
> > +
> >  	/* Release resources */
> >  	for (port = 0; port < nb_sys_ports; port++) {
> >  		if (!(ports_mask & (1 << port)))
> > --
> > 2.7.4
> >
>
> [1]
> main_loop
>   kni_ingress
>     rte_kni_handle_request
>       kni_change_mtu
>       ||
>       kni_config_network_interface
>         kni_stop_lcores
>
>
> [2]
> APP: Change MTU of port 0 to 1402
> PMD: ixgbe_set_rx_function(): Vector rx enabled, please make sure RX burst size
> no less than 4 (port=0).
> PMD: ixgbe_dev_link_status_print():  Port 0: Link Down
> PMD: ixgbe_dev_link_status_print(): PCI Address: 0000:08:00.1
> APP: Lcore 1 is reading from port 0
> APP: Lcore 2 is writing to port 0
> APP: Lcore 3 is reading from port 1
> APP: Lcore 4 is writing to port 1
> APP: Lcore 5 has nothing to do
> APP: Lcore 6 has nothing to do
> APP: Lcore 7 has nothing to do
> APP: Lcore 8 has nothing to do
> APP: Lcore 9 has nothing to do
> APP: Lcore 10 has nothing to do
> APP: Lcore 11 has nothing to do
> APP: Lcore 12 has nothing to do
> APP: Lcore 13 has nothing to do
> APP: Lcore 14 has nothing to do
> APP: Lcore 15 has nothing to do
> APP: Lcore 16 has nothing to do
> APP: Lcore 17 has nothing to do
> APP: Lcore 18 has nothing to do
> APP: Lcore 19 has nothing to do
> APP: Lcore 20 has nothing to do
> APP: Lcore 21 has nothing to do
> APP: Lcore 22 has nothing to do
> APP: Lcore 23 has nothing to do
> APP: Lcore 24 has nothing to do
> APP: Lcore 25 has nothing to do
> APP: Lcore 26 has nothing to do
> APP: Lcore 27 has nothing to do
> APP: Lcore 28 has nothing to do
> APP: Lcore 29 has nothing to do
> APP: Lcore 30 has nothing to do
> APP: Lcore 31 has nothing to do
> APP: Lcore 0 has nothing to do

--
- Tomasz Duszyński
  

Patch

diff --git a/examples/kni/main.c b/examples/kni/main.c
index cb48fb5..5c50448 100644
--- a/examples/kni/main.c
+++ b/examples/kni/main.c
@@ -166,6 +166,23 @@  static int kni_change_mtu(uint16_t port_id, unsigned int new_mtu);
 static int kni_config_network_interface(uint16_t port_id, uint8_t if_up);

 static rte_atomic32_t kni_stop = RTE_ATOMIC32_INIT(0);
+static rte_atomic32_t kni_restart = RTE_ATOMIC32_INIT(0);
+
+static void
+kni_stop_lcores(void)
+{
+	unsigned int i;
+
+	rte_atomic32_inc(&kni_restart);
+	rte_atomic32_inc(&kni_stop);
+
+	RTE_LCORE_FOREACH(i) {
+		if (i == rte_lcore_id())
+			continue;
+
+		rte_eal_wait_lcore(i);
+	}
+}

 /* Print out statistics on packets handled */
 static void
@@ -712,6 +729,7 @@  kni_change_mtu(uint16_t port_id, unsigned int new_mtu)

 	RTE_LOG(INFO, APP, "Change MTU of port %d to %u\n", port_id, new_mtu);

+	kni_stop_lcores();
 	/* Stop specific port */
 	rte_eth_dev_stop(port_id);

@@ -755,6 +773,8 @@  kni_config_network_interface(uint16_t port_id, uint8_t if_up)
 	RTE_LOG(INFO, APP, "Configure network interface of %d %s\n",
 					port_id, if_up ? "up" : "down");

+	kni_stop_lcores();
+
 	if (if_up != 0) { /* Configure network interface up */
 		rte_eth_dev_stop(port_id);
 		ret = rte_eth_dev_start(port_id);
@@ -911,6 +931,7 @@  main(int argc, char** argv)
 	}
 	check_all_ports_link_status(nb_sys_ports, ports_mask);

+restart:
 	/* Launch per-lcore function on every lcore */
 	rte_eal_mp_remote_launch(main_loop, NULL, CALL_MASTER);
 	RTE_LCORE_FOREACH_SLAVE(i) {
@@ -918,6 +939,13 @@  main(int argc, char** argv)
 			return -1;
 	}

+	if (rte_atomic32_read(&kni_restart)) {
+		rte_atomic32_dec(&kni_stop);
+		rte_atomic32_dec(&kni_restart);
+
+		goto restart;
+	}
+
 	/* Release resources */
 	for (port = 0; port < nb_sys_ports; port++) {
 		if (!(ports_mask & (1 << port)))