examples/l3fwd: fix Rx over not ready port

Message ID 20240301163931.107036-1-konstantin.v.ananyev@yandex.ru (mailing list archive)
State Accepted, archived
Delegated to: Thomas Monjalon
Headers
Series examples/l3fwd: fix Rx over not ready port |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/github-robot: build success github build: passed
ci/intel-Functional success Functional PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-unit-arm64-testing success Testing PASS
ci/iol-compile-amd64-testing success Testing PASS
ci/iol-compile-arm64-testing success Testing PASS
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/iol-unit-amd64-testing success Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-sample-apps-testing success Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS

Commit Message

Konstantin Ananyev March 1, 2024, 4:39 p.m. UTC
  From: Konstantin Ananyev <konstantin.ananyev@huawei.com>

Running l3fwd in event mode with SW eventdev, service cores
can start RX before main thread is finished with PMD installation.
to reproduce:
./dpdk-l3fwd --lcores=49,51 -n 6 -a ca:00.0 -s 0x8000000000000 \
--vdev event_sw0 -- \
-L -P -p 1  --mode eventdev --eventq-sched=ordered \
--rule_ipv4=test/l3fwd_lpm_v4_u1.cfg --rule_ipv6=test/l3fwd_lpm_v6_u1.cfg \
--no-numa

At init stage user will most likely see the error message like that:
ETHDEV: lcore 51 called rx_pkt_burst for not ready port 0
0: ./dpdk-l3fwd (rte_dump_stack+0x1f) [15de723]
...
9: ./dpdk-l3fwd (eal_thread_loop+0x5a2) [15c1324]
...

And then all depends how luck/unlucky you are.
If there are some actual packet in HW RX queue, then the app will most
likely crash, otherwise it might survive.
As error message suggests, the problem is that services are started
before main thread finished with NIC setup and initialization.
The suggested fix moves services startup after NIC setup phase.

Bugzilla ID: 1390
Fixes: 8bd537e9c6cf ("examples/l3fwd: add service core setup based on caps")
Cc: stable@dpdk.org

Signed-off-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
---
 examples/l3fwd/main.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)
  

Comments

Pavan Nikhilesh Bhagavatula March 1, 2024, 4:49 p.m. UTC | #1
> -----Original Message-----
> From: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
> Sent: Friday, March 1, 2024 10:10 PM
> To: dev@dpdk.org
> Cc: Jerin Jacob <jerinj@marvell.com>; Pavan Nikhilesh Bhagavatula
> <pbhagavatula@marvell.com>; Konstantin Ananyev
> <konstantin.ananyev@huawei.com>; stable@dpdk.org
> Subject: [EXTERNAL] [PATCH] examples/l3fwd: fix Rx over not ready port
> 
> Prioritize security for external emails: Confirm sender and content safety
> before clicking links or opening attachments
> 
> ----------------------------------------------------------------------
> From: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> 
> Running l3fwd in event mode with SW eventdev, service cores
> can start RX before main thread is finished with PMD installation.
> to reproduce:
> ./dpdk-l3fwd --lcores=49,51 -n 6 -a ca:00.0 -s 0x8000000000000 \
> --vdev event_sw0 -- \
> -L -P -p 1  --mode eventdev --eventq-sched=ordered \
> --rule_ipv4=test/l3fwd_lpm_v4_u1.cfg --rule_ipv6=test/l3fwd_lpm_v6_u1.cfg
> \
> --no-numa
> 
> At init stage user will most likely see the error message like that:
> ETHDEV: lcore 51 called rx_pkt_burst for not ready port 0
> 0: ./dpdk-l3fwd (rte_dump_stack+0x1f) [15de723]
> ...
> 9: ./dpdk-l3fwd (eal_thread_loop+0x5a2) [15c1324]
> ...
> 
> And then all depends how luck/unlucky you are.
> If there are some actual packet in HW RX queue, then the app will most
> likely crash, otherwise it might survive.
> As error message suggests, the problem is that services are started
> before main thread finished with NIC setup and initialization.
> The suggested fix moves services startup after NIC setup phase.
> 
> Bugzilla ID: 1390
> Fixes: 8bd537e9c6cf ("examples/l3fwd: add service core setup based on
> caps")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
> Signed-off-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> ---
>  examples/l3fwd/main.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
> index 3bf28aec0c..d4fb5d1971 100644
> --- a/examples/l3fwd/main.c
> +++ b/examples/l3fwd/main.c
> @@ -1577,7 +1577,6 @@ main(int argc, char **argv)
>  			l3fwd_lkp.main_loop = evt_rsrc->ops.fib_event_loop;
>  		else
>  			l3fwd_lkp.main_loop = evt_rsrc-
> >ops.lpm_event_loop;
> -		l3fwd_event_service_setup();
>  	} else
>  #endif
>  		l3fwd_poll_resource_setup();
> @@ -1609,6 +1608,11 @@ main(int argc, char **argv)
>  		}
>  	}
> 
> +#ifdef RTE_LIB_EVENTDEV

Is the ifdef required?

> +	if (evt_rsrc->enabled)
> +		l3fwd_event_service_setup();
> +#endif
> +
>  	printf("\n");
> 
>  	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
> --
> 2.35.3
  
Konstantin Ananyev March 1, 2024, 5:12 p.m. UTC | #2
> > From: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
> > Sent: Friday, March 1, 2024 10:10 PM
> > To: dev@dpdk.org
> > Cc: Jerin Jacob <jerinj@marvell.com>; Pavan Nikhilesh Bhagavatula
> > <pbhagavatula@marvell.com>; Konstantin Ananyev
> > <konstantin.ananyev@huawei.com>; stable@dpdk.org
> > Subject: [EXTERNAL] [PATCH] examples/l3fwd: fix Rx over not ready port
> >
> > Prioritize security for external emails: Confirm sender and content safety
> > before clicking links or opening attachments
> >
> > ----------------------------------------------------------------------
> > From: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> >
> > Running l3fwd in event mode with SW eventdev, service cores
> > can start RX before main thread is finished with PMD installation.
> > to reproduce:
> > ./dpdk-l3fwd --lcores=49,51 -n 6 -a ca:00.0 -s 0x8000000000000 \
> > --vdev event_sw0 -- \
> > -L -P -p 1  --mode eventdev --eventq-sched=ordered \
> > --rule_ipv4=test/l3fwd_lpm_v4_u1.cfg --rule_ipv6=test/l3fwd_lpm_v6_u1.cfg
> > \
> > --no-numa
> >
> > At init stage user will most likely see the error message like that:
> > ETHDEV: lcore 51 called rx_pkt_burst for not ready port 0
> > 0: ./dpdk-l3fwd (rte_dump_stack+0x1f) [15de723]
> > ...
> > 9: ./dpdk-l3fwd (eal_thread_loop+0x5a2) [15c1324]
> > ...
> >
> > And then all depends how luck/unlucky you are.
> > If there are some actual packet in HW RX queue, then the app will most
> > likely crash, otherwise it might survive.
> > As error message suggests, the problem is that services are started
> > before main thread finished with NIC setup and initialization.
> > The suggested fix moves services startup after NIC setup phase.
> >
> > Bugzilla ID: 1390
> > Fixes: 8bd537e9c6cf ("examples/l3fwd: add service core setup based on
> > caps")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
> > Signed-off-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> > ---
> >  examples/l3fwd/main.c | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
> > index 3bf28aec0c..d4fb5d1971 100644
> > --- a/examples/l3fwd/main.c
> > +++ b/examples/l3fwd/main.c
> > @@ -1577,7 +1577,6 @@ main(int argc, char **argv)
> >  			l3fwd_lkp.main_loop = evt_rsrc->ops.fib_event_loop;
> >  		else
> >  			l3fwd_lkp.main_loop = evt_rsrc-
> > >ops.lpm_event_loop;
> > -		l3fwd_event_service_setup();
> >  	} else
> >  #endif
> >  		l3fwd_poll_resource_setup();
> > @@ -1609,6 +1608,11 @@ main(int argc, char **argv)
> >  		}
> >  	}
> >
> > +#ifdef RTE_LIB_EVENTDEV
> 
> Is the ifdef required?

Well, right now l3fwd_event_service_setup() is defined only when
RTE_LIB_EVENTDEV is defined, see examples/l3fwd/main.c.
So, I suppose, yes.

> 
> > +	if (evt_rsrc->enabled)
> > +		l3fwd_event_service_setup();
> > +#endif
> > +
> >  	printf("\n");
> >
> >  	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
> > --
> > 2.35.3
  
Pavan Nikhilesh Bhagavatula March 1, 2024, 5:17 p.m. UTC | #3
> -----Original Message-----
> From: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> Sent: Friday, March 1, 2024 10:43 PM
> To: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Konstantin
> Ananyev <konstantin.v.ananyev@yandex.ru>; dev@dpdk.org
> Cc: Jerin Jacob <jerinj@marvell.com>; stable@dpdk.org
> Subject: RE: [EXTERNAL] [PATCH] examples/l3fwd: fix Rx over not ready port
> 
> 
> 
> > > From: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
> > > Sent: Friday, March 1, 2024 10:10 PM
> > > To: dev@dpdk.org
> > > Cc: Jerin Jacob <jerinj@marvell.com>; Pavan Nikhilesh Bhagavatula
> > > <pbhagavatula@marvell.com>; Konstantin Ananyev
> > > <konstantin.ananyev@huawei.com>; stable@dpdk.org
> > > Subject: [EXTERNAL] [PATCH] examples/l3fwd: fix Rx over not ready port
> > >
> > > Prioritize security for external emails: Confirm sender and content safety
> > > before clicking links or opening attachments
> > >
> > > ----------------------------------------------------------------------
> > > From: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> > >
> > > Running l3fwd in event mode with SW eventdev, service cores
> > > can start RX before main thread is finished with PMD installation.
> > > to reproduce:
> > > ./dpdk-l3fwd --lcores=49,51 -n 6 -a ca:00.0 -s 0x8000000000000 \
> > > --vdev event_sw0 -- \
> > > -L -P -p 1  --mode eventdev --eventq-sched=ordered \
> > > --rule_ipv4=test/l3fwd_lpm_v4_u1.cfg --
> rule_ipv6=test/l3fwd_lpm_v6_u1.cfg
> > > \
> > > --no-numa
> > >
> > > At init stage user will most likely see the error message like that:
> > > ETHDEV: lcore 51 called rx_pkt_burst for not ready port 0
> > > 0: ./dpdk-l3fwd (rte_dump_stack+0x1f) [15de723]
> > > ...
> > > 9: ./dpdk-l3fwd (eal_thread_loop+0x5a2) [15c1324]
> > > ...
> > >
> > > And then all depends how luck/unlucky you are.
> > > If there are some actual packet in HW RX queue, then the app will most
> > > likely crash, otherwise it might survive.
> > > As error message suggests, the problem is that services are started
> > > before main thread finished with NIC setup and initialization.
> > > The suggested fix moves services startup after NIC setup phase.
> > >
> > > Bugzilla ID: 1390
> > > Fixes: 8bd537e9c6cf ("examples/l3fwd: add service core setup based on
> > > caps")
> > > Cc: stable@dpdk.org
> > >
> > > Signed-off-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
> > > Signed-off-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>

Acked-by: Pavan Nikhilesh <pbhagavatula@marvell.com>

> > > ---
> > >  examples/l3fwd/main.c | 6 +++++-
> > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
> > > index 3bf28aec0c..d4fb5d1971 100644
> > > --- a/examples/l3fwd/main.c
> > > +++ b/examples/l3fwd/main.c
> > > @@ -1577,7 +1577,6 @@ main(int argc, char **argv)
> > >  			l3fwd_lkp.main_loop = evt_rsrc->ops.fib_event_loop;
> > >  		else
> > >  			l3fwd_lkp.main_loop = evt_rsrc-
> > > >ops.lpm_event_loop;
> > > -		l3fwd_event_service_setup();
> > >  	} else
> > >  #endif
> > >  		l3fwd_poll_resource_setup();
> > > @@ -1609,6 +1608,11 @@ main(int argc, char **argv)
> > >  		}
> > >  	}
> > >
> > > +#ifdef RTE_LIB_EVENTDEV
> >
> > Is the ifdef required?
> 
> Well, right now l3fwd_event_service_setup() is defined only when
> RTE_LIB_EVENTDEV is defined, see examples/l3fwd/main.c.
> So, I suppose, yes.
> 

My bad I was looking at wrong DPDK version (22.11).

> >
> > > +	if (evt_rsrc->enabled)
> > > +		l3fwd_event_service_setup();
> > > +#endif
> > > +
> > >  	printf("\n");
> > >
> > >  	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
> > > --
> > > 2.35.3
  
Konstantin Ananyev March 4, 2024, 10:13 a.m. UTC | #4
> > > > From: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
> > > > Sent: Friday, March 1, 2024 10:10 PM
> > > > To: dev@dpdk.org
> > > > Cc: Jerin Jacob <jerinj@marvell.com>; Pavan Nikhilesh Bhagavatula
> > > > <pbhagavatula@marvell.com>; Konstantin Ananyev
> > > > <konstantin.ananyev@huawei.com>; stable@dpdk.org
> > > > Subject: [EXTERNAL] [PATCH] examples/l3fwd: fix Rx over not ready port
> > > >
> > > > Prioritize security for external emails: Confirm sender and content safety
> > > > before clicking links or opening attachments
> > > >
> > > > ----------------------------------------------------------------------
> > > > From: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> > > >
> > > > Running l3fwd in event mode with SW eventdev, service cores
> > > > can start RX before main thread is finished with PMD installation.
> > > > to reproduce:
> > > > ./dpdk-l3fwd --lcores=49,51 -n 6 -a ca:00.0 -s 0x8000000000000 \
> > > > --vdev event_sw0 -- \
> > > > -L -P -p 1  --mode eventdev --eventq-sched=ordered \
> > > > --rule_ipv4=test/l3fwd_lpm_v4_u1.cfg --
> > rule_ipv6=test/l3fwd_lpm_v6_u1.cfg
> > > > \
> > > > --no-numa
> > > >
> > > > At init stage user will most likely see the error message like that:
> > > > ETHDEV: lcore 51 called rx_pkt_burst for not ready port 0
> > > > 0: ./dpdk-l3fwd (rte_dump_stack+0x1f) [15de723]
> > > > ...
> > > > 9: ./dpdk-l3fwd (eal_thread_loop+0x5a2) [15c1324]
> > > > ...
> > > >
> > > > And then all depends how luck/unlucky you are.
> > > > If there are some actual packet in HW RX queue, then the app will most
> > > > likely crash, otherwise it might survive.
> > > > As error message suggests, the problem is that services are started
> > > > before main thread finished with NIC setup and initialization.
> > > > The suggested fix moves services startup after NIC setup phase.
> > > >
> > > > Bugzilla ID: 1390
> > > > Fixes: 8bd537e9c6cf ("examples/l3fwd: add service core setup based on
> > > > caps")
> > > > Cc: stable@dpdk.org
> > > >
> > > > Signed-off-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
> > > > Signed-off-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> 
> Acked-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> 
> > > > ---
> > > >  examples/l3fwd/main.c | 6 +++++-
> > > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
> > > > index 3bf28aec0c..d4fb5d1971 100644
> > > > --- a/examples/l3fwd/main.c
> > > > +++ b/examples/l3fwd/main.c
> > > > @@ -1577,7 +1577,6 @@ main(int argc, char **argv)
> > > >  			l3fwd_lkp.main_loop = evt_rsrc->ops.fib_event_loop;
> > > >  		else
> > > >  			l3fwd_lkp.main_loop = evt_rsrc-
> > > > >ops.lpm_event_loop;
> > > > -		l3fwd_event_service_setup();
> > > >  	} else
> > > >  #endif
> > > >  		l3fwd_poll_resource_setup();
> > > > @@ -1609,6 +1608,11 @@ main(int argc, char **argv)
> > > >  		}
> > > >  	}
> > > >
> > > > +#ifdef RTE_LIB_EVENTDEV
> > >
> > > Is the ifdef required?
> >
> > Well, right now l3fwd_event_service_setup() is defined only when
> > RTE_LIB_EVENTDEV is defined, see examples/l3fwd/main.c.
> > So, I suppose, yes.
> >
> 
> My bad I was looking at wrong DPDK version (22.11).

NP, thank you for taking a look.
As a FYI, I filled 2 more bugs on a similar subject (l3fwd event mode):
https://bugs.dpdk.org/show_bug.cgi?id=1393
https://bugs.dpdk.org/show_bug.cgi?id=1391
If you have bandwidth to have a look and provide some feedback,
would be great. 

> 
> > >
> > > > +	if (evt_rsrc->enabled)
> > > > +		l3fwd_event_service_setup();
> > > > +#endif
> > > > +
> > > >  	printf("\n");
> > > >
> > > >  	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
> > > > --
> > > > 2.35.3
  
Thomas Monjalon March 7, 2024, 8:43 a.m. UTC | #5
> > > > Running l3fwd in event mode with SW eventdev, service cores
> > > > can start RX before main thread is finished with PMD installation.
> > > > to reproduce:
> > > > ./dpdk-l3fwd --lcores=49,51 -n 6 -a ca:00.0 -s 0x8000000000000 \
> > > > --vdev event_sw0 -- \
> > > > -L -P -p 1  --mode eventdev --eventq-sched=ordered \
> > > > --rule_ipv4=test/l3fwd_lpm_v4_u1.cfg --
> > rule_ipv6=test/l3fwd_lpm_v6_u1.cfg
> > > > \
> > > > --no-numa
> > > >
> > > > At init stage user will most likely see the error message like that:
> > > > ETHDEV: lcore 51 called rx_pkt_burst for not ready port 0
> > > > 0: ./dpdk-l3fwd (rte_dump_stack+0x1f) [15de723]
> > > > ...
> > > > 9: ./dpdk-l3fwd (eal_thread_loop+0x5a2) [15c1324]
> > > > ...
> > > >
> > > > And then all depends how luck/unlucky you are.
> > > > If there are some actual packet in HW RX queue, then the app will most
> > > > likely crash, otherwise it might survive.
> > > > As error message suggests, the problem is that services are started
> > > > before main thread finished with NIC setup and initialization.
> > > > The suggested fix moves services startup after NIC setup phase.
> > > >
> > > > Bugzilla ID: 1390
> > > > Fixes: 8bd537e9c6cf ("examples/l3fwd: add service core setup based on
> > > > caps")
> > > > Cc: stable@dpdk.org
> > > >
> > > > Signed-off-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
> > > > Signed-off-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> 
> Acked-by: Pavan Nikhilesh <pbhagavatula@marvell.com>

Applied, thanks.
  

Patch

diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
index 3bf28aec0c..d4fb5d1971 100644
--- a/examples/l3fwd/main.c
+++ b/examples/l3fwd/main.c
@@ -1577,7 +1577,6 @@  main(int argc, char **argv)
 			l3fwd_lkp.main_loop = evt_rsrc->ops.fib_event_loop;
 		else
 			l3fwd_lkp.main_loop = evt_rsrc->ops.lpm_event_loop;
-		l3fwd_event_service_setup();
 	} else
 #endif
 		l3fwd_poll_resource_setup();
@@ -1609,6 +1608,11 @@  main(int argc, char **argv)
 		}
 	}
 
+#ifdef RTE_LIB_EVENTDEV
+	if (evt_rsrc->enabled)
+		l3fwd_event_service_setup();
+#endif
+
 	printf("\n");
 
 	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {