graph: fix head move when graph walk in mcore dispatch

Message ID 20240319141454.3275543-1-jingjing.wu@intel.com (mailing list archive)
State Superseded
Delegated to: Thomas Monjalon
Headers
Series graph: fix head move when graph walk in mcore dispatch |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing fail Unit Testing FAIL
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/intel-Functional success Functional PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-compile-amd64-testing success Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-abi-testing success Testing PASS
ci/github-robot: build fail github build: failed
ci/iol-sample-apps-testing success Testing PASS
ci/iol-unit-amd64-testing fail Testing issues
ci/iol-unit-arm64-testing fail Testing issues
ci/iol-compile-arm64-testing success Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS

Commit Message

Jingjing Wu March 19, 2024, 2:14 p.m. UTC
  Head move should happen after the core id check, otherwise
source node will be missed.

Fixes: 35dfd9b9fd85 ("graph: introduce graph walk by cross-core dispatch")
Cc: stable@dpdk.org

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
---
 lib/graph/rte_graph_model_mcore_dispatch.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
  

Comments

Yan, Zhirun March 20, 2024, 3:33 a.m. UTC | #1
> -----Original Message-----
> From: Wu, Jingjing <jingjing.wu@intel.com>
> Sent: Tuesday, March 19, 2024 10:15 PM
> To: dev@dpdk.org
> Cc: Wu, Jingjing <jingjing.wu@intel.com>; jerinj@marvell.com;
> pbhagavatula@marvell.com; Yan, Zhirun <zhirun.yan@intel.com>;
> stable@dpdk.org
> Subject: [PATCH] graph: fix head move when graph walk in mcore dispatch
> 
> Head move should happen after the core id check, otherwise source node will be
> missed.
> 
> Fixes: 35dfd9b9fd85 ("graph: introduce graph walk by cross-core dispatch")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
> ---
>  lib/graph/rte_graph_model_mcore_dispatch.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/graph/rte_graph_model_mcore_dispatch.h
> b/lib/graph/rte_graph_model_mcore_dispatch.h
> index 75ec388cad..b96469296e 100644
> --- a/lib/graph/rte_graph_model_mcore_dispatch.h
> +++ b/lib/graph/rte_graph_model_mcore_dispatch.h
> @@ -97,12 +97,12 @@ rte_graph_walk_mcore_dispatch(struct rte_graph
> *graph)
>  		__rte_graph_mcore_dispatch_sched_wq_process(graph);
> 
>  	while (likely(head != graph->tail)) {
> -		node = (struct rte_node *)RTE_PTR_ADD(graph,
> cir_start[(int32_t)head++]);
> +		node = (struct rte_node *)RTE_PTR_ADD(graph,
> +cir_start[(int32_t)head]);
> 
>  		/* skip the src nodes which not bind with current worker */
>  		if ((int32_t)head < 0 && node->dispatch.lcore_id != graph-
> >dispatch.lcore_id)
>  			continue;
> -
> +		head++;
If current src node not bind with current core, It will go into infinite loop.
This line would have no chance to run.

>  		/* Schedule the node until all task/objs are done */
>  		if (node->dispatch.lcore_id != RTE_MAX_LCORE &&
>  		    graph->dispatch.lcore_id != node->dispatch.lcore_id &&
> --
> 2.34.1
  
Jingjing Wu March 20, 2024, 6:25 a.m. UTC | #2
> >  		/* skip the src nodes which not bind with current worker */
> >  		if ((int32_t)head < 0 && node->dispatch.lcore_id != graph-
> > >dispatch.lcore_id)
> >  			continue;
> > -
> > +		head++;
> If current src node not bind with current core, It will go into infinite loop.
> This line would have no chance to run.

Seems reasonable, it might be OK to change "head<0" to "head <1" the condition check?
  
Yan, Zhirun March 20, 2024, 8:42 a.m. UTC | #3
> -----Original Message-----
> From: Wu, Jingjing <jingjing.wu@intel.com>
> Sent: Wednesday, March 20, 2024 2:25 PM
> To: Yan, Zhirun <zhirun.yan@intel.com>; dev@dpdk.org
> Cc: jerinj@marvell.com; pbhagavatula@marvell.com; stable@dpdk.org
> Subject: RE: [PATCH] graph: fix head move when graph walk in mcore dispatch
> 
> 
> > >  		/* skip the src nodes which not bind with current worker */
> > >  		if ((int32_t)head < 0 && node->dispatch.lcore_id != graph-
> > > >dispatch.lcore_id)
> > >  			continue;
> > > -
> > > +		head++;
> > If current src node not bind with current core, It will go into infinite loop.
> > This line would have no chance to run.
> 
> Seems reasonable, it might be OK to change "head<0" to "head <1" the condition
> check?

No. "head<0" means it is src node. 
All src node would put before head = 0.  "Head<1" is confused.
You could find the details of graph reel under rte_graph_walk_rtc() in lib/graph/rte_graph_model_rtc.h

I guess if there are some src node missed, it may be caused by wrong config, for example, the missed src node not pin to a lcore.
Use rte_graph_model_mcore_dispatch_node_lcore_affinity_set() to pin the src node first.
  
Jingjing Wu March 21, 2024, 3:39 a.m. UTC | #4
> -----Original Message-----
> From: Yan, Zhirun <zhirun.yan@intel.com>
> Sent: Wednesday, March 20, 2024 4:43 PM
> To: Wu, Jingjing <jingjing.wu@intel.com>; dev@dpdk.org
> Cc: jerinj@marvell.com; pbhagavatula@marvell.com; stable@dpdk.org
> Subject: RE: [PATCH] graph: fix head move when graph walk in mcore dispatch
> 
> 
> 
> > -----Original Message-----
> > From: Wu, Jingjing <jingjing.wu@intel.com>
> > Sent: Wednesday, March 20, 2024 2:25 PM
> > To: Yan, Zhirun <zhirun.yan@intel.com>; dev@dpdk.org
> > Cc: jerinj@marvell.com; pbhagavatula@marvell.com; stable@dpdk.org
> > Subject: RE: [PATCH] graph: fix head move when graph walk in mcore
> dispatch
> >
> >
> > > >  		/* skip the src nodes which not bind with current worker */
> > > >  		if ((int32_t)head < 0 && node->dispatch.lcore_id != graph-
> > > > >dispatch.lcore_id)
> > > >  			continue;
> > > > -
> > > > +		head++;
> > > If current src node not bind with current core, It will go into infinite loop.
> > > This line would have no chance to run.
> >
> > Seems reasonable, it might be OK to change "head<0" to "head <1" the
> condition
> > check?
> 
> No. "head<0" means it is src node.
> All src node would put before head = 0.  "Head<1" is confused.
> You could find the details of graph reel under rte_graph_walk_rtc() in
> lib/graph/rte_graph_model_rtc.h
> 
> I guess if there are some src node missed, it may be caused by wrong config,
> for example, the missed src node not pin to a lcore.
> Use rte_graph_model_mcore_dispatch_node_lcore_affinity_set() to pin the
> src node first.

I don't think it is confusing because head++ happens before head < 0 check.
Yes, it happens when lcore affinity is not set.
For example, we have two source nodes, both of them have no lcore affinity setting.
By current code, the second node will also be executed which is not as expected.

Thanks
Jingjing
  
Yan, Zhirun March 21, 2024, 5:34 a.m. UTC | #5
> -----Original Message-----
> From: Wu, Jingjing <jingjing.wu@intel.com>
> Sent: Thursday, March 21, 2024 11:39 AM
> To: Yan, Zhirun <zhirun.yan@intel.com>; dev@dpdk.org
> Cc: jerinj@marvell.com; pbhagavatula@marvell.com; stable@dpdk.org
> Subject: RE: [PATCH] graph: fix head move when graph walk in mcore dispatch
> 
> 
> 
> > -----Original Message-----
> > From: Yan, Zhirun <zhirun.yan@intel.com>
> > Sent: Wednesday, March 20, 2024 4:43 PM
> > To: Wu, Jingjing <jingjing.wu@intel.com>; dev@dpdk.org
> > Cc: jerinj@marvell.com; pbhagavatula@marvell.com; stable@dpdk.org
> > Subject: RE: [PATCH] graph: fix head move when graph walk in mcore
> > dispatch
> >
> >
> >
> > > -----Original Message-----
> > > From: Wu, Jingjing <jingjing.wu@intel.com>
> > > Sent: Wednesday, March 20, 2024 2:25 PM
> > > To: Yan, Zhirun <zhirun.yan@intel.com>; dev@dpdk.org
> > > Cc: jerinj@marvell.com; pbhagavatula@marvell.com; stable@dpdk.org
> > > Subject: RE: [PATCH] graph: fix head move when graph walk in mcore
> > dispatch
> > >
> > >
> > > > >  		/* skip the src nodes which not bind with current worker */
> > > > >  		if ((int32_t)head < 0 && node->dispatch.lcore_id != graph-
> > > > > >dispatch.lcore_id)
> > > > >  			continue;
> > > > > -
> > > > > +		head++;
> > > > If current src node not bind with current core, It will go into infinite loop.
> > > > This line would have no chance to run.
> > >
> > > Seems reasonable, it might be OK to change "head<0" to "head <1" the
> > condition
> > > check?
> >
> > No. "head<0" means it is src node.
> > All src node would put before head = 0.  "Head<1" is confused.
> > You could find the details of graph reel under rte_graph_walk_rtc() in
> > lib/graph/rte_graph_model_rtc.h
> >
> > I guess if there are some src node missed, it may be caused by wrong
> > config, for example, the missed src node not pin to a lcore.
> > Use rte_graph_model_mcore_dispatch_node_lcore_affinity_set() to pin
> > the src node first.
> 
> I don't think it is confusing because head++ happens before head < 0 check.
I agree to change it to "head < 1"

> Yes, it happens when lcore affinity is not set.
> For example, we have two source nodes, both of them have no lcore affinity
> setting.
> By current code, the second node will also be executed which is not as expected.
> 
> Thanks
> Jingjing
  

Patch

diff --git a/lib/graph/rte_graph_model_mcore_dispatch.h b/lib/graph/rte_graph_model_mcore_dispatch.h
index 75ec388cad..b96469296e 100644
--- a/lib/graph/rte_graph_model_mcore_dispatch.h
+++ b/lib/graph/rte_graph_model_mcore_dispatch.h
@@ -97,12 +97,12 @@  rte_graph_walk_mcore_dispatch(struct rte_graph *graph)
 		__rte_graph_mcore_dispatch_sched_wq_process(graph);
 
 	while (likely(head != graph->tail)) {
-		node = (struct rte_node *)RTE_PTR_ADD(graph, cir_start[(int32_t)head++]);
+		node = (struct rte_node *)RTE_PTR_ADD(graph, cir_start[(int32_t)head]);
 
 		/* skip the src nodes which not bind with current worker */
 		if ((int32_t)head < 0 && node->dispatch.lcore_id != graph->dispatch.lcore_id)
 			continue;
-
+		head++;
 		/* Schedule the node until all task/objs are done */
 		if (node->dispatch.lcore_id != RTE_MAX_LCORE &&
 		    graph->dispatch.lcore_id != node->dispatch.lcore_id &&