[v2,6/6] test/ring: use relaxed barriers for ring stress test

Message ID 20210909231312.2572006-7-honnappa.nagarahalli@arm.com (mailing list archive)
State Superseded, archived
Delegated to: David Marchand
Headers
Series Use correct memory ordering in eal functions |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/github-robot: build success github build: passed
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS

Commit Message

Honnappa Nagarahalli Sept. 9, 2021, 11:13 p.m. UTC
  wrk_cmd variable is used to signal the worker thread to start
or stop the stress test loop. Relaxed barriers are used
to achieve the same.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Feifei Wang <feifei.wang@arm.com>
---
 app/test/test_ring_stress_impl.h | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)
  

Comments

Ananyev, Konstantin Oct. 7, 2021, 11:55 a.m. UTC | #1
> wrk_cmd variable is used to signal the worker thread to start
> or stop the stress test loop. Relaxed barriers are used
> to achieve the same.
> 
> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
> Reviewed-by: Feifei Wang <feifei.wang@arm.com>
> ---
>  app/test/test_ring_stress_impl.h | 18 +++++++++---------
>  1 file changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/app/test/test_ring_stress_impl.h b/app/test/test_ring_stress_impl.h
> index f9ca63b908..ee8293bb04 100644
> --- a/app/test/test_ring_stress_impl.h
> +++ b/app/test/test_ring_stress_impl.h
> @@ -22,7 +22,7 @@ enum {
>  	WRK_CMD_RUN,
>  };
> 
> -static volatile uint32_t wrk_cmd __rte_cache_aligned;
> +static volatile uint32_t wrk_cmd __rte_cache_aligned = WRK_CMD_STOP;

If we switch to using atomic load/store for 'wrk_cmd',
then we can get remove 'volatile' classifier in the 'wrk_cmd' definition above?  

> 
>  /* test run-time in seconds */
>  static const uint32_t run_time = 60;
> @@ -197,10 +197,12 @@ test_worker(void *arg, const char *fname, int32_t prcs)
>  	fill_ring_elm(&def_elm, UINT32_MAX);
>  	fill_ring_elm(&loc_elm, lc);
> 
> -	while (wrk_cmd != WRK_CMD_RUN) {
> -		rte_smp_rmb();
> +	/* Acquire ordering is not required as the main is not
> +	 * really releasing any data through 'wrk_cmd' to
> +	 * the worker.
> +	 */
> +	while (__atomic_load_n(&wrk_cmd, __ATOMIC_RELAXED) != WRK_CMD_RUN)
>  		rte_pause();
> -	}
> 
>  	cl = rte_rdtsc_precise();
> 
> @@ -242,7 +244,7 @@ test_worker(void *arg, const char *fname, int32_t prcs)
> 
>  		lcore_stat_update(&la->stats, 1, num, tm0 + tm1, prcs);
> 
> -	} while (wrk_cmd == WRK_CMD_RUN);
> +	} while (__atomic_load_n(&wrk_cmd, __ATOMIC_RELAXED) == WRK_CMD_RUN);
> 
>  	cl = rte_rdtsc_precise() - cl;
>  	if (prcs == 0)
> @@ -356,14 +358,12 @@ test_mt1(int (*test)(void *))
>  	}
> 
>  	/* signal worker to start test */
> -	wrk_cmd = WRK_CMD_RUN;
> -	rte_smp_wmb();
> +	__atomic_store_n(&wrk_cmd, WRK_CMD_RUN, __ATOMIC_RELEASE);
> 
>  	usleep(run_time * US_PER_S);
> 
>  	/* signal worker to start test */
> -	wrk_cmd = WRK_CMD_STOP;
> -	rte_smp_wmb();
> +	__atomic_store_n(&wrk_cmd, WRK_CMD_STOP, __ATOMIC_RELEASE);
> 
>  	/* wait for workers and collect stats. */
>  	mc = rte_lcore_id();
> --
> 2.25.1
  
Honnappa Nagarahalli Oct. 7, 2021, 11:40 p.m. UTC | #2
<snip>
> 
> 
> > wrk_cmd variable is used to signal the worker thread to start or stop
> > the stress test loop. Relaxed barriers are used to achieve the same.
> >
> > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
> > Reviewed-by: Feifei Wang <feifei.wang@arm.com>
> > ---
> >  app/test/test_ring_stress_impl.h | 18 +++++++++---------
> >  1 file changed, 9 insertions(+), 9 deletions(-)
> >
> > diff --git a/app/test/test_ring_stress_impl.h
> > b/app/test/test_ring_stress_impl.h
> > index f9ca63b908..ee8293bb04 100644
> > --- a/app/test/test_ring_stress_impl.h
> > +++ b/app/test/test_ring_stress_impl.h
> > @@ -22,7 +22,7 @@ enum {
> >  	WRK_CMD_RUN,
> >  };
> >
> > -static volatile uint32_t wrk_cmd __rte_cache_aligned;
> > +static volatile uint32_t wrk_cmd __rte_cache_aligned = WRK_CMD_STOP;
> 
> If we switch to using atomic load/store for 'wrk_cmd', then we can get remove
> 'volatile' classifier in the 'wrk_cmd' definition above?
Agree, will remove

> 
> >
> >  /* test run-time in seconds */
> >  static const uint32_t run_time = 60;
<snip>
  

Patch

diff --git a/app/test/test_ring_stress_impl.h b/app/test/test_ring_stress_impl.h
index f9ca63b908..ee8293bb04 100644
--- a/app/test/test_ring_stress_impl.h
+++ b/app/test/test_ring_stress_impl.h
@@ -22,7 +22,7 @@  enum {
 	WRK_CMD_RUN,
 };
 
-static volatile uint32_t wrk_cmd __rte_cache_aligned;
+static volatile uint32_t wrk_cmd __rte_cache_aligned = WRK_CMD_STOP;
 
 /* test run-time in seconds */
 static const uint32_t run_time = 60;
@@ -197,10 +197,12 @@  test_worker(void *arg, const char *fname, int32_t prcs)
 	fill_ring_elm(&def_elm, UINT32_MAX);
 	fill_ring_elm(&loc_elm, lc);
 
-	while (wrk_cmd != WRK_CMD_RUN) {
-		rte_smp_rmb();
+	/* Acquire ordering is not required as the main is not
+	 * really releasing any data through 'wrk_cmd' to
+	 * the worker.
+	 */
+	while (__atomic_load_n(&wrk_cmd, __ATOMIC_RELAXED) != WRK_CMD_RUN)
 		rte_pause();
-	}
 
 	cl = rte_rdtsc_precise();
 
@@ -242,7 +244,7 @@  test_worker(void *arg, const char *fname, int32_t prcs)
 
 		lcore_stat_update(&la->stats, 1, num, tm0 + tm1, prcs);
 
-	} while (wrk_cmd == WRK_CMD_RUN);
+	} while (__atomic_load_n(&wrk_cmd, __ATOMIC_RELAXED) == WRK_CMD_RUN);
 
 	cl = rte_rdtsc_precise() - cl;
 	if (prcs == 0)
@@ -356,14 +358,12 @@  test_mt1(int (*test)(void *))
 	}
 
 	/* signal worker to start test */
-	wrk_cmd = WRK_CMD_RUN;
-	rte_smp_wmb();
+	__atomic_store_n(&wrk_cmd, WRK_CMD_RUN, __ATOMIC_RELEASE);
 
 	usleep(run_time * US_PER_S);
 
 	/* signal worker to start test */
-	wrk_cmd = WRK_CMD_STOP;
-	rte_smp_wmb();
+	__atomic_store_n(&wrk_cmd, WRK_CMD_STOP, __ATOMIC_RELEASE);
 
 	/* wait for workers and collect stats. */
 	mc = rte_lcore_id();