test/service: fix race condition on stopping lcore
Checks
Commit Message
There is a potential race condition in 'service_attr_get' which will cause
test failures since the service core thread is still running while the
values are being retrieved/reset.
This patch fixes the race condition by waiting for the service core thread
to stop before continuing with the unit test checks.
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
---
app/test/test_service_cores.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)
Comments
> -----Original Message-----
> From: Laatz, Kevin <kevin.laatz@intel.com>
> Sent: Friday, October 16, 2020 10:08 AM
> To: dev@dpdk.org
> Cc: Van Haaren, Harry <harry.van.haaren@intel.com>;
> david.marchand@redhat.com; l.wojciechow@partner.samsung.com;
> Honnappa.Nagarahalli@arm.com; phil.yang@arm.com; aconole@redhat.com;
> Laatz, Kevin <kevin.laatz@intel.com>
> Subject: [PATCH] test/service: fix race condition on stopping lcore
>
> There is a potential race condition in 'service_attr_get' which will cause
> test failures since the service core thread is still running while the
> values are being retrieved/reset.
>
> This patch fixes the race condition by waiting for the service core thread
> to stop before continuing with the unit test checks.
>
> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Thanks Kevin for handling; can't reproduce race-cond here, but by code review
this is the correct fix, thanks also for refactoring the wait into its own function.
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
On Fri, Oct 16, 2020 at 11:13 AM Kevin Laatz <kevin.laatz@intel.com> wrote:
>
> There is a potential race condition in 'service_attr_get' which will cause
> test failures since the service core thread is still running while the
> values are being retrieved/reset.
>
> This patch fixes the race condition by waiting for the service core thread
> to stop before continuing with the unit test checks.
We won't backport it, since we need a new API, but I would flag it for info as:
Fixes: 4d55194d76a4 ("service: add attribute get function")
Ok for you?
--
David Marchand
> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Friday, October 16, 2020 12:50 PM
> To: Laatz, Kevin <kevin.laatz@intel.com>; Van Haaren, Harry
> <harry.van.haaren@intel.com>
> Cc: dev <dev@dpdk.org>; Lukasz Wojciechowski
> <l.wojciechow@partner.samsung.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Phil Yang <phil.yang@arm.com>; Aaron
> Conole <aconole@redhat.com>
> Subject: Re: [PATCH] test/service: fix race condition on stopping lcore
>
> On Fri, Oct 16, 2020 at 11:13 AM Kevin Laatz <kevin.laatz@intel.com> wrote:
> >
> > There is a potential race condition in 'service_attr_get' which will cause
> > test failures since the service core thread is still running while the
> > values are being retrieved/reset.
> >
> > This patch fixes the race condition by waiting for the service core thread
> > to stop before continuing with the unit test checks.
>
> We won't backport it, since we need a new API, but I would flag it for info as:
> Fixes: 4d55194d76a4 ("service: add attribute get function")
>
> Ok for you?
Yes - thanks.
> David Marchand
On Fri, Oct 16, 2020 at 11:18 AM Van Haaren, Harry
<harry.van.haaren@intel.com> wrote:
> > Subject: [PATCH] test/service: fix race condition on stopping lcore
> >
> > There is a potential race condition in 'service_attr_get' which will cause
> > test failures since the service core thread is still running while the
> > values are being retrieved/reset.
> >
> > This patch fixes the race condition by waiting for the service core thread
> > to stop before continuing with the unit test checks.
Fixes: 4d55194d76a4 ("service: add attribute get function")
> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
> Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Thanks Kevin, Harry, applied.
@@ -119,6 +119,17 @@ unregister_all(void)
return TEST_SUCCESS;
}
+/* Wait until service lcore not active, or for 100x SERVICE_DELAY */
+static void
+wait_slcore_inactive(uint32_t slcore_id)
+{
+ int i;
+
+ for (i = 0; rte_service_lcore_may_be_active(slcore_id) == 1 &&
+ i < 100; i++)
+ rte_delay_ms(SERVICE_DELAY);
+}
+
/* register a single dummy service */
static int
dummy_register(void)
@@ -305,6 +316,8 @@ service_attr_get(void)
rte_service_lcore_stop(slcore_id);
+ wait_slcore_inactive(slcore_id);
+
TEST_ASSERT_EQUAL(0, rte_service_attr_get(id, attr_calls, &attr_value),
"Valid attr_get() call didn't return success");
TEST_ASSERT_EQUAL(1, (attr_value > 0),
@@ -394,11 +407,7 @@ service_lcore_attr_get(void)
TEST_ASSERT_EQUAL(0, rte_service_lcore_stop(slcore_id),
"Failed to stop service lcore");
- /* Wait until service lcore not active, or for 100x SERVICE_DELAY */
- int i;
- for (i = 0; rte_service_lcore_may_be_active(slcore_id) == 1 &&
- i < 100; i++)
- rte_delay_ms(SERVICE_DELAY);
+ wait_slcore_inactive(slcore_id);
TEST_ASSERT_EQUAL(0, rte_service_lcore_may_be_active(slcore_id),
"Service lcore not stopped after waiting.");