[v3,2/2] test/service: fix race condition on stopping lcore
Checks
Commit Message
This commit fixes a potential race condition in the tests
where the lcore running a service would increment a counter
that was already reset by the test-suite thread. The resulting
race-condition incremented value could cause CI failures, as
indicated by DPDK's CI.
This patch fixes the race-condition by making use of the
added rte_service_lcore_active() API, which indicates when
a service-core is no longer in the service-core polling loop.
The unit test makes use of the above function to detect when
all statistics increments are done in the service-core thread,
and then the unit test continues finalizing and checking state.
Fixes: f28f3594ded2 ("service: add attribute API")
Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
---
v3:
- Refactor while() to for() to simplify (Harry)
- Use SERVICE_DELAY instead of magic const 1 (Phil)
- Add Phil's reviewed by tag from ML
v2:
Thanks for discussion on v1, this v2 fixup for the CI
including previous feedback on ML.
---
app/test/test_service_cores.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)
Comments
<snip>
> Subject: [PATCH v3 2/2] test/service: fix race condition on stopping lcore
>
> This commit fixes a potential race condition in the tests where the lcore
> running a service would increment a counter that was already reset by the
> test-suite thread. The resulting race-condition incremented value could cause
> CI failures, as indicated by DPDK's CI.
>
> This patch fixes the race-condition by making use of the added
> rte_service_lcore_active() API, which indicates when a service-core is no
> longer in the service-core polling loop.
>
> The unit test makes use of the above function to detect when all statistics
> increments are done in the service-core thread, and then the unit test
> continues finalizing and checking state.
>
> Fixes: f28f3594ded2 ("service: add attribute API")
>
> Reported-by: David Marchand <david.marchand@redhat.com>
> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
> Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
>
> ---
>
> v3:
> - Refactor while() to for() to simplify (Harry)
> - Use SERVICE_DELAY instead of magic const 1 (Phil)
> - Add Phil's reviewed by tag from ML
>
> v2:
> Thanks for discussion on v1, this v2 fixup for the CI including previous
> feedback on ML.
> ---
> app/test/test_service_cores.c | 20 +++++++++++++++++++-
> 1 file changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/app/test/test_service_cores.c b/app/test/test_service_cores.c index
> ef1d8fcb9..a6bc4487e 100644
> --- a/app/test/test_service_cores.c
> +++ b/app/test/test_service_cores.c
> @@ -362,6 +362,9 @@ service_lcore_attr_get(void)
> "Service core add did not return zero");
> TEST_ASSERT_EQUAL(0, rte_service_map_lcore_set(id, slcore_id, 1),
> "Enabling valid service and core failed");
> + /* Ensure service is not active before starting */
> + TEST_ASSERT_EQUAL(0, rte_service_lcore_active(slcore_id),
> + "Not-active service core reported as active");
> TEST_ASSERT_EQUAL(0, rte_service_lcore_start(slcore_id),
> "Starting service core failed");
>
> @@ -382,7 +385,22 @@ service_lcore_attr_get(void)
> lcore_attr_id, &lcore_attr_value),
> "Invalid lcore attr didn't return -EINVAL");
>
> - rte_service_lcore_stop(slcore_id);
> + /* Ensure service is active */
> + TEST_ASSERT_EQUAL(1, rte_service_lcore_active(slcore_id),
> + "Active service core reported as not-active");
> +
> + TEST_ASSERT_EQUAL(0, rte_service_map_lcore_set(id, slcore_id, 0),
> + "Disabling valid service and core failed");
> + TEST_ASSERT_EQUAL(0, rte_service_lcore_stop(slcore_id),
> + "Failed to stop service lcore");
> +
> + /* Wait until service lcore not active, or for 100x SERVICE_DELAY */
> + for (int i = 0; i < 100 && rte_service_lcore_active(slcore_id) == 1;
> + i++)
> + rte_delay_ms(SERVICE_DELAY);
> +
> + TEST_ASSERT_EQUAL(0, rte_service_lcore_active(slcore_id),
> + "Service lcore not stopped after waiting.");
>
> TEST_ASSERT_EQUAL(0, rte_service_lcore_attr_reset_all(slcore_id),
> "Valid lcore_attr_reset_all() didn't return success");
> --
> 2.17.1
@@ -362,6 +362,9 @@ service_lcore_attr_get(void)
"Service core add did not return zero");
TEST_ASSERT_EQUAL(0, rte_service_map_lcore_set(id, slcore_id, 1),
"Enabling valid service and core failed");
+ /* Ensure service is not active before starting */
+ TEST_ASSERT_EQUAL(0, rte_service_lcore_active(slcore_id),
+ "Not-active service core reported as active");
TEST_ASSERT_EQUAL(0, rte_service_lcore_start(slcore_id),
"Starting service core failed");
@@ -382,7 +385,22 @@ service_lcore_attr_get(void)
lcore_attr_id, &lcore_attr_value),
"Invalid lcore attr didn't return -EINVAL");
- rte_service_lcore_stop(slcore_id);
+ /* Ensure service is active */
+ TEST_ASSERT_EQUAL(1, rte_service_lcore_active(slcore_id),
+ "Active service core reported as not-active");
+
+ TEST_ASSERT_EQUAL(0, rte_service_map_lcore_set(id, slcore_id, 0),
+ "Disabling valid service and core failed");
+ TEST_ASSERT_EQUAL(0, rte_service_lcore_stop(slcore_id),
+ "Failed to stop service lcore");
+
+ /* Wait until service lcore not active, or for 100x SERVICE_DELAY */
+ for (int i = 0; i < 100 && rte_service_lcore_active(slcore_id) == 1;
+ i++)
+ rte_delay_ms(SERVICE_DELAY);
+
+ TEST_ASSERT_EQUAL(0, rte_service_lcore_active(slcore_id),
+ "Service lcore not stopped after waiting.");
TEST_ASSERT_EQUAL(0, rte_service_lcore_attr_reset_all(slcore_id),
"Valid lcore_attr_reset_all() didn't return success");