From patchwork Mon Sep 14 14:31:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Van Haaren, Harry" X-Patchwork-Id: 77615 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id A33F4A04C7; Mon, 14 Sep 2020 16:30:12 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 36CA51C0CF; Mon, 14 Sep 2020 16:30:06 +0200 (CEST) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id F34131C0C0 for ; Mon, 14 Sep 2020 16:30:02 +0200 (CEST) IronPort-SDR: b+D4pbmJOIo4C+YSW3FtFXvlg12BnPj3v1WTcq9VXlm7HnFx2PWpEzc7mNbevgIwp1xlRks+VI 9Y02TvJG8wng== X-IronPort-AV: E=McAfee;i="6000,8403,9744"; a="160017181" X-IronPort-AV: E=Sophos;i="5.76,426,1592895600"; d="scan'208";a="160017181" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2020 07:30:00 -0700 IronPort-SDR: 6q0tA+2rE49Du91UMJ8bFA2tUxKv+RDu349VU/+lGZBQ9Zv+2gJ6BgzIM8w6e2LbIw9fezOS8V RT+Y5E+E9FWw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,426,1592895600"; d="scan'208";a="301768510" Received: from silpixa00399779.ir.intel.com (HELO silpixa00399779.ger.corp.intel.com) ([10.237.222.209]) by orsmga003.jf.intel.com with ESMTP; 14 Sep 2020 07:29:58 -0700 From: Harry van Haaren To: dev@dpdk.org Cc: david.marchand@redhat.com, Harry van Haaren Date: Mon, 14 Sep 2020 15:31:17 +0100 Message-Id: <20200914143118.84791-1-harry.van.haaren@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200724134506.11959-1-harry.van.haaren@intel.com> References: <20200724134506.11959-1-harry.van.haaren@intel.com> Subject: [dpdk-dev] [PATCH v6 1/2] service: add API to retrieve service core active X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This commit adds a new experimental API which allows the user to retrieve the active state of an lcore. Knowing when the service lcore is completed its polling loop can be useful to applications to avoid race conditions when e.g. finalizing statistics. The service thread itself now has a variable to indicate if its thread is active. When zero the service thread has completed its service, and has returned from the service_runner_func() function. Suggested-by: Lukasz Wojciechowski Signed-off-by: Harry van Haaren Reviewed-by: Phil Yang Reviewed-by: Honnappa Nagarahalli --- v5: - Fix typos (robot) v4: - Use _may_be_ style API for lcore_active (Honnappa) - Fix missing tab indent (Honnappa) - Add 'lcore' to doxygen retval description (Honnappa) @Honnappa: Please note i did not update the doxygen title of the lcore_may_be_active() function, as the current description is more accurate than making it more consistent with other functions. v3: - Change service lcore stores to SEQ_CST (Honnappa, David) - Change control thread load to ACQ (Honnappa, David) - Comment reasons for SEQ_CST/ACQ (Honnappa, David) - Add comments to Doxygen for _stop() and _lcore_active() (Honnappa, David) - Add Phil's review tag from ML --- lib/librte_eal/common/rte_service.c | 21 +++++++++++++++++++++ lib/librte_eal/include/rte_service.h | 22 +++++++++++++++++++++- lib/librte_eal/rte_eal_version.map | 1 + 3 files changed, 43 insertions(+), 1 deletion(-) diff --git a/lib/librte_eal/common/rte_service.c b/lib/librte_eal/common/rte_service.c index 6a0e0ff65d..98565bbef3 100644 --- a/lib/librte_eal/common/rte_service.c +++ b/lib/librte_eal/common/rte_service.c @@ -65,6 +65,7 @@ struct core_state { /* map of services IDs are run on this core */ uint64_t service_mask; uint8_t runstate; /* running or stopped */ + uint8_t thread_active; /* indicates when thread is in service_run() */ uint8_t is_service_core; /* set if core is currently a service core */ uint8_t service_active_on_lcore[RTE_SERVICE_NUM_MAX]; uint64_t loops; @@ -457,6 +458,8 @@ service_runner_func(void *arg) const int lcore = rte_lcore_id(); struct core_state *cs = &lcore_states[lcore]; + __atomic_store_n(&cs->thread_active, 1, __ATOMIC_SEQ_CST); + /* runstate act as the guard variable. Use load-acquire * memory order here to synchronize with store-release * in runstate update functions. @@ -475,9 +478,27 @@ service_runner_func(void *arg) cs->loops++; } + /* Use SEQ CST memory ordering to avoid any re-ordering around + * this store, ensuring that once this store is visible, the service + * lcore thread really is done in service cores code. + */ + __atomic_store_n(&cs->thread_active, 0, __ATOMIC_SEQ_CST); return 0; } +int32_t +rte_service_lcore_may_be_active(uint32_t lcore) +{ + if (lcore >= RTE_MAX_LCORE || !lcore_states[lcore].is_service_core) + return -EINVAL; + + /* Load thread_active using ACQUIRE to avoid instructions dependent on + * the result being re-ordered before this load completes. + */ + return __atomic_load_n(&lcore_states[lcore].thread_active, + __ATOMIC_ACQUIRE); +} + int32_t rte_service_lcore_count(void) { diff --git a/lib/librte_eal/include/rte_service.h b/lib/librte_eal/include/rte_service.h index e2d0a6dd32..ca9950d091 100644 --- a/lib/librte_eal/include/rte_service.h +++ b/lib/librte_eal/include/rte_service.h @@ -249,7 +249,11 @@ int32_t rte_service_lcore_start(uint32_t lcore_id); * Stop a service core. * * Stopping a core makes the core become idle, but remains assigned as a - * service core. + * service core. Note that the service lcore thread may not have returned from + * the service it is running when this API returns. + * + * The *rte_service_lcore_may_be_active* API can be used to check if the + * service lcore is * still active. * * @retval 0 Success * @retval -EINVAL Invalid *lcore_id* provided @@ -261,6 +265,22 @@ int32_t rte_service_lcore_start(uint32_t lcore_id); */ int32_t rte_service_lcore_stop(uint32_t lcore_id); +/** + * Reports if a service lcore is currently running. + * + * This function returns if the core has finished service cores code, and has + * returned to EAL control. If *rte_service_lcore_stop* has been called but + * the lcore has not returned to EAL yet, it might be required to wait and call + * this function again. The amount of time to wait before the core returns + * depends on the duration of the services being run. + * + * @retval 0 Service thread is not active, and lcore has been returned to EAL. + * @retval 1 Service thread is in the service core polling loop. + * @retval -EINVAL Invalid *lcore_id* provided. + */ +__rte_experimental +int32_t rte_service_lcore_may_be_active(uint32_t lcore_id); + /** * Adds lcore to the list of service cores. * diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map index 0b18e2ef85..55f611d071 100644 --- a/lib/librte_eal/rte_eal_version.map +++ b/lib/librte_eal/rte_eal_version.map @@ -395,6 +395,7 @@ EXPERIMENTAL { rte_lcore_dump; rte_lcore_iterate; rte_mp_disable; + rte_service_lcore_may_be_active; rte_thread_register; rte_thread_unregister; }; From patchwork Mon Sep 14 14:31:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Van Haaren, Harry" X-Patchwork-Id: 77614 X-Patchwork-Delegate: david.marchand@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 39377A04C7; Mon, 14 Sep 2020 16:30:04 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id EA24B1C0C2; Mon, 14 Sep 2020 16:30:03 +0200 (CEST) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 398EA1C0C0 for ; Mon, 14 Sep 2020 16:30:02 +0200 (CEST) IronPort-SDR: zrYEH7dp+rEL1w3ABkTeln6SM4Mt4QhqNYD0fJ8nvgjb2m5yc1/nyM8Ro4e6Cx/phD9/7B+JOQ J6ekjyLefa1A== X-IronPort-AV: E=McAfee;i="6000,8403,9744"; a="160017184" X-IronPort-AV: E=Sophos;i="5.76,426,1592895600"; d="scan'208";a="160017184" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2020 07:30:01 -0700 IronPort-SDR: syKrQqSS9stbCGTxMhnCuaCOhzRMdhPggHx1aMU6j1eRaYtLRgHBxwKQDvPvbJkEhmQl6kM8d8 FNT1pz/TMGOQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,426,1592895600"; d="scan'208";a="301768517" Received: from silpixa00399779.ir.intel.com (HELO silpixa00399779.ger.corp.intel.com) ([10.237.222.209]) by orsmga003.jf.intel.com with ESMTP; 14 Sep 2020 07:30:00 -0700 From: Harry van Haaren To: dev@dpdk.org Cc: david.marchand@redhat.com, Harry van Haaren Date: Mon, 14 Sep 2020 15:31:18 +0100 Message-Id: <20200914143118.84791-2-harry.van.haaren@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200914143118.84791-1-harry.van.haaren@intel.com> References: <20200724134506.11959-1-harry.van.haaren@intel.com> <20200914143118.84791-1-harry.van.haaren@intel.com> Subject: [dpdk-dev] [PATCH v6 2/2] test/service: fix race condition on stopping lcore X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This commit fixes a potential race condition in the tests where the lcore running a service would increment a counter that was already reset by the test-suite thread. The resulting race-condition incremented value could cause CI failures, as indicated by DPDK's CI. This patch fixes the race-condition by making use of the added rte_service_lcore_active() API, which indicates when a service-core is no longer in the service-core polling loop. The unit test makes use of the above function to detect when all statistics increments are done in the service-core thread, and then the unit test continues finalizing and checking state. Fixes: f28f3594ded2 ("service: add attribute API") Reported-by: David Marchand Signed-off-by: Harry van Haaren Reviewed-by: Phil Yang Reviewed-by: Honnappa Nagarahalli --- v6: - Fix CI issue on C99 style loop initializer (David) v4: - Update test to new _may_be_ style API (Honnappa) - Add reviewed by from ML v3: - Refactor while() to for() to simplify (Harry) - Use SERVICE_DELAY instead of magic const 1 (Phil) - Add Phil's reviewed by tag from ML v2: Thanks for discussion on v1, this v2 fixup for the CI including previous feedback on ML. --- app/test/test_service_cores.c | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/app/test/test_service_cores.c b/app/test/test_service_cores.c index ef1d8fcb9b..5d92bea8af 100644 --- a/app/test/test_service_cores.c +++ b/app/test/test_service_cores.c @@ -362,6 +362,9 @@ service_lcore_attr_get(void) "Service core add did not return zero"); TEST_ASSERT_EQUAL(0, rte_service_map_lcore_set(id, slcore_id, 1), "Enabling valid service and core failed"); + /* Ensure service is not active before starting */ + TEST_ASSERT_EQUAL(0, rte_service_lcore_may_be_active(slcore_id), + "Not-active service core reported as active"); TEST_ASSERT_EQUAL(0, rte_service_lcore_start(slcore_id), "Starting service core failed"); @@ -382,7 +385,23 @@ service_lcore_attr_get(void) lcore_attr_id, &lcore_attr_value), "Invalid lcore attr didn't return -EINVAL"); - rte_service_lcore_stop(slcore_id); + /* Ensure service is active */ + TEST_ASSERT_EQUAL(1, rte_service_lcore_may_be_active(slcore_id), + "Active service core reported as not-active"); + + TEST_ASSERT_EQUAL(0, rte_service_map_lcore_set(id, slcore_id, 0), + "Disabling valid service and core failed"); + TEST_ASSERT_EQUAL(0, rte_service_lcore_stop(slcore_id), + "Failed to stop service lcore"); + + /* Wait until service lcore not active, or for 100x SERVICE_DELAY */ + int i; + for (i = 0; rte_service_lcore_may_be_active(slcore_id) == 1 && + i < 100; i++) + rte_delay_ms(SERVICE_DELAY); + + TEST_ASSERT_EQUAL(0, rte_service_lcore_may_be_active(slcore_id), + "Service lcore not stopped after waiting."); TEST_ASSERT_EQUAL(0, rte_service_lcore_attr_reset_all(slcore_id), "Valid lcore_attr_reset_all() didn't return success");