From patchwork Wed Jun 5 15:58:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phil Yang X-Patchwork-Id: 54420 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 8FBC01BBD8; Wed, 5 Jun 2019 17:59:44 +0200 (CEST) Received: from foss.arm.com (usa-sjc-mx-foss1.foss.arm.com [217.140.101.70]) by dpdk.org (Postfix) with ESMTP id 29CD41BBD3 for ; Wed, 5 Jun 2019 17:59:42 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7F88C15A2; Wed, 5 Jun 2019 08:59:41 -0700 (PDT) Received: from phil-VirtualBox.shanghai.arm.com (unknown [10.171.20.59]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 2284E3F246; Wed, 5 Jun 2019 08:59:39 -0700 (PDT) From: Phil Yang To: dev@dpdk.org Cc: thomas@monjalon.net, jerinj@marvell.com, hemant.agrawal@nxp.com, Honnappa.Nagarahalli@arm.com, gavin.hu@arm.com, phil.yang@arm.com, nd@arm.com Date: Wed, 5 Jun 2019 23:58:46 +0800 Message-Id: <1559750328-22377-2-git-send-email-phil.yang@arm.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1559750328-22377-1-git-send-email-phil.yang@arm.com> References: <1559750328-22377-1-git-send-email-phil.yang@arm.com> Subject: [dpdk-dev] [PATCH v1 1/3] eal/mcslock: add mcs queued lock implementation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" If there are multiple threads contending, they all attempt to take the spinlock lock at the same time once it is released. This results in a huge amount of processor bus traffic, which is a huge performance killer. Thus, if we somehow order the lock-takers so that they know who is next in line for the resource we can vastly reduce the amount of bus traffic. This patch added MCS lock library. It provides scalability by spinning on a CPU/thread local variable which avoids expensive cache bouncings. It provides fairness by maintaining a list of acquirers and passing the lock to each CPU/thread in the order they acquired the lock. Signed-off-by: Phil Yang Reviewed-by: Steve Capper Reviewed-by: Honnappa Nagarahalli --- MAINTAINERS | 4 + doc/api/doxy-api-index.md | 1 + doc/guides/rel_notes/release_19_08.rst | 6 + lib/librte_eal/common/Makefile | 2 +- .../common/include/generic/rte_mcslock.h | 169 +++++++++++++++++++++ lib/librte_eal/common/meson.build | 1 + 6 files changed, 182 insertions(+), 1 deletion(-) create mode 100644 lib/librte_eal/common/include/generic/rte_mcslock.h diff --git a/MAINTAINERS b/MAINTAINERS index 15d0829..1390238 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -222,6 +222,10 @@ M: Joyce Kong F: lib/librte_eal/common/include/generic/rte_ticketlock.h F: app/test/test_ticketlock.c +MCSlock - EXPERIMENTAL +M: Phil Yang +F: lib/librte_eal/common/include/generic/rte_mcslock.h + ARM v7 M: Jan Viktorin M: Gavin Hu diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md index 715248d..d0e32b1 100644 --- a/doc/api/doxy-api-index.md +++ b/doc/api/doxy-api-index.md @@ -63,6 +63,7 @@ The public API headers are grouped by topics: - **locks**: [atomic] (@ref rte_atomic.h), + [mcslock] (@ref rte_mcslock.h), [rwlock] (@ref rte_rwlock.h), [spinlock] (@ref rte_spinlock.h), [ticketlock] (@ref rte_ticketlock.h), diff --git a/doc/guides/rel_notes/release_19_08.rst b/doc/guides/rel_notes/release_19_08.rst index a17e7de..ebd5105 100644 --- a/doc/guides/rel_notes/release_19_08.rst +++ b/doc/guides/rel_notes/release_19_08.rst @@ -54,6 +54,12 @@ New Features Also, make sure to start the actual text at the margin. ========================================================= +* **Added MCS lock library.** + + Added MCS lock library. It provides scalability by spinning on a + CPU/thread local variable which avoids expensive cache bouncings. + It provides fairness by maintaining a list of acquirers and passing + the lock to each CPU/thread in the order they acquired the lock. Removed Items ------------- diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile index 1647af7..a00d4fc 100644 --- a/lib/librte_eal/common/Makefile +++ b/lib/librte_eal/common/Makefile @@ -21,7 +21,7 @@ INC += rte_reciprocal.h rte_fbarray.h rte_uuid.h GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h rte_prefetch.h GENERIC_INC += rte_memcpy.h rte_cpuflags.h -GENERIC_INC += rte_spinlock.h rte_rwlock.h rte_ticketlock.h +GENERIC_INC += rte_mcslock.h rte_spinlock.h rte_rwlock.h rte_ticketlock.h GENERIC_INC += rte_vect.h rte_pause.h rte_io.h # defined in mk/arch/$(RTE_ARCH)/rte.vars.mk diff --git a/lib/librte_eal/common/include/generic/rte_mcslock.h b/lib/librte_eal/common/include/generic/rte_mcslock.h new file mode 100644 index 0000000..20e9bb8 --- /dev/null +++ b/lib/librte_eal/common/include/generic/rte_mcslock.h @@ -0,0 +1,169 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Arm Limited + */ + +#ifndef _RTE_MCSLOCK_H_ +#define _RTE_MCSLOCK_H_ + +/** + * @file + * + * RTE MCS lock + * + * This file defines the main data structure and APIs for MCS queued lock. + * + * The MCS lock (proposed by JOHN M. MELLOR-CRUMMEY and MICHAEL L. SCOTT) + * provides scalability by spinning on a CPU/thread local variable which + * avoids expensive cache bouncings. It provides fairness by maintaining + * a list of acquirers and passing the lock to each CPU/thread in the order + * they acquired the lock. + */ + +#include +#include +#include + +/** + * The rte_mcslock_t type. + */ +typedef struct rte_mcslock { + struct rte_mcslock *next; + int locked; /* 1 if the queue locked, 0 otherwise */ +} rte_mcslock_t; + +/** + * @warning + * @b EXPERIMENTAL: This API may change without prior notice + * + * Take the MCS lock. + * + * @param msl + * A pointer to the pointer of a MCS lock. + * When the lock is initialized or declared, the msl pointer should be + * set to NULL. + * @param me + * A pointer to a new node of MCS lock. Each CPU/thread acquiring the + * lock should use its 'own node'. + */ +static inline __rte_experimental void +rte_mcslock_lock(rte_mcslock_t **msl, rte_mcslock_t *me) +{ + rte_mcslock_t *prev; + + /* Init me node */ + __atomic_store_n(&me->next, NULL, __ATOMIC_RELAXED); + + /* If the queue is empty, the exchange operation is enough to acquire + * the lock. Hence, the exchange operation requires acquire semantics. + * The store to me->next above should complete before the node is + * visible to other CPUs/threads. Hence, the exchange operation requires + * release semantics as well. + */ + prev = __atomic_exchange_n(msl, me, __ATOMIC_ACQ_REL); + if (likely(prev == NULL)) { + /* Queue was empty, no further action required, + * proceed with lock taken. + */ + return; + } + __atomic_store_n(&me->locked, 1, __ATOMIC_RELAXED); + __atomic_store_n(&prev->next, me, __ATOMIC_RELAXED); + + /* The while-load of me->locked should not move above the previous + * store to prev->next. Otherwise it will cause a deadlock. Need a + * store-load barrier. + */ + __atomic_thread_fence(__ATOMIC_ACQ_REL); + /* If the lock has already been acquired, it first atomically + * places the node at the end of the queue and then proceeds + * to spin on me->locked until the previous lock holder resets + * the me->locked using mcslock_unlock(). + */ + while (__atomic_load_n(&me->locked, __ATOMIC_ACQUIRE)) + rte_pause(); +} + +/** + * @warning + * @b EXPERIMENTAL: This API may change without prior notice + * + * Release the MCS lock. + * + * @param msl + * A pointer to the pointer of a MCS lock. + * @param me + * A pointer to the node of MCS lock passed in rte_mcslock_lock. + */ +static inline __rte_experimental void +rte_mcslock_unlock(rte_mcslock_t **msl, rte_mcslock_t *me) +{ + /* Check if there are more nodes in the queue. */ + if (likely(__atomic_load_n(&me->next, __ATOMIC_RELAXED) == NULL)) { + /* No, last member in the queue. */ + rte_mcslock_t *save_me = __atomic_load_n(&me, __ATOMIC_RELAXED); + + /* Release the lock by setting it to NULL */ + if (likely(__atomic_compare_exchange_n(msl, &save_me, NULL, 0, + __ATOMIC_RELEASE, __ATOMIC_RELAXED))) + return; + /* More nodes added to the queue by other CPUs. + * Wait until the next pointer is set. + */ + while (__atomic_load_n(&me->next, __ATOMIC_RELAXED) == NULL) + rte_pause(); + } + + /* Pass lock to next waiter. */ + __atomic_store_n(&me->next->locked, 0, __ATOMIC_RELEASE); +} + +/** + * @warning + * @b EXPERIMENTAL: This API may change without prior notice + * + * Try to take the lock. + * + * @param msl + * A pointer to the pointer of a MCS lock. + * @param me + * A pointer to a new node of MCS lock. + * @return + * 1 if the lock is successfully taken; 0 otherwise. + */ +static inline __rte_experimental int +rte_mcslock_trylock(rte_mcslock_t **msl, rte_mcslock_t *me) +{ + /* Init me node */ + __atomic_store_n(&me->next, NULL, __ATOMIC_RELAXED); + + /* Try to lock */ + rte_mcslock_t *expected = NULL; + + /* The lock can be taken only when the queue is empty. Hence, + * the compare-exchange operation requires acquire semantics. + * The store to me->next above should complete before the node + * is visible to other CPUs/threads. Hence, the compare-exchange + * operation requires release semantics as well. + */ + return __atomic_compare_exchange_n(msl, &expected, me, 0, + __ATOMIC_ACQ_REL, __ATOMIC_RELAXED); +} + +/** + * @warning + * @b EXPERIMENTAL: This API may change without prior notice + * + * Test if the lock is taken. + * + * @param msl + * A pointer to a MCS lock node. + * @return + * 1 if the lock is currently taken; 0 otherwise. + */ +static inline __rte_experimental int +rte_mcslock_is_locked(rte_mcslock_t *msl) +{ + return (__atomic_load_n(&msl, __ATOMIC_RELAXED) != NULL); +} + +#endif /* _RTE_MCSLOCK_H_ */ diff --git a/lib/librte_eal/common/meson.build b/lib/librte_eal/common/meson.build index 0670e41..f74db29 100644 --- a/lib/librte_eal/common/meson.build +++ b/lib/librte_eal/common/meson.build @@ -94,6 +94,7 @@ generic_headers = files( 'include/generic/rte_cpuflags.h', 'include/generic/rte_cycles.h', 'include/generic/rte_io.h', + 'include/generic/rte_mcslock.h', 'include/generic/rte_memcpy.h', 'include/generic/rte_pause.h', 'include/generic/rte_prefetch.h', From patchwork Wed Jun 5 15:58:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phil Yang X-Patchwork-Id: 54421 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 6EB4F1BBDE; Wed, 5 Jun 2019 17:59:46 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.101.70]) by dpdk.org (Postfix) with ESMTP id D7B6F1BBD6 for ; Wed, 5 Jun 2019 17:59:43 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 40EBC15BF; Wed, 5 Jun 2019 08:59:43 -0700 (PDT) Received: from phil-VirtualBox.shanghai.arm.com (unknown [10.171.20.59]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id BBD1D3F246; Wed, 5 Jun 2019 08:59:41 -0700 (PDT) From: Phil Yang To: dev@dpdk.org Cc: thomas@monjalon.net, jerinj@marvell.com, hemant.agrawal@nxp.com, Honnappa.Nagarahalli@arm.com, gavin.hu@arm.com, phil.yang@arm.com, nd@arm.com Date: Wed, 5 Jun 2019 23:58:47 +0800 Message-Id: <1559750328-22377-3-git-send-email-phil.yang@arm.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1559750328-22377-1-git-send-email-phil.yang@arm.com> References: <1559750328-22377-1-git-send-email-phil.yang@arm.com> Subject: [dpdk-dev] [PATCH v1 2/3] eal/mcslock: use generic msc queued lock on all arch X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Let all architectures use generic MCS queued lock implementation. Signed-off-by: Phil Yang Reviewed-by: Gavin Hu Reviewed-by: Honnappa Nagarahalli --- .../common/include/arch/arm/rte_mcslock.h | 23 ++++++++++++++++++++++ .../common/include/arch/ppc_64/rte_mcslock.h | 19 ++++++++++++++++++ .../common/include/arch/x86/rte_mcslock.h | 19 ++++++++++++++++++ 3 files changed, 61 insertions(+) create mode 100644 lib/librte_eal/common/include/arch/arm/rte_mcslock.h create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_mcslock.h create mode 100644 lib/librte_eal/common/include/arch/x86/rte_mcslock.h diff --git a/lib/librte_eal/common/include/arch/arm/rte_mcslock.h b/lib/librte_eal/common/include/arch/arm/rte_mcslock.h new file mode 100644 index 0000000..5e41e32 --- /dev/null +++ b/lib/librte_eal/common/include/arch/arm/rte_mcslock.h @@ -0,0 +1,23 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Arm Limited + */ + +#ifndef _RTE_MCSLOCK_ARM_H_ +#define _RTE_MCSLOCK_ARM_H_ + +#ifndef RTE_FORCE_INTRINSICS +# error Platform must be built with CONFIG_RTE_FORCE_INTRINSICS +#endif + +#ifdef __cplusplus +extern "C" { +#endif + +#include "generic/rte_mcslock.h" + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_MCSLOCK_ARM_H_ */ + diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_mcslock.h b/lib/librte_eal/common/include/arch/ppc_64/rte_mcslock.h new file mode 100644 index 0000000..951b6dd --- /dev/null +++ b/lib/librte_eal/common/include/arch/ppc_64/rte_mcslock.h @@ -0,0 +1,19 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Arm Limited + */ + +#ifndef _RTE_MCSLOCK_PPC_64_H_ +#define _RTE_MCSLOCK_PPC_64_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include "generic/rte_mcslock.h" + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_MCSLOCK_PPC_64_H_ */ + diff --git a/lib/librte_eal/common/include/arch/x86/rte_mcslock.h b/lib/librte_eal/common/include/arch/x86/rte_mcslock.h new file mode 100644 index 0000000..573b700 --- /dev/null +++ b/lib/librte_eal/common/include/arch/x86/rte_mcslock.h @@ -0,0 +1,19 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Arm Limited + */ + +#ifndef _RTE_MCSLOCK_X86_64_H_ +#define _RTE_MCSLOCK_X86_64_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include "generic/rte_mcslock.h" + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_MCSLOCK_X86_64_H_ */ + From patchwork Wed Jun 5 15:58:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phil Yang X-Patchwork-Id: 54422 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 4771B1BBE3; Wed, 5 Jun 2019 17:59:48 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.101.70]) by dpdk.org (Postfix) with ESMTP id 980131BBDB for ; Wed, 5 Jun 2019 17:59:45 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 006AF374; Wed, 5 Jun 2019 08:59:45 -0700 (PDT) Received: from phil-VirtualBox.shanghai.arm.com (unknown [10.171.20.59]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 7D5543F246; Wed, 5 Jun 2019 08:59:43 -0700 (PDT) From: Phil Yang To: dev@dpdk.org Cc: thomas@monjalon.net, jerinj@marvell.com, hemant.agrawal@nxp.com, Honnappa.Nagarahalli@arm.com, gavin.hu@arm.com, phil.yang@arm.com, nd@arm.com Date: Wed, 5 Jun 2019 23:58:48 +0800 Message-Id: <1559750328-22377-4-git-send-email-phil.yang@arm.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1559750328-22377-1-git-send-email-phil.yang@arm.com> References: <1559750328-22377-1-git-send-email-phil.yang@arm.com> Subject: [dpdk-dev] [PATCH v1 3/3] test/mcslock: add mcs queued lock unit test X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Unit test and perf test for MCS queued lock. Signed-off-by: Phil Yang Reviewed-by: Gavin Hu Reviewed-by: Honnappa Nagarahalli --- MAINTAINERS | 1 + app/test/Makefile | 1 + app/test/autotest_data.py | 6 + app/test/autotest_test_funcs.py | 32 ++++++ app/test/meson.build | 2 + app/test/test_mcslock.c | 248 ++++++++++++++++++++++++++++++++++++++++ 6 files changed, 290 insertions(+) create mode 100644 app/test/test_mcslock.c diff --git a/MAINTAINERS b/MAINTAINERS index 1390238..33fdc8f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -225,6 +225,7 @@ F: app/test/test_ticketlock.c MCSlock - EXPERIMENTAL M: Phil Yang F: lib/librte_eal/common/include/generic/rte_mcslock.h +F: app/test/test_mcslock.c ARM v7 M: Jan Viktorin diff --git a/app/test/Makefile b/app/test/Makefile index 68d6b4f..be405cd 100644 --- a/app/test/Makefile +++ b/app/test/Makefile @@ -64,6 +64,7 @@ SRCS-y += test_atomic.c SRCS-y += test_barrier.c SRCS-y += test_malloc.c SRCS-y += test_cycles.c +SRCS-y += test_mcslock.c SRCS-y += test_spinlock.c SRCS-y += test_ticketlock.c SRCS-y += test_memory.c diff --git a/app/test/autotest_data.py b/app/test/autotest_data.py index 0f2c9a7..68ca23d 100644 --- a/app/test/autotest_data.py +++ b/app/test/autotest_data.py @@ -177,6 +177,12 @@ "Report": None, }, { + "Name": "MCSlock autotest", + "Command": "mcslock_autotest", + "Func": mcslock_autotest, + "Report": None, + }, + { "Name": "Byte order autotest", "Command": "byteorder_autotest", "Func": default_autotest, diff --git a/app/test/autotest_test_funcs.py b/app/test/autotest_test_funcs.py index 31cc0f5..26688b7 100644 --- a/app/test/autotest_test_funcs.py +++ b/app/test/autotest_test_funcs.py @@ -164,6 +164,38 @@ def ticketlock_autotest(child, test_name): return 0, "Success" +def mcslock_autotest(child, test_name): + i = 0 + ir = 0 + child.sendline(test_name) + while True: + index = child.expect(["Test OK", + "Test Failed", + "lcore ([0-9]*) state: ([0-1])" + "MCS lock taken on core ([0-9]*)", + "MCS lock released on core ([0-9]*)", + pexpect.TIMEOUT], timeout=5) + # ok + if index == 0: + break + + # message, check ordering + elif index == 2: + if int(child.match.groups()[0]) < i: + return -1, "Fail [Bad order]" + i = int(child.match.groups()[0]) + elif index == 3: + if int(child.match.groups()[0]) < ir: + return -1, "Fail [Bad order]" + ir = int(child.match.groups()[0]) + + # fail + elif index == 4: + return -1, "Fail [Timeout]" + elif index == 1: + return -1, "Fail" + + return 0, "Success" def logs_autotest(child, test_name): child.sendline(test_name) diff --git a/app/test/meson.build b/app/test/meson.build index 83391ce..3f5f17a 100644 --- a/app/test/meson.build +++ b/app/test/meson.build @@ -75,6 +75,7 @@ test_sources = files('commands.c', 'test_memzone.c', 'test_meter.c', 'test_metrics.c', + 'test_mcslock.c', 'test_mp_secondary.c', 'test_pdump.c', 'test_per_lcore.c', @@ -167,6 +168,7 @@ fast_parallel_test_names = [ 'lpm6_autotest', 'malloc_autotest', 'mbuf_autotest', + 'mcslock_autotest', 'memcpy_autotest', 'memory_autotest', 'mempool_autotest', diff --git a/app/test/test_mcslock.c b/app/test/test_mcslock.c new file mode 100644 index 0000000..a2274e5 --- /dev/null +++ b/app/test/test_mcslock.c @@ -0,0 +1,248 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Arm Limited + */ + +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "test.h" + +/* + * RTE MCS lock test + * ================= + * + * These tests are derived from spin lock test cases. + * + * - The functional test takes all of these locks and launches the + * ''test_mcslock_per_core()'' function on each core (except the master). + * + * - The function takes the global lock, display something, then releases + * the global lock on each core. + * + * - A load test is carried out, with all cores attempting to lock a single + * lock multiple times. + */ +#include + +RTE_DEFINE_PER_LCORE(rte_mcslock_t, _ml_me); +RTE_DEFINE_PER_LCORE(rte_mcslock_t, _ml_try_me); +RTE_DEFINE_PER_LCORE(rte_mcslock_t, _ml_perf_me); + +rte_mcslock_t *p_ml; +rte_mcslock_t *p_ml_try; +rte_mcslock_t *p_ml_perf; + +static unsigned int count; + +static rte_atomic32_t synchro; + +static int +test_mcslock_per_core(__attribute__((unused)) void *arg) +{ + /* Per core me node. */ + rte_mcslock_t ml_me = RTE_PER_LCORE(_ml_me); + + rte_mcslock_lock(&p_ml, &ml_me); + printf("MCS lock taken on core %u\n", rte_lcore_id()); + rte_mcslock_unlock(&p_ml, &ml_me); + printf("MCS lock released on core %u\n", rte_lcore_id()); + + return 0; +} + +static uint64_t time_count[RTE_MAX_LCORE] = {0}; + +#define MAX_LOOP 10000 + +static int +load_loop_fn(void *func_param) +{ + uint64_t time_diff = 0, begin; + uint64_t hz = rte_get_timer_hz(); + volatile uint64_t lcount = 0; + const int use_lock = *(int *)func_param; + const unsigned int lcore = rte_lcore_id(); + + /**< Per core me node. */ + rte_mcslock_t ml_perf_me = RTE_PER_LCORE(_ml_perf_me); + + /* wait synchro */ + while (rte_atomic32_read(&synchro) == 0) + ; + + begin = rte_get_timer_cycles(); + while (lcount < MAX_LOOP) { + if (use_lock) + rte_mcslock_lock(&p_ml_perf, &ml_perf_me); + + lcount++; + if (use_lock) + rte_mcslock_unlock(&p_ml_perf, &ml_perf_me); + } + time_diff = rte_get_timer_cycles() - begin; + time_count[lcore] = time_diff * 1000000 / hz; + return 0; +} + +static int +test_mcslock_perf(void) +{ + unsigned int i; + uint64_t total = 0; + int lock = 0; + const unsigned int lcore = rte_lcore_id(); + + printf("\nTest with no lock on single core...\n"); + rte_atomic32_set(&synchro, 1); + load_loop_fn(&lock); + printf("Core [%u] Cost Time = %"PRIu64" us\n", + lcore, time_count[lcore]); + memset(time_count, 0, sizeof(time_count)); + + printf("\nTest with lock on single core...\n"); + lock = 1; + rte_atomic32_set(&synchro, 1); + load_loop_fn(&lock); + printf("Core [%u] Cost Time = %"PRIu64" us\n", + lcore, time_count[lcore]); + memset(time_count, 0, sizeof(time_count)); + + printf("\nTest with lock on %u cores...\n", (rte_lcore_count()-1)); + + rte_atomic32_set(&synchro, 0); + rte_eal_mp_remote_launch(load_loop_fn, &lock, SKIP_MASTER); + rte_atomic32_set(&synchro, 1); + + rte_eal_mp_wait_lcore(); + + RTE_LCORE_FOREACH_SLAVE(i) { + printf("Core [%u] Cost Time = %"PRIu64" us\n", + i, time_count[i]); + total += time_count[i]; + } + + printf("Total Cost Time = %"PRIu64" us\n", total); + + return 0; +} + +/* + * Use rte_mcslock_trylock() to trylock a mcs lock object, + * If it could not lock the object successfully, it would + * return immediately. + */ +static int +test_mcslock_try(__attribute__((unused)) void *arg) +{ + /**< Per core me node. */ + rte_mcslock_t ml_me = RTE_PER_LCORE(_ml_me); + rte_mcslock_t ml_try_me = RTE_PER_LCORE(_ml_try_me); + + /* Locked ml_try in the master lcore, so it should fail + * when trying to lock it in the slave lcore. + */ + if (rte_mcslock_trylock(&p_ml_try, &ml_try_me) == 0) { + rte_mcslock_lock(&p_ml, &ml_me); + count++; + rte_mcslock_unlock(&p_ml, &ml_me); + } + + return 0; +} + + +/* + * Test rte_eal_get_lcore_state() in addition to mcs locks + * as we have "waiting" then "running" lcores. + */ +static int +test_mcslock(void) +{ + int ret = 0; + int i; + + /* Define per core me node. */ + rte_mcslock_t ml_me = RTE_PER_LCORE(_ml_me); + rte_mcslock_t ml_try_me = RTE_PER_LCORE(_ml_try_me); + + /* + * Test mcs lock & unlock on each core + */ + + /* slave cores should be waiting: print it */ + RTE_LCORE_FOREACH_SLAVE(i) { + printf("lcore %d state: %d\n", i, + (int) rte_eal_get_lcore_state(i)); + } + + rte_mcslock_lock(&p_ml, &ml_me); + + RTE_LCORE_FOREACH_SLAVE(i) { + rte_eal_remote_launch(test_mcslock_per_core, NULL, i); + } + + /* slave cores should be busy: print it */ + RTE_LCORE_FOREACH_SLAVE(i) { + printf("lcore %d state: %d\n", i, + (int) rte_eal_get_lcore_state(i)); + } + + rte_mcslock_unlock(&p_ml, &ml_me); + + rte_eal_mp_wait_lcore(); + + /* + * Test if it could return immediately from try-locking a locked object. + * Here it will lock the mcs lock object first, then launch all the + * slave lcores to trylock the same mcs lock object. + * All the slave lcores should give up try-locking a locked object and + * return immediately, and then increase the "count" initialized with + * zero by one per times. + * We can check if the "count" is finally equal to the number of all + * slave lcores to see if the behavior of try-locking a locked + * mcslock object is correct. + */ + if (rte_mcslock_trylock(&p_ml_try, &ml_try_me) == 0) + return -1; + + count = 0; + RTE_LCORE_FOREACH_SLAVE(i) { + rte_eal_remote_launch(test_mcslock_try, NULL, i); + } + rte_mcslock_unlock(&p_ml_try, &ml_try_me); + rte_eal_mp_wait_lcore(); + + /* Test is_locked API */ + if (rte_mcslock_is_locked(p_ml)) { + printf("mcslock is locked but it should not be\n"); + return -1; + } + + /* Counting the locked times in each core */ + rte_mcslock_lock(&p_ml, &ml_me); + if (count != (rte_lcore_count() - 1)) + ret = -1; + rte_mcslock_unlock(&p_ml, &ml_me); + + /* mcs lock perf test */ + if (test_mcslock_perf() < 0) + return -1; + + return ret; +} + +REGISTER_TEST_COMMAND(mcslock_autotest, test_mcslock);