diff mbox series

eal: fix memory initialization deadlock

Message ID	20230830103303.2428995-1-artemyko@nvidia.com (mailing list archive)
State	Superseded, archived
Delegated to:	Thomas Monjalon
Headers	Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.161 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.161; helo=mail.nvidia.com; pr=C From: Artemy Kovalyov <artemyko@nvidia.com> To: <dev@dpdk.org> CC: Thomas Monjalon <thomas@monjalon.net>, Ophir Munk <ophirmu@nvidia.com>, <stable@dpdk.org>, Anatoly Burakov <anatoly.burakov@intel.com>, "Stephen Hemminger" <stephen@networkplumber.org>, =?utf-8?q?Morten_Br=C3=B8?= =?utf-8?q?rup?= <mb@smartsharesystems.com> Subject: [PATCH] eal: fix memory initialization deadlock Date: Wed, 30 Aug 2023 13:33:03 +0300 Message-ID: <20230830103303.2428995-1-artemyko@nvidia.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain Precedence: list Errors-To: dev-bounces@dpdk.org
Series	eal: fix memory initialization deadlock \| eal: fix memory initialization deadlock

Checks

Context	Check	Description
ci/checkpatch	success	coding style OK
ci/loongarch-compilation	success	Compilation OK
ci/loongarch-unit-testing	success	Unit Testing PASS
ci/iol-mellanox-Performance	success	Performance Testing PASS
ci/iol-compile-amd64-testing	success	Testing PASS
ci/github-robot: build	success	github build: passed
ci/iol-unit-arm64-testing	success	Testing PASS
ci/iol-sample-apps-testing	success	Testing PASS
ci/iol-unit-amd64-testing	success	Testing PASS
ci/iol-compile-arm64-testing	success	Testing PASS
ci/iol-intel-Functional	success	Functional Testing PASS
ci/iol-intel-Performance	success	Performance Testing PASS
ci/iol-broadcom-Performance	success	Performance Testing PASS
ci/iol-broadcom-Functional	success	Functional Testing PASS
ci/Intel-compilation	success	Compilation OK
ci/intel-Testing	success	Testing PASS
ci/intel-Functional	success	Functional PASS

Commit Message

Artemy Kovalyov Aug. 30, 2023, 10:33 a.m. UTC

  The issue arose due to the change in the DPDK read-write lock
implementation. That change added a new flag, RTE_RWLOCK_WAIT, designed
to prevent new read locks while a write lock is in the queue. However,
this change has led to a scenario where a recursive read lock, where a
lock is acquired twice by the same execution thread, can initiate a
sequence of events resulting in a deadlock:

Process 1 takes the first read lock.
Process 2 attempts to take a write lock, triggering RTE_RWLOCK_WAIT due
to the presence of a read lock. This makes process 2 enter a wait loop
until the read lock is released.
Process 1 tries to take a second read lock. However, since a write lock
is waiting (due to RTE_RWLOCK_WAIT), it also enters a wait loop until
the write lock is acquired and then released.

Both processes end up in a blocked state, unable to proceed, resulting
in a deadlock scenario.

Following these changes, the RW-lock no longer supports
recursion, implying that a single thread shouldn't obtain a read lock if
it already possesses one. The problem arises during initialization: the
rte_eal_init() function acquires the memory_hotplug_lock, and later on,
the sequence of calls rte_eal_memory_init() -> eal_memalloc_init() ->
rte_memseg_list_walk() acquires it again without releasing it. This
scenario introduces the risk of a potential deadlock when concurrent
write locks are applied to the same memory_hotplug_lock. To address this
we resolved the issue by replacing rte_memseg_list_walk() with
rte_memseg_list_walk_thread_unsafe().

Bugzilla ID: 1277
Fixes: 832cecc03d77 ("rwlock: prevent readers from starving writers")
Cc: stable@dpdk.org

Signed-off-by: Artemy Kovalyov <artemyko@nvidia.com>
---
 lib/eal/include/generic/rte_rwlock.h | 4 ++++
 lib/eal/linux/eal_memalloc.c         | 7 +++++--
 2 files changed, 9 insertions(+), 2 deletions(-)

Comments

Dmitry Kozlyuk Aug. 30, 2023, 7:13 p.m. UTC | #1

2023-08-30 13:33 (UTC+0300), Artemy Kovalyov:
> Following these changes, the RW-lock no longer supports
> recursion, implying that a single thread shouldn't obtain a read lock if
> it already possesses one. The problem arises during initialization: the
> rte_eal_init() function acquires the memory_hotplug_lock, and later on,
> the sequence of calls rte_eal_memory_init() -> eal_memalloc_init() ->
> rte_memseg_list_walk() acquires it again without releasing it. This
> scenario introduces the risk of a potential deadlock when concurrent
> write locks are applied to the same memory_hotplug_lock. To address this
> we resolved the issue by replacing rte_memseg_list_walk() with
> rte_memseg_list_walk_thread_unsafe().

There is another call to rte_memseg_list_walk() during initialization:
from eal_dynmem_hugepage_init(), please address it too.

diff mbox series

Patch

diff --git a/lib/eal/include/generic/rte_rwlock.h b/lib/eal/include/generic/rte_rwlock.h
index 9e083bbc61..c98fc7d083 100644
--- a/lib/eal/include/generic/rte_rwlock.h
+++ b/lib/eal/include/generic/rte_rwlock.h
@@ -80,6 +80,10 @@  rte_rwlock_init(rte_rwlock_t *rwl)
 /**
  * Take a read lock. Loop until the lock is held.
  *
+ * @note The RW lock isn't recursive, so calling this function on the same
+ * lock twice without releasing it could potentially result in a deadlock
+ * scenario when a write lock is involved.
+ *
  * @param rwl
  *   A pointer to a rwlock structure.
  */
diff --git a/lib/eal/linux/eal_memalloc.c b/lib/eal/linux/eal_memalloc.c
index f8b1588cae..3705b41f5f 100644
--- a/lib/eal/linux/eal_memalloc.c
+++ b/lib/eal/linux/eal_memalloc.c
@@ -1740,7 +1740,10 @@  eal_memalloc_init(void)
 		eal_get_internal_configuration();
 
 	if (rte_eal_process_type() == RTE_PROC_SECONDARY)
-		if (rte_memseg_list_walk(secondary_msl_create_walk, NULL) < 0)
+		/*  memory_hotplug_lock is taken in rte_eal_init(), so it's
+		 *  safe to call thread-unsafe version.
+		 */
+		if (rte_memseg_list_walk_thread_unsafe(secondary_msl_create_walk, NULL) < 0)
 			return -1;
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY &&
 			internal_conf->in_memory) {
@@ -1778,7 +1781,7 @@  eal_memalloc_init(void)
 	}
 
 	/* initialize all of the fd lists */
-	if (rte_memseg_list_walk(fd_list_create_walk, NULL))
+	if (rte_memseg_list_walk_thread_unsafe(fd_list_create_walk, NULL))
 		return -1;
 	return 0;
 }