diff mbox series

[2/2] common/mlx5: fix multi-process mempool registration

Message ID 20220808094236.3395516-3-dkozlyuk@nvidia.com (mailing list archive)
State New
Delegated to: Thomas Monjalon
Headers show
Series common/mlx5: fix multi-process mempool registration | expand

Checks

Context Check Description
ci/intel-Testing success Testing PASS
ci/github-robot: build success github build: passed
ci/iol-x86_64-unit-testing fail Testing issues
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-intel-Functional fail Functional Testing issues
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/Intel-compilation success Compilation OK
ci/checkpatch warning coding style issues

Commit Message

Dmitry Kozlyuk Aug. 8, 2022, 9:42 a.m. UTC
The `mp_cb_registered` flag shared between all processes
was used to ensure that for any IB device (MLX5 common device)
mempool event callback was registered only once
and mempools that had been existing before the device start
were traversed only once to register them.
Since mempool callback registrations have become process-private,
callback registration must be done by every process.
The flag can no longer reflect the state for any single process.
Replace it with a registration counter to track
when no more callbacks are registered for the device in any process.
It is sufficient to only register pre-existing mempools
in the primary process because it is the one that starts the device.

Fixes: 690b2a88c2f7 ("common/mlx5: add mempool registration facilities")
Cc: stable@dpdk.org

Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
---
 drivers/common/mlx5/mlx5_common.c    | 15 +++++++++------
 drivers/common/mlx5/mlx5_common_mr.c |  2 +-
 drivers/common/mlx5/mlx5_common_mr.h |  2 +-
 3 files changed, 11 insertions(+), 8 deletions(-)

Comments

Slava Ovsiienko Aug. 28, 2022, 6:34 p.m. UTC | #1
> -----Original Message-----
> From: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
> Sent: Monday, August 8, 2022 12:43
> To: dev@dpdk.org
> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; stable@dpdk.org
> Subject: [PATCH 2/2] common/mlx5: fix multi-process mempool registration
> 
> The `mp_cb_registered` flag shared between all processes was used to ensure
> that for any IB device (MLX5 common device) mempool event callback was
> registered only once and mempools that had been existing before the device
> start were traversed only once to register them.
> Since mempool callback registrations have become process-private, callback
> registration must be done by every process.
> The flag can no longer reflect the state for any single process.
> Replace it with a registration counter to track when no more callbacks are
> registered for the device in any process.
> It is sufficient to only register pre-existing mempools in the primary
> process because it is the one that starts the device.
> 
> Fixes: 690b2a88c2f7 ("common/mlx5: add mempool registration facilities")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
diff mbox series

Patch

diff --git a/drivers/common/mlx5/mlx5_common.c b/drivers/common/mlx5/mlx5_common.c
index 89fef2b535..4dcc8cc49c 100644
--- a/drivers/common/mlx5/mlx5_common.c
+++ b/drivers/common/mlx5/mlx5_common.c
@@ -583,18 +583,17 @@  mlx5_dev_mempool_subscribe(struct mlx5_common_device *cdev)
 	if (!cdev->config.mr_mempool_reg_en)
 		return 0;
 	rte_rwlock_write_lock(&cdev->mr_scache.mprwlock);
-	if (cdev->mr_scache.mp_cb_registered)
-		goto exit;
 	/* Callback for this device may be already registered. */
 	ret = rte_mempool_event_callback_register(mlx5_dev_mempool_event_cb,
 						  cdev);
 	if (ret != 0 && rte_errno != EEXIST)
 		goto exit;
+	__atomic_add_fetch(&cdev->mr_scache.mempool_cb_reg_n, 1,
+			   __ATOMIC_ACQUIRE);
 	/* Register mempools only once for this device. */
-	if (ret == 0)
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
 		rte_mempool_walk(mlx5_dev_mempool_register_cb, cdev);
 	ret = 0;
-	cdev->mr_scache.mp_cb_registered = 1;
 exit:
 	rte_rwlock_write_unlock(&cdev->mr_scache.mprwlock);
 	return ret;
@@ -603,10 +602,14 @@  mlx5_dev_mempool_subscribe(struct mlx5_common_device *cdev)
 static void
 mlx5_dev_mempool_unsubscribe(struct mlx5_common_device *cdev)
 {
+	uint32_t mempool_cb_reg_n;
 	int ret;
 
-	if (!cdev->mr_scache.mp_cb_registered ||
-	    !cdev->config.mr_mempool_reg_en)
+	if (!cdev->config.mr_mempool_reg_en)
+		return;
+	mempool_cb_reg_n = __atomic_sub_fetch(&cdev->mr_scache.mempool_cb_reg_n,
+					      1, __ATOMIC_RELEASE);
+	if (mempool_cb_reg_n > 0)
 		return;
 	/* Stop watching for mempool events and unregister all mempools. */
 	ret = rte_mempool_event_callback_unregister(mlx5_dev_mempool_event_cb,
diff --git a/drivers/common/mlx5/mlx5_common_mr.c b/drivers/common/mlx5/mlx5_common_mr.c
index 8d8bec99a9..1d54102b54 100644
--- a/drivers/common/mlx5/mlx5_common_mr.c
+++ b/drivers/common/mlx5/mlx5_common_mr.c
@@ -1138,7 +1138,7 @@  mlx5_mr_create_cache(struct mlx5_mr_share_cache *share_cache, int socket)
 			      &share_cache->dereg_mr_cb);
 	rte_rwlock_init(&share_cache->rwlock);
 	rte_rwlock_init(&share_cache->mprwlock);
-	share_cache->mp_cb_registered = 0;
+	share_cache->mempool_cb_reg_n = 0;
 	/* Initialize B-tree and allocate memory for global MR cache table. */
 	return mlx5_mr_btree_init(&share_cache->cache,
 				  MLX5_MR_BTREE_CACHE_N * 2, socket);
diff --git a/drivers/common/mlx5/mlx5_common_mr.h b/drivers/common/mlx5/mlx5_common_mr.h
index 213f5427cb..a5f2d4fd35 100644
--- a/drivers/common/mlx5/mlx5_common_mr.h
+++ b/drivers/common/mlx5/mlx5_common_mr.h
@@ -81,7 +81,7 @@  struct mlx5_mr_share_cache {
 	uint32_t dev_gen; /* Generation number to flush local caches. */
 	rte_rwlock_t rwlock; /* MR cache Lock. */
 	rte_rwlock_t mprwlock; /* Mempool Registration Lock. */
-	uint8_t mp_cb_registered; /* Mempool are Registered. */
+	uint32_t mempool_cb_reg_n; /* Mempool event callabck registrants. */
 	struct mlx5_mr_btree cache; /* Global MR cache table. */
 	struct mlx5_mr_list mr_list; /* Registered MR list. */
 	struct mlx5_mr_list mr_free_list; /* Freed MR list. */