[v4] net/mlx5: fix refcount on detached indirect action

Message ID 20211123153805.192398-1-dsosnowski@nvidia.com (mailing list archive)
State Accepted, archived
Delegated to: Raslan Darawsheh
Headers
Series [v4] net/mlx5: fix refcount on detached indirect action |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/github-robot: build success github build: passed
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS

Commit Message

Dariusz Sosnowski Nov. 23, 2021, 3:38 p.m. UTC
  This patch fixes segfault which was triggered when port, with indirect
actions created, was closed. Segfault was occurring only when
RTE_LIBRTE_MLX5_DEBUG was defined. It was caused by redundant decrement
of RX queues refcount:

- refcount was decremented when port was stopped and indirect actions
were detached from RX queues (port stop),
- refcount was decremented when indirect actions objects were destroyed
(port close or destroying of indirect action).

This patch fixes behavior. Dereferencing RX queues is done if and only
if indirect action is explicitly destroyed by the user or detached on
port stop. Dereferencing RX queues on action destroy operation depends on
an argument to the wrapper of indirect action destroy operation, introduced
in this patch.

Fixes: ec4e11d41d12 ("net/mlx5: preserve indirect actions on restart")
Cc: dkozlyuk@nvidia.com
Cc: stable@dpdk.org

Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
---
v4:
* Simplify dev_started checking.
* Remove redundant passes of deref_rxqs argument.

v3:
* Fix handling action destroy in between port start and stop.
* Revert moving contents of mlx5_action_handle_destroy

v2:
* Introduce wrapper over action action destroy operation.
* Fix typos in commit message.

 drivers/net/mlx5/mlx5_flow_dv.c |  5 +++--
 drivers/net/mlx5/mlx5_rx.h      |  3 ++-
 drivers/net/mlx5/mlx5_rxq.c     | 21 ++++++++++++++-------
 3 files changed, 19 insertions(+), 10 deletions(-)
  

Comments

Raslan Darawsheh Nov. 23, 2021, 8:35 p.m. UTC | #1
Hi,

> -----Original Message-----
> From: Dariusz Sosnowski <dsosnowski@nvidia.com>
> Sent: Tuesday, November 23, 2021 5:38 PM
> To: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Dmitry Kozlyuk <dkozlyuk@nvidia.com>
> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>;
> stable@dpdk.org
> Subject: [PATCH v4] net/mlx5: fix refcount on detached indirect action
> 
> This patch fixes segfault which was triggered when port, with indirect actions
> created, was closed. Segfault was occurring only when
> RTE_LIBRTE_MLX5_DEBUG was defined. It was caused by redundant
> decrement of RX queues refcount:
> 
> - refcount was decremented when port was stopped and indirect actions
> were detached from RX queues (port stop),
> - refcount was decremented when indirect actions objects were destroyed
> (port close or destroying of indirect action).
> 
> This patch fixes behavior. Dereferencing RX queues is done if and only if
> indirect action is explicitly destroyed by the user or detached on port stop.
> Dereferencing RX queues on action destroy operation depends on an
> argument to the wrapper of indirect action destroy operation, introduced in
> this patch.
> 
> Fixes: ec4e11d41d12 ("net/mlx5: preserve indirect actions on restart")
> Cc: dkozlyuk@nvidia.com
> Cc: stable@dpdk.org
> 
> Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
> Acked-by: Matan Azrad <matan@nvidia.com>
> ---
> v4:
> * Simplify dev_started checking.
> * Remove redundant passes of deref_rxqs argument.
> 
> v3:
> * Fix handling action destroy in between port start and stop.
> * Revert moving contents of mlx5_action_handle_destroy
> 
> v2:
> * Introduce wrapper over action action destroy operation.
> * Fix typos in commit message.
> 

Patch applied to next-net-mlx,

Kindest regards,
Raslan Darawsheh
  

Patch

diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 2f03e59f9c..1d46fa48b0 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -14732,7 +14732,7 @@  __flow_dv_action_rss_setup(struct rte_eth_dev *dev,
 error_hrxq_new:
 	err = rte_errno;
 	__flow_dv_action_rss_hrxqs_release(dev, shared_rss);
-	if (!mlx5_ind_table_obj_release(dev, shared_rss->ind_tbl, true))
+	if (!mlx5_ind_table_obj_release(dev, shared_rss->ind_tbl, true, true))
 		shared_rss->ind_tbl = NULL;
 	rte_errno = err;
 	return -rte_errno;
@@ -14875,7 +14875,8 @@  __flow_dv_action_rss_release(struct rte_eth_dev *dev, uint32_t idx,
 					  NULL,
 					  "shared rss hrxq has references");
 	queue = shared_rss->ind_tbl->queues;
-	remaining = mlx5_ind_table_obj_release(dev, shared_rss->ind_tbl, true);
+	remaining = mlx5_ind_table_obj_release(dev, shared_rss->ind_tbl, true,
+					       !!dev->data->dev_started);
 	if (remaining)
 		return rte_flow_error_set(error, EBUSY,
 					  RTE_FLOW_ERROR_TYPE_ACTION,
diff --git a/drivers/net/mlx5/mlx5_rx.h b/drivers/net/mlx5/mlx5_rx.h
index 9cc1a2703b..b19464bb37 100644
--- a/drivers/net/mlx5/mlx5_rx.h
+++ b/drivers/net/mlx5/mlx5_rx.h
@@ -225,7 +225,8 @@  struct mlx5_ind_table_obj *mlx5_ind_table_obj_get(struct rte_eth_dev *dev,
 						  uint32_t queues_n);
 int mlx5_ind_table_obj_release(struct rte_eth_dev *dev,
 			       struct mlx5_ind_table_obj *ind_tbl,
-			       bool standalone);
+			       bool standalone,
+			       bool deref_rxqs);
 int mlx5_ind_table_obj_setup(struct rte_eth_dev *dev,
 			     struct mlx5_ind_table_obj *ind_tbl);
 int mlx5_ind_table_obj_modify(struct rte_eth_dev *dev,
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 480f4f9f07..1f6ddbab8b 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -2195,6 +2195,9 @@  mlx5_ind_table_obj_get(struct rte_eth_dev *dev, const uint16_t *queues,
  *   Indirection table to release.
  * @param standalone
  *   Indirection table for Standalone queue.
+ * @param deref_rxqs
+ *   If true, then dereference RX queues related to indirection table.
+ *   Otherwise, no additional action will be taken.
  *
  * @return
  *   1 while a reference on it exists, 0 when freed.
@@ -2202,7 +2205,8 @@  mlx5_ind_table_obj_get(struct rte_eth_dev *dev, const uint16_t *queues,
 int
 mlx5_ind_table_obj_release(struct rte_eth_dev *dev,
 			   struct mlx5_ind_table_obj *ind_tbl,
-			   bool standalone)
+			   bool standalone,
+			   bool deref_rxqs)
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
 	unsigned int i, ret;
@@ -2215,8 +2219,10 @@  mlx5_ind_table_obj_release(struct rte_eth_dev *dev,
 	if (ret)
 		return 1;
 	priv->obj_ops.ind_table_destroy(ind_tbl);
-	for (i = 0; i != ind_tbl->queues_n; ++i)
-		claim_nonzero(mlx5_rxq_deref(dev, ind_tbl->queues[i]));
+	if (deref_rxqs) {
+		for (i = 0; i != ind_tbl->queues_n; ++i)
+			claim_nonzero(mlx5_rxq_deref(dev, ind_tbl->queues[i]));
+	}
 	mlx5_free(ind_tbl);
 	return 0;
 }
@@ -2573,7 +2579,7 @@  mlx5_hrxq_modify(struct rte_eth_dev *dev, uint32_t hrxq_idx,
 	if (ind_tbl != hrxq->ind_table) {
 		MLX5_ASSERT(!hrxq->standalone);
 		mlx5_ind_table_obj_release(dev, hrxq->ind_table,
-					   hrxq->standalone);
+					   hrxq->standalone, true);
 		hrxq->ind_table = ind_tbl;
 	}
 	hrxq->hash_fields = hash_fields;
@@ -2583,7 +2589,8 @@  mlx5_hrxq_modify(struct rte_eth_dev *dev, uint32_t hrxq_idx,
 	err = rte_errno;
 	if (ind_tbl != hrxq->ind_table) {
 		MLX5_ASSERT(!hrxq->standalone);
-		mlx5_ind_table_obj_release(dev, ind_tbl, hrxq->standalone);
+		mlx5_ind_table_obj_release(dev, ind_tbl, hrxq->standalone,
+					   true);
 	}
 	rte_errno = err;
 	return -rte_errno;
@@ -2600,7 +2607,7 @@  __mlx5_hrxq_remove(struct rte_eth_dev *dev, struct mlx5_hrxq *hrxq)
 	priv->obj_ops.hrxq_destroy(hrxq);
 	if (!hrxq->standalone) {
 		mlx5_ind_table_obj_release(dev, hrxq->ind_table,
-					   hrxq->standalone);
+					   hrxq->standalone, true);
 	}
 	mlx5_ipool_free(priv->sh->ipool[MLX5_IPOOL_HRXQ], hrxq->idx);
 }
@@ -2666,7 +2673,7 @@  __mlx5_hrxq_create(struct rte_eth_dev *dev,
 	return hrxq;
 error:
 	if (!rss_desc->ind_tbl)
-		mlx5_ind_table_obj_release(dev, ind_tbl, standalone);
+		mlx5_ind_table_obj_release(dev, ind_tbl, standalone, true);
 	if (hrxq)
 		mlx5_ipool_free(priv->sh->ipool[MLX5_IPOOL_HRXQ], hrxq_idx);
 	return NULL;