net/mlx5: fix find sibling devices

Message ID 20210803150658.303-1-getelson@nvidia.com (mailing list archive)
State Accepted, archived
Delegated to: Thomas Monjalon
Headers
Series net/mlx5: fix find sibling devices |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/github-robot success github build: passed
ci/Intel-compilation success Compilation OK
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/intel-Testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-x86_64-compile-testing fail Testing issues

Commit Message

Gregory Etelson Aug. 3, 2021, 3:06 p.m. UTC
  The routine mlx5_eth_find_next() and related iterating macro
MLX5_ETH_FOREACH_DEV is used to iterate through sibling devices (all
representors share the same configuration and switching domain) on top
of specified root device.

The root device parameter was specified as NULL, and it caused
the missing siblings in iteration during representor device probing,
causing:

1. allocating the new domain_id for the device being probed.
2. discrepancy in representor configurations and potential overall
   driver malfunctions.

Fixes: 56bb3c84e982 ("net/mlx5: reduce PCI dependency")

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/linux/mlx5_os.c   | 2 +-
 drivers/net/mlx5/mlx5.c            | 5 +++--
 drivers/net/mlx5/mlx5.h            | 3 ++-
 drivers/net/mlx5/windows/mlx5_os.c | 2 +-
 4 files changed, 7 insertions(+), 5 deletions(-)
  

Comments

Thomas Monjalon Aug. 4, 2021, 9:29 a.m. UTC | #1
03/08/2021 17:06, Gregory Etelson:
> The routine mlx5_eth_find_next() and related iterating macro
> MLX5_ETH_FOREACH_DEV is used to iterate through sibling devices (all
> representors share the same configuration and switching domain) on top
> of specified root device.
> 
> The root device parameter was specified as NULL, and it caused
> the missing siblings in iteration during representor device probing,
> causing:
> 
> 1. allocating the new domain_id for the device being probed.
> 2. discrepancy in representor configurations and potential overall
>    driver malfunctions.
> 
> Fixes: 56bb3c84e982 ("net/mlx5: reduce PCI dependency")
> 
> Signed-off-by: Gregory Etelson <getelson@nvidia.com>
> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

Applied, thanks.
  

Patch

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index eeeca27ac2..5f8766aa48 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1314,7 +1314,7 @@  mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	}
 	/* Override some values set by hardware configuration. */
 	mlx5_args(config, dpdk_dev->devargs);
-	err = mlx5_dev_check_sibling_config(priv, config);
+	err = mlx5_dev_check_sibling_config(priv, config, dpdk_dev);
 	if (err)
 		goto error;
 	config->hw_csum = !!(sh->device_attr.device_cap_flags_ex &
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 90990ffdc2..f84e061fe7 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -2297,7 +2297,8 @@  rte_pmd_mlx5_get_dyn_flag_names(char *names[], unsigned int n)
  */
 int
 mlx5_dev_check_sibling_config(struct mlx5_priv *priv,
-			      struct mlx5_dev_config *config)
+			      struct mlx5_dev_config *config,
+			      struct rte_device *dpdk_dev)
 {
 	struct mlx5_dev_ctx_shared *sh = priv->sh;
 	struct mlx5_dev_config *sh_conf = NULL;
@@ -2308,7 +2309,7 @@  mlx5_dev_check_sibling_config(struct mlx5_priv *priv,
 	if (sh->refcnt == 1)
 		return 0;
 	/* Find the device with shared context. */
-	MLX5_ETH_FOREACH_DEV(port_id, NULL) {
+	MLX5_ETH_FOREACH_DEV(port_id, dpdk_dev) {
 		struct mlx5_priv *opriv =
 			rte_eth_devices[port_id].data->dev_private;
 
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 34d66e93ad..e02714e231 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -1503,7 +1503,8 @@  void mlx5_set_min_inline(struct mlx5_dev_spawn_data *spawn,
 			 struct mlx5_dev_config *config);
 void mlx5_set_metadata_mask(struct rte_eth_dev *dev);
 int mlx5_dev_check_sibling_config(struct mlx5_priv *priv,
-				  struct mlx5_dev_config *config);
+				  struct mlx5_dev_config *config,
+				  struct rte_device *dpdk_dev);
 int mlx5_dev_configure(struct rte_eth_dev *dev);
 int mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info);
 int mlx5_fw_version_get(struct rte_eth_dev *dev, char *fw_ver, size_t fw_size);
diff --git a/drivers/net/mlx5/windows/mlx5_os.c b/drivers/net/mlx5/windows/mlx5_os.c
index 5a18f538bc..5518bc3e76 100644
--- a/drivers/net/mlx5/windows/mlx5_os.c
+++ b/drivers/net/mlx5/windows/mlx5_os.c
@@ -430,7 +430,7 @@  mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	 * Look for sibling devices in order to reuse their switch domain
 	 * if any, otherwise allocate one.
 	 */
-	MLX5_ETH_FOREACH_DEV(port_id, NULL) {
+	MLX5_ETH_FOREACH_DEV(port_id, dpdk_dev) {
 		const struct mlx5_priv *opriv =
 			rte_eth_devices[port_id].data->dev_private;