[dpdk-dev,v2] net/mlx5: fix link state on device start

Message ID 20180125160428.146069-1-shahafs@mellanox.com (mailing list archive)
State Accepted, archived
Delegated to: Ferruh Yigit
Headers

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/Intel-compilation fail Compilation issues

Commit Message

Shahaf Shuler Jan. 25, 2018, 4:04 p.m. UTC
  Following commit c7bf62255edf ("net/mlx5: fix handling link status event")
the link state must be up in order for the burst function to be set on
the device ops.

As the link may take time to move between down and up state it is
possible the rte_eth_dev_start call will return with wrong burst
function (either null or the empty burst function).

Fixing it by forcing the link to be up before returning from device
start. In case the link is still not up after 5 seconds fail the function.
In addition initialize the burst function on device probe to prevent
crashes before the link is up.

Fixes: c7bf62255edf ("net/mlx5: fix handling link status event")
Cc: yskoh@mellanox.com

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
 drivers/net/mlx5/mlx5.c         |  5 +++++
 drivers/net/mlx5/mlx5.h         |  1 +
 drivers/net/mlx5/mlx5_defs.h    |  3 +++
 drivers/net/mlx5/mlx5_ethdev.c  | 27 +++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_trigger.c |  8 +++++++-
 5 files changed, 43 insertions(+), 1 deletion(-)
  

Comments

Shahaf Shuler Jan. 25, 2018, 4:20 p.m. UTC | #1
Thursday, January 25, 2018 6:04 PM, Shahaf Shuler:
> Following commit c7bf62255edf ("net/mlx5: fix handling link status event")
> the link state must be up in order for the burst function to be set on the
> device ops.
> 
> As the link may take time to move between down and up state it is possible
> the rte_eth_dev_start call will return with wrong burst function (either null
> or the empty burst function).
> 
> Fixing it by forcing the link to be up before returning from device start. In
> case the link is still not up after 5 seconds fail the function.
> In addition initialize the burst function on device probe to prevent crashes
> before the link is up.
> 
> Fixes: c7bf62255edf ("net/mlx5: fix handling link status event")
> Cc: yskoh@mellanox.com
> 
> Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> ---

Applied to next-net-mlx, thanks
  

Patch

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index bc4b6bad0..62fa52434 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -862,6 +862,11 @@  mlx5_pci_probe(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 		eth_dev->device = &pci_dev->device;
 		rte_eth_copy_pci_info(eth_dev, pci_dev);
 		eth_dev->device->driver = &mlx5_driver.driver;
+		/*
+		 * Initialize burst functions to prevent crashes before link-up.
+		 */
+		eth_dev->rx_pkt_burst = removed_rx_burst;
+		eth_dev->tx_pkt_burst = removed_tx_burst;
 		priv->dev = eth_dev;
 		eth_dev->dev_ops = &mlx5_dev_ops;
 		/* Register MAC address. */
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index a7ec607c3..30b737f76 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -246,6 +246,7 @@  int mlx5_dev_configure(struct rte_eth_dev *);
 void mlx5_dev_infos_get(struct rte_eth_dev *, struct rte_eth_dev_info *);
 const uint32_t *mlx5_dev_supported_ptypes_get(struct rte_eth_dev *dev);
 int priv_link_update(struct priv *, int);
+int priv_force_link_status_change(struct priv *, int);
 int mlx5_link_update(struct rte_eth_dev *, int);
 int mlx5_dev_set_mtu(struct rte_eth_dev *, uint16_t);
 int mlx5_dev_get_flow_ctrl(struct rte_eth_dev *, struct rte_eth_fc_conf *);
diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index a71db281d..57f295c58 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -110,4 +110,7 @@ 
 /* Supported RSS */
 #define MLX5_RSS_HF_MASK (~(ETH_RSS_IP | ETH_RSS_UDP | ETH_RSS_TCP))
 
+/* Maximum number of attempts to query link status before giving up. */
+#define MLX5_MAX_LINK_QUERY_ATTEMPTS 5
+
 #endif /* RTE_PMD_MLX5_DEFS_H_ */
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 6624888c9..5694136c9 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -966,6 +966,33 @@  priv_link_update(struct priv *priv, int wait_to_complete)
 }
 
 /**
+ * Querying the link status till it changes to the desired state.
+ * Number of query attempts is bounded by MLX5_MAX_LINK_QUERY_ATTEMPTS.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ * @param status
+ *   Link desired status.
+ *
+ * @return
+ *   0 on success, negative errno value on failure.
+ */
+int
+priv_force_link_status_change(struct priv *priv, int status)
+{
+	int try = 0;
+
+	while (try < MLX5_MAX_LINK_QUERY_ATTEMPTS) {
+		priv_link_update(priv, 0);
+		if (priv->dev->data->dev_link.link_status == status)
+			return 0;
+		try++;
+		sleep(1);
+	}
+	return -EAGAIN;
+}
+
+/**
  * DPDK callback to retrieve physical link information.
  *
  * @param dev
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index 827db2e7e..c5429e182 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -166,7 +166,13 @@  mlx5_dev_start(struct rte_eth_dev *dev)
 	priv_xstats_init(priv);
 	/* Update link status and Tx/Rx callbacks for the first time. */
 	memset(&dev->data->dev_link, 0, sizeof(struct rte_eth_link));
-	priv_link_update(priv, 1);
+	INFO("Forcing port %u link to be up", dev->data->port_id);
+	err = priv_force_link_status_change(priv, ETH_LINK_UP);
+	if (err) {
+		DEBUG("Failed to set port %u link to be up",
+		      dev->data->port_id);
+		goto error;
+	}
 	priv_dev_interrupt_handler_install(priv, dev);
 	priv_unlock(priv);
 	return 0;