[fix,probabilistic,failure,of,i40evf,initialization] net/i40e: fix probabilistic failure of i40evf initialization

Message ID 20210309154113.2896-1-chenqiming2018@163.com (mailing list archive)
State Superseded, archived
Delegated to: Qi Zhang
Headers
Series [fix,probabilistic,failure,of,i40evf,initialization] net/i40e: fix probabilistic failure of i40evf initialization |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/Intel-compilation success Compilation OK
ci/iol-intel-Performance success Performance Testing PASS
ci/travis-robot success travis build: passed
ci/github-robot fail github build: failed
ci/iol-testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/intel-Testing success Testing PASS

Commit Message

chenqiming2018@163.com March 9, 2021, 3:41 p.m. UTC
  From: Qiming Chen <chenqiming2018@163.com>

    The d2146nt chip integrates the x722 controller. The i40e.ko version is 2.9.21, and the firmware version is Intel’s customized version 4.3. It has been communicated with Intel Steven. The version is compatible. Each PF virtual place has 16 VFs, and there are 2 Each process takes over several VF ports, and there is no VF kernel driver on the host. In an embedded environment, repeated single board restarts may cause VF initialization failure.
    By checking the log, it can be confirmed that the i40evf_check_vf_reset_done function returns an error. Through a horizontal comparison with the iavf.ko code, it is found that the iavf kernel driver is implemented as a loop 20 times, and the vf status is checked for 5 seconds each time to increase the reliability of the vf initialization.
    Try to modify the align iavf.ko, repeat the test and reproduce, and find that the problem no longer exists. Although the probability is relatively small, the result is more serious, so it is recommended to modify it.

Signed-off-by: Qiming Chen <chenqiming2018@163.com>
---
 drivers/net/i40e/i40e_ethdev_vf.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)
  

Comments

Qi Zhang March 24, 2021, 12:38 p.m. UTC | #1
Hi: 
please following guideline.
https://doc.dpdk.org/guides/contributing/patches.html

> -----Original Message-----
> From: chenqiming2018@163.com <chenqiming2018@163.com>
> Sent: Tuesday, March 9, 2021 11:41 PM
> To: dev@dpdk.org
> Cc: chenqiming2018@163.com; Xing, Beilei <beilei.xing@intel.com>; Zhang,
> Qi Z <qi.z.zhang@intel.com>
> Subject: [fix probabilistic failure of i40evf initialization] net/i40e: fix
> probabilistic failure of i40evf initialization
> 
> From: Qiming Chen <chenqiming2018@163.com>
> 
>     The d2146nt chip integrates the x722 controller. The i40e.ko version is
> 2.9.21, and the firmware version is Intel’s customized version 4.3. It has been
> communicated with Intel Steven. The version is compatible. Each PF virtual
> place has 16 VFs, and there are 2 Each process takes over several VF ports, and
> there is no VF kernel driver on the host. In an embedded environment,
> repeated single board restarts may cause VF initialization failure.
>     By checking the log, it can be confirmed that the
> i40evf_check_vf_reset_done function returns an error. Through a horizontal
> comparison with the iavf.ko code, it is found that the iavf kernel driver is
> implemented as a loop 20 times, and the vf status is checked for 5 seconds
> each time to increase the reliability of the vf initialization.
>     Try to modify the align iavf.ko, repeat the test and reproduce, and find
> that the problem no longer exists. Although the probability is relatively small,
> the result is more serious, so it is recommended to modify it.
> 
> Signed-off-by: Qiming Chen <chenqiming2018@163.com>
> ---
>  drivers/net/i40e/i40e_ethdev_vf.c | 20 ++++++++++++++++----
>  1 file changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/i40e/i40e_ethdev_vf.c
> b/drivers/net/i40e/i40e_ethdev_vf.c
> index 0c9bd8d2c..8bfbb1153 100644
> --- a/drivers/net/i40e/i40e_ethdev_vf.c
> +++ b/drivers/net/i40e/i40e_ethdev_vf.c
> @@ -42,7 +42,8 @@
>  /* busy wait delay in msec */
>  #define I40EVF_BUSY_WAIT_DELAY 10
>  #define I40EVF_BUSY_WAIT_COUNT 50
> -#define MAX_RESET_WAIT_CNT     20
> +#define I40EVF_AQ_MAX_ERR      20
> +#define MAX_RESET_WAIT_CNT     500
> 
>  #define I40EVF_ALARM_INTERVAL 50000 /* us */
> 
> @@ -1217,7 +1218,7 @@ i40evf_check_vf_reset_done(struct rte_eth_dev
> *dev)
>  		if (reset == VIRTCHNL_VFR_VFACTIVE ||
>  		    reset == VIRTCHNL_VFR_COMPLETED)
>  			break;
> -		rte_delay_ms(50);
> +		rte_delay_ms(10);
>  	}
> 
>  	if (i >= MAX_RESET_WAIT_CNT)
> @@ -1276,9 +1277,20 @@ i40evf_init_vf(struct rte_eth_dev *dev)
>  		goto err;
>  	}
> 
> -	err = i40evf_check_vf_reset_done(dev);
> -	if (err)
> +	for (i = 0; i < I40EVF_AQ_MAX_ERR; i++) {
> +		err = i40evf_check_vf_reset_done(dev);
> +		if (err) {
> +			PMD_INIT_LOG(WARNING, "Device is still reset: %d %d", err, i);
> +			continue;
> +		} else {
> +			break;
> +		}
> +	}
> +
> +	if (i == I40EVF_AQ_MAX_ERR) {
> +		PMD_INIT_LOG(ERR, "Device check vf reset status failed");
>  		goto err;
> +	}
> 
>  	i40e_init_adminq_parameter(hw);
>  	err = i40e_init_adminq(hw);
> --
> 2.30.1.windows.1
  

Patch

diff --git a/drivers/net/i40e/i40e_ethdev_vf.c b/drivers/net/i40e/i40e_ethdev_vf.c
index 0c9bd8d2c..8bfbb1153 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -42,7 +42,8 @@ 
 /* busy wait delay in msec */
 #define I40EVF_BUSY_WAIT_DELAY 10
 #define I40EVF_BUSY_WAIT_COUNT 50
-#define MAX_RESET_WAIT_CNT     20
+#define I40EVF_AQ_MAX_ERR      20
+#define MAX_RESET_WAIT_CNT     500
 
 #define I40EVF_ALARM_INTERVAL 50000 /* us */
 
@@ -1217,7 +1218,7 @@  i40evf_check_vf_reset_done(struct rte_eth_dev *dev)
 		if (reset == VIRTCHNL_VFR_VFACTIVE ||
 		    reset == VIRTCHNL_VFR_COMPLETED)
 			break;
-		rte_delay_ms(50);
+		rte_delay_ms(10);
 	}
 
 	if (i >= MAX_RESET_WAIT_CNT)
@@ -1276,9 +1277,20 @@  i40evf_init_vf(struct rte_eth_dev *dev)
 		goto err;
 	}
 
-	err = i40evf_check_vf_reset_done(dev);
-	if (err)
+	for (i = 0; i < I40EVF_AQ_MAX_ERR; i++) {
+		err = i40evf_check_vf_reset_done(dev);
+		if (err) {
+			PMD_INIT_LOG(WARNING, "Device is still reset: %d %d", err, i);
+			continue;
+		} else {
+			break;
+		}
+	}
+
+	if (i == I40EVF_AQ_MAX_ERR) {
+		PMD_INIT_LOG(ERR, "Device check vf reset status failed");
 		goto err;
+	}
 
 	i40e_init_adminq_parameter(hw);
 	err = i40e_init_adminq(hw);