[v2] net/ice: fix DCF state checking mechanism

Message ID 20220511154930.509436-1-peng1x.zhang@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Qi Zhang
Headers
Series [v2] net/ice: fix DCF state checking mechanism |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/Intel-compilation success Compilation OK
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/intel-Testing success Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/github-robot: build success github build: passed
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS

Commit Message

Zhang, Peng1X May 11, 2022, 3:49 p.m. UTC
  From: Peng Zhang <peng1x.zhang@intel.com>

DCF state previous checking mechanism can not fully detect DCF state
whether is on or not,so PMD will report uncorrect error code in some
cases and mislead user.Fix DCF state checking mechanism which will
mention user resource temporarily unavailable when DCF state is not on.

Fixes: 285f63fc6bb7 ("net/ice: track DCF state of PF")
Cc: stable@dpdk.org

Signed-off-by: Peng Zhang <peng1x.zhang@intel.com>
---
 drivers/net/ice/ice_dcf_parent.c    |  3 ---
 drivers/net/ice/ice_switch_filter.c | 20 ++++++--------------
 2 files changed, 6 insertions(+), 17 deletions(-)
  

Comments

Connolly, Padraig J May 13, 2022, 9:56 a.m. UTC | #1
> -----Original Message-----
> From: peng1x.zhang@intel.com <peng1x.zhang@intel.com>
> Sent: Wednesday, May 11, 2022 4:50 PM
> To: Yang, Qiming <qiming.yang@intel.com>; Zhang, Qi Z
> <qi.z.zhang@intel.com>; dev@dpdk.org
> Cc: Zhang, Peng1X <peng1x.zhang@intel.com>; stable@dpdk.org
> Subject: [PATCH v2] net/ice: fix DCF state checking mechanism
> 
> From: Peng Zhang <peng1x.zhang@intel.com>
> 
> DCF state previous checking mechanism can not fully detect DCF state
> whether is on or not,so PMD will report uncorrect error code in some cases
> and mislead user.Fix DCF state checking mechanism which will mention user
> resource temporarily unavailable when DCF state is not on.
> 
> Fixes: 285f63fc6bb7 ("net/ice: track DCF state of PF")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Peng Zhang <peng1x.zhang@intel.com>
> ---
>  drivers/net/ice/ice_dcf_parent.c    |  3 ---
>  drivers/net/ice/ice_switch_filter.c | 20 ++++++--------------
>  2 files changed, 6 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/net/ice/ice_dcf_parent.c
> b/drivers/net/ice/ice_dcf_parent.c
> index 2f96dedcce..5b02e0197f 100644
> --- a/drivers/net/ice/ice_dcf_parent.c
> +++ b/drivers/net/ice/ice_dcf_parent.c
> @@ -121,7 +121,6 @@ ice_dcf_vsi_update_service_handler(void *param)
>  	struct ice_dcf_hw *hw = reset_param->dcf_hw;
>  	struct ice_dcf_adapter *adapter =
>  		container_of(hw, struct ice_dcf_adapter, real_hw);
> -	struct ice_adapter *parent_adapter = &adapter->parent;
> 
>  	pthread_detach(pthread_self());
> 
> @@ -130,8 +129,6 @@ ice_dcf_vsi_update_service_handler(void *param)
>  	rte_spinlock_lock(&vsi_update_lock);
> 
>  	if (!ice_dcf_handle_vsi_update_event(hw)) {
> -		__atomic_store_n(&parent_adapter->dcf_state_on, true,
> -				 __ATOMIC_RELAXED);
>  		ice_dcf_update_vf_vsi_map(&adapter->parent.hw,
>  					  hw->num_vfs, hw->vf_vsi_map);
>  	}
> diff --git a/drivers/net/ice/ice_switch_filter.c
> b/drivers/net/ice/ice_switch_filter.c
> index 36c9bffb73..3d36c63e97 100644
> --- a/drivers/net/ice/ice_switch_filter.c
> +++ b/drivers/net/ice/ice_switch_filter.c
> @@ -403,13 +403,6 @@ ice_switch_create(struct ice_adapter *ad,
>  		goto error;
>  	}
> 
> -	if (ice_dcf_adminq_need_retry(ad)) {
> -		rte_flow_error_set(error, EAGAIN,
> -			RTE_FLOW_ERROR_TYPE_ITEM, NULL,
> -			"DCF is not on");
> -		goto error;
> -	}
> -
>  	ret = ice_add_adv_rule(hw, list, lkups_cnt, rule_info, &rule_added);
>  	if (!ret) {
>  		filter_conf_ptr = rte_zmalloc("ice_switch_filter", @@ -432,6
> +425,9 @@ ice_switch_create(struct ice_adapter *ad,
>  		filter_conf_ptr->fltr_status = ICE_SW_FLTR_ADDED;
> 
>  		flow->rule = filter_conf_ptr;
> +
> +		if (ad->hw.dcf_enabled)
> +			__atomic_store_n(&ad->dcf_state_on, true,
> __ATOMIC_RELAXED);
>  	} else {
>  		if (ice_dcf_adminq_need_retry(ad))
>  			ret = -EAGAIN;
> @@ -490,13 +486,6 @@ ice_switch_destroy(struct ice_adapter *ad,
>  		return -rte_errno;
>  	}
> 
> -	if (ice_dcf_adminq_need_retry(ad)) {
> -		rte_flow_error_set(error, EAGAIN,
> -			RTE_FLOW_ERROR_TYPE_ITEM, NULL,
> -			"DCF is not on");
> -		return -rte_errno;
> -	}
> -
>  	ret = ice_rem_adv_rule_by_id(hw, &filter_conf_ptr-
> >sw_query_data);
>  	if (ret) {
>  		if (ice_dcf_adminq_need_retry(ad))
> @@ -508,6 +497,9 @@ ice_switch_destroy(struct ice_adapter *ad,
>  			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
>  			"fail to destroy switch filter rule");
>  		return -rte_errno;
> +	} else {
> +		if (ad->hw.dcf_enabled)
> +			__atomic_store_n(&ad->dcf_state_on, true,
> __ATOMIC_RELAXED);
>  	}
> 
>  	ice_switch_filter_rule_free(flow);
> --
> 2.25.1

Tested-by: Padraig Connolly <padraig.j.connolly@intel.com>
  
Qi Zhang May 17, 2022, 7:35 a.m. UTC | #2
> -----Original Message-----
> From: Zhang, Peng1X <peng1x.zhang@intel.com>
> Sent: Wednesday, May 11, 2022 11:50 PM
> To: Yang, Qiming <qiming.yang@intel.com>; Zhang, Qi Z
> <qi.z.zhang@intel.com>; dev@dpdk.org
> Cc: Zhang, Peng1X <peng1x.zhang@intel.com>; stable@dpdk.org
> Subject: [PATCH v2] net/ice: fix DCF state checking mechanism
> 
> From: Peng Zhang <peng1x.zhang@intel.com>
> 
> DCF state previous checking mechanism can not fully detect DCF state whether
> is on or not,so PMD will report uncorrect error code in some cases and mislead
> user.Fix DCF state checking mechanism which will mention user resource
> temporarily unavailable when DCF state is not on.


Please describe at which situation which error code is uncorrect and which is expected.
  
Zhang, Peng1X May 18, 2022, 6:36 a.m. UTC | #3
Ok, because error phenomena happens during the period VF reset again and again
following situation will possible happen as following steps describe:
step 1. DCF state has been set to on after VF has reset.
step 2. A VF reset happen, kernel send an event to DCF and set STATE to pause.
step 3. Before DCF receive the event, it is possible a rule creation is ongoing, 
then in virtual channel queue, the rule request is in front of the "re-connect", then it will be rejected.
step 4.But the DCF state is not set to pause, according to previous logic error code will be EINVAL,
while not EAGAIN.

In conclusion, in upper situation error code which should not be EINVAL and EAGAIN is expected.

> -----Original Message-----
> From: Zhang, Qi Z <qi.z.zhang@intel.com>
> Sent: Tuesday, May 17, 2022 3:36 PM
> To: Zhang, Peng1X <peng1x.zhang@intel.com>; Yang, Qiming
> <qiming.yang@intel.com>; dev@dpdk.org
> Cc: stable@dpdk.org
> Subject: RE: [PATCH v2] net/ice: fix DCF state checking mechanism
> 
> 
> 
> > -----Original Message-----
> > From: Zhang, Peng1X <peng1x.zhang@intel.com>
> > Sent: Wednesday, May 11, 2022 11:50 PM
> > To: Yang, Qiming <qiming.yang@intel.com>; Zhang, Qi Z
> > <qi.z.zhang@intel.com>; dev@dpdk.org
> > Cc: Zhang, Peng1X <peng1x.zhang@intel.com>; stable@dpdk.org
> > Subject: [PATCH v2] net/ice: fix DCF state checking mechanism
> >
> > From: Peng Zhang <peng1x.zhang@intel.com>
> >
> > DCF state previous checking mechanism can not fully detect DCF state
> > whether is on or not,so PMD will report uncorrect error code in some
> > cases and mislead user.Fix DCF state checking mechanism which will
> > mention user resource temporarily unavailable when DCF state is not on.
> 
> 
> Please describe at which situation which error code is uncorrect and which is
> expected.
> 
>
  
Qi Zhang May 18, 2022, 6:45 a.m. UTC | #4
> -----Original Message-----
> From: Zhang, Peng1X <peng1x.zhang@intel.com>
> Sent: Wednesday, May 18, 2022 2:36 PM
> To: Zhang, Qi Z <qi.z.zhang@intel.com>; Yang, Qiming
> <qiming.yang@intel.com>; dev@dpdk.org
> Cc: stable@dpdk.org
> Subject: RE: [PATCH v2] net/ice: fix DCF state checking mechanism
> 
> Ok, because error phenomena happens during the period VF reset again and
> again following situation will possible happen as following steps describe:
> step 1. DCF state has been set to on after VF has reset.
> step 2. A VF reset happen, kernel send an event to DCF and set STATE to pause.
> step 3. Before DCF receive the event, it is possible a rule creation is ongoing,
> then in virtual channel queue, the rule request is in front of the "re-connect",
> then it will be rejected.
> step 4.But the DCF state is not set to pause, according to previous logic error
> code will be EINVAL, while not EAGAIN.
> 
> In conclusion, in upper situation error code which should not be EINVAL and
> EAGAIN is expected.

Ok, Please send a new version
  
Zhang, Peng1X May 19, 2022, 6:05 a.m. UTC | #5
This patch is aim to fix the mentioned situation.
After having failed to create rule, error code is EINVAL or EAGAIN depends whether DCF is enabled and DCF state is on or not.
In this patch conduct DPDK DCF state by create or destroy rule successfully or not.
Before patch is applied, the steps of error phenomena:
step 1. DPDK DCF state has been set to on after VF has reset and multiple rules are creating.
step 2. A VF reset happen immediately, kernel send an event to DPDK DCF and set STATE to pause.
step 3. Before DPDK DCF receive the event, it is possible a rule creation is ongoing, 
then in virtual channel queue, the rule request is in front of the "re-connect", then it will be rejected.
step 4. But the DPDK DCF state is not set to pause, error code will be set as EINVAL, not EAGAIN.
After patch is applied, the upper error should be fixed for the upper situation.
Because in step 3,because rule request is rejected, then create rule fail.
DPDK DCF state will be still pause state, and DPDK DCF is enabled.
According to the logic of conduct error code after create rule fail, error code is EAGAIN.

> -----Original Message-----
> From: Zhang, Qi Z <qi.z.zhang@intel.com>
> Sent: Wednesday, May 18, 2022 2:46 PM
> To: Zhang, Peng1X <peng1x.zhang@intel.com>; Yang, Qiming
> <qiming.yang@intel.com>; dev@dpdk.org
> Cc: stable@dpdk.org
> Subject: RE: [PATCH v2] net/ice: fix DCF state checking mechanism
> 
> 
> 
> > -----Original Message-----
> > From: Zhang, Peng1X <peng1x.zhang@intel.com>
> > Sent: Wednesday, May 18, 2022 2:36 PM
> > To: Zhang, Qi Z <qi.z.zhang@intel.com>; Yang, Qiming
> > <qiming.yang@intel.com>; dev@dpdk.org
> > Cc: stable@dpdk.org
> > Subject: RE: [PATCH v2] net/ice: fix DCF state checking mechanism
> >
> > Ok, because error phenomena happens during the period VF reset again
> > and again following situation will possible happen as following steps describe:
> > step 1. DCF state has been set to on after VF has reset.
> > step 2. A VF reset happen, kernel send an event to DCF and set STATE to
> pause.
> > step 3. Before DCF receive the event, it is possible a rule creation
> > is ongoing, then in virtual channel queue, the rule request is in
> > front of the "re-connect", then it will be rejected.
> > step 4.But the DCF state is not set to pause, according to previous
> > logic error code will be EINVAL, while not EAGAIN.
> >
> > In conclusion, in upper situation error code which should not be
> > EINVAL and EAGAIN is expected.
> 
> Ok, Please send a new version
  

Patch

diff --git a/drivers/net/ice/ice_dcf_parent.c b/drivers/net/ice/ice_dcf_parent.c
index 2f96dedcce..5b02e0197f 100644
--- a/drivers/net/ice/ice_dcf_parent.c
+++ b/drivers/net/ice/ice_dcf_parent.c
@@ -121,7 +121,6 @@  ice_dcf_vsi_update_service_handler(void *param)
 	struct ice_dcf_hw *hw = reset_param->dcf_hw;
 	struct ice_dcf_adapter *adapter =
 		container_of(hw, struct ice_dcf_adapter, real_hw);
-	struct ice_adapter *parent_adapter = &adapter->parent;
 
 	pthread_detach(pthread_self());
 
@@ -130,8 +129,6 @@  ice_dcf_vsi_update_service_handler(void *param)
 	rte_spinlock_lock(&vsi_update_lock);
 
 	if (!ice_dcf_handle_vsi_update_event(hw)) {
-		__atomic_store_n(&parent_adapter->dcf_state_on, true,
-				 __ATOMIC_RELAXED);
 		ice_dcf_update_vf_vsi_map(&adapter->parent.hw,
 					  hw->num_vfs, hw->vf_vsi_map);
 	}
diff --git a/drivers/net/ice/ice_switch_filter.c b/drivers/net/ice/ice_switch_filter.c
index 36c9bffb73..3d36c63e97 100644
--- a/drivers/net/ice/ice_switch_filter.c
+++ b/drivers/net/ice/ice_switch_filter.c
@@ -403,13 +403,6 @@  ice_switch_create(struct ice_adapter *ad,
 		goto error;
 	}
 
-	if (ice_dcf_adminq_need_retry(ad)) {
-		rte_flow_error_set(error, EAGAIN,
-			RTE_FLOW_ERROR_TYPE_ITEM, NULL,
-			"DCF is not on");
-		goto error;
-	}
-
 	ret = ice_add_adv_rule(hw, list, lkups_cnt, rule_info, &rule_added);
 	if (!ret) {
 		filter_conf_ptr = rte_zmalloc("ice_switch_filter",
@@ -432,6 +425,9 @@  ice_switch_create(struct ice_adapter *ad,
 		filter_conf_ptr->fltr_status = ICE_SW_FLTR_ADDED;
 
 		flow->rule = filter_conf_ptr;
+
+		if (ad->hw.dcf_enabled)
+			__atomic_store_n(&ad->dcf_state_on, true, __ATOMIC_RELAXED);
 	} else {
 		if (ice_dcf_adminq_need_retry(ad))
 			ret = -EAGAIN;
@@ -490,13 +486,6 @@  ice_switch_destroy(struct ice_adapter *ad,
 		return -rte_errno;
 	}
 
-	if (ice_dcf_adminq_need_retry(ad)) {
-		rte_flow_error_set(error, EAGAIN,
-			RTE_FLOW_ERROR_TYPE_ITEM, NULL,
-			"DCF is not on");
-		return -rte_errno;
-	}
-
 	ret = ice_rem_adv_rule_by_id(hw, &filter_conf_ptr->sw_query_data);
 	if (ret) {
 		if (ice_dcf_adminq_need_retry(ad))
@@ -508,6 +497,9 @@  ice_switch_destroy(struct ice_adapter *ad,
 			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
 			"fail to destroy switch filter rule");
 		return -rte_errno;
+	} else {
+		if (ad->hw.dcf_enabled)
+			__atomic_store_n(&ad->dcf_state_on, true, __ATOMIC_RELAXED);
 	}
 
 	ice_switch_filter_rule_free(flow);