[v2] net/ice: fix crash on closing representor ports

Message ID 20231030084459.3470038-1-mingjinx.ye@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Qi Zhang
Headers
Series [v2] net/ice: fix crash on closing representor ports |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-compile-amd64-testing success Testing PASS
ci/intel-Functional success Functional PASS
ci/iol-unit-amd64-testing success Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-unit-arm64-testing success Testing PASS
ci/github-robot: build success github build: passed
ci/iol-compile-arm64-testing success Testing PASS
ci/iol-sample-apps-testing success Testing PASS

Commit Message

Mingjin Ye Oct. 30, 2023, 8:44 a.m. UTC
  Since the representor port needs to access the resources of the
associated DCF when it is closed. Therefore, the correct close
port operation is to close all the representor ports first, and
then close the associated DCF port.

If the DCF port is closed before the representor port on pmd exit.
This will result in accessing freed resources and eventually a
core dump will occur.

This patch fixes this issue by notifying all presentor ports
that DCF is not accessible when the DCF port is closed.
And when the presentor port is closed, it determines if the DCF
resources are accessible. If it can't be accessed, it will
report an error and return.

Fixes: 5674465a32c8 ("net/ice: add DCF VLAN handling")
Fixes: da9cdcd1f372 ("net/ice: fix crash on representor port closing")
Cc: stable@dpdk.org

Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
---
v2: Reformat code to remove unneeded fixlines.
---
 drivers/net/ice/ice_dcf_ethdev.h         |  1 +
 drivers/net/ice/ice_dcf_vf_representor.c | 11 ++++++++---
 2 files changed, 9 insertions(+), 3 deletions(-)
  

Comments

Qi Zhang Nov. 1, 2023, 1:11 a.m. UTC | #1
> -----Original Message-----
> From: Ye, MingjinX <mingjinx.ye@intel.com>
> Sent: Monday, October 30, 2023 4:45 PM
> To: dev@dpdk.org
> Cc: Yang, Qiming <qiming.yang@intel.com>; Zhou, YidingX
> <yidingx.zhou@intel.com>; Ye, MingjinX <mingjinx.ye@intel.com>;
> stable@dpdk.org; Zhang, Qi Z <qi.z.zhang@intel.com>
> Subject: [PATCH v2] net/ice: fix crash on closing representor ports
> 
> Since the representor port needs to access the resources of the associated
> DCF when it is closed. Therefore, the correct close port operation is to close
> all the representor ports first, and then close the associated DCF port.
> 
> If the DCF port is closed before the representor port on pmd exit.
> This will result in accessing freed resources and eventually a core dump will
> occur.
> 
> This patch fixes this issue by notifying all presentor ports that DCF is not
> accessible when the DCF port is closed.
> And when the presentor port is closed, it determines if the DCF resources are
> accessible. If it can't be accessed, it will report an error and return.
> 
> Fixes: 5674465a32c8 ("net/ice: add DCF VLAN handling")
> Fixes: da9cdcd1f372 ("net/ice: fix crash on representor port closing")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
> ---
> v2: Reformat code to remove unneeded fixlines.
> ---
>  drivers/net/ice/ice_dcf_ethdev.h         |  1 +
>  drivers/net/ice/ice_dcf_vf_representor.c | 11 ++++++++---
>  2 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ice/ice_dcf_ethdev.h
> b/drivers/net/ice/ice_dcf_ethdev.h
> index 4baaec4b8b..d94ef10244 100644
> --- a/drivers/net/ice/ice_dcf_ethdev.h
> +++ b/drivers/net/ice/ice_dcf_ethdev.h
> @@ -60,6 +60,7 @@ struct ice_dcf_vf_repr {
>  	struct rte_ether_addr mac_addr;
>  	uint16_t switch_domain_id;
>  	uint16_t vf_id;
> +	bool dcf_valid;
> 
>  	struct ice_dcf_vlan outer_vlan_info; /* DCF always handle outer VLAN
> */  }; diff --git a/drivers/net/ice/ice_dcf_vf_representor.c
> b/drivers/net/ice/ice_dcf_vf_representor.c
> index b9fcfc80ad..eb49eae4e4 100644
> --- a/drivers/net/ice/ice_dcf_vf_representor.c
> +++ b/drivers/net/ice/ice_dcf_vf_representor.c
> @@ -45,6 +45,9 @@ ice_dcf_vf_repr_dev_start(struct rte_eth_dev *dev)
> static int  ice_dcf_vf_repr_dev_stop(struct rte_eth_dev *dev)  {
> +	struct ice_dcf_vf_repr *repr = dev->data->dev_private;
> +
> +	repr->dcf_valid = false;

The fix assume the ice_dcf_vf_repr_dev_stop will be invoked during ice_dcf_dev_stop.
But what if DCF port is not closed, while only the port representor port is stopped, should we still reset the flag?
Based your description of the issue, is it better to check on a per DCF port flag, but not a per representer flag?
  

Patch

diff --git a/drivers/net/ice/ice_dcf_ethdev.h b/drivers/net/ice/ice_dcf_ethdev.h
index 4baaec4b8b..d94ef10244 100644
--- a/drivers/net/ice/ice_dcf_ethdev.h
+++ b/drivers/net/ice/ice_dcf_ethdev.h
@@ -60,6 +60,7 @@  struct ice_dcf_vf_repr {
 	struct rte_ether_addr mac_addr;
 	uint16_t switch_domain_id;
 	uint16_t vf_id;
+	bool dcf_valid;
 
 	struct ice_dcf_vlan outer_vlan_info; /* DCF always handle outer VLAN */
 };
diff --git a/drivers/net/ice/ice_dcf_vf_representor.c b/drivers/net/ice/ice_dcf_vf_representor.c
index b9fcfc80ad..eb49eae4e4 100644
--- a/drivers/net/ice/ice_dcf_vf_representor.c
+++ b/drivers/net/ice/ice_dcf_vf_representor.c
@@ -45,6 +45,9 @@  ice_dcf_vf_repr_dev_start(struct rte_eth_dev *dev)
 static int
 ice_dcf_vf_repr_dev_stop(struct rte_eth_dev *dev)
 {
+	struct ice_dcf_vf_repr *repr = dev->data->dev_private;
+
+	repr->dcf_valid = false;
 	dev->data->dev_link.link_status = RTE_ETH_LINK_DOWN;
 
 	return 0;
@@ -111,14 +114,15 @@  ice_dcf_vf_repr_link_update(__rte_unused struct rte_eth_dev *ethdev,
 static __rte_always_inline struct ice_dcf_hw *
 ice_dcf_vf_repr_hw(struct ice_dcf_vf_repr *repr)
 {
-	struct ice_dcf_adapter *dcf_adapter =
-			repr->dcf_eth_dev->data->dev_private;
+	struct ice_dcf_adapter *dcf_adapter;
 
-	if (!dcf_adapter) {
+	if (!repr->dcf_valid) {
 		PMD_DRV_LOG(ERR, "DCF for VF representor has been released\n");
 		return NULL;
 	}
 
+	dcf_adapter = repr->dcf_eth_dev->data->dev_private;
+
 	return &dcf_adapter->real_hw;
 }
 
@@ -414,6 +418,7 @@  ice_dcf_vf_repr_init(struct rte_eth_dev *vf_rep_eth_dev, void *init_param)
 	repr->dcf_eth_dev = param->dcf_eth_dev;
 	repr->switch_domain_id = param->switch_domain_id;
 	repr->vf_id = param->vf_id;
+	repr->dcf_valid = true;
 	repr->outer_vlan_info.port_vlan_ena = false;
 	repr->outer_vlan_info.stripping_ena = false;
 	repr->outer_vlan_info.tpid = RTE_ETHER_TYPE_VLAN;