[RFC,V1] ethdev: fix the issue that dev uninit may be called twice

Message ID 1627908397-51565-1-git-send-email-lihuisong@huawei.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series [RFC,V1] ethdev: fix the issue that dev uninit may be called twice |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS

Commit Message

lihuisong (C) Aug. 2, 2021, 12:46 p.m. UTC
  Ethernet devices in DPDK can be released by rte_eth_dev_close() and
rte_dev_remove(). However, these two APIs do not have explicit invocation
restrictions. In other words, at the ethdev layer, calling
rte_eth_dev_close() and then rte_dev_remove() or rte_eal_hotplug_remove()
is allowed. In such a bad scenario, the primary process may be fine, but it
may cause that dev_unint() in the secondary process will be called twice,
and even other serious problems. So this patch fixes it.

Fixes: 99a2dd955fba ("lib: remove librte_ prefix from directory names")

Signed-off-by: Huisong Li <lihuisong@huawei.com>
---
 lib/ethdev/ethdev_pci.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)
  

Comments

Singh, Aman Deep Aug. 18, 2021, 9:47 a.m. UTC | #1
Hi Huison,

On 8/2/2021 6:16 PM, Huisong Li wrote:
> Ethernet devices in DPDK can be released by rte_eth_dev_close() and
> rte_dev_remove(). However, these two APIs do not have explicit invocation
> restrictions. In other words, at the ethdev layer, calling
> rte_eth_dev_close() and then rte_dev_remove() or rte_eal_hotplug_remove()
> is allowed. In such a bad scenario, the primary process may be fine, but it
> may cause that dev_unint() in the secondary process will be called twice,

Shouldn't dev_unint() for Secondary process, simply return with no-action.

> and even other serious problems. So this patch fixes it.
>
> Fixes: 99a2dd955fba ("lib: remove librte_ prefix from directory names")
>
> Signed-off-by: Huisong Li <lihuisong@huawei.com>
> ---
>   lib/ethdev/ethdev_pci.h | 13 +++++++++++++
>   1 file changed, 13 insertions(+)
>
> diff --git a/lib/ethdev/ethdev_pci.h b/lib/ethdev/ethdev_pci.h
> index 8edca82..14a0e01 100644
> --- a/lib/ethdev/ethdev_pci.h
> +++ b/lib/ethdev/ethdev_pci.h
> @@ -151,6 +151,19 @@ rte_eth_dev_pci_generic_remove(struct rte_pci_device *pci_dev,
>   	if (!eth_dev)
>   		return 0;
>   
> +	/*
> +	 * The eth_dev->data->name doesn't be cleared by the secondary precess,
Can we reprase above sentence "doesn't be cleared "
> +	 * so above "eth_dev" isn't NULL after rte_eth_dev_close() called.
> +	 * Namely, whether "eth_dev" is NULL cannot be used to determine whether
> +	 * an ethdev port has been released.
> +	 * For both primary precess and secondary precess, eth_dev->state is
s/ precess / process
> +	 * RTE_ETH_DEV_UNUSED, which means the ethdev port has been released.
> +	 */
> +	if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
> +		RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
> +		return 0;
> +	}
> +
>   	if (dev_uninit) {
>   		ret = dev_uninit(eth_dev);
>   		if (ret)
  
lihuisong (C) Aug. 24, 2021, 2:10 a.m. UTC | #2
Hi, Singh, Aman Deep

Sorry, I missed your review. Thank you for your review.😁

在 2021/8/18 17:47, Singh, Aman Deep 写道:
>
> Hi Huison,
>
> On 8/2/2021 6:16 PM, Huisong Li wrote:
>> Ethernet devices in DPDK can be released by rte_eth_dev_close() and
>> rte_dev_remove(). However, these two APIs do not have explicit invocation
>> restrictions. In other words, at the ethdev layer, calling
>> rte_eth_dev_close() and then rte_dev_remove() or rte_eal_hotplug_remove()
>> is allowed. In such a bad scenario, the primary process may be fine, but it
>> may cause that dev_unint() in the secondary process will be called twice,
> Shouldn't dev_unint() for Secondary process, simply return with no-action.

The prerequisite is that the secondary process does not have any 
resources that need to be released.

However, some resources from secondary process may also need to be 
released. For example, mp action

registered by rte_mp_action_register() is used to for multi-process 
communication. It should be unloaded

when all eth devices driven by one PMD in a process are removed. In 
order to achieve the above purpose,

secondary process may have data recording the number of device to decide 
when to deregister the action.

Of course, this is just the case. In short, secondary process may have 
its own private data or resources to

be released.

It is mentioned in RFC v2. Please go to discussion line of RFC v2.

>> and even other serious problems. So this patch fixes it.
>>
>> Fixes: 99a2dd955fba ("lib: remove librte_ prefix from directory names")
>>
>> Signed-off-by: Huisong Li<lihuisong@huawei.com>
>> ---
>>   lib/ethdev/ethdev_pci.h | 13 +++++++++++++
>>   1 file changed, 13 insertions(+)
>>
>> diff --git a/lib/ethdev/ethdev_pci.h b/lib/ethdev/ethdev_pci.h
>> index 8edca82..14a0e01 100644
>> --- a/lib/ethdev/ethdev_pci.h
>> +++ b/lib/ethdev/ethdev_pci.h
>> @@ -151,6 +151,19 @@ rte_eth_dev_pci_generic_remove(struct rte_pci_device *pci_dev,
>>   	if (!eth_dev)
>>   		return 0;
>>   
>> +	/*
>> +	 * The eth_dev->data->name doesn't be cleared by the secondary precess,
> Can we reprase above sentence "doesn't be cleared "
ok
>> +	 * so above "eth_dev" isn't NULL after rte_eth_dev_close() called.
>> +	 * Namely, whether "eth_dev" is NULL cannot be used to determine whether
>> +	 * an ethdev port has been released.
>> +	 * For both primary precess and secondary precess, eth_dev->state is
> s/ precess / process
Thanks. RFC v2 has corrected it.
>> +	 * RTE_ETH_DEV_UNUSED, which means the ethdev port has been released.
>> +	 */
>> +	if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
>> +		RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
>> +		return 0;
>> +	}
>> +
>>   	if (dev_uninit) {
>>   		ret = dev_uninit(eth_dev);
>>   		if (ret)
  

Patch

diff --git a/lib/ethdev/ethdev_pci.h b/lib/ethdev/ethdev_pci.h
index 8edca82..14a0e01 100644
--- a/lib/ethdev/ethdev_pci.h
+++ b/lib/ethdev/ethdev_pci.h
@@ -151,6 +151,19 @@  rte_eth_dev_pci_generic_remove(struct rte_pci_device *pci_dev,
 	if (!eth_dev)
 		return 0;
 
+	/*
+	 * The eth_dev->data->name doesn't be cleared by the secondary precess,
+	 * so above "eth_dev" isn't NULL after rte_eth_dev_close() called.
+	 * Namely, whether "eth_dev" is NULL cannot be used to determine whether
+	 * an ethdev port has been released.
+	 * For both primary precess and secondary precess, eth_dev->state is
+	 * RTE_ETH_DEV_UNUSED, which means the ethdev port has been released.
+	 */
+	if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
+		RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
+		return 0;
+	}
+
 	if (dev_uninit) {
 		ret = dev_uninit(eth_dev);
 		if (ret)