diff mbox series

[RFC,V1] ethdev: fix the issue that dev uninit may be called twice

Message ID 1627908397-51565-1-git-send-email-lihuisong@huawei.com (mailing list archive)
State Superseded
Delegated to: Ferruh Yigit
Headers show
Series [RFC,V1] ethdev: fix the issue that dev uninit may be called twice | expand

Checks

Context Check Description
ci/intel-Testing success Testing PASS
ci/Intel-compilation success Compilation OK
ci/checkpatch success coding style OK

Commit Message

Huisong Li Aug. 2, 2021, 12:46 p.m. UTC
Ethernet devices in DPDK can be released by rte_eth_dev_close() and
rte_dev_remove(). However, these two APIs do not have explicit invocation
restrictions. In other words, at the ethdev layer, calling
rte_eth_dev_close() and then rte_dev_remove() or rte_eal_hotplug_remove()
is allowed. In such a bad scenario, the primary process may be fine, but it
may cause that dev_unint() in the secondary process will be called twice,
and even other serious problems. So this patch fixes it.

Fixes: 99a2dd955fba ("lib: remove librte_ prefix from directory names")

Signed-off-by: Huisong Li <lihuisong@huawei.com>
---
 lib/ethdev/ethdev_pci.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

Comments

Singh, Aman Deep Aug. 18, 2021, 9:47 a.m. UTC | #1
Hi Huison,

On 8/2/2021 6:16 PM, Huisong Li wrote:
> Ethernet devices in DPDK can be released by rte_eth_dev_close() and
> rte_dev_remove(). However, these two APIs do not have explicit invocation
> restrictions. In other words, at the ethdev layer, calling
> rte_eth_dev_close() and then rte_dev_remove() or rte_eal_hotplug_remove()
> is allowed. In such a bad scenario, the primary process may be fine, but it
> may cause that dev_unint() in the secondary process will be called twice,

Shouldn't dev_unint() for Secondary process, simply return with no-action.

> and even other serious problems. So this patch fixes it.
>
> Fixes: 99a2dd955fba ("lib: remove librte_ prefix from directory names")
>
> Signed-off-by: Huisong Li <lihuisong@huawei.com>
> ---
>   lib/ethdev/ethdev_pci.h | 13 +++++++++++++
>   1 file changed, 13 insertions(+)
>
> diff --git a/lib/ethdev/ethdev_pci.h b/lib/ethdev/ethdev_pci.h
> index 8edca82..14a0e01 100644
> --- a/lib/ethdev/ethdev_pci.h
> +++ b/lib/ethdev/ethdev_pci.h
> @@ -151,6 +151,19 @@ rte_eth_dev_pci_generic_remove(struct rte_pci_device *pci_dev,
>   	if (!eth_dev)
>   		return 0;
>   
> +	/*
> +	 * The eth_dev->data->name doesn't be cleared by the secondary precess,
Can we reprase above sentence "doesn't be cleared "
> +	 * so above "eth_dev" isn't NULL after rte_eth_dev_close() called.
> +	 * Namely, whether "eth_dev" is NULL cannot be used to determine whether
> +	 * an ethdev port has been released.
> +	 * For both primary precess and secondary precess, eth_dev->state is
s/ precess / process
> +	 * RTE_ETH_DEV_UNUSED, which means the ethdev port has been released.
> +	 */
> +	if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
> +		RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
> +		return 0;
> +	}
> +
>   	if (dev_uninit) {
>   		ret = dev_uninit(eth_dev);
>   		if (ret)
Huisong Li Aug. 24, 2021, 2:10 a.m. UTC | #2
Hi, Singh, Aman Deep

Sorry, I missed your review. Thank you for your review.😁

在 2021/8/18 17:47, Singh, Aman Deep 写道:
>
> Hi Huison,
>
> On 8/2/2021 6:16 PM, Huisong Li wrote:
>> Ethernet devices in DPDK can be released by rte_eth_dev_close() and
>> rte_dev_remove(). However, these two APIs do not have explicit invocation
>> restrictions. In other words, at the ethdev layer, calling
>> rte_eth_dev_close() and then rte_dev_remove() or rte_eal_hotplug_remove()
>> is allowed. In such a bad scenario, the primary process may be fine, but it
>> may cause that dev_unint() in the secondary process will be called twice,
> Shouldn't dev_unint() for Secondary process, simply return with no-action.

The prerequisite is that the secondary process does not have any 
resources that need to be released.

However, some resources from secondary process may also need to be 
released. For example, mp action

registered by rte_mp_action_register() is used to for multi-process 
communication. It should be unloaded

when all eth devices driven by one PMD in a process are removed. In 
order to achieve the above purpose,

secondary process may have data recording the number of device to decide 
when to deregister the action.

Of course, this is just the case. In short, secondary process may have 
its own private data or resources to

be released.

It is mentioned in RFC v2. Please go to discussion line of RFC v2.

>> and even other serious problems. So this patch fixes it.
>>
>> Fixes: 99a2dd955fba ("lib: remove librte_ prefix from directory names")
>>
>> Signed-off-by: Huisong Li<lihuisong@huawei.com>
>> ---
>>   lib/ethdev/ethdev_pci.h | 13 +++++++++++++
>>   1 file changed, 13 insertions(+)
>>
>> diff --git a/lib/ethdev/ethdev_pci.h b/lib/ethdev/ethdev_pci.h
>> index 8edca82..14a0e01 100644
>> --- a/lib/ethdev/ethdev_pci.h
>> +++ b/lib/ethdev/ethdev_pci.h
>> @@ -151,6 +151,19 @@ rte_eth_dev_pci_generic_remove(struct rte_pci_device *pci_dev,
>>   	if (!eth_dev)
>>   		return 0;
>>   
>> +	/*
>> +	 * The eth_dev->data->name doesn't be cleared by the secondary precess,
> Can we reprase above sentence "doesn't be cleared "
ok
>> +	 * so above "eth_dev" isn't NULL after rte_eth_dev_close() called.
>> +	 * Namely, whether "eth_dev" is NULL cannot be used to determine whether
>> +	 * an ethdev port has been released.
>> +	 * For both primary precess and secondary precess, eth_dev->state is
> s/ precess / process
Thanks. RFC v2 has corrected it.
>> +	 * RTE_ETH_DEV_UNUSED, which means the ethdev port has been released.
>> +	 */
>> +	if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
>> +		RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
>> +		return 0;
>> +	}
>> +
>>   	if (dev_uninit) {
>>   		ret = dev_uninit(eth_dev);
>>   		if (ret)
diff mbox series

Patch

diff --git a/lib/ethdev/ethdev_pci.h b/lib/ethdev/ethdev_pci.h
index 8edca82..14a0e01 100644
--- a/lib/ethdev/ethdev_pci.h
+++ b/lib/ethdev/ethdev_pci.h
@@ -151,6 +151,19 @@  rte_eth_dev_pci_generic_remove(struct rte_pci_device *pci_dev,
 	if (!eth_dev)
 		return 0;
 
+	/*
+	 * The eth_dev->data->name doesn't be cleared by the secondary precess,
+	 * so above "eth_dev" isn't NULL after rte_eth_dev_close() called.
+	 * Namely, whether "eth_dev" is NULL cannot be used to determine whether
+	 * an ethdev port has been released.
+	 * For both primary precess and secondary precess, eth_dev->state is
+	 * RTE_ETH_DEV_UNUSED, which means the ethdev port has been released.
+	 */
+	if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
+		RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
+		return 0;
+	}
+
 	if (dev_uninit) {
 		ret = dev_uninit(eth_dev);
 		if (ret)