[2/3] bus/vdev: revert fix devargs after multi-process bus scan

Message ID 20240314093630.1066948-3-mingjinx.ye@intel.com (mailing list archive)
State New
Delegated to: Thomas Monjalon
Headers
Series fix insert dev core dump |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Mingjin Ye March 14, 2024, 9:36 a.m. UTC
The asan tool detected a memory leak in the vdev driver alloc_devargs.
The previous commit does not insert device arguments into devargs_list
when attaching a device during a bus scan of a secondary process.
This resulted in an existing memory leak when removing a vdev device,
since rte_devargs_remove actually does nothing.

Therefore the following commit was reverted accordingly.

Fixes: f5b2eff0847d ("bus/vdev: fix devargs after multi-process bus scan")

Restoring this commit will fix the memory leak. There was an issue with
device parameters using free devargs when inserting a vdev device when
devargs_list already existed, resulting in a core dump. A new patch
will fix this issue.

Fixes: f5b2eff0847d ("bus/vdev: fix devargs after multi-process bus scan")
Cc: stable@dpdk.org

Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
---
 drivers/bus/vdev/vdev.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)
  

Comments

Thomas Monjalon June 19, 2024, 8:15 p.m. UTC | #1
14/03/2024 10:36, Mingjin Ye:
> The asan tool detected a memory leak in the vdev driver alloc_devargs.
> The previous commit does not insert device arguments into devargs_list

What is the previous commit?
Where is devargs_list in this function?

> when attaching a device during a bus scan of a secondary process.
> This resulted in an existing memory leak when removing a vdev device,
> since rte_devargs_remove actually does nothing.
> 
> Therefore the following commit was reverted accordingly.
> 
> Fixes: f5b2eff0847d ("bus/vdev: fix devargs after multi-process bus scan")
> 
> Restoring this commit will fix the memory leak. There was an issue with
> device parameters using free devargs when inserting a vdev device when
> devargs_list already existed, resulting in a core dump. A new patch
> will fix this issue.
> 
> Fixes: f5b2eff0847d ("bus/vdev: fix devargs after multi-process bus scan")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>

I'm not comfortable with reverting a so old commit.
Your previous attempt in this bus driver was not successful.
Please prove the memory leak cannot be simply fixed.
  
Mingjin Ye June 20, 2024, 6:10 a.m. UTC | #2
> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Thursday, June 20, 2024 4:15 AM
> To: Ye, MingjinX <mingjinx.ye@intel.com>
> Cc: dev@dpdk.org; stable@dpdk.org; stephen@networkplumber.org;
> Richardson, Bruce <bruce.richardson@intel.com>; Marchand, David
> <david.marchand@redhat.com>
> Subject: Re: [PATCH 2/3] bus/vdev: revert fix devargs after multi-process bus
> scan
> 
> 14/03/2024 10:36, Mingjin Ye:
> > The asan tool detected a memory leak in the vdev driver alloc_devargs.
> > The previous commit does not insert device arguments into devargs_list
> 
> What is the previous commit?
commit: f5b2eff0847d49a66301f0046502c6232cd5da3f


> Where is devargs_list in this function?
In the rte_devargs_insert function.
> 
> > when attaching a device during a bus scan of a secondary process.
> > This resulted in an existing memory leak when removing a vdev device,
> > since rte_devargs_remove actually does nothing.
> >
> > Therefore the following commit was reverted accordingly.
> >
> > Fixes: f5b2eff0847d ("bus/vdev: fix devargs after multi-process bus
> > scan")
> >
> > Restoring this commit will fix the memory leak. There was an issue
> > with device parameters using free devargs when inserting a vdev device
> > when devargs_list already existed, resulting in a core dump. A new
> > patch will fix this issue.
> >
> > Fixes: f5b2eff0847d ("bus/vdev: fix devargs after multi-process bus
> > scan")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
> 
> I'm not comfortable with reverting a so old commit.
> Your previous attempt in this bus driver was not successful.
> Please prove the memory leak cannot be simply fixed.

dpdk/drivers/bus/vdev/vdev.c:471
ret = insert_vdev(in->name, NULL, NULL, false);

Since the last argument is always fale, this results in the objects of alloc_devargs in insert_vdev not being added to the devargs_list via rte_devargs_insert.

rte_vdev_uninit->rte_devargs_remove will release the devargs objects constructed through the program startup parameters. 
however the devargs bound to the device (the devargs objects constructed in insert_vdev) are not released. causing a memory leak.

Therefore, after undoing the patch. insert_vdev->rte_devargs_insert can update the devargs object without causing a memory leak.
>
  
Anatoly Burakov July 11, 2024, 3:56 p.m. UTC | #3
On 3/14/2024 10:36 AM, Mingjin Ye wrote:
> The asan tool detected a memory leak in the vdev driver alloc_devargs.
> The previous commit does not insert device arguments into devargs_list
> when attaching a device during a bus scan of a secondary process.
> This resulted in an existing memory leak when removing a vdev device,
> since rte_devargs_remove actually does nothing.
> 
> Therefore the following commit was reverted accordingly.
> 
> Fixes: f5b2eff0847d ("bus/vdev: fix devargs after multi-process bus scan")
> 
> Restoring this commit will fix the memory leak. There was an issue with
> device parameters using free devargs when inserting a vdev device when
> devargs_list already existed, resulting in a core dump. A new patch
> will fix this issue.
> 
> Fixes: f5b2eff0847d ("bus/vdev: fix devargs after multi-process bus scan")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
> ---

Hi Mingjin,

I've been trying to understand what this patchset does.

>   drivers/bus/vdev/vdev.c | 11 ++++-------
>   1 file changed, 4 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
> index 38d05a9fe9..1a6cc7d12d 100644
> --- a/drivers/bus/vdev/vdev.c
> +++ b/drivers/bus/vdev/vdev.c
> @@ -264,9 +264,7 @@ alloc_devargs(const char *name, const char *args)
>   }
>   
>   static int
> -insert_vdev(const char *name, const char *args,
> -		struct rte_vdev_device **p_dev,
> -		bool init)
> +insert_vdev(const char *name, const char *args, struct rte_vdev_device **p_dev)
>   {
>   	struct rte_vdev_device *dev;
>   	struct rte_devargs *devargs;
> @@ -300,8 +298,7 @@ insert_vdev(const char *name, const char *args,
>   		goto fail;
>   	}
>   
> -	if (init)
> -		rte_devargs_insert(&devargs);
> +	rte_devargs_insert(&devargs);

The original commit message states that "it is not necessary" to add 
devargs to the list in secondary processes, but does not state why it is 
unnecessary and what issue was that fix for. With that understanding, I 
think the commit message for this patch can be reworded like follows:

---

This patch reverts commit f5b2eff0847d ("bus/vdev: fix devargs after 
multi-process bus scan")

With current code, we do not add devargs to devargs list when we add a 
vdev in secondary process (because `init` flag is set to `false`).

Because of this, when we do vdev_uninit, we call rte_devargs_remove on 
the &devargs pointer (the one we didn't add to devargs list but did save 
inside rte_vdev_device struct), but in secondary process, because 
devargs were not added to the list in the first place, devargs_remove 
does not find the associated devargs in its list and therefore does not 
free associated resources. As a result, we get a memory leak, because we 
free the rte_vdev_device but not its associated devargs.

Revert this patch to avoid leaking devargs on vdev uninit.

---

I think this would be clearer summation of what is going on in this patch.

Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>


>   	dev->device.devargs = devargs;
>   	TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
>   
> @@ -323,7 +320,7 @@ rte_vdev_init(const char *name, const char *args)
>   	int ret;
>   
>   	rte_spinlock_recursive_lock(&vdev_device_list_lock);
> -	ret = insert_vdev(name, args, &dev, true);
> +	ret = insert_vdev(name, args, &dev);
>   	if (ret == 0) {
>   		ret = vdev_probe_all_drivers(dev);
>   		if (ret) {
> @@ -449,7 +446,7 @@ vdev_action(const struct rte_mp_msg *mp_msg, const void *peer)
>   		break;
>   	case VDEV_SCAN_ONE:
>   		VDEV_LOG(INFO, "receive vdev, %s", in->name);
> -		ret = insert_vdev(in->name, NULL, NULL, false);
> +		ret = insert_vdev(in->name, NULL, NULL);
>   		if (ret == -EEXIST)
>   			VDEV_LOG(DEBUG, "device already exist, %s", in->name);
>   		else if (ret < 0)
  

Patch

diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index 38d05a9fe9..1a6cc7d12d 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -264,9 +264,7 @@  alloc_devargs(const char *name, const char *args)
 }
 
 static int
-insert_vdev(const char *name, const char *args,
-		struct rte_vdev_device **p_dev,
-		bool init)
+insert_vdev(const char *name, const char *args, struct rte_vdev_device **p_dev)
 {
 	struct rte_vdev_device *dev;
 	struct rte_devargs *devargs;
@@ -300,8 +298,7 @@  insert_vdev(const char *name, const char *args,
 		goto fail;
 	}
 
-	if (init)
-		rte_devargs_insert(&devargs);
+	rte_devargs_insert(&devargs);
 	dev->device.devargs = devargs;
 	TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
 
@@ -323,7 +320,7 @@  rte_vdev_init(const char *name, const char *args)
 	int ret;
 
 	rte_spinlock_recursive_lock(&vdev_device_list_lock);
-	ret = insert_vdev(name, args, &dev, true);
+	ret = insert_vdev(name, args, &dev);
 	if (ret == 0) {
 		ret = vdev_probe_all_drivers(dev);
 		if (ret) {
@@ -449,7 +446,7 @@  vdev_action(const struct rte_mp_msg *mp_msg, const void *peer)
 		break;
 	case VDEV_SCAN_ONE:
 		VDEV_LOG(INFO, "receive vdev, %s", in->name);
-		ret = insert_vdev(in->name, NULL, NULL, false);
+		ret = insert_vdev(in->name, NULL, NULL);
 		if (ret == -EEXIST)
 			VDEV_LOG(DEBUG, "device already exist, %s", in->name);
 		else if (ret < 0)