[v2] kni: fix use-after-free when kni release

Message ID 20220209073525.22900-1-humin29@huawei.com (mailing list archive)
State Accepted, archived
Delegated to: Thomas Monjalon
Headers
Series [v2] kni: fix use-after-free when kni release |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/github-robot: build success github build: passed
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS

Commit Message

humin (Q) Feb. 9, 2022, 7:35 a.m. UTC
  From: Huisong Li <lihuisong@huawei.com>

The "kni_dev" is the private data of the "net_device" in kni, and allocated
with the "net_device" by calling "alloc_netdev()". The "net_device" is
freed by calling "free_netdev()" when kni release. The freed memory
includes the "kni_dev". So After "kni_dev" should not be accessed after
"net_device" is released.

Fixes: e77fec694936 ("kni: fix possible mbuf leaks and speed up port release")
Cc: stable@dpdk.org

KASAN trace:

[   85.263717] ==========================================================
[   85.264418] BUG: KASAN: use-after-free in kni_net_release_fifo_phy+
		0x30/0x84 [rte_kni]
[   85.265139] Read of size 8 at addr ffff000260668d60 by task kni/341
[   85.265703]
[   85.265857] CPU: 0 PID: 341 Comm: kni Tainted: G     U     O
		5.15.0-rc4+ #1
[   85.266525] Hardware name: linux,dummy-virt (DT)
[   85.266968] Call trace:
[   85.267220]  dump_backtrace+0x0/0x2d0
[   85.267591]  show_stack+0x24/0x30
[   85.267924]  dump_stack_lvl+0x8c/0xb8
[   85.268294]  print_address_description.constprop.0+0x74/0x2b8
[   85.268855]  kasan_report+0x1e4/0x200
[   85.269224]  __asan_load8+0x98/0xd4
[   85.269577]  kni_net_release_fifo_phy+0x30/0x84 [rte_kni]
[   85.270116]  kni_dev_remove.isra.0+0x50/0x64 [rte_kni]
[   85.270630]  kni_ioctl_release+0x254/0x320 [rte_kni]
[   85.271136]  kni_ioctl+0x64/0xb0 [rte_kni]
[   85.271553]  __arm64_sys_ioctl+0xdc/0x120
[   85.271955]  invoke_syscall+0x68/0x1a0
[   85.272332]  el0_svc_common.constprop.0+0x90/0x200
[   85.272807]  do_el0_svc+0x94/0xa4
[   85.273144]  el0_svc+0x78/0x240
[   85.273463]  el0t_64_sync_handler+0x1a8/0x1b0
[   85.273895]  el0t_64_sync+0x1a0/0x1a4
[   85.274264]
[   85.274427] Allocated by task 341:
[   85.274767]  kasan_save_stack+0x2c/0x60
[   85.275157]  __kasan_kmalloc+0x90/0xb4
[   85.275533]  __kmalloc_node+0x230/0x594
[   85.275917]  kvmalloc_node+0x8c/0x190
[   85.276286]  alloc_netdev_mqs+0x70/0x6b0
[   85.276678]  kni_ioctl_create+0x224/0xf40 [rte_kni]
[   85.277166]  kni_ioctl+0x9c/0xb0 [rte_kni]
[   85.277581]  __arm64_sys_ioctl+0xdc/0x120
[   85.277980]  invoke_syscall+0x68/0x1a0
[   85.278357]  el0_svc_common.constprop.0+0x90/0x200
[   85.278830]  do_el0_svc+0x94/0xa4
[   85.279172]  el0_svc+0x78/0x240
[   85.279491]  el0t_64_sync_handler+0x1a8/0x1b0
[   85.279925]  el0t_64_sync+0x1a0/0x1a4
[   85.280292]
[   85.280454] Freed by task 341:
[   85.280763]  kasan_save_stack+0x2c/0x60
[   85.281147]  kasan_set_track+0x2c/0x40
[   85.281522]  kasan_set_free_info+0x2c/0x50
[   85.281930]  __kasan_slab_free+0xdc/0x140
[   85.282331]  slab_free_freelist_hook+0x90/0x250
[   85.282782]  kfree+0x128/0x580
[   85.283099]  kvfree+0x48/0x60
[   85.283402]  netdev_freemem+0x34/0x44
[   85.283770]  netdev_release+0x50/0x64
[   85.284138]  device_release+0xa0/0x120
[   85.284516]  kobject_put+0xf8/0x160
[   85.284867]  put_device+0x20/0x30
[   85.285204]  free_netdev+0x22c/0x310
[   85.285562]  kni_dev_remove.isra.0+0x48/0x64 [rte_kni]
[   85.286076]  kni_ioctl_release+0x254/0x320 [rte_kni]
[   85.286573]  kni_ioctl+0x64/0xb0 [rte_kni]
[   85.286992]  __arm64_sys_ioctl+0xdc/0x120
[   85.287392]  invoke_syscall+0x68/0x1a0
[   85.287769]  el0_svc_common.constprop.0+0x90/0x200
[   85.288243]  do_el0_svc+0x94/0xa4
[   85.288579]  el0_svc+0x78/0x240
[   85.288899]  el0t_64_sync_handler+0x1a8/0x1b0
[   85.289332]  el0t_64_sync+0x1a0/0x1a4
[   85.289699]
[   85.289862] The buggy address belongs to the object at ffff000260668000
[   85.289862]  which belongs to the cache kmalloc-cg-8k of size 8192
[   85.291079] The buggy address is located 3424 bytes inside of
[   85.291079]  8192-byte region [ffff000260668000, ffff00026066a000)
[   85.292213] The buggy address belongs to the page:
[   85.292684] page:(____ptrval____) refcount:1 mapcount:0 mapping:
		0000000000000000 index:0x0 pfn:0x2a0668
[   85.293585] head:(____ptrval____) order:3 compound_mapcount:0
		compound_pincount:0
[   85.294305] flags: 0xbfff80000010200(slab|head|node=0|zone=2|
		lastcpupid=0x7fff)
[   85.295020] raw: 0bfff80000010200 0000000000000000 dead000000000122
		ffff0000c000d680
[   85.295767] raw: 0000000000000000 0000000080020002 00000001ffffffff
		0000000000000000
[   85.296512] page dumped because: kasan: bad access detected
[   85.297054]
[   85.297217] Memory state around the buggy address:
[   85.297688]  ffff000260668c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb
		fb fb
[   85.298384]  ffff000260668c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb
		fb fb
[   85.299088] >ffff000260668d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb
		fb fb
[   85.299781]                                                        ^
[   85.300396]  ffff000260668d80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb
		fb fb
[   85.301092]  ffff000260668e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb
		fb fb
[   85.301787] ===========================================================

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
---
v2:
* change the order of `list_del` and `kni_dev_remove` in `kni_release`
---
 kernel/linux/kni/kni_misc.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)
  

Comments

Ferruh Yigit Feb. 14, 2022, 6:41 p.m. UTC | #1
On 2/9/2022 7:35 AM, Min Hu (Connor) wrote:
> From: Huisong Li <lihuisong@huawei.com>
> 
> The "kni_dev" is the private data of the "net_device" in kni, and allocated
> with the "net_device" by calling "alloc_netdev()". The "net_device" is
> freed by calling "free_netdev()" when kni release. The freed memory
> includes the "kni_dev". So After "kni_dev" should not be accessed after
> "net_device" is released.
> 

The problem description looks valid and change looks good to me,

only list_del after remove is like this for years, I wonder how
it is not caught until now, or if we are missing something, I
want to test some before ack, which I will do in next few days.

> Fixes: e77fec694936 ("kni: fix possible mbuf leaks and speed up port release")
> Cc: stable@dpdk.org
> 
> KASAN trace:
> 
> [   85.263717] ==========================================================
> [   85.264418] BUG: KASAN: use-after-free in kni_net_release_fifo_phy+
> 		0x30/0x84 [rte_kni]
> [   85.265139] Read of size 8 at addr ffff000260668d60 by task kni/341
> [   85.265703]
> [   85.265857] CPU: 0 PID: 341 Comm: kni Tainted: G     U     O
> 		5.15.0-rc4+ #1
> [   85.266525] Hardware name: linux,dummy-virt (DT)
> [   85.266968] Call trace:
> [   85.267220]  dump_backtrace+0x0/0x2d0
> [   85.267591]  show_stack+0x24/0x30
> [   85.267924]  dump_stack_lvl+0x8c/0xb8
> [   85.268294]  print_address_description.constprop.0+0x74/0x2b8
> [   85.268855]  kasan_report+0x1e4/0x200
> [   85.269224]  __asan_load8+0x98/0xd4
> [   85.269577]  kni_net_release_fifo_phy+0x30/0x84 [rte_kni]
> [   85.270116]  kni_dev_remove.isra.0+0x50/0x64 [rte_kni]
> [   85.270630]  kni_ioctl_release+0x254/0x320 [rte_kni]
> [   85.271136]  kni_ioctl+0x64/0xb0 [rte_kni]
> [   85.271553]  __arm64_sys_ioctl+0xdc/0x120
> [   85.271955]  invoke_syscall+0x68/0x1a0
> [   85.272332]  el0_svc_common.constprop.0+0x90/0x200
> [   85.272807]  do_el0_svc+0x94/0xa4
> [   85.273144]  el0_svc+0x78/0x240
> [   85.273463]  el0t_64_sync_handler+0x1a8/0x1b0
> [   85.273895]  el0t_64_sync+0x1a0/0x1a4
> [   85.274264]
> [   85.274427] Allocated by task 341:
> [   85.274767]  kasan_save_stack+0x2c/0x60
> [   85.275157]  __kasan_kmalloc+0x90/0xb4
> [   85.275533]  __kmalloc_node+0x230/0x594
> [   85.275917]  kvmalloc_node+0x8c/0x190
> [   85.276286]  alloc_netdev_mqs+0x70/0x6b0
> [   85.276678]  kni_ioctl_create+0x224/0xf40 [rte_kni]
> [   85.277166]  kni_ioctl+0x9c/0xb0 [rte_kni]
> [   85.277581]  __arm64_sys_ioctl+0xdc/0x120
> [   85.277980]  invoke_syscall+0x68/0x1a0
> [   85.278357]  el0_svc_common.constprop.0+0x90/0x200
> [   85.278830]  do_el0_svc+0x94/0xa4
> [   85.279172]  el0_svc+0x78/0x240
> [   85.279491]  el0t_64_sync_handler+0x1a8/0x1b0
> [   85.279925]  el0t_64_sync+0x1a0/0x1a4
> [   85.280292]
> [   85.280454] Freed by task 341:
> [   85.280763]  kasan_save_stack+0x2c/0x60
> [   85.281147]  kasan_set_track+0x2c/0x40
> [   85.281522]  kasan_set_free_info+0x2c/0x50
> [   85.281930]  __kasan_slab_free+0xdc/0x140
> [   85.282331]  slab_free_freelist_hook+0x90/0x250
> [   85.282782]  kfree+0x128/0x580
> [   85.283099]  kvfree+0x48/0x60
> [   85.283402]  netdev_freemem+0x34/0x44
> [   85.283770]  netdev_release+0x50/0x64
> [   85.284138]  device_release+0xa0/0x120
> [   85.284516]  kobject_put+0xf8/0x160
> [   85.284867]  put_device+0x20/0x30
> [   85.285204]  free_netdev+0x22c/0x310
> [   85.285562]  kni_dev_remove.isra.0+0x48/0x64 [rte_kni]
> [   85.286076]  kni_ioctl_release+0x254/0x320 [rte_kni]
> [   85.286573]  kni_ioctl+0x64/0xb0 [rte_kni]
> [   85.286992]  __arm64_sys_ioctl+0xdc/0x120
> [   85.287392]  invoke_syscall+0x68/0x1a0
> [   85.287769]  el0_svc_common.constprop.0+0x90/0x200
> [   85.288243]  do_el0_svc+0x94/0xa4
> [   85.288579]  el0_svc+0x78/0x240
> [   85.288899]  el0t_64_sync_handler+0x1a8/0x1b0
> [   85.289332]  el0t_64_sync+0x1a0/0x1a4
> [   85.289699]
> [   85.289862] The buggy address belongs to the object at ffff000260668000
> [   85.289862]  which belongs to the cache kmalloc-cg-8k of size 8192
> [   85.291079] The buggy address is located 3424 bytes inside of
> [   85.291079]  8192-byte region [ffff000260668000, ffff00026066a000)
> [   85.292213] The buggy address belongs to the page:
> [   85.292684] page:(____ptrval____) refcount:1 mapcount:0 mapping:
> 		0000000000000000 index:0x0 pfn:0x2a0668
> [   85.293585] head:(____ptrval____) order:3 compound_mapcount:0
> 		compound_pincount:0
> [   85.294305] flags: 0xbfff80000010200(slab|head|node=0|zone=2|
> 		lastcpupid=0x7fff)
> [   85.295020] raw: 0bfff80000010200 0000000000000000 dead000000000122
> 		ffff0000c000d680
> [   85.295767] raw: 0000000000000000 0000000080020002 00000001ffffffff
> 		0000000000000000
> [   85.296512] page dumped because: kasan: bad access detected
> [   85.297054]
> [   85.297217] Memory state around the buggy address:
> [   85.297688]  ffff000260668c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 		fb fb
> [   85.298384]  ffff000260668c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 		fb fb
> [   85.299088] >ffff000260668d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 		fb fb
> [   85.299781]                                                        ^
> [   85.300396]  ffff000260668d80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 		fb fb
> [   85.301092]  ffff000260668e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 		fb fb
> [   85.301787] ===========================================================
> 
> Signed-off-by: Huisong Li <lihuisong@huawei.com>
> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
> ---
> v2:
> * change the order of `list_del` and `kni_dev_remove` in `kni_release`
> ---
>   kernel/linux/kni/kni_misc.c | 12 ++++++++----
>   1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/linux/kni/kni_misc.c b/kernel/linux/kni/kni_misc.c
> index f10dcd069d..ad1582d911 100644
> --- a/kernel/linux/kni/kni_misc.c
> +++ b/kernel/linux/kni/kni_misc.c
> @@ -184,13 +184,17 @@ kni_dev_remove(struct kni_dev *dev)
>   	if (!dev)
>   		return -ENODEV;
>   
> +	/*
> +	 * The memory of kni device is allocated and released together
> +	 * with net device. Release mbuf before freeing net device.
> +	 */
> +	kni_net_release_fifo_phy(dev);
> +
>   	if (dev->net_dev) {
>   		unregister_netdev(dev->net_dev);
>   		free_netdev(dev->net_dev);
>   	}
>   
> -	kni_net_release_fifo_phy(dev);
> -
>   	return 0;
>   }
>   
> @@ -220,8 +224,8 @@ kni_release(struct inode *inode, struct file *file)
>   			dev->pthread = NULL;
>   		}
>   
> -		kni_dev_remove(dev);
>   		list_del(&dev->list);
> +		kni_dev_remove(dev);
>   	}
>   	up_write(&knet->kni_list_lock);
>   
> @@ -470,8 +474,8 @@ kni_ioctl_release(struct net *net, uint32_t ioctl_num,
>   			dev->pthread = NULL;
>   		}
>   
> -		kni_dev_remove(dev);
>   		list_del(&dev->list);
> +		kni_dev_remove(dev);
>   		ret = 0;
>   		break;
>   	}
  
Ferruh Yigit Feb. 15, 2022, 7:11 p.m. UTC | #2
On 2/14/2022 6:41 PM, Ferruh Yigit wrote:
> On 2/9/2022 7:35 AM, Min Hu (Connor) wrote:
>> From: Huisong Li <lihuisong@huawei.com>
>>
>> The "kni_dev" is the private data of the "net_device" in kni, and allocated
>> with the "net_device" by calling "alloc_netdev()". The "net_device" is
>> freed by calling "free_netdev()" when kni release. The freed memory
>> includes the "kni_dev". So After "kni_dev" should not be accessed after
>> "net_device" is released.
>>
> 
> The problem description looks valid and change looks good to me,
> 
> only list_del after remove is like this for years, I wonder how
> it is not caught until now, or if we are missing something, I
> want to test some before ack, which I will do in next few days.


Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
  
Thomas Monjalon Feb. 27, 2022, 8:11 p.m. UTC | #3
15/02/2022 20:11, Ferruh Yigit:
> On 2/14/2022 6:41 PM, Ferruh Yigit wrote:
> > On 2/9/2022 7:35 AM, Min Hu (Connor) wrote:
> >> From: Huisong Li <lihuisong@huawei.com>
> >>
> >> The "kni_dev" is the private data of the "net_device" in kni, and allocated
> >> with the "net_device" by calling "alloc_netdev()". The "net_device" is
> >> freed by calling "free_netdev()" when kni release. The freed memory
> >> includes the "kni_dev". So After "kni_dev" should not be accessed after
> >> "net_device" is released.
> >>
> > 
> > The problem description looks valid and change looks good to me,
> > 
> > only list_del after remove is like this for years, I wonder how
> > it is not caught until now, or if we are missing something, I
> > want to test some before ack, which I will do in next few days.
> 
> 
> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>

Applied, thanks.
  

Patch

diff --git a/kernel/linux/kni/kni_misc.c b/kernel/linux/kni/kni_misc.c
index f10dcd069d..ad1582d911 100644
--- a/kernel/linux/kni/kni_misc.c
+++ b/kernel/linux/kni/kni_misc.c
@@ -184,13 +184,17 @@  kni_dev_remove(struct kni_dev *dev)
 	if (!dev)
 		return -ENODEV;
 
+	/*
+	 * The memory of kni device is allocated and released together
+	 * with net device. Release mbuf before freeing net device.
+	 */
+	kni_net_release_fifo_phy(dev);
+
 	if (dev->net_dev) {
 		unregister_netdev(dev->net_dev);
 		free_netdev(dev->net_dev);
 	}
 
-	kni_net_release_fifo_phy(dev);
-
 	return 0;
 }
 
@@ -220,8 +224,8 @@  kni_release(struct inode *inode, struct file *file)
 			dev->pthread = NULL;
 		}
 
-		kni_dev_remove(dev);
 		list_del(&dev->list);
+		kni_dev_remove(dev);
 	}
 	up_write(&knet->kni_list_lock);
 
@@ -470,8 +474,8 @@  kni_ioctl_release(struct net *net, uint32_t ioctl_num,
 			dev->pthread = NULL;
 		}
 
-		kni_dev_remove(dev);
 		list_del(&dev->list);
+		kni_dev_remove(dev);
 		ret = 0;
 		break;
 	}