[v2] eal: fix unused memseg length

Message ID 20250102085838.991-1-ming.1.yang@nokia-sbell.com (mailing list archive)
State New
Delegated to: Thomas Monjalon
Headers
Series [v2] eal: fix unused memseg length |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/github-robot: build success github build: passed
ci/Intel-compilation success Compilation OK
ci/iol-unit-amd64-testing pending Testing pending
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-sample-apps-testing success Testing PASS
ci/iol-unit-arm64-testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-marvell-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-compile-arm64-testing success Testing PASS
ci/iol-abi-testing success Testing PASS
ci/intel-Testing success Testing PASS
ci/intel-Functional success Functional PASS

Commit Message

Yang Ming Jan. 2, 2025, 8:58 a.m. UTC
Fix the issue where OS memory is mistakenly freed with rte_free
by setting the length (len) of unused memseg to 0.

When `eal_legacy_hugepage_init()` releases the VA space for
unused memseg lists(MSLs), it does not reset MSLs' length to 0.
As a result, `mlx5_mem_is_rte()` may incorrectly identify OS
memory as rte memory.
This can lead to `mlx_free()` calling `rte_free()` on OS memory,
causing an "EAL: Error: Invalid memory" log and failing to free
the OS memory.

This issue is occasional and occurs when the DPDK program’s
memory map places the heap address range between 0 and len(32G).
In such cases, malloc may return an address less than len,
causing `mlx5_mem_is_rte()` to incorrectly treat it as rte
memory.

Also, consider how the MSL with `base_va == NULL` ends up in
`mlx5_mem_is_rte()`. It comes from `rte_mem_virt2memseg_list()`
which iterates MSLs and checks that an address belongs to
[`base_va`; `base_va+len`) without checking whether
`base_va == NULL` i.e. that the MSL is inactive. So this patch
also corrects `rte_mem_virt2memseg_list()` behavior.

Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
Cc: anatoly.burakov@intel.com
Cc: stable@dpdk.org

Signed-off-by: Yang Ming <ming.1.yang@nokia-sbell.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
---
 lib/eal/linux/eal_memory.c | 1 +
 1 file changed, 1 insertion(+)
  

Comments

Yang Ming Jan. 22, 2025, 3:14 a.m. UTC | #1
Hi experts, is there any chance to review and accept this patch?

On 2025/1/2 16:58, Yang Ming wrote:
> Fix the issue where OS memory is mistakenly freed with rte_free
> by setting the length (len) of unused memseg to 0.
>
> When `eal_legacy_hugepage_init()` releases the VA space for
> unused memseg lists(MSLs), it does not reset MSLs' length to 0.
> As a result, `mlx5_mem_is_rte()` may incorrectly identify OS
> memory as rte memory.
> This can lead to `mlx_free()` calling `rte_free()` on OS memory,
> causing an "EAL: Error: Invalid memory" log and failing to free
> the OS memory.
>
> This issue is occasional and occurs when the DPDK program’s
> memory map places the heap address range between 0 and len(32G).
> In such cases, malloc may return an address less than len,
> causing `mlx5_mem_is_rte()` to incorrectly treat it as rte
> memory.
>
> Also, consider how the MSL with `base_va == NULL` ends up in
> `mlx5_mem_is_rte()`. It comes from `rte_mem_virt2memseg_list()`
> which iterates MSLs and checks that an address belongs to
> [`base_va`; `base_va+len`) without checking whether
> `base_va == NULL` i.e. that the MSL is inactive. So this patch
> also corrects `rte_mem_virt2memseg_list()` behavior.
>
> Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
> Cc: anatoly.burakov@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Yang Ming <ming.1.yang@nokia-sbell.com>
> Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
> ---
>   lib/eal/linux/eal_memory.c | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/lib/eal/linux/eal_memory.c b/lib/eal/linux/eal_memory.c
> index 45879ca743..9dda60c0e1 100644
> --- a/lib/eal/linux/eal_memory.c
> +++ b/lib/eal/linux/eal_memory.c
> @@ -1472,6 +1472,7 @@ eal_legacy_hugepage_init(void)
>   		mem_sz = msl->len;
>   		munmap(msl->base_va, mem_sz);
>   		msl->base_va = NULL;
> +		msl->len = 0;
>   		msl->heap = 0;
>   
>   		/* destroy backing fbarray */
  

Patch

diff --git a/lib/eal/linux/eal_memory.c b/lib/eal/linux/eal_memory.c
index 45879ca743..9dda60c0e1 100644
--- a/lib/eal/linux/eal_memory.c
+++ b/lib/eal/linux/eal_memory.c
@@ -1472,6 +1472,7 @@  eal_legacy_hugepage_init(void)
 		mem_sz = msl->len;
 		munmap(msl->base_va, mem_sz);
 		msl->base_va = NULL;
+		msl->len = 0;
 		msl->heap = 0;
 
 		/* destroy backing fbarray */