[v4] eal: fix eal init may failed when too much continuous memsegs under legacy mode

Message ID 20230529112130.11198-1-changfengnan@bytedance.com (mailing list archive)
State Accepted, archived
Delegated to: David Marchand
Headers
Series [v4] eal: fix eal init may failed when too much continuous memsegs under legacy mode |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/github-robot: build success github build: passed
ci/iol-mellanox-Performance success Performance Testing PASS
ci/intel-Functional success Functional PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-unit-testing success Testing PASS
ci/iol-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/intel-Testing success Testing PASS

Commit Message

Fengnan Chang May 29, 2023, 11:21 a.m. UTC
  Under legacy mode, if the number of continuous memsegs greater
than RTE_MAX_MEMSEG_PER_LIST, eal init will failed even though
another memseg list is empty, because only one memseg list used
to check in remap_needed_hugepages.
Fix this by make remap_segment return how many segments mapped,
remap_segment try to map most contiguous segments it can, if
exceed it's capbility, remap_needed_hugepages will continue to
map other left pages.

For example:
hugepage configure:
cat /sys/devices/system/node/node*/hugepages/hugepages-2048kB/nr_hugepages
10241
10239

startup log:
EAL: Detected memory type: socket_id:0 hugepage_sz:2097152
EAL: Detected memory type: socket_id:1 hugepage_sz:2097152
EAL: Creating 4 segment lists: n_segs:8192 socket_id:0 hugepage_sz:2097152
EAL: Creating 4 segment lists: n_segs:8192 socket_id:1 hugepage_sz:2097152
EAL: Requesting 13370 pages of size 2MB from socket 0
EAL: Requesting 7110 pages of size 2MB from socket 1
EAL: Attempting to map 14220M on socket 1
EAL: Allocated 14220M on socket 1
EAL: Attempting to map 26740M on socket 0
EAL: Could not find space for memseg. Please increase 32768 and/or 65536 in
configuration.
EAL: Couldn't remap hugepage files into memseg lists
EAL: FATAL: Cannot init memory
EAL: Cannot init memory

Signed-off-by: Fengnan Chang <changfengnan@bytedance.com>
Signed-off-by: Lin Li <lilintjpu@bytedance.com>
Signed-off-by: Burakov Anatoly <anatoly.burakov@intel.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/eal/linux/eal_memory.c | 51 +++++++++++++++++++++++++++-----------
 1 file changed, 36 insertions(+), 15 deletions(-)
  

Comments

David Marchand June 7, 2023, 8:32 p.m. UTC | #1
On Mon, May 29, 2023 at 1:23 PM Fengnan Chang
<changfengnan@bytedance.com> wrote:
>
> Under legacy mode, if the number of continuous memsegs greater
> than RTE_MAX_MEMSEG_PER_LIST, eal init will failed even though
> another memseg list is empty, because only one memseg list used
> to check in remap_needed_hugepages.
> Fix this by make remap_segment return how many segments mapped,
> remap_segment try to map most contiguous segments it can, if
> exceed it's capbility, remap_needed_hugepages will continue to
> map other left pages.
>
> For example:
> hugepage configure:
> cat /sys/devices/system/node/node*/hugepages/hugepages-2048kB/nr_hugepages
> 10241
> 10239
>
> startup log:
> EAL: Detected memory type: socket_id:0 hugepage_sz:2097152
> EAL: Detected memory type: socket_id:1 hugepage_sz:2097152
> EAL: Creating 4 segment lists: n_segs:8192 socket_id:0 hugepage_sz:2097152
> EAL: Creating 4 segment lists: n_segs:8192 socket_id:1 hugepage_sz:2097152
> EAL: Requesting 13370 pages of size 2MB from socket 0
> EAL: Requesting 7110 pages of size 2MB from socket 1
> EAL: Attempting to map 14220M on socket 1
> EAL: Allocated 14220M on socket 1
> EAL: Attempting to map 26740M on socket 0
> EAL: Could not find space for memseg. Please increase 32768 and/or 65536 in
> configuration.
> EAL: Couldn't remap hugepage files into memseg lists
> EAL: FATAL: Cannot init memory
> EAL: Cannot init memory

We are missing a Fixes: tag and this is backport material, right?


>
> Signed-off-by: Fengnan Chang <changfengnan@bytedance.com>
> Signed-off-by: Lin Li <lilintjpu@bytedance.com>

Can I update Lin Li existing entry in .mailmap? Or is this a different person?


> Signed-off-by: Burakov Anatoly <anatoly.burakov@intel.com>
Anatoly Burakov*

> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>

Strange to have both SoB and Review tag from Anatoly.


> ---
>  lib/eal/linux/eal_memory.c | 51 +++++++++++++++++++++++++++-----------

Is this issue affecting only Linux?
  
Fengnan Chang June 9, 2023, 8:35 a.m. UTC | #2
David Marchand <david.marchand@redhat.com> 于2023年6月8日周四 04:33写道:
>
> On Mon, May 29, 2023 at 1:23 PM Fengnan Chang
> <changfengnan@bytedance.com> wrote:
> >
> > Under legacy mode, if the number of continuous memsegs greater
> > than RTE_MAX_MEMSEG_PER_LIST, eal init will failed even though
> > another memseg list is empty, because only one memseg list used
> > to check in remap_needed_hugepages.
> > Fix this by make remap_segment return how many segments mapped,
> > remap_segment try to map most contiguous segments it can, if
> > exceed it's capbility, remap_needed_hugepages will continue to
> > map other left pages.
> >
> > For example:
> > hugepage configure:
> > cat /sys/devices/system/node/node*/hugepages/hugepages-2048kB/nr_hugepages
> > 10241
> > 10239
> >
> > startup log:
> > EAL: Detected memory type: socket_id:0 hugepage_sz:2097152
> > EAL: Detected memory type: socket_id:1 hugepage_sz:2097152
> > EAL: Creating 4 segment lists: n_segs:8192 socket_id:0 hugepage_sz:2097152
> > EAL: Creating 4 segment lists: n_segs:8192 socket_id:1 hugepage_sz:2097152
> > EAL: Requesting 13370 pages of size 2MB from socket 0
> > EAL: Requesting 7110 pages of size 2MB from socket 1
> > EAL: Attempting to map 14220M on socket 1
> > EAL: Allocated 14220M on socket 1
> > EAL: Attempting to map 26740M on socket 0
> > EAL: Could not find space for memseg. Please increase 32768 and/or 65536 in
> > configuration.
> > EAL: Couldn't remap hugepage files into memseg lists
> > EAL: FATAL: Cannot init memory
> > EAL: Cannot init memory
>
> We are missing a Fixes: tag and this is backport material, right?
Yes, this patch need cc stable@dpdk.org
>
>
> >
> > Signed-off-by: Fengnan Chang <changfengnan@bytedance.com>
> > Signed-off-by: Lin Li <lilintjpu@bytedance.com>
>
> Can I update Lin Li existing entry in .mailmap? Or is this a different person?
Please help update in .mailmap, same person, thanks.
>
>
> > Signed-off-by: Burakov Anatoly <anatoly.burakov@intel.com>
> Anatoly Burakov*
>
> > Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
>
> Strange to have both SoB and Review tag from Anatoly.
Maybe just leave SoB ? cc @ Anatoly
>
>
> > ---
> >  lib/eal/linux/eal_memory.c | 51 +++++++++++++++++++++++++++-----------
>
> Is this issue affecting only Linux?
Yes,Windows and FreeBSD is fine.
>
>
>
> --
> David Marchand
>
  
David Marchand June 9, 2023, 12:57 p.m. UTC | #3
On Mon, May 29, 2023 at 1:23 PM Fengnan Chang
<changfengnan@bytedance.com> wrote:
>
> Under legacy mode, if the number of continuous memsegs greater
> than RTE_MAX_MEMSEG_PER_LIST, eal init will failed even though
> another memseg list is empty, because only one memseg list used
> to check in remap_needed_hugepages.
> Fix this by make remap_segment return how many segments mapped,
> remap_segment try to map most contiguous segments it can, if
> exceed it's capbility, remap_needed_hugepages will continue to
> map other left pages.
>
> For example:
> hugepage configure:
> cat /sys/devices/system/node/node*/hugepages/hugepages-2048kB/nr_hugepages
> 10241
> 10239
>
> startup log:
> EAL: Detected memory type: socket_id:0 hugepage_sz:2097152
> EAL: Detected memory type: socket_id:1 hugepage_sz:2097152
> EAL: Creating 4 segment lists: n_segs:8192 socket_id:0 hugepage_sz:2097152
> EAL: Creating 4 segment lists: n_segs:8192 socket_id:1 hugepage_sz:2097152
> EAL: Requesting 13370 pages of size 2MB from socket 0
> EAL: Requesting 7110 pages of size 2MB from socket 1
> EAL: Attempting to map 14220M on socket 1
> EAL: Allocated 14220M on socket 1
> EAL: Attempting to map 26740M on socket 0
> EAL: Could not find space for memseg. Please increase 32768 and/or 65536 in
> configuration.
> EAL: Couldn't remap hugepage files into memseg lists
> EAL: FATAL: Cannot init memory
> EAL: Cannot init memory

Best culprit seems to be:
Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
Cc: stable@dpdk.org

> Signed-off-by: Fengnan Chang <changfengnan@bytedance.com>
> Signed-off-by: Lin Li <lilintjpu@bytedance.com>
> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>

Applied, thanks.
  
Burakov, Anatoly June 12, 2023, 9:48 a.m. UTC | #4
On 6/9/2023 9:35 AM, Fengnan Chang wrote:
> David Marchand <david.marchand@redhat.com> 于2023年6月8日周四 04:33写道:
>>
>> On Mon, May 29, 2023 at 1:23 PM Fengnan Chang
>> <changfengnan@bytedance.com> wrote:
>>>
>>> Under legacy mode, if the number of continuous memsegs greater
>>> than RTE_MAX_MEMSEG_PER_LIST, eal init will failed even though
>>> another memseg list is empty, because only one memseg list used
>>> to check in remap_needed_hugepages.
>>> Fix this by make remap_segment return how many segments mapped,
>>> remap_segment try to map most contiguous segments it can, if
>>> exceed it's capbility, remap_needed_hugepages will continue to
>>> map other left pages.
>>>
>>> For example:
>>> hugepage configure:
>>> cat /sys/devices/system/node/node*/hugepages/hugepages-2048kB/nr_hugepages
>>> 10241
>>> 10239
>>>
>>> startup log:
>>> EAL: Detected memory type: socket_id:0 hugepage_sz:2097152
>>> EAL: Detected memory type: socket_id:1 hugepage_sz:2097152
>>> EAL: Creating 4 segment lists: n_segs:8192 socket_id:0 hugepage_sz:2097152
>>> EAL: Creating 4 segment lists: n_segs:8192 socket_id:1 hugepage_sz:2097152
>>> EAL: Requesting 13370 pages of size 2MB from socket 0
>>> EAL: Requesting 7110 pages of size 2MB from socket 1
>>> EAL: Attempting to map 14220M on socket 1
>>> EAL: Allocated 14220M on socket 1
>>> EAL: Attempting to map 26740M on socket 0
>>> EAL: Could not find space for memseg. Please increase 32768 and/or 65536 in
>>> configuration.
>>> EAL: Couldn't remap hugepage files into memseg lists
>>> EAL: FATAL: Cannot init memory
>>> EAL: Cannot init memory
>>
>> We are missing a Fixes: tag and this is backport material, right?
> Yes, this patch need cc stable@dpdk.org
>>
>>
>>>
>>> Signed-off-by: Fengnan Chang <changfengnan@bytedance.com>
>>> Signed-off-by: Lin Li <lilintjpu@bytedance.com>
>>
>> Can I update Lin Li existing entry in .mailmap? Or is this a different person?
> Please help update in .mailmap, same person, thanks.
>>
>>
>>> Signed-off-by: Burakov Anatoly <anatoly.burakov@intel.com>
>> Anatoly Burakov*
>>
>>> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>
>> Strange to have both SoB and Review tag from Anatoly.
> Maybe just leave SoB ? cc @ Anatoly

The signoff is there because I suggested an alternative implementation 
in comments. I'm OK with just leaving Review :)

>>
>>
>>> ---
>>>   lib/eal/linux/eal_memory.c | 51 +++++++++++++++++++++++++++-----------
>>
>> Is this issue affecting only Linux?
> Yes,Windows and FreeBSD is fine.
>>
>>
>>
>> --
>> David Marchand
>>
  
David Marchand June 12, 2023, 9:55 a.m. UTC | #5
On Mon, Jun 12, 2023 at 11:48 AM Burakov, Anatoly
<anatoly.burakov@intel.com> wrote:
>
> On 6/9/2023 9:35 AM, Fengnan Chang wrote:
> > David Marchand <david.marchand@redhat.com> 于2023年6月8日周四 04:33写道:
> >>
> >> On Mon, May 29, 2023 at 1:23 PM Fengnan Chang
> >> <changfengnan@bytedance.com> wrote:
> >>>
> >>> Under legacy mode, if the number of continuous memsegs greater
> >>> than RTE_MAX_MEMSEG_PER_LIST, eal init will failed even though
> >>> another memseg list is empty, because only one memseg list used
> >>> to check in remap_needed_hugepages.
> >>> Fix this by make remap_segment return how many segments mapped,
> >>> remap_segment try to map most contiguous segments it can, if
> >>> exceed it's capbility, remap_needed_hugepages will continue to
> >>> map other left pages.
> >>>
> >>> For example:
> >>> hugepage configure:
> >>> cat /sys/devices/system/node/node*/hugepages/hugepages-2048kB/nr_hugepages
> >>> 10241
> >>> 10239
> >>>
> >>> startup log:
> >>> EAL: Detected memory type: socket_id:0 hugepage_sz:2097152
> >>> EAL: Detected memory type: socket_id:1 hugepage_sz:2097152
> >>> EAL: Creating 4 segment lists: n_segs:8192 socket_id:0 hugepage_sz:2097152
> >>> EAL: Creating 4 segment lists: n_segs:8192 socket_id:1 hugepage_sz:2097152
> >>> EAL: Requesting 13370 pages of size 2MB from socket 0
> >>> EAL: Requesting 7110 pages of size 2MB from socket 1
> >>> EAL: Attempting to map 14220M on socket 1
> >>> EAL: Allocated 14220M on socket 1
> >>> EAL: Attempting to map 26740M on socket 0
> >>> EAL: Could not find space for memseg. Please increase 32768 and/or 65536 in
> >>> configuration.
> >>> EAL: Couldn't remap hugepage files into memseg lists
> >>> EAL: FATAL: Cannot init memory
> >>> EAL: Cannot init memory
> >>
> >> We are missing a Fixes: tag and this is backport material, right?
> > Yes, this patch need cc stable@dpdk.org
> >>
> >>
> >>>
> >>> Signed-off-by: Fengnan Chang <changfengnan@bytedance.com>
> >>> Signed-off-by: Lin Li <lilintjpu@bytedance.com>
> >>
> >> Can I update Lin Li existing entry in .mailmap? Or is this a different person?
> > Please help update in .mailmap, same person, thanks.
> >>
> >>
> >>> Signed-off-by: Burakov Anatoly <anatoly.burakov@intel.com>
> >> Anatoly Burakov*
> >>
> >>> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
> >>
> >> Strange to have both SoB and Review tag from Anatoly.
> > Maybe just leave SoB ? cc @ Anatoly
>
> The signoff is there because I suggested an alternative implementation
> in comments. I'm OK with just leaving Review :)

Good, that was what I had understood.
I updated accordingly when applying.
Thanks.
  

Patch

diff --git a/lib/eal/linux/eal_memory.c b/lib/eal/linux/eal_memory.c
index 60fc8cc6ca..0876974631 100644
--- a/lib/eal/linux/eal_memory.c
+++ b/lib/eal/linux/eal_memory.c
@@ -681,6 +681,7 @@  remap_segment(struct hugepage_file *hugepages, int seg_start, int seg_end)
 
 	/* find free space in memseg lists */
 	for (msl_idx = 0; msl_idx < RTE_MAX_MEMSEG_LISTS; msl_idx++) {
+		int free_len;
 		bool empty;
 		msl = &mcfg->memsegs[msl_idx];
 		arr = &msl->memseg_arr;
@@ -692,18 +693,26 @@  remap_segment(struct hugepage_file *hugepages, int seg_start, int seg_end)
 
 		/* leave space for a hole if array is not empty */
 		empty = arr->count == 0;
-		ms_idx = rte_fbarray_find_next_n_free(arr, 0,
-				seg_len + (empty ? 0 : 1));
-
-		/* memseg list is full? */
+		/* find start of the biggest contiguous block and its size */
+		ms_idx = rte_fbarray_find_biggest_free(arr, 0);
 		if (ms_idx < 0)
 			continue;
-
+		/* hole is 1 segment long, so at least two segments long. */
+		free_len = rte_fbarray_find_contig_free(arr, ms_idx);
+		if (free_len < 2)
+			continue;
 		/* leave some space between memsegs, they are not IOVA
 		 * contiguous, so they shouldn't be VA contiguous either.
 		 */
-		if (!empty)
+		if (!empty) {
 			ms_idx++;
+			free_len--;
+		}
+
+		/* we might not get all of the space we wanted */
+		free_len = RTE_MIN(seg_len, free_len);
+		seg_end = seg_start + free_len;
+		seg_len = seg_end - seg_start;
 		break;
 	}
 	if (msl_idx == RTE_MAX_MEMSEG_LISTS) {
@@ -787,7 +796,7 @@  remap_segment(struct hugepage_file *hugepages, int seg_start, int seg_end)
 	}
 	RTE_LOG(DEBUG, EAL, "Allocated %" PRIu64 "M on socket %i\n",
 			(seg_len * page_sz) >> 20, socket_id);
-	return 0;
+	return seg_len;
 }
 
 static uint64_t
@@ -1022,10 +1031,16 @@  remap_needed_hugepages(struct hugepage_file *hugepages, int n_pages)
 		if (new_memseg) {
 			/* if this isn't the first time, remap segment */
 			if (cur_page != 0) {
-				ret = remap_segment(hugepages, seg_start_page,
-						cur_page);
-				if (ret != 0)
-					return -1;
+				int n_remapped = 0;
+				int n_needed = cur_page - seg_start_page;
+				while (n_remapped < n_needed) {
+					ret = remap_segment(hugepages, seg_start_page,
+							cur_page);
+					if (ret < 0)
+						return -1;
+					n_remapped += ret;
+					seg_start_page += ret;
+				}
 			}
 			/* remember where we started */
 			seg_start_page = cur_page;
@@ -1034,10 +1049,16 @@  remap_needed_hugepages(struct hugepage_file *hugepages, int n_pages)
 	}
 	/* we were stopped, but we didn't remap the last segment, do it now */
 	if (cur_page != 0) {
-		ret = remap_segment(hugepages, seg_start_page,
-				cur_page);
-		if (ret != 0)
-			return -1;
+		int n_remapped = 0;
+		int n_needed = cur_page - seg_start_page;
+		while (n_remapped < n_needed) {
+			ret = remap_segment(hugepages, seg_start_page,
+					cur_page);
+			if (ret < 0)
+				return -1;
+			n_remapped += ret;
+			seg_start_page += ret;
+		}
 	}
 	return 0;
 }