[17.11] mem: fix memory initialization time

Message ID 20181112111819.25087-1-alejandro.lucero@netronome.com (mailing list archive)
State Not Applicable, archived
Headers
Series [17.11] mem: fix memory initialization time |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Alejandro Lucero Nov. 12, 2018, 11:18 a.m. UTC
  When using large amount of hugepage based memory, doing all the
hugepages mapping can take quite significant time.

The problem is hugepages being initially mmaped to virtual addresses
which will be tried later for the final hugepage mmaping. This causes
the final mapping requiring calling mmap with another hint address which
can happen several times, depending on the amount of memory to mmap, and
which each mmmap taking more than a second.

This patch changes the hint for the initial hugepage mmaping using
a starting address which will not collide with the final mmaping.

Fixes: 293c0c4b957f ("mem: use address hint for mapping hugepages")

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
---
 lib/librte_eal/linuxapp/eal/eal_memory.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)
  

Comments

Eelco Chaudron Nov. 12, 2018, 11:26 a.m. UTC | #1
On 12 Nov 2018, at 12:18, Alejandro Lucero wrote:

> When using large amount of hugepage based memory, doing all the
> hugepages mapping can take quite significant time.
>
> The problem is hugepages being initially mmaped to virtual addresses
> which will be tried later for the final hugepage mmaping. This causes
> the final mapping requiring calling mmap with another hint address 
> which
> can happen several times, depending on the amount of memory to mmap, 
> and
> which each mmmap taking more than a second.
>
> This patch changes the hint for the initial hugepage mmaping using
> a starting address which will not collide with the final mmaping.
>
> Fixes: 293c0c4b957f ("mem: use address hint for mapping hugepages")
>
> Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>

Thanks Alejandro for sending the patch. This issue was found in an 
OVS-DPDK environment.
I verified/tested the patch.

Acked-by: Eelco Chaudron <echaudro@redhat.com>
Tested-by: Eelco Chaudron <echaudro@redhat.com>
> ---
>  lib/librte_eal/linuxapp/eal/eal_memory.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
>
> diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
> b/lib/librte_eal/linuxapp/eal/eal_memory.c
> index bac969a12..0675809b7 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_memory.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
> @@ -421,6 +421,21 @@ map_all_hugepages(struct hugepage_file 
> *hugepg_tbl, struct hugepage_info *hpi,
>  	}
>  #endif
>
> +#ifdef RTE_ARCH_64
> +	/*
> +	 * Hugepages are first mmaped individually and then re-mmapped to
> +	 * another region for having contiguous physical pages in contiguous
> +	 * virtual addresses. Setting here vma_addr for the first hugepage
> +	 * mapped to a virtual address which will not collide with the 
> second
> +	 * mmaping later. The next hugepages will use increments of this
> +	 * initial address.
> +	 *
> +	 * The final virtual address will be based on baseaddr which is
> +	 * 0x100000000. We use a hint here starting at 0x200000000, leaving
> +	 * another 4GB just in case, plus the total available hugepages 
> memory.
> +	 */
> +	vma_addr = (char *)0x200000000 + (hpi->hugepage_sz * 
> hpi->num_pages[0]);
> +#endif
>  	for (i = 0; i < hpi->num_pages[0]; i++) {
>  		uint64_t hugepage_sz = hpi->hugepage_sz;
>
> -- 
> 2.17.1
  
Eelco Chaudron Nov. 14, 2018, 12:45 p.m. UTC | #2
On 12 Nov 2018, at 12:26, Eelco Chaudron wrote:

> On 12 Nov 2018, at 12:18, Alejandro Lucero wrote:
>
>> When using large amount of hugepage based memory, doing all the
>> hugepages mapping can take quite significant time.
>>
>> The problem is hugepages being initially mmaped to virtual addresses
>> which will be tried later for the final hugepage mmaping. This causes
>> the final mapping requiring calling mmap with another hint address 
>> which
>> can happen several times, depending on the amount of memory to mmap, 
>> and
>> which each mmmap taking more than a second.
>>
>> This patch changes the hint for the initial hugepage mmaping using
>> a starting address which will not collide with the final mmaping.
>>
>> Fixes: 293c0c4b957f ("mem: use address hint for mapping hugepages")
>>
>> Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
>
> Thanks Alejandro for sending the patch. This issue was found in an 
> OVS-DPDK environment.
> I verified/tested the patch.
>
> Acked-by: Eelco Chaudron <echaudro@redhat.com>
> Tested-by: Eelco Chaudron <echaudro@redhat.com>
>> ---
>>  lib/librte_eal/linuxapp/eal/eal_memory.c | 15 +++++++++++++++
>>  1 file changed, 15 insertions(+)
>>
>> diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
>> b/lib/librte_eal/linuxapp/eal/eal_memory.c
>> index bac969a12..0675809b7 100644
>> --- a/lib/librte_eal/linuxapp/eal/eal_memory.c
>> +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
>> @@ -421,6 +421,21 @@ map_all_hugepages(struct hugepage_file 
>> *hugepg_tbl, struct hugepage_info *hpi,
>>  	}
>>  #endif
>>
>> +#ifdef RTE_ARCH_64
>> +	/*
>> +	 * Hugepages are first mmaped individually and then re-mmapped to
>> +	 * another region for having contiguous physical pages in 
>> contiguous
>> +	 * virtual addresses. Setting here vma_addr for the first hugepage
>> +	 * mapped to a virtual address which will not collide with the 
>> second
>> +	 * mmaping later. The next hugepages will use increments of this
>> +	 * initial address.
>> +	 *
>> +	 * The final virtual address will be based on baseaddr which is
>> +	 * 0x100000000. We use a hint here starting at 0x200000000, leaving
>> +	 * another 4GB just in case, plus the total available hugepages 
>> memory.
>> +	 */
>> +	vma_addr = (char *)0x200000000 + (hpi->hugepage_sz * 
>> hpi->num_pages[0]);
>> +#endif
>>  	for (i = 0; i < hpi->num_pages[0]; i++) {
>>  		uint64_t hugepage_sz = hpi->hugepage_sz;
>>
>> -- 
>> 2.17.1

Adding OVS dev to this thread, as this issue is introduced in DPDK 
17.11.4.
  
Anatoly Burakov Nov. 15, 2018, 1:16 p.m. UTC | #3
On 12-Nov-18 11:18 AM, Alejandro Lucero wrote:
> When using large amount of hugepage based memory, doing all the
> hugepages mapping can take quite significant time.
> 
> The problem is hugepages being initially mmaped to virtual addresses
> which will be tried later for the final hugepage mmaping. This causes
> the final mapping requiring calling mmap with another hint address which
> can happen several times, depending on the amount of memory to mmap, and
> which each mmmap taking more than a second.
> 
> This patch changes the hint for the initial hugepage mmaping using
> a starting address which will not collide with the final mmaping.
> 
> Fixes: 293c0c4b957f ("mem: use address hint for mapping hugepages")
> 
> Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
> ---

Hi Alejandro,

I'm not sure i understand the purpose. When final mapping is performed, 
we reserve new memory area, and map pages into it. (i don't quite 
understand why we unmap the area before mapping pages, but it's how it's 
always been and i didn't change it in the legacy code)

Which addresses are causing the collision?
  
Alejandro Lucero Nov. 16, 2018, 12:49 p.m. UTC | #4
On Thu, Nov 15, 2018 at 1:16 PM Burakov, Anatoly <anatoly.burakov@intel.com>
wrote:

> On 12-Nov-18 11:18 AM, Alejandro Lucero wrote:
> > When using large amount of hugepage based memory, doing all the
> > hugepages mapping can take quite significant time.
> >
> > The problem is hugepages being initially mmaped to virtual addresses
> > which will be tried later for the final hugepage mmaping. This causes
> > the final mapping requiring calling mmap with another hint address which
> > can happen several times, depending on the amount of memory to mmap, and
> > which each mmmap taking more than a second.
> >
> > This patch changes the hint for the initial hugepage mmaping using
> > a starting address which will not collide with the final mmaping.
> >
> > Fixes: 293c0c4b957f ("mem: use address hint for mapping hugepages")
> >
> > Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
> > ---
>
> Hi Alejandro,
>
> I'm not sure i understand the purpose. When final mapping is performed,
> we reserve new memory area, and map pages into it. (i don't quite
> understand why we unmap the area before mapping pages, but it's how it's
> always been and i didn't change it in the legacy code)
>
> Which addresses are causing the collision?
>
>
Because the hint for the final mapping is at 4GB address, and the hugepages
are initially individually mapped starting at low virtual addresses, when
the memory to map is 4GB or higher, the hugepages will end using that hint
address and higher. The more the hugepages to mmap, the more addresses
above the hint address are used, and the more mmaps failed for getting the
virtual addresses for the final mmap.


> --
> Thanks,
> Anatoly
>
  
Anatoly Burakov Nov. 16, 2018, 1:35 p.m. UTC | #5
On 16-Nov-18 12:49 PM, Alejandro Lucero wrote:
> 
> 
> On Thu, Nov 15, 2018 at 1:16 PM Burakov, Anatoly 
> <anatoly.burakov@intel.com <mailto:anatoly.burakov@intel.com>> wrote:
> 
>     On 12-Nov-18 11:18 AM, Alejandro Lucero wrote:
>      > When using large amount of hugepage based memory, doing all the
>      > hugepages mapping can take quite significant time.
>      >
>      > The problem is hugepages being initially mmaped to virtual addresses
>      > which will be tried later for the final hugepage mmaping. This causes
>      > the final mapping requiring calling mmap with another hint
>     address which
>      > can happen several times, depending on the amount of memory to
>     mmap, and
>      > which each mmmap taking more than a second.
>      >
>      > This patch changes the hint for the initial hugepage mmaping using
>      > a starting address which will not collide with the final mmaping.
>      >
>      > Fixes: 293c0c4b957f ("mem: use address hint for mapping hugepages")
>      >
>      > Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com
>     <mailto:alejandro.lucero@netronome.com>>
>      > ---
> 
>     Hi Alejandro,
> 
>     I'm not sure i understand the purpose. When final mapping is performed,
>     we reserve new memory area, and map pages into it. (i don't quite
>     understand why we unmap the area before mapping pages, but it's how
>     it's
>     always been and i didn't change it in the legacy code)
> 
>     Which addresses are causing the collision?
> 
> 
> Because the hint for the final mapping is at 4GB address, and the 
> hugepages are initially individually mapped starting at low virtual 
> addresses, when the memory to map is 4GB or higher, the hugepages will 
> end using that hint address and higher. The more the hugepages to mmap, 
> the more addresses above the hint address are used, and the more mmaps 
> failed for getting the virtual addresses for the final mmap.

Yes, but i still don't understand what the problem is.

Before the final mapping, all of the pages get unmapped. They no longer 
occupy any VA space at all. Then, we create a VA-area the size of 
IOVA-contiguous chunk we have, but then we also unmap *that* (again, no 
idea why we actually do that, but that's how it works). So, the final 
mapping is performed with the knowledge that there are no pages at 
specified addresses, and mapping for specified addresses is performed 
when the first mapping has already been unmapped.

As far as i understand, at no point do we hold addresses for initial and 
final mappings concurrently. So, where does the conflict come in?

> 
>     -- 
>     Thanks,
>     Anatoly
>
  
Alejandro Lucero Nov. 16, 2018, 2:42 p.m. UTC | #6
On Fri, Nov 16, 2018 at 1:35 PM Burakov, Anatoly <anatoly.burakov@intel.com>
wrote:

> On 16-Nov-18 12:49 PM, Alejandro Lucero wrote:
> >
> >
> > On Thu, Nov 15, 2018 at 1:16 PM Burakov, Anatoly
> > <anatoly.burakov@intel.com <mailto:anatoly.burakov@intel.com>> wrote:
> >
> >     On 12-Nov-18 11:18 AM, Alejandro Lucero wrote:
> >      > When using large amount of hugepage based memory, doing all the
> >      > hugepages mapping can take quite significant time.
> >      >
> >      > The problem is hugepages being initially mmaped to virtual
> addresses
> >      > which will be tried later for the final hugepage mmaping. This
> causes
> >      > the final mapping requiring calling mmap with another hint
> >     address which
> >      > can happen several times, depending on the amount of memory to
> >     mmap, and
> >      > which each mmmap taking more than a second.
> >      >
> >      > This patch changes the hint for the initial hugepage mmaping using
> >      > a starting address which will not collide with the final mmaping.
> >      >
> >      > Fixes: 293c0c4b957f ("mem: use address hint for mapping
> hugepages")
> >      >
> >      > Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com
> >     <mailto:alejandro.lucero@netronome.com>>
> >      > ---
> >
> >     Hi Alejandro,
> >
> >     I'm not sure i understand the purpose. When final mapping is
> performed,
> >     we reserve new memory area, and map pages into it. (i don't quite
> >     understand why we unmap the area before mapping pages, but it's how
> >     it's
> >     always been and i didn't change it in the legacy code)
> >
> >     Which addresses are causing the collision?
> >
> >
> > Because the hint for the final mapping is at 4GB address, and the
> > hugepages are initially individually mapped starting at low virtual
> > addresses, when the memory to map is 4GB or higher, the hugepages will
> > end using that hint address and higher. The more the hugepages to mmap,
> > the more addresses above the hint address are used, and the more mmaps
> > failed for getting the virtual addresses for the final mmap.
>
> Yes, but i still don't understand what the problem is.
>
> Before the final mapping, all of the pages get unmapped. They no longer
> occupy any VA space at all. Then, we create a VA-area the size of
> IOVA-contiguous chunk we have, but then we also unmap *that* (again, no
> idea why we actually do that, but that's how it works). So, the final
> mapping is performed with the knowledge that there are no pages at
> specified addresses, and mapping for specified addresses is performed
> when the first mapping has already been unmapped.
>
> As far as i understand, at no point do we hold addresses for initial and
> final mappings concurrently. So, where does the conflict come in?
>
>
Are you sure about this? Because I can see calling unmap_all_hugepage_init
happens after the second call to map_all_hugepages.

Maybe you are looking at the legacy code in a newer version which is not
exactly doing the same steps.


> >
> >     --
> >     Thanks,
> >     Anatoly
> >
>
>
> --
> Thanks,
> Anatoly
>
  
Anatoly Burakov Nov. 16, 2018, 3:56 p.m. UTC | #7
On 16-Nov-18 2:42 PM, Alejandro Lucero wrote:
> 
> 
> On Fri, Nov 16, 2018 at 1:35 PM Burakov, Anatoly 
> <anatoly.burakov@intel.com <mailto:anatoly.burakov@intel.com>> wrote:
> 
>     On 16-Nov-18 12:49 PM, Alejandro Lucero wrote:
>      >
>      >
>      > On Thu, Nov 15, 2018 at 1:16 PM Burakov, Anatoly
>      > <anatoly.burakov@intel.com <mailto:anatoly.burakov@intel.com>
>     <mailto:anatoly.burakov@intel.com
>     <mailto:anatoly.burakov@intel.com>>> wrote:
>      >
>      >     On 12-Nov-18 11:18 AM, Alejandro Lucero wrote:
>      >      > When using large amount of hugepage based memory, doing
>     all the
>      >      > hugepages mapping can take quite significant time.
>      >      >
>      >      > The problem is hugepages being initially mmaped to virtual
>     addresses
>      >      > which will be tried later for the final hugepage mmaping.
>     This causes
>      >      > the final mapping requiring calling mmap with another hint
>      >     address which
>      >      > can happen several times, depending on the amount of memory to
>      >     mmap, and
>      >      > which each mmmap taking more than a second.
>      >      >
>      >      > This patch changes the hint for the initial hugepage
>     mmaping using
>      >      > a starting address which will not collide with the final
>     mmaping.
>      >      >
>      >      > Fixes: 293c0c4b957f ("mem: use address hint for mapping
>     hugepages")
>      >      >
>      >      > Signed-off-by: Alejandro Lucero
>     <alejandro.lucero@netronome.com <mailto:alejandro.lucero@netronome.com>
>      >     <mailto:alejandro.lucero@netronome.com
>     <mailto:alejandro.lucero@netronome.com>>>
>      >      > ---
>      >
>      >     Hi Alejandro,
>      >
>      >     I'm not sure i understand the purpose. When final mapping is
>     performed,
>      >     we reserve new memory area, and map pages into it. (i don't quite
>      >     understand why we unmap the area before mapping pages, but
>     it's how
>      >     it's
>      >     always been and i didn't change it in the legacy code)
>      >
>      >     Which addresses are causing the collision?
>      >
>      >
>      > Because the hint for the final mapping is at 4GB address, and the
>      > hugepages are initially individually mapped starting at low virtual
>      > addresses, when the memory to map is 4GB or higher, the hugepages
>     will
>      > end using that hint address and higher. The more the hugepages to
>     mmap,
>      > the more addresses above the hint address are used, and the more
>     mmaps
>      > failed for getting the virtual addresses for the final mmap.
> 
>     Yes, but i still don't understand what the problem is.
> 
>     Before the final mapping, all of the pages get unmapped. They no longer
>     occupy any VA space at all. Then, we create a VA-area the size of
>     IOVA-contiguous chunk we have, but then we also unmap *that* (again, no
>     idea why we actually do that, but that's how it works). So, the final
>     mapping is performed with the knowledge that there are no pages at
>     specified addresses, and mapping for specified addresses is performed
>     when the first mapping has already been unmapped.
> 
>     As far as i understand, at no point do we hold addresses for initial
>     and
>     final mappings concurrently. So, where does the conflict come in?
> 
> 
> Are you sure about this? Because I can see calling 
> unmap_all_hugepage_init happens after the second call to map_all_hugepages.
> 
> Maybe you are looking at the legacy code in a newer version which is not 
> exactly doing the same steps.

Ah yes, you're right - we do remap the pages before we unmap the 
original mappings. This patch perfect makes sense then. It'd still 
collide with mappings with --base-virtaddr set to the same address, but 
it's not going to fail (just be slow again), so it's OK.

Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

> 
>      >
>      >     --
>      >     Thanks,
>      >     Anatoly
>      >
> 
> 
>     -- 
>     Thanks,
>     Anatoly
>
  

Patch

diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index bac969a12..0675809b7 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -421,6 +421,21 @@  map_all_hugepages(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi,
 	}
 #endif
 
+#ifdef RTE_ARCH_64
+	/*
+	 * Hugepages are first mmaped individually and then re-mmapped to
+	 * another region for having contiguous physical pages in contiguous
+	 * virtual addresses. Setting here vma_addr for the first hugepage
+	 * mapped to a virtual address which will not collide with the second
+	 * mmaping later. The next hugepages will use increments of this
+	 * initial address.
+	 *
+	 * The final virtual address will be based on baseaddr which is
+	 * 0x100000000. We use a hint here starting at 0x200000000, leaving
+	 * another 4GB just in case, plus the total available hugepages memory.
+	 */
+	vma_addr = (char *)0x200000000 + (hpi->hugepage_sz * hpi->num_pages[0]);
+#endif
 	for (i = 0; i < hpi->num_pages[0]; i++) {
 		uint64_t hugepage_sz = hpi->hugepage_sz;