[dpdk-dev,4/4] vhost: avoid populate guest memory

Message ID 1518580892-32656-5-git-send-email-jianfeng.tan@intel.com
State Superseded, archived
Delegated to: Maxime Coquelin
Headers show

Checks

Context Check Description
ci/Intel-compilation success Compilation OK
ci/checkpatch success coding style OK

Commit Message

Jianfeng Tan Feb. 14, 2018, 4:01 a.m.
It's not necessary to polulate guest memory from vhost side.

Cc: maxime.coquelin@redhat.com
Cc: yliu@fridaylinux.org

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
---
 lib/librte_vhost/vhost_user.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Maxime Coquelin Feb. 19, 2018, 8:44 p.m. | #1
Hi Jianfeng,

On 02/14/2018 05:01 AM, Jianfeng Tan wrote:
> It's not necessary to polulate guest memory from vhost side.
> 
> Cc: maxime.coquelin@redhat.com
> Cc: yliu@fridaylinux.org
> 
> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
> ---
>   lib/librte_vhost/vhost_user.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
> index 90ed211..9bd0391 100644
> --- a/lib/librte_vhost/vhost_user.c
> +++ b/lib/librte_vhost/vhost_user.c
> @@ -644,6 +644,7 @@ vhost_user_set_mem_table(struct virtio_net *dev, struct VhostUserMsg *pmsg)
>   	uint64_t mmap_offset;
>   	uint64_t alignment;
>   	uint32_t i;
> +	int populate;
>   	int fd;
>   
>   	if (dev->mem && !vhost_memory_changed(&memory, dev->mem)) {
> @@ -714,8 +715,9 @@ vhost_user_set_mem_table(struct virtio_net *dev, struct VhostUserMsg *pmsg)
>   		}
>   		mmap_size = RTE_ALIGN_CEIL(mmap_size, alignment);
>   
> +		populate = (dev->dequeue_zero_copy) ? MAP_POPULATE : 0;
>   		mmap_addr = mmap(NULL, mmap_size, PROT_READ | PROT_WRITE,
> -				 MAP_SHARED | MAP_POPULATE, fd, 0);
> +				 MAP_SHARED | populate, fd, 0);
>   
>   		if (mmap_addr == MAP_FAILED) {
>   			RTE_LOG(ERR, VHOST_CONFIG,
> 

Wouldn't not populating all the guest memory have a bad impact on 0%
acceptable loss use-cases?

Thanks,
Maxime
Jianfeng Tan Feb. 22, 2018, 2:42 a.m. | #2
Hi Maxime,

> -----Original Message-----

> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]

> Sent: Tuesday, February 20, 2018 4:45 AM

> To: Tan, Jianfeng; dev@dpdk.org

> Cc: yliu@fridaylinux.org

> Subject: Re: [PATCH 4/4] vhost: avoid populate guest memory

> 

> Hi Jianfeng,

> 

> On 02/14/2018 05:01 AM, Jianfeng Tan wrote:

> > It's not necessary to polulate guest memory from vhost side.

> >

> > Cc: maxime.coquelin@redhat.com

> > Cc: yliu@fridaylinux.org

> >

> > Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>

> > ---

> >   lib/librte_vhost/vhost_user.c | 4 +++-

> >   1 file changed, 3 insertions(+), 1 deletion(-)

> >

> > diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c

> > index 90ed211..9bd0391 100644

> > --- a/lib/librte_vhost/vhost_user.c

> > +++ b/lib/librte_vhost/vhost_user.c

> > @@ -644,6 +644,7 @@ vhost_user_set_mem_table(struct virtio_net *dev,

> struct VhostUserMsg *pmsg)

> >   	uint64_t mmap_offset;

> >   	uint64_t alignment;

> >   	uint32_t i;

> > +	int populate;

> >   	int fd;

> >

> >   	if (dev->mem && !vhost_memory_changed(&memory, dev->mem))

> {

> > @@ -714,8 +715,9 @@ vhost_user_set_mem_table(struct virtio_net *dev,

> struct VhostUserMsg *pmsg)

> >   		}

> >   		mmap_size = RTE_ALIGN_CEIL(mmap_size, alignment);

> >

> > +		populate = (dev->dequeue_zero_copy) ? MAP_POPULATE :

> 0;

> >   		mmap_addr = mmap(NULL, mmap_size, PROT_READ |

> PROT_WRITE,

> > -				 MAP_SHARED | MAP_POPULATE, fd, 0);

> > +				 MAP_SHARED | populate, fd, 0);

> >

> >   		if (mmap_addr == MAP_FAILED) {

> >   			RTE_LOG(ERR, VHOST_CONFIG,

> >

> 

> Wouldn't not populating all the guest memory have a bad impact on 0%

> acceptable loss use-cases?


Yes, it could affect such use case; but we can address that by warming up the system a little bit, can't we?

From a good point of view, it could save the memory for VMs without pre-allocating.

Thanks,
Jianfeng
Maxime Coquelin Feb. 22, 2018, 8:25 a.m. | #3
On 02/22/2018 03:42 AM, Tan, Jianfeng wrote:
> Hi Maxime,
> 
>> -----Original Message-----
>> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
>> Sent: Tuesday, February 20, 2018 4:45 AM
>> To: Tan, Jianfeng; dev@dpdk.org
>> Cc: yliu@fridaylinux.org
>> Subject: Re: [PATCH 4/4] vhost: avoid populate guest memory
>>
>> Hi Jianfeng,
>>
>> On 02/14/2018 05:01 AM, Jianfeng Tan wrote:
>>> It's not necessary to polulate guest memory from vhost side.
>>>
>>> Cc: maxime.coquelin@redhat.com
>>> Cc: yliu@fridaylinux.org
>>>
>>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
>>> ---
>>>    lib/librte_vhost/vhost_user.c | 4 +++-
>>>    1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
>>> index 90ed211..9bd0391 100644
>>> --- a/lib/librte_vhost/vhost_user.c
>>> +++ b/lib/librte_vhost/vhost_user.c
>>> @@ -644,6 +644,7 @@ vhost_user_set_mem_table(struct virtio_net *dev,
>> struct VhostUserMsg *pmsg)
>>>    	uint64_t mmap_offset;
>>>    	uint64_t alignment;
>>>    	uint32_t i;
>>> +	int populate;
>>>    	int fd;
>>>
>>>    	if (dev->mem && !vhost_memory_changed(&memory, dev->mem))
>> {
>>> @@ -714,8 +715,9 @@ vhost_user_set_mem_table(struct virtio_net *dev,
>> struct VhostUserMsg *pmsg)
>>>    		}
>>>    		mmap_size = RTE_ALIGN_CEIL(mmap_size, alignment);
>>>
>>> +		populate = (dev->dequeue_zero_copy) ? MAP_POPULATE :
>> 0;
>>>    		mmap_addr = mmap(NULL, mmap_size, PROT_READ |
>> PROT_WRITE,
>>> -				 MAP_SHARED | MAP_POPULATE, fd, 0);
>>> +				 MAP_SHARED | populate, fd, 0);
>>>
>>>    		if (mmap_addr == MAP_FAILED) {
>>>    			RTE_LOG(ERR, VHOST_CONFIG,
>>>
>>
>> Wouldn't not populating all the guest memory have a bad impact on 0%
>> acceptable loss use-cases?
> 
> Yes, it could affect such use case; but we can address that by warming up the system a little bit, can't we?

I'm not sure this is a good idea to ask the real user to warm-up the
system.

Also, even with benchmarking, the loss happens when the queues are full,
so it is likely that it happens with buffers not used before, even if
system has been warmed-up.

>  From a good point of view, it could save the memory for VMs without pre-allocating.

What could be done is maybe to have an EAL API for mmaping, with an
associated EAL parameter to state whether it want populating or not.
This option would be disabled by default.

Does that sounds reasonable?

Cheers,
Maxime

> Thanks,
> Jianfeng
>
Jianfeng Tan Feb. 22, 2018, 8:40 a.m. | #4
> -----Original Message-----

> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]

> Sent: Thursday, February 22, 2018 4:26 PM

> To: Tan, Jianfeng; dev@dpdk.org

> Cc: yliu@fridaylinux.org

> Subject: Re: [PATCH 4/4] vhost: avoid populate guest memory

> 

> 

> 

> On 02/22/2018 03:42 AM, Tan, Jianfeng wrote:

> > Hi Maxime,

> >

> >> -----Original Message-----

> >> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]

> >> Sent: Tuesday, February 20, 2018 4:45 AM

> >> To: Tan, Jianfeng; dev@dpdk.org

> >> Cc: yliu@fridaylinux.org

> >> Subject: Re: [PATCH 4/4] vhost: avoid populate guest memory

> >>

> >> Hi Jianfeng,

> >>

> >> On 02/14/2018 05:01 AM, Jianfeng Tan wrote:

> >>> It's not necessary to polulate guest memory from vhost side.

> >>>

> >>> Cc: maxime.coquelin@redhat.com

> >>> Cc: yliu@fridaylinux.org

> >>>

> >>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>

> >>> ---

> >>>    lib/librte_vhost/vhost_user.c | 4 +++-

> >>>    1 file changed, 3 insertions(+), 1 deletion(-)

> >>>

> >>> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c

> >>> index 90ed211..9bd0391 100644

> >>> --- a/lib/librte_vhost/vhost_user.c

> >>> +++ b/lib/librte_vhost/vhost_user.c

> >>> @@ -644,6 +644,7 @@ vhost_user_set_mem_table(struct virtio_net

> *dev,

> >> struct VhostUserMsg *pmsg)

> >>>    	uint64_t mmap_offset;

> >>>    	uint64_t alignment;

> >>>    	uint32_t i;

> >>> +	int populate;

> >>>    	int fd;

> >>>

> >>>    	if (dev->mem && !vhost_memory_changed(&memory, dev->mem))

> >> {

> >>> @@ -714,8 +715,9 @@ vhost_user_set_mem_table(struct virtio_net

> *dev,

> >> struct VhostUserMsg *pmsg)

> >>>    		}

> >>>    		mmap_size = RTE_ALIGN_CEIL(mmap_size, alignment);

> >>>

> >>> +		populate = (dev->dequeue_zero_copy) ? MAP_POPULATE :

> >> 0;

> >>>    		mmap_addr = mmap(NULL, mmap_size, PROT_READ |

> >> PROT_WRITE,

> >>> -				 MAP_SHARED | MAP_POPULATE, fd, 0);

> >>> +				 MAP_SHARED | populate, fd, 0);

> >>>

> >>>    		if (mmap_addr == MAP_FAILED) {

> >>>    			RTE_LOG(ERR, VHOST_CONFIG,

> >>>

> >>

> >> Wouldn't not populating all the guest memory have a bad impact on 0%

> >> acceptable loss use-cases?

> >

> > Yes, it could affect such use case; but we can address that by warming up

> the system a little bit, can't we?

> 

> I'm not sure this is a good idea to ask the real user to warm-up the

> system.

> 

> Also, even with benchmarking, the loss happens when the queues are full,

> so it is likely that it happens with buffers not used before, even if

> system has been warmed-up.


OK, warm-up is a bad idea here :-)

But if a VM is used for such use case, I think we'd better pre-allocate the memory at QEMU side.

> 

> >  From a good point of view, it could save the memory for VMs without pre-

> allocating.

> 

> What could be done is maybe to have an EAL API for mmaping, with an

> associated EAL parameter to state whether it want populating or not.

> This option would be disabled by default.

> 

> Does that sounds reasonable?


If we look for an application-level configuration, it's not necessary to have such a parameter. Refer to the 3rd patch in this series, if we make all (current/future) memory locked, the mmap() syscall will populate the memory.

Thanks,
Jianfeng
Maxime Coquelin Feb. 22, 2018, 12:32 p.m. | #5
On 02/22/2018 09:40 AM, Tan, Jianfeng wrote:
> 
> 
>> -----Original Message-----
>> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
>> Sent: Thursday, February 22, 2018 4:26 PM
>> To: Tan, Jianfeng; dev@dpdk.org
>> Cc: yliu@fridaylinux.org
>> Subject: Re: [PATCH 4/4] vhost: avoid populate guest memory
>>
>>
>>
>> On 02/22/2018 03:42 AM, Tan, Jianfeng wrote:
>>> Hi Maxime,
>>>
>>>> -----Original Message-----
>>>> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
>>>> Sent: Tuesday, February 20, 2018 4:45 AM
>>>> To: Tan, Jianfeng; dev@dpdk.org
>>>> Cc: yliu@fridaylinux.org
>>>> Subject: Re: [PATCH 4/4] vhost: avoid populate guest memory
>>>>
>>>> Hi Jianfeng,
>>>>
>>>> On 02/14/2018 05:01 AM, Jianfeng Tan wrote:
>>>>> It's not necessary to polulate guest memory from vhost side.
>>>>>
>>>>> Cc: maxime.coquelin@redhat.com
>>>>> Cc: yliu@fridaylinux.org
>>>>>
>>>>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
>>>>> ---
>>>>>     lib/librte_vhost/vhost_user.c | 4 +++-
>>>>>     1 file changed, 3 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
>>>>> index 90ed211..9bd0391 100644
>>>>> --- a/lib/librte_vhost/vhost_user.c
>>>>> +++ b/lib/librte_vhost/vhost_user.c
>>>>> @@ -644,6 +644,7 @@ vhost_user_set_mem_table(struct virtio_net
>> *dev,
>>>> struct VhostUserMsg *pmsg)
>>>>>     	uint64_t mmap_offset;
>>>>>     	uint64_t alignment;
>>>>>     	uint32_t i;
>>>>> +	int populate;
>>>>>     	int fd;
>>>>>
>>>>>     	if (dev->mem && !vhost_memory_changed(&memory, dev->mem))
>>>> {
>>>>> @@ -714,8 +715,9 @@ vhost_user_set_mem_table(struct virtio_net
>> *dev,
>>>> struct VhostUserMsg *pmsg)
>>>>>     		}
>>>>>     		mmap_size = RTE_ALIGN_CEIL(mmap_size, alignment);
>>>>>
>>>>> +		populate = (dev->dequeue_zero_copy) ? MAP_POPULATE :
>>>> 0;
>>>>>     		mmap_addr = mmap(NULL, mmap_size, PROT_READ |
>>>> PROT_WRITE,
>>>>> -				 MAP_SHARED | MAP_POPULATE, fd, 0);
>>>>> +				 MAP_SHARED | populate, fd, 0);
>>>>>
>>>>>     		if (mmap_addr == MAP_FAILED) {
>>>>>     			RTE_LOG(ERR, VHOST_CONFIG,
>>>>>
>>>>
>>>> Wouldn't not populating all the guest memory have a bad impact on 0%
>>>> acceptable loss use-cases?
>>>
>>> Yes, it could affect such use case; but we can address that by warming up
>> the system a little bit, can't we?
>>
>> I'm not sure this is a good idea to ask the real user to warm-up the
>> system.
>>
>> Also, even with benchmarking, the loss happens when the queues are full,
>> so it is likely that it happens with buffers not used before, even if
>> system has been warmed-up.
> 
> OK, warm-up is a bad idea here :-)
> 
> But if a VM is used for such use case, I think we'd better pre-allocate the memory at QEMU side.
> 
>>
>>>   From a good point of view, it could save the memory for VMs without pre-
>> allocating.
>>
>> What could be done is maybe to have an EAL API for mmaping, with an
>> associated EAL parameter to state whether it want populating or not.
>> This option would be disabled by default.
>>
>> Does that sounds reasonable?
> 
> If we look for an application-level configuration, it's not necessary to have such a parameter. Refer to the 3rd patch in this series, if we make all (current/future) memory locked, the mmap() syscall will populate the memory.

OK, but in that case it should be documented.
I see OVS has also a parameter to request the memory to be locked, but 
it seems not to be the default, so the user could face a change in the
behavior it didn't expect.

Thanks,
Maxime

> Thanks,
> Jianfeng
>
Jianfeng Tan Feb. 23, 2018, 3:17 a.m. | #6
> -----Original Message-----

> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]

> Sent: Thursday, February 22, 2018 8:33 PM

> To: Tan, Jianfeng; dev@dpdk.org

> Cc: yliu@fridaylinux.org

> Subject: Re: [PATCH 4/4] vhost: avoid populate guest memory

> 

> 

> >> What could be done is maybe to have an EAL API for mmaping, with an

> >> associated EAL parameter to state whether it want populating or not.

> >> This option would be disabled by default.

> >>

> >> Does that sounds reasonable?

> >

> > If we look for an application-level configuration, it's not necessary to have

> such a parameter. Refer to the 3rd patch in this series, if we make all

> (current/future) memory locked, the mmap() syscall will populate the

> memory.

> 

> OK, but in that case it should be documented.

> I see OVS has also a parameter to request the memory to be locked, but

> it seems not to be the default, so the user could face a change in the

> behavior it didn't expect.


Make sense, will send v2 with the note.

Thanks,
Jianfeng

Patch

diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index 90ed211..9bd0391 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -644,6 +644,7 @@  vhost_user_set_mem_table(struct virtio_net *dev, struct VhostUserMsg *pmsg)
 	uint64_t mmap_offset;
 	uint64_t alignment;
 	uint32_t i;
+	int populate;
 	int fd;
 
 	if (dev->mem && !vhost_memory_changed(&memory, dev->mem)) {
@@ -714,8 +715,9 @@  vhost_user_set_mem_table(struct virtio_net *dev, struct VhostUserMsg *pmsg)
 		}
 		mmap_size = RTE_ALIGN_CEIL(mmap_size, alignment);
 
+		populate = (dev->dequeue_zero_copy) ? MAP_POPULATE : 0;
 		mmap_addr = mmap(NULL, mmap_size, PROT_READ | PROT_WRITE,
-				 MAP_SHARED | MAP_POPULATE, fd, 0);
+				 MAP_SHARED | populate, fd, 0);
 
 		if (mmap_addr == MAP_FAILED) {
 			RTE_LOG(ERR, VHOST_CONFIG,