[v2,01/17] vhost: fix messages error checks

Message ID 20181002093651.24795-2-maxime.coquelin@redhat.com (mailing list archive)
State Superseded, archived
Delegated to: Maxime Coquelin
Headers
Series vhost: add postcopy live-migration support |

Checks

Context Check Description
ci/Intel-compilation success Compilation OK

Commit Message

Maxime Coquelin Oct. 2, 2018, 9:36 a.m. UTC
  Return of message handling has now changed to an enum that can
take non-negative value that is not zero in case a reply is
needed. But the code checking the variable afterwards has not
been updated, leading to success messages handling being
treated as errors.

Fixes: 4e601952cae6 ("vhost: message handling implemented as a callback array")

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/librte_vhost/vhost_user.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
  

Comments

Ilya Maximets Oct. 2, 2018, 2:15 p.m. UTC | #1
On 02.10.2018 12:36, Maxime Coquelin wrote:
> Return of message handling has now changed to an enum that can
> take non-negative value that is not zero in case a reply is
> needed. But the code checking the variable afterwards has not
> been updated, leading to success messages handling being
> treated as errors.
> 
> Fixes: 4e601952cae6 ("vhost: message handling implemented as a callback array")
> 
> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
>  lib/librte_vhost/vhost_user.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
> index 7ef3fb4a4..060b41893 100644
> --- a/lib/librte_vhost/vhost_user.c
> +++ b/lib/librte_vhost/vhost_user.c
> @@ -1783,7 +1783,7 @@ vhost_user_msg_handler(int vid, int fd)
>  	}
>  
>  skip_to_post_handle:
> -	if (!ret && dev->extern_ops.post_msg_handle) {
> +	if (ret != VH_RESULT_ERR && dev->extern_ops.post_msg_handle) {
>  		uint32_t need_reply;
>  
>  		ret = (*dev->extern_ops.post_msg_handle)(
> @@ -1800,10 +1800,10 @@ vhost_user_msg_handler(int vid, int fd)
>  		vhost_user_unlock_all_queue_pairs(dev);
>  
>  	if (msg.flags & VHOST_USER_NEED_REPLY) {

Maybe we need to reply here only if we didn't reply
already (not VH_RESULT_REPLY) ? Otherwise, we could
reply twice (with payload and with return code).

> -		msg.payload.u64 = !!ret;
> +		msg.payload.u64 = ret == VH_RESULT_ERR;
>  		msg.size = sizeof(msg.payload.u64);
>  		send_vhost_reply(fd, &msg);
> -	} else if (ret) {
> +	} else if (ret == VH_RESULT_ERR) {
>  		RTE_LOG(ERR, VHOST_CONFIG,
>  			"vhost message handling failed.\n");
>  		return -1;
>
  
Maxime Coquelin Oct. 3, 2018, 7:50 a.m. UTC | #2
On 10/02/2018 04:15 PM, Ilya Maximets wrote:
> On 02.10.2018 12:36, Maxime Coquelin wrote:
>> Return of message handling has now changed to an enum that can
>> take non-negative value that is not zero in case a reply is
>> needed. But the code checking the variable afterwards has not
>> been updated, leading to success messages handling being
>> treated as errors.
>>
>> Fixes: 4e601952cae6 ("vhost: message handling implemented as a callback array")
>>
>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>> ---
>>   lib/librte_vhost/vhost_user.c | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
>> index 7ef3fb4a4..060b41893 100644
>> --- a/lib/librte_vhost/vhost_user.c
>> +++ b/lib/librte_vhost/vhost_user.c
>> @@ -1783,7 +1783,7 @@ vhost_user_msg_handler(int vid, int fd)
>>   	}
>>   
>>   skip_to_post_handle:
>> -	if (!ret && dev->extern_ops.post_msg_handle) {
>> +	if (ret != VH_RESULT_ERR && dev->extern_ops.post_msg_handle) {
>>   		uint32_t need_reply;
>>   
>>   		ret = (*dev->extern_ops.post_msg_handle)(
>> @@ -1800,10 +1800,10 @@ vhost_user_msg_handler(int vid, int fd)
>>   		vhost_user_unlock_all_queue_pairs(dev);
>>   
>>   	if (msg.flags & VHOST_USER_NEED_REPLY) {
> 
> Maybe we need to reply here only if we didn't reply
> already (not VH_RESULT_REPLY) ? Otherwise, we could
> reply twice (with payload and with return code).

Well, if the master sets this bit, it means it is waiting for
a "reply-ack", so not sending it would cause the master to wait
forever.

It is the master responsibility to not set this bit for requests
already expecting a non "reply-ack" reply (as you fixed it for
postcopy's set mem table case).

>> -		msg.payload.u64 = !!ret;
>> +		msg.payload.u64 = ret == VH_RESULT_ERR;
>>   		msg.size = sizeof(msg.payload.u64);
>>   		send_vhost_reply(fd, &msg);
>> -	} else if (ret) {
>> +	} else if (ret == VH_RESULT_ERR) {
>>   		RTE_LOG(ERR, VHOST_CONFIG,
>>   			"vhost message handling failed.\n");
>>   		return -1;
>>
  
Ilya Maximets Oct. 3, 2018, 7:57 a.m. UTC | #3
On 03.10.2018 10:50, Maxime Coquelin wrote:
> 
> 
> On 10/02/2018 04:15 PM, Ilya Maximets wrote:
>> On 02.10.2018 12:36, Maxime Coquelin wrote:
>>> Return of message handling has now changed to an enum that can
>>> take non-negative value that is not zero in case a reply is
>>> needed. But the code checking the variable afterwards has not
>>> been updated, leading to success messages handling being
>>> treated as errors.
>>>
>>> Fixes: 4e601952cae6 ("vhost: message handling implemented as a callback array")
>>>
>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>> ---
>>>   lib/librte_vhost/vhost_user.c | 6 +++---
>>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
>>> index 7ef3fb4a4..060b41893 100644
>>> --- a/lib/librte_vhost/vhost_user.c
>>> +++ b/lib/librte_vhost/vhost_user.c
>>> @@ -1783,7 +1783,7 @@ vhost_user_msg_handler(int vid, int fd)
>>>       }
>>>     skip_to_post_handle:
>>> -    if (!ret && dev->extern_ops.post_msg_handle) {
>>> +    if (ret != VH_RESULT_ERR && dev->extern_ops.post_msg_handle) {
>>>           uint32_t need_reply;
>>>             ret = (*dev->extern_ops.post_msg_handle)(
>>> @@ -1800,10 +1800,10 @@ vhost_user_msg_handler(int vid, int fd)
>>>           vhost_user_unlock_all_queue_pairs(dev);
>>>         if (msg.flags & VHOST_USER_NEED_REPLY) {
>>
>> Maybe we need to reply here only if we didn't reply
>> already (not VH_RESULT_REPLY) ? Otherwise, we could
>> reply twice (with payload and with return code).
> 
> Well, if the master sets this bit, it means it is waiting for
> a "reply-ack", so not sending it would cause the master to wait
> forever.
> 
> It is the master responsibility to not set this bit for requests
> already expecting a non "reply-ack" reply (as you fixed it for
> postcopy's set mem table case).

vhost-user docs in QEMU says:
"
For the message types that already solicit a reply from the client, the
presence of VHOST_USER_PROTOCOL_F_REPLY_ACK or need_reply bit being set brings
no behavioural change.
"
i.e. even if QEMU sets the need_reply flag, vhost should not reply twice.
Am I missing something?

> 
>>> -        msg.payload.u64 = !!ret;
>>> +        msg.payload.u64 = ret == VH_RESULT_ERR;
>>>           msg.size = sizeof(msg.payload.u64);
>>>           send_vhost_reply(fd, &msg);
>>> -    } else if (ret) {
>>> +    } else if (ret == VH_RESULT_ERR) {
>>>           RTE_LOG(ERR, VHOST_CONFIG,
>>>               "vhost message handling failed.\n");
>>>           return -1;
>>>
> 
>
  
Maxime Coquelin Oct. 3, 2018, 8:02 a.m. UTC | #4
On 10/03/2018 09:57 AM, Ilya Maximets wrote:
> On 03.10.2018 10:50, Maxime Coquelin wrote:
>>
>>
>> On 10/02/2018 04:15 PM, Ilya Maximets wrote:
>>> On 02.10.2018 12:36, Maxime Coquelin wrote:
>>>> Return of message handling has now changed to an enum that can
>>>> take non-negative value that is not zero in case a reply is
>>>> needed. But the code checking the variable afterwards has not
>>>> been updated, leading to success messages handling being
>>>> treated as errors.
>>>>
>>>> Fixes: 4e601952cae6 ("vhost: message handling implemented as a callback array")
>>>>
>>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>> ---
>>>>    lib/librte_vhost/vhost_user.c | 6 +++---
>>>>    1 file changed, 3 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
>>>> index 7ef3fb4a4..060b41893 100644
>>>> --- a/lib/librte_vhost/vhost_user.c
>>>> +++ b/lib/librte_vhost/vhost_user.c
>>>> @@ -1783,7 +1783,7 @@ vhost_user_msg_handler(int vid, int fd)
>>>>        }
>>>>      skip_to_post_handle:
>>>> -    if (!ret && dev->extern_ops.post_msg_handle) {
>>>> +    if (ret != VH_RESULT_ERR && dev->extern_ops.post_msg_handle) {
>>>>            uint32_t need_reply;
>>>>              ret = (*dev->extern_ops.post_msg_handle)(
>>>> @@ -1800,10 +1800,10 @@ vhost_user_msg_handler(int vid, int fd)
>>>>            vhost_user_unlock_all_queue_pairs(dev);
>>>>          if (msg.flags & VHOST_USER_NEED_REPLY) {
>>>
>>> Maybe we need to reply here only if we didn't reply
>>> already (not VH_RESULT_REPLY) ? Otherwise, we could
>>> reply twice (with payload and with return code).
>>
>> Well, if the master sets this bit, it means it is waiting for
>> a "reply-ack", so not sending it would cause the master to wait
>> forever.
>>
>> It is the master responsibility to not set this bit for requests
>> already expecting a non "reply-ack" reply (as you fixed it for
>> postcopy's set mem table case).
> 
> vhost-user docs in QEMU says:
> "
> For the message types that already solicit a reply from the client, the
> presence of VHOST_USER_PROTOCOL_F_REPLY_ACK or need_reply bit being set brings
> no behavioural change.
> "
> i.e. even if QEMU sets the need_reply flag, vhost should not reply twice.
> Am I missing something?

Oh, right. Thanks for pointing it out.

So coming back to the DPDK implementation, I just had a look again, and 
it seems that we don't send a reply twice, as send_vhost_reply takes
care of clearing the VHOST_USER_NEED_REPLY flag.
Do you confirm my understanding is correct?

>>
>>>> -        msg.payload.u64 = !!ret;
>>>> +        msg.payload.u64 = ret == VH_RESULT_ERR;
>>>>            msg.size = sizeof(msg.payload.u64);
>>>>            send_vhost_reply(fd, &msg);
>>>> -    } else if (ret) {
>>>> +    } else if (ret == VH_RESULT_ERR) {
>>>>            RTE_LOG(ERR, VHOST_CONFIG,
>>>>                "vhost message handling failed.\n");
>>>>            return -1;
>>>>
>>
>>
  
Ilya Maximets Oct. 3, 2018, 8:32 a.m. UTC | #5
On 03.10.2018 11:02, Maxime Coquelin wrote:
> 
> 
> On 10/03/2018 09:57 AM, Ilya Maximets wrote:
>> On 03.10.2018 10:50, Maxime Coquelin wrote:
>>>
>>>
>>> On 10/02/2018 04:15 PM, Ilya Maximets wrote:
>>>> On 02.10.2018 12:36, Maxime Coquelin wrote:
>>>>> Return of message handling has now changed to an enum that can
>>>>> take non-negative value that is not zero in case a reply is
>>>>> needed. But the code checking the variable afterwards has not
>>>>> been updated, leading to success messages handling being
>>>>> treated as errors.
>>>>>
>>>>> Fixes: 4e601952cae6 ("vhost: message handling implemented as a callback array")
>>>>>
>>>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>> ---
>>>>>    lib/librte_vhost/vhost_user.c | 6 +++---
>>>>>    1 file changed, 3 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
>>>>> index 7ef3fb4a4..060b41893 100644
>>>>> --- a/lib/librte_vhost/vhost_user.c
>>>>> +++ b/lib/librte_vhost/vhost_user.c
>>>>> @@ -1783,7 +1783,7 @@ vhost_user_msg_handler(int vid, int fd)
>>>>>        }
>>>>>      skip_to_post_handle:
>>>>> -    if (!ret && dev->extern_ops.post_msg_handle) {
>>>>> +    if (ret != VH_RESULT_ERR && dev->extern_ops.post_msg_handle) {
>>>>>            uint32_t need_reply;
>>>>>              ret = (*dev->extern_ops.post_msg_handle)(
>>>>> @@ -1800,10 +1800,10 @@ vhost_user_msg_handler(int vid, int fd)
>>>>>            vhost_user_unlock_all_queue_pairs(dev);
>>>>>          if (msg.flags & VHOST_USER_NEED_REPLY) {
>>>>
>>>> Maybe we need to reply here only if we didn't reply
>>>> already (not VH_RESULT_REPLY) ? Otherwise, we could
>>>> reply twice (with payload and with return code).
>>>
>>> Well, if the master sets this bit, it means it is waiting for
>>> a "reply-ack", so not sending it would cause the master to wait
>>> forever.
>>>
>>> It is the master responsibility to not set this bit for requests
>>> already expecting a non "reply-ack" reply (as you fixed it for
>>> postcopy's set mem table case).
>>
>> vhost-user docs in QEMU says:
>> "
>> For the message types that already solicit a reply from the client, the
>> presence of VHOST_USER_PROTOCOL_F_REPLY_ACK or need_reply bit being set brings
>> no behavioural change.
>> "
>> i.e. even if QEMU sets the need_reply flag, vhost should not reply twice.
>> Am I missing something?
> 
> Oh, right. Thanks for pointing it out.
> 
> So coming back to the DPDK implementation, I just had a look again, and it seems that we don't send a reply twice, as send_vhost_reply takes
> care of clearing the VHOST_USER_NEED_REPLY flag.
> Do you confirm my understanding is correct?

Hmm. Yes, you're right. send_vhost_reply clears the VHOST_USER_NEED_REPLY
flag and vhost doesn't send replies twice.
Maybe some comment with clarifications needed here, or some more
refactoring to make this aspect more clear.

So, we have a situation where only one of the message processing stages
is able to reply even if it's not the last stage for the message:
1. extern_ops.pre_msg_handle
2. "master"
3. extern_ops.post_msg_handle
4. result i.e. (!!ret)

extern_ops API is a bit confusing from this point of view. It has
serious restrictions for replies which are not described anywhere.
I mean that pre and post processing stages are able to request the
reply, but the post processing reply will be just dropped
(or "master" reply will be dropped).
This is, actually, not an issue until we have only one user of them,
which uses only one of the callbacks. But maybe this API should be
described more in comments or docs.

> 
>>>
>>>>> -        msg.payload.u64 = !!ret;
>>>>> +        msg.payload.u64 = ret == VH_RESULT_ERR;
>>>>>            msg.size = sizeof(msg.payload.u64);
>>>>>            send_vhost_reply(fd, &msg);
>>>>> -    } else if (ret) {
>>>>> +    } else if (ret == VH_RESULT_ERR) {
>>>>>            RTE_LOG(ERR, VHOST_CONFIG,
>>>>>                "vhost message handling failed.\n");
>>>>>            return -1;
>>>>>
>>>
>>>
> 
>
  
Ilya Maximets Oct. 3, 2018, 9:07 a.m. UTC | #6
On 03.10.2018 11:32, Ilya Maximets wrote:
> On 03.10.2018 11:02, Maxime Coquelin wrote:
>>
>>
>> On 10/03/2018 09:57 AM, Ilya Maximets wrote:
>>> On 03.10.2018 10:50, Maxime Coquelin wrote:
>>>>
>>>>
>>>> On 10/02/2018 04:15 PM, Ilya Maximets wrote:
>>>>> On 02.10.2018 12:36, Maxime Coquelin wrote:
>>>>>> Return of message handling has now changed to an enum that can
>>>>>> take non-negative value that is not zero in case a reply is
>>>>>> needed. But the code checking the variable afterwards has not
>>>>>> been updated, leading to success messages handling being
>>>>>> treated as errors.
>>>>>>
>>>>>> Fixes: 4e601952cae6 ("vhost: message handling implemented as a callback array")
>>>>>>
>>>>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>> ---
>>>>>>    lib/librte_vhost/vhost_user.c | 6 +++---
>>>>>>    1 file changed, 3 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
>>>>>> index 7ef3fb4a4..060b41893 100644
>>>>>> --- a/lib/librte_vhost/vhost_user.c
>>>>>> +++ b/lib/librte_vhost/vhost_user.c
>>>>>> @@ -1783,7 +1783,7 @@ vhost_user_msg_handler(int vid, int fd)
>>>>>>        }
>>>>>>      skip_to_post_handle:
>>>>>> -    if (!ret && dev->extern_ops.post_msg_handle) {
>>>>>> +    if (ret != VH_RESULT_ERR && dev->extern_ops.post_msg_handle) {
>>>>>>            uint32_t need_reply;
>>>>>>              ret = (*dev->extern_ops.post_msg_handle)(
>>>>>> @@ -1800,10 +1800,10 @@ vhost_user_msg_handler(int vid, int fd)
>>>>>>            vhost_user_unlock_all_queue_pairs(dev);
>>>>>>          if (msg.flags & VHOST_USER_NEED_REPLY) {
>>>>>
>>>>> Maybe we need to reply here only if we didn't reply
>>>>> already (not VH_RESULT_REPLY) ? Otherwise, we could
>>>>> reply twice (with payload and with return code).
>>>>
>>>> Well, if the master sets this bit, it means it is waiting for
>>>> a "reply-ack", so not sending it would cause the master to wait
>>>> forever.
>>>>
>>>> It is the master responsibility to not set this bit for requests
>>>> already expecting a non "reply-ack" reply (as you fixed it for
>>>> postcopy's set mem table case).
>>>
>>> vhost-user docs in QEMU says:
>>> "
>>> For the message types that already solicit a reply from the client, the
>>> presence of VHOST_USER_PROTOCOL_F_REPLY_ACK or need_reply bit being set brings
>>> no behavioural change.
>>> "
>>> i.e. even if QEMU sets the need_reply flag, vhost should not reply twice.
>>> Am I missing something?
>>
>> Oh, right. Thanks for pointing it out.
>>
>> So coming back to the DPDK implementation, I just had a look again, and it seems that we don't send a reply twice, as send_vhost_reply takes
>> care of clearing the VHOST_USER_NEED_REPLY flag.
>> Do you confirm my understanding is correct?
> 
> Hmm. Yes, you're right. send_vhost_reply clears the VHOST_USER_NEED_REPLY
> flag and vhost doesn't send replies twice.
> Maybe some comment with clarifications needed here, or some more
> refactoring to make this aspect more clear.
> 

Sorry. Please, ignore the text below.

> So, we have a situation where only one of the message processing stages
> is able to reply even if it's not the last stage for the message:
> 1. extern_ops.pre_msg_handle
> 2. "master"
> 3. extern_ops.post_msg_handle
> 4. result i.e. (!!ret)
> 
> extern_ops API is a bit confusing from this point of view. It has
> serious restrictions for replies which are not described anywhere.
> I mean that pre and post processing stages are able to request the
> reply, but the post processing reply will be just dropped
> (or "master" reply will be dropped).
> This is, actually, not an issue until we have only one user of them,
> which uses only one of the callbacks. But maybe this API should be
> described more in comments or docs.
> 
>>
>>>>
>>>>>> -        msg.payload.u64 = !!ret;
>>>>>> +        msg.payload.u64 = ret == VH_RESULT_ERR;
>>>>>>            msg.size = sizeof(msg.payload.u64);
>>>>>>            send_vhost_reply(fd, &msg);
>>>>>> -    } else if (ret) {
>>>>>> +    } else if (ret == VH_RESULT_ERR) {
>>>>>>            RTE_LOG(ERR, VHOST_CONFIG,
>>>>>>                "vhost message handling failed.\n");
>>>>>>            return -1;
>>>>>>
>>>>
>>>>
>>
>>
> 
>
  
Maxime Coquelin Oct. 3, 2018, 2:39 p.m. UTC | #7
On 10/03/2018 11:07 AM, Ilya Maximets wrote:
> On 03.10.2018 11:32, Ilya Maximets wrote:
>> On 03.10.2018 11:02, Maxime Coquelin wrote:
>>>
>>>
>>> On 10/03/2018 09:57 AM, Ilya Maximets wrote:
>>>> On 03.10.2018 10:50, Maxime Coquelin wrote:
>>>>>
>>>>>
>>>>> On 10/02/2018 04:15 PM, Ilya Maximets wrote:
>>>>>> On 02.10.2018 12:36, Maxime Coquelin wrote:
>>>>>>> Return of message handling has now changed to an enum that can
>>>>>>> take non-negative value that is not zero in case a reply is
>>>>>>> needed. But the code checking the variable afterwards has not
>>>>>>> been updated, leading to success messages handling being
>>>>>>> treated as errors.
>>>>>>>
>>>>>>> Fixes: 4e601952cae6 ("vhost: message handling implemented as a callback array")
>>>>>>>
>>>>>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>>> ---
>>>>>>>     lib/librte_vhost/vhost_user.c | 6 +++---
>>>>>>>     1 file changed, 3 insertions(+), 3 deletions(-)
>>>>>>>
>>>>>>> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
>>>>>>> index 7ef3fb4a4..060b41893 100644
>>>>>>> --- a/lib/librte_vhost/vhost_user.c
>>>>>>> +++ b/lib/librte_vhost/vhost_user.c
>>>>>>> @@ -1783,7 +1783,7 @@ vhost_user_msg_handler(int vid, int fd)
>>>>>>>         }
>>>>>>>       skip_to_post_handle:
>>>>>>> -    if (!ret && dev->extern_ops.post_msg_handle) {
>>>>>>> +    if (ret != VH_RESULT_ERR && dev->extern_ops.post_msg_handle) {
>>>>>>>             uint32_t need_reply;
>>>>>>>               ret = (*dev->extern_ops.post_msg_handle)(
>>>>>>> @@ -1800,10 +1800,10 @@ vhost_user_msg_handler(int vid, int fd)
>>>>>>>             vhost_user_unlock_all_queue_pairs(dev);
>>>>>>>           if (msg.flags & VHOST_USER_NEED_REPLY) {
>>>>>>
>>>>>> Maybe we need to reply here only if we didn't reply
>>>>>> already (not VH_RESULT_REPLY) ? Otherwise, we could
>>>>>> reply twice (with payload and with return code).
>>>>>
>>>>> Well, if the master sets this bit, it means it is waiting for
>>>>> a "reply-ack", so not sending it would cause the master to wait
>>>>> forever.
>>>>>
>>>>> It is the master responsibility to not set this bit for requests
>>>>> already expecting a non "reply-ack" reply (as you fixed it for
>>>>> postcopy's set mem table case).
>>>>
>>>> vhost-user docs in QEMU says:
>>>> "
>>>> For the message types that already solicit a reply from the client, the
>>>> presence of VHOST_USER_PROTOCOL_F_REPLY_ACK or need_reply bit being set brings
>>>> no behavioural change.
>>>> "
>>>> i.e. even if QEMU sets the need_reply flag, vhost should not reply twice.
>>>> Am I missing something?
>>>
>>> Oh, right. Thanks for pointing it out.
>>>
>>> So coming back to the DPDK implementation, I just had a look again, and it seems that we don't send a reply twice, as send_vhost_reply takes
>>> care of clearing the VHOST_USER_NEED_REPLY flag.
>>> Do you confirm my understanding is correct?
>>
>> Hmm. Yes, you're right. send_vhost_reply clears the VHOST_USER_NEED_REPLY
>> flag and vhost doesn't send replies twice.
>> Maybe some comment with clarifications needed here, or some more
>> refactoring to make this aspect more clear.
>>

Agree.
I'm adding a comment, I don't think a refactoring is required, and I
would be reluctant to add one more refactoring so close to the
integration deadline.

Does it work for you?

Thanks,
Maxime
  
Ilya Maximets Oct. 4, 2018, 5:42 a.m. UTC | #8
On 03.10.2018 17:39, Maxime Coquelin wrote:
> 
> 
> On 10/03/2018 11:07 AM, Ilya Maximets wrote:
>> On 03.10.2018 11:32, Ilya Maximets wrote:
>>> On 03.10.2018 11:02, Maxime Coquelin wrote:
>>>>
>>>>
>>>> On 10/03/2018 09:57 AM, Ilya Maximets wrote:
>>>>> On 03.10.2018 10:50, Maxime Coquelin wrote:
>>>>>>
>>>>>>
>>>>>> On 10/02/2018 04:15 PM, Ilya Maximets wrote:
>>>>>>> On 02.10.2018 12:36, Maxime Coquelin wrote:
>>>>>>>> Return of message handling has now changed to an enum that can
>>>>>>>> take non-negative value that is not zero in case a reply is
>>>>>>>> needed. But the code checking the variable afterwards has not
>>>>>>>> been updated, leading to success messages handling being
>>>>>>>> treated as errors.
>>>>>>>>
>>>>>>>> Fixes: 4e601952cae6 ("vhost: message handling implemented as a callback array")
>>>>>>>>
>>>>>>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>>>> ---
>>>>>>>>     lib/librte_vhost/vhost_user.c | 6 +++---
>>>>>>>>     1 file changed, 3 insertions(+), 3 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
>>>>>>>> index 7ef3fb4a4..060b41893 100644
>>>>>>>> --- a/lib/librte_vhost/vhost_user.c
>>>>>>>> +++ b/lib/librte_vhost/vhost_user.c
>>>>>>>> @@ -1783,7 +1783,7 @@ vhost_user_msg_handler(int vid, int fd)
>>>>>>>>         }
>>>>>>>>       skip_to_post_handle:
>>>>>>>> -    if (!ret && dev->extern_ops.post_msg_handle) {
>>>>>>>> +    if (ret != VH_RESULT_ERR && dev->extern_ops.post_msg_handle) {
>>>>>>>>             uint32_t need_reply;
>>>>>>>>               ret = (*dev->extern_ops.post_msg_handle)(
>>>>>>>> @@ -1800,10 +1800,10 @@ vhost_user_msg_handler(int vid, int fd)
>>>>>>>>             vhost_user_unlock_all_queue_pairs(dev);
>>>>>>>>           if (msg.flags & VHOST_USER_NEED_REPLY) {
>>>>>>>
>>>>>>> Maybe we need to reply here only if we didn't reply
>>>>>>> already (not VH_RESULT_REPLY) ? Otherwise, we could
>>>>>>> reply twice (with payload and with return code).
>>>>>>
>>>>>> Well, if the master sets this bit, it means it is waiting for
>>>>>> a "reply-ack", so not sending it would cause the master to wait
>>>>>> forever.
>>>>>>
>>>>>> It is the master responsibility to not set this bit for requests
>>>>>> already expecting a non "reply-ack" reply (as you fixed it for
>>>>>> postcopy's set mem table case).
>>>>>
>>>>> vhost-user docs in QEMU says:
>>>>> "
>>>>> For the message types that already solicit a reply from the client, the
>>>>> presence of VHOST_USER_PROTOCOL_F_REPLY_ACK or need_reply bit being set brings
>>>>> no behavioural change.
>>>>> "
>>>>> i.e. even if QEMU sets the need_reply flag, vhost should not reply twice.
>>>>> Am I missing something?
>>>>
>>>> Oh, right. Thanks for pointing it out.
>>>>
>>>> So coming back to the DPDK implementation, I just had a look again, and it seems that we don't send a reply twice, as send_vhost_reply takes
>>>> care of clearing the VHOST_USER_NEED_REPLY flag.
>>>> Do you confirm my understanding is correct?
>>>
>>> Hmm. Yes, you're right. send_vhost_reply clears the VHOST_USER_NEED_REPLY
>>> flag and vhost doesn't send replies twice.
>>> Maybe some comment with clarifications needed here, or some more
>>> refactoring to make this aspect more clear.
>>>
> 
> Agree.
> I'm adding a comment, I don't think a refactoring is required, and I
> would be reluctant to add one more refactoring so close to the
> integration deadline.
> 
> Does it work for you?

Sure. Thanks. I agree that it's not the right time for refactoring now.

> 
> Thanks,
> Maxime
> 
>
  

Patch

diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index 7ef3fb4a4..060b41893 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -1783,7 +1783,7 @@  vhost_user_msg_handler(int vid, int fd)
 	}
 
 skip_to_post_handle:
-	if (!ret && dev->extern_ops.post_msg_handle) {
+	if (ret != VH_RESULT_ERR && dev->extern_ops.post_msg_handle) {
 		uint32_t need_reply;
 
 		ret = (*dev->extern_ops.post_msg_handle)(
@@ -1800,10 +1800,10 @@  vhost_user_msg_handler(int vid, int fd)
 		vhost_user_unlock_all_queue_pairs(dev);
 
 	if (msg.flags & VHOST_USER_NEED_REPLY) {
-		msg.payload.u64 = !!ret;
+		msg.payload.u64 = ret == VH_RESULT_ERR;
 		msg.size = sizeof(msg.payload.u64);
 		send_vhost_reply(fd, &msg);
-	} else if (ret) {
+	} else if (ret == VH_RESULT_ERR) {
 		RTE_LOG(ERR, VHOST_CONFIG,
 			"vhost message handling failed.\n");
 		return -1;