[dpdk-dev] eal/ipc: stop async IPC loop on callback request

Message ID b4fe53189e469a14102934d59a536bebbad2538d.1523354362.git.anatoly.burakov@intel.com (mailing list archive)
State Superseded, archived
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation fail Compilation issues

Commit Message

Anatoly Burakov April 10, 2018, 10:03 a.m. UTC
  EAL did not stop processing further asynchronous requests on
encountering a request that should trigger the callback. This
resulted in erasing valid requests but not triggering them.

Fix this by stopping the loop once we have a request that we
can trigger. Also, remove unnecessary check for trigger
request being NULL.

Fixes: f05e26051c15 ("eal: add IPC asynchronous request")
Cc: anatoly.burakov@intel.com

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/librte_eal/common/eal_common_proc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
  

Comments

Jianfeng Tan April 10, 2018, 1:53 p.m. UTC | #1
On 4/10/2018 6:03 PM, Anatoly Burakov wrote:
> EAL did not stop processing further asynchronous requests on
> encountering a request that should trigger the callback. This
> resulted in erasing valid requests but not triggering them.

That means one wakeup could process multiple replies, and following 
process_async_request() will erase some valid requests?

>
> Fix this by stopping the loop once we have a request that we
> can trigger. Also, remove unnecessary check for trigger
> request being NULL.
>
> Fixes: f05e26051c15 ("eal: add IPC asynchronous request")
> Cc: anatoly.burakov@intel.com
>
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>


> ---
>   lib/librte_eal/common/eal_common_proc.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/lib/librte_eal/common/eal_common_proc.c b/lib/librte_eal/common/eal_common_proc.c
> index f98622f..1ea3b58 100644
> --- a/lib/librte_eal/common/eal_common_proc.c
> +++ b/lib/librte_eal/common/eal_common_proc.c
> @@ -510,11 +510,11 @@ async_reply_handle(void *arg __rte_unused)
>   					TAILQ_REMOVE(&pending_requests.requests,
>   							sr, next);
>   					free(sr);
> -				} else if (action == ACTION_TRIGGER &&
> -						trigger == NULL) {
> +				} else if (action == ACTION_TRIGGER) {
>   					TAILQ_REMOVE(&pending_requests.requests,
>   							sr, next);
>   					trigger = sr;
> +					break;

If I understand it correctly above, break here, we will trigger an async 
action, and then go back to sleep with some ready requests not handled? 
Seems that we shall unlock, process, and lock here. Right?

Thanks,
Jianfeng

>   				}
>   			}
>   		}
  
Anatoly Burakov April 10, 2018, 2:17 p.m. UTC | #2
On 10-Apr-18 2:53 PM, Tan, Jianfeng wrote:
> 
> 
> On 4/10/2018 6:03 PM, Anatoly Burakov wrote:
>> EAL did not stop processing further asynchronous requests on
>> encountering a request that should trigger the callback. This
>> resulted in erasing valid requests but not triggering them.
> 
> That means one wakeup could process multiple replies, and following 
> process_async_request() will erase some valid requests?

Yes.

> 
>>
>> Fix this by stopping the loop once we have a request that we
>> can trigger. Also, remove unnecessary check for trigger
>> request being NULL.
>>
>> Fixes: f05e26051c15 ("eal: add IPC asynchronous request")
>> Cc: anatoly.burakov@intel.com
>>
>> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> 
> 
>> ---
>>   lib/librte_eal/common/eal_common_proc.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/lib/librte_eal/common/eal_common_proc.c 
>> b/lib/librte_eal/common/eal_common_proc.c
>> index f98622f..1ea3b58 100644
>> --- a/lib/librte_eal/common/eal_common_proc.c
>> +++ b/lib/librte_eal/common/eal_common_proc.c
>> @@ -510,11 +510,11 @@ async_reply_handle(void *arg __rte_unused)
>>                       TAILQ_REMOVE(&pending_requests.requests,
>>                               sr, next);
>>                       free(sr);
>> -                } else if (action == ACTION_TRIGGER &&
>> -                        trigger == NULL) {
>> +                } else if (action == ACTION_TRIGGER) {
>>                       TAILQ_REMOVE(&pending_requests.requests,
>>                               sr, next);
>>                       trigger = sr;
>> +                    break;
> 
> If I understand it correctly above, break here, we will trigger an async 
> action, and then go back to sleep with some ready requests not handled? 
> Seems that we shall unlock, process, and lock here. Right?

Well, we won't go back to sleep - we'll just loop around and come back.

See eal_common_proc.c:472:

	/* sometimes, we don't even wait */
	if (sr->reply_received) {
		nowait = true;
		break;
	}

Followed by line 478:

	if (nowait)
		ret = 0;

Followed by line 495:

	if (ret == 0 || ret == ETIMEDOUT) {

So, having messages with replies already in the queue will cause wait to 
be cancelled. It's not much slower than unlocking, triggering, and 
locking again.

However, if you're OK with lock-loop-unlock-trigger-lock-loop-unlock-... 
sequence until we run out of triggers, then sure, i can add that.

> 
> Thanks,
> Jianfeng
> 
>>                   }
>>               }
>>           }
> 
>
  
Jianfeng Tan April 10, 2018, 3:16 p.m. UTC | #3
On 4/10/2018 10:17 PM, Burakov, Anatoly wrote:
> On 10-Apr-18 2:53 PM, Tan, Jianfeng wrote:
>>
>>
>> On 4/10/2018 6:03 PM, Anatoly Burakov wrote:
>>> EAL did not stop processing further asynchronous requests on
>>> encountering a request that should trigger the callback. This
>>> resulted in erasing valid requests but not triggering them.
>>
>> That means one wakeup could process multiple replies, and following 
>> process_async_request() will erase some valid requests?
>
> Yes.
>
>>
>>>
>>> Fix this by stopping the loop once we have a request that we
>>> can trigger. Also, remove unnecessary check for trigger
>>> request being NULL.
>>>
>>> Fixes: f05e26051c15 ("eal: add IPC asynchronous request")
>>> Cc: anatoly.burakov@intel.com
>>>
>>> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>

Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>

>>
>>> ---
>>>   lib/librte_eal/common/eal_common_proc.c | 4 ++--
>>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/lib/librte_eal/common/eal_common_proc.c 
>>> b/lib/librte_eal/common/eal_common_proc.c
>>> index f98622f..1ea3b58 100644
>>> --- a/lib/librte_eal/common/eal_common_proc.c
>>> +++ b/lib/librte_eal/common/eal_common_proc.c
>>> @@ -510,11 +510,11 @@ async_reply_handle(void *arg __rte_unused)
>>> TAILQ_REMOVE(&pending_requests.requests,
>>>                               sr, next);
>>>                       free(sr);
>>> -                } else if (action == ACTION_TRIGGER &&
>>> -                        trigger == NULL) {
>>> +                } else if (action == ACTION_TRIGGER) {
>>> TAILQ_REMOVE(&pending_requests.requests,
>>>                               sr, next);
>>>                       trigger = sr;
>>> +                    break;
>>
>> If I understand it correctly above, break here, we will trigger an 
>> async action, and then go back to sleep with some ready requests not 
>> handled? Seems that we shall unlock, process, and lock here. Right?
>
> Well, we won't go back to sleep - we'll just loop around and come back.
>
> See eal_common_proc.c:472:
>
>     /* sometimes, we don't even wait */
>     if (sr->reply_received) {
>         nowait = true;
>         break;
>     }
>
> Followed by line 478:
>
>     if (nowait)
>         ret = 0;
>
> Followed by line 495:
>
>     if (ret == 0 || ret == ETIMEDOUT) {
>
> So, having messages with replies already in the queue will cause wait 
> to be cancelled. It's not much slower than unlocking, triggering, and 
> locking again.

Ah, sorry, I overlooked that fact that every iteration we re-scan the 
request list.

>
> However, if you're OK with 
> lock-loop-unlock-trigger-lock-loop-unlock-... sequence until we run 
> out of triggers, then sure, i can add that.

Don't see the reason for that.

Thanks,
Jianfeng
  

Patch

diff --git a/lib/librte_eal/common/eal_common_proc.c b/lib/librte_eal/common/eal_common_proc.c
index f98622f..1ea3b58 100644
--- a/lib/librte_eal/common/eal_common_proc.c
+++ b/lib/librte_eal/common/eal_common_proc.c
@@ -510,11 +510,11 @@  async_reply_handle(void *arg __rte_unused)
 					TAILQ_REMOVE(&pending_requests.requests,
 							sr, next);
 					free(sr);
-				} else if (action == ACTION_TRIGGER &&
-						trigger == NULL) {
+				} else if (action == ACTION_TRIGGER) {
 					TAILQ_REMOVE(&pending_requests.requests,
 							sr, next);
 					trigger = sr;
+					break;
 				}
 			}
 		}