[2/2] eal: fix hang in ctrl thread creation error logic
Checks
Commit Message
The affinity of a control thread is set after it has been launched. If
setting the affinity fails, pthread_cancel is called followed by a call
to pthread_join, which can hang forever if the thread's start routine
doesn't call a pthread cancellation point.
This patch modifies the logic so that the control thread exits
gracefully if the affinity cannot be set successfully and removes the
call to pthread_cancel.
Fixes: 6383d26 ("eal: set name when creating a control thread")
Cc: olivier.matz@6wind.com
Cc: stable@dpdk.org
Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
---
Hi Olivier,
Hi Honnappa,
As discussed, I've split the changes into 2 patches. This second commit
removes the pthread_cancel call which could result in a hang on join, if
the ctrl thread routine didn't call a cancellation point.
lib/librte_eal/common/eal_common_thread.c | 29 +++++++++++++----------
1 file changed, 17 insertions(+), 12 deletions(-)
Comments
Hi Luc,
On Wed, Apr 07, 2021 at 04:16:06PM -0400, Luc Pelletier wrote:
> The affinity of a control thread is set after it has been launched. If
> setting the affinity fails, pthread_cancel is called followed by a call
> to pthread_join, which can hang forever if the thread's start routine
> doesn't call a pthread cancellation point.
>
> This patch modifies the logic so that the control thread exits
> gracefully if the affinity cannot be set successfully and removes the
> call to pthread_cancel.
>
> Fixes: 6383d26 ("eal: set name when creating a control thread")
> Cc: olivier.matz@6wind.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
Thank you for these 2 fixes. Note the the title of your patches do not
contain the version (should have been v8?). I don't know how critical
it is for commiters.
Acked-by: Olivier Matz <olivier.matz@6wind.com>
<snip>
>
> The affinity of a control thread is set after it has been launched. If setting the
> affinity fails, pthread_cancel is called followed by a call to pthread_join, which
> can hang forever if the thread's start routine doesn't call a pthread
> cancellation point.
>
> This patch modifies the logic so that the control thread exits gracefully if the
> affinity cannot be set successfully and removes the call to pthread_cancel.
>
> Fixes: 6383d26 ("eal: set name when creating a control thread")
> Cc: olivier.matz@6wind.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
Looks good.
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> ---
>
<snip>
> Thank you for these 2 fixes. Note the the title of your patches do not
> contain the version (should have been v8?). I don't know how critical
> it is for commiters.
Thanks Olivier. I'll admit that I wasn't sure if I should version the
patches after splitting the original. I opted not to but it seems like
I should have. If it's a problem, please let me know and I'll repost
them with 'v8'.
Le jeu. 8 avr. 2021 à 10:20, Olivier Matz <olivier.matz@6wind.com> a écrit :
>
> Hi Luc,
>
> On Wed, Apr 07, 2021 at 04:16:06PM -0400, Luc Pelletier wrote:
> > The affinity of a control thread is set after it has been launched. If
> > setting the affinity fails, pthread_cancel is called followed by a call
> > to pthread_join, which can hang forever if the thread's start routine
> > doesn't call a pthread cancellation point.
> >
> > This patch modifies the logic so that the control thread exits
> > gracefully if the affinity cannot be set successfully and removes the
> > call to pthread_cancel.
> >
> > Fixes: 6383d26 ("eal: set name when creating a control thread")
> > Cc: olivier.matz@6wind.com
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
>
> Thank you for these 2 fixes. Note the the title of your patches do not
> contain the version (should have been v8?). I don't know how critical
> it is for commiters.
>
> Acked-by: Olivier Matz <olivier.matz@6wind.com>
On Thu, Apr 8, 2021 at 8:02 PM Luc Pelletier <lucp.at.work@gmail.com> wrote:
>
> > Thank you for these 2 fixes. Note the the title of your patches do not
> > contain the version (should have been v8?). I don't know how critical
> > it is for commiters.
>
> Thanks Olivier. I'll admit that I wasn't sure if I should version the
> patches after splitting the original. I opted not to but it seems like
> I should have. If it's a problem, please let me know and I'll repost
> them with 'v8'.
I followed this series closely, so not an issue for me.
No need to resend.
I'll look at merging it today.
On Wed, Apr 7, 2021 at 10:29 PM Luc Pelletier <lucp.at.work@gmail.com> wrote:
>
> The affinity of a control thread is set after it has been launched. If
> setting the affinity fails, pthread_cancel is called followed by a call
> to pthread_join, which can hang forever if the thread's start routine
> doesn't call a pthread cancellation point.
>
> This patch modifies the logic so that the control thread exits
> gracefully if the affinity cannot be set successfully and removes the
> call to pthread_cancel.
>
> Fixes: 6383d26 ("eal: set name when creating a control thread")
Fixed sha1's while applying.
We prefer sha1 on 12 chars, like described in
https://doc.dpdk.org/guides/contributing/patches.html#commit-messages-body.
> Cc: stable@dpdk.org
>
> Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Series applied, thanks for the fixes.
@@ -187,14 +187,18 @@ static void *ctrl_thread_init(void *arg)
eal_get_internal_configuration();
rte_cpuset_t *cpuset = &internal_conf->ctrl_cpuset;
struct rte_thread_ctrl_params *params = arg;
- void *(*start_routine)(void *) = params->start_routine;
+ void *(*start_routine)(void *);
void *routine_arg = params->arg;
__rte_thread_init(rte_lcore_id(), cpuset);
pthread_barrier_wait(¶ms->configured);
+ start_routine = params->start_routine;
ctrl_params_free(params);
+ if (start_routine == NULL)
+ return NULL;
+
return start_routine(routine_arg);
}
@@ -218,14 +222,12 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
params->refcnt = 2;
ret = pthread_barrier_init(¶ms->configured, NULL, 2);
- if (ret != 0) {
- free(params);
- return -ret;
- }
+ if (ret != 0)
+ goto fail_no_barrier;
ret = pthread_create(thread, attr, ctrl_thread_init, (void *)params);
if (ret != 0)
- goto fail;
+ goto fail_with_barrier;
if (name != NULL) {
ret = rte_thread_setname(*thread, name);
@@ -236,19 +238,22 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
ret = pthread_setaffinity_np(*thread, sizeof(*cpuset), cpuset);
if (ret != 0)
- goto fail_cancel;
+ params->start_routine = NULL;
pthread_barrier_wait(¶ms->configured);
ctrl_params_free(params);
- return 0;
+ if (ret != 0)
+ /* start_routine has been set to NULL above; */
+ /* ctrl thread will exit immediately */
+ pthread_join(*thread, NULL);
-fail_cancel:
- pthread_cancel(*thread);
- pthread_join(*thread, NULL);
+ return -ret;
-fail:
+fail_with_barrier:
pthread_barrier_destroy(¶ms->configured);
+
+fail_no_barrier:
free(params);
return -ret;