[v6,06/10] eal: add thread lifetime management

Message ID 1617413948-10504-7-git-send-email-navasile@linux.microsoft.com (mailing list archive)
State Superseded, archived
Delegated to: David Marchand
Headers
Series eal: Add new API for threading |

Checks

Context Check Description
ci/checkpatch warning coding style issues

Commit Message

Narcisa Ana Maria Vasile April 3, 2021, 1:39 a.m. UTC
  From: Narcisa Vasile <navasile@microsoft.com>

Add function for thread creation, join, canceling.

Signed-off-by: Narcisa Vasile <navasile@microsoft.com>
---
 lib/librte_eal/common/rte_thread.c  | 110 ++++++++++++++++++++++++
 lib/librte_eal/include/rte_thread.h |  53 ++++++++++++
 lib/librte_eal/windows/rte_thread.c | 125 ++++++++++++++++++++++++++++
 3 files changed, 288 insertions(+)
  

Comments

Dmitry Kozlyuk April 29, 2021, 8:44 p.m. UTC | #1
2021-04-02 18:39 (UTC-0700), Narcisa Ana Maria Vasile:
[...]
> diff --git a/lib/librte_eal/windows/rte_thread.c b/lib/librte_eal/windows/rte_thread.c
> index f61103bbc..86bbd7bc2 100644
> --- a/lib/librte_eal/windows/rte_thread.c
> +++ b/lib/librte_eal/windows/rte_thread.c
> @@ -329,6 +329,131 @@ rte_thread_attr_set_priority(rte_thread_attr_t *thread_attr,
>  	return 0;
>  }
>  
> +int
> +rte_thread_create(rte_thread_t *thread_id,
> +		  const rte_thread_attr_t *thread_attr,
> +		  void *(*thread_func)(void *), void *args)
> +{
> +	int ret = 0;
> +	HANDLE thread_handle = NULL;
> +	GROUP_AFFINITY thread_affinity;
> +
> +	thread_handle = CreateThread(NULL, 0,
> +				(LPTHREAD_START_ROUTINE)(ULONG_PTR)thread_func,

thread_func returns void* (8 bytes), LPTHREAD_START_ROUTING
returns DWORD (4 byte), is this cast safe?
It seems to be on x86, can't say for ARM64.

> +				args, 0, thread_id);
> +	if (thread_handle == NULL) {
> +		ret = rte_thread_translate_win32_error();
> +		RTE_LOG_WIN32_ERR("CreateThread()");
> +		goto cleanup;
> +	}

After this point, in case of errors the thread remains running,
while the function reports failure. It's better to create the thread
suspended, configure its attributes (destroying the thread on failure), then
resume (start) the thread.

> +
> +	if (thread_attr != NULL) {
> +		if (CPU_COUNT(&thread_attr->cpuset) > 0) {
> +			ret = rte_convert_cpuset_to_affinity(&thread_attr->cpuset, &thread_affinity);
> +			if (ret != 0) {
> +				RTE_LOG(DEBUG, EAL, "Unable to convert cpuset to thread affinity\n");
> +				goto cleanup;
> +			}
> +
> +			if (!SetThreadGroupAffinity(thread_handle, &thread_affinity, NULL)) {
> +				ret = rte_thread_translate_win32_error();
> +				RTE_LOG_WIN32_ERR("SetThreadGroupAffinity()");
> +				goto cleanup;
> +			}
> +		}
> +		ret = rte_thread_set_priority(*thread_id, thread_attr->priority);
> +		if (ret != 0) {
> +			RTE_LOG(DEBUG, EAL, "Unable to set thread priority\n");
> +			goto cleanup;
> +		}
> +	}
> +
> +	return 0;
> +
> +cleanup:
> +	if (thread_handle != NULL) {
> +		CloseHandle(thread_handle);
> +		thread_handle = NULL;
> +	}
> +	return ret;
> +}

[...]
> +
> +int
> +rte_thread_cancel(rte_thread_t thread_id)
> +{
> +	int ret = 0;
> +	HANDLE thread_handle = NULL;
> +
> +	thread_handle = OpenThread(THREAD_TERMINATE, FALSE, thread_id);
> +	if (thread_handle == NULL) {
> +		ret = rte_thread_translate_win32_error();
> +		RTE_LOG_WIN32_ERR("OpenThread()");
> +		goto cleanup;
> +	}
> +
> +	/*
> +	 * TODO: Behavior is different between POSIX and Windows threads.
> +	 * POSIX threads wait for a cancellation point.
> +	 * Current Windows emulation kills thread at any point.
> +	 */
> +	ret = TerminateThread(thread_handle, 0);

This is not acceptable.
There is a handful of pthread_cancel() usages in DPDK.
The threads they cancel take spinlocks and mutexes,
so interrupt at arbitrary point may cause a deadlock.

User		Cancellation points	Notes
----		-------------------	-----
net/ipn3ke	libfdt functions?
net/kni		usleep			Linux-only
raw/ifpga	open/read/write/close
vdpa/ifc	read
vdpa/mlx5	usleep, nanosleep
lib/eventdev	rte_epoll_wait		no pthread CP

Possible solutions:

1. Make specific DPDK functions cancellation points for this API and ensure
DPDK users of pthread_cancel() call it.  This way is almost non-invasive,
but it's more work in EAL.

2. Divert from pthread style: add cancellation token concept, so that DPDK
users can check themselves if they're cancelled. This is invasive, but very
explicit and flexible.

> +	if (ret != 0) {
> +		ret = rte_thread_translate_win32_error();
> +		RTE_LOG_WIN32_ERR("TerminateThread()");
> +		goto cleanup;
> +	}
> +
> +cleanup:
> +	if (thread_handle != NULL) {
> +		CloseHandle(thread_handle);
> +		thread_handle = NULL;

This line is not needed.

> +	}
> +	return ret;
> +}
> +
>  int
>  rte_thread_key_create(rte_thread_key *key,
>  		__rte_unused void (*destructor)(void *))
  
Dmitry Malloy April 29, 2021, 9:31 p.m. UTC | #2
Thread cancellation is a pain point. Emulating it properly is nearly impossible (without hooking into various OS calls which are supposed to be "cancellation points"). I do like the cancellation token idea, but I'm not sure how existing clients, who rely on existing phtread_cancel() semantic, will react to that - it will require a code re-write. 

How about we defer fixing this to another follow-up change? 

-----Original Message-----
From: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> 
Sent: Thursday, April 29, 2021 1:44 PM
To: Narcisa Ana Maria Vasile <navasile@linux.microsoft.com>
Cc: dev@dpdk.org; thomas <thomas@monjalon.net>; Khoa To <khot@microsoft.com>; Narcisa Ana Maria Vasile <Narcisa.Vasile@microsoft.com>; Dmitry Malloy <dmitrym@microsoft.com>; Tyler Retzlaff <roretzla@microsoft.com>; talshn@nvidia.com; Omar Cardona <ocardona@microsoft.com>; bruce.richardson@intel.com; david.marchand@redhat.com; Kadam, Pallavi <pallavi.kadam@intel.com>
Subject: [EXTERNAL] Re: [PATCH v6 06/10] eal: add thread lifetime management

2021-04-02 18:39 (UTC-0700), Narcisa Ana Maria Vasile:
[...]
> diff --git a/lib/librte_eal/windows/rte_thread.c 
> b/lib/librte_eal/windows/rte_thread.c
> index f61103bbc..86bbd7bc2 100644
> --- a/lib/librte_eal/windows/rte_thread.c
> +++ b/lib/librte_eal/windows/rte_thread.c
> @@ -329,6 +329,131 @@ rte_thread_attr_set_priority(rte_thread_attr_t *thread_attr,
>  	return 0;
>  }
>  
> +int
> +rte_thread_create(rte_thread_t *thread_id,
> +		  const rte_thread_attr_t *thread_attr,
> +		  void *(*thread_func)(void *), void *args) {
> +	int ret = 0;
> +	HANDLE thread_handle = NULL;
> +	GROUP_AFFINITY thread_affinity;
> +
> +	thread_handle = CreateThread(NULL, 0,
> +				(LPTHREAD_START_ROUTINE)(ULONG_PTR)thread_func,

thread_func returns void* (8 bytes), LPTHREAD_START_ROUTING returns DWORD (4 byte), is this cast safe?
It seems to be on x86, can't say for ARM64.

> +				args, 0, thread_id);
> +	if (thread_handle == NULL) {
> +		ret = rte_thread_translate_win32_error();
> +		RTE_LOG_WIN32_ERR("CreateThread()");
> +		goto cleanup;
> +	}

After this point, in case of errors the thread remains running, while the function reports failure. It's better to create the thread suspended, configure its attributes (destroying the thread on failure), then resume (start) the thread.

> +
> +	if (thread_attr != NULL) {
> +		if (CPU_COUNT(&thread_attr->cpuset) > 0) {
> +			ret = rte_convert_cpuset_to_affinity(&thread_attr->cpuset, &thread_affinity);
> +			if (ret != 0) {
> +				RTE_LOG(DEBUG, EAL, "Unable to convert cpuset to thread affinity\n");
> +				goto cleanup;
> +			}
> +
> +			if (!SetThreadGroupAffinity(thread_handle, &thread_affinity, NULL)) {
> +				ret = rte_thread_translate_win32_error();
> +				RTE_LOG_WIN32_ERR("SetThreadGroupAffinity()");
> +				goto cleanup;
> +			}
> +		}
> +		ret = rte_thread_set_priority(*thread_id, thread_attr->priority);
> +		if (ret != 0) {
> +			RTE_LOG(DEBUG, EAL, "Unable to set thread priority\n");
> +			goto cleanup;
> +		}
> +	}
> +
> +	return 0;
> +
> +cleanup:
> +	if (thread_handle != NULL) {
> +		CloseHandle(thread_handle);
> +		thread_handle = NULL;
> +	}
> +	return ret;
> +}

[...]
> +
> +int
> +rte_thread_cancel(rte_thread_t thread_id) {
> +	int ret = 0;
> +	HANDLE thread_handle = NULL;
> +
> +	thread_handle = OpenThread(THREAD_TERMINATE, FALSE, thread_id);
> +	if (thread_handle == NULL) {
> +		ret = rte_thread_translate_win32_error();
> +		RTE_LOG_WIN32_ERR("OpenThread()");
> +		goto cleanup;
> +	}
> +
> +	/*
> +	 * TODO: Behavior is different between POSIX and Windows threads.
> +	 * POSIX threads wait for a cancellation point.
> +	 * Current Windows emulation kills thread at any point.
> +	 */
> +	ret = TerminateThread(thread_handle, 0);

This is not acceptable.
There is a handful of pthread_cancel() usages in DPDK.
The threads they cancel take spinlocks and mutexes, so interrupt at arbitrary point may cause a deadlock.

User		Cancellation points	Notes
----		-------------------	-----
net/ipn3ke	libfdt functions?
net/kni		usleep			Linux-only
raw/ifpga	open/read/write/close
vdpa/ifc	read
vdpa/mlx5	usleep, nanosleep
lib/eventdev	rte_epoll_wait		no pthread CP

Possible solutions:

1. Make specific DPDK functions cancellation points for this API and ensure DPDK users of pthread_cancel() call it.  This way is almost non-invasive, but it's more work in EAL.

2. Divert from pthread style: add cancellation token concept, so that DPDK users can check themselves if they're cancelled. This is invasive, but very explicit and flexible.

> +	if (ret != 0) {
> +		ret = rte_thread_translate_win32_error();
> +		RTE_LOG_WIN32_ERR("TerminateThread()");
> +		goto cleanup;
> +	}
> +
> +cleanup:
> +	if (thread_handle != NULL) {
> +		CloseHandle(thread_handle);
> +		thread_handle = NULL;

This line is not needed.

> +	}
> +	return ret;
> +}
> +
>  int
>  rte_thread_key_create(rte_thread_key *key,
>  		__rte_unused void (*destructor)(void *))
  
Dmitry Kozlyuk April 30, 2021, 5:22 p.m. UTC | #3
2021-04-29 21:31 (UTC+0000), Dmitry Malloy:
> Thread cancellation is a pain point. Emulating it properly is nearly
> impossible (without hooking into various OS calls which are supposed to be
> "cancellation points"). I do like the cancellation token idea, but I'm not
> sure how existing clients, who rely on existing phtread_cancel() semantic,
> will react to that - it will require a code re-write. 
>
> How about we defer fixing this to another follow-up change?

I'm strictly against accepting code with known severe bugs.

Alternative:

1. Don't add rte_thread_cancel() in public API.
2. Implement rte_thread_cancel() as internal API with cancellation tokens.
3. Rewrite DPDK code to use rte_thread_cancel().
4. For external users:
4.1. If their app is using pthread, they can use pthread backend for DPDK and
call pthread_cancel() directly.
4.2. If their app is not designed for pthread, they probably don't need
pthread_cancel() semantics and cancel threads another way.
4.3. If it's a brand new app, they will only use public API.

In 4.1 users need a pthread_t. In principle, they can call pthread_self()
from the thread to be cancelled and pass it to the thread that cancels. Or we
could provide API to get backend type and native handle. Since users have to
rewrite their code anyway to move from pthread to rte_thread, I'm for the
former approach. "Escape hatch" API would be very fragile.

P.S. Please avoid top-posting.
  
Dmitry Kozlyuk April 30, 2021, 5:51 p.m. UTC | #4
2021-04-30 20:22 (UTC+0300), Dmitry Kozlyuk:
> 2021-04-29 21:31 (UTC+0000), Dmitry Malloy:
> > [...]
> > How about we defer fixing this to another follow-up change?  
> 
> I'm strictly against accepting code with known severe bugs.
> 
> Alternative:
> 
> 1. Don't add rte_thread_cancel() in public API.
> 2. Implement rte_thread_cancel() as internal API with cancellation tokens.
> 3. Rewrite DPDK code to use rte_thread_cancel().
> [...]

Or just don't export this function on Windows for now.
None of the drivers using it is soon to be enabled for Windows,
and portable apps don't need it anyway.
  

Patch

diff --git a/lib/librte_eal/common/rte_thread.c b/lib/librte_eal/common/rte_thread.c
index 26c5b1f3c..29d38d193 100644
--- a/lib/librte_eal/common/rte_thread.c
+++ b/lib/librte_eal/common/rte_thread.c
@@ -120,6 +120,116 @@  rte_thread_attr_set_priority(rte_thread_attr_t *thread_attr,
 	return 0;
 }
 
+int
+rte_thread_create(rte_thread_t *thread_id,
+		  const rte_thread_attr_t *thread_attr,
+		  void *(*thread_func)(void *), void *args)
+{
+	int ret = 0;
+	pthread_attr_t attr;
+	pthread_attr_t *attrp = NULL;
+	struct sched_param param = {
+		.sched_priority = 0,
+	};
+	int policy = SCHED_OTHER;
+
+	if (thread_attr != NULL) {
+		ret = pthread_attr_init(&attr);
+		if (ret != 0) {
+			RTE_LOG(DEBUG, EAL, "pthread_attr_init failed\n");
+			goto cleanup;
+		}
+
+		attrp = &attr;
+
+		/*
+		 * Set the inherit scheduler parameter to explicit,
+		 * otherwise the priority attribute is ignored.
+		 */
+		ret = pthread_attr_setinheritsched(attrp,
+						   PTHREAD_EXPLICIT_SCHED);
+		if (ret != 0) {
+			RTE_LOG(DEBUG, EAL, "pthread_attr_setinheritsched failed\n");
+			goto cleanup;
+		}
+
+		/*
+		 * In case a realtime scheduling policy is requested,
+		 * the sched_priority parameter is set to the value stored in
+		 * thread_attr. Otherwise, for the default scheduling policy
+		 * (SCHED_OTHER) sched_priority needs to be initialized to 0.
+		 */
+		if (thread_attr->priority == RTE_THREAD_PRIORITY_REALTIME_CRITICAL) {
+			policy = SCHED_RR;
+			param.sched_priority = thread_attr->priority;
+		}
+
+		ret = pthread_attr_setschedpolicy(attrp, policy);
+		if (ret != 0) {
+			RTE_LOG(DEBUG, EAL, "pthread_attr_setschedpolicy failed\n");
+			goto cleanup;
+		}
+
+		ret = pthread_attr_setschedparam(attrp, &param);
+		if (ret != 0) {
+			RTE_LOG(DEBUG, EAL, "pthread_attr_setschedparam failed\n");
+			goto cleanup;
+		}
+
+		ret = pthread_attr_setaffinity_np(attrp,
+						  sizeof(thread_attr->cpuset),
+						  &thread_attr->cpuset);
+		if (ret != 0) {
+			RTE_LOG(DEBUG, EAL, "pthread_attr_setaffinity_np failed\n");
+			goto cleanup;
+		}
+	}
+
+	ret = pthread_create(thread_id, attrp, thread_func, args);
+	if (ret != 0) {
+		RTE_LOG(DEBUG, EAL, "pthread_create failed\n");
+		goto cleanup;
+	}
+
+cleanup:
+	if (attrp != NULL)
+		pthread_attr_destroy(&attr);
+
+	return ret;
+}
+
+int
+rte_thread_join(rte_thread_t thread_id, int *value_ptr)
+{
+	int ret = 0;
+	void *res = NULL;
+	void **pres = NULL;
+
+	if (value_ptr != NULL)
+		pres = &res;
+
+	ret = pthread_join(thread_id, pres);
+	if (ret != 0) {
+		RTE_LOG(DEBUG, EAL, "pthread_join failed\n");
+		return ret;
+	}
+
+	if (pres != NULL)
+		*value_ptr = *(int *)(*pres);
+
+	return 0;
+}
+
+int rte_thread_cancel(rte_thread_t thread_id)
+{
+	/*
+	 * TODO: Behavior is different between POSIX and Windows threads.
+	 * POSIX threads wait for a cancellation point.
+	 * Current Windows emulation kills thread at any point.
+	 */
+	return pthread_cancel(thread_id);
+}
+
 int
 rte_thread_key_create(rte_thread_key *key, void (*destructor)(void *))
 {
diff --git a/lib/librte_eal/include/rte_thread.h b/lib/librte_eal/include/rte_thread.h
index f95efb319..fa643433a 100644
--- a/lib/librte_eal/include/rte_thread.h
+++ b/lib/librte_eal/include/rte_thread.h
@@ -200,6 +200,59 @@  __rte_experimental
 int rte_thread_attr_set_priority(rte_thread_attr_t *thread_attr,
 				 enum rte_thread_priority priority);
 
+/**
+ * Create a new thread that will invoke the 'thread_func' routine.
+ *
+ * @param thread_id
+ *    A pointer that will store the id of the newly created thread.
+ *
+ * @param thread_attr
+ *    Attributes that are used at the creation of the new thread.
+ *
+ * @param thread_func
+ *    The routine that the new thread will invoke when starting execution.
+ *
+ * @param args
+ *    Arguments to be passed to the 'thread_func' routine.
+ *
+ * @return
+ *   On success, return 0.
+ *   On failure, return a positive errno-style error number.
+ */
+__rte_experimental
+int rte_thread_create(rte_thread_t *thread_id,
+		      const rte_thread_attr_t *thread_attr,
+		      void *(*thread_func)(void *), void *args);
+
+/**
+ * Waits for the thread identified by 'thread_id' to terminate
+ *
+ * @param thread_id
+ *    The identifier of the thread.
+ *
+ * @param value_ptr
+ *    Stores the exit status of the thread.
+ *
+ * @return
+ *   On success, return 0.
+ *   On failure, return a positive errno-style error number.
+ */
+__rte_experimental
+int rte_thread_join(rte_thread_t thread_id, int *value_ptr);
+
+/**
+ * Terminates a thread.
+ *
+ * @param thread_id
+ *    The id of the thread to be cancelled.
+ *
+ * @return
+ *   On success, return 0.
+ *   On failure, return a positive errno-style error number.
+ */
+__rte_experimental
+int rte_thread_cancel(rte_thread_t thread_id);
+
 /**
  * Set core affinity of the current thread.
  * Support both EAL and non-EAL thread and update TLS.
diff --git a/lib/librte_eal/windows/rte_thread.c b/lib/librte_eal/windows/rte_thread.c
index f61103bbc..86bbd7bc2 100644
--- a/lib/librte_eal/windows/rte_thread.c
+++ b/lib/librte_eal/windows/rte_thread.c
@@ -329,6 +329,131 @@  rte_thread_attr_set_priority(rte_thread_attr_t *thread_attr,
 	return 0;
 }
 
+int
+rte_thread_create(rte_thread_t *thread_id,
+		  const rte_thread_attr_t *thread_attr,
+		  void *(*thread_func)(void *), void *args)
+{
+	int ret = 0;
+	HANDLE thread_handle = NULL;
+	GROUP_AFFINITY thread_affinity;
+
+	thread_handle = CreateThread(NULL, 0,
+				(LPTHREAD_START_ROUTINE)(ULONG_PTR)thread_func,
+				args, 0, thread_id);
+	if (thread_handle == NULL) {
+		ret = rte_thread_translate_win32_error();
+		RTE_LOG_WIN32_ERR("CreateThread()");
+		goto cleanup;
+	}
+
+	if (thread_attr != NULL) {
+		if (CPU_COUNT(&thread_attr->cpuset) > 0) {
+			ret = rte_convert_cpuset_to_affinity(&thread_attr->cpuset, &thread_affinity);
+			if (ret != 0) {
+				RTE_LOG(DEBUG, EAL, "Unable to convert cpuset to thread affinity\n");
+				goto cleanup;
+			}
+
+			if (!SetThreadGroupAffinity(thread_handle, &thread_affinity, NULL)) {
+				ret = rte_thread_translate_win32_error();
+				RTE_LOG_WIN32_ERR("SetThreadGroupAffinity()");
+				goto cleanup;
+			}
+		}
+		ret = rte_thread_set_priority(*thread_id, thread_attr->priority);
+		if (ret != 0) {
+			RTE_LOG(DEBUG, EAL, "Unable to set thread priority\n");
+			goto cleanup;
+		}
+	}
+
+	return 0;
+
+cleanup:
+	if (thread_handle != NULL) {
+		CloseHandle(thread_handle);
+		thread_handle = NULL;
+	}
+	return ret;
+}
+
+int
+rte_thread_join(rte_thread_t thread_id, int *value_ptr)
+{
+	HANDLE thread_handle = NULL;
+	DWORD result = 0;
+	DWORD exit_code = 0;
+	BOOL err = 0;
+	int ret = 0;
+
+	thread_handle = OpenThread(SYNCHRONIZE | THREAD_QUERY_INFORMATION,
+				   FALSE, thread_id);
+	if (thread_handle == NULL) {
+		ret = rte_thread_translate_win32_error();
+		RTE_LOG_WIN32_ERR("OpenThread()");
+		goto cleanup;
+	}
+
+	result = WaitForSingleObject(thread_handle, INFINITE);
+	if (result != WAIT_OBJECT_0) {
+		ret = rte_thread_translate_win32_error();
+		RTE_LOG_WIN32_ERR("WaitForSingleObject()");
+		goto cleanup;
+	}
+
+	if (value_ptr != NULL) {
+		err = GetExitCodeThread(thread_handle, &exit_code);
+		if (err == 0) {
+			ret = rte_thread_translate_win32_error();
+			RTE_LOG_WIN32_ERR("GetExitCodeThread()");
+			goto cleanup;
+		}
+		*value_ptr = exit_code;
+	}
+
+cleanup:
+	if (thread_handle != NULL) {
+		CloseHandle(thread_handle);
+		thread_handle = NULL;
+	}
+
+	return ret;
+}
+
+int
+rte_thread_cancel(rte_thread_t thread_id)
+{
+	int ret = 0;
+	HANDLE thread_handle = NULL;
+
+	thread_handle = OpenThread(THREAD_TERMINATE, FALSE, thread_id);
+	if (thread_handle == NULL) {
+		ret = rte_thread_translate_win32_error();
+		RTE_LOG_WIN32_ERR("OpenThread()");
+		goto cleanup;
+	}
+
+	/*
+	 * TODO: Behavior is different between POSIX and Windows threads.
+	 * POSIX threads wait for a cancellation point.
+	 * Current Windows emulation kills thread at any point.
+	 */
+	ret = TerminateThread(thread_handle, 0);
+	if (ret != 0) {
+		ret = rte_thread_translate_win32_error();
+		RTE_LOG_WIN32_ERR("TerminateThread()");
+		goto cleanup;
+	}
+
+cleanup:
+	if (thread_handle != NULL) {
+		CloseHandle(thread_handle);
+		thread_handle = NULL;
+	}
+	return ret;
+}
+
 int
 rte_thread_key_create(rte_thread_key *key,
 		__rte_unused void (*destructor)(void *))