[v6,1/1] eal/unix: allow creating thread with real-time priority

Message ID 20231027081158.1358064-1-thomas@monjalon.net (mailing list archive)
State New
Delegated to: Thomas Monjalon
Headers
Series [v6,1/1] eal/unix: allow creating thread with real-time priority |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/github-robot: build success github build: passed
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-compile-amd64-testing success Testing PASS
ci/iol-unit-amd64-testing success Testing PASS
ci/iol-unit-arm64-testing success Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/iol-compile-arm64-testing success Testing PASS
ci/intel-Functional success Functional PASS
ci/iol-sample-apps-testing success Testing PASS

Commit Message

Thomas Monjalon Oct. 27, 2023, 8:08 a.m. UTC
  When adding an API for creating threads,
the real-time priority has been forbidden on Unix.

There is a known issue with ring behaviour,
but it should not be completely forbidden.

Real-time thread can block some kernel threads on the same core,
making the system unstable.
That's why a sleep is added in the test thread,
and a warning is logged when using real-time priority.

Fixes: ca04c78b6262 ("eal: get/set thread priority per thread identifier")
Fixes: ce6e911d20f6 ("eal: add thread lifetime API")
Fixes: a7ba40b2b1bf ("drivers: convert to internal control threads")
Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
---

v1: no yield at all
v2: more comments, sched_yield() and Sleep(0) on Windows
v3: 2 yield functions with sleep in realtime version
v4: runtime warning, longer sleep on Unix and lighter yield on Windows
v5: fix build and increase Unix sleep to 1 ms
v6: no yield helper, use a big sleep for testing purpose

It is not clear how to implement helpers which would work
on all systems (different configuration, arch and OS).
So the first patch adding yield functions is dropped for now.

---
 app/test/test_threads.c                       | 12 ++--------
 .../prog_guide/env_abstraction_layer.rst      |  4 +++-
 lib/eal/include/rte_thread.h                  |  8 +++++--
 lib/eal/unix/rte_thread.c                     | 22 +++++++++++--------
 4 files changed, 24 insertions(+), 22 deletions(-)
  

Comments

Morten Brørup Oct. 27, 2023, 8:45 a.m. UTC | #1
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Friday, 27 October 2023 10.09
> 
> When adding an API for creating threads,
> the real-time priority has been forbidden on Unix.
> 
> There is a known issue with ring behaviour,
> but it should not be completely forbidden.
> 
> Real-time thread can block some kernel threads on the same core,
> making the system unstable.
> That's why a sleep is added in the test thread,
> and a warning is logged when using real-time priority.
> 
> Fixes: ca04c78b6262 ("eal: get/set thread priority per thread
> identifier")
> Fixes: ce6e911d20f6 ("eal: add thread lifetime API")
> Fixes: a7ba40b2b1bf ("drivers: convert to internal control threads")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> Acked-by: Morten Brørup <mb@smartsharesystems.com>
> ---

[...]

>  enum rte_thread_priority {
> +	/** Normal thread priority, the default. */
>  	RTE_THREAD_PRIORITY_NORMAL            = 0,
> -	/**< normal thread priority, the default */
> +	/**
> +	 * Highest thread priority, use with caution.
> +	 * WARNING: System may be unstable because of a real-time busy
> loop.
> +	 *          @see rte_thread_yield_realtime().

Please remove the reference to the now non-existing function.

Also, I'd prefer to move the warning comments (about real-time threads having priority over kernel threads, and issues with rte_ring) up here, so it goes into the public API documentation.

> +	 */
>  	RTE_THREAD_PRIORITY_REALTIME_CRITICAL = 1,
> -	/**< highest thread priority allowed */
>  };
> 
>  /**
> diff --git a/lib/eal/unix/rte_thread.c b/lib/eal/unix/rte_thread.c
> index 278d8d342d..17ffb86c17 100644
> --- a/lib/eal/unix/rte_thread.c
> +++ b/lib/eal/unix/rte_thread.c
> @@ -33,6 +33,8 @@ static int
>  thread_map_priority_to_os_value(enum rte_thread_priority eal_pri, int
> *os_pri,
>  	int *pol)
>  {
> +	static bool warned;
> +
>  	/* Clear the output parameters. */
>  	*os_pri = sched_get_priority_min(SCHED_OTHER) - 1;
>  	*pol = -1;
> @@ -51,6 +53,17 @@ thread_map_priority_to_os_value(enum
> rte_thread_priority eal_pri, int *os_pri,
>  			sched_get_priority_max(SCHED_OTHER)) / 2;
>  		break;
>  	case RTE_THREAD_PRIORITY_REALTIME_CRITICAL:
> +		/*
> +		 * WARNING: Real-time busy loop takes priority on kernel
> threads,
> +		 *          making the system unstable.
> +		 *          There is also a known issue when using
> rte_ring.
> +		 */
> +		if (!warned) {
> +			RTE_LOG(NOTICE, EAL,
> +					"Real-time thread is unstable if polling
> without sleep.\n");
> +			warned = true;
> +		}

Is it 100 % certain that the system becomes unstable if not sleeping or using blocking system calls from a real-time thread?
And technically, it's not the thread itself that becomes unstable.

How about:
"System may be unstable unless real-time thread uses blocking system calls or sleeps."
or:
"Real-time thread usually requires the use of blocking system calls or sleeps."
or something else.

My ACK is still valid.
  
Thomas Monjalon Oct. 27, 2023, 9:11 a.m. UTC | #2
27/10/2023 10:45, Morten Brørup:
> > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > Sent: Friday, 27 October 2023 10.09
> > 
> > When adding an API for creating threads,
> > the real-time priority has been forbidden on Unix.
> > 
> > There is a known issue with ring behaviour,
> > but it should not be completely forbidden.
> > 
> > Real-time thread can block some kernel threads on the same core,
> > making the system unstable.
> > That's why a sleep is added in the test thread,
> > and a warning is logged when using real-time priority.
> > 
> > Fixes: ca04c78b6262 ("eal: get/set thread priority per thread
> > identifier")
> > Fixes: ce6e911d20f6 ("eal: add thread lifetime API")
> > Fixes: a7ba40b2b1bf ("drivers: convert to internal control threads")
> > Cc: stable@dpdk.org
> > 
> > Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> > Acked-by: Morten Brørup <mb@smartsharesystems.com>
> > ---
> 
> [...]
> 
> >  enum rte_thread_priority {
> > +	/** Normal thread priority, the default. */
> >  	RTE_THREAD_PRIORITY_NORMAL            = 0,
> > -	/**< normal thread priority, the default */
> > +	/**
> > +	 * Highest thread priority, use with caution.
> > +	 * WARNING: System may be unstable because of a real-time busy
> > loop.
> > +	 *          @see rte_thread_yield_realtime().
> 
> Please remove the reference to the now non-existing function.
> 
> Also, I'd prefer to move the warning comments (about real-time threads having priority over kernel threads, and issues with rte_ring) up here, so it goes into the public API documentation.

Yes OK, thanks for the careful review.

> 
> > +	 */
> >  	RTE_THREAD_PRIORITY_REALTIME_CRITICAL = 1,
> > -	/**< highest thread priority allowed */
> >  };
> > 
> >  /**
> > diff --git a/lib/eal/unix/rte_thread.c b/lib/eal/unix/rte_thread.c
> > index 278d8d342d..17ffb86c17 100644
> > --- a/lib/eal/unix/rte_thread.c
> > +++ b/lib/eal/unix/rte_thread.c
> > @@ -33,6 +33,8 @@ static int
> >  thread_map_priority_to_os_value(enum rte_thread_priority eal_pri, int
> > *os_pri,
> >  	int *pol)
> >  {
> > +	static bool warned;
> > +
> >  	/* Clear the output parameters. */
> >  	*os_pri = sched_get_priority_min(SCHED_OTHER) - 1;
> >  	*pol = -1;
> > @@ -51,6 +53,17 @@ thread_map_priority_to_os_value(enum
> > rte_thread_priority eal_pri, int *os_pri,
> >  			sched_get_priority_max(SCHED_OTHER)) / 2;
> >  		break;
> >  	case RTE_THREAD_PRIORITY_REALTIME_CRITICAL:
> > +		/*
> > +		 * WARNING: Real-time busy loop takes priority on kernel
> > threads,
> > +		 *          making the system unstable.
> > +		 *          There is also a known issue when using
> > rte_ring.
> > +		 */
> > +		if (!warned) {
> > +			RTE_LOG(NOTICE, EAL,
> > +					"Real-time thread is unstable if polling
> > without sleep.\n");
> > +			warned = true;
> > +		}
> 
> Is it 100 % certain that the system becomes unstable if not sleeping or using blocking system calls from a real-time thread?
> And technically, it's not the thread itself that becomes unstable.
> 
> How about:
> "System may be unstable unless real-time thread uses blocking system calls or sleeps."
> or:
> "Real-time thread usually requires the use of blocking system calls or sleeps."
> or something else.

Yes something like that looks better.
I will try to find a short sentence.

> 
> My ACK is still valid.
> 
>
  
Stephen Hemminger Oct. 27, 2023, 6:15 p.m. UTC | #3
On Fri, 27 Oct 2023 10:45:03 +0200
Morten Brørup <mb@smartsharesystems.com> wrote:

> Is it 100 % certain that the system becomes unstable if not sleeping or using blocking system calls from a real-time thread?
> And technically, it's not the thread itself that becomes unstable.

My experience is that the goal of real time threads is "do not let kernel have higher priority than this thread".
That means that if/when a kernel worker is needed on this core, instability happens.
It is very hard but possible to ensure that a kernel worker thread never runs on that core.
But if you are able to do configure that, then there is no point in using RT threads.
The best design is to avoid the kernel needing to run, and if it does then let it run.
  

Patch

diff --git a/app/test/test_threads.c b/app/test/test_threads.c
index 4ac3f2671a..a863214137 100644
--- a/app/test/test_threads.c
+++ b/app/test/test_threads.c
@@ -5,6 +5,7 @@ 
 #include <string.h>
 
 #include <rte_thread.h>
+#include <rte_cycles.h>
 #include <rte_debug.h>
 
 #include "test.h"
@@ -22,7 +23,7 @@  thread_main(void *arg)
 	__atomic_store_n(&thread_id_ready, 1, __ATOMIC_RELEASE);
 
 	while (__atomic_load_n(&thread_id_ready, __ATOMIC_ACQUIRE) == 1)
-		;
+		rte_delay_us_sleep(1000); /* required for RT priority */
 
 	return 0;
 }
@@ -97,21 +98,12 @@  test_thread_priority(void)
 		"Priority set mismatches priority get");
 
 	priority = RTE_THREAD_PRIORITY_REALTIME_CRITICAL;
-#ifndef RTE_EXEC_ENV_WINDOWS
-	RTE_TEST_ASSERT(rte_thread_set_priority(thread_id, priority) == ENOTSUP,
-		"Priority set to critical should fail");
-	RTE_TEST_ASSERT(rte_thread_get_priority(thread_id, &priority) == 0,
-		"Failed to get thread priority");
-	RTE_TEST_ASSERT(priority == RTE_THREAD_PRIORITY_NORMAL,
-		"Failed set to critical should have retained normal");
-#else
 	RTE_TEST_ASSERT(rte_thread_set_priority(thread_id, priority) == 0,
 		"Priority set to critical should succeed");
 	RTE_TEST_ASSERT(rte_thread_get_priority(thread_id, &priority) == 0,
 		"Failed to get thread priority");
 	RTE_TEST_ASSERT(priority == RTE_THREAD_PRIORITY_REALTIME_CRITICAL,
 		"Priority set mismatches priority get");
-#endif
 
 	priority = RTE_THREAD_PRIORITY_NORMAL;
 	RTE_TEST_ASSERT(rte_thread_set_priority(thread_id, priority) == 0,
diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
index 6debf54efb..d1f7cae7cd 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -815,7 +815,9 @@  Known Issues
 
   4. It MAY be used by preemptible multi-producer and/or preemptible multi-consumer pthreads whose scheduling policy are all SCHED_OTHER(cfs), SCHED_IDLE or SCHED_BATCH. User SHOULD be aware of the performance penalty before using it.
 
-  5. It MUST not be used by multi-producer/consumer pthreads, whose scheduling policies are SCHED_FIFO or SCHED_RR.
+  5. It MUST not be used by multi-producer/consumer pthreads
+     whose scheduling policies are ``SCHED_FIFO``
+     or ``SCHED_RR`` (``RTE_THREAD_PRIORITY_REALTIME_CRITICAL``).
 
   Alternatively, applications can use the lock-free stack mempool handler. When
   considering this handler, note that:
diff --git a/lib/eal/include/rte_thread.h b/lib/eal/include/rte_thread.h
index f2581fe152..7ff031e1b2 100644
--- a/lib/eal/include/rte_thread.h
+++ b/lib/eal/include/rte_thread.h
@@ -56,10 +56,14 @@  typedef uint32_t (*rte_thread_func) (void *arg);
  * Thread priority values.
  */
 enum rte_thread_priority {
+	/** Normal thread priority, the default. */
 	RTE_THREAD_PRIORITY_NORMAL            = 0,
-	/**< normal thread priority, the default */
+	/**
+	 * Highest thread priority, use with caution.
+	 * WARNING: System may be unstable because of a real-time busy loop.
+	 *          @see rte_thread_yield_realtime().
+	 */
 	RTE_THREAD_PRIORITY_REALTIME_CRITICAL = 1,
-	/**< highest thread priority allowed */
 };
 
 /**
diff --git a/lib/eal/unix/rte_thread.c b/lib/eal/unix/rte_thread.c
index 278d8d342d..17ffb86c17 100644
--- a/lib/eal/unix/rte_thread.c
+++ b/lib/eal/unix/rte_thread.c
@@ -33,6 +33,8 @@  static int
 thread_map_priority_to_os_value(enum rte_thread_priority eal_pri, int *os_pri,
 	int *pol)
 {
+	static bool warned;
+
 	/* Clear the output parameters. */
 	*os_pri = sched_get_priority_min(SCHED_OTHER) - 1;
 	*pol = -1;
@@ -51,6 +53,17 @@  thread_map_priority_to_os_value(enum rte_thread_priority eal_pri, int *os_pri,
 			sched_get_priority_max(SCHED_OTHER)) / 2;
 		break;
 	case RTE_THREAD_PRIORITY_REALTIME_CRITICAL:
+		/*
+		 * WARNING: Real-time busy loop takes priority on kernel threads,
+		 *          making the system unstable.
+		 *          There is also a known issue when using rte_ring.
+		 */
+		if (!warned) {
+			RTE_LOG(NOTICE, EAL,
+					"Real-time thread is unstable if polling without sleep.\n");
+			warned = true;
+		}
+
 		*pol = SCHED_RR;
 		*os_pri = sched_get_priority_max(SCHED_RR);
 		break;
@@ -155,11 +168,6 @@  rte_thread_create(rte_thread_t *thread_id,
 			goto cleanup;
 		}
 
-		if (thread_attr->priority ==
-				RTE_THREAD_PRIORITY_REALTIME_CRITICAL) {
-			ret = ENOTSUP;
-			goto cleanup;
-		}
 		ret = thread_map_priority_to_os_value(thread_attr->priority,
 				&param.sched_priority, &policy);
 		if (ret != 0)
@@ -291,10 +299,6 @@  rte_thread_set_priority(rte_thread_t thread_id,
 	int policy;
 	int ret;
 
-	/* Realtime priority can cause crashes on non-Windows platforms. */
-	if (priority == RTE_THREAD_PRIORITY_REALTIME_CRITICAL)
-		return ENOTSUP;
-
 	ret = thread_map_priority_to_os_value(priority, &param.sched_priority,
 		&policy);
 	if (ret != 0)