[4/6] service: tweak cycle statistics semantics

Message ID 20220906161352.296110-4-mattias.ronnblom@ericsson.com (mailing list archive)
State Superseded, archived
Delegated to: David Marchand
Headers
Series [1/6] service: reduce statistics overhead for parallel services |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Mattias Rönnblom Sept. 6, 2022, 4:13 p.m. UTC
  As a part of its service function, a service usually polls some kind
of source (e.g., an RX queue, a ring, an eventdev port, or a timer
wheel) to retrieve one or more items of work.

In low-load situations, the service framework reports a significant
amount of cycles spent for all running services, despite the fact they
have performed little or no actual work.

The per-call cycle expenditure for an idle service (i.e., a service
currently without pending jobs) is typically very low. Polling an
empty ring or RX queue is inexpensive. However, since the service
function call frequency on an idle or lightly loaded lcore is going to
be very high indeed, the service function calls' cycles adds up to a
significant amount. The only thing preventing the idle services'
cycles counters to make up 100% of the available CPU cycles is the
overhead of the service framework itself.

If the RTE_SERVICE_ATTR_CYCLES or RTE_SERVICE_LCORE_ATTR_CYCLES are
used to estimate service core load, the cores may look very busy when
the system is mostly doing nothing useful at all.

This patch allows for an idle service to indicate that no actual work
was performed during a particular service function call (by returning
-EAGAIN). In such cases the RTE_SERVICE_ATTR_CYCLES and
RTE_SERVICE_LCORE_ATTR_CYCLES values are not incremented.

The convention of returning -EAGAIN for idle services may in the
future also be used to have the lcore enter a short sleep, or reduce
its operating frequency, in case all services are currently idle.

This change is backward-compatible.

Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
---
 lib/eal/common/rte_service.c            | 22 ++++++++++++++--------
 lib/eal/include/rte_service_component.h |  5 +++++
 2 files changed, 19 insertions(+), 8 deletions(-)
  

Comments

Morten Brørup Sept. 7, 2022, 8:41 a.m. UTC | #1
> From: Mattias Rönnblom [mailto:mattias.ronnblom@ericsson.com]
> Sent: Tuesday, 6 September 2022 18.14
> 
> As a part of its service function, a service usually polls some kind
> of source (e.g., an RX queue, a ring, an eventdev port, or a timer
> wheel) to retrieve one or more items of work.
> 
> In low-load situations, the service framework reports a significant
> amount of cycles spent for all running services, despite the fact they
> have performed little or no actual work.
> 
> The per-call cycle expenditure for an idle service (i.e., a service
> currently without pending jobs) is typically very low. Polling an
> empty ring or RX queue is inexpensive. However, since the service
> function call frequency on an idle or lightly loaded lcore is going to
> be very high indeed, the service function calls' cycles adds up to a
> significant amount. The only thing preventing the idle services'
> cycles counters to make up 100% of the available CPU cycles is the
> overhead of the service framework itself.
> 
> If the RTE_SERVICE_ATTR_CYCLES or RTE_SERVICE_LCORE_ATTR_CYCLES are
> used to estimate service core load, the cores may look very busy when
> the system is mostly doing nothing useful at all.
> 
> This patch allows for an idle service to indicate that no actual work
> was performed during a particular service function call (by returning
> -EAGAIN). In such cases the RTE_SERVICE_ATTR_CYCLES and
> RTE_SERVICE_LCORE_ATTR_CYCLES values are not incremented.
> 
> The convention of returning -EAGAIN for idle services may in the
> future also be used to have the lcore enter a short sleep, or reduce
> its operating frequency, in case all services are currently idle.
> 
> This change is backward-compatible.
> 
> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> ---

This entire series contains a bunch of good improvements.

Returning -EAGAIN is a step in the right direction towards measuring CPU usage, and a great way to make it backwards compatible.

Series-Acked-by: Morten Brørup <mb@smartsharesystems.com>
  
Van Haaren, Harry Oct. 3, 2022, 1:45 p.m. UTC | #2
> -----Original Message-----
> From: Morten Brørup <mb@smartsharesystems.com>
> Sent: Wednesday, September 7, 2022 9:41 AM
> To: mattias.ronnblom <mattias.ronnblom@ericsson.com>; Van Haaren, Harry
> <harry.van.haaren@intel.com>
> Cc: dev@dpdk.org; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; nd
> <nd@arm.com>
> Subject: RE: [PATCH 4/6] service: tweak cycle statistics semantics
> 
> > From: Mattias Rönnblom [mailto:mattias.ronnblom@ericsson.com]
> > Sent: Tuesday, 6 September 2022 18.14
> >
> > As a part of its service function, a service usually polls some kind
> > of source (e.g., an RX queue, a ring, an eventdev port, or a timer
> > wheel) to retrieve one or more items of work.
> >
> > In low-load situations, the service framework reports a significant
> > amount of cycles spent for all running services, despite the fact they
> > have performed little or no actual work.
> >
> > The per-call cycle expenditure for an idle service (i.e., a service
> > currently without pending jobs) is typically very low. Polling an
> > empty ring or RX queue is inexpensive. However, since the service
> > function call frequency on an idle or lightly loaded lcore is going to
> > be very high indeed, the service function calls' cycles adds up to a
> > significant amount. The only thing preventing the idle services'
> > cycles counters to make up 100% of the available CPU cycles is the
> > overhead of the service framework itself.
> >
> > If the RTE_SERVICE_ATTR_CYCLES or RTE_SERVICE_LCORE_ATTR_CYCLES are
> > used to estimate service core load, the cores may look very busy when
> > the system is mostly doing nothing useful at all.
> >
> > This patch allows for an idle service to indicate that no actual work
> > was performed during a particular service function call (by returning
> > -EAGAIN). In such cases the RTE_SERVICE_ATTR_CYCLES and
> > RTE_SERVICE_LCORE_ATTR_CYCLES values are not incremented.
> >
> > The convention of returning -EAGAIN for idle services may in the
> > future also be used to have the lcore enter a short sleep, or reduce
> > its operating frequency, in case all services are currently idle.
> >
> > This change is backward-compatible.
> >
> > Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> > ---
> 
> This entire series contains a bunch of good improvements.
> 
> Returning -EAGAIN is a step in the right direction towards measuring CPU usage, and
> a great way to make it backwards compatible.
> 
> Series-Acked-by: Morten Brørup <mb@smartsharesystems.com>

Agreed, thanks Mattias for authoring & Morten for review/ack;

I've left 2 minor comments on individual patches, but for the remaining 4 patches;
Series-Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
  

Patch

diff --git a/lib/eal/common/rte_service.c b/lib/eal/common/rte_service.c
index 4cac866792..123610688c 100644
--- a/lib/eal/common/rte_service.c
+++ b/lib/eal/common/rte_service.c
@@ -10,6 +10,7 @@ 
 #include <rte_service_component.h>
 
 #include <rte_lcore.h>
+#include <rte_branch_prediction.h>
 #include <rte_common.h>
 #include <rte_cycles.h>
 #include <rte_atomic.h>
@@ -364,24 +365,29 @@  service_runner_do_callback(struct rte_service_spec_impl *s,
 
 	if (service_stats_enabled(s)) {
 		uint64_t start = rte_rdtsc();
-		s->spec.callback(userdata);
-		uint64_t end = rte_rdtsc();
-		uint64_t cycles = end - start;
+		int rc = s->spec.callback(userdata);
 
 		/* The lcore service worker thread is the only writer,
 		 * and thus only a non-atomic load and an atomic store
 		 * is needed, and not the more expensive atomic
 		 * add.
 		 */
-		__atomic_store_n(&cs->cycles, cs->cycles + cycles,
-				 __ATOMIC_RELAXED);
+
+		if (likely(rc != -EAGAIN)) {
+			uint64_t end = rte_rdtsc();
+			uint64_t cycles = end - start;
+
+			__atomic_store_n(&cs->cycles, cs->cycles + cycles,
+					 __ATOMIC_RELAXED);
+			__atomic_store_n(&cs->cycles_per_service[service_idx],
+					 cs->cycles_per_service[service_idx] +
+					 cycles, __ATOMIC_RELAXED);
+		}
+
 		__atomic_store_n(&cs->calls_per_service[service_idx],
 				 cs->calls_per_service[service_idx] + 1,
 				 __ATOMIC_RELAXED);
 
-		__atomic_store_n(&cs->cycles_per_service[service_idx],
-				 cs->cycles_per_service[service_idx] + cycles,
-				 __ATOMIC_RELAXED);
 	} else
 		s->spec.callback(userdata);
 }
diff --git a/lib/eal/include/rte_service_component.h b/lib/eal/include/rte_service_component.h
index 9e66ee7e29..9be49d698a 100644
--- a/lib/eal/include/rte_service_component.h
+++ b/lib/eal/include/rte_service_component.h
@@ -19,6 +19,11 @@  extern "C" {
 
 /**
  * Signature of callback function to run a service.
+ *
+ * A service function call resulting in no actual work being
+ * performed, should return -EAGAIN. In that case, the (presumbly few)
+ * cycles spent will not be counted toward the lcore or service-level
+ * cycles attributes.
  */
 typedef int32_t (*rte_service_func)(void *args);