eal: support lcore usage ratio

Message ID 20231023040811.46038-1-fengchengwen@huawei.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series eal: support lcore usage ratio |

Checks

Context Check Description
ci/loongarch-compilation success Compilation OK
ci/checkpatch success coding style OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/github-robot: build success github build: passed
ci/intel-Testing success Testing PASS
ci/intel-Functional success Functional PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-unit-arm64-testing success Testing PASS
ci/iol-compile-amd64-testing success Testing PASS
ci/iol-unit-amd64-testing success Testing PASS
ci/iol-compile-arm64-testing success Testing PASS

Commit Message

fengchengwen Oct. 23, 2023, 4:08 a.m. UTC
  Current, the lcore usage only display two key fields: busy_cycles and
total_cycles, which is inconvenient to obtain the usage ratio
immediately. So adds lcore usage ratio field.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
---
 lib/eal/common/eal_common_lcore.c | 34 ++++++++++++++++++++++++++++---
 1 file changed, 31 insertions(+), 3 deletions(-)
  

Comments

Morten Brørup Oct. 23, 2023, 8:58 a.m. UTC | #1
> From: Chengwen Feng [mailto:fengchengwen@huawei.com]
> Sent: Monday, 23 October 2023 06.08
> 
> Current, the lcore usage only display two key fields: busy_cycles and
> total_cycles, which is inconvenient to obtain the usage ratio
> immediately. So adds lcore usage ratio field.

Usage ratio in percentage is only useful if it doesn't vary much over time. Which use cases don't have a varying traffic pattern with busy hours and off-peak hours?

> 
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> ---

 [...]

> +static float
> +calc_usage_ratio(const struct rte_lcore_usage *usage)
> +{
> +	return (usage->busy_cycles * 100.0) / (usage->total_cycles == 0 ? 1 :
> usage->total_cycles);
> +}

This correctly prevents division by zero. If total_cycles by some freak accident isn't updated, the result will be a very big number. You might consider this alternative:

return usage->total_cycles != 0 ? (usage->busy_cycles * 100.0) / usage->total_cycles : (float)0;

> +
>  static int
>  lcore_dump_cb(unsigned int lcore_id, void *arg)
>  {
> @@ -462,8 +468,9 @@ lcore_dump_cb(unsigned int lcore_id, void *arg)
>  	/* Guard against concurrent modification of lcore_usage_cb. */
>  	usage_cb = lcore_usage_cb;
>  	if (usage_cb != NULL && usage_cb(lcore_id, &usage) == 0) {
> -		if (asprintf(&usage_str, ", busy cycles %"PRIu64"/%"PRIu64,
> -				usage.busy_cycles, usage.total_cycles) < 0) {
> +		if (asprintf(&usage_str, ", busy cycles %"PRIu64"/%"PRIu64"
> (ratio %.3f%%)",

Is "%.3f%%" the community preference for human readable CPU usage percentages?

I prefer "%.02f%%", but don't object to the suggested format.

NB: The format also applies to format_usage_ratio() below.

> +				usage.busy_cycles, usage.total_cycles,
> +				calc_usage_ratio(&usage)) < 0) {
>  			return -ENOMEM;
>  		}
>  	}
> @@ -511,11 +518,19 @@ struct lcore_telemetry_info {
>  	struct rte_tel_data *d;
>  };
> 
> +static void
> +format_usage_ratio(char *buf, uint16_t size, const struct rte_lcore_usage
> *usage)
> +{
> +	float ratio = calc_usage_ratio(usage);
> +	snprintf(buf, size, "%.3f%%", ratio);
> +}
  
fengchengwen Oct. 23, 2023, 12:02 p.m. UTC | #2
Hi Morten,

On 2023/10/23 16:58, Morten Brørup wrote:
>> From: Chengwen Feng [mailto:fengchengwen@huawei.com]
>> Sent: Monday, 23 October 2023 06.08
>>
>> Current, the lcore usage only display two key fields: busy_cycles and
>> total_cycles, which is inconvenient to obtain the usage ratio
>> immediately. So adds lcore usage ratio field.
> 
> Usage ratio in percentage is only useful if it doesn't vary much over time. Which use cases don't have a varying traffic pattern with busy hours and off-peak hours?

Yes, it indeed.

There have too way:
1\ only compare increment of busy&total, which this ratio have less reference.
2\ clean the busy&total before do an metrics, then this ratio become usefull

> 
>>
>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>> ---
> 
>  [...]
> 
>> +static float
>> +calc_usage_ratio(const struct rte_lcore_usage *usage)
>> +{
>> +	return (usage->busy_cycles * 100.0) / (usage->total_cycles == 0 ? 1 :
>> usage->total_cycles);
>> +}
> 
> This correctly prevents division by zero. If total_cycles by some freak accident isn't updated, the result will be a very big number. You might consider this alternative:
> 
> return usage->total_cycles != 0 ? (usage->busy_cycles * 100.0) / usage->total_cycles : (float)0;

ok

> 
>> +
>>  static int
>>  lcore_dump_cb(unsigned int lcore_id, void *arg)
>>  {
>> @@ -462,8 +468,9 @@ lcore_dump_cb(unsigned int lcore_id, void *arg)
>>  	/* Guard against concurrent modification of lcore_usage_cb. */
>>  	usage_cb = lcore_usage_cb;
>>  	if (usage_cb != NULL && usage_cb(lcore_id, &usage) == 0) {
>> -		if (asprintf(&usage_str, ", busy cycles %"PRIu64"/%"PRIu64,
>> -				usage.busy_cycles, usage.total_cycles) < 0) {
>> +		if (asprintf(&usage_str, ", busy cycles %"PRIu64"/%"PRIu64"
>> (ratio %.3f%%)",
> 
> Is "%.3f%%" the community preference for human readable CPU usage percentages?
> 
> I prefer "%.02f%%", but don't object to the suggested format.

maybe %.02f is more common, will change in v2

> 
> NB: The format also applies to format_usage_ratio() below.
> 
>> +				usage.busy_cycles, usage.total_cycles,
>> +				calc_usage_ratio(&usage)) < 0) {
>>  			return -ENOMEM;
>>  		}
>>  	}
>> @@ -511,11 +518,19 @@ struct lcore_telemetry_info {
>>  	struct rte_tel_data *d;
>>  };
>>
>> +static void
>> +format_usage_ratio(char *buf, uint16_t size, const struct rte_lcore_usage
>> *usage)
>> +{
>> +	float ratio = calc_usage_ratio(usage);
>> +	snprintf(buf, size, "%.3f%%", ratio);
>> +}
> 
> .
>
  

Patch

diff --git a/lib/eal/common/eal_common_lcore.c b/lib/eal/common/eal_common_lcore.c
index ceda714ca5..d1d0da2dd0 100644
--- a/lib/eal/common/eal_common_lcore.c
+++ b/lib/eal/common/eal_common_lcore.c
@@ -446,6 +446,12 @@  rte_lcore_register_usage_cb(rte_lcore_usage_cb cb)
 	lcore_usage_cb = cb;
 }
 
+static float
+calc_usage_ratio(const struct rte_lcore_usage *usage)
+{
+	return (usage->busy_cycles * 100.0) / (usage->total_cycles == 0 ? 1 : usage->total_cycles);
+}
+
 static int
 lcore_dump_cb(unsigned int lcore_id, void *arg)
 {
@@ -462,8 +468,9 @@  lcore_dump_cb(unsigned int lcore_id, void *arg)
 	/* Guard against concurrent modification of lcore_usage_cb. */
 	usage_cb = lcore_usage_cb;
 	if (usage_cb != NULL && usage_cb(lcore_id, &usage) == 0) {
-		if (asprintf(&usage_str, ", busy cycles %"PRIu64"/%"PRIu64,
-				usage.busy_cycles, usage.total_cycles) < 0) {
+		if (asprintf(&usage_str, ", busy cycles %"PRIu64"/%"PRIu64" (ratio %.3f%%)",
+				usage.busy_cycles, usage.total_cycles,
+				calc_usage_ratio(&usage)) < 0) {
 			return -ENOMEM;
 		}
 	}
@@ -511,11 +518,19 @@  struct lcore_telemetry_info {
 	struct rte_tel_data *d;
 };
 
+static void
+format_usage_ratio(char *buf, uint16_t size, const struct rte_lcore_usage *usage)
+{
+	float ratio = calc_usage_ratio(usage);
+	snprintf(buf, size, "%.3f%%", ratio);
+}
+
 static int
 lcore_telemetry_info_cb(unsigned int lcore_id, void *arg)
 {
 	struct rte_config *cfg = rte_eal_get_configuration();
 	struct lcore_telemetry_info *info = arg;
+	char ratio_str[RTE_TEL_MAX_STRING_LEN];
 	struct rte_lcore_usage usage;
 	struct rte_tel_data *cpuset;
 	rte_lcore_usage_cb usage_cb;
@@ -544,6 +559,8 @@  lcore_telemetry_info_cb(unsigned int lcore_id, void *arg)
 	if (usage_cb != NULL && usage_cb(lcore_id, &usage) == 0) {
 		rte_tel_data_add_dict_uint(info->d, "total_cycles", usage.total_cycles);
 		rte_tel_data_add_dict_uint(info->d, "busy_cycles", usage.busy_cycles);
+		format_usage_ratio(ratio_str, sizeof(ratio_str), &usage);
+		rte_tel_data_add_dict_string(info->d, "usage_ratio", ratio_str);
 	}
 
 	return 0;
@@ -574,11 +591,13 @@  struct lcore_telemetry_usage {
 	struct rte_tel_data *lcore_ids;
 	struct rte_tel_data *total_cycles;
 	struct rte_tel_data *busy_cycles;
+	struct rte_tel_data *usage_ratio;
 };
 
 static int
 lcore_telemetry_usage_cb(unsigned int lcore_id, void *arg)
 {
+	char ratio_str[RTE_TEL_MAX_STRING_LEN];
 	struct lcore_telemetry_usage *u = arg;
 	struct rte_lcore_usage usage;
 	rte_lcore_usage_cb usage_cb;
@@ -591,6 +610,8 @@  lcore_telemetry_usage_cb(unsigned int lcore_id, void *arg)
 		rte_tel_data_add_array_uint(u->lcore_ids, lcore_id);
 		rte_tel_data_add_array_uint(u->total_cycles, usage.total_cycles);
 		rte_tel_data_add_array_uint(u->busy_cycles, usage.busy_cycles);
+		format_usage_ratio(ratio_str, sizeof(ratio_str), &usage);
+		rte_tel_data_add_array_string(u->usage_ratio, ratio_str);
 	}
 
 	return 0;
@@ -603,15 +624,19 @@  handle_lcore_usage(const char *cmd __rte_unused, const char *params __rte_unused
 	struct lcore_telemetry_usage usage;
 	struct rte_tel_data *total_cycles;
 	struct rte_tel_data *busy_cycles;
+	struct rte_tel_data *usage_ratio;
 	struct rte_tel_data *lcore_ids;
 
 	lcore_ids = rte_tel_data_alloc();
 	total_cycles = rte_tel_data_alloc();
 	busy_cycles = rte_tel_data_alloc();
-	if (lcore_ids == NULL || total_cycles == NULL || busy_cycles == NULL) {
+	usage_ratio = rte_tel_data_alloc();
+	if (lcore_ids == NULL || total_cycles == NULL || busy_cycles == NULL ||
+	    usage_ratio == NULL) {
 		rte_tel_data_free(lcore_ids);
 		rte_tel_data_free(total_cycles);
 		rte_tel_data_free(busy_cycles);
+		rte_tel_data_free(usage_ratio);
 		return -ENOMEM;
 	}
 
@@ -619,12 +644,15 @@  handle_lcore_usage(const char *cmd __rte_unused, const char *params __rte_unused
 	rte_tel_data_start_array(lcore_ids, RTE_TEL_UINT_VAL);
 	rte_tel_data_start_array(total_cycles, RTE_TEL_UINT_VAL);
 	rte_tel_data_start_array(busy_cycles, RTE_TEL_UINT_VAL);
+	rte_tel_data_start_array(usage_ratio, RTE_TEL_STRING_VAL);
 	rte_tel_data_add_dict_container(d, "lcore_ids", lcore_ids, 0);
 	rte_tel_data_add_dict_container(d, "total_cycles", total_cycles, 0);
 	rte_tel_data_add_dict_container(d, "busy_cycles", busy_cycles, 0);
+	rte_tel_data_add_dict_container(d, "usage_ratio", usage_ratio, 0);
 	usage.lcore_ids = lcore_ids;
 	usage.total_cycles = total_cycles;
 	usage.busy_cycles = busy_cycles;
+	usage.usage_ratio = usage_ratio;
 
 	return rte_lcore_iterate(lcore_telemetry_usage_cb, &usage);
 }