[v3,6/8] arg_parser: added common core string and heuristic parsers

Message ID 20231207161818.2590661-7-euan.bourke@intel.com (mailing list archive)
State Superseded
Delegated to: Thomas Monjalon
Headers
Series add new command line argument parsing library |

Checks

Context Check Description
ci/checkpatch warning coding style issues

Commit Message

Euan Bourke Dec. 7, 2023, 4:18 p.m. UTC
  Two new functions, the first is a 'heuristic parser' which examines a
string describing a set of cores and determines based off heuristics
whether its a coremask or a corelist.

Second is a 'combined parser' which calls the first function and then
based off the returned value will call the relevant core string parser.
This function also takes a 'default_type' int which corresponds to
which parser should be used in the case of an ambiguous string.

Signed-off-by: Euan Bourke <euan.bourke@intel.com>
---
 lib/arg_parser/arg_parser.c     | 68 +++++++++++++++++++++++++++++++++
 lib/arg_parser/rte_arg_parser.h | 60 +++++++++++++++++++++++++++++
 lib/arg_parser/version.map      |  2 +
 3 files changed, 130 insertions(+)
  

Comments

Bruce Richardson Dec. 7, 2023, 4:58 p.m. UTC | #1
On Thu, Dec 07, 2023 at 04:18:16PM +0000, Euan Bourke wrote:
> Two new functions, the first is a 'heuristic parser' which examines a
> string describing a set of cores and determines based off heuristics
> whether its a coremask or a corelist.
> 
> Second is a 'combined parser' which calls the first function and then
> based off the returned value will call the relevant core string parser.
> This function also takes a 'default_type' int which corresponds to
> which parser should be used in the case of an ambiguous string.
> 
> Signed-off-by: Euan Bourke <euan.bourke@intel.com>
> ---
>  lib/arg_parser/arg_parser.c     | 68 +++++++++++++++++++++++++++++++++
>  lib/arg_parser/rte_arg_parser.h | 60 +++++++++++++++++++++++++++++
>  lib/arg_parser/version.map      |  2 +
>  3 files changed, 130 insertions(+)
> 
> diff --git a/lib/arg_parser/arg_parser.c b/lib/arg_parser/arg_parser.c
> index cebab9e2f8..95cbc50c13 100644
> --- a/lib/arg_parser/arg_parser.c
> +++ b/lib/arg_parser/arg_parser.c
> @@ -7,10 +7,15 @@
>  #include "ctype.h"
>  #include "string.h"
>  #include "stdbool.h"
> +#include "stdio.h"
>  
>  #include <rte_arg_parser.h>
>  #include <rte_common.h>
>  
> +#define RTE_ARG_PARSE_TYPE_COREMASK 0
> +#define RTE_ARG_PARSE_TYPE_CORELIST 1
> +#define RTE_ARG_PARSE_TYPE_UNKNOWN 2
> +

As these are used as return values, they need to be defined in the header
file so that applications can use them.

>  #define BITS_PER_HEX 4
>  #define MAX_COREMASK_SIZE ((UINT16_MAX + 1) / BITS_PER_HEX)
>  
> @@ -22,6 +27,7 @@ struct core_bits {
>  	uint32_t total_bits_set;
>  };
>  
> +

Stray newline added to patch.

>  static inline bool
>  get_core_bit(struct core_bits *mask, uint16_t idx)
>  {
> @@ -159,3 +165,65 @@ rte_arg_parse_coremask(const char *coremask, uint16_t *cores, uint32_t cores_len
>  
>  	return total_count;
>  }
> +
> +int
> +rte_arg_parse_arg_type(const char *core_string)
> +{
> +	/* Remove leading whitespace */
> +	while (isblank(*core_string))
> +		core_string++;
> +
> +	/* Check for 0x prefix */
> +	if (core_string[0] == '0' && tolower(core_string[1]) == 'x') {
> +		if (core_string[2] != '\0')
> +			return RTE_ARG_PARSE_TYPE_COREMASK;
> +		return -1;
> +	}
> +
> +	int i = 0, idx = 0;
> +	/* Check for ',' and '-' and check for A-F */
> +	do {
> +		while (isblank(core_string[idx]))
> +			idx++;
> +
> +		if (core_string[idx] == ',' || core_string[idx] == '-')
> +			return RTE_ARG_PARSE_TYPE_CORELIST;
> +
> +		if (isalpha(core_string[idx])) {
> +			if (isxdigit(core_string[idx]))
> +				return RTE_ARG_PARSE_TYPE_COREMASK;
> +			return -1;
> +		}
> +		idx++;
> +		i++;
> +	} while (core_string[idx] != '\0');
> +
> +	/* Check length of core_string if ambiguous as max length of a uint16_t is 5 digits
> +	 * implying its a coremask.
> +	 */
> +	if (i > 5)
> +		return RTE_ARG_PARSE_TYPE_COREMASK;
> +
> +	return -1;

Rather than returning -1, I think in most/all cases above, the function
should return -EINVAL as error code, since it's invalid input passed.

> +}
> +
> +int
> +rte_arg_parse_core_string(const char *core_string, uint16_t *cores, uint32_t cores_len,
> +		int default_type)
> +{
> +	if (default_type != RTE_ARG_PARSE_TYPE_COREMASK &&
> +			default_type != RTE_ARG_PARSE_TYPE_CORELIST) {
> +		return -1;
> +	}
> +	switch (rte_arg_parse_arg_type(core_string)) {
> +	case RTE_ARG_PARSE_TYPE_COREMASK:
> +		return rte_arg_parse_coremask(core_string, cores, cores_len);
> +	case RTE_ARG_PARSE_TYPE_CORELIST:
> +		return rte_arg_parse_corelist(core_string, cores, cores_len);
> +	default:
> +		return default_type == RTE_ARG_PARSE_TYPE_COREMASK ?
> +			rte_arg_parse_coremask(core_string, cores, cores_len) :
> +			rte_arg_parse_corelist(core_string, cores, cores_len);
> +		return -1;
> +	}
> +}
> diff --git a/lib/arg_parser/rte_arg_parser.h b/lib/arg_parser/rte_arg_parser.h
> index 359d40e305..125ca9524c 100644
> --- a/lib/arg_parser/rte_arg_parser.h
> +++ b/lib/arg_parser/rte_arg_parser.h
> @@ -92,6 +92,66 @@ __rte_experimental
>  int
>  rte_arg_parse_coremask(const char *coremask, uint16_t *cores, uint32_t cores_len);
>  
> +/**
> + * Use heuristics to determine if a string contains a coremask or a corelist.
> + *
> + * This function will check a series of conditions and return an int representing which
> + * core type (mask or list) the string represents or UNKNOWN if the string is ambiguous.

"or report the type as unknown if it is ambiguous"

> + *
> + * @param core_string
> + *   A string describing the intended cores to be parsed
> + * @return
> + *   int representing the core type
> + *   -1: error.

Suggest "negative error code on error". We should also list out the error
codes at the end, though I think right now -EINVAL is the only one we need.

> + *   0: coremask.
> + *   1: corelist.
> + *   2: unknown (ambiguous).

Move the #defines from the C file to the header, and use them here rather
than magic numbers.

> + */
> +__rte_experimental
> +int
> +rte_arg_parse_arg_type(const char *core_string);
> +
> +/**
> + * Convert a string describing either a corelist or coremask into an array of core ids.
> + *
> + * This function will fill the "cores" array up to "cores_len" with the core ids described
> + * in the "core_string". The string can either describe a corelist or a coremask, and
> + * will be parsed accordingly. The number of unique core ids in the string is then returned.
> + * For example:
> + * "1-4" is treated as a corelist and results in an array of [1,2,3,4] with 4 being returned
> + * "0xA1" is treated as a coremask and results in an array of [0,5,7] with 3 being returned
> + *
> + * In the case of an ambiguous string, the function will use the default_type parameter to
> + * decide.
> + *
> + * NOTE: if the length of the input array is insufficient to hold the number of core ids
> + * in "core_string" the input array is filled to capacity but the return value is the
> + * number of elements which would have been written to the array, had enough space been
> + * available. [This is similar to the behaviour of the snprintf function]. Because of
> + * this, the number of core values in the "core_string" may be determined by calling the
> + * function with a NULL array pointer and array length given as 0.
> + *
> + * @param core_string
> + *   A string describing the intended cores to be parsed.
> + * @param cores
> + *   An array where to store the core ids.
> + *   Array can be NULL if "cores_len" is 0.
> + * @param cores_len
> + *   The length of the "cores" array.
> + *   If the size is smaller than that needed to hold all cores from "core_string"
> + * @param default_type
> + *   How to treat ambiguous cases (e.g. '4' could be mask or list).
> + *   0: mask.
> + *   1: list.

Again, use the defines.

> + * @return
> + *   n: the number of unique cores present in "core_string".
> + *   -1 if the string was invalid.
> + *   NOTE: if n > "cores_len", then only "cores_len" elements in the "cores" array are valid.
> + */
> +__rte_experimental
> +int
> +rte_arg_parse_core_string(const char *core_string, uint16_t *cores, uint32_t cores_len,
> +		int default_type);
>  
>  #ifdef __cplusplus
>  }
> diff --git a/lib/arg_parser/version.map b/lib/arg_parser/version.map
> index b44d4b02b7..383b6bd0e9 100644
> --- a/lib/arg_parser/version.map
> +++ b/lib/arg_parser/version.map
> @@ -8,4 +8,6 @@ EXPERIMENTAL {
>  	# added in 24.03
>  	rte_arg_parse_corelist;
>  	rte_arg_parse_coremask;
> +	rte_arg_parse_arg_type;
> +	rte_arg_parse_core_string;


The version.map lists are kept alphabetical, so the new entries need to be
moved up.

>  };
> -- 
> 2.34.1
>
  

Patch

diff --git a/lib/arg_parser/arg_parser.c b/lib/arg_parser/arg_parser.c
index cebab9e2f8..95cbc50c13 100644
--- a/lib/arg_parser/arg_parser.c
+++ b/lib/arg_parser/arg_parser.c
@@ -7,10 +7,15 @@ 
 #include "ctype.h"
 #include "string.h"
 #include "stdbool.h"
+#include "stdio.h"
 
 #include <rte_arg_parser.h>
 #include <rte_common.h>
 
+#define RTE_ARG_PARSE_TYPE_COREMASK 0
+#define RTE_ARG_PARSE_TYPE_CORELIST 1
+#define RTE_ARG_PARSE_TYPE_UNKNOWN 2
+
 #define BITS_PER_HEX 4
 #define MAX_COREMASK_SIZE ((UINT16_MAX + 1) / BITS_PER_HEX)
 
@@ -22,6 +27,7 @@  struct core_bits {
 	uint32_t total_bits_set;
 };
 
+
 static inline bool
 get_core_bit(struct core_bits *mask, uint16_t idx)
 {
@@ -159,3 +165,65 @@  rte_arg_parse_coremask(const char *coremask, uint16_t *cores, uint32_t cores_len
 
 	return total_count;
 }
+
+int
+rte_arg_parse_arg_type(const char *core_string)
+{
+	/* Remove leading whitespace */
+	while (isblank(*core_string))
+		core_string++;
+
+	/* Check for 0x prefix */
+	if (core_string[0] == '0' && tolower(core_string[1]) == 'x') {
+		if (core_string[2] != '\0')
+			return RTE_ARG_PARSE_TYPE_COREMASK;
+		return -1;
+	}
+
+	int i = 0, idx = 0;
+	/* Check for ',' and '-' and check for A-F */
+	do {
+		while (isblank(core_string[idx]))
+			idx++;
+
+		if (core_string[idx] == ',' || core_string[idx] == '-')
+			return RTE_ARG_PARSE_TYPE_CORELIST;
+
+		if (isalpha(core_string[idx])) {
+			if (isxdigit(core_string[idx]))
+				return RTE_ARG_PARSE_TYPE_COREMASK;
+			return -1;
+		}
+		idx++;
+		i++;
+	} while (core_string[idx] != '\0');
+
+	/* Check length of core_string if ambiguous as max length of a uint16_t is 5 digits
+	 * implying its a coremask.
+	 */
+	if (i > 5)
+		return RTE_ARG_PARSE_TYPE_COREMASK;
+
+	return -1;
+}
+
+int
+rte_arg_parse_core_string(const char *core_string, uint16_t *cores, uint32_t cores_len,
+		int default_type)
+{
+	if (default_type != RTE_ARG_PARSE_TYPE_COREMASK &&
+			default_type != RTE_ARG_PARSE_TYPE_CORELIST) {
+		return -1;
+	}
+	switch (rte_arg_parse_arg_type(core_string)) {
+	case RTE_ARG_PARSE_TYPE_COREMASK:
+		return rte_arg_parse_coremask(core_string, cores, cores_len);
+	case RTE_ARG_PARSE_TYPE_CORELIST:
+		return rte_arg_parse_corelist(core_string, cores, cores_len);
+	default:
+		return default_type == RTE_ARG_PARSE_TYPE_COREMASK ?
+			rte_arg_parse_coremask(core_string, cores, cores_len) :
+			rte_arg_parse_corelist(core_string, cores, cores_len);
+		return -1;
+	}
+}
diff --git a/lib/arg_parser/rte_arg_parser.h b/lib/arg_parser/rte_arg_parser.h
index 359d40e305..125ca9524c 100644
--- a/lib/arg_parser/rte_arg_parser.h
+++ b/lib/arg_parser/rte_arg_parser.h
@@ -92,6 +92,66 @@  __rte_experimental
 int
 rte_arg_parse_coremask(const char *coremask, uint16_t *cores, uint32_t cores_len);
 
+/**
+ * Use heuristics to determine if a string contains a coremask or a corelist.
+ *
+ * This function will check a series of conditions and return an int representing which
+ * core type (mask or list) the string represents or UNKNOWN if the string is ambiguous.
+ *
+ * @param core_string
+ *   A string describing the intended cores to be parsed
+ * @return
+ *   int representing the core type
+ *   -1: error.
+ *   0: coremask.
+ *   1: corelist.
+ *   2: unknown (ambiguous).
+ */
+__rte_experimental
+int
+rte_arg_parse_arg_type(const char *core_string);
+
+/**
+ * Convert a string describing either a corelist or coremask into an array of core ids.
+ *
+ * This function will fill the "cores" array up to "cores_len" with the core ids described
+ * in the "core_string". The string can either describe a corelist or a coremask, and
+ * will be parsed accordingly. The number of unique core ids in the string is then returned.
+ * For example:
+ * "1-4" is treated as a corelist and results in an array of [1,2,3,4] with 4 being returned
+ * "0xA1" is treated as a coremask and results in an array of [0,5,7] with 3 being returned
+ *
+ * In the case of an ambiguous string, the function will use the default_type parameter to
+ * decide.
+ *
+ * NOTE: if the length of the input array is insufficient to hold the number of core ids
+ * in "core_string" the input array is filled to capacity but the return value is the
+ * number of elements which would have been written to the array, had enough space been
+ * available. [This is similar to the behaviour of the snprintf function]. Because of
+ * this, the number of core values in the "core_string" may be determined by calling the
+ * function with a NULL array pointer and array length given as 0.
+ *
+ * @param core_string
+ *   A string describing the intended cores to be parsed.
+ * @param cores
+ *   An array where to store the core ids.
+ *   Array can be NULL if "cores_len" is 0.
+ * @param cores_len
+ *   The length of the "cores" array.
+ *   If the size is smaller than that needed to hold all cores from "core_string"
+ * @param default_type
+ *   How to treat ambiguous cases (e.g. '4' could be mask or list).
+ *   0: mask.
+ *   1: list.
+ * @return
+ *   n: the number of unique cores present in "core_string".
+ *   -1 if the string was invalid.
+ *   NOTE: if n > "cores_len", then only "cores_len" elements in the "cores" array are valid.
+ */
+__rte_experimental
+int
+rte_arg_parse_core_string(const char *core_string, uint16_t *cores, uint32_t cores_len,
+		int default_type);
 
 #ifdef __cplusplus
 }
diff --git a/lib/arg_parser/version.map b/lib/arg_parser/version.map
index b44d4b02b7..383b6bd0e9 100644
--- a/lib/arg_parser/version.map
+++ b/lib/arg_parser/version.map
@@ -8,4 +8,6 @@  EXPERIMENTAL {
 	# added in 24.03
 	rte_arg_parse_corelist;
 	rte_arg_parse_coremask;
+	rte_arg_parse_arg_type;
+	rte_arg_parse_core_string;
 };