[v8,2/4] examples/l3fwd-power: simple app update for new API

Message ID 1537191016-26330-2-git-send-email-liang.j.ma@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series [v8,1/4] lib/librte_power: traffic pattern aware power control |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Liang, Ma Sept. 17, 2018, 1:30 p.m. UTC
  Add the support for new traffic pattern aware power control
power management API.

Example:
./l3fwd-power -l xxx   -n 4   -w 0000:xx:00.0 -w 0000:xx:00.1 -- -p 0x3
-P --config="(0,0,xx),(1,0,xx)" --empty-poll="0,0,0" -l 14 -m 9 -h 1

Please Reference l3fwd-power document for all parameter except
empty-poll.

The option "l", "m", "h" are used to set the power index for
LOW, MED, HIGH power state. only is useful after enable empty-poll

--empty-poll="training_flag, med_threshold, high_threshold"

The option training_flag is used to enable/disable training mode.

The option med_threshold is used to indicate the empty poll threshold
of modest state which is customized by user.

The option high_threshold is used to indicate the empty poll threshold
of busy state which is customized by user.

Above three option default value is all 0.

Once enable empty-poll. System will apply the default parameter.
Training mode is disabled as default.

If training mode is triggered, there should not has any traffic
pass-through during training phase.
When training phase complete, system transfer to normal phase.

System will running with modest power stat at beginning.
If the system busyness percentage above 70%, then system will adjust
power state move to High power state. If the traffic become lower(eg. The
system busyness percentage drop below 30%), system will fallback
to the modest power state.

Example code use master thread to monitoring worker thread busyness.
the default timer resolution is 10ms.

ChangeLog:
v2 fix some coding style issues
v3 rename the API.
v6 re-work the API.
v7 no change.
v8 disable training as default option.

Signed-off-by: Liang Ma <liang.j.ma@intel.com>

Reviewed-by: Lei Yao <lei.a.yao@intel.com>
---
 examples/l3fwd-power/Makefile    |   3 +
 examples/l3fwd-power/main.c      | 325 +++++++++++++++++++++++++++++++++++++--
 examples/l3fwd-power/meson.build |   1 +
 3 files changed, 312 insertions(+), 17 deletions(-)
  

Comments

Hunt, David Sept. 28, 2018, 11:19 a.m. UTC | #1
Hi Liang,

A few tweaks below:


On 17/9/2018 2:30 PM, Liang Ma wrote:
> Add the support for new traffic pattern aware power control
> power management API.
>
> Example:
> ./l3fwd-power -l xxx   -n 4   -w 0000:xx:00.0 -w 0000:xx:00.1 -- -p 0x3
> -P --config="(0,0,xx),(1,0,xx)" --empty-poll="0,0,0" -l 14 -m 9 -h 1
>
> Please Reference l3fwd-power document for all parameter except
> empty-poll.

The docs should probably include empty poll parameter. Suggest 
re-wording to

Please Reference l3fwd-power document for full parameter usage



> The option "l", "m", "h" are used to set the power index for
> LOW, MED, HIGH power state. only is useful after enable empty-poll
>
> --empty-poll="training_flag, med_threshold, high_threshold"
>
> The option training_flag is used to enable/disable training mode.
>
> The option med_threshold is used to indicate the empty poll threshold
> of modest state which is customized by user.
>
> The option high_threshold is used to indicate the empty poll threshold
> of busy state which is customized by user.
>
> Above three option default value is all 0.
>
> Once enable empty-poll. System will apply the default parameter.
> Training mode is disabled as default.

Suggest:

Once empty-poll is enabled, the system will apply the default parameters is no
other command line options are provided.



> If training mode is triggered, there should not has any traffic
> pass-through during training phase.

Suggest:
If training mode is enabled, the user should ensure that no traffic
is allowed to pass through the system.

> When training phase complete, system transfer to normal phase.

When training phase complete, the application transfer to normal operation



>
> System will running with modest power stat at beginning.

System will start running with the modest power mode.


> If the system busyness percentage above 70%, then system will adjust
> power state move to High power state. If the traffic become lower(eg. The
> system busyness percentage drop below 30%), system will fallback
> to the modest power state.

If the traffic goes above 70%, then system will move to High power state.
If the traffic drops below 30%, the system will fallback to the modest
power state.


> Example code use master thread to monitoring worker thread busyness.
> the default timer resolution is 10ms.
>
> ChangeLog:
> v2 fix some coding style issues
> v3 rename the API.
> v6 re-work the API.
> v7 no change.
> v8 disable training as default option.
>
> Signed-off-by: Liang Ma <liang.j.ma@intel.com>
>
> Reviewed-by: Lei Yao <lei.a.yao@intel.com>
> ---
>   examples/l3fwd-power/Makefile    |   3 +
>   examples/l3fwd-power/main.c      | 325 +++++++++++++++++++++++++++++++++++++--
>   examples/l3fwd-power/meson.build |   1 +
>   3 files changed, 312 insertions(+), 17 deletions(-)
>
> diff --git a/examples/l3fwd-power/Makefile b/examples/l3fwd-power/Makefile
> index d7e39a3..772ec7b 100644
> --- a/examples/l3fwd-power/Makefile
> +++ b/examples/l3fwd-power/Makefile
> @@ -23,6 +23,8 @@ CFLAGS += -O3 $(shell pkg-config --cflags libdpdk)
>   LDFLAGS_SHARED = $(shell pkg-config --libs libdpdk)
>   LDFLAGS_STATIC = -Wl,-Bstatic $(shell pkg-config --static --libs libdpdk)
>   
> +CFLAGS += -DALLOW_EXPERIMENTAL_API
> +
>   build/$(APP)-shared: $(SRCS-y) Makefile $(PC_FILE) | build
>   	$(CC) $(CFLAGS) $(SRCS-y) -o $@ $(LDFLAGS) $(LDFLAGS_SHARED)
>   
> @@ -54,6 +56,7 @@ please change the definition of the RTE_TARGET environment variable)
>   all:
>   else
>   
> +CFLAGS += -DALLOW_EXPERIMENTAL_API
>   CFLAGS += -O3
>   CFLAGS += $(WERROR_FLAGS)
>   
> diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
> index 68527d2..1465608 100644
> --- a/examples/l3fwd-power/main.c
> +++ b/examples/l3fwd-power/main.c
> @@ -43,6 +43,7 @@
>   #include <rte_timer.h>
>   #include <rte_power.h>
>   #include <rte_spinlock.h>
> +#include <rte_power_empty_poll.h>
>   
>   #include "perf_core.h"
>   #include "main.h"
> @@ -55,6 +56,8 @@
>   
>   /* 100 ms interval */
>   #define TIMER_NUMBER_PER_SECOND           10
> +/* (10ms) */
> +#define INTERVALS_PER_SECOND             100
>   /* 100000 us */
>   #define SCALING_PERIOD                    (1000000/TIMER_NUMBER_PER_SECOND)
>   #define SCALING_DOWN_TIME_RATIO_THRESHOLD 0.25
> @@ -117,6 +120,11 @@
>    */
>   #define RTE_TEST_RX_DESC_DEFAULT 1024
>   #define RTE_TEST_TX_DESC_DEFAULT 1024
> +#define EMPTY_POLL_MED_THRESHOLD 350000UL
> +#define EMPTY_POLL_HGH_THRESHOLD 580000UL

I'd suggest adding some explanation around these two numbers.
E.g.
/*
  * These two thresholds were decided on by running the training 
algorithm on
  * a 2.5GHz Xeon. These defaults can be overridden by supplying 
non-zero values
  * for the med_threshold and high_threshold parameters on the command line.
  */


> +
> +
> +
>   static uint16_t nb_rxd = RTE_TEST_RX_DESC_DEFAULT;
>   static uint16_t nb_txd = RTE_TEST_TX_DESC_DEFAULT;
>   
> @@ -132,6 +140,14 @@ static uint32_t enabled_port_mask = 0;
>   static int promiscuous_on = 0;
>   /* NUMA is enabled by default. */
>   static int numa_on = 1;
> +/* emptypoll is disabled by default. */
> +static bool empty_poll_on;
> +static bool empty_poll_train;
> +volatile bool empty_poll_stop;
> +static struct  ep_params *ep_params;
> +static struct  ep_policy policy;
> +static long  ep_med_edpi, ep_hgh_edpi;
> +
>   static int parse_ptype; /**< Parse packet type using rx callback, and */
>   			/**< disabled by default */
>   
> @@ -330,6 +346,13 @@ static inline uint32_t power_idle_heuristic(uint32_t zero_rx_packet_count);
>   static inline enum freq_scale_hint_t power_freq_scaleup_heuristic( \
>   		unsigned int lcore_id, uint16_t port_id, uint16_t queue_id);
>   
> +static uint8_t  freq_tlb[] = {14, 9, 1};
> +

Maybe an explanation on where these numbers came from. E.g.
/*
  * These defaults are using the max frequency index (1), a medium index 
(9) and a
  * typical low frequency index (14). These can be adjusted to use different
  * indexes using the relevant command line parameters.
  */


> +static int is_done(void)
> +{
> +	return empty_poll_stop;
> +}
> +
>   /* exit signal handler */
>   static void
>   signal_exit_now(int sigtype)
> @@ -338,7 +361,15 @@ signal_exit_now(int sigtype)
>   	unsigned int portid;
>   	int ret;
>   
> +	RTE_SET_USED(lcore_id);
> +	RTE_SET_USED(portid);
> +	RTE_SET_USED(ret);
> +
>   	if (sigtype == SIGINT) {
> +		if (empty_poll_on)
> +			empty_poll_stop = true;
> +
> +
>   		for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
>   			if (rte_lcore_is_enabled(lcore_id) == 0)
>   				continue;
> @@ -351,16 +382,19 @@ signal_exit_now(int sigtype)
>   							"core%u\n", lcore_id);
>   		}
>   
> -		RTE_ETH_FOREACH_DEV(portid) {
> -			if ((enabled_port_mask & (1 << portid)) == 0)
> -				continue;
> +		if (!empty_poll_on) {
> +			RTE_ETH_FOREACH_DEV(portid) {
> +				if ((enabled_port_mask & (1 << portid)) == 0)
> +					continue;
>   
> -			rte_eth_dev_stop(portid);
> -			rte_eth_dev_close(portid);
> +				rte_eth_dev_stop(portid);
> +				rte_eth_dev_close(portid);
> +			}
>   		}
>   	}
>   
> -	rte_exit(EXIT_SUCCESS, "User forced exit\n");
> +	if (!empty_poll_on)
> +		rte_exit(EXIT_SUCCESS, "User forced exit\n");
>   }
>   
>   /*  Freqency scale down timer callback */
> @@ -825,7 +859,107 @@ static int event_register(struct lcore_conf *qconf)
>   
>   	return 0;
>   }
> +/* main processing loop */
> +static int
> +main_empty_poll_loop(__attribute__((unused)) void *dummy)
> +{
> +	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
> +	unsigned int lcore_id;
> +	uint64_t prev_tsc, diff_tsc, cur_tsc;
> +	int i, j, nb_rx;
> +	uint8_t queueid;
> +	uint16_t portid;
> +	struct lcore_conf *qconf;
> +	struct lcore_rx_queue *rx_queue;
> +
> +	const uint64_t drain_tsc =
> +		(rte_get_tsc_hz() + US_PER_S - 1) / US_PER_S * BURST_TX_DRAIN_US;
> +
> +	prev_tsc = 0;
> +
> +	lcore_id = rte_lcore_id();
> +	qconf = &lcore_conf[lcore_id];
> +
> +	if (qconf->n_rx_queue == 0) {
> +		RTE_LOG(INFO, L3FWD_POWER, "lcore %u has nothing to do\n", lcore_id);
> +		return 0;
> +	}
> +
> +	for (i = 0; i < qconf->n_rx_queue; i++) {
> +		portid = qconf->rx_queue_list[i].port_id;
> +		queueid = qconf->rx_queue_list[i].queue_id;
> +		RTE_LOG(INFO, L3FWD_POWER, " -- lcoreid=%u portid=%u "
> +				"rxqueueid=%hhu\n", lcore_id, portid, queueid);
> +	}
> +
> +	while (!is_done()) {
> +		stats[lcore_id].nb_iteration_looped++;
> +
> +		cur_tsc = rte_rdtsc();
> +		/*
> +		 * TX burst queue drain
> +		 */
> +		diff_tsc = cur_tsc - prev_tsc;
> +		if (unlikely(diff_tsc > drain_tsc)) {
> +			for (i = 0; i < qconf->n_tx_port; ++i) {
> +				portid = qconf->tx_port_id[i];
> +				rte_eth_tx_buffer_flush(portid,
> +						qconf->tx_queue_id[portid],
> +						qconf->tx_buffer[portid]);
> +			}
> +			prev_tsc = cur_tsc;
> +		}
> +
> +		/*
> +		 * Read packet from RX queues
> +		 */
> +		for (i = 0; i < qconf->n_rx_queue; ++i) {
> +			rx_queue = &(qconf->rx_queue_list[i]);
> +			rx_queue->idle_hint = 0;
> +			portid = rx_queue->port_id;
> +			queueid = rx_queue->queue_id;
> +
> +			nb_rx = rte_eth_rx_burst(portid, queueid, pkts_burst,
> +					MAX_PKT_BURST);
> +
> +			stats[lcore_id].nb_rx_processed += nb_rx;
> +
> +			if (nb_rx == 0) {
> +
> +				rte_power_empty_poll_stat_update(lcore_id);
> +
> +				continue;
> +			} else {
> +				rte_power_poll_stat_update(lcore_id, nb_rx);
> +			}
> +
> +
> +			/* Prefetch first packets */
> +			for (j = 0; j < PREFETCH_OFFSET && j < nb_rx; j++) {
> +				rte_prefetch0(rte_pktmbuf_mtod(
> +							pkts_burst[j], void *));
> +			}
> +
> +			/* Prefetch and forward already prefetched packets */
> +			for (j = 0; j < (nb_rx - PREFETCH_OFFSET); j++) {
> +				rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[
> +							j + PREFETCH_OFFSET], void *));
> +				l3fwd_simple_forward(pkts_burst[j], portid,
> +						qconf);
> +			}
>   
> +			/* Forward remaining prefetched packets */
> +			for (; j < nb_rx; j++) {
> +				l3fwd_simple_forward(pkts_burst[j], portid,
> +						qconf);
> +			}
> +
> +		}
> +
> +	}
> +
> +	return 0;
> +}
>   /* main processing loop */
>   static int
>   main_loop(__attribute__((unused)) void *dummy)
> @@ -1127,7 +1261,8 @@ print_usage(const char *prgname)
>   		"  --no-numa: optional, disable numa awareness\n"
>   		"  --enable-jumbo: enable jumbo frame"
>   		" which max packet len is PKTLEN in decimal (64-9600)\n"
> -		"  --parse-ptype: parse packet type by software\n",
> +		"  --parse-ptype: parse packet type by software\n"
> +		"  --empty=poll: enable empty poll detection\n",

typo: "empty=poll" should be "empty-poll"

I really think some info on what should be supplied with the empty-poll 
parameter
should be mentioned here
e.g.

--empty=poll "training_flag, high_threshold, med_threshold"



>   		prgname);
>   }
>   
> @@ -1220,7 +1355,55 @@ parse_config(const char *q_arg)
>   
>   	return 0;
>   }
> +static int
> +parse_ep_config(const char *q_arg)
> +{
> +	char s[256];
> +	const char *p = q_arg;
> +	char *end;
> +	int  num_arg;
> +
> +	char *str_fld[3];
> +
> +	int training_flag;
> +	int med_edpi;
> +	int hgh_edpi;
> +
> +	ep_med_edpi = EMPTY_POLL_MED_THRESHOLD;
> +	ep_hgh_edpi = EMPTY_POLL_MED_THRESHOLD;
> +
> +	snprintf(s, sizeof(s), "%s", p);
> +
> +	num_arg = rte_strsplit(s, sizeof(s), str_fld, 3, ',');
> +
> +	empty_poll_train = false;
> +
> +	if (num_arg == 0)
> +		return 0;
>   
> +	if (num_arg == 3) {
> +
> +		training_flag = strtoul(str_fld[0], &end, 0);
> +		med_edpi = strtoul(str_fld[1], &end, 0);
> +		hgh_edpi = strtoul(str_fld[2], &end, 0);
> +
> +		if (training_flag == 1)
> +			empty_poll_train = true;
> +
> +		if (med_edpi > 0)
> +			ep_med_edpi = med_edpi;
> +
> +		if (med_edpi > 0)
> +			ep_hgh_edpi = hgh_edpi;
> +
> +	} else {
> +
> +		return -1;
> +	}
> +
> +	return 0;
> +
> +}
>   #define CMD_LINE_OPT_PARSE_PTYPE "parse-ptype"
>   
>   /* Parse the argument given in the command line of the application */
> @@ -1230,6 +1413,7 @@ parse_args(int argc, char **argv)
>   	int opt, ret;
>   	char **argvopt;
>   	int option_index;
> +	uint32_t limit;
>   	char *prgname = argv[0];
>   	static struct option lgopts[] = {
>   		{"config", 1, 0, 0},
> @@ -1237,13 +1421,14 @@ parse_args(int argc, char **argv)
>   		{"high-perf-cores", 1, 0, 0},
>   		{"no-numa", 0, 0, 0},
>   		{"enable-jumbo", 0, 0, 0},
> +		{"empty-poll", 1, 0, 0},
>   		{CMD_LINE_OPT_PARSE_PTYPE, 0, 0, 0},
>   		{NULL, 0, 0, 0}
>   	};
>   
>   	argvopt = argv;
>   
> -	while ((opt = getopt_long(argc, argvopt, "p:P",
> +	while ((opt = getopt_long(argc, argvopt, "p:l:m:h:P",
>   				lgopts, &option_index)) != EOF) {
>   
>   		switch (opt) {
> @@ -1260,7 +1445,18 @@ parse_args(int argc, char **argv)
>   			printf("Promiscuous mode selected\n");
>   			promiscuous_on = 1;
>   			break;
> -
> +		case 'l':
> +			limit = parse_max_pkt_len(optarg);
> +			freq_tlb[LOW] = limit;
> +			break;
> +		case 'm':
> +			limit = parse_max_pkt_len(optarg);
> +			freq_tlb[MED] = limit;
> +			break;
> +		case 'h':
> +			limit = parse_max_pkt_len(optarg);
> +			freq_tlb[HGH] = limit;
> +			break;
>   		/* long options */
>   		case 0:
>   			if (!strncmp(lgopts[option_index].name, "config", 6)) {
> @@ -1299,6 +1495,20 @@ parse_args(int argc, char **argv)
>   			}
>   
>   			if (!strncmp(lgopts[option_index].name,
> +						"empty-poll", 10)) {
> +				printf("empty-poll is enabled\n");
> +				empty_poll_on = true;
> +				ret = parse_ep_config(optarg);
> +
> +				if (ret) {
> +					printf("invalid empty poll config\n");
> +					print_usage(prgname);
> +					return -1;
> +				}
> +
> +			}
> +
> +			if (!strncmp(lgopts[option_index].name,
>   					"enable-jumbo", 12)) {
>   				struct option lenopts =
>   					{"max-pkt-len", required_argument, \
> @@ -1646,6 +1856,59 @@ init_power_library(void)
>   	}
>   	return ret;
>   }
> +static void
> +empty_poll_setup_timer(void)
> +{
> +	int lcore_id = rte_lcore_id();
> +	uint64_t hz = rte_get_timer_hz();
> +
> +	struct  ep_params *ep_ptr = ep_params;
> +
> +	ep_ptr->interval_ticks = hz / INTERVALS_PER_SECOND;
> +
> +	rte_timer_reset_sync(&ep_ptr->timer0,
> +			ep_ptr->interval_ticks,
> +			PERIODICAL,
> +			lcore_id,
> +			rte_empty_poll_detection,
> +			(void *)ep_ptr);
> +
> +}
> +static int
> +launch_timer(unsigned int lcore_id)
> +{
> +	int64_t prev_tsc = 0, cur_tsc, diff_tsc, cycles_10ms;
> +
> +	RTE_SET_USED(lcore_id);
> +
> +
> +	if (rte_get_master_lcore() != lcore_id) {
> +		rte_panic("timer on lcore:%d which is not master core:%d\n",
> +				lcore_id,
> +				rte_get_master_lcore());
> +	}
> +
> +	RTE_LOG(INFO, POWER, "Bring up the Timer\n");
> +
> +	empty_poll_setup_timer();
> +
> +	cycles_10ms = rte_get_timer_hz() / 100;
> +
> +	while (!is_done()) {
> +		cur_tsc = rte_rdtsc();
> +		diff_tsc = cur_tsc - prev_tsc;
> +		if (diff_tsc > cycles_10ms) {
> +			rte_timer_manage();
> +			prev_tsc = cur_tsc;
> +			cycles_10ms = rte_get_timer_hz() / 100;
> +		}
> +	}
> +
> +	RTE_LOG(INFO, POWER, "Timer_subsystem is done\n");
> +
> +	return 0;
> +}
> +
>   
>   int
>   main(int argc, char **argv)
> @@ -1828,13 +2091,15 @@ main(int argc, char **argv)
>   		if (rte_lcore_is_enabled(lcore_id) == 0)
>   			continue;
>   
> -		/* init timer structures for each enabled lcore */
> -		rte_timer_init(&power_timers[lcore_id]);
> -		hz = rte_get_timer_hz();
> -		rte_timer_reset(&power_timers[lcore_id],
> -			hz/TIMER_NUMBER_PER_SECOND, SINGLE, lcore_id,
> -						power_timer_cb, NULL);
> -
> +		if (empty_poll_on == false) {
> +			/* init timer structures for each enabled lcore */
> +			rte_timer_init(&power_timers[lcore_id]);
> +			hz = rte_get_timer_hz();
> +			rte_timer_reset(&power_timers[lcore_id],
> +					hz/TIMER_NUMBER_PER_SECOND,
> +					SINGLE, lcore_id,
> +					power_timer_cb, NULL);
> +		}
>   		qconf = &lcore_conf[lcore_id];
>   		printf("\nInitializing rx queues on lcore %u ... ", lcore_id );
>   		fflush(stdout);
> @@ -1905,12 +2170,38 @@ main(int argc, char **argv)
>   
>   	check_all_ports_link_status(enabled_port_mask);
>   
> +	if (empty_poll_on == true) {
> +
> +		if (empty_poll_train) {
> +			policy.state = TRAINING;
> +		} else {
> +			policy.state = MED_NORMAL;
> +			policy.med_base_edpi = ep_med_edpi;
> +			policy.hgh_base_edpi = ep_hgh_edpi;
> +		}
> +
> +		rte_power_empty_poll_stat_init(&ep_params, freq_tlb, &policy);
> +	}
> +
> +
>   	/* launch per-lcore init on every lcore */
> -	rte_eal_mp_remote_launch(main_loop, NULL, CALL_MASTER);
> +	if (empty_poll_on == false) {
> +		rte_eal_mp_remote_launch(main_loop, NULL, CALL_MASTER);
> +	} else {
> +		empty_poll_stop = false;
> +		rte_eal_mp_remote_launch(main_empty_poll_loop, NULL, SKIP_MASTER);
> +	}
> +
> +	if (empty_poll_on == true)
> +		launch_timer(rte_lcore_id());
> +
>   	RTE_LCORE_FOREACH_SLAVE(lcore_id) {
>   		if (rte_eal_wait_lcore(lcore_id) < 0)
>   			return -1;
>   	}
>   
> +	if (empty_poll_on)
> +		rte_power_empty_poll_stat_free();
> +
>   	return 0;
>   }
> diff --git a/examples/l3fwd-power/meson.build b/examples/l3fwd-power/meson.build
> index 20c8054..a3c5c2f 100644
> --- a/examples/l3fwd-power/meson.build
> +++ b/examples/l3fwd-power/meson.build
> @@ -9,6 +9,7 @@
>   if host_machine.system() != 'linux'
>   	build = false
>   endif
> +allow_experimental_apis = true
>   deps += ['power', 'timer', 'lpm', 'hash']
>   sources = files(
>   	'main.c', 'perf_core.c'


Checkpatch throws up some warnings:


### examples/l3fwd-power: simple app update for new API

WARNING:LONG_LINE: line over 80 characters
#201: FILE: examples/l3fwd-power/main.c:876:
+               (rte_get_tsc_hz() + US_PER_S - 1) / US_PER_S * 
BURST_TX_DRAIN_US;

WARNING:LONG_LINE: line over 80 characters
#209: FILE: examples/l3fwd-power/main.c:884:
+               RTE_LOG(INFO, L3FWD_POWER, "lcore %u has nothing to 
do\n", lcore_id);

WARNING:LONG_LINE: line over 80 characters
#271: FILE: examples/l3fwd-power/main.c:946:
+                                                       j + 
PREFETCH_OFFSET], void *));

WARNING:LONG_LINE: line over 80 characters
#529: FILE: examples/l3fwd-power/main.c:2192:
+               rte_eal_mp_remote_launch(main_empty_poll_loop, NULL, 
SKIP_MASTER);

total: 0 errors, 4 warnings, 467 lines checked


Rgds,
Dave.
  
Liang, Ma Oct. 2, 2018, 10:18 a.m. UTC | #2
On 28 Sep 12:19, Hunt, David wrote:
> Hi Liang,
> 
> A few tweaks below:
> 
> 
> On 17/9/2018 2:30 PM, Liang Ma wrote:
> >Add the support for new traffic pattern aware power control
> >power management API.
> >
> >Example:
> >./l3fwd-power -l xxx   -n 4   -w 0000:xx:00.0 -w 0000:xx:00.1 -- -p 0x3
> >-P --config="(0,0,xx),(1,0,xx)" --empty-poll="0,0,0" -l 14 -m 9 -h 1
> >
> >Please Reference l3fwd-power document for all parameter except
> >empty-poll.
> 
> The docs should probably include empty poll parameter. Suggest 
> re-wording to
> 
> Please Reference l3fwd-power document for full parameter usage
> 
agree
> 
> 
> >The option "l", "m", "h" are used to set the power index for
> >LOW, MED, HIGH power state. only is useful after enable empty-poll
> >
> >--empty-poll="training_flag, med_threshold, high_threshold"
> >
> >The option training_flag is used to enable/disable training mode.
> >
> >The option med_threshold is used to indicate the empty poll threshold
> >of modest state which is customized by user.
> >
> >The option high_threshold is used to indicate the empty poll threshold
> >of busy state which is customized by user.
> >
> >Above three option default value is all 0.
> >
> >Once enable empty-poll. System will apply the default parameter.
> >Training mode is disabled as default.
> 
> Suggest:
> 
> Once empty-poll is enabled, the system will apply the default parameters is 
> no
> other command line options are provided.
> 
agree
> 
> 
> >If training mode is triggered, there should not has any traffic
> >pass-through during training phase.
> 
> Suggest:
> If training mode is enabled, the user should ensure that no traffic
> is allowed to pass through the system.
> 
> >When training phase complete, system transfer to normal phase.
> 
> When training phase complete, the application transfer to normal operation
> 
> 
agree
> 
> >
> >System will running with modest power stat at beginning.
> 
> System will start running with the modest power mode.
> 

> 
> >If the system busyness percentage above 70%, then system will adjust
> >power state move to High power state. If the traffic become lower(eg. The
> >system busyness percentage drop below 30%), system will fallback
> >to the modest power state.
> 
> If the traffic goes above 70%, then system will move to High power state.
> If the traffic drops below 30%, the system will fallback to the modest
> power state.
> 
> 
> >Example code use master thread to monitoring worker thread busyness.
> >the default timer resolution is 10ms.
> >
> >ChangeLog:
> >v2 fix some coding style issues
> >v3 rename the API.
> >v6 re-work the API.
> >v7 no change.
> >v8 disable training as default option.
> >
> >Signed-off-by: Liang Ma <liang.j.ma@intel.com>
> >
> >Reviewed-by: Lei Yao <lei.a.yao@intel.com>
> >---
> >  examples/l3fwd-power/Makefile    |   3 +
> >  examples/l3fwd-power/main.c      | 325 
> >  +++++++++++++++++++++++++++++++++++++--
> >  examples/l3fwd-power/meson.build |   1 +
> >  3 files changed, 312 insertions(+), 17 deletions(-)
> >
> >diff --git a/examples/l3fwd-power/Makefile b/examples/l3fwd-power/Makefile
> >index d7e39a3..772ec7b 100644
> >--- a/examples/l3fwd-power/Makefile
> >+++ b/examples/l3fwd-power/Makefile
> >@@ -23,6 +23,8 @@ CFLAGS += -O3 $(shell pkg-config --cflags libdpdk)
> >  LDFLAGS_SHARED = $(shell pkg-config --libs libdpdk)
> >  LDFLAGS_STATIC = -Wl,-Bstatic $(shell pkg-config --static --libs libdpdk)
> >  
> >+CFLAGS += -DALLOW_EXPERIMENTAL_API
> >+
> >  build/$(APP)-shared: $(SRCS-y) Makefile $(PC_FILE) | build
> >  	$(CC) $(CFLAGS) $(SRCS-y) -o $@ $(LDFLAGS) $(LDFLAGS_SHARED)
> >  
> >@@ -54,6 +56,7 @@ please change the definition of the RTE_TARGET 
> >environment variable)
> >  all:
> >  else
> >  
> >+CFLAGS += -DALLOW_EXPERIMENTAL_API
> >  CFLAGS += -O3
> >  CFLAGS += $(WERROR_FLAGS)
> >  
> >diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
> >index 68527d2..1465608 100644
> >--- a/examples/l3fwd-power/main.c
> >+++ b/examples/l3fwd-power/main.c
> >@@ -43,6 +43,7 @@
> >  #include <rte_timer.h>
> >  #include <rte_power.h>
> >  #include <rte_spinlock.h>
> >+#include <rte_power_empty_poll.h>
> >  
> >  #include "perf_core.h"
> >  #include "main.h"
> >@@ -55,6 +56,8 @@
> >  
> >  /* 100 ms interval */
> >  #define TIMER_NUMBER_PER_SECOND           10
> >+/* (10ms) */
> >+#define INTERVALS_PER_SECOND             100
> >  /* 100000 us */
> >  #define SCALING_PERIOD                    
> >  (1000000/TIMER_NUMBER_PER_SECOND)
> >  #define SCALING_DOWN_TIME_RATIO_THRESHOLD 0.25
> >@@ -117,6 +120,11 @@
> >   */
> >  #define RTE_TEST_RX_DESC_DEFAULT 1024
> >  #define RTE_TEST_TX_DESC_DEFAULT 1024
> >+#define EMPTY_POLL_MED_THRESHOLD 350000UL
> >+#define EMPTY_POLL_HGH_THRESHOLD 580000UL
> 
> I'd suggest adding some explanation around these two numbers.
> E.g.
> /*
>  * These two thresholds were decided on by running the training 
> algorithm on
>  * a 2.5GHz Xeon. These defaults can be overridden by supplying 
> non-zero values
>  * for the med_threshold and high_threshold parameters on the command line.
>  */
> 
> 
> >+
> >+
> >+
> >  static uint16_t nb_rxd = RTE_TEST_RX_DESC_DEFAULT;
> >  static uint16_t nb_txd = RTE_TEST_TX_DESC_DEFAULT;
> >  
> >@@ -132,6 +140,14 @@ static uint32_t enabled_port_mask = 0;
> >  static int promiscuous_on = 0;
> >  /* NUMA is enabled by default. */
> >  static int numa_on = 1;
> >+/* emptypoll is disabled by default. */
> >+static bool empty_poll_on;
> >+static bool empty_poll_train;
> >+volatile bool empty_poll_stop;
> >+static struct  ep_params *ep_params;
> >+static struct  ep_policy policy;
> >+static long  ep_med_edpi, ep_hgh_edpi;
> >+
> >  static int parse_ptype; /**< Parse packet type using rx callback, and */
> >  			/**< disabled by default */
> >  
> >@@ -330,6 +346,13 @@ static inline uint32_t power_idle_heuristic(uint32_t 
> >zero_rx_packet_count);
> >  static inline enum freq_scale_hint_t power_freq_scaleup_heuristic( \
> >  		unsigned int lcore_id, uint16_t port_id, uint16_t queue_id);
> >  
> >+static uint8_t  freq_tlb[] = {14, 9, 1};
> >+
> 
> Maybe an explanation on where these numbers came from. E.g.
> /*
>  * These defaults are using the max frequency index (1), a medium index 
> (9) and a
>  * typical low frequency index (14). These can be adjusted to use different
>  * indexes using the relevant command line parameters.
>  */
>
agree
> 
> >+static int is_done(void)
> >+{
> >+	return empty_poll_stop;
> >+}
> >+
> >  /* exit signal handler */
> >  static void
> >  signal_exit_now(int sigtype)
> >@@ -338,7 +361,15 @@ signal_exit_now(int sigtype)
> >  	unsigned int portid;
> >  	int ret;
> >  
> >+	RTE_SET_USED(lcore_id);
> >+	RTE_SET_USED(portid);
> >+	RTE_SET_USED(ret);
> >+
> >  	if (sigtype == SIGINT) {
> >+		if (empty_poll_on)
> >+			empty_poll_stop = true;
> >+
> >+
> >  		for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
> >  			if (rte_lcore_is_enabled(lcore_id) == 0)
> >  				continue;
> >@@ -351,16 +382,19 @@ signal_exit_now(int sigtype)
> >  							"core%u\n", 
> >  							lcore_id);
> >  		}
> >  
> >-		RTE_ETH_FOREACH_DEV(portid) {
> >-			if ((enabled_port_mask & (1 << portid)) == 0)
> >-				continue;
> >+		if (!empty_poll_on) {
> >+			RTE_ETH_FOREACH_DEV(portid) {
> >+				if ((enabled_port_mask & (1 << portid)) == 0)
> >+					continue;
> >  
> >-			rte_eth_dev_stop(portid);
> >-			rte_eth_dev_close(portid);
> >+				rte_eth_dev_stop(portid);
> >+				rte_eth_dev_close(portid);
> >+			}
> >  		}
> >  	}
> >  
> >-	rte_exit(EXIT_SUCCESS, "User forced exit\n");
> >+	if (!empty_poll_on)
> >+		rte_exit(EXIT_SUCCESS, "User forced exit\n");
> >  }
> >  
> >  /*  Freqency scale down timer callback */
> >@@ -825,7 +859,107 @@ static int event_register(struct lcore_conf *qconf)
> >  
> >  	return 0;
> >  }
> >+/* main processing loop */
> >+static int
> >+main_empty_poll_loop(__attribute__((unused)) void *dummy)
> >+{
> >+	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
> >+	unsigned int lcore_id;
> >+	uint64_t prev_tsc, diff_tsc, cur_tsc;
> >+	int i, j, nb_rx;
> >+	uint8_t queueid;
> >+	uint16_t portid;
> >+	struct lcore_conf *qconf;
> >+	struct lcore_rx_queue *rx_queue;
> >+
> >+	const uint64_t drain_tsc =
> >+		(rte_get_tsc_hz() + US_PER_S - 1) / US_PER_S * 
> >BURST_TX_DRAIN_US;
> >+
> >+	prev_tsc = 0;
> >+
> >+	lcore_id = rte_lcore_id();
> >+	qconf = &lcore_conf[lcore_id];
> >+
> >+	if (qconf->n_rx_queue == 0) {
> >+		RTE_LOG(INFO, L3FWD_POWER, "lcore %u has nothing to do\n", 
> >lcore_id);
> >+		return 0;
> >+	}
> >+
> >+	for (i = 0; i < qconf->n_rx_queue; i++) {
> >+		portid = qconf->rx_queue_list[i].port_id;
> >+		queueid = qconf->rx_queue_list[i].queue_id;
> >+		RTE_LOG(INFO, L3FWD_POWER, " -- lcoreid=%u portid=%u "
> >+				"rxqueueid=%hhu\n", lcore_id, portid, 
> >queueid);
> >+	}
> >+
> >+	while (!is_done()) {
> >+		stats[lcore_id].nb_iteration_looped++;
> >+
> >+		cur_tsc = rte_rdtsc();
> >+		/*
> >+		 * TX burst queue drain
> >+		 */
> >+		diff_tsc = cur_tsc - prev_tsc;
> >+		if (unlikely(diff_tsc > drain_tsc)) {
> >+			for (i = 0; i < qconf->n_tx_port; ++i) {
> >+				portid = qconf->tx_port_id[i];
> >+				rte_eth_tx_buffer_flush(portid,
> >+						qconf->tx_queue_id[portid],
> >+						qconf->tx_buffer[portid]);
> >+			}
> >+			prev_tsc = cur_tsc;
> >+		}
> >+
> >+		/*
> >+		 * Read packet from RX queues
> >+		 */
> >+		for (i = 0; i < qconf->n_rx_queue; ++i) {
> >+			rx_queue = &(qconf->rx_queue_list[i]);
> >+			rx_queue->idle_hint = 0;
> >+			portid = rx_queue->port_id;
> >+			queueid = rx_queue->queue_id;
> >+
> >+			nb_rx = rte_eth_rx_burst(portid, queueid, pkts_burst,
> >+					MAX_PKT_BURST);
> >+
> >+			stats[lcore_id].nb_rx_processed += nb_rx;
> >+
> >+			if (nb_rx == 0) {
> >+
> >+				rte_power_empty_poll_stat_update(lcore_id);
> >+
> >+				continue;
> >+			} else {
> >+				rte_power_poll_stat_update(lcore_id, nb_rx);
> >+			}
> >+
> >+
> >+			/* Prefetch first packets */
> >+			for (j = 0; j < PREFETCH_OFFSET && j < nb_rx; j++) {
> >+				rte_prefetch0(rte_pktmbuf_mtod(
> >+							pkts_burst[j], void 
> >*));
> >+			}
> >+
> >+			/* Prefetch and forward already prefetched packets */
> >+			for (j = 0; j < (nb_rx - PREFETCH_OFFSET); j++) {
> >+				rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[
> >+							j + 
> >PREFETCH_OFFSET], void *));
> >+				l3fwd_simple_forward(pkts_burst[j], portid,
> >+						qconf);
> >+			}
> >  
> >+			/* Forward remaining prefetched packets */
> >+			for (; j < nb_rx; j++) {
> >+				l3fwd_simple_forward(pkts_burst[j], portid,
> >+						qconf);
> >+			}
> >+
> >+		}
> >+
> >+	}
> >+
> >+	return 0;
> >+}
> >  /* main processing loop */
> >  static int
> >  main_loop(__attribute__((unused)) void *dummy)
> >@@ -1127,7 +1261,8 @@ print_usage(const char *prgname)
> >  		"  --no-numa: optional, disable numa awareness\n"
> >  		"  --enable-jumbo: enable jumbo frame"
> >  		" which max packet len is PKTLEN in decimal (64-9600)\n"
> >-		"  --parse-ptype: parse packet type by software\n",
> >+		"  --parse-ptype: parse packet type by software\n"
> >+		"  --empty=poll: enable empty poll detection\n",
> 
> typo: "empty=poll" should be "empty-poll"
> 
> I really think some info on what should be supplied with the empty-poll 
> parameter
> should be mentioned here
> e.g.
> 
> --empty=poll "training_flag, high_threshold, med_threshold"
> 
> 
agree
> 
> >  		prgname);
> >  }
> >  
> >@@ -1220,7 +1355,55 @@ parse_config(const char *q_arg)
> >  
> >  	return 0;
> >  }
> >+static int
> >+parse_ep_config(const char *q_arg)
> >+{
> >+	char s[256];
> >+	const char *p = q_arg;
> >+	char *end;
> >+	int  num_arg;
> >+
> >+	char *str_fld[3];
> >+
> >+	int training_flag;
> >+	int med_edpi;
> >+	int hgh_edpi;
> >+
> >+	ep_med_edpi = EMPTY_POLL_MED_THRESHOLD;
> >+	ep_hgh_edpi = EMPTY_POLL_MED_THRESHOLD;
> >+
> >+	snprintf(s, sizeof(s), "%s", p);
> >+
> >+	num_arg = rte_strsplit(s, sizeof(s), str_fld, 3, ',');
> >+
> >+	empty_poll_train = false;
> >+
> >+	if (num_arg == 0)
> >+		return 0;
> >  
> >+	if (num_arg == 3) {
> >+
> >+		training_flag = strtoul(str_fld[0], &end, 0);
> >+		med_edpi = strtoul(str_fld[1], &end, 0);
> >+		hgh_edpi = strtoul(str_fld[2], &end, 0);
> >+
> >+		if (training_flag == 1)
> >+			empty_poll_train = true;
> >+
> >+		if (med_edpi > 0)
> >+			ep_med_edpi = med_edpi;
> >+
> >+		if (med_edpi > 0)
> >+			ep_hgh_edpi = hgh_edpi;
> >+
> >+	} else {
> >+
> >+		return -1;
> >+	}
> >+
> >+	return 0;
> >+
> >+}
> >  #define CMD_LINE_OPT_PARSE_PTYPE "parse-ptype"
> >  
> >  /* Parse the argument given in the command line of the application */
> >@@ -1230,6 +1413,7 @@ parse_args(int argc, char **argv)
> >  	int opt, ret;
> >  	char **argvopt;
> >  	int option_index;
> >+	uint32_t limit;
> >  	char *prgname = argv[0];
> >  	static struct option lgopts[] = {
> >  		{"config", 1, 0, 0},
> >@@ -1237,13 +1421,14 @@ parse_args(int argc, char **argv)
> >  		{"high-perf-cores", 1, 0, 0},
> >  		{"no-numa", 0, 0, 0},
> >  		{"enable-jumbo", 0, 0, 0},
> >+		{"empty-poll", 1, 0, 0},
> >  		{CMD_LINE_OPT_PARSE_PTYPE, 0, 0, 0},
> >  		{NULL, 0, 0, 0}
> >  	};
> >  
> >  	argvopt = argv;
> >  
> >-	while ((opt = getopt_long(argc, argvopt, "p:P",
> >+	while ((opt = getopt_long(argc, argvopt, "p:l:m:h:P",
> >  				lgopts, &option_index)) != EOF) {
> >  
> >  		switch (opt) {
> >@@ -1260,7 +1445,18 @@ parse_args(int argc, char **argv)
> >  			printf("Promiscuous mode selected\n");
> >  			promiscuous_on = 1;
> >  			break;
> >-
> >+		case 'l':
> >+			limit = parse_max_pkt_len(optarg);
> >+			freq_tlb[LOW] = limit;
> >+			break;
> >+		case 'm':
> >+			limit = parse_max_pkt_len(optarg);
> >+			freq_tlb[MED] = limit;
> >+			break;
> >+		case 'h':
> >+			limit = parse_max_pkt_len(optarg);
> >+			freq_tlb[HGH] = limit;
> >+			break;
> >  		/* long options */
> >  		case 0:
> >  			if (!strncmp(lgopts[option_index].name, "config", 
> >  			6)) {
> >@@ -1299,6 +1495,20 @@ parse_args(int argc, char **argv)
> >  			}
> >  
> >  			if (!strncmp(lgopts[option_index].name,
> >+						"empty-poll", 10)) {
> >+				printf("empty-poll is enabled\n");
> >+				empty_poll_on = true;
> >+				ret = parse_ep_config(optarg);
> >+
> >+				if (ret) {
> >+					printf("invalid empty poll 
> >config\n");
> >+					print_usage(prgname);
> >+					return -1;
> >+				}
> >+
> >+			}
> >+
> >+			if (!strncmp(lgopts[option_index].name,
> >  					"enable-jumbo", 12)) {
> >  				struct option lenopts =
> >  					{"max-pkt-len", required_argument, \
> >@@ -1646,6 +1856,59 @@ init_power_library(void)
> >  	}
> >  	return ret;
> >  }
> >+static void
> >+empty_poll_setup_timer(void)
> >+{
> >+	int lcore_id = rte_lcore_id();
> >+	uint64_t hz = rte_get_timer_hz();
> >+
> >+	struct  ep_params *ep_ptr = ep_params;
> >+
> >+	ep_ptr->interval_ticks = hz / INTERVALS_PER_SECOND;
> >+
> >+	rte_timer_reset_sync(&ep_ptr->timer0,
> >+			ep_ptr->interval_ticks,
> >+			PERIODICAL,
> >+			lcore_id,
> >+			rte_empty_poll_detection,
> >+			(void *)ep_ptr);
> >+
> >+}
> >+static int
> >+launch_timer(unsigned int lcore_id)
> >+{
> >+	int64_t prev_tsc = 0, cur_tsc, diff_tsc, cycles_10ms;
> >+
> >+	RTE_SET_USED(lcore_id);
> >+
> >+
> >+	if (rte_get_master_lcore() != lcore_id) {
> >+		rte_panic("timer on lcore:%d which is not master core:%d\n",
> >+				lcore_id,
> >+				rte_get_master_lcore());
> >+	}
> >+
> >+	RTE_LOG(INFO, POWER, "Bring up the Timer\n");
> >+
> >+	empty_poll_setup_timer();
> >+
> >+	cycles_10ms = rte_get_timer_hz() / 100;
> >+
> >+	while (!is_done()) {
> >+		cur_tsc = rte_rdtsc();
> >+		diff_tsc = cur_tsc - prev_tsc;
> >+		if (diff_tsc > cycles_10ms) {
> >+			rte_timer_manage();
> >+			prev_tsc = cur_tsc;
> >+			cycles_10ms = rte_get_timer_hz() / 100;
> >+		}
> >+	}
> >+
> >+	RTE_LOG(INFO, POWER, "Timer_subsystem is done\n");
> >+
> >+	return 0;
> >+}
> >+
> >  
> >  int
> >  main(int argc, char **argv)
> >@@ -1828,13 +2091,15 @@ main(int argc, char **argv)
> >  		if (rte_lcore_is_enabled(lcore_id) == 0)
> >  			continue;
> >  
> >-		/* init timer structures for each enabled lcore */
> >-		rte_timer_init(&power_timers[lcore_id]);
> >-		hz = rte_get_timer_hz();
> >-		rte_timer_reset(&power_timers[lcore_id],
> >-			hz/TIMER_NUMBER_PER_SECOND, SINGLE, lcore_id,
> >-						power_timer_cb, NULL);
> >-
> >+		if (empty_poll_on == false) {
> >+			/* init timer structures for each enabled lcore */
> >+			rte_timer_init(&power_timers[lcore_id]);
> >+			hz = rte_get_timer_hz();
> >+			rte_timer_reset(&power_timers[lcore_id],
> >+					hz/TIMER_NUMBER_PER_SECOND,
> >+					SINGLE, lcore_id,
> >+					power_timer_cb, NULL);
> >+		}
> >  		qconf = &lcore_conf[lcore_id];
> >  		printf("\nInitializing rx queues on lcore %u ... ", lcore_id 
> >  		);
> >  		fflush(stdout);
> >@@ -1905,12 +2170,38 @@ main(int argc, char **argv)
> >  
> >  	check_all_ports_link_status(enabled_port_mask);
> >  
> >+	if (empty_poll_on == true) {
> >+
> >+		if (empty_poll_train) {
> >+			policy.state = TRAINING;
> >+		} else {
> >+			policy.state = MED_NORMAL;
> >+			policy.med_base_edpi = ep_med_edpi;
> >+			policy.hgh_base_edpi = ep_hgh_edpi;
> >+		}
> >+
> >+		rte_power_empty_poll_stat_init(&ep_params, freq_tlb, 
> >&policy);
> >+	}
> >+
> >+
> >  	/* launch per-lcore init on every lcore */
> >-	rte_eal_mp_remote_launch(main_loop, NULL, CALL_MASTER);
> >+	if (empty_poll_on == false) {
> >+		rte_eal_mp_remote_launch(main_loop, NULL, CALL_MASTER);
> >+	} else {
> >+		empty_poll_stop = false;
> >+		rte_eal_mp_remote_launch(main_empty_poll_loop, NULL, 
> >SKIP_MASTER);
> >+	}
> >+
> >+	if (empty_poll_on == true)
> >+		launch_timer(rte_lcore_id());
> >+
> >  	RTE_LCORE_FOREACH_SLAVE(lcore_id) {
> >  		if (rte_eal_wait_lcore(lcore_id) < 0)
> >  			return -1;
> >  	}
> >  
> >+	if (empty_poll_on)
> >+		rte_power_empty_poll_stat_free();
> >+
> >  	return 0;
> >  }
> >diff --git a/examples/l3fwd-power/meson.build 
> >b/examples/l3fwd-power/meson.build
> >index 20c8054..a3c5c2f 100644
> >--- a/examples/l3fwd-power/meson.build
> >+++ b/examples/l3fwd-power/meson.build
> >@@ -9,6 +9,7 @@
> >  if host_machine.system() != 'linux'
> >  	build = false
> >  endif
> >+allow_experimental_apis = true
> >  deps += ['power', 'timer', 'lpm', 'hash']
> >  sources = files(
> >  	'main.c', 'perf_core.c'
> 
> 
> Checkpatch throws up some warnings:
> 
> 
> ### examples/l3fwd-power: simple app update for new API
> 
> WARNING:LONG_LINE: line over 80 characters
> #201: FILE: examples/l3fwd-power/main.c:876:
> +               (rte_get_tsc_hz() + US_PER_S - 1) / US_PER_S 
> * BURST_TX_DRAIN_US;
> 
> WARNING:LONG_LINE: line over 80 characters
> #209: FILE: examples/l3fwd-power/main.c:884:
> +               RTE_LOG(INFO, L3FWD_POWER, "lcore %u has 
> nothing to do\n", lcore_id);
> 
> WARNING:LONG_LINE: line over 80 characters
> #271: FILE: examples/l3fwd-power/main.c:946:
> +                                                       j + 
> PREFETCH_OFFSET], void *));
> 
> WARNING:LONG_LINE: line over 80 characters
> #529: FILE: examples/l3fwd-power/main.c:2192:
> +               
> rte_eal_mp_remote_launch(main_empty_poll_loop, NULL, SKIP_MASTER);
> 
> total: 0 errors, 4 warnings, 467 lines checked
> 
> 
> Rgds,
> Dave.
> 
>
  

Patch

diff --git a/examples/l3fwd-power/Makefile b/examples/l3fwd-power/Makefile
index d7e39a3..772ec7b 100644
--- a/examples/l3fwd-power/Makefile
+++ b/examples/l3fwd-power/Makefile
@@ -23,6 +23,8 @@  CFLAGS += -O3 $(shell pkg-config --cflags libdpdk)
 LDFLAGS_SHARED = $(shell pkg-config --libs libdpdk)
 LDFLAGS_STATIC = -Wl,-Bstatic $(shell pkg-config --static --libs libdpdk)
 
+CFLAGS += -DALLOW_EXPERIMENTAL_API
+
 build/$(APP)-shared: $(SRCS-y) Makefile $(PC_FILE) | build
 	$(CC) $(CFLAGS) $(SRCS-y) -o $@ $(LDFLAGS) $(LDFLAGS_SHARED)
 
@@ -54,6 +56,7 @@  please change the definition of the RTE_TARGET environment variable)
 all:
 else
 
+CFLAGS += -DALLOW_EXPERIMENTAL_API
 CFLAGS += -O3
 CFLAGS += $(WERROR_FLAGS)
 
diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index 68527d2..1465608 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -43,6 +43,7 @@ 
 #include <rte_timer.h>
 #include <rte_power.h>
 #include <rte_spinlock.h>
+#include <rte_power_empty_poll.h>
 
 #include "perf_core.h"
 #include "main.h"
@@ -55,6 +56,8 @@ 
 
 /* 100 ms interval */
 #define TIMER_NUMBER_PER_SECOND           10
+/* (10ms) */
+#define INTERVALS_PER_SECOND             100
 /* 100000 us */
 #define SCALING_PERIOD                    (1000000/TIMER_NUMBER_PER_SECOND)
 #define SCALING_DOWN_TIME_RATIO_THRESHOLD 0.25
@@ -117,6 +120,11 @@ 
  */
 #define RTE_TEST_RX_DESC_DEFAULT 1024
 #define RTE_TEST_TX_DESC_DEFAULT 1024
+#define EMPTY_POLL_MED_THRESHOLD 350000UL
+#define EMPTY_POLL_HGH_THRESHOLD 580000UL
+
+
+
 static uint16_t nb_rxd = RTE_TEST_RX_DESC_DEFAULT;
 static uint16_t nb_txd = RTE_TEST_TX_DESC_DEFAULT;
 
@@ -132,6 +140,14 @@  static uint32_t enabled_port_mask = 0;
 static int promiscuous_on = 0;
 /* NUMA is enabled by default. */
 static int numa_on = 1;
+/* emptypoll is disabled by default. */
+static bool empty_poll_on;
+static bool empty_poll_train;
+volatile bool empty_poll_stop;
+static struct  ep_params *ep_params;
+static struct  ep_policy policy;
+static long  ep_med_edpi, ep_hgh_edpi;
+
 static int parse_ptype; /**< Parse packet type using rx callback, and */
 			/**< disabled by default */
 
@@ -330,6 +346,13 @@  static inline uint32_t power_idle_heuristic(uint32_t zero_rx_packet_count);
 static inline enum freq_scale_hint_t power_freq_scaleup_heuristic( \
 		unsigned int lcore_id, uint16_t port_id, uint16_t queue_id);
 
+static uint8_t  freq_tlb[] = {14, 9, 1};
+
+static int is_done(void)
+{
+	return empty_poll_stop;
+}
+
 /* exit signal handler */
 static void
 signal_exit_now(int sigtype)
@@ -338,7 +361,15 @@  signal_exit_now(int sigtype)
 	unsigned int portid;
 	int ret;
 
+	RTE_SET_USED(lcore_id);
+	RTE_SET_USED(portid);
+	RTE_SET_USED(ret);
+
 	if (sigtype == SIGINT) {
+		if (empty_poll_on)
+			empty_poll_stop = true;
+
+
 		for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
 			if (rte_lcore_is_enabled(lcore_id) == 0)
 				continue;
@@ -351,16 +382,19 @@  signal_exit_now(int sigtype)
 							"core%u\n", lcore_id);
 		}
 
-		RTE_ETH_FOREACH_DEV(portid) {
-			if ((enabled_port_mask & (1 << portid)) == 0)
-				continue;
+		if (!empty_poll_on) {
+			RTE_ETH_FOREACH_DEV(portid) {
+				if ((enabled_port_mask & (1 << portid)) == 0)
+					continue;
 
-			rte_eth_dev_stop(portid);
-			rte_eth_dev_close(portid);
+				rte_eth_dev_stop(portid);
+				rte_eth_dev_close(portid);
+			}
 		}
 	}
 
-	rte_exit(EXIT_SUCCESS, "User forced exit\n");
+	if (!empty_poll_on)
+		rte_exit(EXIT_SUCCESS, "User forced exit\n");
 }
 
 /*  Freqency scale down timer callback */
@@ -825,7 +859,107 @@  static int event_register(struct lcore_conf *qconf)
 
 	return 0;
 }
+/* main processing loop */
+static int
+main_empty_poll_loop(__attribute__((unused)) void *dummy)
+{
+	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+	unsigned int lcore_id;
+	uint64_t prev_tsc, diff_tsc, cur_tsc;
+	int i, j, nb_rx;
+	uint8_t queueid;
+	uint16_t portid;
+	struct lcore_conf *qconf;
+	struct lcore_rx_queue *rx_queue;
+
+	const uint64_t drain_tsc =
+		(rte_get_tsc_hz() + US_PER_S - 1) / US_PER_S * BURST_TX_DRAIN_US;
+
+	prev_tsc = 0;
+
+	lcore_id = rte_lcore_id();
+	qconf = &lcore_conf[lcore_id];
+
+	if (qconf->n_rx_queue == 0) {
+		RTE_LOG(INFO, L3FWD_POWER, "lcore %u has nothing to do\n", lcore_id);
+		return 0;
+	}
+
+	for (i = 0; i < qconf->n_rx_queue; i++) {
+		portid = qconf->rx_queue_list[i].port_id;
+		queueid = qconf->rx_queue_list[i].queue_id;
+		RTE_LOG(INFO, L3FWD_POWER, " -- lcoreid=%u portid=%u "
+				"rxqueueid=%hhu\n", lcore_id, portid, queueid);
+	}
+
+	while (!is_done()) {
+		stats[lcore_id].nb_iteration_looped++;
+
+		cur_tsc = rte_rdtsc();
+		/*
+		 * TX burst queue drain
+		 */
+		diff_tsc = cur_tsc - prev_tsc;
+		if (unlikely(diff_tsc > drain_tsc)) {
+			for (i = 0; i < qconf->n_tx_port; ++i) {
+				portid = qconf->tx_port_id[i];
+				rte_eth_tx_buffer_flush(portid,
+						qconf->tx_queue_id[portid],
+						qconf->tx_buffer[portid]);
+			}
+			prev_tsc = cur_tsc;
+		}
+
+		/*
+		 * Read packet from RX queues
+		 */
+		for (i = 0; i < qconf->n_rx_queue; ++i) {
+			rx_queue = &(qconf->rx_queue_list[i]);
+			rx_queue->idle_hint = 0;
+			portid = rx_queue->port_id;
+			queueid = rx_queue->queue_id;
+
+			nb_rx = rte_eth_rx_burst(portid, queueid, pkts_burst,
+					MAX_PKT_BURST);
+
+			stats[lcore_id].nb_rx_processed += nb_rx;
+
+			if (nb_rx == 0) {
+
+				rte_power_empty_poll_stat_update(lcore_id);
+
+				continue;
+			} else {
+				rte_power_poll_stat_update(lcore_id, nb_rx);
+			}
+
+
+			/* Prefetch first packets */
+			for (j = 0; j < PREFETCH_OFFSET && j < nb_rx; j++) {
+				rte_prefetch0(rte_pktmbuf_mtod(
+							pkts_burst[j], void *));
+			}
+
+			/* Prefetch and forward already prefetched packets */
+			for (j = 0; j < (nb_rx - PREFETCH_OFFSET); j++) {
+				rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[
+							j + PREFETCH_OFFSET], void *));
+				l3fwd_simple_forward(pkts_burst[j], portid,
+						qconf);
+			}
 
+			/* Forward remaining prefetched packets */
+			for (; j < nb_rx; j++) {
+				l3fwd_simple_forward(pkts_burst[j], portid,
+						qconf);
+			}
+
+		}
+
+	}
+
+	return 0;
+}
 /* main processing loop */
 static int
 main_loop(__attribute__((unused)) void *dummy)
@@ -1127,7 +1261,8 @@  print_usage(const char *prgname)
 		"  --no-numa: optional, disable numa awareness\n"
 		"  --enable-jumbo: enable jumbo frame"
 		" which max packet len is PKTLEN in decimal (64-9600)\n"
-		"  --parse-ptype: parse packet type by software\n",
+		"  --parse-ptype: parse packet type by software\n"
+		"  --empty=poll: enable empty poll detection\n",
 		prgname);
 }
 
@@ -1220,7 +1355,55 @@  parse_config(const char *q_arg)
 
 	return 0;
 }
+static int
+parse_ep_config(const char *q_arg)
+{
+	char s[256];
+	const char *p = q_arg;
+	char *end;
+	int  num_arg;
+
+	char *str_fld[3];
+
+	int training_flag;
+	int med_edpi;
+	int hgh_edpi;
+
+	ep_med_edpi = EMPTY_POLL_MED_THRESHOLD;
+	ep_hgh_edpi = EMPTY_POLL_MED_THRESHOLD;
+
+	snprintf(s, sizeof(s), "%s", p);
+
+	num_arg = rte_strsplit(s, sizeof(s), str_fld, 3, ',');
+
+	empty_poll_train = false;
+
+	if (num_arg == 0)
+		return 0;
 
+	if (num_arg == 3) {
+
+		training_flag = strtoul(str_fld[0], &end, 0);
+		med_edpi = strtoul(str_fld[1], &end, 0);
+		hgh_edpi = strtoul(str_fld[2], &end, 0);
+
+		if (training_flag == 1)
+			empty_poll_train = true;
+
+		if (med_edpi > 0)
+			ep_med_edpi = med_edpi;
+
+		if (med_edpi > 0)
+			ep_hgh_edpi = hgh_edpi;
+
+	} else {
+
+		return -1;
+	}
+
+	return 0;
+
+}
 #define CMD_LINE_OPT_PARSE_PTYPE "parse-ptype"
 
 /* Parse the argument given in the command line of the application */
@@ -1230,6 +1413,7 @@  parse_args(int argc, char **argv)
 	int opt, ret;
 	char **argvopt;
 	int option_index;
+	uint32_t limit;
 	char *prgname = argv[0];
 	static struct option lgopts[] = {
 		{"config", 1, 0, 0},
@@ -1237,13 +1421,14 @@  parse_args(int argc, char **argv)
 		{"high-perf-cores", 1, 0, 0},
 		{"no-numa", 0, 0, 0},
 		{"enable-jumbo", 0, 0, 0},
+		{"empty-poll", 1, 0, 0},
 		{CMD_LINE_OPT_PARSE_PTYPE, 0, 0, 0},
 		{NULL, 0, 0, 0}
 	};
 
 	argvopt = argv;
 
-	while ((opt = getopt_long(argc, argvopt, "p:P",
+	while ((opt = getopt_long(argc, argvopt, "p:l:m:h:P",
 				lgopts, &option_index)) != EOF) {
 
 		switch (opt) {
@@ -1260,7 +1445,18 @@  parse_args(int argc, char **argv)
 			printf("Promiscuous mode selected\n");
 			promiscuous_on = 1;
 			break;
-
+		case 'l':
+			limit = parse_max_pkt_len(optarg);
+			freq_tlb[LOW] = limit;
+			break;
+		case 'm':
+			limit = parse_max_pkt_len(optarg);
+			freq_tlb[MED] = limit;
+			break;
+		case 'h':
+			limit = parse_max_pkt_len(optarg);
+			freq_tlb[HGH] = limit;
+			break;
 		/* long options */
 		case 0:
 			if (!strncmp(lgopts[option_index].name, "config", 6)) {
@@ -1299,6 +1495,20 @@  parse_args(int argc, char **argv)
 			}
 
 			if (!strncmp(lgopts[option_index].name,
+						"empty-poll", 10)) {
+				printf("empty-poll is enabled\n");
+				empty_poll_on = true;
+				ret = parse_ep_config(optarg);
+
+				if (ret) {
+					printf("invalid empty poll config\n");
+					print_usage(prgname);
+					return -1;
+				}
+
+			}
+
+			if (!strncmp(lgopts[option_index].name,
 					"enable-jumbo", 12)) {
 				struct option lenopts =
 					{"max-pkt-len", required_argument, \
@@ -1646,6 +1856,59 @@  init_power_library(void)
 	}
 	return ret;
 }
+static void
+empty_poll_setup_timer(void)
+{
+	int lcore_id = rte_lcore_id();
+	uint64_t hz = rte_get_timer_hz();
+
+	struct  ep_params *ep_ptr = ep_params;
+
+	ep_ptr->interval_ticks = hz / INTERVALS_PER_SECOND;
+
+	rte_timer_reset_sync(&ep_ptr->timer0,
+			ep_ptr->interval_ticks,
+			PERIODICAL,
+			lcore_id,
+			rte_empty_poll_detection,
+			(void *)ep_ptr);
+
+}
+static int
+launch_timer(unsigned int lcore_id)
+{
+	int64_t prev_tsc = 0, cur_tsc, diff_tsc, cycles_10ms;
+
+	RTE_SET_USED(lcore_id);
+
+
+	if (rte_get_master_lcore() != lcore_id) {
+		rte_panic("timer on lcore:%d which is not master core:%d\n",
+				lcore_id,
+				rte_get_master_lcore());
+	}
+
+	RTE_LOG(INFO, POWER, "Bring up the Timer\n");
+
+	empty_poll_setup_timer();
+
+	cycles_10ms = rte_get_timer_hz() / 100;
+
+	while (!is_done()) {
+		cur_tsc = rte_rdtsc();
+		diff_tsc = cur_tsc - prev_tsc;
+		if (diff_tsc > cycles_10ms) {
+			rte_timer_manage();
+			prev_tsc = cur_tsc;
+			cycles_10ms = rte_get_timer_hz() / 100;
+		}
+	}
+
+	RTE_LOG(INFO, POWER, "Timer_subsystem is done\n");
+
+	return 0;
+}
+
 
 int
 main(int argc, char **argv)
@@ -1828,13 +2091,15 @@  main(int argc, char **argv)
 		if (rte_lcore_is_enabled(lcore_id) == 0)
 			continue;
 
-		/* init timer structures for each enabled lcore */
-		rte_timer_init(&power_timers[lcore_id]);
-		hz = rte_get_timer_hz();
-		rte_timer_reset(&power_timers[lcore_id],
-			hz/TIMER_NUMBER_PER_SECOND, SINGLE, lcore_id,
-						power_timer_cb, NULL);
-
+		if (empty_poll_on == false) {
+			/* init timer structures for each enabled lcore */
+			rte_timer_init(&power_timers[lcore_id]);
+			hz = rte_get_timer_hz();
+			rte_timer_reset(&power_timers[lcore_id],
+					hz/TIMER_NUMBER_PER_SECOND,
+					SINGLE, lcore_id,
+					power_timer_cb, NULL);
+		}
 		qconf = &lcore_conf[lcore_id];
 		printf("\nInitializing rx queues on lcore %u ... ", lcore_id );
 		fflush(stdout);
@@ -1905,12 +2170,38 @@  main(int argc, char **argv)
 
 	check_all_ports_link_status(enabled_port_mask);
 
+	if (empty_poll_on == true) {
+
+		if (empty_poll_train) {
+			policy.state = TRAINING;
+		} else {
+			policy.state = MED_NORMAL;
+			policy.med_base_edpi = ep_med_edpi;
+			policy.hgh_base_edpi = ep_hgh_edpi;
+		}
+
+		rte_power_empty_poll_stat_init(&ep_params, freq_tlb, &policy);
+	}
+
+
 	/* launch per-lcore init on every lcore */
-	rte_eal_mp_remote_launch(main_loop, NULL, CALL_MASTER);
+	if (empty_poll_on == false) {
+		rte_eal_mp_remote_launch(main_loop, NULL, CALL_MASTER);
+	} else {
+		empty_poll_stop = false;
+		rte_eal_mp_remote_launch(main_empty_poll_loop, NULL, SKIP_MASTER);
+	}
+
+	if (empty_poll_on == true)
+		launch_timer(rte_lcore_id());
+
 	RTE_LCORE_FOREACH_SLAVE(lcore_id) {
 		if (rte_eal_wait_lcore(lcore_id) < 0)
 			return -1;
 	}
 
+	if (empty_poll_on)
+		rte_power_empty_poll_stat_free();
+
 	return 0;
 }
diff --git a/examples/l3fwd-power/meson.build b/examples/l3fwd-power/meson.build
index 20c8054..a3c5c2f 100644
--- a/examples/l3fwd-power/meson.build
+++ b/examples/l3fwd-power/meson.build
@@ -9,6 +9,7 @@ 
 if host_machine.system() != 'linux'
 	build = false
 endif
+allow_experimental_apis = true
 deps += ['power', 'timer', 'lpm', 'hash']
 sources = files(
 	'main.c', 'perf_core.c'