From patchwork Thu Jun 7 07:37:00 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Hunt, David" X-Patchwork-Id: 40770 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id B74CA1B3F0; Thu, 7 Jun 2018 16:39:55 +0200 (CEST) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 1A86F1B1E3 for ; Thu, 7 Jun 2018 16:39:52 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Jun 2018 07:39:52 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,486,1520924400"; d="scan'208";a="62111496" Received: from silpixa00399952.ir.intel.com (HELO silpixa00399952.ger.corp.intel.com) ([10.237.223.64]) by fmsmga001.fm.intel.com with ESMTP; 07 Jun 2018 07:39:51 -0700 From: David Hunt To: dev@dpdk.org Cc: david.hunt@intel.com Date: Thu, 7 Jun 2018 08:37:00 +0100 Message-Id: <20180607073705.32895-2-david.hunt@intel.com> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180607073705.32895-1-david.hunt@intel.com> References: <20180607073705.32895-1-david.hunt@intel.com> Subject: [dpdk-dev] [PATCH v1 1/6] examples/vm_power: add check for port count X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" If we don't pass any ports to the app, we don't need to create any mempools, and we don't need to init any ports. Signed-off-by: David Hunt --- examples/vm_power_manager/main.c | 81 +++++++++++++++++--------------- 1 file changed, 43 insertions(+), 38 deletions(-) diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c index c9805a461..043b374bc 100644 --- a/examples/vm_power_manager/main.c +++ b/examples/vm_power_manager/main.c @@ -280,51 +280,56 @@ main(int argc, char **argv) nb_ports = rte_eth_dev_count_avail(); - mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL", NUM_MBUFS * nb_ports, - MBUF_CACHE_SIZE, 0, RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id()); + if (nb_ports > 0) { + mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL", + NUM_MBUFS * nb_ports, MBUF_CACHE_SIZE, 0, + RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id()); - if (mbuf_pool == NULL) - rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n"); + if (mbuf_pool == NULL) + rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n"); - /* Initialize ports. */ - RTE_ETH_FOREACH_DEV(portid) { - struct ether_addr eth; - int w, j; - int ret; + /* Initialize ports. */ + RTE_ETH_FOREACH_DEV(portid) { + struct ether_addr eth; + int w, j; + int ret; - if ((enabled_port_mask & (1 << portid)) == 0) - continue; + if ((enabled_port_mask & (1 << portid)) == 0) + continue; - eth.addr_bytes[0] = 0xe0; - eth.addr_bytes[1] = 0xe0; - eth.addr_bytes[2] = 0xe0; - eth.addr_bytes[3] = 0xe0; - eth.addr_bytes[4] = portid + 0xf0; + eth.addr_bytes[0] = 0xe0; + eth.addr_bytes[1] = 0xe0; + eth.addr_bytes[2] = 0xe0; + eth.addr_bytes[3] = 0xe0; + eth.addr_bytes[4] = portid + 0xf0; - if (port_init(portid, mbuf_pool) != 0) - rte_exit(EXIT_FAILURE, "Cannot init port %"PRIu8 "\n", + if (port_init(portid, mbuf_pool) != 0) + rte_exit(EXIT_FAILURE, + "Cannot init port %"PRIu8 "\n", portid); - for (w = 0; w < MAX_VFS; w++) { - eth.addr_bytes[5] = w + 0xf0; - - ret = rte_pmd_ixgbe_set_vf_mac_addr(portid, - w, ð); - if (ret == -ENOTSUP) - ret = rte_pmd_i40e_set_vf_mac_addr(portid, - w, ð); - if (ret == -ENOTSUP) - ret = rte_pmd_bnxt_set_vf_mac_addr(portid, - w, ð); - - switch (ret) { - case 0: - printf("Port %d VF %d MAC: ", - portid, w); - for (j = 0; j < 6; j++) { - printf("%02x", eth.addr_bytes[j]); - if (j < 5) - printf(":"); + for (w = 0; w < MAX_VFS; w++) { + eth.addr_bytes[5] = w + 0xf0; + + ret = rte_pmd_ixgbe_set_vf_mac_addr(portid, + w, ð); + if (ret == -ENOTSUP) + ret = rte_pmd_i40e_set_vf_mac_addr( + portid, w, ð); + if (ret == -ENOTSUP) + ret = rte_pmd_bnxt_set_vf_mac_addr( + portid, w, ð); + + switch (ret) { + case 0: + printf("Port %d VF %d MAC: ", + portid, w); + for (j = 0; j < 5; j++) { + printf("%02x:", + eth.addr_bytes[j]); + } + printf("%02x\n", eth.addr_bytes[5]); + break; } printf("\n"); break; From patchwork Thu Jun 7 07:37:01 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Hunt, David" X-Patchwork-Id: 40771 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 234331B3FF; Thu, 7 Jun 2018 16:39:57 +0200 (CEST) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id BC1AE1B1E3 for ; Thu, 7 Jun 2018 16:39:53 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Jun 2018 07:39:52 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,486,1520924400"; d="scan'208";a="62111500" Received: from silpixa00399952.ir.intel.com (HELO silpixa00399952.ger.corp.intel.com) ([10.237.223.64]) by fmsmga001.fm.intel.com with ESMTP; 07 Jun 2018 07:39:52 -0700 From: David Hunt To: dev@dpdk.org Cc: david.hunt@intel.com Date: Thu, 7 Jun 2018 08:37:01 +0100 Message-Id: <20180607073705.32895-3-david.hunt@intel.com> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180607073705.32895-1-david.hunt@intel.com> References: <20180607073705.32895-1-david.hunt@intel.com> Subject: [dpdk-dev] [PATCH v1 2/6] examples/vm_power: add core list parameter X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add in the '-l' command line parameter (also --core-list) So the user can now pass --corelist=4,6,8-10 and it will expand out to 4,6,8,9,10 using the parse function provided in parse.c (parse_set). This list of cores is then used to enable out-of-band monitoring to scale up and down these cores based on the ratio of branch hits versus branch misses. The ratio will be low when a poll loop is spinning with no packets being received, so the frequency will be scaled down. Also , as part of this change, we introduce a core_info struct which keeps information on each core in the system, and whether we're doing out of band monitoring on them. Signed-off-by: David Hunt --- examples/vm_power_manager/Makefile | 2 +- examples/vm_power_manager/main.c | 34 ++++++++- examples/vm_power_manager/parse.c | 93 +++++++++++++++++++++++ examples/vm_power_manager/parse.h | 20 +++++ examples/vm_power_manager/power_manager.c | 31 ++++++++ examples/vm_power_manager/power_manager.h | 20 +++++ 6 files changed, 197 insertions(+), 3 deletions(-) create mode 100644 examples/vm_power_manager/parse.c create mode 100644 examples/vm_power_manager/parse.h diff --git a/examples/vm_power_manager/Makefile b/examples/vm_power_manager/Makefile index ef2a9f959..0c925967c 100644 --- a/examples/vm_power_manager/Makefile +++ b/examples/vm_power_manager/Makefile @@ -19,7 +19,7 @@ APP = vm_power_mgr # all source are stored in SRCS-y SRCS-y := main.c vm_power_cli.c power_manager.c channel_manager.c -SRCS-y += channel_monitor.c +SRCS-y += channel_monitor.c parse.c CFLAGS += -O3 -I$(RTE_SDK)/lib/librte_power/ CFLAGS += $(WERROR_FLAGS) diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c index 043b374bc..cc2a1289c 100644 --- a/examples/vm_power_manager/main.c +++ b/examples/vm_power_manager/main.c @@ -29,6 +29,7 @@ #include "channel_monitor.h" #include "power_manager.h" #include "vm_power_cli.h" +#include "parse.h" #include #include #include @@ -135,18 +136,22 @@ parse_portmask(const char *portmask) static int parse_args(int argc, char **argv) { - int opt, ret; + int opt, ret, cnt, i; char **argvopt; + uint16_t *oob_enable; int option_index; char *prgname = argv[0]; + struct core_info *ci; static struct option lgopts[] = { { "mac-updating", no_argument, 0, 1}, { "no-mac-updating", no_argument, 0, 0}, + { "core-list", optional_argument, 0, 'l'}, {NULL, 0, 0, 0} }; argvopt = argv; + ci = get_core_info(); - while ((opt = getopt_long(argc, argvopt, "p:q:T:", + while ((opt = getopt_long(argc, argvopt, "l:p:q:T:", lgopts, &option_index)) != EOF) { switch (opt) { @@ -158,6 +163,27 @@ parse_args(int argc, char **argv) return -1; } break; + case 'l': + oob_enable = malloc(ci->core_count * sizeof(uint16_t)); + if (oob_enable == NULL) { + printf("Error - Unable to allocate memory\n"); + return -1; + } + cnt = parse_set(optarg, oob_enable, ci->core_count); + if (cnt < 0) { + printf("Invalid core-list - [%s]\n", + optarg); + break; + } + for (i = 0; i < ci->core_count; i++) { + if (oob_enable[i]) { + printf("***Using core %d\n", i); + ci->cd[i].oob_enabled = 1; + ci->cd[i].global_enabled_cpus = 1; + } + } + free(oob_enable); + break; /* long options */ case 0: break; @@ -263,6 +289,10 @@ main(int argc, char **argv) uint16_t portid; + ret = core_info_init(); + if (ret < 0) + rte_panic("Cannot allocate core info\n"); + ret = rte_eal_init(argc, argv); if (ret < 0) rte_panic("Cannot init EAL\n"); diff --git a/examples/vm_power_manager/parse.c b/examples/vm_power_manager/parse.c new file mode 100644 index 000000000..9de15c4a7 --- /dev/null +++ b/examples/vm_power_manager/parse.c @@ -0,0 +1,93 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2010-2014 Intel Corporation. + * Copyright(c) 2014 6WIND S.A. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "parse.h" + +/* + * Parse elem, the elem could be single number/range or group + * 1) A single number elem, it's just a simple digit. e.g. 9 + * 2) A single range elem, two digits with a '-' between. e.g. 2-6 + * 3) A group elem, combines multiple 1) or 2) e.g 0,2-4,6 + * Within group, '-' used for a range separator; + * ',' used for a single number. + */ +int +parse_set(const char *input, uint16_t set[], unsigned int num) +{ + unsigned int idx; + const char *str = input; + char *end = NULL; + unsigned int min, max; + + memset(set, 0, num * sizeof(uint16_t)); + + while (isblank(*str)) + str++; + + /* only digit or left bracket is qualify for start point */ + if (!isdigit(*str) || *str == '\0') + return -1; + + while (isblank(*str)) + str++; + if (*str == '\0') + return -1; + + min = num; + do { + + /* go ahead to the first digit */ + while (isblank(*str)) + str++; + if (!isdigit(*str)) + return -1; + + /* get the digit value */ + errno = 0; + idx = strtoul(str, &end, 10); + if (errno || end == NULL || idx >= num) + return -1; + + /* go ahead to separator '-' and ',' */ + while (isblank(*end)) + end++; + if (*end == '-') { + if (min == num) + min = idx; + else /* avoid continuous '-' */ + return -1; + } else if ((*end == ',') || (*end == '\0')) { + max = idx; + + if (min == num) + min = idx; + + for (idx = RTE_MIN(min, max); + idx <= RTE_MAX(min, max); idx++) { + set[idx] = 1; + } + min = num; + } else + return -1; + + str = end + 1; + } while (*end != '\0'); + + return str - input; +} diff --git a/examples/vm_power_manager/parse.h b/examples/vm_power_manager/parse.h new file mode 100644 index 000000000..a5971e9a2 --- /dev/null +++ b/examples/vm_power_manager/parse.h @@ -0,0 +1,20 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2018 Intel Corporation + */ + +#ifndef PARSE_H_ +#define PARSE_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +int +parse_set(const char *, uint16_t [], unsigned int); + +#ifdef __cplusplus +} +#endif + + +#endif /* PARSE_H_ */ diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c index 35db25591..a7849e48a 100644 --- a/examples/vm_power_manager/power_manager.c +++ b/examples/vm_power_manager/power_manager.c @@ -12,6 +12,7 @@ #include #include +#include #include #include @@ -54,6 +55,7 @@ struct freq_info { static struct freq_info global_core_freq_info[POWER_MGR_MAX_CPUS]; +struct core_info ci; static uint64_t global_enabled_cpus; #define SYSFS_CPU_PATH "/sys/devices/system/cpu/cpu%u/topology/core_id" @@ -76,6 +78,35 @@ set_host_cpus_mask(void) return num_cpus; } +struct core_info * +get_core_info(void) +{ + return &ci; +} + +int +core_info_init(void) +{ + struct core_info *ci; + int i; + + ci = get_core_info(); + + ci->core_count = get_nprocs_conf(); + ci->cd = malloc(ci->core_count * sizeof(struct core_details)); + if (!ci->cd) { + RTE_LOG(ERR, POWER_MANAGER, "Failed to allocate memory for core info."); + return -1; + } + for (i = 0; i < ci->core_count; i++) { + ci->cd[i].global_enabled_cpus = 1; + ci->cd[i].oob_enabled = 0; + ci->cd[i].msr_fd = 0; + } + printf("%d cores in system\n", ci->core_count); + return 0; +} + int power_manager_init(void) { diff --git a/examples/vm_power_manager/power_manager.h b/examples/vm_power_manager/power_manager.h index 8a8a84aa4..45385de37 100644 --- a/examples/vm_power_manager/power_manager.h +++ b/examples/vm_power_manager/power_manager.h @@ -8,6 +8,26 @@ #ifdef __cplusplus extern "C" { #endif +struct core_details { + uint64_t last_branches; + uint64_t last_branch_misses; + uint16_t global_enabled_cpus; + uint16_t oob_enabled; + int msr_fd; +}; + +struct core_info { + uint16_t core_count; + struct core_details *cd; +}; + +struct core_info * +get_core_info(void); + +int +core_info_init(void); + +#define RTE_LOGTYPE_POWER_MANAGER RTE_LOGTYPE_USER1 /* Maximum number of CPUS to manage */ #define POWER_MGR_MAX_CPUS 64 From patchwork Thu Jun 7 07:37:02 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Hunt, David" X-Patchwork-Id: 40772 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 11F171B409; Thu, 7 Jun 2018 16:39:59 +0200 (CEST) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 8A76D1B1E3 for ; Thu, 7 Jun 2018 16:39:54 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Jun 2018 07:39:54 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,486,1520924400"; d="scan'208";a="62111506" Received: from silpixa00399952.ir.intel.com (HELO silpixa00399952.ger.corp.intel.com) ([10.237.223.64]) by fmsmga001.fm.intel.com with ESMTP; 07 Jun 2018 07:39:53 -0700 From: David Hunt To: dev@dpdk.org Cc: david.hunt@intel.com Date: Thu, 7 Jun 2018 08:37:02 +0100 Message-Id: <20180607073705.32895-4-david.hunt@intel.com> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180607073705.32895-1-david.hunt@intel.com> References: <20180607073705.32895-1-david.hunt@intel.com> Subject: [dpdk-dev] [PATCH v1 3/6] examples/vm_power: add oob monitoring functions X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This patch introduces the out-of-band (oob) core monitoring functions. The functions are similar to the channel manager functions. There are function to add and remove cores from the list of cores being monitored. There is a function to initialise the monitor setup, run the monitor thread, and exit the monitor. The monitor thread runs in it's own lcore, and is separate functionality to the channel monitor which is epoll based. THis thread is timer based. It loops through all monitored cores, calculates the branch ratio, scales up or down the core, then sleeps for an interval (~250 uS). The method it uses to read the branch counters is a pread on the /dev/cpu/x/msr file, so the 'msr' kernel module needs to be loaded. Also, since the msr.h file has been made unavailable in recent kernels, we have #defines for the relevant MSRs included in the code. The makefile has a switch for x86 and non-x86 platforms, and compiles stub function for non-x86 platforms. Signed-off-by: David Hunt --- examples/vm_power_manager/Makefile | 5 + examples/vm_power_manager/oob_monitor.h | 68 +++++ examples/vm_power_manager/oob_monitor_nop.c | 38 +++ examples/vm_power_manager/oob_monitor_x86.c | 282 ++++++++++++++++++++ 4 files changed, 393 insertions(+) create mode 100644 examples/vm_power_manager/oob_monitor.h create mode 100644 examples/vm_power_manager/oob_monitor_nop.c create mode 100644 examples/vm_power_manager/oob_monitor_x86.c diff --git a/examples/vm_power_manager/Makefile b/examples/vm_power_manager/Makefile index 0c925967c..13a5205ba 100644 --- a/examples/vm_power_manager/Makefile +++ b/examples/vm_power_manager/Makefile @@ -20,6 +20,11 @@ APP = vm_power_mgr # all source are stored in SRCS-y SRCS-y := main.c vm_power_cli.c power_manager.c channel_manager.c SRCS-y += channel_monitor.c parse.c +ifeq ($(CONFIG_RTE_ARCH_X86_64),y) +SRCS-y += oob_monitor_x86.c +else +SRCS-y += oob_monitor_nop.c +endif CFLAGS += -O3 -I$(RTE_SDK)/lib/librte_power/ CFLAGS += $(WERROR_FLAGS) diff --git a/examples/vm_power_manager/oob_monitor.h b/examples/vm_power_manager/oob_monitor.h new file mode 100644 index 000000000..b96e08df7 --- /dev/null +++ b/examples/vm_power_manager/oob_monitor.h @@ -0,0 +1,68 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2018 Intel Corporation + */ + +#ifndef OOB_MONITOR_H_ +#define OOB_MONITOR_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +/** + * Setup the Branch Monitor resources required to initialize epoll. + * Must be called first before calling other functions. + * + * @return + * - 0 on success. + * - Negative on error. + */ +int branch_monitor_init(void); + +/** + * Run the OOB branch monitor, loops forever on on epoll_wait. + * + * + * @return + * None + */ +void run_branch_monitor(void); + +/** + * Exit the OOB Branch Monitor. + * + * @return + * None + */ +void branch_monitor_exit(void); + +/** + * Add a core to the list of cores to monitor. + * + * @param core + * Core Number + * + * @return + * - 0 on success. + * - Negative on error. + */ +int add_core_to_monitor(int core); + +/** + * Remove a previously added core from core list. + * + * @param core + * Core Number + * + * @return + * - 0 on success. + * - Negative on error. + */ +int remove_core_from_monitor(int core); + +#ifdef __cplusplus +} +#endif + + +#endif /* OOB_MONITOR_H_ */ diff --git a/examples/vm_power_manager/oob_monitor_nop.c b/examples/vm_power_manager/oob_monitor_nop.c new file mode 100644 index 000000000..7e7b8bc14 --- /dev/null +++ b/examples/vm_power_manager/oob_monitor_nop.c @@ -0,0 +1,38 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2010-2014 Intel Corporation + */ + +#include "oob_monitor.h" + +void branch_monitor_exit(void) +{ +} + +__attribute__((unused)) static float +apply_policy(__attribute__((unused)) int core) +{ + return 0.0; +} + +int +add_core_to_monitor(__attribute__((unused)) int core) +{ + return 0; +} + +int +remove_core_from_monitor(__attribute__((unused)) int core) +{ + return 0; +} + +int +branch_monitor_init(void) +{ + return 0; +} + +void +run_branch_monitor(void) +{ +} diff --git a/examples/vm_power_manager/oob_monitor_x86.c b/examples/vm_power_manager/oob_monitor_x86.c new file mode 100644 index 000000000..485ec5e3f --- /dev/null +++ b/examples/vm_power_manager/oob_monitor_x86.c @@ -0,0 +1,282 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2018 Intel Corporation + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include + +#include +#include "oob_monitor.h" +#include "power_manager.h" +#include "channel_manager.h" + +#include +#define RTE_LOGTYPE_CHANNEL_MONITOR RTE_LOGTYPE_USER1 + +#define MAX_EVENTS 256 + +static volatile unsigned run_loop = 1; +static uint64_t g_branches, g_branch_misses; +static int g_active; + +void branch_monitor_exit(void) +{ + run_loop = 0; +} + +/* Number of microseconds between each poll */ +#define INTERVAL 100 +#define PRINT_LOOP_COUNT (1000000/INTERVAL) +#define RATIO_THRESHOLD 0.03 +#define IA32_PERFEVTSEL0 0x186 +#define IA32_PERFEVTSEL1 0x187 +#define IA32_PERFCTR0 0xc1 +#define IA32_PERFCTR1 0xc2 +#define IA32_PERFEVT_BRANCH_HITS 0x05300c4 +#define IA32_PERFEVT_BRANCH_MISS 0x05300c5 + +static float +apply_policy(int core) +{ + struct core_info *ci; + uint64_t counter; + uint64_t branches, branch_misses; + uint32_t last_branches, last_branch_misses; + int hits_diff, miss_diff; + float ratio; + int ret; + + g_active = 0; + ci = get_core_info(); + + last_branches = ci->cd[core].last_branches; + last_branch_misses = ci->cd[core].last_branch_misses; + + ret = pread(ci->cd[core].msr_fd, &counter, + sizeof(counter), IA32_PERFCTR0); + if (ret < 0) + RTE_LOG(ERR, POWER_MANAGER, + "unable to read counter for core %u\n", + core); + branches = counter; + + ret = pread(ci->cd[core].msr_fd, &counter, + sizeof(counter), IA32_PERFCTR1); + if (ret < 0) + RTE_LOG(ERR, POWER_MANAGER, + "unable to read counter for core %u\n", + core); + branch_misses = counter; + + + ci->cd[core].last_branches = branches; + ci->cd[core].last_branch_misses = branch_misses; + + hits_diff = (int)branches - (int)last_branches; + if (hits_diff <= 0) { + /* Likely a counter overflow condition, skip this round */ + return -1.0; + } + + miss_diff = (int)branch_misses - (int)last_branch_misses; + if (miss_diff <= 0) { + /* Likely a counter overflow condition, skip this round */ + return -1.0; + } + + g_branches = hits_diff; + g_branch_misses = miss_diff; + + if (hits_diff < (INTERVAL*100)) { + /* Likely no workload running on this core. Skip. */ + return -1.0; + } + + ratio = (float)miss_diff * (float)100 / (float)hits_diff; + + if (ratio < RATIO_THRESHOLD) + power_manager_scale_core_min(core); + else + power_manager_scale_core_max(core); + + g_active = 1; + return ratio; +} + +int +add_core_to_monitor(int core) +{ + struct core_info *ci; + char proc_file[UNIX_PATH_MAX]; + int ret; + + ci = get_core_info(); + + if (core < ci->core_count) { + long setup; + + snprintf(proc_file, UNIX_PATH_MAX, "/dev/cpu/%d/msr", core); + ci->cd[core].msr_fd = open(proc_file, O_RDWR | O_SYNC); + if (ci->cd[core].msr_fd < 0) { + RTE_LOG(ERR, POWER_MANAGER, + "Error opening MSR file for core %d " + "(is msr kernel module loaded?)\n", + core); + return -1; + } + /* + * Set up branch counters + */ + setup = IA32_PERFEVT_BRANCH_HITS; + ret = pwrite(ci->cd[core].msr_fd, &setup, + sizeof(setup), IA32_PERFEVTSEL0); + if (ret < 0) { + RTE_LOG(ERR, POWER_MANAGER, + "unable to set counter for core %u\n", + core); + return ret; + } + setup = IA32_PERFEVT_BRANCH_MISS; + ret = pwrite(ci->cd[core].msr_fd, &setup, + sizeof(setup), IA32_PERFEVTSEL1); + if (ret < 0) { + RTE_LOG(ERR, POWER_MANAGER, + "unable to set counter for core %u\n", + core); + return ret; + } + /* + * Close the file and re-open as read only so + * as not to hog the resource + */ + close(ci->cd[core].msr_fd); + ci->cd[core].msr_fd = open(proc_file, O_RDONLY); + if (ci->cd[core].msr_fd < 0) { + RTE_LOG(ERR, POWER_MANAGER, + "Error opening MSR file for core %d " + "(is msr kernel module loaded?)\n", + core); + return -1; + } + ci->cd[core].oob_enabled = 1; + } + return 0; +} + +int +remove_core_from_monitor(int core) +{ + struct core_info *ci; + char proc_file[UNIX_PATH_MAX]; + int ret; + + ci = get_core_info(); + + if (ci->cd[core].oob_enabled) { + long setup; + + /* + * close the msr file, then reopen rw so we can + * disable the counters + */ + if (ci->cd[core].msr_fd != 0) + close(ci->cd[core].msr_fd); + snprintf(proc_file, UNIX_PATH_MAX, "/dev/cpu/%d/msr", core); + ci->cd[core].msr_fd = open(proc_file, O_RDWR | O_SYNC); + if (ci->cd[core].msr_fd < 0) { + RTE_LOG(ERR, POWER_MANAGER, + "Error opening MSR file for core %d " + "(is msr kernel module loaded?)\n", + core); + return -1; + } + setup = 0x0; /* clear event */ + ret = pwrite(ci->cd[core].msr_fd, &setup, + sizeof(setup), IA32_PERFEVTSEL0); + if (ret < 0) { + RTE_LOG(ERR, POWER_MANAGER, + "unable to set counter for core %u\n", + core); + return ret; + } + setup = 0x0; /* clear event */ + ret = pwrite(ci->cd[core].msr_fd, &setup, + sizeof(setup), IA32_PERFEVTSEL1); + if (ret < 0) { + RTE_LOG(ERR, POWER_MANAGER, + "unable to set counter for core %u\n", + core); + return ret; + } + + close(ci->cd[core].msr_fd); + ci->cd[core].msr_fd = 0; + ci->cd[core].oob_enabled = 0; + } + return 0; +} + +int +branch_monitor_init(void) +{ + return 0; +} + +void +run_branch_monitor(void) +{ + struct core_info *ci; + int print = 0; + float ratio; + int printed; + int reads = 0; + + ci = get_core_info(); + + while (run_loop) { + + if (!run_loop) + break; + usleep(INTERVAL); + int j; + print++; + printed = 0; + for (j = 0; j < ci->core_count; j++) { + if (ci->cd[j].oob_enabled) { + ratio = apply_policy(j); + if ((print > PRINT_LOOP_COUNT) && (g_active)) { + printf(" %d: %.4f {%lu} {%d}", j, + ratio, g_branches, + reads); + printed = 1; + reads = 0; + } else { + reads++; + } + } + } + if (print > PRINT_LOOP_COUNT) { + if (printed) + printf("\n"); + print = 0; + } + } +} From patchwork Thu Jun 7 07:37:03 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Hunt, David" X-Patchwork-Id: 40773 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id D578E1B61A; Thu, 7 Jun 2018 16:40:00 +0200 (CEST) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 5BA281B1E3 for ; Thu, 7 Jun 2018 16:39:55 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Jun 2018 07:39:54 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,486,1520924400"; d="scan'208";a="62111510" Received: from silpixa00399952.ir.intel.com (HELO silpixa00399952.ger.corp.intel.com) ([10.237.223.64]) by fmsmga001.fm.intel.com with ESMTP; 07 Jun 2018 07:39:54 -0700 From: David Hunt To: dev@dpdk.org Cc: david.hunt@intel.com Date: Thu, 7 Jun 2018 08:37:03 +0100 Message-Id: <20180607073705.32895-5-david.hunt@intel.com> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180607073705.32895-1-david.hunt@intel.com> References: <20180607073705.32895-1-david.hunt@intel.com> Subject: [dpdk-dev] [PATCH v1 4/6] examples/vm_power: allow greater than 64 cores X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" To facilitate more info per core, change the global_cpu_mask from a uint64_t to an array. This also removes the limit on 64 cores, allocing the aray at run-time based on the number of cores found in the system. Signed-off-by: David Hunt --- examples/vm_power_manager/power_manager.c | 115 +++++++++++----------- 1 file changed, 58 insertions(+), 57 deletions(-) diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c index a7849e48a..4bdde23da 100644 --- a/examples/vm_power_manager/power_manager.c +++ b/examples/vm_power_manager/power_manager.c @@ -19,14 +19,14 @@ #include #include +#include "channel_manager.h" #include "power_manager.h" - -#define RTE_LOGTYPE_POWER_MANAGER RTE_LOGTYPE_USER1 +#include "oob_monitor.h" #define POWER_SCALE_CORE(DIRECTION, core_num , ret) do { \ - if (core_num >= POWER_MGR_MAX_CPUS) \ + if (core_num >= ci.core_count) \ return -1; \ - if (!(global_enabled_cpus & (1ULL << core_num))) \ + if (!(ci.cd[core_num].global_enabled_cpus)) \ return -1; \ rte_spinlock_lock(&global_core_freq_info[core_num].power_sl); \ ret = rte_power_freq_##DIRECTION(core_num); \ @@ -37,7 +37,7 @@ int i; \ for (i = 0; core_mask; core_mask &= ~(1 << i++)) { \ if ((core_mask >> i) & 1) { \ - if (!(global_enabled_cpus & (1ULL << i))) \ + if (!(ci.cd[i].global_enabled_cpus)) \ continue; \ rte_spinlock_lock(&global_core_freq_info[i].power_sl); \ if (rte_power_freq_##DIRECTION(i) != 1) \ @@ -56,28 +56,9 @@ struct freq_info { static struct freq_info global_core_freq_info[POWER_MGR_MAX_CPUS]; struct core_info ci; -static uint64_t global_enabled_cpus; #define SYSFS_CPU_PATH "/sys/devices/system/cpu/cpu%u/topology/core_id" -static unsigned -set_host_cpus_mask(void) -{ - char path[PATH_MAX]; - unsigned i; - unsigned num_cpus = 0; - - for (i = 0; i < POWER_MGR_MAX_CPUS; i++) { - snprintf(path, sizeof(path), SYSFS_CPU_PATH, i); - if (access(path, F_OK) == 0) { - global_enabled_cpus |= 1ULL << i; - num_cpus++; - } else - return num_cpus; - } - return num_cpus; -} - struct core_info * get_core_info(void) { @@ -110,38 +91,45 @@ core_info_init(void) int power_manager_init(void) { - unsigned int i, num_cpus, num_freqs; - uint64_t cpu_mask; + unsigned int i, num_cpus = 0, num_freqs = 0; int ret = 0; + struct core_info *ci; + + rte_power_set_env(PM_ENV_ACPI_CPUFREQ); - num_cpus = set_host_cpus_mask(); - if (num_cpus == 0) { - RTE_LOG(ERR, POWER_MANAGER, "Unable to detected host CPUs, please " - "ensure that sufficient privileges exist to inspect sysfs\n"); + ci = get_core_info(); + if (!ci) { + RTE_LOG(ERR, POWER_MANAGER, + "Failed to get core info!\n"); return -1; } - rte_power_set_env(PM_ENV_ACPI_CPUFREQ); - cpu_mask = global_enabled_cpus; - for (i = 0; cpu_mask; cpu_mask &= ~(1 << i++)) { - if (rte_power_init(i) < 0) - RTE_LOG(ERR, POWER_MANAGER, - "Unable to initialize power manager " - "for core %u\n", i); - num_freqs = rte_power_freqs(i, global_core_freq_info[i].freqs, + + for (i = 0; i < ci->core_count; i++) { + if (ci->cd[i].global_enabled_cpus) { + if (rte_power_init(i) < 0) + RTE_LOG(ERR, POWER_MANAGER, + "Unable to initialize power manager " + "for core %u\n", i); + num_cpus++; + num_freqs = rte_power_freqs(i, + global_core_freq_info[i].freqs, RTE_MAX_LCORE_FREQS); - if (num_freqs == 0) { - RTE_LOG(ERR, POWER_MANAGER, - "Unable to get frequency list for core %u\n", - i); - global_enabled_cpus &= ~(1 << i); - num_cpus--; - ret = -1; + if (num_freqs == 0) { + RTE_LOG(ERR, POWER_MANAGER, + "Unable to get frequency list for core %u\n", + i); + ci->cd[i].oob_enabled = 0; + ret = -1; + } + global_core_freq_info[i].num_freqs = num_freqs; + + rte_spinlock_init(&global_core_freq_info[i].power_sl); } - global_core_freq_info[i].num_freqs = num_freqs; - rte_spinlock_init(&global_core_freq_info[i].power_sl); + if (ci->cd[i].oob_enabled) + add_core_to_monitor(i); } - RTE_LOG(INFO, POWER_MANAGER, "Detected %u host CPUs , enabled core mask:" - " 0x%"PRIx64"\n", num_cpus, global_enabled_cpus); + RTE_LOG(INFO, POWER_MANAGER, "Managing %u cores out of %u available host cores\n", + num_cpus, ci->core_count); return ret; } @@ -156,7 +144,7 @@ power_manager_get_current_frequency(unsigned core_num) core_num, POWER_MGR_MAX_CPUS-1); return -1; } - if (!(global_enabled_cpus & (1ULL << core_num))) + if (!(ci.cd[core_num].global_enabled_cpus)) return 0; rte_spinlock_lock(&global_core_freq_info[core_num].power_sl); @@ -175,15 +163,26 @@ power_manager_exit(void) { unsigned int i; int ret = 0; + struct core_info *ci; - for (i = 0; global_enabled_cpus; global_enabled_cpus &= ~(1 << i++)) { - if (rte_power_exit(i) < 0) { - RTE_LOG(ERR, POWER_MANAGER, "Unable to shutdown power manager " - "for core %u\n", i); - ret = -1; + ci = get_core_info(); + if (!ci) { + RTE_LOG(ERR, POWER_MANAGER, + "Failed to get core info!\n"); + return -1; + } + + for (i = 0; i < ci->core_count; i++) { + if (ci->cd[i].global_enabled_cpus) { + if (rte_power_exit(i) < 0) { + RTE_LOG(ERR, POWER_MANAGER, "Unable to shutdown power manager " + "for core %u\n", i); + ret = -1; + } + ci->cd[i].global_enabled_cpus = 0; } + remove_core_from_monitor(i); } - global_enabled_cpus = 0; return ret; } @@ -299,10 +298,12 @@ int power_manager_scale_core_med(unsigned int core_num) { int ret = 0; + struct core_info *ci; + ci = get_core_info(); if (core_num >= POWER_MGR_MAX_CPUS) return -1; - if (!(global_enabled_cpus & (1ULL << core_num))) + if (!(ci->cd[core_num].global_enabled_cpus)) return -1; rte_spinlock_lock(&global_core_freq_info[core_num].power_sl); ret = rte_power_set_freq(core_num, From patchwork Thu Jun 7 07:37:04 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Hunt, David" X-Patchwork-Id: 40774 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 3F0641B622; Thu, 7 Jun 2018 16:40:02 +0200 (CEST) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 8D33A1B3F5 for ; Thu, 7 Jun 2018 16:39:56 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Jun 2018 07:39:55 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,486,1520924400"; d="scan'208";a="62111513" Received: from silpixa00399952.ir.intel.com (HELO silpixa00399952.ger.corp.intel.com) ([10.237.223.64]) by fmsmga001.fm.intel.com with ESMTP; 07 Jun 2018 07:39:55 -0700 From: David Hunt To: dev@dpdk.org Cc: david.hunt@intel.com Date: Thu, 7 Jun 2018 08:37:04 +0100 Message-Id: <20180607073705.32895-6-david.hunt@intel.com> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180607073705.32895-1-david.hunt@intel.com> References: <20180607073705.32895-1-david.hunt@intel.com> Subject: [dpdk-dev] [PATCH v1 5/6] examples/vm_power: add thread for oob core monitor X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Change the app to now require three cores, as the third core will be used to run the oob montoring thread. Signed-off-by: David Hunt --- examples/vm_power_manager/main.c | 37 +++++++++++++++++++++++++++++--- 1 file changed, 34 insertions(+), 3 deletions(-) diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c index cc2a1289c..4c6b5a990 100644 --- a/examples/vm_power_manager/main.c +++ b/examples/vm_power_manager/main.c @@ -29,6 +29,7 @@ #include "channel_monitor.h" #include "power_manager.h" #include "vm_power_cli.h" +#include "oob_monitor.h" #include "parse.h" #include #include @@ -269,6 +270,17 @@ run_monitor(__attribute__((unused)) void *arg) return 0; } +static int +run_core_monitor(__attribute__((unused)) void *arg) +{ + if (branch_monitor_init() < 0) { + printf("Unable to initialize core monitor\n"); + return -1; + } + run_branch_monitor(); + return 0; +} + static void sig_handler(int signo) { @@ -287,12 +299,15 @@ main(int argc, char **argv) unsigned int nb_ports; struct rte_mempool *mbuf_pool; uint16_t portid; + struct core_info *ci; ret = core_info_init(); if (ret < 0) rte_panic("Cannot allocate core info\n"); + ci = get_core_info(); + ret = rte_eal_init(argc, argv); if (ret < 0) rte_panic("Cannot init EAL\n"); @@ -367,16 +382,23 @@ main(int argc, char **argv) } } + check_all_ports_link_status(enabled_port_mask); + lcore_id = rte_get_next_lcore(-1, 1, 0); if (lcore_id == RTE_MAX_LCORE) { - RTE_LOG(ERR, EAL, "A minimum of two cores are required to run " + RTE_LOG(ERR, EAL, "A minimum of three cores are required to run " "application\n"); return 0; } - - check_all_ports_link_status(enabled_port_mask); + printf("Running channel monitor on lcore id %d\n", lcore_id); rte_eal_remote_launch(run_monitor, NULL, lcore_id); + lcore_id = rte_get_next_lcore(lcore_id, 1, 0); + if (lcore_id == RTE_MAX_LCORE) { + RTE_LOG(ERR, EAL, "A minimum of three cores are required to run " + "application\n"); + return 0; + } if (power_manager_init() < 0) { printf("Unable to initialize power manager\n"); return -1; @@ -385,8 +407,17 @@ main(int argc, char **argv) printf("Unable to initialize channel manager\n"); return -1; } + + printf("Running core monitor on lcore id %d\n", lcore_id); + rte_eal_remote_launch(run_core_monitor, NULL, lcore_id); + run_cli(NULL); + branch_monitor_exit(); + rte_eal_mp_wait_lcore(); + + free(ci->cd); + return 0; } From patchwork Thu Jun 7 07:37:05 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Hunt, David" X-Patchwork-Id: 40775 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id B005E1B630; Thu, 7 Jun 2018 16:40:03 +0200 (CEST) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 2995C1B405 for ; Thu, 7 Jun 2018 16:39:56 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Jun 2018 07:39:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,486,1520924400"; d="scan'208";a="62111518" Received: from silpixa00399952.ir.intel.com (HELO silpixa00399952.ger.corp.intel.com) ([10.237.223.64]) by fmsmga001.fm.intel.com with ESMTP; 07 Jun 2018 07:39:55 -0700 From: David Hunt To: dev@dpdk.org Cc: david.hunt@intel.com Date: Thu, 7 Jun 2018 08:37:05 +0100 Message-Id: <20180607073705.32895-7-david.hunt@intel.com> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180607073705.32895-1-david.hunt@intel.com> References: <20180607073705.32895-1-david.hunt@intel.com> Subject: [dpdk-dev] [PATCH v1 6/6] examples/vm_power: add port-list to command line X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" add in the long form of -p, which is --port-list Signed-off-by: David Hunt --- examples/vm_power_manager/main.c | 1 + 1 file changed, 1 insertion(+) diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c index 4c6b5a990..4088861f1 100644 --- a/examples/vm_power_manager/main.c +++ b/examples/vm_power_manager/main.c @@ -147,6 +147,7 @@ parse_args(int argc, char **argv) { "mac-updating", no_argument, 0, 1}, { "no-mac-updating", no_argument, 0, 0}, { "core-list", optional_argument, 0, 'l'}, + { "port-list", optional_argument, 0, 'p'}, {NULL, 0, 0, 0} }; argvopt = argv;