From patchwork Mon Jul 17 17:15:00 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: ilia.kurakin@intel.com X-Patchwork-Id: 26992 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id 4EAA6532E; Mon, 17 Jul 2017 19:15:27 +0200 (CEST) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id DA3D6532C for ; Mon, 17 Jul 2017 19:15:24 +0200 (CEST) Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Jul 2017 10:15:07 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.40,375,1496127600"; d="scan'208";a="126114176" Received: from nntvtune144.inn.intel.com ([10.125.21.144]) by orsmga005.jf.intel.com with ESMTP; 17 Jul 2017 10:15:06 -0700 From: ilia.kurakin@intel.com To: dev@dpdk.org Cc: jerin.jacob@caviumnetworks.com, konstantin.ananyev@intel.com, keith.wiles@intel.com, dmitry.galanov@intel.com, Ilia Kurakin Date: Mon, 17 Jul 2017 20:15:00 +0300 Message-Id: <1500311700-5298-1-git-send-email-ilia.kurakin@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1499795286-31826-1-git-send-email-ilia.kurakin@intel.com> References: <1499795286-31826-1-git-send-email-ilia.kurakin@intel.com> Subject: [dpdk-dev] [PATCH v4] ether: add support for vtune task tracing X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Ilia Kurakin The patch adds tracing of loop iterations that yielded no packets in a DPDK application. It is using ITT task API: https://software.intel.com/en-us/node/544206 We suppose the flow of using this tracing would assume the user has ITT lib and header on machine and re-build DPDK with additional make parameters: make EXTRA_CFLAGS=-I EXTRA_LDLIBS="-L -littnotify" Signed-off-by: Ilia Kurakin --- -V2 change: ITT tasks collection is moved to rx callback -V3 change: rte_ethdev_profile.c created, all profile specific code moved there. Added generic profile function -V4 change: checkpatch issues fixed Added documentation topic config/common_base | 1 + doc/guides/prog_guide/profile_app.rst | 31 +++++++ lib/librte_ether/Makefile | 1 + lib/librte_ether/rte_ethdev.c | 4 + lib/librte_ether/rte_ethdev_profile.c | 156 ++++++++++++++++++++++++++++++++++ lib/librte_ether/rte_ethdev_profile.h | 52 ++++++++++++ 6 files changed, 245 insertions(+) create mode 100644 lib/librte_ether/rte_ethdev_profile.c create mode 100644 lib/librte_ether/rte_ethdev_profile.h diff --git a/config/common_base b/config/common_base index 8ae6e92..dda51db 100644 --- a/config/common_base +++ b/config/common_base @@ -136,6 +136,7 @@ CONFIG_RTE_MAX_QUEUES_PER_PORT=1024 CONFIG_RTE_LIBRTE_IEEE1588=n CONFIG_RTE_ETHDEV_QUEUE_STAT_CNTRS=16 CONFIG_RTE_ETHDEV_RXTX_CALLBACKS=y +CONFIG_RTE_ETHDEV_PROFILE_ITT_WASTED_RX_ITERATIONS=n # # Turn off Tx preparation stage diff --git a/doc/guides/prog_guide/profile_app.rst b/doc/guides/prog_guide/profile_app.rst index 54b546a..13b373c 100644 --- a/doc/guides/prog_guide/profile_app.rst +++ b/doc/guides/prog_guide/profile_app.rst @@ -59,6 +59,37 @@ Refer to the for details about application profiling. +Profiling wasted iterations with ITT +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Iterations which yielded no RX packets (wasted loop iterations) can be analyzed +using Intel VTune Amplifier. This profiling employs +`Instrumentation and Tracing Technology (ITT) API +`_ +, enclosed to VTune, and requires no changes in a DPDK application. + +To trace wasted iterations on RX queues, first reconfigure DPDK with +``CONFIG_RTE_ETHDEV_RXTX_CALLBACKS`` and +``CONFIG_RTE_ETHDEV_PROFILE_ITT_WASTED_RX_ITERATIONS`` enabled. + +Then rebuild DPDK, specifying paths to ITT header and library, which can be +found in any VTune distribution in *include* and *lib* directories respectively: + +.. code-block:: console + + make EXTRA_CFLAGS=-I \ + EXTRA_LDLIBS="-L -littnotify" + +Finally, to see wasted iterations in your performance analysis results, pick +*"Analyze user tasks, events, and counters"* checkbox in VTune's +*"Analysis Type"* tab when configuring analysis via VTune GUI. Alternatively, +running VTune via command line, specify ``-knob enable-user-tasks=true`` option. + +Collected regions of wasted iterations will be marked on VTune's timeline +as usual ITT tasks. These ITT tasks have predefined names, containing Ethernet +device and RX queue identifiers. + + Profiling on ARM64 ------------------ diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile index db692ae..7224a11 100644 --- a/lib/librte_ether/Makefile +++ b/lib/librte_ether/Makefile @@ -46,6 +46,7 @@ LIBABIVER := 6 SRCS-y += rte_ethdev.c SRCS-y += rte_flow.c SRCS-y += rte_tm.c +SRCS-y += rte_ethdev_profile.c # # Export include files diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index a1b7447..2eba36e 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -67,6 +67,7 @@ #include "rte_ether.h" #include "rte_ethdev.h" +#include "rte_ethdev_profile.h" static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data"; struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS]; @@ -825,6 +826,9 @@ rte_eth_dev_configure(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q, return diag; } + /* See rte_ethdev_profile.h to find comments on code below. */ + rte_eth_profile_rx_init(port_id, dev); + return 0; } diff --git a/lib/librte_ether/rte_ethdev_profile.c b/lib/librte_ether/rte_ethdev_profile.c new file mode 100644 index 0000000..8884175 --- /dev/null +++ b/lib/librte_ether/rte_ethdev_profile.c @@ -0,0 +1,156 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2017 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include + +#include "rte_ethdev_profile.h" + +/** + * This conditional block enables RX queues profiling by tracking wasted + * iterations, i.e. iterations which yielded no RX packets. Profiling is + * performed using the Instrumentation and Tracing Technology (ITT) API, + * employed by the Intel VTune TM Amplifier. + */ +#ifdef RTE_ETHDEV_PROFILE_ITT_WASTED_RX_ITERATIONS + +#include + +#define ITT_MAX_NAME_LEN (100) + +/** + * Auxiliary ITT structure belonging to Ethernet devive and using to: + * - track RX queue state to determine whether it is wasting loop iterations + * - begin or end ITT task using task domain and task name (handle) + */ +struct itt_profile_rx_data { + /** + * ITT domains for each queue. + */ + __itt_domain *domains[RTE_MAX_QUEUES_PER_PORT]; + /** + * ITT task names for each queue. + */ + __itt_string_handle *handles[RTE_MAX_QUEUES_PER_PORT]; + /** + * Flags indicating the queues state. Possible values: + * 1 - queue is wasting iterations, + * 0 - otherwise. + */ + uint8_t queue_state[RTE_MAX_QUEUES_PER_PORT]; +}; + +/** + * The pool of *itt_profile_rx_data* structures. + */ +struct itt_profile_rx_data itt_rx_data[RTE_MAX_ETHPORTS]; + + +/** + * This callback function manages ITT tasks collection on given port and queue. + * It must be registered with rte_eth_add_rx_callback() to be called from + * rte_eth_rx_burst(). To find more comments see rte_rx_callback_fn function + * type declaration. + */ +static uint16_t +collect_itt_rx_burst_cb(uint8_t port_id, uint16_t queue_id, + __rte_unused struct rte_mbuf *pkts[], uint16_t nb_pkts, + __rte_unused uint16_t max_pkts, __rte_unused void *user_param) +{ + if (unlikely(nb_pkts == 0)) { + if (!itt_rx_data[port_id].queue_state[queue_id]) { + __itt_task_begin( + itt_rx_data[port_id].domains[queue_id], + __itt_null, __itt_null, + itt_rx_data[port_id].handles[queue_id]); + itt_rx_data[port_id].queue_state[queue_id] = 1; + } + } else { + if (unlikely(itt_rx_data[port_id].queue_state[queue_id])) { + __itt_task_end( + itt_rx_data[port_id].domains[queue_id]); + itt_rx_data[port_id].queue_state[queue_id] = 0; + } + } + return nb_pkts; +} + +/** + * Initialization of itt_profile_rx_data for a given Ethernet device. + * This function must be invoked when ethernet device is being configured. + * Result will be stored in the global array *itt_rx_data*. + * + * @param port_id + * The port identifier of the Ethernet device. + * @param port_name + * The name of the Ethernet device. + * @param rx_queue_num + * The number of RX queues on specified port. + */ +static inline void +itt_profile_rx_init(uint8_t port_id, char *port_name, uint8_t rx_queue_num) +{ + uint16_t q_id; + + for (q_id = 0; q_id < rx_queue_num; ++q_id) { + char domain_name[ITT_MAX_NAME_LEN]; + + snprintf(domain_name, sizeof(domain_name), + "RXBurst.WastedIterations.Port_%s.Queue_%d", + port_name, q_id); + itt_rx_data[port_id].domains[q_id] + = __itt_domain_create(domain_name); + + char task_name[ITT_MAX_NAME_LEN]; + + snprintf(task_name, sizeof(task_name), + "port id: %d; queue id: %d", + port_id, q_id); + itt_rx_data[port_id].handles[q_id] + = __itt_string_handle_create(task_name); + + itt_rx_data[port_id].queue_state[q_id] = 0; + + rte_eth_add_rx_callback( + port_id, q_id, collect_itt_rx_burst_cb, NULL); + } +} +#endif /* RTE_ETHDEV_PROFILE_ITT_WASTED_RX_ITERATIONS */ + +void +rte_eth_profile_rx_init(__rte_unused uint8_t port_id, + __rte_unused struct rte_eth_dev *dev) +{ +#ifdef RTE_ETHDEV_PROFILE_ITT_WASTED_RX_ITERATIONS + itt_profile_rx_init(port_id, dev->data->name, dev->data->nb_rx_queues); +#endif +} diff --git a/lib/librte_ether/rte_ethdev_profile.h b/lib/librte_ether/rte_ethdev_profile.h new file mode 100644 index 0000000..1eb72bd --- /dev/null +++ b/lib/librte_ether/rte_ethdev_profile.h @@ -0,0 +1,52 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2017 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_ETHDEV_PROFILE_H_ +#define _RTE_ETHDEV_PROFILE_H_ + +#include "rte_ethdev.h" + +/** + * Initialization of profiling RX queues for the Ethernet device. + * Implementation of this function depends on chosen profiling method, + * defined in configs. + * + * @param port_id + * The port identifier of the Ethernet device. + * @param dev + * Pointer to struct rte_eth_dev corresponding to given port_id. + */ +void +rte_eth_profile_rx_init(uint8_t port_id, struct rte_eth_dev *dev); + +#endif