From patchwork Fri Sep 4 10:18:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Liang, Ma" X-Patchwork-Id: 76554 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9C601A04C5; Fri, 4 Sep 2020 12:19:08 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id CCBAC1C0CD; Fri, 4 Sep 2020 12:19:07 +0200 (CEST) Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id 9CB9EE07 for ; Fri, 4 Sep 2020 12:19:05 +0200 (CEST) IronPort-SDR: G9VFmv25o5biu5+WN59NwOn5Gid1KuZxh3aPJnH4k+/V5QergLOB6ra7P/8bb+LYv4F7jiB3YP +l57t4WiHSyQ== X-IronPort-AV: E=McAfee;i="6000,8403,9733"; a="157729719" X-IronPort-AV: E=Sophos;i="5.76,389,1592895600"; d="scan'208";a="157729719" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Sep 2020 03:19:04 -0700 IronPort-SDR: 346Dut97hbm08OsVLgd8hywXj9Jw5bM/bYX6fQTthTwyynUxElTveFe7KAtcJURMrVFn0JYklG xUTjyJvIw/Zw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,389,1592895600"; d="scan'208";a="502852105" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga006.fm.intel.com with ESMTP; 04 Sep 2020 03:19:03 -0700 Received: from sivswdev09.ir.intel.com (sivswdev09.ir.intel.com [10.237.217.48]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id 084AJ2cU030492; Fri, 4 Sep 2020 11:19:02 +0100 Received: from sivswdev09.ir.intel.com (localhost [127.0.0.1]) by sivswdev09.ir.intel.com with ESMTP id 084AJ2Nq003978; Fri, 4 Sep 2020 11:19:02 +0100 Received: (from lma25@localhost) by sivswdev09.ir.intel.com with LOCAL id 084AJ2vl003974; Fri, 4 Sep 2020 11:19:02 +0100 From: Liang Ma To: dev@dpdk.org Cc: david.hunt@intel.com, anatoly.burakov@intel.com, Liang Ma Date: Fri, 4 Sep 2020 11:18:55 +0100 Message-Id: <1599214740-3927-1-git-send-email-liang.j.ma@intel.com> X-Mailer: git-send-email 1.7.7.4 In-Reply-To: <1597141666-20621-1-git-send-email-liang.j.ma@intel.com> References: <1597141666-20621-1-git-send-email-liang.j.ma@intel.com> Subject: [dpdk-dev] [PATCH v3 1/6] eal: add power management intrinsics X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add two new power management intrinsics, and provide an implementation in eal/x86 based on UMONITOR/UMWAIT instructions. The instructions are implemented as raw byte opcodes because there is not yet widespread compiler support for these instructions. The power management instructions provide an architecture-specific function to either wait until a specified TSC timestamp is reached, or optionally wait until either a TSC timestamp is reached or a memory location is written to. The monitor function also provides an optional comparison, to avoid sleeping when the expected write has already happened, and no more writes are expected. Signed-off-by: Liang Ma Signed-off-by: Anatoly Burakov --- .../include/generic/rte_power_intrinsics.h | 64 ++++++++ lib/librte_eal/include/meson.build | 1 + lib/librte_eal/x86/include/meson.build | 1 + lib/librte_eal/x86/include/rte_cpuflags.h | 2 + .../x86/include/rte_power_intrinsics.h | 143 ++++++++++++++++++ lib/librte_eal/x86/rte_cpuflags.c | 2 + 6 files changed, 213 insertions(+) create mode 100644 lib/librte_eal/include/generic/rte_power_intrinsics.h create mode 100644 lib/librte_eal/x86/include/rte_power_intrinsics.h diff --git a/lib/librte_eal/include/generic/rte_power_intrinsics.h b/lib/librte_eal/include/generic/rte_power_intrinsics.h new file mode 100644 index 0000000000..cd7f8070ac --- /dev/null +++ b/lib/librte_eal/include/generic/rte_power_intrinsics.h @@ -0,0 +1,64 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2020 Intel Corporation + */ + +#ifndef _RTE_POWER_INTRINSIC_H_ +#define _RTE_POWER_INTRINSIC_H_ + +#include + +/** + * @file + * Advanced power management operations. + * + * This file define APIs for advanced power management, + * which are architecture-dependent. + */ + +/** + * Monitor specific address for changes. This will cause the CPU to enter an + * architecture-defined optimized power state until either the specified + * memory address is written to, or a certain TSC timestamp is reached. + * + * Additionally, an `expected` 64-bit value and 64-bit mask are provided. If + * mask is non-zero, the current value pointed to by the `p` pointer will be + * checked against the expected value, and if they match, the entering of + * optimized power state may be aborted. + * + * @param p + * Address to monitor for changes. Must be aligned on an 64-byte boundary. + * @param expected_value + * Before attempting the monitoring, the `p` address may be read and compared + * against this value. If `value_mask` is zero, this step will be skipped. + * @param value_mask + * The 64-bit mask to use to extract current value from `p`. + * @param state + * Architecture-dependent optimized power state number + * @param tsc_timestamp + * Maximum TSC timestamp to wait for. Note that the wait behavior is + * architecture-dependent. + * + * @return + * Architecture-dependent return value. + */ +static inline int rte_power_monitor(const volatile void *p, + const uint64_t expected_value, const uint64_t value_mask, + const uint32_t state, const uint64_t tsc_timestamp); + +/** + * Enter an architecture-defined optimized power state until a certain TSC + * timestamp is reached. + * + * @param state + * Architecture-dependent optimized power state number + * @param tsc_timestamp + * Maximum TSC timestamp to wait for. Note that the wait behavior is + * architecture-dependent. + * + * @return + * Architecture-dependent return value. + */ +static inline int rte_power_pause(const uint32_t state, + const uint64_t tsc_timestamp); + +#endif /* _RTE_POWER_INTRINSIC_H_ */ diff --git a/lib/librte_eal/include/meson.build b/lib/librte_eal/include/meson.build index cd09027958..3a12e87e19 100644 --- a/lib/librte_eal/include/meson.build +++ b/lib/librte_eal/include/meson.build @@ -60,6 +60,7 @@ generic_headers = files( 'generic/rte_memcpy.h', 'generic/rte_pause.h', 'generic/rte_prefetch.h', + 'generic/rte_power_intrinsics.h', 'generic/rte_rwlock.h', 'generic/rte_spinlock.h', 'generic/rte_ticketlock.h', diff --git a/lib/librte_eal/x86/include/meson.build b/lib/librte_eal/x86/include/meson.build index f0e998c2fe..494a8142a2 100644 --- a/lib/librte_eal/x86/include/meson.build +++ b/lib/librte_eal/x86/include/meson.build @@ -13,6 +13,7 @@ arch_headers = files( 'rte_io.h', 'rte_memcpy.h', 'rte_prefetch.h', + 'rte_power_intrinsics.h', 'rte_pause.h', 'rte_rtm.h', 'rte_rwlock.h', diff --git a/lib/librte_eal/x86/include/rte_cpuflags.h b/lib/librte_eal/x86/include/rte_cpuflags.h index c1d20364d1..5041a830a7 100644 --- a/lib/librte_eal/x86/include/rte_cpuflags.h +++ b/lib/librte_eal/x86/include/rte_cpuflags.h @@ -132,6 +132,8 @@ enum rte_cpu_flag_t { RTE_CPUFLAG_MOVDIR64B, /**< Direct Store Instructions 64B */ RTE_CPUFLAG_AVX512VP2INTERSECT, /**< AVX512 Two Register Intersection */ + /**< UMWAIT/TPAUSE Instructions */ + RTE_CPUFLAG_WAITPKG, /**< UMINITOR/UMWAIT/TPAUSE */ /* The last item */ RTE_CPUFLAG_NUMFLAGS, /**< This should always be the last! */ }; diff --git a/lib/librte_eal/x86/include/rte_power_intrinsics.h b/lib/librte_eal/x86/include/rte_power_intrinsics.h new file mode 100644 index 0000000000..6dd1cdc939 --- /dev/null +++ b/lib/librte_eal/x86/include/rte_power_intrinsics.h @@ -0,0 +1,143 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2020 Intel Corporation + */ + +#ifndef _RTE_POWER_INTRINSIC_X86_64_H_ +#define _RTE_POWER_INTRINSIC_X86_64_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include + +#include "generic/rte_power_intrinsics.h" + +/** + * Monitor specific address for changes. This will cause the CPU to enter an + * architecture-defined optimized power state until either the specified + * memory address is written to, or a certain TSC timestamp is reached. + * + * Additionally, an `expected` 64-bit value and 64-bit mask are provided. If + * mask is non-zero, the current value pointed to by the `p` pointer will be + * checked against the expected value, and if they match, the entering of + * optimized power state may be aborted. + * + * This function uses UMONITOR/UMWAIT instructions. For more information about + * their usage, please refer to Intel(R) 64 and IA-32 Architectures Software + * Developer's Manual. + * + * @param p + * Address to monitor for changes. Must be aligned on an 64-byte boundary. + * @param expected_value + * Before attempting the monitoring, the `p` address may be read and compared + * against this value. If `value_mask` is zero, this step will be skipped. + * @param value_mask + * The 64-bit mask to use to extract current value from `p`. + * @param state + * Architecture-dependent optimized power state number. Can be 0 (C0.2) or + * 1 (C0.1). + * @param tsc_timestamp + * Maximum TSC timestamp to wait for. + * + * @return + * - 1 if wakeup was due to TSC timeout expiration. + * - 0 if wakeup was due to memory write or other reasons. + */ +static inline int rte_power_monitor(const volatile void *p, + const uint64_t expected_value, const uint64_t value_mask, + const uint32_t state, const uint64_t tsc_timestamp) +{ + const uint32_t tsc_l = (uint32_t)tsc_timestamp; + const uint32_t tsc_h = (uint32_t)(tsc_timestamp >> 32); + /* the rflags need match native register size */ +#ifdef RTE_ARCH_I686 + uint32_t rflags; +#else + uint64_t rflags; +#endif + /* + * we're using raw byte codes for now as only the newest compiler + * versions support this instruction natively. + */ + + /* set address for UMONITOR */ + asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;" + : + : "D"(p)); + + if (value_mask) { + const uint64_t cur_value = *(const volatile uint64_t *)p; + const uint64_t masked = cur_value & value_mask; + /* if the masked value is already matching, abort */ + if (masked == expected_value) + return 0; + } + /* execute UMWAIT */ + asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;\n" + /* + * UMWAIT sets CF flag in RFLAGS, so PUSHF to push them + * onto the stack, then pop them back into `rflags` so that + * we can read it. + */ + "pushf;\n" + "pop %0;\n" + : "=r"(rflags) + : "D"(state), "a"(tsc_l), "d"(tsc_h)); + + /* we're interested in the first bit (the carry flag) */ + return rflags & 0x1; +} + +/** + * Enter an architecture-defined optimized power state until a certain TSC + * timestamp is reached. + * + * This function uses TPAUSE instruction. For more information about its usage, + * please refer to Intel(R) 64 and IA-32 Architectures Software Developer's + * Manual. + * + * @param state + * Architecture-dependent optimized power state number. Can be 0 (C0.2) or + * 1 (C0.1). + * @param tsc_timestamp + * Maximum TSC timestamp to wait for. + * + * @return + * - 1 if wakeup was due to TSC timeout expiration. + * - 0 if wakeup was due to other reasons. + */ +static inline int rte_power_pause(const uint32_t state, + const uint64_t tsc_timestamp) +{ + const uint32_t tsc_l = (uint32_t)tsc_timestamp; + const uint32_t tsc_h = (uint32_t)(tsc_timestamp >> 32); + /* the rflags need match native register size */ +#ifdef RTE_ARCH_I686 + uint32_t rflags; +#else + uint64_t rflags; +#endif + + /* execute TPAUSE */ + asm volatile(".byte 0x66, 0x0f, 0xae, 0xf7;\n" + /* + * TPAUSE sets CF flag in RFLAGS, so PUSHF to push them + * onto the stack, then pop them back into `rflags` so that + * we can read it. + */ + "pushf;\n" + "pop %0;\n" + : "=r"(rflags) + : "D"(state), "a"(tsc_l), "d"(tsc_h)); + + /* we're interested in the first bit (the carry flag) */ + return rflags & 0x1; +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_POWER_INTRINSIC_X86_64_H_ */ diff --git a/lib/librte_eal/x86/rte_cpuflags.c b/lib/librte_eal/x86/rte_cpuflags.c index 30439e7951..0325c4b93b 100644 --- a/lib/librte_eal/x86/rte_cpuflags.c +++ b/lib/librte_eal/x86/rte_cpuflags.c @@ -110,6 +110,8 @@ const struct feature_entry rte_cpu_feature_table[] = { FEAT_DEF(AVX512F, 0x00000007, 0, RTE_REG_EBX, 16) FEAT_DEF(RDSEED, 0x00000007, 0, RTE_REG_EBX, 18) + FEAT_DEF(WAITPKG, 0x00000007, 0, RTE_REG_ECX, 5) + FEAT_DEF(LAHF_SAHF, 0x80000001, 0, RTE_REG_ECX, 0) FEAT_DEF(LZCNT, 0x80000001, 0, RTE_REG_ECX, 4) From patchwork Fri Sep 4 10:18:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Liang, Ma" X-Patchwork-Id: 76555 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 41E40A04C5; Fri, 4 Sep 2020 12:19:17 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 1C2CD1C0D5; Fri, 4 Sep 2020 12:19:09 +0200 (CEST) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 547D31C0CC for ; Fri, 4 Sep 2020 12:19:06 +0200 (CEST) IronPort-SDR: 4dYoq56K1Az0x/a6lcBSldumIToCkrjjbjrIvLKdp2fbtHlCBB8F8zQPjlhXCEdAXoejrvUl9m FvTvxixarHxw== X-IronPort-AV: E=McAfee;i="6000,8403,9733"; a="175774415" X-IronPort-AV: E=Sophos;i="5.76,389,1592895600"; d="scan'208";a="175774415" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Sep 2020 03:19:05 -0700 IronPort-SDR: 1hK7tYXt9ZEX52o45YJRUO1hy9KZpaRjiHke9gcKSPj7Bo9/NOY8BklE9KeN6RFFIkeb78Q5nN ltJqonJOW9aw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,389,1592895600"; d="scan'208";a="334809998" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga002.fm.intel.com with ESMTP; 04 Sep 2020 03:19:04 -0700 Received: from sivswdev09.ir.intel.com (sivswdev09.ir.intel.com [10.237.217.48]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id 084AJ3s5030495; Fri, 4 Sep 2020 11:19:03 +0100 Received: from sivswdev09.ir.intel.com (localhost [127.0.0.1]) by sivswdev09.ir.intel.com with ESMTP id 084AJ37w003990; Fri, 4 Sep 2020 11:19:03 +0100 Received: (from lma25@localhost) by sivswdev09.ir.intel.com with LOCAL id 084AJ3Qq003986; Fri, 4 Sep 2020 11:19:03 +0100 From: Liang Ma To: dev@dpdk.org Cc: david.hunt@intel.com, anatoly.burakov@intel.com, Liang Ma Date: Fri, 4 Sep 2020 11:18:56 +0100 Message-Id: <1599214740-3927-2-git-send-email-liang.j.ma@intel.com> X-Mailer: git-send-email 1.7.7.4 In-Reply-To: <1599214740-3927-1-git-send-email-liang.j.ma@intel.com> References: <1597141666-20621-1-git-send-email-liang.j.ma@intel.com> <1599214740-3927-1-git-send-email-liang.j.ma@intel.com> Subject: [dpdk-dev] [PATCH v3 2/6] ethdev: add simple power management API X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add a simple API allow ethdev get the last available queue descriptor address from PMD. Also include internal structure update. Signed-off-by: Liang Ma Signed-off-by: Anatoly Burakov --- lib/librte_ethdev/rte_ethdev.h | 22 ++++++++++++++ lib/librte_ethdev/rte_ethdev_core.h | 46 +++++++++++++++++++++++++++-- 2 files changed, 66 insertions(+), 2 deletions(-) diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h index 70295d7ab7..d9312d3e11 100644 --- a/lib/librte_ethdev/rte_ethdev.h +++ b/lib/librte_ethdev/rte_ethdev.h @@ -157,6 +157,7 @@ extern "C" { #include #include #include +#include #include "rte_ethdev_trace_fp.h" #include "rte_dev_info.h" @@ -775,6 +776,7 @@ rte_eth_rss_hf_refine(uint64_t rss_hf) /** Maximum nb. of vlan per mirror rule */ #define ETH_MIRROR_MAX_VLANS 64 +#define ETH_EMPTYPOLL_MAX 512 /**< Empty poll number threshlold */ #define ETH_MIRROR_VIRTUAL_POOL_UP 0x01 /**< Virtual Pool uplink Mirroring. */ #define ETH_MIRROR_UPLINK_PORT 0x02 /**< Uplink Port Mirroring. */ #define ETH_MIRROR_DOWNLINK_PORT 0x04 /**< Downlink Port Mirroring. */ @@ -1602,6 +1604,26 @@ enum rte_eth_dev_state { RTE_ETH_DEV_REMOVED, }; +#define RTE_ETH_PAUSE_NUM 64 /* How many times to pause */ +/** + * Possible power management states of an ethdev port. + */ +enum rte_eth_dev_power_mgmt_state { + /** Device power management is disabled. */ + RTE_ETH_DEV_POWER_MGMT_DISABLED = 0, + /** Device power management is enabled. */ + RTE_ETH_DEV_POWER_MGMT_ENABLED, +}; + +enum rte_eth_dev_power_mgmt_cb_mode { + /** WAIT callback mode. */ + RTE_ETH_DEV_POWER_MGMT_CB_WAIT = 1, + /** PAUSE callback mode. */ + RTE_ETH_DEV_POWER_MGMT_CB_PAUSE, + /** Freq Scaling callback mode. */ + RTE_ETH_DEV_POWER_MGMT_CB_SCALE, +}; + struct rte_eth_dev_sriov { uint8_t active; /**< SRIOV is active with 16, 32 or 64 pools */ uint8_t nb_q_per_pool; /**< rx queue number per pool */ diff --git a/lib/librte_ethdev/rte_ethdev_core.h b/lib/librte_ethdev/rte_ethdev_core.h index 32407dd418..16e54bb4e4 100644 --- a/lib/librte_ethdev/rte_ethdev_core.h +++ b/lib/librte_ethdev/rte_ethdev_core.h @@ -603,6 +603,30 @@ typedef int (*eth_tx_hairpin_queue_setup_t) uint16_t nb_tx_desc, const struct rte_eth_hairpin_conf *hairpin_conf); +/** + * @internal + * Get the next RX ring descriptor address. + * + * @param rxq + * ethdev queue pointer. + * @param tail_desc_addr + * the pointer point to descriptor address var. + * @param expected + * the pointer point to value to be expected when descriptor is set. + * @param mask + * the pointer point to comparison bitmask for the expected value. + * @return + * Negative errno value on error, 0 on success. + * + * @retval 0 + * Success. + * @retval -EINVAL + * Failed to get descriptor address. + */ +typedef int (*eth_next_rx_desc_t) + (void *rxq, volatile void **tail_desc_addr, + uint64_t *expected, uint64_t *mask); + /** * @internal A structure containing the functions exported by an Ethernet driver. */ @@ -752,6 +776,8 @@ struct eth_dev_ops { /**< Set up device RX hairpin queue. */ eth_tx_hairpin_queue_setup_t tx_hairpin_queue_setup; /**< Set up device TX hairpin queue. */ + eth_next_rx_desc_t next_rx_desc; + /**< Get next RX ring descriptor address. */ }; /** @@ -768,6 +794,14 @@ struct rte_eth_rxtx_callback { void *param; }; +/** + * @internal + * Structure used to hold counters for empty poll + */ +struct rte_eth_ep_stat { + uint64_t num; +} __rte_cache_aligned; + /** * @internal * The generic data structure associated with each ethernet device. @@ -807,8 +841,16 @@ struct rte_eth_dev { enum rte_eth_dev_state state; /**< Flag indicating the port state */ void *security_ctx; /**< Context for security ops */ - uint64_t reserved_64s[4]; /**< Reserved for future fields */ - void *reserved_ptrs[4]; /**< Reserved for future fields */ + /**< Empty poll number */ + enum rte_eth_dev_power_mgmt_state pwr_mgmt_state; + /**< Power mgmt Callback mode */ + enum rte_eth_dev_power_mgmt_cb_mode cb_mode; + uint64_t reserved_64s[3]; /**< Reserved for future fields */ + + /**< Flag indicating the port power state */ + struct rte_eth_ep_stat *empty_poll_stats; + const struct rte_eth_rxtx_callback *cur_pwr_cb; + void *reserved_ptrs[2]; /**< Reserved for future fields */ } __rte_cache_aligned; struct rte_eth_dev_sriov; From patchwork Fri Sep 4 10:18:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Liang, Ma" X-Patchwork-Id: 76556 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id F20BCA04C5; Fri, 4 Sep 2020 12:19:25 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id A32BF1C0DC; Fri, 4 Sep 2020 12:19:10 +0200 (CEST) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id AEE1EE07 for ; Fri, 4 Sep 2020 12:19:07 +0200 (CEST) IronPort-SDR: kG/MhG34RrnY4PimFPMZXcSu9joQ0Nf5HXXScXiF834cuMQUStwgSt8wD1L68SOi4GAOW5RFbE 2S1cAKsJmrzQ== X-IronPort-AV: E=McAfee;i="6000,8403,9733"; a="155118644" X-IronPort-AV: E=Sophos;i="5.76,389,1592895600"; d="scan'208";a="155118644" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Sep 2020 03:19:06 -0700 IronPort-SDR: NRuGFzxhISbLs5Y8esFC0NF9uLbW9KhBeavcrn6Tq8+CsFC7Qwc+K5YK8+gQXQim9VxFOZGx3B zC7LvmDT1bbQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,389,1592895600"; d="scan'208";a="283021325" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga007.fm.intel.com with ESMTP; 04 Sep 2020 03:19:05 -0700 Received: from sivswdev09.ir.intel.com (sivswdev09.ir.intel.com [10.237.217.48]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id 084AJ4fj030498; Fri, 4 Sep 2020 11:19:05 +0100 Received: from sivswdev09.ir.intel.com (localhost [127.0.0.1]) by sivswdev09.ir.intel.com with ESMTP id 084AJ4YZ004014; Fri, 4 Sep 2020 11:19:04 +0100 Received: (from lma25@localhost) by sivswdev09.ir.intel.com with LOCAL id 084AJ4JU004010; Fri, 4 Sep 2020 11:19:04 +0100 From: Liang Ma To: dev@dpdk.org Cc: david.hunt@intel.com, anatoly.burakov@intel.com, Liang Ma Date: Fri, 4 Sep 2020 11:18:57 +0100 Message-Id: <1599214740-3927-3-git-send-email-liang.j.ma@intel.com> X-Mailer: git-send-email 1.7.7.4 In-Reply-To: <1599214740-3927-1-git-send-email-liang.j.ma@intel.com> References: <1597141666-20621-1-git-send-email-liang.j.ma@intel.com> <1599214740-3927-1-git-send-email-liang.j.ma@intel.com> Subject: [dpdk-dev] [PATCH v3 3/6] power: add simple power management API and callback X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add a simple on/off switch that will enable saving power when no packets are arriving. It is based on counting the number of empty polls and, when the number reaches a certain threshold, entering an architecture-defined optimized power state that will either wait until a TSC timestamp expires, or when packets arrive. This API is limited to 1 core 1 port 1 queue use case as there is no coordination between queues/cores in ethdev. 1 port map to multiple core will be supported in next version. This design leverage RX Callback mechnaism which allow three different power management methodology co exist. 1. umwait/umonitor: The TSC timestamp is automatically calculated using current link speed and RX descriptor ring size, such that the sleep time is not longer than it would take for a NIC to fill its entire RX descriptor ring. 2. Pause instruction Instead of move the core into deeper C state, this lightweight method use Pause instruction to releaf the processor from busy polling. 3. Frequency Scaling Reuse exist rte power library to scale up/down core frequency depend on traffic volume. Signed-off-by: Liang Ma Signed-off-by: Anatoly Burakov --- lib/librte_power/meson.build | 3 +- lib/librte_power/rte_power.h | 38 +++++ lib/librte_power/rte_power_pmd_mgmt.c | 184 +++++++++++++++++++++++++ lib/librte_power/rte_power_version.map | 4 + 4 files changed, 228 insertions(+), 1 deletion(-) create mode 100644 lib/librte_power/rte_power_pmd_mgmt.c diff --git a/lib/librte_power/meson.build b/lib/librte_power/meson.build index 78c031c943..44b01afce2 100644 --- a/lib/librte_power/meson.build +++ b/lib/librte_power/meson.build @@ -9,6 +9,7 @@ sources = files('rte_power.c', 'power_acpi_cpufreq.c', 'power_kvm_vm.c', 'guest_channel.c', 'rte_power_empty_poll.c', 'power_pstate_cpufreq.c', + 'rte_power_pmd_mgmt.c', 'power_common.c') headers = files('rte_power.h','rte_power_empty_poll.h') -deps += ['timer'] +deps += ['timer' ,'ethdev'] diff --git a/lib/librte_power/rte_power.h b/lib/librte_power/rte_power.h index bbbde4dfb4..06d5a9984f 100644 --- a/lib/librte_power/rte_power.h +++ b/lib/librte_power/rte_power.h @@ -14,6 +14,7 @@ #include #include #include +#include #ifdef __cplusplus extern "C" { @@ -97,6 +98,43 @@ int rte_power_init(unsigned int lcore_id); */ int rte_power_exit(unsigned int lcore_id); +/** + * @warning + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice + * + * Enable device power management. + * @param lcore_id + * lcore id. + * @param port_id + * The port identifier of the Ethernet device. + * @param mode + * The power management callback function mode. + * @return + * 0 on success + * <0 on error + */ +__rte_experimental +int rte_power_pmd_mgmt_enable(unsigned int lcore_id, + uint16_t port_id, + enum rte_eth_dev_power_mgmt_cb_mode mode); + +/** + * @warning + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice + * + * Disable device power management. + * @param lcore_id + * lcore id. + * @param port_id + * The port identifier of the Ethernet device. + * + * @return + * 0 on success + * <0 on error + */ +__rte_experimental +int rte_power_pmd_mgmt_disable(unsigned int lcore_id, uint16_t port_id); + /** * Get the available frequencies of a specific lcore. * Function pointer definition. Review each environments diff --git a/lib/librte_power/rte_power_pmd_mgmt.c b/lib/librte_power/rte_power_pmd_mgmt.c new file mode 100644 index 0000000000..a445153ede --- /dev/null +++ b/lib/librte_power/rte_power_pmd_mgmt.c @@ -0,0 +1,184 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2010-2020 Intel Corporation + */ + +#include +#include +#include +#include +#include + +#include "rte_power.h" + + + +static uint16_t +rte_power_mgmt_umwait(uint16_t port_id, uint16_t qidx, + struct rte_mbuf **pkts __rte_unused, uint16_t nb_rx, + uint16_t max_pkts __rte_unused, void *_ __rte_unused) +{ + + struct rte_eth_dev *dev = &rte_eth_devices[port_id]; + + if (unlikely(nb_rx == 0)) { + dev->empty_poll_stats[qidx].num++; + if (unlikely(dev->empty_poll_stats[qidx].num > + ETH_EMPTYPOLL_MAX)) { + volatile void *target_addr; + uint64_t expected, mask; + uint16_t ret; + + /* + * get address of next descriptor in the RX + * ring for this queue, as well as expected + * value and a mask. + */ + ret = (*dev->dev_ops->next_rx_desc) + (dev->data->rx_queues[qidx], + &target_addr, &expected, &mask); + if (ret == 0) + /* -1ULL is maximum value for TSC */ + rte_power_monitor(target_addr, + expected, mask, + 0, -1ULL); + } + } else + dev->empty_poll_stats[qidx].num = 0; + + return nb_rx; +} + +static uint16_t +rte_power_mgmt_pause(uint16_t port_id, uint16_t qidx, + struct rte_mbuf **pkts __rte_unused, uint16_t nb_rx, + uint16_t max_pkts __rte_unused, void *_ __rte_unused) +{ + struct rte_eth_dev *dev = &rte_eth_devices[port_id]; + + int i; + + if (unlikely(nb_rx == 0)) { + + dev->empty_poll_stats[qidx].num++; + + if (unlikely(dev->empty_poll_stats[qidx].num > + ETH_EMPTYPOLL_MAX)) { + + for (i = 0; i < RTE_ETH_PAUSE_NUM; i++) + rte_pause(); + + } + } else + dev->empty_poll_stats[qidx].num = 0; + + return nb_rx; +} + +static uint16_t +rte_power_mgmt_scalefreq(uint16_t port_id, uint16_t qidx, + struct rte_mbuf **pkts __rte_unused, uint16_t nb_rx, + uint16_t max_pkts __rte_unused, void *_ __rte_unused) +{ + struct rte_eth_dev *dev = &rte_eth_devices[port_id]; + + if (unlikely(nb_rx == 0)) { + dev->empty_poll_stats[qidx].num++; + if (unlikely(dev->empty_poll_stats[qidx].num > + ETH_EMPTYPOLL_MAX)) { + + /*scale down freq */ + rte_power_freq_min(rte_lcore_id()); + + } + } else { + dev->empty_poll_stats[qidx].num = 0; + /* scal up freq */ + rte_power_freq_max(rte_lcore_id()); + } + + return nb_rx; +} + +int +rte_power_pmd_mgmt_enable(unsigned int lcore_id, + uint16_t port_id, + enum rte_eth_dev_power_mgmt_cb_mode mode) +{ + struct rte_eth_dev *dev; + + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL); + dev = &rte_eth_devices[port_id]; + + if (dev->pwr_mgmt_state == RTE_ETH_DEV_POWER_MGMT_ENABLED) + return -EINVAL; + /* allocate memory for empty poll stats */ + dev->empty_poll_stats = rte_malloc_socket(NULL, + sizeof(struct rte_eth_ep_stat) + * RTE_MAX_QUEUES_PER_PORT, + 0, dev->data->numa_node); + if (dev->empty_poll_stats == NULL) + return -ENOMEM; + + switch (mode) { + case RTE_ETH_DEV_POWER_MGMT_CB_WAIT: + if (!rte_cpu_get_flag_enabled(RTE_CPUFLAG_WAITPKG)) + return -ENOTSUP; + dev->cur_pwr_cb = rte_eth_add_rx_callback(port_id, 0, + rte_power_mgmt_umwait, NULL); + break; + case RTE_ETH_DEV_POWER_MGMT_CB_SCALE: + /* init scale freq */ + if (rte_power_init(lcore_id)) + return -EINVAL; + dev->cur_pwr_cb = rte_eth_add_rx_callback(port_id, 0, + rte_power_mgmt_scalefreq, NULL); + break; + case RTE_ETH_DEV_POWER_MGMT_CB_PAUSE: + dev->cur_pwr_cb = rte_eth_add_rx_callback(port_id, 0, + rte_power_mgmt_pause, NULL); + break; + } + + dev->cb_mode = mode; + dev->pwr_mgmt_state = RTE_ETH_DEV_POWER_MGMT_ENABLED; + return 0; +} + +int +rte_power_pmd_mgmt_disable(unsigned int lcore_id, + uint16_t port_id) +{ + struct rte_eth_dev *dev; + + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL); + dev = &rte_eth_devices[port_id]; + + /*add flag check */ + + if (dev->pwr_mgmt_state == RTE_ETH_DEV_POWER_MGMT_DISABLED) + return -EINVAL; + + /* rte_free ignores NULL so safe to call without checks */ + rte_free(dev->empty_poll_stats); + + switch (dev->cb_mode) { + case RTE_ETH_DEV_POWER_MGMT_CB_WAIT: + case RTE_ETH_DEV_POWER_MGMT_CB_PAUSE: + rte_eth_remove_rx_callback(port_id, 0, + dev->cur_pwr_cb); + break; + case RTE_ETH_DEV_POWER_MGMT_CB_SCALE: + rte_power_freq_max(lcore_id); + rte_eth_remove_rx_callback(port_id, 0, + dev->cur_pwr_cb); + if (rte_power_exit(lcore_id)) + return -EINVAL; + break; + } + + dev->pwr_mgmt_state = RTE_ETH_DEV_POWER_MGMT_DISABLED; + dev->cur_pwr_cb = NULL; + dev->cb_mode = 0; + + return 0; +} diff --git a/lib/librte_power/rte_power_version.map b/lib/librte_power/rte_power_version.map index 00ee5753e2..ade83cfd4f 100644 --- a/lib/librte_power/rte_power_version.map +++ b/lib/librte_power/rte_power_version.map @@ -34,4 +34,8 @@ EXPERIMENTAL { rte_power_guest_channel_receive_msg; rte_power_poll_stat_fetch; rte_power_poll_stat_update; + # added in 20.08 + rte_power_pmd_mgmt_disable; + rte_power_pmd_mgmt_enable; + }; From patchwork Fri Sep 4 10:18:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Liang, Ma" X-Patchwork-Id: 76557 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id AD382A04C5; Fri, 4 Sep 2020 12:19:34 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 0114D1C10F; Fri, 4 Sep 2020 12:19:12 +0200 (CEST) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id AB4121C0D0 for ; Fri, 4 Sep 2020 12:19:08 +0200 (CEST) IronPort-SDR: cuUBgH1X07UQ9J5PYr400aTM+piRlZSZDoBfxaJ3/2TNR5c+jciw2Kry6Q8qUInIYKIwwojhdU C7UT4Map8Auw== X-IronPort-AV: E=McAfee;i="6000,8403,9733"; a="156984481" X-IronPort-AV: E=Sophos;i="5.76,389,1592895600"; d="scan'208";a="156984481" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Sep 2020 03:19:07 -0700 IronPort-SDR: xcFNHiXT2KTLRTjQdMPTSBAqqoFe3mHczjV8uFvOCtSXrGY0Rg4+EFgypOf3RHMpbIayL8FV+c dw5AiBz9OHhA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,389,1592895600"; d="scan'208";a="503455312" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga005.fm.intel.com with ESMTP; 04 Sep 2020 03:19:06 -0700 Received: from sivswdev09.ir.intel.com (sivswdev09.ir.intel.com [10.237.217.48]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id 084AJ6EU030504; Fri, 4 Sep 2020 11:19:06 +0100 Received: from sivswdev09.ir.intel.com (localhost [127.0.0.1]) by sivswdev09.ir.intel.com with ESMTP id 084AJ6OO004025; Fri, 4 Sep 2020 11:19:06 +0100 Received: (from lma25@localhost) by sivswdev09.ir.intel.com with LOCAL id 084AJ6cD004021; Fri, 4 Sep 2020 11:19:06 +0100 From: Liang Ma To: dev@dpdk.org Cc: david.hunt@intel.com, anatoly.burakov@intel.com, Liang Ma Date: Fri, 4 Sep 2020 11:18:58 +0100 Message-Id: <1599214740-3927-4-git-send-email-liang.j.ma@intel.com> X-Mailer: git-send-email 1.7.7.4 In-Reply-To: <1599214740-3927-1-git-send-email-liang.j.ma@intel.com> References: <1597141666-20621-1-git-send-email-liang.j.ma@intel.com> <1599214740-3927-1-git-send-email-liang.j.ma@intel.com> Subject: [dpdk-dev] [PATCH v3 4/6] net/ixgbe: implement power management API X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Implement support for the power management API by implementing a `next_rx_desc` function that will return an address of an RX ring's status bit. Signed-off-by: Anatoly Burakov Signed-off-by: Liang Ma --- drivers/net/ixgbe/ixgbe_ethdev.c | 1 + drivers/net/ixgbe/ixgbe_rxtx.c | 22 ++++++++++++++++++++++ drivers/net/ixgbe/ixgbe_rxtx.h | 2 ++ 3 files changed, 25 insertions(+) diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c index fd0cb9b0e2..618fc15732 100644 --- a/drivers/net/ixgbe/ixgbe_ethdev.c +++ b/drivers/net/ixgbe/ixgbe_ethdev.c @@ -592,6 +592,7 @@ static const struct eth_dev_ops ixgbe_eth_dev_ops = { .udp_tunnel_port_del = ixgbe_dev_udp_tunnel_port_del, .tm_ops_get = ixgbe_tm_ops_get, .tx_done_cleanup = ixgbe_dev_tx_done_cleanup, + .next_rx_desc = ixgbe_next_rx_desc, }; /* diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c index 977ecf5137..d1d015deae 100644 --- a/drivers/net/ixgbe/ixgbe_rxtx.c +++ b/drivers/net/ixgbe/ixgbe_rxtx.c @@ -1366,6 +1366,28 @@ const uint32_t RTE_PTYPE_INNER_L3_IPV4_EXT | RTE_PTYPE_INNER_L4_UDP, }; +int ixgbe_next_rx_desc(void *rx_queue, volatile void **tail_desc_addr, + uint64_t *expected, uint64_t *mask) +{ + volatile union ixgbe_adv_rx_desc *rxdp; + struct ixgbe_rx_queue *rxq = rx_queue; + uint16_t desc; + + desc = rxq->rx_tail; + rxdp = &rxq->rx_ring[desc]; + /* watch for changes in status bit */ + *tail_desc_addr = &rxdp->wb.upper.status_error; + + /* + * we expect the DD bit to be set to 1 if this descriptor was already + * written to. + */ + *expected = rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD); + *mask = rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD); + + return 0; +} + /* @note: fix ixgbe_dev_supported_ptypes_get() if any change here. */ static inline uint32_t ixgbe_rxd_pkt_info_to_pkt_type(uint32_t pkt_info, uint16_t ptype_mask) diff --git a/drivers/net/ixgbe/ixgbe_rxtx.h b/drivers/net/ixgbe/ixgbe_rxtx.h index 7e09291b22..826f451bee 100644 --- a/drivers/net/ixgbe/ixgbe_rxtx.h +++ b/drivers/net/ixgbe/ixgbe_rxtx.h @@ -299,5 +299,7 @@ uint64_t ixgbe_get_tx_port_offloads(struct rte_eth_dev *dev); uint64_t ixgbe_get_rx_queue_offloads(struct rte_eth_dev *dev); uint64_t ixgbe_get_rx_port_offloads(struct rte_eth_dev *dev); uint64_t ixgbe_get_tx_queue_offloads(struct rte_eth_dev *dev); +int ixgbe_next_rx_desc(void *rx_queue, volatile void **tail_desc_addr, + uint64_t *expected, uint64_t *mask); #endif /* _IXGBE_RXTX_H_ */ From patchwork Fri Sep 4 10:18:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Liang, Ma" X-Patchwork-Id: 76558 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5D658A04C5; Fri, 4 Sep 2020 12:19:45 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 3E9491C120; Fri, 4 Sep 2020 12:19:14 +0200 (CEST) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 47B851C0D9 for ; Fri, 4 Sep 2020 12:19:10 +0200 (CEST) IronPort-SDR: SB7v0aoF8cT31NKgQSYB65xrAZcBmCKHnIN5leQVAxBJzZ4keeM89V+Y+rDqwiUF17nmGjqztn yhnUl4OkwSvQ== X-IronPort-AV: E=McAfee;i="6000,8403,9733"; a="175774419" X-IronPort-AV: E=Sophos;i="5.76,389,1592895600"; d="scan'208";a="175774419" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Sep 2020 03:19:09 -0700 IronPort-SDR: m5U9V01PZKgry+woHN0OrfosgG735DH6CM7JiZe73VOyg0iKk27NCyGT0Fgll6/grHnmMbg43z pwBJe7bWOPAg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,389,1592895600"; d="scan'208";a="334810010" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga002.fm.intel.com with ESMTP; 04 Sep 2020 03:19:08 -0700 Received: from sivswdev09.ir.intel.com (sivswdev09.ir.intel.com [10.237.217.48]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id 084AJ8wp030569; Fri, 4 Sep 2020 11:19:08 +0100 Received: from sivswdev09.ir.intel.com (localhost [127.0.0.1]) by sivswdev09.ir.intel.com with ESMTP id 084AJ8Zv004034; Fri, 4 Sep 2020 11:19:08 +0100 Received: (from lma25@localhost) by sivswdev09.ir.intel.com with LOCAL id 084AJ83H004030; Fri, 4 Sep 2020 11:19:08 +0100 From: Liang Ma To: dev@dpdk.org Cc: david.hunt@intel.com, anatoly.burakov@intel.com, Liang Ma Date: Fri, 4 Sep 2020 11:18:59 +0100 Message-Id: <1599214740-3927-5-git-send-email-liang.j.ma@intel.com> X-Mailer: git-send-email 1.7.7.4 In-Reply-To: <1599214740-3927-1-git-send-email-liang.j.ma@intel.com> References: <1597141666-20621-1-git-send-email-liang.j.ma@intel.com> <1599214740-3927-1-git-send-email-liang.j.ma@intel.com> Subject: [dpdk-dev] [PATCH v3 5/6] net/i40e: implement power management API X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Implement support for the power management API by implementing a `next_rx_desc` function that will return an address of an RX ring's status bit. Signed-off-by: Liang Ma Signed-off-by: Anatoly Burakov --- drivers/net/i40e/i40e_ethdev.c | 1 + drivers/net/i40e/i40e_rxtx.c | 23 +++++++++++++++++++++++ drivers/net/i40e/i40e_rxtx.h | 2 ++ 3 files changed, 26 insertions(+) diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c index 11c02b1888..94e9298d7c 100644 --- a/drivers/net/i40e/i40e_ethdev.c +++ b/drivers/net/i40e/i40e_ethdev.c @@ -517,6 +517,7 @@ static const struct eth_dev_ops i40e_eth_dev_ops = { .mtu_set = i40e_dev_mtu_set, .tm_ops_get = i40e_tm_ops_get, .tx_done_cleanup = i40e_tx_done_cleanup, + .next_rx_desc = i40e_next_rx_desc, }; /* store statistics names and its offset in stats structure */ diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c index fe7f9200c1..9d7eea8aed 100644 --- a/drivers/net/i40e/i40e_rxtx.c +++ b/drivers/net/i40e/i40e_rxtx.c @@ -71,6 +71,29 @@ #define I40E_TX_OFFLOAD_NOTSUP_MASK \ (PKT_TX_OFFLOAD_MASK ^ I40E_TX_OFFLOAD_MASK) +int +i40e_next_rx_desc(void *rx_queue, volatile void **tail_desc_addr, + uint64_t *expected, uint64_t *mask) +{ + struct i40e_rx_queue *rxq = rx_queue; + volatile union i40e_rx_desc *rxdp; + uint16_t desc; + + desc = rxq->rx_tail; + rxdp = &rxq->rx_ring[desc]; + /* watch for changes in status bit */ + *tail_desc_addr = &rxdp->wb.qword1.status_error_len; + + /* + * we expect the DD bit to be set to 1 if this descriptor was already + * written to. + */ + *expected = rte_cpu_to_le_64(1 << I40E_RX_DESC_STATUS_DD_SHIFT); + *mask = rte_cpu_to_le_64(1 << I40E_RX_DESC_STATUS_DD_SHIFT); + + return 0; +} + static inline void i40e_rxd_to_vlan_tci(struct rte_mbuf *mb, volatile union i40e_rx_desc *rxdp) { diff --git a/drivers/net/i40e/i40e_rxtx.h b/drivers/net/i40e/i40e_rxtx.h index 57d7b4160b..bfda5b6ad3 100644 --- a/drivers/net/i40e/i40e_rxtx.h +++ b/drivers/net/i40e/i40e_rxtx.h @@ -248,6 +248,8 @@ uint16_t i40e_recv_scattered_pkts_vec_avx2(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts); uint16_t i40e_xmit_pkts_vec_avx2(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts); +int i40e_next_rx_desc(void *rx_queue, volatile void **tail_desc_addr, + uint64_t *expected, uint64_t *value); /* For each value it means, datasheet of hardware can tell more details * From patchwork Fri Sep 4 10:19:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Liang, Ma" X-Patchwork-Id: 76559 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7CAF6A04C5; Fri, 4 Sep 2020 12:19:53 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 1B4831C127; Fri, 4 Sep 2020 12:19:15 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id B5E181C115 for ; Fri, 4 Sep 2020 12:19:12 +0200 (CEST) IronPort-SDR: XstCbjb28Ogpi5X3P8a8+9s3Qbcb+vDaRVOPZoWksC/V6KyZMBkWLjgyaiRy9ranhoAKR2Km7C unv0Uzw4y24g== X-IronPort-AV: E=McAfee;i="6000,8403,9733"; a="242540263" X-IronPort-AV: E=Sophos;i="5.76,389,1592895600"; d="scan'208";a="242540263" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Sep 2020 03:19:11 -0700 IronPort-SDR: UTv0STQ/J7AimuqGK3F4cShzVGViTMPR8NiCWNnXdoBuuPcG/Z/TMCS/UkhYatOyg7x2koaCLG tPtkz7O1BgGQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,389,1592895600"; d="scan'208";a="342145655" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga007.jf.intel.com with ESMTP; 04 Sep 2020 03:19:10 -0700 Received: from sivswdev09.ir.intel.com (sivswdev09.ir.intel.com [10.237.217.48]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id 084AJANL030576; Fri, 4 Sep 2020 11:19:10 +0100 Received: from sivswdev09.ir.intel.com (localhost [127.0.0.1]) by sivswdev09.ir.intel.com with ESMTP id 084AJ95A004057; Fri, 4 Sep 2020 11:19:09 +0100 Received: (from lma25@localhost) by sivswdev09.ir.intel.com with LOCAL id 084AJ91q004053; Fri, 4 Sep 2020 11:19:09 +0100 From: Liang Ma To: dev@dpdk.org Cc: david.hunt@intel.com, anatoly.burakov@intel.com, Liang Ma Date: Fri, 4 Sep 2020 11:19:00 +0100 Message-Id: <1599214740-3927-6-git-send-email-liang.j.ma@intel.com> X-Mailer: git-send-email 1.7.7.4 In-Reply-To: <1599214740-3927-1-git-send-email-liang.j.ma@intel.com> References: <1597141666-20621-1-git-send-email-liang.j.ma@intel.com> <1599214740-3927-1-git-send-email-liang.j.ma@intel.com> Subject: [dpdk-dev] [PATCH v3 6/6] net/ice: implement power management API X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Implement support for the power management API by implementing a `next_rx_desc` function that will return an address of an RX ring's status bit. Signed-off-by: Liang Ma Signed-off-by: Anatoly Burakov --- drivers/net/ice/ice_ethdev.c | 1 + drivers/net/ice/ice_rxtx.c | 23 +++++++++++++++++++++++ drivers/net/ice/ice_rxtx.h | 2 ++ 3 files changed, 26 insertions(+) diff --git a/drivers/net/ice/ice_ethdev.c b/drivers/net/ice/ice_ethdev.c index 8d435e8892..7d7e1dcbac 100644 --- a/drivers/net/ice/ice_ethdev.c +++ b/drivers/net/ice/ice_ethdev.c @@ -212,6 +212,7 @@ static const struct eth_dev_ops ice_eth_dev_ops = { .udp_tunnel_port_add = ice_dev_udp_tunnel_port_add, .udp_tunnel_port_del = ice_dev_udp_tunnel_port_del, .tx_done_cleanup = ice_tx_done_cleanup, + .next_rx_desc = ice_next_rx_desc, }; /* store statistics names and its offset in stats structure */ diff --git a/drivers/net/ice/ice_rxtx.c b/drivers/net/ice/ice_rxtx.c index 2e1f06d2c0..c043181ceb 100644 --- a/drivers/net/ice/ice_rxtx.c +++ b/drivers/net/ice/ice_rxtx.c @@ -24,6 +24,29 @@ uint64_t rte_net_ice_dynflag_proto_xtr_ipv6_mask; uint64_t rte_net_ice_dynflag_proto_xtr_ipv6_flow_mask; uint64_t rte_net_ice_dynflag_proto_xtr_tcp_mask; +int ice_next_rx_desc(void *rx_queue, volatile void **tail_desc_addr, + uint64_t *expected, uint64_t *mask) +{ + volatile union ice_rx_flex_desc *rxdp; + struct ice_rx_queue *rxq = rx_queue; + uint16_t desc; + + desc = rxq->rx_tail; + rxdp = &rxq->rx_ring[desc]; + /* watch for changes in status bit */ + *tail_desc_addr = &rxdp->wb.status_error0; + + /* + * we expect the DD bit to be set to 1 if this descriptor was already + * written to. + */ + *expected = rte_cpu_to_le_16(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S); + *mask = rte_cpu_to_le_16(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S); + + return 0; +} + + static inline uint64_t ice_rxdid_to_proto_xtr_ol_flag(uint8_t rxdid) { diff --git a/drivers/net/ice/ice_rxtx.h b/drivers/net/ice/ice_rxtx.h index 2fdcfb7d04..7eb6fa904e 100644 --- a/drivers/net/ice/ice_rxtx.h +++ b/drivers/net/ice/ice_rxtx.h @@ -202,5 +202,7 @@ uint16_t ice_xmit_pkts_vec_avx2(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts); int ice_fdir_programming(struct ice_pf *pf, struct ice_fltr_desc *fdir_desc); int ice_tx_done_cleanup(void *txq, uint32_t free_cnt); +int ice_next_rx_desc(void *rx_queue, volatile void **tail_desc_addr, + uint64_t *expected, uint64_t *mask); #endif /* _ICE_RXTX_H_ */