Message ID | 1603810749-22285-1-git-send-email-liang.j.ma@intel.com (mailing list archive) |
---|---|
Headers |
Return-Path: <dev-bounces@dpdk.org> X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 80132A04B5; Tue, 27 Oct 2020 15:59:20 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id DD7204C7A; Tue, 27 Oct 2020 15:59:18 +0100 (CET) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 750F74C6C for <dev@dpdk.org>; Tue, 27 Oct 2020 15:59:17 +0100 (CET) IronPort-SDR: aOJyKh8BbvZI0ZcPZHezaSeJ5h+lnG/syWQfpyOZsmXHkanWs71+rSGDdDxZX971E7yY6JZ4dn gwPNMsKiBXgA== X-IronPort-AV: E=McAfee;i="6000,8403,9786"; a="185839798" X-IronPort-AV: E=Sophos;i="5.77,424,1596524400"; d="scan'208";a="185839798" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Oct 2020 07:59:15 -0700 IronPort-SDR: A80OKlmVTh4/2p/Kma3fAe2mvCs3jsrrgII7njjFHM3TO4oCCrpU4n1cKrQzRP1VHTNqKagcxl 68Cc1zpqli6w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,424,1596524400"; d="scan'208";a="535824178" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga005.jf.intel.com with ESMTP; 27 Oct 2020 07:59:11 -0700 Received: from sivswdev09.ir.intel.com (sivswdev09.ir.intel.com [10.237.217.48]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id 09RExANa015651; Tue, 27 Oct 2020 14:59:10 GMT Received: from sivswdev09.ir.intel.com (localhost [127.0.0.1]) by sivswdev09.ir.intel.com with ESMTP id 09RExAfr022423; Tue, 27 Oct 2020 14:59:10 GMT Received: (from lma25@localhost) by sivswdev09.ir.intel.com with LOCAL id 09REx9fB022412; Tue, 27 Oct 2020 14:59:09 GMT From: Liang Ma <liang.j.ma@intel.com> To: dev@dpdk.org Cc: ruifeng.wang@arm.com, haiyue.wang@intel.com, bruce.richardson@intel.com, konstantin.ananyev@intel.com, david.hunt@intel.com, jerinjacobk@gmail.com, nhorman@tuxdriver.com, thomas@monjalon.net, timothy.mcdaniel@intel.com, gage.eads@intel.com, mw@semihalf.com, gtzalik@amazon.com, ajit.khaparde@broadcom.com, hkalra@marvell.com, johndale@cisco.com, xavier.huwei@huawei.com, xuanziyang2@huawei.com, matan@nvidia.com, yongwang@vmware.com Date: Tue, 27 Oct 2020 14:59:00 +0000 Message-Id: <1603810749-22285-1-git-send-email-liang.j.ma@intel.com> X-Mailer: git-send-email 1.7.7.4 In-Reply-To: <1603494392-7181-1-git-send-email-liang.j.ma@intel.com> References: <1603494392-7181-1-git-send-email-liang.j.ma@intel.com> Subject: [dpdk-dev] [PATCH v10 0/9] Add PMD power mgmt X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions <dev.dpdk.org> List-Unsubscribe: <https://mails.dpdk.org/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://mails.dpdk.org/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <https://mails.dpdk.org/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org> |
Series |
Add PMD power mgmt
|
|
Message
Liang, Ma
Oct. 27, 2020, 2:59 p.m. UTC
This patchset proposes a simple API for Ethernet drivers to cause the CPU to enter a power-optimized state while waiting for packets to arrive, along with a set of generic intrinsics that facilitate that. This is achieved through cooperation with the NIC driver that will allow us to know address of wake up event, and wait for writes on it. On IA, this is achieved through using UMONITOR/UMWAIT instructions. They are used in their raw opcode form because there is no widespread compiler support for them yet. Still, the API is made generic enough to hopefully support other architectures, if they happen to implement similar instructions. To achieve power savings, there is a very simple mechanism used: we're counting empty polls, and if a certain threshold is reached, we get the address of next RX ring descriptor from the NIC driver, arm the monitoring hardware, and enter a power-optimized state. We will then wake up when either a timeout happens, or a write happens (or generally whenever CPU feels like waking up - this is platform- specific), and proceed as normal. The empty poll counter is reset whenever we actually get packets, so we only go to sleep when we know nothing is going on. The mechanism is generic which can be used for any write back descriptor. Why are we putting it into ethdev as opposed to leaving this up to the application? Our customers specifically requested a way to do it wit minimal changes to the application code. The current approach allows to just flip a switch and automatically have power savings. - Only 1:1 core to queue mapping is supported, meaning that each lcore must at most handle RX on a single queue - Support 3 type policies. UMWAIT/PAUSE/Frequency_Scale - Power management is enabled per-queue - The API doesn't extend to other device types Liang Ma (9): eal: add new x86 cpuid support for WAITPKG eal: add power management intrinsics eal: add intrinsics support check infrastructure ethdev: add simple power management API power: add PMD power management API and callback net/ixgbe: implement power management API net/i40e: implement power management API net/ice: implement power management API examples/l3fwd-power: enable PMD power mgmt doc/guides/prog_guide/power_man.rst | 48 +++ doc/guides/rel_notes/release_20_11.rst | 15 + .../sample_app_ug/l3_forward_power_man.rst | 13 + drivers/net/i40e/i40e_ethdev.c | 1 + drivers/net/i40e/i40e_rxtx.c | 26 ++ drivers/net/i40e/i40e_rxtx.h | 2 + drivers/net/ice/ice_ethdev.c | 1 + drivers/net/ice/ice_rxtx.c | 26 ++ drivers/net/ice/ice_rxtx.h | 2 + drivers/net/ixgbe/ixgbe_ethdev.c | 1 + drivers/net/ixgbe/ixgbe_rxtx.c | 25 ++ drivers/net/ixgbe/ixgbe_rxtx.h | 2 + examples/l3fwd-power/main.c | 46 ++- lib/librte_eal/arm/include/meson.build | 1 + .../arm/include/rte_power_intrinsics.h | 60 ++++ lib/librte_eal/arm/rte_cpuflags.c | 6 + lib/librte_eal/include/generic/rte_cpuflags.h | 26 ++ .../include/generic/rte_power_intrinsics.h | 123 +++++++ lib/librte_eal/include/meson.build | 1 + lib/librte_eal/ppc/include/meson.build | 1 + .../ppc/include/rte_power_intrinsics.h | 60 ++++ lib/librte_eal/ppc/rte_cpuflags.c | 7 + lib/librte_eal/version.map | 1 + lib/librte_eal/x86/include/meson.build | 1 + lib/librte_eal/x86/include/rte_cpuflags.h | 1 + .../x86/include/rte_power_intrinsics.h | 135 ++++++++ lib/librte_eal/x86/rte_cpuflags.c | 14 + lib/librte_ethdev/rte_ethdev.c | 23 ++ lib/librte_ethdev/rte_ethdev.h | 33 ++ lib/librte_ethdev/rte_ethdev_driver.h | 28 ++ lib/librte_ethdev/version.map | 1 + lib/librte_power/meson.build | 5 +- lib/librte_power/rte_power_pmd_mgmt.c | 320 ++++++++++++++++++ lib/librte_power/rte_power_pmd_mgmt.h | 92 +++++ lib/librte_power/version.map | 4 + 35 files changed, 1148 insertions(+), 3 deletions(-) create mode 100644 lib/librte_eal/arm/include/rte_power_intrinsics.h create mode 100644 lib/librte_eal/include/generic/rte_power_intrinsics.h create mode 100644 lib/librte_eal/ppc/include/rte_power_intrinsics.h create mode 100644 lib/librte_eal/x86/include/rte_power_intrinsics.h create mode 100644 lib/librte_power/rte_power_pmd_mgmt.c create mode 100644 lib/librte_power/rte_power_pmd_mgmt.h
Comments
Please be more patient. I asked some questions on v9 (ethdev API is not generic enough), and you send a v10 in the same minute you reply, without making sure I agree with your answers.
On Tue, Oct 27, 2020 at 7:59 AM Liang Ma <liang.j.ma@intel.com> wrote: > > This patchset proposes a simple API for Ethernet drivers > to cause the CPU to enter a power-optimized state while > waiting for packets to arrive, along with a set of > generic intrinsics that facilitate that. This is achieved > through cooperation with the NIC driver that will allow > us to know address of wake up event, and wait for writes > on it. Is the wake event the same as ring status or interrupt status register? So in a way the PMD is passing the address of the next ring descriptor? So that instead of the PMD polling it, the application peeks at it and when ready asks the PMD to actually process the packet(s)? > > On IA, this is achieved through using UMONITOR/UMWAIT > instructions. They are used in their raw opcode form > because there is no widespread compiler support for > them yet. Still, the API is made generic enough to > hopefully support other architectures, if they happen > to implement similar instructions. > > To achieve power savings, there is a very simple mechanism > used: we're counting empty polls, and if a certain threshold > is reached, we get the address of next RX ring descriptor > from the NIC driver, arm the monitoring hardware, and > enter a power-optimized state. We will then wake up when > either a timeout happens, or a write happens (or generally > whenever CPU feels like waking up - this is platform- > specific), and proceed as normal. The empty poll counter is > reset whenever we actually get packets, so we only go to > sleep when we know nothing is going on. The mechanism is > generic which can be used for any write back descriptor. > > Why are we putting it into ethdev as opposed to leaving > this up to the application? Our customers specifically > requested a way to do it wit minimal changes to the > application code. The current approach allows to just > flip a switch and automatically have power savings. The application still has to know address of wake up event. Right? And then it will need the logic to count empty polls and the threshold? This will be done by application or something else? > > - Only 1:1 core to queue mapping is supported, > meaning that each lcore must at most handle RX on a > single queue > - Support 3 type policies. UMWAIT/PAUSE/Frequency_Scale > - Power management is enabled per-queue > - The API doesn't extend to other device types > > Liang Ma (9): > eal: add new x86 cpuid support for WAITPKG > eal: add power management intrinsics > eal: add intrinsics support check infrastructure > ethdev: add simple power management API > power: add PMD power management API and callback > net/ixgbe: implement power management API > net/i40e: implement power management API > net/ice: implement power management API > examples/l3fwd-power: enable PMD power mgmt > > doc/guides/prog_guide/power_man.rst | 48 +++ > doc/guides/rel_notes/release_20_11.rst | 15 + > .../sample_app_ug/l3_forward_power_man.rst | 13 + > drivers/net/i40e/i40e_ethdev.c | 1 + > drivers/net/i40e/i40e_rxtx.c | 26 ++ > drivers/net/i40e/i40e_rxtx.h | 2 + > drivers/net/ice/ice_ethdev.c | 1 + > drivers/net/ice/ice_rxtx.c | 26 ++ > drivers/net/ice/ice_rxtx.h | 2 + > drivers/net/ixgbe/ixgbe_ethdev.c | 1 + > drivers/net/ixgbe/ixgbe_rxtx.c | 25 ++ > drivers/net/ixgbe/ixgbe_rxtx.h | 2 + > examples/l3fwd-power/main.c | 46 ++- > lib/librte_eal/arm/include/meson.build | 1 + > .../arm/include/rte_power_intrinsics.h | 60 ++++ > lib/librte_eal/arm/rte_cpuflags.c | 6 + > lib/librte_eal/include/generic/rte_cpuflags.h | 26 ++ > .../include/generic/rte_power_intrinsics.h | 123 +++++++ > lib/librte_eal/include/meson.build | 1 + > lib/librte_eal/ppc/include/meson.build | 1 + > .../ppc/include/rte_power_intrinsics.h | 60 ++++ > lib/librte_eal/ppc/rte_cpuflags.c | 7 + > lib/librte_eal/version.map | 1 + > lib/librte_eal/x86/include/meson.build | 1 + > lib/librte_eal/x86/include/rte_cpuflags.h | 1 + > .../x86/include/rte_power_intrinsics.h | 135 ++++++++ > lib/librte_eal/x86/rte_cpuflags.c | 14 + > lib/librte_ethdev/rte_ethdev.c | 23 ++ > lib/librte_ethdev/rte_ethdev.h | 33 ++ > lib/librte_ethdev/rte_ethdev_driver.h | 28 ++ > lib/librte_ethdev/version.map | 1 + > lib/librte_power/meson.build | 5 +- > lib/librte_power/rte_power_pmd_mgmt.c | 320 ++++++++++++++++++ > lib/librte_power/rte_power_pmd_mgmt.h | 92 +++++ > lib/librte_power/version.map | 4 + > 35 files changed, 1148 insertions(+), 3 deletions(-) > create mode 100644 lib/librte_eal/arm/include/rte_power_intrinsics.h > create mode 100644 lib/librte_eal/include/generic/rte_power_intrinsics.h > create mode 100644 lib/librte_eal/ppc/include/rte_power_intrinsics.h > create mode 100644 lib/librte_eal/x86/include/rte_power_intrinsics.h > create mode 100644 lib/librte_power/rte_power_pmd_mgmt.c > create mode 100644 lib/librte_power/rte_power_pmd_mgmt.h > > -- > 2.17.1 >
On 27 Oct 13:53, Ajit Khaparde wrote: > On Tue, Oct 27, 2020 at 7:59 AM Liang Ma <liang.j.ma@intel.com> wrote: > > > > This patchset proposes a simple API for Ethernet drivers > > to cause the CPU to enter a power-optimized state while > > waiting for packets to arrive, along with a set of > > generic intrinsics that facilitate that. This is achieved > > through cooperation with the NIC driver that will allow > > us to know address of wake up event, and wait for writes > > on it. > Is the wake event the same as ring status or interrupt status register? the wake event is any change happen to the monitoring address range. can be a ring descriptor(e.g. Done Bits) but not limiti to that. > > So in a way the PMD is passing the address of the next ring descriptor? > So that instead of the PMD polling it, the application peeks at it and > when ready asks the PMD to actually process the packet(s)? NO, after PMD pssing the address, processor will monitoring and wait on the address range(aka, move to a sub-state). any change to the address range caused by other source(another processor, NIC etc) will wake up the processor then start polling again. > > > > > On IA, this is achieved through using UMONITOR/UMWAIT > > instructions. They are used in their raw opcode form > > because there is no widespread compiler support for > > them yet. Still, the API is made generic enough to > > hopefully support other architectures, if they happen > > to implement similar instructions. > > > > To achieve power savings, there is a very simple mechanism > > used: we're counting empty polls, and if a certain threshold > > is reached, we get the address of next RX ring descriptor > > from the NIC driver, arm the monitoring hardware, and > > enter a power-optimized state. We will then wake up when > > either a timeout happens, or a write happens (or generally > > whenever CPU feels like waking up - this is platform- > > specific), and proceed as normal. The empty poll counter is > > reset whenever we actually get packets, so we only go to > > sleep when we know nothing is going on. The mechanism is > > generic which can be used for any write back descriptor. > > > > Why are we putting it into ethdev as opposed to leaving > > this up to the application? Our customers specifically > > requested a way to do it wit minimal changes to the > > application code. The current approach allows to just > > flip a switch and automatically have power savings. > The application still has to know address of wake up event. Right? > And then it will need the logic to count empty polls and the threshold? > This will be done by application or something else? the empty poll counting is done by the power library callback function which is included in the patch 5. APP has no need change any thing beside the initilization code(call the enable/disable API). please Ref the patch 9 which demostrate how to use it with l3fwd-power. > > > > > - Only 1:1 core to queue mapping is supported, > > meaning that each lcore must at most handle RX on a > > single queue > > - Support 3 type policies. UMWAIT/PAUSE/Frequency_Scale > > - Power management is enabled per-queue > > - The API doesn't extend to other device types > > > > Liang Ma (9): > > eal: add new x86 cpuid support for WAITPKG > > eal: add power management intrinsics > > eal: add intrinsics support check infrastructure > > ethdev: add simple power management API > > power: add PMD power management API and callback > > net/ixgbe: implement power management API > > net/i40e: implement power management API > > net/ice: implement power management API > > examples/l3fwd-power: enable PMD power mgmt > > > > doc/guides/prog_guide/power_man.rst | 48 +++ > > doc/guides/rel_notes/release_20_11.rst | 15 + > > .../sample_app_ug/l3_forward_power_man.rst | 13 + > > drivers/net/i40e/i40e_ethdev.c | 1 + > > drivers/net/i40e/i40e_rxtx.c | 26 ++ > > drivers/net/i40e/i40e_rxtx.h | 2 + > > drivers/net/ice/ice_ethdev.c | 1 + > > drivers/net/ice/ice_rxtx.c | 26 ++ > > drivers/net/ice/ice_rxtx.h | 2 + > > drivers/net/ixgbe/ixgbe_ethdev.c | 1 + > > drivers/net/ixgbe/ixgbe_rxtx.c | 25 ++ > > drivers/net/ixgbe/ixgbe_rxtx.h | 2 + > > examples/l3fwd-power/main.c | 46 ++- > > lib/librte_eal/arm/include/meson.build | 1 + > > .../arm/include/rte_power_intrinsics.h | 60 ++++ > > lib/librte_eal/arm/rte_cpuflags.c | 6 + > > lib/librte_eal/include/generic/rte_cpuflags.h | 26 ++ > > .../include/generic/rte_power_intrinsics.h | 123 +++++++ > > lib/librte_eal/include/meson.build | 1 + > > lib/librte_eal/ppc/include/meson.build | 1 + > > .../ppc/include/rte_power_intrinsics.h | 60 ++++ > > lib/librte_eal/ppc/rte_cpuflags.c | 7 + > > lib/librte_eal/version.map | 1 + > > lib/librte_eal/x86/include/meson.build | 1 + > > lib/librte_eal/x86/include/rte_cpuflags.h | 1 + > > .../x86/include/rte_power_intrinsics.h | 135 ++++++++ > > lib/librte_eal/x86/rte_cpuflags.c | 14 + > > lib/librte_ethdev/rte_ethdev.c | 23 ++ > > lib/librte_ethdev/rte_ethdev.h | 33 ++ > > lib/librte_ethdev/rte_ethdev_driver.h | 28 ++ > > lib/librte_ethdev/version.map | 1 + > > lib/librte_power/meson.build | 5 +- > > lib/librte_power/rte_power_pmd_mgmt.c | 320 ++++++++++++++++++ > > lib/librte_power/rte_power_pmd_mgmt.h | 92 +++++ > > lib/librte_power/version.map | 4 + > > 35 files changed, 1148 insertions(+), 3 deletions(-) > > create mode 100644 lib/librte_eal/arm/include/rte_power_intrinsics.h > > create mode 100644 lib/librte_eal/include/generic/rte_power_intrinsics.h > > create mode 100644 lib/librte_eal/ppc/include/rte_power_intrinsics.h > > create mode 100644 lib/librte_eal/x86/include/rte_power_intrinsics.h > > create mode 100644 lib/librte_power/rte_power_pmd_mgmt.c > > create mode 100644 lib/librte_power/rte_power_pmd_mgmt.h > > > > -- > > 2.17.1 > >
Hi Thomas, I think I addressed all of the questions in relation to V9. I don't think I can solve the issue of a generic API on my own. From the Community Call last week Jerin also said that a generic was investigated but that a single solution wasn't feasible. This API is experimental and other vendor support can be added as needed. If there are any other open issue let me know? Regards Liang On 27 Oct 17:02, Thomas Monjalon wrote: > Please be more patient. > I asked some questions on v9 (ethdev API is not generic enough), > and you send a v10 in the same minute you reply, > without making sure I agree with your answers. > > >
On Wed, Oct 28, 2020 at 7:05 PM Liang, Ma <liang.j.ma@intel.com> wrote: > > Hi Thomas, > I think I addressed all of the questions in relation to V9. I don't think I can solve the issue of a generic API on my own. From the Community Call last week Jerin also said that a generic was investigated but that a single solution wasn't feasible. I think, From the architecture point of view, the specific functionally of UMONITOR may not be abstracted. But from the ethdev callback point of view, Can it be abstracted in such a way that packet notification available through checking interrupt status register or ring descriptor location, etc by the driver. Use that callback as a notification mechanism rather than defining a memory-based scheme that UMONITOR expects? or similar thoughts on abstraction. > This API is experimental and other vendor support can be added as needed. If there are any other open issue let me know? > > Regards > Liang > > On 27 Oct 17:02, Thomas Monjalon wrote: > > Please be more patient. > > I asked some questions on v9 (ethdev API is not generic enough), > > and you send a v10 in the same minute you reply, > > without making sure I agree with your answers. > > > > > >
28/10/2020 14:49, Jerin Jacob: > On Wed, Oct 28, 2020 at 7:05 PM Liang, Ma <liang.j.ma@intel.com> wrote: > > > > Hi Thomas, > > I think I addressed all of the questions in relation to V9. I don't think I can solve the issue of a generic API on my own. From the Community Call last week Jerin also said that a generic was investigated but that a single solution wasn't feasible. > > I think, From the architecture point of view, the specific > functionally of UMONITOR may not be abstracted. > But from the ethdev callback point of view, Can it be abstracted in > such a way that packet notification available through > checking interrupt status register or ring descriptor location, etc by > the driver. Use that callback as a notification mechanism rather > than defining a memory-based scheme that UMONITOR expects? or similar > thoughts on abstraction. I agree with Jerin. The ethdev API is the blocking problem. First problem: it is not well explained in doxygen. Second problem: it is probably not generic enough (if we understand it well) > > This API is experimental and other vendor support can be added as needed. If there are any other open issue let me know? Being experimental is not an excuse to throw something which is not satisfying.
> 28/10/2020 14:49, Jerin Jacob: > > On Wed, Oct 28, 2020 at 7:05 PM Liang, Ma <liang.j.ma@intel.com> wrote: > > > > > > Hi Thomas, > > > I think I addressed all of the questions in relation to V9. I don't think I can solve the issue of a generic API on my own. From the > Community Call last week Jerin also said that a generic was investigated but that a single solution wasn't feasible. > > > > I think, From the architecture point of view, the specific > > functionally of UMONITOR may not be abstracted. > > But from the ethdev callback point of view, Can it be abstracted in > > such a way that packet notification available through > > checking interrupt status register or ring descriptor location, etc by > > the driver. Use that callback as a notification mechanism rather > > than defining a memory-based scheme that UMONITOR expects? or similar > > thoughts on abstraction. I think there is probably some sort of misunderstanding. This API is not about providing acync notification when next packet arrives. This is about to putting core to sleep till some event (or timeout) happens. From my perspective the closest analogy: cond_timedwait(). So we need PMD to tell us what will be the address of the condition variable we should sleep on. > I agree with Jerin. > The ethdev API is the blocking problem. > First problem: it is not well explained in doxygen. > Second problem: it is probably not generic enough (if we understand it well) It is an address to sleep(/wakeup) on, plus expected value. Honestly, I can't think-up of anything even more generic then that. If you guys have something particular in mind - please share. > > > > This API is experimental and other vendor support can be added as needed. If there are any other open issue let me know? > > Being experimental is not an excuse to throw something > which is not satisfying. > >
On Wed, Oct 28, 2020 at 8:27 PM Ananyev, Konstantin <konstantin.ananyev@intel.com> wrote: > > > > > 28/10/2020 14:49, Jerin Jacob: > > > On Wed, Oct 28, 2020 at 7:05 PM Liang, Ma <liang.j.ma@intel.com> wrote: > > > > > > > > Hi Thomas, > > > > I think I addressed all of the questions in relation to V9. I don't think I can solve the issue of a generic API on my own. From the > > Community Call last week Jerin also said that a generic was investigated but that a single solution wasn't feasible. > > > > > > I think, From the architecture point of view, the specific > > > functionally of UMONITOR may not be abstracted. > > > But from the ethdev callback point of view, Can it be abstracted in > > > such a way that packet notification available through > > > checking interrupt status register or ring descriptor location, etc by > > > the driver. Use that callback as a notification mechanism rather > > > than defining a memory-based scheme that UMONITOR expects? or similar > > > thoughts on abstraction. > > I think there is probably some sort of misunderstanding. > This API is not about providing acync notification when next packet arrives. > This is about to putting core to sleep till some event (or timeout) happens. > From my perspective the closest analogy: cond_timedwait(). > So we need PMD to tell us what will be the address of the condition variable > we should sleep on. > > > I agree with Jerin. > > The ethdev API is the blocking problem. > > First problem: it is not well explained in doxygen. > > Second problem: it is probably not generic enough (if we understand it well) > > It is an address to sleep(/wakeup) on, plus expected value. > Honestly, I can't think-up of anything even more generic then that. > If you guys have something particular in mind - please share. Current PMD callback: typedef int (*eth_get_wake_addr_t)(void *rxq, volatile void **tail_desc_addr, + uint64_t *expected, uint64_t *mask, uint8_t *data_sz); Can we make it as typedef void (*core_sleep_t)(void *rxq) if we do such abstraction and "move the polling on memory by HW/CPU" to the driver using a helper function then I can think of abstracting in some way in all PMDs. Note: core_sleep_t can take some more arguments such as enumerated policy if something more needs to be pushed to the driver. Thoughts? > > > > > > > This API is experimental and other vendor support can be added as needed. If there are any other open issue let me know? > > > > Being experimental is not an excuse to throw something > > which is not satisfying. > > > > >
On 28 Oct 20:44, Jerin Jacob wrote: > On Wed, Oct 28, 2020 at 8:27 PM Ananyev, Konstantin > <konstantin.ananyev@intel.com> wrote: > > > > > > > > > 28/10/2020 14:49, Jerin Jacob: > > > > On Wed, Oct 28, 2020 at 7:05 PM Liang, Ma <liang.j.ma@intel.com> wrote: > > > > > > > > > > Hi Thomas, > > > > > I think I addressed all of the questions in relation to V9. I don't think I can solve the issue of a generic API on my own. From the > > > Community Call last week Jerin also said that a generic was investigated but that a single solution wasn't feasible. > > > > > > > > I think, From the architecture point of view, the specific > > > > functionally of UMONITOR may not be abstracted. > > > > But from the ethdev callback point of view, Can it be abstracted in > > > > such a way that packet notification available through > > > > checking interrupt status register or ring descriptor location, etc by > > > > the driver. Use that callback as a notification mechanism rather > > > > than defining a memory-based scheme that UMONITOR expects? or similar > > > > thoughts on abstraction. > > > > I think there is probably some sort of misunderstanding. > > This API is not about providing acync notification when next packet arrives. > > This is about to putting core to sleep till some event (or timeout) happens. > > From my perspective the closest analogy: cond_timedwait(). > > So we need PMD to tell us what will be the address of the condition variable > > we should sleep on. > > > > > I agree with Jerin. > > > The ethdev API is the blocking problem. > > > First problem: it is not well explained in doxygen. > > > Second problem: it is probably not generic enough (if we understand it well) > > > > It is an address to sleep(/wakeup) on, plus expected value. > > Honestly, I can't think-up of anything even more generic then that. > > If you guys have something particular in mind - please share. > > Current PMD callback: > typedef int (*eth_get_wake_addr_t)(void *rxq, volatile void > **tail_desc_addr, + uint64_t *expected, uint64_t *mask, uint8_t > *data_sz); > > Can we make it as > typedef void (*core_sleep_t)(void *rxq) How about void (*eth_core_sleep_helper_t)(void *rxq, enum scheme, void *paramter) by this way, PMD can cast the parameter accorind to the scheme. e.g. if scheme MEM_MONITOR then cast to umwait way. however, this will introduce another problem. we need add PMD query callback to figure out if PMD support this scheme. > > if we do such abstraction and "move the polling on memory by HW/CPU" > to the driver using a helper function then > I can think of abstracting in some way in all PMDs. > > Note: core_sleep_t can take some more arguments such as enumerated > policy if something more needs to be pushed to the driver. > > Thoughts? > > > > > > > > > > > This API is experimental and other vendor support can be added as needed. If there are any other open issue let me know? > > > > > > Being experimental is not an excuse to throw something > > > which is not satisfying. > > > > > > > >
> > > 28/10/2020 14:49, Jerin Jacob: > > > > On Wed, Oct 28, 2020 at 7:05 PM Liang, Ma <liang.j.ma@intel.com> wrote: > > > > > > > > > > Hi Thomas, > > > > > I think I addressed all of the questions in relation to V9. I don't think I can solve the issue of a generic API on my own. From the > > > Community Call last week Jerin also said that a generic was investigated but that a single solution wasn't feasible. > > > > > > > > I think, From the architecture point of view, the specific > > > > functionally of UMONITOR may not be abstracted. > > > > But from the ethdev callback point of view, Can it be abstracted in > > > > such a way that packet notification available through > > > > checking interrupt status register or ring descriptor location, etc by > > > > the driver. Use that callback as a notification mechanism rather > > > > than defining a memory-based scheme that UMONITOR expects? or similar > > > > thoughts on abstraction. > > > > I think there is probably some sort of misunderstanding. > > This API is not about providing acync notification when next packet arrives. > > This is about to putting core to sleep till some event (or timeout) happens. > > From my perspective the closest analogy: cond_timedwait(). > > So we need PMD to tell us what will be the address of the condition variable > > we should sleep on. > > > > > I agree with Jerin. > > > The ethdev API is the blocking problem. > > > First problem: it is not well explained in doxygen. > > > Second problem: it is probably not generic enough (if we understand it well) > > > > It is an address to sleep(/wakeup) on, plus expected value. > > Honestly, I can't think-up of anything even more generic then that. > > If you guys have something particular in mind - please share. > > Current PMD callback: > typedef int (*eth_get_wake_addr_t)(void *rxq, volatile void > **tail_desc_addr, + uint64_t *expected, uint64_t *mask, uint8_t > *data_sz); > > Can we make it as > typedef void (*core_sleep_t)(void *rxq) > > if we do such abstraction and "move the polling on memory by HW/CPU" > to the driver using a helper function then > I can think of abstracting in some way in all PMDs. Ok I see, thanks for explanation. From my perspective main disadvantage of such approach - it can't be extended easily. If/when will have an ability for core to sleep/wake-up on multiple events (multiple addresses) will have to either rework that API again. > > Note: core_sleep_t can take some more arguments such as enumerated > policy if something more needs to be pushed to the driver. > > Thoughts? > > > > > > > > > > > This API is experimental and other vendor support can be added as needed. If there are any other open issue let me know? > > > > > > Being experimental is not an excuse to throw something > > > which is not satisfying. > > > > > > > >
On Wed, Oct 28, 2020 at 9:00 PM Liang, Ma <liang.j.ma@intel.com> wrote: > > On 28 Oct 20:44, Jerin Jacob wrote: > > On Wed, Oct 28, 2020 at 8:27 PM Ananyev, Konstantin > > <konstantin.ananyev@intel.com> wrote: > > > > > > > > > > > > > 28/10/2020 14:49, Jerin Jacob: > > > > > On Wed, Oct 28, 2020 at 7:05 PM Liang, Ma <liang.j.ma@intel.com> wrote: > > > > > > > > > > > > Hi Thomas, > > > > > > I think I addressed all of the questions in relation to V9. I don't think I can solve the issue of a generic API on my own. From the > > > > Community Call last week Jerin also said that a generic was investigated but that a single solution wasn't feasible. > > > > > > > > > > I think, From the architecture point of view, the specific > > > > > functionally of UMONITOR may not be abstracted. > > > > > But from the ethdev callback point of view, Can it be abstracted in > > > > > such a way that packet notification available through > > > > > checking interrupt status register or ring descriptor location, etc by > > > > > the driver. Use that callback as a notification mechanism rather > > > > > than defining a memory-based scheme that UMONITOR expects? or similar > > > > > thoughts on abstraction. > > > > > > I think there is probably some sort of misunderstanding. > > > This API is not about providing acync notification when next packet arrives. > > > This is about to putting core to sleep till some event (or timeout) happens. > > > From my perspective the closest analogy: cond_timedwait(). > > > So we need PMD to tell us what will be the address of the condition variable > > > we should sleep on. > > > > > > > I agree with Jerin. > > > > The ethdev API is the blocking problem. > > > > First problem: it is not well explained in doxygen. > > > > Second problem: it is probably not generic enough (if we understand it well) > > > > > > It is an address to sleep(/wakeup) on, plus expected value. > > > Honestly, I can't think-up of anything even more generic then that. > > > If you guys have something particular in mind - please share. > > > > Current PMD callback: > > typedef int (*eth_get_wake_addr_t)(void *rxq, volatile void > > **tail_desc_addr, + uint64_t *expected, uint64_t *mask, uint8_t > > *data_sz); > > > > Can we make it as > > typedef void (*core_sleep_t)(void *rxq) > How about void (*eth_core_sleep_helper_t)(void *rxq, enum scheme, void *paramter) > by this way, PMD can cast the parameter accorind to the scheme. > e.g. if scheme MEM_MONITOR then cast to umwait way. > however, this will introduce another problem. > we need add PMD query callback to figure out if PMD support this scheme. I thought scheme/policy something that "application cares" like below not arch specifics 1) wakeup up on first packet, 2) wake me up on first packet or timeout 100 us etc Yes. We can have query on type of the policies supported. > > > > if we do such abstraction and "move the polling on memory by HW/CPU" > > to the driver using a helper function then > > I can think of abstracting in some way in all PMDs. > > > > Note: core_sleep_t can take some more arguments such as enumerated > > policy if something more needs to be pushed to the driver. > > > > Thoughts? > > > > > > > > > > > > > > > This API is experimental and other vendor support can be added as needed. If there are any other open issue let me know? > > > > > > > > Being experimental is not an excuse to throw something > > > > which is not satisfying. > > > > > > > > > > >
On Wed, Oct 28, 2020 at 9:04 PM Ananyev, Konstantin <konstantin.ananyev@intel.com> wrote: > > > > > > > 28/10/2020 14:49, Jerin Jacob: > > > > > On Wed, Oct 28, 2020 at 7:05 PM Liang, Ma <liang.j.ma@intel.com> wrote: > > > > > > > > > > > > Hi Thomas, > > > > > > I think I addressed all of the questions in relation to V9. I don't think I can solve the issue of a generic API on my own. From the > > > > Community Call last week Jerin also said that a generic was investigated but that a single solution wasn't feasible. > > > > > > > > > > I think, From the architecture point of view, the specific > > > > > functionally of UMONITOR may not be abstracted. > > > > > But from the ethdev callback point of view, Can it be abstracted in > > > > > such a way that packet notification available through > > > > > checking interrupt status register or ring descriptor location, etc by > > > > > the driver. Use that callback as a notification mechanism rather > > > > > than defining a memory-based scheme that UMONITOR expects? or similar > > > > > thoughts on abstraction. > > > > > > I think there is probably some sort of misunderstanding. > > > This API is not about providing acync notification when next packet arrives. > > > This is about to putting core to sleep till some event (or timeout) happens. > > > From my perspective the closest analogy: cond_timedwait(). > > > So we need PMD to tell us what will be the address of the condition variable > > > we should sleep on. > > > > > > > I agree with Jerin. > > > > The ethdev API is the blocking problem. > > > > First problem: it is not well explained in doxygen. > > > > Second problem: it is probably not generic enough (if we understand it well) > > > > > > It is an address to sleep(/wakeup) on, plus expected value. > > > Honestly, I can't think-up of anything even more generic then that. > > > If you guys have something particular in mind - please share. > > > > Current PMD callback: > > typedef int (*eth_get_wake_addr_t)(void *rxq, volatile void > > **tail_desc_addr, + uint64_t *expected, uint64_t *mask, uint8_t > > *data_sz); > > > > Can we make it as > > typedef void (*core_sleep_t)(void *rxq) > > > > if we do such abstraction and "move the polling on memory by HW/CPU" > > to the driver using a helper function then > > I can think of abstracting in some way in all PMDs. > > Ok I see, thanks for explanation. > From my perspective main disadvantage of such approach - > it can't be extended easily. > If/when will have an ability for core to sleep/wake-up on multiple events > (multiple addresses) will have to either rework that API again. I think, we can enumerate the policies and pass the associated structures as input to the driver. > > > > > Note: core_sleep_t can take some more arguments such as enumerated > > policy if something more needs to be pushed to the driver. > > > > Thoughts? > > > > > > > > > > > > > > > This API is experimental and other vendor support can be added as needed. If there are any other open issue let me know? > > > > > > > > Being experimental is not an excuse to throw something > > > > which is not satisfying. > > > > > > > > > > >
On 28 Oct 21:06, Jerin Jacob wrote: > On Wed, Oct 28, 2020 at 9:00 PM Liang, Ma <liang.j.ma@intel.com> wrote: > > > > On 28 Oct 20:44, Jerin Jacob wrote: > > > On Wed, Oct 28, 2020 at 8:27 PM Ananyev, Konstantin > > > <konstantin.ananyev@intel.com> wrote: > > > > > > > > > > > > > > > > > 28/10/2020 14:49, Jerin Jacob: > > > > > > On Wed, Oct 28, 2020 at 7:05 PM Liang, Ma <liang.j.ma@intel.com> wrote: > > > > > > > > > > > > > > Hi Thomas, > > > > > > > I think I addressed all of the questions in relation to V9. I don't think I can solve the issue of a generic API on my own. From the > > > > > Community Call last week Jerin also said that a generic was investigated but that a single solution wasn't feasible. > > > > > > > > > > > > I think, From the architecture point of view, the specific > > > > > > functionally of UMONITOR may not be abstracted. > > > > > > But from the ethdev callback point of view, Can it be abstracted in > > > > > > such a way that packet notification available through > > > > > > checking interrupt status register or ring descriptor location, etc by > > > > > > the driver. Use that callback as a notification mechanism rather > > > > > > than defining a memory-based scheme that UMONITOR expects? or similar > > > > > > thoughts on abstraction. > > > > > > > > I think there is probably some sort of misunderstanding. > > > > This API is not about providing acync notification when next packet arrives. > > > > This is about to putting core to sleep till some event (or timeout) happens. > > > > From my perspective the closest analogy: cond_timedwait(). > > > > So we need PMD to tell us what will be the address of the condition variable > > > > we should sleep on. > > > > > > > > > I agree with Jerin. > > > > > The ethdev API is the blocking problem. > > > > > First problem: it is not well explained in doxygen. > > > > > Second problem: it is probably not generic enough (if we understand it well) > > > > > > > > It is an address to sleep(/wakeup) on, plus expected value. > > > > Honestly, I can't think-up of anything even more generic then that. > > > > If you guys have something particular in mind - please share. > > > > > > Current PMD callback: > > > typedef int (*eth_get_wake_addr_t)(void *rxq, volatile void > > > **tail_desc_addr, + uint64_t *expected, uint64_t *mask, uint8_t > > > *data_sz); > > > > > > Can we make it as > > > typedef void (*core_sleep_t)(void *rxq) > > How about void (*eth_core_sleep_helper_t)(void *rxq, enum scheme, void *paramter) > > by this way, PMD can cast the parameter accorind to the scheme. > > e.g. if scheme MEM_MONITOR then cast to umwait way. > > however, this will introduce another problem. > > we need add PMD query callback to figure out if PMD support this scheme. > > I thought scheme/policy something that "application cares" like below > not arch specifics > 1) wakeup up on first packet, > 2) wake me up on first packet or timeout 100 us etc I need clarify about current design a bit. the purposed API just get target address. the API itself(include the PMD callback) will not demand the processor to idle/sleep. we use the post rx callback to do that. for ethdev layer, this API only is used to fetch target address from PMD, which make minmal impact to existing code. > Yes. We can have query on type of the policies supported. > > > > > > > > if we do such abstraction and "move the polling on memory by HW/CPU" > > > to the driver using a helper function then > > > I can think of abstracting in some way in all PMDs. > > > > > > Note: core_sleep_t can take some more arguments such as enumerated > > > policy if something more needs to be pushed to the driver. > > > > > > Thoughts? > > > > > > > > > > > > > > > > > > > This API is experimental and other vendor support can be added as needed. If there are any other open issue let me know? > > > > > > > > > > Being experimental is not an excuse to throw something > > > > > which is not satisfying. > > > > > > > > > > > > > >
> -----Original Message----- > From: Jerin Jacob <jerinjacobk@gmail.com> > Sent: Wednesday, October 28, 2020 3:40 PM > To: Ananyev, Konstantin <konstantin.ananyev@intel.com> > Cc: Thomas Monjalon <thomas@monjalon.net>; Ma, Liang J <liang.j.ma@intel.com>; dpdk-dev <dev@dpdk.org>; Ruifeng Wang (Arm > Technology China) <ruifeng.wang@arm.com>; Wang, Haiyue <haiyue.wang@intel.com>; Richardson, Bruce > <bruce.richardson@intel.com>; Hunt, David <david.hunt@intel.com>; Neil Horman <nhorman@tuxdriver.com>; McDaniel, Timothy > <timothy.mcdaniel@intel.com>; Eads, Gage <gage.eads@intel.com>; Marcin Wojtas <mw@semihalf.com>; Guy Tzalik > <gtzalik@amazon.com>; Ajit Khaparde <ajit.khaparde@broadcom.com>; Harman Kalra <hkalra@marvell.com>; John Daley > <johndale@cisco.com>; Wei Hu (Xavier <xavier.huwei@huawei.com>; Ziyang Xuan <xuanziyang2@huawei.com>; matan@nvidia.com; Yong > Wang <yongwang@vmware.com>; david.marchand@redhat.com > Subject: Re: [PATCH v10 0/9] Add PMD power mgmt > > On Wed, Oct 28, 2020 at 9:04 PM Ananyev, Konstantin > <konstantin.ananyev@intel.com> wrote: > > > > > > > > > > > 28/10/2020 14:49, Jerin Jacob: > > > > > > On Wed, Oct 28, 2020 at 7:05 PM Liang, Ma <liang.j.ma@intel.com> wrote: > > > > > > > > > > > > > > Hi Thomas, > > > > > > > I think I addressed all of the questions in relation to V9. I don't think I can solve the issue of a generic API on my own. From the > > > > > Community Call last week Jerin also said that a generic was investigated but that a single solution wasn't feasible. > > > > > > > > > > > > I think, From the architecture point of view, the specific > > > > > > functionally of UMONITOR may not be abstracted. > > > > > > But from the ethdev callback point of view, Can it be abstracted in > > > > > > such a way that packet notification available through > > > > > > checking interrupt status register or ring descriptor location, etc by > > > > > > the driver. Use that callback as a notification mechanism rather > > > > > > than defining a memory-based scheme that UMONITOR expects? or similar > > > > > > thoughts on abstraction. > > > > > > > > I think there is probably some sort of misunderstanding. > > > > This API is not about providing acync notification when next packet arrives. > > > > This is about to putting core to sleep till some event (or timeout) happens. > > > > From my perspective the closest analogy: cond_timedwait(). > > > > So we need PMD to tell us what will be the address of the condition variable > > > > we should sleep on. > > > > > > > > > I agree with Jerin. > > > > > The ethdev API is the blocking problem. > > > > > First problem: it is not well explained in doxygen. > > > > > Second problem: it is probably not generic enough (if we understand it well) > > > > > > > > It is an address to sleep(/wakeup) on, plus expected value. > > > > Honestly, I can't think-up of anything even more generic then that. > > > > If you guys have something particular in mind - please share. > > > > > > Current PMD callback: > > > typedef int (*eth_get_wake_addr_t)(void *rxq, volatile void > > > **tail_desc_addr, + uint64_t *expected, uint64_t *mask, uint8_t > > > *data_sz); > > > > > > Can we make it as > > > typedef void (*core_sleep_t)(void *rxq) > > > > > > if we do such abstraction and "move the polling on memory by HW/CPU" > > > to the driver using a helper function then > > > I can think of abstracting in some way in all PMDs. > > > > Ok I see, thanks for explanation. > > From my perspective main disadvantage of such approach - > > it can't be extended easily. > > If/when will have an ability for core to sleep/wake-up on multiple events > > (multiple addresses) will have to either rework that API again. > > I think, we can enumerate the policies and pass the associated > structures as input to the driver. What I am trying to say: with that API we will not be able to wait for events from multiple devices (HW queues). I.E. something like that: get_wake_addr(port=X, ..., &addr[0], ...); get_wake_addr(port=Y,..., &addr[1],...); wait_on_multi(addr, 2); wouldn't be possible. > > > > > > > > > > Note: core_sleep_t can take some more arguments such as enumerated > > > policy if something more needs to be pushed to the driver. > > > > > > Thoughts? > > > > > > > > > > > > > > > > > > > This API is experimental and other vendor support can be added as needed. If there are any other open issue let me know? > > > > > > > > > > Being experimental is not an excuse to throw something > > > > > which is not satisfying. > > > > > > > > > > > > > >
On Wed, Oct 28, 2020 at 9:19 PM Ananyev, Konstantin <konstantin.ananyev@intel.com> wrote: > > > > > -----Original Message----- > > From: Jerin Jacob <jerinjacobk@gmail.com> > > Sent: Wednesday, October 28, 2020 3:40 PM > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com> > > Cc: Thomas Monjalon <thomas@monjalon.net>; Ma, Liang J <liang.j.ma@intel.com>; dpdk-dev <dev@dpdk.org>; Ruifeng Wang (Arm > > Technology China) <ruifeng.wang@arm.com>; Wang, Haiyue <haiyue.wang@intel.com>; Richardson, Bruce > > <bruce.richardson@intel.com>; Hunt, David <david.hunt@intel.com>; Neil Horman <nhorman@tuxdriver.com>; McDaniel, Timothy > > <timothy.mcdaniel@intel.com>; Eads, Gage <gage.eads@intel.com>; Marcin Wojtas <mw@semihalf.com>; Guy Tzalik > > <gtzalik@amazon.com>; Ajit Khaparde <ajit.khaparde@broadcom.com>; Harman Kalra <hkalra@marvell.com>; John Daley > > <johndale@cisco.com>; Wei Hu (Xavier <xavier.huwei@huawei.com>; Ziyang Xuan <xuanziyang2@huawei.com>; matan@nvidia.com; Yong > > Wang <yongwang@vmware.com>; david.marchand@redhat.com > > Subject: Re: [PATCH v10 0/9] Add PMD power mgmt > > > > On Wed, Oct 28, 2020 at 9:04 PM Ananyev, Konstantin > > <konstantin.ananyev@intel.com> wrote: > > > > > > > > > > > > > > > 28/10/2020 14:49, Jerin Jacob: > > > > > > > On Wed, Oct 28, 2020 at 7:05 PM Liang, Ma <liang.j.ma@intel.com> wrote: > > > > > > > > > > > > > > > > Hi Thomas, > > > > > > > > I think I addressed all of the questions in relation to V9. I don't think I can solve the issue of a generic API on my own. From the > > > > > > Community Call last week Jerin also said that a generic was investigated but that a single solution wasn't feasible. > > > > > > > > > > > > > > I think, From the architecture point of view, the specific > > > > > > > functionally of UMONITOR may not be abstracted. > > > > > > > But from the ethdev callback point of view, Can it be abstracted in > > > > > > > such a way that packet notification available through > > > > > > > checking interrupt status register or ring descriptor location, etc by > > > > > > > the driver. Use that callback as a notification mechanism rather > > > > > > > than defining a memory-based scheme that UMONITOR expects? or similar > > > > > > > thoughts on abstraction. > > > > > > > > > > I think there is probably some sort of misunderstanding. > > > > > This API is not about providing acync notification when next packet arrives. > > > > > This is about to putting core to sleep till some event (or timeout) happens. > > > > > From my perspective the closest analogy: cond_timedwait(). > > > > > So we need PMD to tell us what will be the address of the condition variable > > > > > we should sleep on. > > > > > > > > > > > I agree with Jerin. > > > > > > The ethdev API is the blocking problem. > > > > > > First problem: it is not well explained in doxygen. > > > > > > Second problem: it is probably not generic enough (if we understand it well) > > > > > > > > > > It is an address to sleep(/wakeup) on, plus expected value. > > > > > Honestly, I can't think-up of anything even more generic then that. > > > > > If you guys have something particular in mind - please share. > > > > > > > > Current PMD callback: > > > > typedef int (*eth_get_wake_addr_t)(void *rxq, volatile void > > > > **tail_desc_addr, + uint64_t *expected, uint64_t *mask, uint8_t > > > > *data_sz); > > > > > > > > Can we make it as > > > > typedef void (*core_sleep_t)(void *rxq) > > > > > > > > if we do such abstraction and "move the polling on memory by HW/CPU" > > > > to the driver using a helper function then > > > > I can think of abstracting in some way in all PMDs. > > > > > > Ok I see, thanks for explanation. > > > From my perspective main disadvantage of such approach - > > > it can't be extended easily. > > > If/when will have an ability for core to sleep/wake-up on multiple events > > > (multiple addresses) will have to either rework that API again. > > > > I think, we can enumerate the policies and pass the associated > > structures as input to the driver. > > What I am trying to say: with that API we will not be able to wait > for events from multiple devices (HW queues). > I.E. something like that: > > get_wake_addr(port=X, ..., &addr[0], ...); > get_wake_addr(port=Y,..., &addr[1],...); > wait_on_multi(addr, 2); > > wouldn't be possible. I see. But the current implementation dictates the only queue bound to a core. Right? > > > > > > > > > > > > > > > > Note: core_sleep_t can take some more arguments such as enumerated > > > > policy if something more needs to be pushed to the driver. > > > > > > > > Thoughts? > > > > > > > > > > > > > > > > > > > > > > > This API is experimental and other vendor support can be added as needed. If there are any other open issue let me know? > > > > > > > > > > > > Being experimental is not an excuse to throw something > > > > > > which is not satisfying. > > > > > > > > > > > > > > > > >
On Wed, Oct 28, 2020 at 9:14 PM Liang, Ma <liang.j.ma@intel.com> wrote: > > On 28 Oct 21:06, Jerin Jacob wrote: > > On Wed, Oct 28, 2020 at 9:00 PM Liang, Ma <liang.j.ma@intel.com> wrote: > > > > > > On 28 Oct 20:44, Jerin Jacob wrote: > > > > On Wed, Oct 28, 2020 at 8:27 PM Ananyev, Konstantin > > > > <konstantin.ananyev@intel.com> wrote: > > > > > > > > > > > > > > > > > > > > > 28/10/2020 14:49, Jerin Jacob: > > > > > > > On Wed, Oct 28, 2020 at 7:05 PM Liang, Ma <liang.j.ma@intel.com> wrote: > > > > > > > > > > > > > > > > Hi Thomas, > > > > > > > > I think I addressed all of the questions in relation to V9. I don't think I can solve the issue of a generic API on my own. From the > > > > > > Community Call last week Jerin also said that a generic was investigated but that a single solution wasn't feasible. > > > > > > > > > > > > > > I think, From the architecture point of view, the specific > > > > > > > functionally of UMONITOR may not be abstracted. > > > > > > > But from the ethdev callback point of view, Can it be abstracted in > > > > > > > such a way that packet notification available through > > > > > > > checking interrupt status register or ring descriptor location, etc by > > > > > > > the driver. Use that callback as a notification mechanism rather > > > > > > > than defining a memory-based scheme that UMONITOR expects? or similar > > > > > > > thoughts on abstraction. > > > > > > > > > > I think there is probably some sort of misunderstanding. > > > > > This API is not about providing acync notification when next packet arrives. > > > > > This is about to putting core to sleep till some event (or timeout) happens. > > > > > From my perspective the closest analogy: cond_timedwait(). > > > > > So we need PMD to tell us what will be the address of the condition variable > > > > > we should sleep on. > > > > > > > > > > > I agree with Jerin. > > > > > > The ethdev API is the blocking problem. > > > > > > First problem: it is not well explained in doxygen. > > > > > > Second problem: it is probably not generic enough (if we understand it well) > > > > > > > > > > It is an address to sleep(/wakeup) on, plus expected value. > > > > > Honestly, I can't think-up of anything even more generic then that. > > > > > If you guys have something particular in mind - please share. > > > > > > > > Current PMD callback: > > > > typedef int (*eth_get_wake_addr_t)(void *rxq, volatile void > > > > **tail_desc_addr, + uint64_t *expected, uint64_t *mask, uint8_t > > > > *data_sz); > > > > > > > > Can we make it as > > > > typedef void (*core_sleep_t)(void *rxq) > > > How about void (*eth_core_sleep_helper_t)(void *rxq, enum scheme, void *paramter) > > > by this way, PMD can cast the parameter accorind to the scheme. > > > e.g. if scheme MEM_MONITOR then cast to umwait way. > > > however, this will introduce another problem. > > > we need add PMD query callback to figure out if PMD support this scheme. > > > > I thought scheme/policy something that "application cares" like below > > not arch specifics > > 1) wakeup up on first packet, > > 2) wake me up on first packet or timeout 100 us etc > I need clarify about current design a bit. the purposed API just get target address. > the API itself(include the PMD callback) will not demand the processor to idle/sleep. > we use the post rx callback to do that. for ethdev layer, this API only is used to > fetch target address from PMD, which make minmal impact to existing code. I understand that. But if we move that logic to common code in ethdev as a set of internal PMD helper functions(Helper functions for class of devices and/or arch) then it should be possible. Right? I do understand that, It will call for some rework in the code. I will leave up to ethdev maintainers on specifics. > > > Yes. We can have query on type of the policies supported. > > > > > > > > > > > > if we do such abstraction and "move the polling on memory by HW/CPU" > > > > to the driver using a helper function then > > > > I can think of abstracting in some way in all PMDs. > > > > > > > > Note: core_sleep_t can take some more arguments such as enumerated > > > > policy if something more needs to be pushed to the driver. > > > > > > > > Thoughts? > > > > > > > > > > > > > > > > > > > > > > > This API is experimental and other vendor support can be added as needed. If there are any other open issue let me know? > > > > > > > > > > > > Being experimental is not an excuse to throw something > > > > > > which is not satisfying. > > > > > > > > > > > > > > > > >
On 28 Oct 21:31, Jerin Jacob wrote: > On Wed, Oct 28, 2020 at 9:14 PM Liang, Ma <liang.j.ma@intel.com> wrote: > > > > On 28 Oct 21:06, Jerin Jacob wrote: > > > On Wed, Oct 28, 2020 at 9:00 PM Liang, Ma <liang.j.ma@intel.com> wrote: > > > > > > > > On 28 Oct 20:44, Jerin Jacob wrote: > > > > > On Wed, Oct 28, 2020 at 8:27 PM Ananyev, Konstantin > > > > > <konstantin.ananyev@intel.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > 28/10/2020 14:49, Jerin Jacob: > > > > > > > > On Wed, Oct 28, 2020 at 7:05 PM Liang, Ma <liang.j.ma@intel.com> wrote: > > > > > > > > > > > > > > > > > > Hi Thomas, > > > > > > > > > I think I addressed all of the questions in relation to V9. I don't think I can solve the issue of a generic API on my own. From the > > > > > > > Community Call last week Jerin also said that a generic was investigated but that a single solution wasn't feasible. > > > > > > > > > > > > > > > > I think, From the architecture point of view, the specific > > > > > > > > functionally of UMONITOR may not be abstracted. > > > > > > > > But from the ethdev callback point of view, Can it be abstracted in > > > > > > > > such a way that packet notification available through > > > > > > > > checking interrupt status register or ring descriptor location, etc by > > > > > > > > the driver. Use that callback as a notification mechanism rather > > > > > > > > than defining a memory-based scheme that UMONITOR expects? or similar > > > > > > > > thoughts on abstraction. > > > > > > > > > > > > I think there is probably some sort of misunderstanding. > > > > > > This API is not about providing acync notification when next packet arrives. > > > > > > This is about to putting core to sleep till some event (or timeout) happens. > > > > > > From my perspective the closest analogy: cond_timedwait(). > > > > > > So we need PMD to tell us what will be the address of the condition variable > > > > > > we should sleep on. > > > > > > > > > > > > > I agree with Jerin. > > > > > > > The ethdev API is the blocking problem. > > > > > > > First problem: it is not well explained in doxygen. > > > > > > > Second problem: it is probably not generic enough (if we understand it well) > > > > > > > > > > > > It is an address to sleep(/wakeup) on, plus expected value. > > > > > > Honestly, I can't think-up of anything even more generic then that. > > > > > > If you guys have something particular in mind - please share. > > > > > > > > > > Current PMD callback: > > > > > typedef int (*eth_get_wake_addr_t)(void *rxq, volatile void > > > > > **tail_desc_addr, + uint64_t *expected, uint64_t *mask, uint8_t > > > > > *data_sz); > > > > > > > > > > Can we make it as > > > > > typedef void (*core_sleep_t)(void *rxq) > > > > How about void (*eth_core_sleep_helper_t)(void *rxq, enum scheme, void *paramter) > > > > by this way, PMD can cast the parameter accorind to the scheme. > > > > e.g. if scheme MEM_MONITOR then cast to umwait way. > > > > however, this will introduce another problem. > > > > we need add PMD query callback to figure out if PMD support this scheme. > > > > > > I thought scheme/policy something that "application cares" like below > > > not arch specifics > > > 1) wakeup up on first packet, > > > 2) wake me up on first packet or timeout 100 us etc > > I need clarify about current design a bit. the purposed API just get target address. > > the API itself(include the PMD callback) will not demand the processor to idle/sleep. > > we use the post rx callback to do that. for ethdev layer, this API only is used to > > fetch target address from PMD, which make minmal impact to existing code. > > I understand that. But if we move that logic to common code in ethdev > as a set of internal > PMD helper functions(Helper functions for class of devices and/or > arch) then it should be > possible. Right? Although that's possible, but power manangment logic will be duplicated in all related PMD. Advantage of current design is decouple power mgmt logic with PMD, just need minmal help from PMD. there are 3 scheme already which are implemented as post rx callback. Put all power mgmt logic into PMD is way too heavy. Also current design is very easy to extend. Any new scheme has no impact to ethdev layer unless it need some kind of help from PMD. we have another 2 scheme, PAUSE and Freq Scale dont not need any PMD help. > I do understand that, It will call for some rework in the code. I will > leave up to ethdev maintainers > on specifics. > > > > > > > Yes. We can have query on type of the policies supported. > > > > > > > > > > > > > > > > if we do such abstraction and "move the polling on memory by HW/CPU" > > > > > to the driver using a helper function then > > > > > I can think of abstracting in some way in all PMDs. > > > > > > > > > > Note: core_sleep_t can take some more arguments such as enumerated > > > > > policy if something more needs to be pushed to the driver. > > > > > > > > > > Thoughts? > > > > > > > > > > > > > > > > > > > > > > > > > > > This API is experimental and other vendor support can be added as needed. If there are any other open issue let me know? > > > > > > > > > > > > > > Being experimental is not an excuse to throw something > > > > > > > which is not satisfying. > > > > > > > > > > > > > > > > > > > >
> > On Wed, Oct 28, 2020 at 9:19 PM Ananyev, Konstantin > <konstantin.ananyev@intel.com> wrote: > > > > > > > > > -----Original Message----- > > > From: Jerin Jacob <jerinjacobk@gmail.com> > > > Sent: Wednesday, October 28, 2020 3:40 PM > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com> > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Ma, Liang J <liang.j.ma@intel.com>; dpdk-dev <dev@dpdk.org>; Ruifeng Wang (Arm > > > Technology China) <ruifeng.wang@arm.com>; Wang, Haiyue <haiyue.wang@intel.com>; Richardson, Bruce > > > <bruce.richardson@intel.com>; Hunt, David <david.hunt@intel.com>; Neil Horman <nhorman@tuxdriver.com>; McDaniel, Timothy > > > <timothy.mcdaniel@intel.com>; Eads, Gage <gage.eads@intel.com>; Marcin Wojtas <mw@semihalf.com>; Guy Tzalik > > > <gtzalik@amazon.com>; Ajit Khaparde <ajit.khaparde@broadcom.com>; Harman Kalra <hkalra@marvell.com>; John Daley > > > <johndale@cisco.com>; Wei Hu (Xavier <xavier.huwei@huawei.com>; Ziyang Xuan <xuanziyang2@huawei.com>; matan@nvidia.com; > Yong > > > Wang <yongwang@vmware.com>; david.marchand@redhat.com > > > Subject: Re: [PATCH v10 0/9] Add PMD power mgmt > > > > > > On Wed, Oct 28, 2020 at 9:04 PM Ananyev, Konstantin > > > <konstantin.ananyev@intel.com> wrote: > > > > > > > > > > > > > > > > > > > 28/10/2020 14:49, Jerin Jacob: > > > > > > > > On Wed, Oct 28, 2020 at 7:05 PM Liang, Ma <liang.j.ma@intel.com> wrote: > > > > > > > > > > > > > > > > > > Hi Thomas, > > > > > > > > > I think I addressed all of the questions in relation to V9. I don't think I can solve the issue of a generic API on my own. From > the > > > > > > > Community Call last week Jerin also said that a generic was investigated but that a single solution wasn't feasible. > > > > > > > > > > > > > > > > I think, From the architecture point of view, the specific > > > > > > > > functionally of UMONITOR may not be abstracted. > > > > > > > > But from the ethdev callback point of view, Can it be abstracted in > > > > > > > > such a way that packet notification available through > > > > > > > > checking interrupt status register or ring descriptor location, etc by > > > > > > > > the driver. Use that callback as a notification mechanism rather > > > > > > > > than defining a memory-based scheme that UMONITOR expects? or similar > > > > > > > > thoughts on abstraction. > > > > > > > > > > > > I think there is probably some sort of misunderstanding. > > > > > > This API is not about providing acync notification when next packet arrives. > > > > > > This is about to putting core to sleep till some event (or timeout) happens. > > > > > > From my perspective the closest analogy: cond_timedwait(). > > > > > > So we need PMD to tell us what will be the address of the condition variable > > > > > > we should sleep on. > > > > > > > > > > > > > I agree with Jerin. > > > > > > > The ethdev API is the blocking problem. > > > > > > > First problem: it is not well explained in doxygen. > > > > > > > Second problem: it is probably not generic enough (if we understand it well) > > > > > > > > > > > > It is an address to sleep(/wakeup) on, plus expected value. > > > > > > Honestly, I can't think-up of anything even more generic then that. > > > > > > If you guys have something particular in mind - please share. > > > > > > > > > > Current PMD callback: > > > > > typedef int (*eth_get_wake_addr_t)(void *rxq, volatile void > > > > > **tail_desc_addr, + uint64_t *expected, uint64_t *mask, uint8_t > > > > > *data_sz); > > > > > > > > > > Can we make it as > > > > > typedef void (*core_sleep_t)(void *rxq) > > > > > > > > > > if we do such abstraction and "move the polling on memory by HW/CPU" > > > > > to the driver using a helper function then > > > > > I can think of abstracting in some way in all PMDs. > > > > > > > > Ok I see, thanks for explanation. > > > > From my perspective main disadvantage of such approach - > > > > it can't be extended easily. > > > > If/when will have an ability for core to sleep/wake-up on multiple events > > > > (multiple addresses) will have to either rework that API again. > > > > > > I think, we can enumerate the policies and pass the associated > > > structures as input to the driver. > > > > What I am trying to say: with that API we will not be able to wait > > for events from multiple devices (HW queues). > > I.E. something like that: > > > > get_wake_addr(port=X, ..., &addr[0], ...); > > get_wake_addr(port=Y,..., &addr[1],...); > > wait_on_multi(addr, 2); > > > > wouldn't be possible. > > I see. But the current implementation dictates the only queue bound to > a core. Right? Yes, current implementation of rte_power_monitor() supports only single address. Though proposed API for both ethdev (get_wake_addr) and power(rte_power_pmd_mgmt_queue_enable) don't dictate one to one mapping as the only possible usage model. > > > > > > > > > > > > > > > > > > > > > > > > Note: core_sleep_t can take some more arguments such as enumerated > > > > > policy if something more needs to be pushed to the driver. > > > > > > > > > > Thoughts? > > > > > > > > > > > > > > > > > > > > > > > > > > > This API is experimental and other vendor support can be added as needed. If there are any other open issue let me know? > > > > > > > > > > > > > > Being experimental is not an excuse to throw something > > > > > > > which is not satisfying. > > > > > > > > > > > > > > > > > > > >
On 28 Oct 21:27, Jerin Jacob wrote: > On Wed, Oct 28, 2020 at 9:19 PM Ananyev, Konstantin > <konstantin.ananyev@intel.com> wrote: > > > > > > > 28/10/2020 14:49, Jerin Jacob: > > > > > > > > On Wed, Oct 28, 2020 at 7:05 PM Liang, Ma <liang.j.ma@intel.com> wrote: > > > > > > > > > > > > > > > > > > Hi Thomas, > > > > > > > > > I think I addressed all of the questions in relation to V9. I don't think I can solve the issue of a generic API on my own. From the > > > > > > > Community Call last week Jerin also said that a generic was investigated but that a single solution wasn't feasible. > > > > > > > > > > > > > > > > I think, From the architecture point of view, the specific > > > > > > > > functionally of UMONITOR may not be abstracted. > > > > > > > > But from the ethdev callback point of view, Can it be abstracted in > > > > > > > > such a way that packet notification available through > > > > > > > > checking interrupt status register or ring descriptor location, etc by > > > > > > > > the driver. Use that callback as a notification mechanism rather > > > > > > > > than defining a memory-based scheme that UMONITOR expects? or similar > > > > > > > > thoughts on abstraction. > > > > > > > > > > > > I think there is probably some sort of misunderstanding. > > > > > > This API is not about providing acync notification when next packet arrives. > > > > > > This is about to putting core to sleep till some event (or timeout) happens. > > > > > > From my perspective the closest analogy: cond_timedwait(). > > > > > > So we need PMD to tell us what will be the address of the condition variable > > > > > > we should sleep on. > > > > > > > > > > > > > I agree with Jerin. > > > > > > > The ethdev API is the blocking problem. > > > > > > > First problem: it is not well explained in doxygen. > > > > > > > Second problem: it is probably not generic enough (if we understand it well) > > > > > > > > > > > > It is an address to sleep(/wakeup) on, plus expected value. > > > > > > Honestly, I can't think-up of anything even more generic then that. > > > > > > If you guys have something particular in mind - please share. > > > > > > > > > > Current PMD callback: > > > > > typedef int (*eth_get_wake_addr_t)(void *rxq, volatile void > > > > > **tail_desc_addr, + uint64_t *expected, uint64_t *mask, uint8_t > > > > > *data_sz); > > > > > > > > > > Can we make it as > > > > > typedef void (*core_sleep_t)(void *rxq) > > > > > > > > > > if we do such abstraction and "move the polling on memory by HW/CPU" > > > > > to the driver using a helper function then > > > > > I can think of abstracting in some way in all PMDs. > > > > > > > > Ok I see, thanks for explanation. > > > > From my perspective main disadvantage of such approach - > > > > it can't be extended easily. > > > > If/when will have an ability for core to sleep/wake-up on multiple events > > > > (multiple addresses) will have to either rework that API again. > > > > > > I think, we can enumerate the policies and pass the associated > > > structures as input to the driver. > > > > What I am trying to say: with that API we will not be able to wait > > for events from multiple devices (HW queues). > > I.E. something like that: > > > > get_wake_addr(port=X, ..., &addr[0], ...); > > get_wake_addr(port=Y,..., &addr[1],...); > > wait_on_multi(addr, 2); > > > > wouldn't be possible. > > I see. But the current implementation dictates the only queue bound to > a core. Right? Current implementation only support 1:1 queue/core mapping is because of the limitation of umwait/umonitor which can not work with multiple address range. However, for other scheme like PASUE/Freq Scale have no such limitation. The proposed API itself doesn't limit the 1:1 queue/core mapping. > > > > > > > > > > > > > > > > > > > > > > > > Note: core_sleep_t can take some more arguments such as enumerated > > > > > policy if something more needs to be pushed to the driver. > > > > > > > > > > Thoughts? > > > > > > > > > > > > > > > > > > > > > > > > > > > This API is experimental and other vendor support can be added as needed. If there are any other open issue let me know? > > > > > > > > > > > > > > Being experimental is not an excuse to throw something > > > > > > > which is not satisfying. > > > > > > > > > > > > > > > > > > > >
> -----Original Message----- > From: Ma, Liang J <liang.j.ma@intel.com> > Sent: Wednesday, October 28, 2020 11:48 AM > To: Jerin Jacob <jerinjacobk@gmail.com> > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Thomas Monjalon > <thomas@monjalon.net>; dpdk-dev <dev@dpdk.org>; Ruifeng Wang (Arm > Technology China) <ruifeng.wang@arm.com>; Wang, Haiyue > <haiyue.wang@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>; > Hunt, David <david.hunt@intel.com>; Neil Horman <nhorman@tuxdriver.com>; > McDaniel, Timothy <timothy.mcdaniel@intel.com>; Eads, Gage > <gage.eads@intel.com>; Marcin Wojtas <mw@semihalf.com>; Guy Tzalik > <gtzalik@amazon.com>; Ajit Khaparde <ajit.khaparde@broadcom.com>; > Harman Kalra <hkalra@marvell.com>; John Daley <johndale@cisco.com>; Wei > Hu (Xavier <xavier.huwei@huawei.com>; Ziyang Xuan > <xuanziyang2@huawei.com>; matan@nvidia.com; Yong Wang > <yongwang@vmware.com>; david.marchand@redhat.com > Subject: Re: [PATCH v10 0/9] Add PMD power mgmt > > On 28 Oct 21:27, Jerin Jacob wrote: > > On Wed, Oct 28, 2020 at 9:19 PM Ananyev, Konstantin > > <konstantin.ananyev@intel.com> wrote: > > > > > > > > 28/10/2020 14:49, Jerin Jacob: > > > > > > > > > On Wed, Oct 28, 2020 at 7:05 PM Liang, Ma > <liang.j.ma@intel.com> wrote: > > > > > > > > > > > > > > > > > > > > Hi Thomas, > > > > > > > > > > I think I addressed all of the questions in relation to V9. I don't > think I can solve the issue of a generic API on my own. From the > > > > > > > > Community Call last week Jerin also said that a generic was > investigated but that a single solution wasn't feasible. > > > > > > > > > > > > > > > > > > I think, From the architecture point of view, the specific > > > > > > > > > functionally of UMONITOR may not be abstracted. > > > > > > > > > But from the ethdev callback point of view, Can it be abstracted in > > > > > > > > > such a way that packet notification available through > > > > > > > > > checking interrupt status register or ring descriptor location, etc > by > > > > > > > > > the driver. Use that callback as a notification mechanism rather > > > > > > > > > than defining a memory-based scheme that UMONITOR expects? > or similar > > > > > > > > > thoughts on abstraction. > > > > > > > > > > > > > > I think there is probably some sort of misunderstanding. > > > > > > > This API is not about providing acync notification when next packet > arrives. > > > > > > > This is about to putting core to sleep till some event (or timeout) > happens. > > > > > > > From my perspective the closest analogy: cond_timedwait(). > > > > > > > So we need PMD to tell us what will be the address of the condition > variable > > > > > > > we should sleep on. > > > > > > > > > > > > > > > I agree with Jerin. > > > > > > > > The ethdev API is the blocking problem. > > > > > > > > First problem: it is not well explained in doxygen. > > > > > > > > Second problem: it is probably not generic enough (if we understand > it well) > > > > > > > > > > > > > > It is an address to sleep(/wakeup) on, plus expected value. > > > > > > > Honestly, I can't think-up of anything even more generic then that. > > > > > > > If you guys have something particular in mind - please share. > > > > > > > > > > > > Current PMD callback: > > > > > > typedef int (*eth_get_wake_addr_t)(void *rxq, volatile void > > > > > > **tail_desc_addr, + uint64_t *expected, uint64_t *mask, uint8_t > > > > > > *data_sz); > > > > > > > > > > > > Can we make it as > > > > > > typedef void (*core_sleep_t)(void *rxq) > > > > > > > > > > > > if we do such abstraction and "move the polling on memory by > HW/CPU" > > > > > > to the driver using a helper function then > > > > > > I can think of abstracting in some way in all PMDs. > > > > > > > > > > Ok I see, thanks for explanation. > > > > > From my perspective main disadvantage of such approach - > > > > > it can't be extended easily. > > > > > If/when will have an ability for core to sleep/wake-up on multiple events > > > > > (multiple addresses) will have to either rework that API again. > > > > > > > > I think, we can enumerate the policies and pass the associated > > > > structures as input to the driver. > > > > > > What I am trying to say: with that API we will not be able to wait > > > for events from multiple devices (HW queues). > > > I.E. something like that: > > > > > > get_wake_addr(port=X, ..., &addr[0], ...); > > > get_wake_addr(port=Y,..., &addr[1],...); > > > wait_on_multi(addr, 2); > > > > > > wouldn't be possible. > > > > I see. But the current implementation dictates the only queue bound to > > a core. Right? > Current implementation only support 1:1 queue/core mapping is because of > the limitation of umwait/umonitor which can not work with multiple address > range. However, for other scheme like PASUE/Freq Scale have no such > limitation. > The proposed API itself doesn't limit the 1:1 queue/core mapping. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Note: core_sleep_t can take some more arguments such as > enumerated > > > > > > policy if something more needs to be pushed to the driver. > > > > > > > > > > > > Thoughts? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This API is experimental and other vendor support can be added > as needed. If there are any other open issue let me know? > > > > > > > > > > > > > > > > Being experimental is not an excuse to throw something > > > > > > > > which is not satisfying. > > > > > > > > > > > > > > > > > > > > > > > It would be nice if the low level definition of the UMWAIT and UMONOTOR instructions were split out into their own inline function or macro so that any PMD could use the intrinsic without being tied to ethdev or any of the other logic associated with this patch set. This would be similar to rte_wmb, and so on
On Wed, Oct 28, 2020 at 9:47 AM Liang, Ma <liang.j.ma@intel.com> wrote: > > On 28 Oct 21:27, Jerin Jacob wrote: > > On Wed, Oct 28, 2020 at 9:19 PM Ananyev, Konstantin > > <konstantin.ananyev@intel.com> wrote: > > > > > > > > 28/10/2020 14:49, Jerin Jacob: > > > > > > > > > On Wed, Oct 28, 2020 at 7:05 PM Liang, Ma <liang.j.ma@intel.com> wrote: > > > > > > > > > > > > > > > > > > > > Hi Thomas, > > > > > > > > > > I think I addressed all of the questions in relation to V9. I don't think I can solve the issue of a generic API on my own. From the > > > > > > > > Community Call last week Jerin also said that a generic was investigated but that a single solution wasn't feasible. > > > > > > > > > > > > > > > > > > I think, From the architecture point of view, the specific > > > > > > > > > functionally of UMONITOR may not be abstracted. > > > > > > > > > But from the ethdev callback point of view, Can it be abstracted in > > > > > > > > > such a way that packet notification available through > > > > > > > > > checking interrupt status register or ring descriptor location, etc by > > > > > > > > > the driver. Use that callback as a notification mechanism rather > > > > > > > > > than defining a memory-based scheme that UMONITOR expects? or similar > > > > > > > > > thoughts on abstraction. > > > > > > > > > > > > > > I think there is probably some sort of misunderstanding. > > > > > > > This API is not about providing acync notification when next packet arrives. > > > > > > > This is about to putting core to sleep till some event (or timeout) happens. > > > > > > > From my perspective the closest analogy: cond_timedwait(). > > > > > > > So we need PMD to tell us what will be the address of the condition variable > > > > > > > we should sleep on. > > > > > > > > > > > > > > > I agree with Jerin. > > > > > > > > The ethdev API is the blocking problem. > > > > > > > > First problem: it is not well explained in doxygen. > > > > > > > > Second problem: it is probably not generic enough (if we understand it well) > > > > > > > > > > > > > > It is an address to sleep(/wakeup) on, plus expected value. > > > > > > > Honestly, I can't think-up of anything even more generic then that. > > > > > > > If you guys have something particular in mind - please share. > > > > > > > > > > > > Current PMD callback: > > > > > > typedef int (*eth_get_wake_addr_t)(void *rxq, volatile void > > > > > > **tail_desc_addr, + uint64_t *expected, uint64_t *mask, uint8_t > > > > > > *data_sz); > > > > > > > > > > > > Can we make it as > > > > > > typedef void (*core_sleep_t)(void *rxq) > > > > > > > > > > > > if we do such abstraction and "move the polling on memory by HW/CPU" > > > > > > to the driver using a helper function then > > > > > > I can think of abstracting in some way in all PMDs. > > > > > > > > > > Ok I see, thanks for explanation. > > > > > From my perspective main disadvantage of such approach - > > > > > it can't be extended easily. > > > > > If/when will have an ability for core to sleep/wake-up on multiple events > > > > > (multiple addresses) will have to either rework that API again. > > > > > > > > I think, we can enumerate the policies and pass the associated > > > > structures as input to the driver. > > > > > > What I am trying to say: with that API we will not be able to wait > > > for events from multiple devices (HW queues). > > > I.E. something like that: > > > > > > get_wake_addr(port=X, ..., &addr[0], ...); > > > get_wake_addr(port=Y,..., &addr[1],...); > > > wait_on_multi(addr, 2); > > > > > > wouldn't be possible. > > > > I see. But the current implementation dictates the only queue bound to > > a core. Right? > Current implementation only support 1:1 queue/core mapping is because of > the limitation of umwait/umonitor which can not work with multiple address > range. However, for other scheme like PASUE/Freq Scale have no such limitation. > The proposed API itself doesn't limit the 1:1 queue/core mapping. The PMD would not know if it is 1:1 queue/core or any other shared scheme. So the intelligence and decision making is best left to the application. I think PMD and the underlying hardware does not need to know what kind of power management scheme is implemented. IMHO the original API which provides the address, value and mask should suffice. Any other callback or handshake between PMD and application may be an overkill. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Note: core_sleep_t can take some more arguments such as enumerated > > > > > > policy if something more needs to be pushed to the driver. > > > > > > > > > > > > Thoughts? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This API is experimental and other vendor support can be added as needed. If there are any other open issue let me know? > > > > > > > > > > > > > > > > Being experimental is not an excuse to throw something > > > > > > > > which is not satisfying. > > > > > > > > > > > > > > > > > > > > > > >
> -----Original Message----- > From: Ajit Khaparde <ajit.khaparde@broadcom.com> > Sent: Wednesday, October 28, 2020 5:02 PM > To: Ma, Liang J <liang.j.ma@intel.com> > Cc: Jerin Jacob <jerinjacobk@gmail.com>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Thomas Monjalon > <thomas@monjalon.net>; dpdk-dev <dev@dpdk.org>; Ruifeng Wang (Arm Technology China) <ruifeng.wang@arm.com>; Wang, Haiyue > <haiyue.wang@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>; Hunt, David <david.hunt@intel.com>; Neil Horman > <nhorman@tuxdriver.com>; McDaniel, Timothy <timothy.mcdaniel@intel.com>; Eads, Gage <gage.eads@intel.com>; Marcin Wojtas > <mw@semihalf.com>; Guy Tzalik <gtzalik@amazon.com>; Harman Kalra <hkalra@marvell.com>; John Daley <johndale@cisco.com>; Wei > Hu (Xavier <xavier.huwei@huawei.com>; Ziyang Xuan <xuanziyang2@huawei.com>; matan@nvidia.com; Yong Wang > <yongwang@vmware.com>; david.marchand@redhat.com > Subject: Re: [PATCH v10 0/9] Add PMD power mgmt > > On Wed, Oct 28, 2020 at 9:47 AM Liang, Ma <liang.j.ma@intel.com> wrote: > > > > On 28 Oct 21:27, Jerin Jacob wrote: > > > On Wed, Oct 28, 2020 at 9:19 PM Ananyev, Konstantin > > > <konstantin.ananyev@intel.com> wrote: > > > > > > > > > 28/10/2020 14:49, Jerin Jacob: > > > > > > > > > > On Wed, Oct 28, 2020 at 7:05 PM Liang, Ma <liang.j.ma@intel.com> wrote: > > > > > > > > > > > > > > > > > > > > > > Hi Thomas, > > > > > > > > > > > I think I addressed all of the questions in relation to V9. I don't think I can solve the issue of a generic API on my own. > From the > > > > > > > > > Community Call last week Jerin also said that a generic was investigated but that a single solution wasn't feasible. > > > > > > > > > > > > > > > > > > > > I think, From the architecture point of view, the specific > > > > > > > > > > functionally of UMONITOR may not be abstracted. > > > > > > > > > > But from the ethdev callback point of view, Can it be abstracted in > > > > > > > > > > such a way that packet notification available through > > > > > > > > > > checking interrupt status register or ring descriptor location, etc by > > > > > > > > > > the driver. Use that callback as a notification mechanism rather > > > > > > > > > > than defining a memory-based scheme that UMONITOR expects? or similar > > > > > > > > > > thoughts on abstraction. > > > > > > > > > > > > > > > > I think there is probably some sort of misunderstanding. > > > > > > > > This API is not about providing acync notification when next packet arrives. > > > > > > > > This is about to putting core to sleep till some event (or timeout) happens. > > > > > > > > From my perspective the closest analogy: cond_timedwait(). > > > > > > > > So we need PMD to tell us what will be the address of the condition variable > > > > > > > > we should sleep on. > > > > > > > > > > > > > > > > > I agree with Jerin. > > > > > > > > > The ethdev API is the blocking problem. > > > > > > > > > First problem: it is not well explained in doxygen. > > > > > > > > > Second problem: it is probably not generic enough (if we understand it well) > > > > > > > > > > > > > > > > It is an address to sleep(/wakeup) on, plus expected value. > > > > > > > > Honestly, I can't think-up of anything even more generic then that. > > > > > > > > If you guys have something particular in mind - please share. > > > > > > > > > > > > > > Current PMD callback: > > > > > > > typedef int (*eth_get_wake_addr_t)(void *rxq, volatile void > > > > > > > **tail_desc_addr, + uint64_t *expected, uint64_t *mask, uint8_t > > > > > > > *data_sz); > > > > > > > > > > > > > > Can we make it as > > > > > > > typedef void (*core_sleep_t)(void *rxq) > > > > > > > > > > > > > > if we do such abstraction and "move the polling on memory by HW/CPU" > > > > > > > to the driver using a helper function then > > > > > > > I can think of abstracting in some way in all PMDs. > > > > > > > > > > > > Ok I see, thanks for explanation. > > > > > > From my perspective main disadvantage of such approach - > > > > > > it can't be extended easily. > > > > > > If/when will have an ability for core to sleep/wake-up on multiple events > > > > > > (multiple addresses) will have to either rework that API again. > > > > > > > > > > I think, we can enumerate the policies and pass the associated > > > > > structures as input to the driver. > > > > > > > > What I am trying to say: with that API we will not be able to wait > > > > for events from multiple devices (HW queues). > > > > I.E. something like that: > > > > > > > > get_wake_addr(port=X, ..., &addr[0], ...); > > > > get_wake_addr(port=Y,..., &addr[1],...); > > > > wait_on_multi(addr, 2); > > > > > > > > wouldn't be possible. > > > > > > I see. But the current implementation dictates the only queue bound to > > > a core. Right? > > Current implementation only support 1:1 queue/core mapping is because of > > the limitation of umwait/umonitor which can not work with multiple address > > range. However, for other scheme like PASUE/Freq Scale have no such limitation. > > The proposed API itself doesn't limit the 1:1 queue/core mapping. > > The PMD would not know if it is 1:1 queue/core or any other shared scheme. > So the intelligence and decision making is best left to the application. > I think PMD and the underlying hardware does not need to know what kind of > power management scheme is implemented. Yep, good point. 100% agree. > IMHO the original API which provides the address, value and mask should suffice. > Any other callback or handshake between PMD and application may be an overkill. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Note: core_sleep_t can take some more arguments such as enumerated > > > > > > > policy if something more needs to be pushed to the driver. > > > > > > > > > > > > > > Thoughts? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This API is experimental and other vendor support can be added as needed. If there are any other open issue let me know? > > > > > > > > > > > > > > > > > > Being experimental is not an excuse to throw something > > > > > > > > > which is not satisfying. > > > > > > > > > > > > > > > > > > > > > > > > > >
On 28 Oct 09:54, McDaniel, Timothy wrote: <snip> > > > > > > > > > > > > > > > > > > It would be nice if the low level definition of the UMWAIT and UMONOTOR instructions were split out > into their own inline function or macro so that any PMD could use the intrinsic without being tied to ethdev or > any of the other logic associated with this patch set. This would be similar to rte_wmb, and so on Current patches split the intrinsics out. and already part of eal lib. any other PMD can re-use it easily. > >
27/10/2020 15:59, Liang Ma: > Liang Ma (9): > eal: add new x86 cpuid support for WAITPKG > eal: add power management intrinsics > eal: add intrinsics support check infrastructure EAL patches applied, thanks. > ethdev: add simple power management API Waiting for doxygen reword which may help to get acks for ethdev API. > power: add PMD power management API and callback > net/ixgbe: implement power management API > net/i40e: implement power management API > net/ice: implement power management API > examples/l3fwd-power: enable PMD power mgmt
On 29 Oct 18:42, Thomas Monjalon wrote: > 27/10/2020 15:59, Liang Ma: > > Liang Ma (9): > > eal: add new x86 cpuid support for WAITPKG > > eal: add power management intrinsics > > eal: add intrinsics support check infrastructure > > EAL patches applied, thanks. thanks! > > > ethdev: add simple power management API > > Waiting for doxygen reword which may help to get acks for ethdev API. we are working on this. I will send out v11 today. > > > power: add PMD power management API and callback > > net/ixgbe: implement power management API > > net/i40e: implement power management API > > net/ice: implement power management API > > examples/l3fwd-power: enable PMD power mgmt > > >
30/10/2020 10:36, Liang, Ma: > On 29 Oct 18:42, Thomas Monjalon wrote: > > 27/10/2020 15:59, Liang Ma: > > > Liang Ma (9): > > > eal: add new x86 cpuid support for WAITPKG > > > eal: add power management intrinsics > > > eal: add intrinsics support check infrastructure > > > > EAL patches applied, thanks. > thanks! > > > > > ethdev: add simple power management API > > > > Waiting for doxygen reword which may help to get acks for ethdev API. > we are working on this. I will send out v11 today. David sent some interesting questions about the EAL API. Maybe some explanations are missing, and we could improve the API. Please reply, I would like to solve the EAL questions quickly.