mbox series

[v20,0/4] Add PMD power management

Message ID cover.1611335511.git.anatoly.burakov@intel.com (mailing list archive)
Headers show
Series Add PMD power management | expand

Message

Anatoly Burakov Jan. 22, 2021, 5:12 p.m. UTC
This patchset proposes a simple API for Ethernet drivers to cause the  
CPU to enter a power-optimized state while waiting for packets to  
arrive. There are multiple proposed mechanisms to achieve said power
savings: simple frequency scaling, idle loop, and monitoring the Rx
queue for incoming packages. The latter is achieved through cooperation
with the NIC driver that will allow us to know address of wake up event,
and wait for writes on that address.

To achieve power savings, there is a very simple mechanism used: we're 
counting empty polls, and if a certain threshold is reached, we employ
one of the suggested power management schemes automatically, from within
a Rx callback inside the PMD. Once there's traffic again, the empty poll
counter is reset.

Why are we putting it into ethdev as opposed to leaving this up to the 
application? Our customers specifically requested a way to do it with
minimal changes to the application code. The current approach allows to 
just flip a switch and automatically have power savings.

Things of note:

- Only 1:1 core to queue mapping is supported, meaning that each lcore 
  must at most handle RX on a single queue
- Support 3 type policies. Monitor/Pause/Frequency Scaling
- Power management is enabled per-queue
- The API doesn't extend to other device types

v20:
- Moved callback removal before port close

v19:
- Renamed "data_sz" to "size" and clarified struct comments
- Clarified documentation around rte_power_monitor/pause API

v18:
- Rebase on top of latest main
- Address review comments by Thomas

v17:
- Added exception for ethdev driver-only ABI
- Added memory barriers for monitor/wakeup (Konstantin)
- Fixed compiled issues on non-x86 platforms (hopefully!)

v16:
- Implemented Konstantin's suggestions and comments
- Added return values to the API

v15:
- Fixed incorrect check in UMWAIT callback
- Fixed accidental whitespace changes

v14:
- Fixed ARM/PPC builds
- Addressed various review comments

v13:
- Reworked the librte_power code to require less locking and handle invalid
  parameters better
- Fix numerous rebase errors present in v12

v12:
- Rebase on top of 21.02
- Rework of power intrinsics code

Anatoly Burakov (2):
  eal: rename power monitor condition member
  eal: improve comments around power monitoring API

Liang Ma (2):
  power: add PMD power management API and callback
  examples/l3fwd-power: enable PMD power mgmt

 doc/guides/prog_guide/power_man.rst           |  41 ++
 doc/guides/rel_notes/release_21_02.rst        |  10 +
 .../sample_app_ug/l3_forward_power_man.rst    |  35 ++
 drivers/event/dlb/dlb.c                       |   2 +-
 drivers/event/dlb2/dlb2.c                     |   2 +-
 drivers/net/i40e/i40e_rxtx.c                  |   2 +-
 drivers/net/ice/ice_rxtx.c                    |   2 +-
 drivers/net/ixgbe/ixgbe_rxtx.c                |   2 +-
 examples/l3fwd-power/main.c                   |  90 ++++-
 .../include/generic/rte_power_intrinsics.h    |  39 +-
 lib/librte_eal/x86/rte_power_intrinsics.c     |   4 +-
 lib/librte_power/meson.build                  |   5 +-
 lib/librte_power/rte_power_pmd_mgmt.c         | 365 ++++++++++++++++++
 lib/librte_power/rte_power_pmd_mgmt.h         |  91 +++++
 lib/librte_power/version.map                  |   5 +
 15 files changed, 669 insertions(+), 26 deletions(-)
 create mode 100644 lib/librte_power/rte_power_pmd_mgmt.c
 create mode 100644 lib/librte_power/rte_power_pmd_mgmt.h

Comments

Thomas Monjalon Jan. 29, 2021, 2:20 p.m. UTC | #1
22/01/2021 18:12, Anatoly Burakov:
> Things of note:
> 
> - Only 1:1 core to queue mapping is supported, meaning that each lcore 
>   must at most handle RX on a single queue

Is there a way to have a more generic solution?
I think it may deserve some comments in the API.

> - Support 3 type policies. Monitor/Pause/Frequency Scaling
> - Power management is enabled per-queue
> - The API doesn't extend to other device types

Could it be extended to more device types?
Otherwise it should be called specifically ethdev power management.

> Anatoly Burakov (2):
>   eal: rename power monitor condition member
>   eal: improve comments around power monitoring API
> 
> Liang Ma (2):
>   power: add PMD power management API and callback
>   examples/l3fwd-power: enable PMD power mgmt

Applied with few formatting improvements, thanks
Anatoly Burakov Jan. 29, 2021, 2:47 p.m. UTC | #2
On 29-Jan-21 2:20 PM, Thomas Monjalon wrote:
> 22/01/2021 18:12, Anatoly Burakov:
>> Things of note:
>>
>> - Only 1:1 core to queue mapping is supported, meaning that each lcore
>>    must at most handle RX on a single queue
> 
> Is there a way to have a more generic solution?
> I think it may deserve some comments in the API.

If you're referring to possibility of monitoring multiple queues from 
one core, we are investigating ways to make that happen, but for now, 
this is the limitation.

> 
>> - Support 3 type policies. Monitor/Pause/Frequency Scaling
>> - Power management is enabled per-queue
>> - The API doesn't extend to other device types
> 
> Could it be extended to more device types?
> Otherwise it should be called specifically ethdev power management.

It can theoretically be extended to any device type that has callbacks. 
Current focus is obviously NICs, but in general, it doesn't have to be. 
Anything that polls and has callbacks should work.

> 
>> Anatoly Burakov (2):
>>    eal: rename power monitor condition member
>>    eal: improve comments around power monitoring API
>>
>> Liang Ma (2):
>>    power: add PMD power management API and callback
>>    examples/l3fwd-power: enable PMD power mgmt
> 
> Applied with few formatting improvements, thanks
> 
> 

Thanks!