[v10,3/9] eal: add intrinsics support check infrastructure

Message ID 1603810749-22285-4-git-send-email-liang.j.ma@intel.com (mailing list archive)
State Accepted, archived
Delegated to: Thomas Monjalon
Headers
Series Add PMD power mgmt |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Liang, Ma Oct. 27, 2020, 2:59 p.m. UTC
  Currently, it is not possible to check support for intrinsics that
are platform-specific, cannot be abstracted in a generic way, or do not
have support on all architectures. The CPUID flags can be used to some
extent, but they are only defined for their platform, while intrinsics
will be available to all code as they are in generic headers.

This patch introduces infrastructure to check support for certain
platform-specific intrinsics, and adds support for checking support for
IA power management-related intrinsics for UMWAIT/UMONITOR and TPAUSE.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: Liang Ma <liang.j.ma@intel.com>
Acked-by: David Christensen <drc@linux.vnet.ibm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
--

Notes:
   v8:
    - Rename eal version.map

   v6:
    - Fix the comments
---
 lib/librte_eal/arm/rte_cpuflags.c             |  6 +++++
 lib/librte_eal/include/generic/rte_cpuflags.h | 26 +++++++++++++++++++
 .../include/generic/rte_power_intrinsics.h    | 12 +++++++++
 lib/librte_eal/ppc/rte_cpuflags.c             |  7 +++++
 lib/librte_eal/version.map                    |  1 +
 lib/librte_eal/x86/rte_cpuflags.c             | 12 +++++++++
 6 files changed, 64 insertions(+)
  

Comments

David Marchand Oct. 29, 2020, 9:27 p.m. UTC | #1
On Tue, Oct 27, 2020 at 4:00 PM Liang Ma <liang.j.ma@intel.com> wrote:
>
> Currently, it is not possible to check support for intrinsics that
> are platform-specific, cannot be abstracted in a generic way, or do not
> have support on all architectures. The CPUID flags can be used to some
> extent, but they are only defined for their platform, while intrinsics
> will be available to all code as they are in generic headers.
>
> This patch introduces infrastructure to check support for certain
> platform-specific intrinsics, and adds support for checking support for
> IA power management-related intrinsics for UMWAIT/UMONITOR and TPAUSE.
>
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> Signed-off-by: Liang Ma <liang.j.ma@intel.com>
> Acked-by: David Christensen <drc@linux.vnet.ibm.com>
> Acked-by: Jerin Jacob <jerinj@marvell.com>
> Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Acked-by: Ray Kinsella <mdr@ashroe.eu>

Coming late to the party, it seems crowded...



> diff --git a/lib/librte_eal/include/generic/rte_cpuflags.h b/lib/librte_eal/include/generic/rte_cpuflags.h
> index 872f0ebe3e..28a5aecde8 100644
> --- a/lib/librte_eal/include/generic/rte_cpuflags.h
> +++ b/lib/librte_eal/include/generic/rte_cpuflags.h
> @@ -13,6 +13,32 @@
>  #include "rte_common.h"
>  #include <errno.h>
>
> +#include <rte_compat.h>
> +
> +/**
> + * Structure used to describe platform-specific intrinsics that may or may not
> + * be supported at runtime.
> + */
> +struct rte_cpu_intrinsics {
> +       uint32_t power_monitor : 1;
> +       /**< indicates support for rte_power_monitor function */
> +       uint32_t power_pause : 1;
> +       /**< indicates support for rte_power_pause function */
> +};

- The rte_power library is supposed to be built on top of cpuflags.
Not the other way around.
Those capabilities should have been kept inside the rte_power_ API and
not pollute the cpuflags API.

- All of this should have come as a single patch as the previously
introduced API is unusable before.


> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice
> + *
> + * Check CPU support for various intrinsics at runtime.
> + *
> + * @param intrinsics
> + *     Pointer to a structure to be filled.
> + */
> +__rte_experimental
> +void
> +rte_cpu_get_intrinsics_support(struct rte_cpu_intrinsics *intrinsics);
> +
>  /**
>   * Enumeration of all CPU features supported
>   */
> diff --git a/lib/librte_eal/include/generic/rte_power_intrinsics.h b/lib/librte_eal/include/generic/rte_power_intrinsics.h
> index fb897d9060..03a326f076 100644
> --- a/lib/librte_eal/include/generic/rte_power_intrinsics.h
> +++ b/lib/librte_eal/include/generic/rte_power_intrinsics.h
> @@ -32,6 +32,10 @@
>   * checked against the expected value, and if they match, the entering of
>   * optimized power state may be aborted.
>   *
> + * @warning It is responsibility of the user to check if this function is
> + *   supported at runtime using `rte_cpu_get_features()` API call. Failing to do
> + *   so may result in an illegal CPU instruction error.
> + *

- Reading this API description... what am I supposed to do in my
application or driver who wants to use the new
rte_power_monitor/rte_power_pause stuff?

I should call rte_cpu_get_features(TOTO) ?
This comment does not give a hint.

I suppose the intent was to refer to the rte_cpu_get_intrinsics_support() thing.
This must be fixed.


- Again, I wonder why we are exposing all this stuff.
This should be hidden in the rte_power API.
  
Anatoly Burakov Oct. 30, 2020, 10:09 a.m. UTC | #2
On 29-Oct-20 9:27 PM, David Marchand wrote:
> On Tue, Oct 27, 2020 at 4:00 PM Liang Ma <liang.j.ma@intel.com> wrote:
>>
>> Currently, it is not possible to check support for intrinsics that
>> are platform-specific, cannot be abstracted in a generic way, or do not
>> have support on all architectures. The CPUID flags can be used to some
>> extent, but they are only defined for their platform, while intrinsics
>> will be available to all code as they are in generic headers.
>>
>> This patch introduces infrastructure to check support for certain
>> platform-specific intrinsics, and adds support for checking support for
>> IA power management-related intrinsics for UMWAIT/UMONITOR and TPAUSE.
>>
>> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
>> Signed-off-by: Liang Ma <liang.j.ma@intel.com>
>> Acked-by: David Christensen <drc@linux.vnet.ibm.com>
>> Acked-by: Jerin Jacob <jerinj@marvell.com>
>> Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
>> Acked-by: Ray Kinsella <mdr@ashroe.eu>
> 
> Coming late to the party, it seems crowded...
> 
> 
> 
>> diff --git a/lib/librte_eal/include/generic/rte_cpuflags.h b/lib/librte_eal/include/generic/rte_cpuflags.h
>> index 872f0ebe3e..28a5aecde8 100644
>> --- a/lib/librte_eal/include/generic/rte_cpuflags.h
>> +++ b/lib/librte_eal/include/generic/rte_cpuflags.h
>> @@ -13,6 +13,32 @@
>>   #include "rte_common.h"
>>   #include <errno.h>
>>
>> +#include <rte_compat.h>
>> +
>> +/**
>> + * Structure used to describe platform-specific intrinsics that may or may not
>> + * be supported at runtime.
>> + */
>> +struct rte_cpu_intrinsics {
>> +       uint32_t power_monitor : 1;
>> +       /**< indicates support for rte_power_monitor function */
>> +       uint32_t power_pause : 1;
>> +       /**< indicates support for rte_power_pause function */
>> +};
> 
> - The rte_power library is supposed to be built on top of cpuflags.
> Not the other way around.
> Those capabilities should have been kept inside the rte_power_ API and
> not pollute the cpuflags API.
> 
> - All of this should have come as a single patch as the previously
> introduced API is unusable before.
> 
> 
>> +
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice
>> + *
>> + * Check CPU support for various intrinsics at runtime.
>> + *
>> + * @param intrinsics
>> + *     Pointer to a structure to be filled.
>> + */
>> +__rte_experimental
>> +void
>> +rte_cpu_get_intrinsics_support(struct rte_cpu_intrinsics *intrinsics);
>> +
>>   /**
>>    * Enumeration of all CPU features supported
>>    */
>> diff --git a/lib/librte_eal/include/generic/rte_power_intrinsics.h b/lib/librte_eal/include/generic/rte_power_intrinsics.h
>> index fb897d9060..03a326f076 100644
>> --- a/lib/librte_eal/include/generic/rte_power_intrinsics.h
>> +++ b/lib/librte_eal/include/generic/rte_power_intrinsics.h
>> @@ -32,6 +32,10 @@
>>    * checked against the expected value, and if they match, the entering of
>>    * optimized power state may be aborted.
>>    *
>> + * @warning It is responsibility of the user to check if this function is
>> + *   supported at runtime using `rte_cpu_get_features()` API call. Failing to do
>> + *   so may result in an illegal CPU instruction error.
>> + *
> 
> - Reading this API description... what am I supposed to do in my
> application or driver who wants to use the new
> rte_power_monitor/rte_power_pause stuff?
> 
> I should call rte_cpu_get_features(TOTO) ?
> This comment does not give a hint.
> 
> I suppose the intent was to refer to the rte_cpu_get_intrinsics_support() thing.
> This must be fixed.
> 
> 
> - Again, I wonder why we are exposing all this stuff.
> This should be hidden in the rte_power API.
> 
> 

We're exposing all of this here because the intrinsics are *not* part of 
the power API but rather are generic headers within EAL. Therefore, any 
infrastructure checking for their support can *not* reside in the power 
library, but instead has to be in EAL.

The intended usage here is to call this function before calling 
rte_power_monitor(), such that:

	struct rte_cpu_intrinsics intrinsics;

	rte_cpu_get_intrinsics_support(&intrinsics);

	if (!intrinsics.power_monitor) {
		// rte_power_monitor not supported and cannot be used
		return;
	}
	// proceed with rte_power_monitor usage

Failing to do that will result in either -ENOTSUP on non-x86, or illegal 
instruction crash on x86 that doesn't have that instruction (because we 
encode raw opcode).

I've *not* added this to the previous patches because i wanted to get 
this part reviewed specifically, and not mix it with other IA-specific 
stuff. It seems that i've succeeded in that goal, as this patch has 4 
likes^W acks :)
  
Thomas Monjalon Oct. 30, 2020, 10:14 a.m. UTC | #3
30/10/2020 11:09, Burakov, Anatoly:
> On 29-Oct-20 9:27 PM, David Marchand wrote:
> > On Tue, Oct 27, 2020 at 4:00 PM Liang Ma <liang.j.ma@intel.com> wrote:
> >>
> >> Currently, it is not possible to check support for intrinsics that
> >> are platform-specific, cannot be abstracted in a generic way, or do not
> >> have support on all architectures. The CPUID flags can be used to some
> >> extent, but they are only defined for their platform, while intrinsics
> >> will be available to all code as they are in generic headers.
> >>
> >> This patch introduces infrastructure to check support for certain
> >> platform-specific intrinsics, and adds support for checking support for
> >> IA power management-related intrinsics for UMWAIT/UMONITOR and TPAUSE.
> >>
> >> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> >> Signed-off-by: Liang Ma <liang.j.ma@intel.com>
> >> Acked-by: David Christensen <drc@linux.vnet.ibm.com>
> >> Acked-by: Jerin Jacob <jerinj@marvell.com>
> >> Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
> >> Acked-by: Ray Kinsella <mdr@ashroe.eu>
> > 
> > Coming late to the party, it seems crowded...
> > 
> > 
> > 
> >> diff --git a/lib/librte_eal/include/generic/rte_cpuflags.h b/lib/librte_eal/include/generic/rte_cpuflags.h
> >> index 872f0ebe3e..28a5aecde8 100644
> >> --- a/lib/librte_eal/include/generic/rte_cpuflags.h
> >> +++ b/lib/librte_eal/include/generic/rte_cpuflags.h
> >> @@ -13,6 +13,32 @@
> >>   #include "rte_common.h"
> >>   #include <errno.h>
> >>
> >> +#include <rte_compat.h>
> >> +
> >> +/**
> >> + * Structure used to describe platform-specific intrinsics that may or may not
> >> + * be supported at runtime.
> >> + */
> >> +struct rte_cpu_intrinsics {
> >> +       uint32_t power_monitor : 1;
> >> +       /**< indicates support for rte_power_monitor function */
> >> +       uint32_t power_pause : 1;
> >> +       /**< indicates support for rte_power_pause function */
> >> +};
> > 
> > - The rte_power library is supposed to be built on top of cpuflags.
> > Not the other way around.
> > Those capabilities should have been kept inside the rte_power_ API and
> > not pollute the cpuflags API.
> > 
> > - All of this should have come as a single patch as the previously
> > introduced API is unusable before.
> > 
> > 
> >> +
> >> +/**
> >> + * @warning
> >> + * @b EXPERIMENTAL: this API may change without prior notice
> >> + *
> >> + * Check CPU support for various intrinsics at runtime.
> >> + *
> >> + * @param intrinsics
> >> + *     Pointer to a structure to be filled.
> >> + */
> >> +__rte_experimental
> >> +void
> >> +rte_cpu_get_intrinsics_support(struct rte_cpu_intrinsics *intrinsics);
> >> +
> >>   /**
> >>    * Enumeration of all CPU features supported
> >>    */
> >> diff --git a/lib/librte_eal/include/generic/rte_power_intrinsics.h b/lib/librte_eal/include/generic/rte_power_intrinsics.h
> >> index fb897d9060..03a326f076 100644
> >> --- a/lib/librte_eal/include/generic/rte_power_intrinsics.h
> >> +++ b/lib/librte_eal/include/generic/rte_power_intrinsics.h
> >> @@ -32,6 +32,10 @@
> >>    * checked against the expected value, and if they match, the entering of
> >>    * optimized power state may be aborted.
> >>    *
> >> + * @warning It is responsibility of the user to check if this function is
> >> + *   supported at runtime using `rte_cpu_get_features()` API call. Failing to do
> >> + *   so may result in an illegal CPU instruction error.
> >> + *
> > 
> > - Reading this API description... what am I supposed to do in my
> > application or driver who wants to use the new
> > rte_power_monitor/rte_power_pause stuff?
> > 
> > I should call rte_cpu_get_features(TOTO) ?
> > This comment does not give a hint.
> > 
> > I suppose the intent was to refer to the rte_cpu_get_intrinsics_support() thing.
> > This must be fixed.
> > 
> > 
> > - Again, I wonder why we are exposing all this stuff.
> > This should be hidden in the rte_power API.
> > 
> 
> We're exposing all of this here because the intrinsics are *not* part of 
> the power API but rather are generic headers within EAL. Therefore, any 
> infrastructure checking for their support can *not* reside in the power 
> library, but instead has to be in EAL.
> 
> The intended usage here is to call this function before calling 
> rte_power_monitor(), such that:
> 
> 	struct rte_cpu_intrinsics intrinsics;
> 
> 	rte_cpu_get_intrinsics_support(&intrinsics);
> 
> 	if (!intrinsics.power_monitor) {
> 		// rte_power_monitor not supported and cannot be used
> 		return;
> 	}

This check could be done inside the rte_power API.

> 	// proceed with rte_power_monitor usage
> 
> Failing to do that will result in either -ENOTSUP on non-x86, or illegal 
> instruction crash on x86 that doesn't have that instruction (because we 
> encode raw opcode).
> 
> I've *not* added this to the previous patches because i wanted to get 
> this part reviewed specifically, and not mix it with other IA-specific 
> stuff. It seems that i've succeeded in that goal, as this patch has 4 
> likes^W acks :)

You did not explain the need for rte_cpu_get_features() call.
  
Anatoly Burakov Oct. 30, 2020, 1:37 p.m. UTC | #4
On 30-Oct-20 10:14 AM, Thomas Monjalon wrote:
> 30/10/2020 11:09, Burakov, Anatoly:
>> On 29-Oct-20 9:27 PM, David Marchand wrote:
>>> On Tue, Oct 27, 2020 at 4:00 PM Liang Ma <liang.j.ma@intel.com> wrote:
>>>>
>>>> Currently, it is not possible to check support for intrinsics that
>>>> are platform-specific, cannot be abstracted in a generic way, or do not
>>>> have support on all architectures. The CPUID flags can be used to some
>>>> extent, but they are only defined for their platform, while intrinsics
>>>> will be available to all code as they are in generic headers.
>>>>
>>>> This patch introduces infrastructure to check support for certain
>>>> platform-specific intrinsics, and adds support for checking support for
>>>> IA power management-related intrinsics for UMWAIT/UMONITOR and TPAUSE.
>>>>
>>>> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>>> Signed-off-by: Liang Ma <liang.j.ma@intel.com>
>>>> Acked-by: David Christensen <drc@linux.vnet.ibm.com>
>>>> Acked-by: Jerin Jacob <jerinj@marvell.com>
>>>> Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
>>>> Acked-by: Ray Kinsella <mdr@ashroe.eu>
>>>
>>> Coming late to the party, it seems crowded...
>>>
>>>
>>>
>>>> diff --git a/lib/librte_eal/include/generic/rte_cpuflags.h b/lib/librte_eal/include/generic/rte_cpuflags.h
>>>> index 872f0ebe3e..28a5aecde8 100644
>>>> --- a/lib/librte_eal/include/generic/rte_cpuflags.h
>>>> +++ b/lib/librte_eal/include/generic/rte_cpuflags.h
>>>> @@ -13,6 +13,32 @@
>>>>    #include "rte_common.h"
>>>>    #include <errno.h>
>>>>
>>>> +#include <rte_compat.h>
>>>> +
>>>> +/**
>>>> + * Structure used to describe platform-specific intrinsics that may or may not
>>>> + * be supported at runtime.
>>>> + */
>>>> +struct rte_cpu_intrinsics {
>>>> +       uint32_t power_monitor : 1;
>>>> +       /**< indicates support for rte_power_monitor function */
>>>> +       uint32_t power_pause : 1;
>>>> +       /**< indicates support for rte_power_pause function */
>>>> +};
>>>
>>> - The rte_power library is supposed to be built on top of cpuflags.
>>> Not the other way around.
>>> Those capabilities should have been kept inside the rte_power_ API and
>>> not pollute the cpuflags API.
>>>
>>> - All of this should have come as a single patch as the previously
>>> introduced API is unusable before.
>>>
>>>
>>>> +
>>>> +/**
>>>> + * @warning
>>>> + * @b EXPERIMENTAL: this API may change without prior notice
>>>> + *
>>>> + * Check CPU support for various intrinsics at runtime.
>>>> + *
>>>> + * @param intrinsics
>>>> + *     Pointer to a structure to be filled.
>>>> + */
>>>> +__rte_experimental
>>>> +void
>>>> +rte_cpu_get_intrinsics_support(struct rte_cpu_intrinsics *intrinsics);
>>>> +
>>>>    /**
>>>>     * Enumeration of all CPU features supported
>>>>     */
>>>> diff --git a/lib/librte_eal/include/generic/rte_power_intrinsics.h b/lib/librte_eal/include/generic/rte_power_intrinsics.h
>>>> index fb897d9060..03a326f076 100644
>>>> --- a/lib/librte_eal/include/generic/rte_power_intrinsics.h
>>>> +++ b/lib/librte_eal/include/generic/rte_power_intrinsics.h
>>>> @@ -32,6 +32,10 @@
>>>>     * checked against the expected value, and if they match, the entering of
>>>>     * optimized power state may be aborted.
>>>>     *
>>>> + * @warning It is responsibility of the user to check if this function is
>>>> + *   supported at runtime using `rte_cpu_get_features()` API call. Failing to do
>>>> + *   so may result in an illegal CPU instruction error.
>>>> + *
>>>
>>> - Reading this API description... what am I supposed to do in my
>>> application or driver who wants to use the new
>>> rte_power_monitor/rte_power_pause stuff?
>>>
>>> I should call rte_cpu_get_features(TOTO) ?
>>> This comment does not give a hint.
>>>
>>> I suppose the intent was to refer to the rte_cpu_get_intrinsics_support() thing.
>>> This must be fixed.
>>>
>>>
>>> - Again, I wonder why we are exposing all this stuff.
>>> This should be hidden in the rte_power API.
>>>
>>
>> We're exposing all of this here because the intrinsics are *not* part of
>> the power API but rather are generic headers within EAL. Therefore, any
>> infrastructure checking for their support can *not* reside in the power
>> library, but instead has to be in EAL.
>>
>> The intended usage here is to call this function before calling
>> rte_power_monitor(), such that:
>>
>> 	struct rte_cpu_intrinsics intrinsics;
>>
>> 	rte_cpu_get_intrinsics_support(&intrinsics);
>>
>> 	if (!intrinsics.power_monitor) {
>> 		// rte_power_monitor not supported and cannot be used
>> 		return;
>> 	}
> 
> This check could be done inside the rte_power API.

I'm not quite clear on exactly what you're asking here.

Do you mean the example code above? If so, code like that is already 
present in the power library, at the callback enable stage.

If you mean to say, i should put this check into the rte_power_monitor 
intrinsic, then no, i don't think it's a good idea to have this 
expensive check every time you call rte_power_monitor.

If you mean put this entire infrastructure into the power API - well, 
that kinda defeats the purpose of both having these intrinsics in 
generic headers and having a generic CPU feature check infrastructure 
that was requested of us during the review. We of course can move the 
intrinsic to the power library and outside of EAL, but then anything 
that requires UMWAIT will have to depend on the librte_power.

Please clarify exactly what changes you would like to see here, and what 
is your objection.

> 
>> 	// proceed with rte_power_monitor usage
>>
>> Failing to do that will result in either -ENOTSUP on non-x86, or illegal
>> instruction crash on x86 that doesn't have that instruction (because we
>> encode raw opcode).
>>
>> I've *not* added this to the previous patches because i wanted to get
>> this part reviewed specifically, and not mix it with other IA-specific
>> stuff. It seems that i've succeeded in that goal, as this patch has 4
>> likes^W acks :)
> 
> You did not explain the need for rte_cpu_get_features() call.
> 

Did not explain *where*? Are you suggesting i put things about 
rte_power_monitor into documentation for rte_cpu_get_intrinsics_support? 
The documentation for rte_power_monitor already states that one should 
use rte_cpu_get_intrinsics_support API to check if the rte_power_monitor 
is supported on current machine. What else is missing?
  
Thomas Monjalon Oct. 30, 2020, 2:09 p.m. UTC | #5
30/10/2020 14:37, Burakov, Anatoly:
> On 30-Oct-20 10:14 AM, Thomas Monjalon wrote:
> > 30/10/2020 11:09, Burakov, Anatoly:
> >> On 29-Oct-20 9:27 PM, David Marchand wrote:
> >>> On Tue, Oct 27, 2020 at 4:00 PM Liang Ma <liang.j.ma@intel.com> wrote:
> >>>>
> >>>> Currently, it is not possible to check support for intrinsics that
> >>>> are platform-specific, cannot be abstracted in a generic way, or do not
> >>>> have support on all architectures. The CPUID flags can be used to some
> >>>> extent, but they are only defined for their platform, while intrinsics
> >>>> will be available to all code as they are in generic headers.
> >>>>
> >>>> This patch introduces infrastructure to check support for certain
> >>>> platform-specific intrinsics, and adds support for checking support for
> >>>> IA power management-related intrinsics for UMWAIT/UMONITOR and TPAUSE.
> >>>>
> >>>> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> >>>> Signed-off-by: Liang Ma <liang.j.ma@intel.com>
> >>>> Acked-by: David Christensen <drc@linux.vnet.ibm.com>
> >>>> Acked-by: Jerin Jacob <jerinj@marvell.com>
> >>>> Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
> >>>> Acked-by: Ray Kinsella <mdr@ashroe.eu>
> >>>
> >>> Coming late to the party, it seems crowded...
> >>>
> >>>
> >>>
> >>>> diff --git a/lib/librte_eal/include/generic/rte_cpuflags.h b/lib/librte_eal/include/generic/rte_cpuflags.h
> >>>> index 872f0ebe3e..28a5aecde8 100644
> >>>> --- a/lib/librte_eal/include/generic/rte_cpuflags.h
> >>>> +++ b/lib/librte_eal/include/generic/rte_cpuflags.h
> >>>> @@ -13,6 +13,32 @@
> >>>>    #include "rte_common.h"
> >>>>    #include <errno.h>
> >>>>
> >>>> +#include <rte_compat.h>
> >>>> +
> >>>> +/**
> >>>> + * Structure used to describe platform-specific intrinsics that may or may not
> >>>> + * be supported at runtime.
> >>>> + */
> >>>> +struct rte_cpu_intrinsics {
> >>>> +       uint32_t power_monitor : 1;
> >>>> +       /**< indicates support for rte_power_monitor function */
> >>>> +       uint32_t power_pause : 1;
> >>>> +       /**< indicates support for rte_power_pause function */
> >>>> +};
> >>>
> >>> - The rte_power library is supposed to be built on top of cpuflags.
> >>> Not the other way around.
> >>> Those capabilities should have been kept inside the rte_power_ API and
> >>> not pollute the cpuflags API.
> >>>
> >>> - All of this should have come as a single patch as the previously
> >>> introduced API is unusable before.
> >>>
> >>>
> >>>> +
> >>>> +/**
> >>>> + * @warning
> >>>> + * @b EXPERIMENTAL: this API may change without prior notice
> >>>> + *
> >>>> + * Check CPU support for various intrinsics at runtime.
> >>>> + *
> >>>> + * @param intrinsics
> >>>> + *     Pointer to a structure to be filled.
> >>>> + */
> >>>> +__rte_experimental
> >>>> +void
> >>>> +rte_cpu_get_intrinsics_support(struct rte_cpu_intrinsics *intrinsics);
> >>>> +
> >>>>    /**
> >>>>     * Enumeration of all CPU features supported
> >>>>     */
> >>>> diff --git a/lib/librte_eal/include/generic/rte_power_intrinsics.h b/lib/librte_eal/include/generic/rte_power_intrinsics.h
> >>>> index fb897d9060..03a326f076 100644
> >>>> --- a/lib/librte_eal/include/generic/rte_power_intrinsics.h
> >>>> +++ b/lib/librte_eal/include/generic/rte_power_intrinsics.h
> >>>> @@ -32,6 +32,10 @@
> >>>>     * checked against the expected value, and if they match, the entering of
> >>>>     * optimized power state may be aborted.
> >>>>     *
> >>>> + * @warning It is responsibility of the user to check if this function is
> >>>> + *   supported at runtime using `rte_cpu_get_features()` API call. Failing to do
> >>>> + *   so may result in an illegal CPU instruction error.
> >>>> + *
> >>>
> >>> - Reading this API description... what am I supposed to do in my
> >>> application or driver who wants to use the new
> >>> rte_power_monitor/rte_power_pause stuff?
> >>>
> >>> I should call rte_cpu_get_features(TOTO) ?
> >>> This comment does not give a hint.
> >>>
> >>> I suppose the intent was to refer to the rte_cpu_get_intrinsics_support() thing.
> >>> This must be fixed.
> >>>
> >>>
> >>> - Again, I wonder why we are exposing all this stuff.
> >>> This should be hidden in the rte_power API.
> >>>
> >>
> >> We're exposing all of this here because the intrinsics are *not* part of
> >> the power API but rather are generic headers within EAL. Therefore, any
> >> infrastructure checking for their support can *not* reside in the power
> >> library, but instead has to be in EAL.
> >>
> >> The intended usage here is to call this function before calling
> >> rte_power_monitor(), such that:
> >>
> >> 	struct rte_cpu_intrinsics intrinsics;
> >>
> >> 	rte_cpu_get_intrinsics_support(&intrinsics);
> >>
> >> 	if (!intrinsics.power_monitor) {
> >> 		// rte_power_monitor not supported and cannot be used
> >> 		return;
> >> 	}
> > 
> > This check could be done inside the rte_power API.
> 
> I'm not quite clear on exactly what you're asking here.
> 
> Do you mean the example code above? If so, code like that is already 
> present in the power library, at the callback enable stage.
> 
> If you mean to say, i should put this check into the rte_power_monitor 
> intrinsic, then no, i don't think it's a good idea to have this 
> expensive check every time you call rte_power_monitor.

No but it can be done at initialization time.
According to what you say above, it is alread done at callback enable stage.
So the app does not need to do the check?

> If you mean put this entire infrastructure into the power API - well, 
> that kinda defeats the purpose of both having these intrinsics in 
> generic headers and having a generic CPU feature check infrastructure 
> that was requested of us during the review. We of course can move the 
> intrinsic to the power library and outside of EAL, but then anything 
> that requires UMWAIT will have to depend on the librte_power.

Yes the intrinsics can be in EAL if usable.
But it seems DLB author cannot use what is in EAL.

> Please clarify exactly what changes you would like to see here, and what 
> is your objection.
> 
> > 
> >> 	// proceed with rte_power_monitor usage
> >>
> >> Failing to do that will result in either -ENOTSUP on non-x86, or illegal
> >> instruction crash on x86 that doesn't have that instruction (because we
> >> encode raw opcode).
> >>
> >> I've *not* added this to the previous patches because i wanted to get
> >> this part reviewed specifically, and not mix it with other IA-specific
> >> stuff. It seems that i've succeeded in that goal, as this patch has 4
> >> likes^W acks :)
> > 
> > You did not explain the need for rte_cpu_get_features() call.
> > 
> 
> Did not explain *where*? Are you suggesting i put things about 
> rte_power_monitor into documentation for rte_cpu_get_intrinsics_support? 
> The documentation for rte_power_monitor already states that one should 
> use rte_cpu_get_intrinsics_support API to check if the rte_power_monitor 
> is supported on current machine. What else is missing?

In your example above, you do not call rte_cpu_get_features()
which is documented as required in the EAL doc.
  
Anatoly Burakov Oct. 30, 2020, 3:27 p.m. UTC | #6
On 30-Oct-20 2:09 PM, Thomas Monjalon wrote:
> 30/10/2020 14:37, Burakov, Anatoly:
>> On 30-Oct-20 10:14 AM, Thomas Monjalon wrote:
>>> 30/10/2020 11:09, Burakov, Anatoly:
>>>> On 29-Oct-20 9:27 PM, David Marchand wrote:
>>>>> On Tue, Oct 27, 2020 at 4:00 PM Liang Ma <liang.j.ma@intel.com> wrote:
>>>>>>
>>>>>> Currently, it is not possible to check support for intrinsics that
>>>>>> are platform-specific, cannot be abstracted in a generic way, or do not
>>>>>> have support on all architectures. The CPUID flags can be used to some
>>>>>> extent, but they are only defined for their platform, while intrinsics
>>>>>> will be available to all code as they are in generic headers.
>>>>>>
>>>>>> This patch introduces infrastructure to check support for certain
>>>>>> platform-specific intrinsics, and adds support for checking support for
>>>>>> IA power management-related intrinsics for UMWAIT/UMONITOR and TPAUSE.
>>>>>>
>>>>>> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>>>>> Signed-off-by: Liang Ma <liang.j.ma@intel.com>
>>>>>> Acked-by: David Christensen <drc@linux.vnet.ibm.com>
>>>>>> Acked-by: Jerin Jacob <jerinj@marvell.com>
>>>>>> Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
>>>>>> Acked-by: Ray Kinsella <mdr@ashroe.eu>
>>>>>
>>>>> Coming late to the party, it seems crowded...
>>>>>
>>>>>
>>>>>
>>>>>> diff --git a/lib/librte_eal/include/generic/rte_cpuflags.h b/lib/librte_eal/include/generic/rte_cpuflags.h
>>>>>> index 872f0ebe3e..28a5aecde8 100644
>>>>>> --- a/lib/librte_eal/include/generic/rte_cpuflags.h
>>>>>> +++ b/lib/librte_eal/include/generic/rte_cpuflags.h
>>>>>> @@ -13,6 +13,32 @@
>>>>>>     #include "rte_common.h"
>>>>>>     #include <errno.h>
>>>>>>
>>>>>> +#include <rte_compat.h>
>>>>>> +
>>>>>> +/**
>>>>>> + * Structure used to describe platform-specific intrinsics that may or may not
>>>>>> + * be supported at runtime.
>>>>>> + */
>>>>>> +struct rte_cpu_intrinsics {
>>>>>> +       uint32_t power_monitor : 1;
>>>>>> +       /**< indicates support for rte_power_monitor function */
>>>>>> +       uint32_t power_pause : 1;
>>>>>> +       /**< indicates support for rte_power_pause function */
>>>>>> +};
>>>>>
>>>>> - The rte_power library is supposed to be built on top of cpuflags.
>>>>> Not the other way around.
>>>>> Those capabilities should have been kept inside the rte_power_ API and
>>>>> not pollute the cpuflags API.
>>>>>
>>>>> - All of this should have come as a single patch as the previously
>>>>> introduced API is unusable before.
>>>>>
>>>>>
>>>>>> +
>>>>>> +/**
>>>>>> + * @warning
>>>>>> + * @b EXPERIMENTAL: this API may change without prior notice
>>>>>> + *
>>>>>> + * Check CPU support for various intrinsics at runtime.
>>>>>> + *
>>>>>> + * @param intrinsics
>>>>>> + *     Pointer to a structure to be filled.
>>>>>> + */
>>>>>> +__rte_experimental
>>>>>> +void
>>>>>> +rte_cpu_get_intrinsics_support(struct rte_cpu_intrinsics *intrinsics);
>>>>>> +
>>>>>>     /**
>>>>>>      * Enumeration of all CPU features supported
>>>>>>      */
>>>>>> diff --git a/lib/librte_eal/include/generic/rte_power_intrinsics.h b/lib/librte_eal/include/generic/rte_power_intrinsics.h
>>>>>> index fb897d9060..03a326f076 100644
>>>>>> --- a/lib/librte_eal/include/generic/rte_power_intrinsics.h
>>>>>> +++ b/lib/librte_eal/include/generic/rte_power_intrinsics.h
>>>>>> @@ -32,6 +32,10 @@
>>>>>>      * checked against the expected value, and if they match, the entering of
>>>>>>      * optimized power state may be aborted.
>>>>>>      *
>>>>>> + * @warning It is responsibility of the user to check if this function is
>>>>>> + *   supported at runtime using `rte_cpu_get_features()` API call. Failing to do
>>>>>> + *   so may result in an illegal CPU instruction error.
>>>>>> + *
>>>>>
>>>>> - Reading this API description... what am I supposed to do in my
>>>>> application or driver who wants to use the new
>>>>> rte_power_monitor/rte_power_pause stuff?
>>>>>
>>>>> I should call rte_cpu_get_features(TOTO) ?
>>>>> This comment does not give a hint.
>>>>>
>>>>> I suppose the intent was to refer to the rte_cpu_get_intrinsics_support() thing.
>>>>> This must be fixed.
>>>>>
>>>>>
>>>>> - Again, I wonder why we are exposing all this stuff.
>>>>> This should be hidden in the rte_power API.
>>>>>
>>>>
>>>> We're exposing all of this here because the intrinsics are *not* part of
>>>> the power API but rather are generic headers within EAL. Therefore, any
>>>> infrastructure checking for their support can *not* reside in the power
>>>> library, but instead has to be in EAL.
>>>>
>>>> The intended usage here is to call this function before calling
>>>> rte_power_monitor(), such that:
>>>>
>>>> 	struct rte_cpu_intrinsics intrinsics;
>>>>
>>>> 	rte_cpu_get_intrinsics_support(&intrinsics);
>>>>
>>>> 	if (!intrinsics.power_monitor) {
>>>> 		// rte_power_monitor not supported and cannot be used
>>>> 		return;
>>>> 	}
>>>
>>> This check could be done inside the rte_power API.
>>
>> I'm not quite clear on exactly what you're asking here.
>>
>> Do you mean the example code above? If so, code like that is already
>> present in the power library, at the callback enable stage.
>>
>> If you mean to say, i should put this check into the rte_power_monitor
>> intrinsic, then no, i don't think it's a good idea to have this
>> expensive check every time you call rte_power_monitor.
> 
> No but it can be done at initialization time.
> According to what you say above, it is alread done at callback enable stage.
> So the app does not need to do the check?

Admittedly it's a bit confusing, but please bear with me.

There are two separate issues at hand: the intrinsic itself, and the 
calling code. We provide both.

That means, the *calling code* should do the check. In our case, *our* 
calling code is the callback. However, nothing stops someone else from 
implementing their own scheme using our intrinsic - in that case, the 
user will be responsible to check if the intrinsic is supported before 
using it in their own code, because they won't be using our callback but 
will be using our intrinsic.

So, we have a check *in our calling code*. But if someone were to use 
the *intrinsic* directly (like DLB), they would have to add their own 
checks around the intrinsic usage.

Our power intrinsic is a static inline function. Are you proposing to 
add some sort of function pointer wrapper and make it an indirect call 
instead of a static inline function? (or indeed a proper function)

> 
>> If you mean put this entire infrastructure into the power API - well,
>> that kinda defeats the purpose of both having these intrinsics in
>> generic headers and having a generic CPU feature check infrastructure
>> that was requested of us during the review. We of course can move the
>> intrinsic to the power library and outside of EAL, but then anything
>> that requires UMWAIT will have to depend on the librte_power.
> 
> Yes the intrinsics can be in EAL if usable.
> But it seems DLB author cannot use what is in EAL.

I'll let the DLB authors clarify that themselves, but as far as i'm 
aware, it seems that this is not the case - while their current code 
wouldn't be able to use these intrinsics by search-and-replace, they 
will be able to use them with a couple of changes to their code that 
basically amounted to reimplementation of our intrinsics.

> 
>> Please clarify exactly what changes you would like to see here, and what
>> is your objection.
>>
>>>
>>>> 	// proceed with rte_power_monitor usage
>>>>
>>>> Failing to do that will result in either -ENOTSUP on non-x86, or illegal
>>>> instruction crash on x86 that doesn't have that instruction (because we
>>>> encode raw opcode).
>>>>
>>>> I've *not* added this to the previous patches because i wanted to get
>>>> this part reviewed specifically, and not mix it with other IA-specific
>>>> stuff. It seems that i've succeeded in that goal, as this patch has 4
>>>> likes^W acks :)
>>>
>>> You did not explain the need for rte_cpu_get_features() call.
>>>
>>
>> Did not explain *where*? Are you suggesting i put things about
>> rte_power_monitor into documentation for rte_cpu_get_intrinsics_support?
>> The documentation for rte_power_monitor already states that one should
>> use rte_cpu_get_intrinsics_support API to check if the rte_power_monitor
>> is supported on current machine. What else is missing?
> 
> In your example above, you do not call rte_cpu_get_features()
> which is documented as required in the EAL doc.
> 

I'm not sure i follow. This is unrelated to rte_cpu_get_features call. 
The rte_cpu_get_features is a CPUID check, and it was decided not to use 
it because the WAITPKG CPUID flag is only defined for x86 and not for 
other archs. This new call (rte_cpu_get_intrinsics_support) is non-arch 
specific, but will have an arch-specific implementation (which happens 
to use rte_cpu_get_features to detect support for WAITPKG). I have given 
the example code of how to detect support for rte_power_monitor using 
this new code, in the code example you just referred to.
  
Thomas Monjalon Oct. 30, 2020, 3:44 p.m. UTC | #7
30/10/2020 16:27, Burakov, Anatoly:
> On 30-Oct-20 2:09 PM, Thomas Monjalon wrote:
> > 30/10/2020 14:37, Burakov, Anatoly:
> >> On 30-Oct-20 10:14 AM, Thomas Monjalon wrote:
> >>> 30/10/2020 11:09, Burakov, Anatoly:
> >>>> On 29-Oct-20 9:27 PM, David Marchand wrote:
> >>>>> On Tue, Oct 27, 2020 at 4:00 PM Liang Ma <liang.j.ma@intel.com> wrote:
> >>>>>>
> >>>>>> Currently, it is not possible to check support for intrinsics that
> >>>>>> are platform-specific, cannot be abstracted in a generic way, or do not
> >>>>>> have support on all architectures. The CPUID flags can be used to some
> >>>>>> extent, but they are only defined for their platform, while intrinsics
> >>>>>> will be available to all code as they are in generic headers.
> >>>>>>
> >>>>>> This patch introduces infrastructure to check support for certain
> >>>>>> platform-specific intrinsics, and adds support for checking support for
> >>>>>> IA power management-related intrinsics for UMWAIT/UMONITOR and TPAUSE.
> >>>>>>
> >>>>>> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> >>>>>> Signed-off-by: Liang Ma <liang.j.ma@intel.com>
> >>>>>> Acked-by: David Christensen <drc@linux.vnet.ibm.com>
> >>>>>> Acked-by: Jerin Jacob <jerinj@marvell.com>
> >>>>>> Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
> >>>>>> Acked-by: Ray Kinsella <mdr@ashroe.eu>
> >>>>>
> >>>>> Coming late to the party, it seems crowded...
> >>>>>
> >>>>>
> >>>>>
> >>>>>> diff --git a/lib/librte_eal/include/generic/rte_cpuflags.h b/lib/librte_eal/include/generic/rte_cpuflags.h
> >>>>>> index 872f0ebe3e..28a5aecde8 100644
> >>>>>> --- a/lib/librte_eal/include/generic/rte_cpuflags.h
> >>>>>> +++ b/lib/librte_eal/include/generic/rte_cpuflags.h
> >>>>>> @@ -13,6 +13,32 @@
> >>>>>>     #include "rte_common.h"
> >>>>>>     #include <errno.h>
> >>>>>>
> >>>>>> +#include <rte_compat.h>
> >>>>>> +
> >>>>>> +/**
> >>>>>> + * Structure used to describe platform-specific intrinsics that may or may not
> >>>>>> + * be supported at runtime.
> >>>>>> + */
> >>>>>> +struct rte_cpu_intrinsics {
> >>>>>> +       uint32_t power_monitor : 1;
> >>>>>> +       /**< indicates support for rte_power_monitor function */
> >>>>>> +       uint32_t power_pause : 1;
> >>>>>> +       /**< indicates support for rte_power_pause function */
> >>>>>> +};
> >>>>>
> >>>>> - The rte_power library is supposed to be built on top of cpuflags.
> >>>>> Not the other way around.
> >>>>> Those capabilities should have been kept inside the rte_power_ API and
> >>>>> not pollute the cpuflags API.
> >>>>>
> >>>>> - All of this should have come as a single patch as the previously
> >>>>> introduced API is unusable before.
> >>>>>
> >>>>>
> >>>>>> +
> >>>>>> +/**
> >>>>>> + * @warning
> >>>>>> + * @b EXPERIMENTAL: this API may change without prior notice
> >>>>>> + *
> >>>>>> + * Check CPU support for various intrinsics at runtime.
> >>>>>> + *
> >>>>>> + * @param intrinsics
> >>>>>> + *     Pointer to a structure to be filled.
> >>>>>> + */
> >>>>>> +__rte_experimental
> >>>>>> +void
> >>>>>> +rte_cpu_get_intrinsics_support(struct rte_cpu_intrinsics *intrinsics);
> >>>>>> +
> >>>>>>     /**
> >>>>>>      * Enumeration of all CPU features supported
> >>>>>>      */
> >>>>>> diff --git a/lib/librte_eal/include/generic/rte_power_intrinsics.h b/lib/librte_eal/include/generic/rte_power_intrinsics.h
> >>>>>> index fb897d9060..03a326f076 100644
> >>>>>> --- a/lib/librte_eal/include/generic/rte_power_intrinsics.h
> >>>>>> +++ b/lib/librte_eal/include/generic/rte_power_intrinsics.h
> >>>>>> @@ -32,6 +32,10 @@
> >>>>>>      * checked against the expected value, and if they match, the entering of
> >>>>>>      * optimized power state may be aborted.
> >>>>>>      *
> >>>>>> + * @warning It is responsibility of the user to check if this function is
> >>>>>> + *   supported at runtime using `rte_cpu_get_features()` API call. Failing to do
> >>>>>> + *   so may result in an illegal CPU instruction error.
> >>>>>> + *
> >>>>>
> >>>>> - Reading this API description... what am I supposed to do in my
> >>>>> application or driver who wants to use the new
> >>>>> rte_power_monitor/rte_power_pause stuff?
> >>>>>
> >>>>> I should call rte_cpu_get_features(TOTO) ?
> >>>>> This comment does not give a hint.
> >>>>>
> >>>>> I suppose the intent was to refer to the rte_cpu_get_intrinsics_support() thing.
> >>>>> This must be fixed.
> >>>>>
> >>>>>
> >>>>> - Again, I wonder why we are exposing all this stuff.
> >>>>> This should be hidden in the rte_power API.
> >>>>>
> >>>>
> >>>> We're exposing all of this here because the intrinsics are *not* part of
> >>>> the power API but rather are generic headers within EAL. Therefore, any
> >>>> infrastructure checking for their support can *not* reside in the power
> >>>> library, but instead has to be in EAL.
> >>>>
> >>>> The intended usage here is to call this function before calling
> >>>> rte_power_monitor(), such that:
> >>>>
> >>>> 	struct rte_cpu_intrinsics intrinsics;
> >>>>
> >>>> 	rte_cpu_get_intrinsics_support(&intrinsics);
> >>>>
> >>>> 	if (!intrinsics.power_monitor) {
> >>>> 		// rte_power_monitor not supported and cannot be used
> >>>> 		return;
> >>>> 	}
> >>>
> >>> This check could be done inside the rte_power API.
> >>
> >> I'm not quite clear on exactly what you're asking here.
> >>
> >> Do you mean the example code above? If so, code like that is already
> >> present in the power library, at the callback enable stage.
> >>
> >> If you mean to say, i should put this check into the rte_power_monitor
> >> intrinsic, then no, i don't think it's a good idea to have this
> >> expensive check every time you call rte_power_monitor.
> > 
> > No but it can be done at initialization time.
> > According to what you say above, it is alread done at callback enable stage.
> > So the app does not need to do the check?
> 
> Admittedly it's a bit confusing, but please bear with me.
> 
> There are two separate issues at hand: the intrinsic itself, and the 
> calling code. We provide both.
> 
> That means, the *calling code* should do the check. In our case, *our* 
> calling code is the callback. However, nothing stops someone else from 
> implementing their own scheme using our intrinsic - in that case, the 
> user will be responsible to check if the intrinsic is supported before 
> using it in their own code, because they won't be using our callback but 
> will be using our intrinsic.
> 
> So, we have a check *in our calling code*. But if someone were to use 
> the *intrinsic* directly (like DLB), they would have to add their own 
> checks around the intrinsic usage.
> 
> Our power intrinsic is a static inline function. Are you proposing to 
> add some sort of function pointer wrapper and make it an indirect call 
> instead of a static inline function? (or indeed a proper function)
> 
> > 
> >> If you mean put this entire infrastructure into the power API - well,
> >> that kinda defeats the purpose of both having these intrinsics in
> >> generic headers and having a generic CPU feature check infrastructure
> >> that was requested of us during the review. We of course can move the
> >> intrinsic to the power library and outside of EAL, but then anything
> >> that requires UMWAIT will have to depend on the librte_power.
> > 
> > Yes the intrinsics can be in EAL if usable.
> > But it seems DLB author cannot use what is in EAL.
> 
> I'll let the DLB authors clarify that themselves, but as far as i'm 
> aware, it seems that this is not the case - while their current code 
> wouldn't be able to use these intrinsics by search-and-replace, they 
> will be able to use them with a couple of changes to their code that 
> basically amounted to reimplementation of our intrinsics.
> 
> > 
> >> Please clarify exactly what changes you would like to see here, and what
> >> is your objection.
> >>
> >>>
> >>>> 	// proceed with rte_power_monitor usage
> >>>>
> >>>> Failing to do that will result in either -ENOTSUP on non-x86, or illegal
> >>>> instruction crash on x86 that doesn't have that instruction (because we
> >>>> encode raw opcode).
> >>>>
> >>>> I've *not* added this to the previous patches because i wanted to get
> >>>> this part reviewed specifically, and not mix it with other IA-specific
> >>>> stuff. It seems that i've succeeded in that goal, as this patch has 4
> >>>> likes^W acks :)
> >>>
> >>> You did not explain the need for rte_cpu_get_features() call.
> >>>
> >>
> >> Did not explain *where*? Are you suggesting i put things about
> >> rte_power_monitor into documentation for rte_cpu_get_intrinsics_support?
> >> The documentation for rte_power_monitor already states that one should
> >> use rte_cpu_get_intrinsics_support API to check if the rte_power_monitor
> >> is supported on current machine. What else is missing?
> > 
> > In your example above, you do not call rte_cpu_get_features()
> > which is documented as required in the EAL doc.
> > 
> 
> I'm not sure i follow. This is unrelated to rte_cpu_get_features call. 
> The rte_cpu_get_features is a CPUID check, and it was decided not to use 
> it because the WAITPKG CPUID flag is only defined for x86 and not for 
> other archs. This new call (rte_cpu_get_intrinsics_support) is non-arch 
> specific, but will have an arch-specific implementation (which happens 
> to use rte_cpu_get_features to detect support for WAITPKG). I have given 
> the example code of how to detect support for rte_power_monitor using 
> this new code, in the code example you just referred to.

Please read the API again:
http://git.dpdk.org/dpdk/tree/lib/librte_eal/include/generic/rte_power_intrinsics.h
"
 * @warning It is responsibility of the user to check if this function is
 *   supported at runtime using `rte_cpu_get_features()` API call.
 *   Failing to do so may result in an illegal CPU instruction error.
"
Why is it referring to rte_cpu_get_features?
  
Anatoly Burakov Oct. 30, 2020, 4:36 p.m. UTC | #8
On 30-Oct-20 3:44 PM, Thomas Monjalon wrote:
> 30/10/2020 16:27, Burakov, Anatoly:
>> On 30-Oct-20 2:09 PM, Thomas Monjalon wrote:
>>> 30/10/2020 14:37, Burakov, Anatoly:
>>>> On 30-Oct-20 10:14 AM, Thomas Monjalon wrote:
>>>>> 30/10/2020 11:09, Burakov, Anatoly:
>>>>>> On 29-Oct-20 9:27 PM, David Marchand wrote:
>>>>>>> On Tue, Oct 27, 2020 at 4:00 PM Liang Ma <liang.j.ma@intel.com> wrote:
>>>>>>>>
>>>>>>>> Currently, it is not possible to check support for intrinsics that
>>>>>>>> are platform-specific, cannot be abstracted in a generic way, or do not
>>>>>>>> have support on all architectures. The CPUID flags can be used to some
>>>>>>>> extent, but they are only defined for their platform, while intrinsics
>>>>>>>> will be available to all code as they are in generic headers.
>>>>>>>>
>>>>>>>> This patch introduces infrastructure to check support for certain
>>>>>>>> platform-specific intrinsics, and adds support for checking support for
>>>>>>>> IA power management-related intrinsics for UMWAIT/UMONITOR and TPAUSE.
>>>>>>>>
>>>>>>>> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>>>>>>> Signed-off-by: Liang Ma <liang.j.ma@intel.com>
>>>>>>>> Acked-by: David Christensen <drc@linux.vnet.ibm.com>
>>>>>>>> Acked-by: Jerin Jacob <jerinj@marvell.com>
>>>>>>>> Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
>>>>>>>> Acked-by: Ray Kinsella <mdr@ashroe.eu>
>>>>>>>
>>>>>>> Coming late to the party, it seems crowded...
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> diff --git a/lib/librte_eal/include/generic/rte_cpuflags.h b/lib/librte_eal/include/generic/rte_cpuflags.h
>>>>>>>> index 872f0ebe3e..28a5aecde8 100644
>>>>>>>> --- a/lib/librte_eal/include/generic/rte_cpuflags.h
>>>>>>>> +++ b/lib/librte_eal/include/generic/rte_cpuflags.h
>>>>>>>> @@ -13,6 +13,32 @@
>>>>>>>>      #include "rte_common.h"
>>>>>>>>      #include <errno.h>
>>>>>>>>
>>>>>>>> +#include <rte_compat.h>
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * Structure used to describe platform-specific intrinsics that may or may not
>>>>>>>> + * be supported at runtime.
>>>>>>>> + */
>>>>>>>> +struct rte_cpu_intrinsics {
>>>>>>>> +       uint32_t power_monitor : 1;
>>>>>>>> +       /**< indicates support for rte_power_monitor function */
>>>>>>>> +       uint32_t power_pause : 1;
>>>>>>>> +       /**< indicates support for rte_power_pause function */
>>>>>>>> +};
>>>>>>>
>>>>>>> - The rte_power library is supposed to be built on top of cpuflags.
>>>>>>> Not the other way around.
>>>>>>> Those capabilities should have been kept inside the rte_power_ API and
>>>>>>> not pollute the cpuflags API.
>>>>>>>
>>>>>>> - All of this should have come as a single patch as the previously
>>>>>>> introduced API is unusable before.
>>>>>>>
>>>>>>>
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * @warning
>>>>>>>> + * @b EXPERIMENTAL: this API may change without prior notice
>>>>>>>> + *
>>>>>>>> + * Check CPU support for various intrinsics at runtime.
>>>>>>>> + *
>>>>>>>> + * @param intrinsics
>>>>>>>> + *     Pointer to a structure to be filled.
>>>>>>>> + */
>>>>>>>> +__rte_experimental
>>>>>>>> +void
>>>>>>>> +rte_cpu_get_intrinsics_support(struct rte_cpu_intrinsics *intrinsics);
>>>>>>>> +
>>>>>>>>      /**
>>>>>>>>       * Enumeration of all CPU features supported
>>>>>>>>       */
>>>>>>>> diff --git a/lib/librte_eal/include/generic/rte_power_intrinsics.h b/lib/librte_eal/include/generic/rte_power_intrinsics.h
>>>>>>>> index fb897d9060..03a326f076 100644
>>>>>>>> --- a/lib/librte_eal/include/generic/rte_power_intrinsics.h
>>>>>>>> +++ b/lib/librte_eal/include/generic/rte_power_intrinsics.h
>>>>>>>> @@ -32,6 +32,10 @@
>>>>>>>>       * checked against the expected value, and if they match, the entering of
>>>>>>>>       * optimized power state may be aborted.
>>>>>>>>       *
>>>>>>>> + * @warning It is responsibility of the user to check if this function is
>>>>>>>> + *   supported at runtime using `rte_cpu_get_features()` API call. Failing to do
>>>>>>>> + *   so may result in an illegal CPU instruction error.
>>>>>>>> + *
>>>>>>>
>>>>>>> - Reading this API description... what am I supposed to do in my
>>>>>>> application or driver who wants to use the new
>>>>>>> rte_power_monitor/rte_power_pause stuff?
>>>>>>>
>>>>>>> I should call rte_cpu_get_features(TOTO) ?
>>>>>>> This comment does not give a hint.
>>>>>>>
>>>>>>> I suppose the intent was to refer to the rte_cpu_get_intrinsics_support() thing.
>>>>>>> This must be fixed.
>>>>>>>
>>>>>>>
>>>>>>> - Again, I wonder why we are exposing all this stuff.
>>>>>>> This should be hidden in the rte_power API.
>>>>>>>
>>>>>>
>>>>>> We're exposing all of this here because the intrinsics are *not* part of
>>>>>> the power API but rather are generic headers within EAL. Therefore, any
>>>>>> infrastructure checking for their support can *not* reside in the power
>>>>>> library, but instead has to be in EAL.
>>>>>>
>>>>>> The intended usage here is to call this function before calling
>>>>>> rte_power_monitor(), such that:
>>>>>>
>>>>>> 	struct rte_cpu_intrinsics intrinsics;
>>>>>>
>>>>>> 	rte_cpu_get_intrinsics_support(&intrinsics);
>>>>>>
>>>>>> 	if (!intrinsics.power_monitor) {
>>>>>> 		// rte_power_monitor not supported and cannot be used
>>>>>> 		return;
>>>>>> 	}
>>>>>
>>>>> This check could be done inside the rte_power API.
>>>>
>>>> I'm not quite clear on exactly what you're asking here.
>>>>
>>>> Do you mean the example code above? If so, code like that is already
>>>> present in the power library, at the callback enable stage.
>>>>
>>>> If you mean to say, i should put this check into the rte_power_monitor
>>>> intrinsic, then no, i don't think it's a good idea to have this
>>>> expensive check every time you call rte_power_monitor.
>>>
>>> No but it can be done at initialization time.
>>> According to what you say above, it is alread done at callback enable stage.
>>> So the app does not need to do the check?
>>
>> Admittedly it's a bit confusing, but please bear with me.
>>
>> There are two separate issues at hand: the intrinsic itself, and the
>> calling code. We provide both.
>>
>> That means, the *calling code* should do the check. In our case, *our*
>> calling code is the callback. However, nothing stops someone else from
>> implementing their own scheme using our intrinsic - in that case, the
>> user will be responsible to check if the intrinsic is supported before
>> using it in their own code, because they won't be using our callback but
>> will be using our intrinsic.
>>
>> So, we have a check *in our calling code*. But if someone were to use
>> the *intrinsic* directly (like DLB), they would have to add their own
>> checks around the intrinsic usage.
>>
>> Our power intrinsic is a static inline function. Are you proposing to
>> add some sort of function pointer wrapper and make it an indirect call
>> instead of a static inline function? (or indeed a proper function)
>>
>>>
>>>> If you mean put this entire infrastructure into the power API - well,
>>>> that kinda defeats the purpose of both having these intrinsics in
>>>> generic headers and having a generic CPU feature check infrastructure
>>>> that was requested of us during the review. We of course can move the
>>>> intrinsic to the power library and outside of EAL, but then anything
>>>> that requires UMWAIT will have to depend on the librte_power.
>>>
>>> Yes the intrinsics can be in EAL if usable.
>>> But it seems DLB author cannot use what is in EAL.
>>
>> I'll let the DLB authors clarify that themselves, but as far as i'm
>> aware, it seems that this is not the case - while their current code
>> wouldn't be able to use these intrinsics by search-and-replace, they
>> will be able to use them with a couple of changes to their code that
>> basically amounted to reimplementation of our intrinsics.
>>
>>>
>>>> Please clarify exactly what changes you would like to see here, and what
>>>> is your objection.
>>>>
>>>>>
>>>>>> 	// proceed with rte_power_monitor usage
>>>>>>
>>>>>> Failing to do that will result in either -ENOTSUP on non-x86, or illegal
>>>>>> instruction crash on x86 that doesn't have that instruction (because we
>>>>>> encode raw opcode).
>>>>>>
>>>>>> I've *not* added this to the previous patches because i wanted to get
>>>>>> this part reviewed specifically, and not mix it with other IA-specific
>>>>>> stuff. It seems that i've succeeded in that goal, as this patch has 4
>>>>>> likes^W acks :)
>>>>>
>>>>> You did not explain the need for rte_cpu_get_features() call.
>>>>>
>>>>
>>>> Did not explain *where*? Are you suggesting i put things about
>>>> rte_power_monitor into documentation for rte_cpu_get_intrinsics_support?
>>>> The documentation for rte_power_monitor already states that one should
>>>> use rte_cpu_get_intrinsics_support API to check if the rte_power_monitor
>>>> is supported on current machine. What else is missing?
>>>
>>> In your example above, you do not call rte_cpu_get_features()
>>> which is documented as required in the EAL doc.
>>>
>>
>> I'm not sure i follow. This is unrelated to rte_cpu_get_features call.
>> The rte_cpu_get_features is a CPUID check, and it was decided not to use
>> it because the WAITPKG CPUID flag is only defined for x86 and not for
>> other archs. This new call (rte_cpu_get_intrinsics_support) is non-arch
>> specific, but will have an arch-specific implementation (which happens
>> to use rte_cpu_get_features to detect support for WAITPKG). I have given
>> the example code of how to detect support for rte_power_monitor using
>> this new code, in the code example you just referred to.
> 
> Please read the API again:
> http://git.dpdk.org/dpdk/tree/lib/librte_eal/include/generic/rte_power_intrinsics.h
> "
>   * @warning It is responsibility of the user to check if this function is
>   *   supported at runtime using `rte_cpu_get_features()` API call.
>   *   Failing to do so may result in an illegal CPU instruction error.
> "
> Why is it referring to rte_cpu_get_features?
> 

Aw, crap. You're right, it's an artifact of earlier implementation. 
We'll fix this. Thanks for letting us know!
  
Thomas Monjalon Oct. 30, 2020, 4:50 p.m. UTC | #9
30/10/2020 17:36, Burakov, Anatoly:
> On 30-Oct-20 3:44 PM, Thomas Monjalon wrote:
> > 30/10/2020 16:27, Burakov, Anatoly:
> >> On 30-Oct-20 2:09 PM, Thomas Monjalon wrote:
> >>> 30/10/2020 14:37, Burakov, Anatoly:
> >>>> On 30-Oct-20 10:14 AM, Thomas Monjalon wrote:
> >>>>> 30/10/2020 11:09, Burakov, Anatoly:
> >>>>>> The intended usage here is to call this function before calling
> >>>>>> rte_power_monitor(), such that:
> >>>>>>
> >>>>>> 	struct rte_cpu_intrinsics intrinsics;
> >>>>>>
> >>>>>> 	rte_cpu_get_intrinsics_support(&intrinsics);
> >>>>>>
> >>>>>> 	if (!intrinsics.power_monitor) {
> >>>>>> 		// rte_power_monitor not supported and cannot be used
> >>>>>> 		return;
> >>>>>> 	}
> >>>>>
[...]
> >>> In your example above, you do not call rte_cpu_get_features()
> >>> which is documented as required in the EAL doc.
> >>>
> >>
> >> I'm not sure i follow. This is unrelated to rte_cpu_get_features call.
> >> The rte_cpu_get_features is a CPUID check, and it was decided not to use
> >> it because the WAITPKG CPUID flag is only defined for x86 and not for
> >> other archs. This new call (rte_cpu_get_intrinsics_support) is non-arch
> >> specific, but will have an arch-specific implementation (which happens
> >> to use rte_cpu_get_features to detect support for WAITPKG). I have given
> >> the example code of how to detect support for rte_power_monitor using
> >> this new code, in the code example you just referred to.
> > 
> > Please read the API again:
> > http://git.dpdk.org/dpdk/tree/lib/librte_eal/include/generic/rte_power_intrinsics.h
> > "
> >   * @warning It is responsibility of the user to check if this function is
> >   *   supported at runtime using `rte_cpu_get_features()` API call.
> >   *   Failing to do so may result in an illegal CPU instruction error.
> > "
> > Why is it referring to rte_cpu_get_features?
> 
> Aw, crap. You're right, it's an artifact of earlier implementation. 
> We'll fix this. Thanks for letting us know!

This is what happens when everybody pushes me to merge a patch
that I believe not ready but with a lot of "acked but not reviewed".

The context around this patch series is not good to allow good quality.
That's why I think we should not merge any more patch on top of it
except DLB PMDs and fixes in this release.
  

Patch

diff --git a/lib/librte_eal/arm/rte_cpuflags.c b/lib/librte_eal/arm/rte_cpuflags.c
index 7b257b7873..e3a53bcece 100644
--- a/lib/librte_eal/arm/rte_cpuflags.c
+++ b/lib/librte_eal/arm/rte_cpuflags.c
@@ -151,3 +151,9 @@  rte_cpu_get_flag_name(enum rte_cpu_flag_t feature)
 		return NULL;
 	return rte_cpu_feature_table[feature].name;
 }
+
+void
+rte_cpu_get_intrinsics_support(struct rte_cpu_intrinsics *intrinsics)
+{
+	memset(intrinsics, 0, sizeof(*intrinsics));
+}
diff --git a/lib/librte_eal/include/generic/rte_cpuflags.h b/lib/librte_eal/include/generic/rte_cpuflags.h
index 872f0ebe3e..28a5aecde8 100644
--- a/lib/librte_eal/include/generic/rte_cpuflags.h
+++ b/lib/librte_eal/include/generic/rte_cpuflags.h
@@ -13,6 +13,32 @@ 
 #include "rte_common.h"
 #include <errno.h>
 
+#include <rte_compat.h>
+
+/**
+ * Structure used to describe platform-specific intrinsics that may or may not
+ * be supported at runtime.
+ */
+struct rte_cpu_intrinsics {
+	uint32_t power_monitor : 1;
+	/**< indicates support for rte_power_monitor function */
+	uint32_t power_pause : 1;
+	/**< indicates support for rte_power_pause function */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Check CPU support for various intrinsics at runtime.
+ *
+ * @param intrinsics
+ *     Pointer to a structure to be filled.
+ */
+__rte_experimental
+void
+rte_cpu_get_intrinsics_support(struct rte_cpu_intrinsics *intrinsics);
+
 /**
  * Enumeration of all CPU features supported
  */
diff --git a/lib/librte_eal/include/generic/rte_power_intrinsics.h b/lib/librte_eal/include/generic/rte_power_intrinsics.h
index fb897d9060..03a326f076 100644
--- a/lib/librte_eal/include/generic/rte_power_intrinsics.h
+++ b/lib/librte_eal/include/generic/rte_power_intrinsics.h
@@ -32,6 +32,10 @@ 
  * checked against the expected value, and if they match, the entering of
  * optimized power state may be aborted.
  *
+ * @warning It is responsibility of the user to check if this function is
+ *   supported at runtime using `rte_cpu_get_features()` API call. Failing to do
+ *   so may result in an illegal CPU instruction error.
+ *
  * @param p
  *   Address to monitor for changes.
  * @param expected_value
@@ -69,6 +73,10 @@  static inline void rte_power_monitor(const volatile void *p,
  * This call will also lock a spinlock on entering sleep, and release it on
  * waking up the CPU.
  *
+ * @warning It is responsibility of the user to check if this function is
+ *   supported at runtime using `rte_cpu_get_features()` API call. Failing to do
+ *   so may result in an illegal CPU instruction error.
+ *
  * @param p
  *   Address to monitor for changes.
  * @param expected_value
@@ -101,6 +109,10 @@  static inline void rte_power_monitor_sync(const volatile void *p,
  * Enter an architecture-defined optimized power state until a certain TSC
  * timestamp is reached.
  *
+ * @warning It is responsibility of the user to check if this function is
+ *   supported at runtime using `rte_cpu_get_features()` API call. Failing to do
+ *   so may result in an illegal CPU instruction error.
+ *
  * @param tsc_timestamp
  *   Maximum TSC timestamp to wait for. Note that the wait behavior is
  *   architecture-dependent.
diff --git a/lib/librte_eal/ppc/rte_cpuflags.c b/lib/librte_eal/ppc/rte_cpuflags.c
index 3bb7563ce9..61db5c216d 100644
--- a/lib/librte_eal/ppc/rte_cpuflags.c
+++ b/lib/librte_eal/ppc/rte_cpuflags.c
@@ -8,6 +8,7 @@ 
 #include <elf.h>
 #include <fcntl.h>
 #include <assert.h>
+#include <string.h>
 #include <unistd.h>
 
 /* Symbolic values for the entries in the auxiliary table */
@@ -108,3 +109,9 @@  rte_cpu_get_flag_name(enum rte_cpu_flag_t feature)
 		return NULL;
 	return rte_cpu_feature_table[feature].name;
 }
+
+void
+rte_cpu_get_intrinsics_support(struct rte_cpu_intrinsics *intrinsics)
+{
+	memset(intrinsics, 0, sizeof(*intrinsics));
+}
diff --git a/lib/librte_eal/version.map b/lib/librte_eal/version.map
index c23ff57ce6..269cdccfd3 100644
--- a/lib/librte_eal/version.map
+++ b/lib/librte_eal/version.map
@@ -402,6 +402,7 @@  EXPERIMENTAL {
 	rte_service_lcore_may_be_active;
 	rte_vect_get_max_simd_bitwidth;
 	rte_vect_set_max_simd_bitwidth;
+	rte_cpu_get_intrinsics_support;
 };
 
 INTERNAL {
diff --git a/lib/librte_eal/x86/rte_cpuflags.c b/lib/librte_eal/x86/rte_cpuflags.c
index 0325c4b93b..a96312ff7f 100644
--- a/lib/librte_eal/x86/rte_cpuflags.c
+++ b/lib/librte_eal/x86/rte_cpuflags.c
@@ -7,6 +7,7 @@ 
 #include <stdio.h>
 #include <errno.h>
 #include <stdint.h>
+#include <string.h>
 
 #include "rte_cpuid.h"
 
@@ -179,3 +180,14 @@  rte_cpu_get_flag_name(enum rte_cpu_flag_t feature)
 		return NULL;
 	return rte_cpu_feature_table[feature].name;
 }
+
+void
+rte_cpu_get_intrinsics_support(struct rte_cpu_intrinsics *intrinsics)
+{
+	memset(intrinsics, 0, sizeof(*intrinsics));
+
+	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_WAITPKG)) {
+		intrinsics->power_monitor = 1;
+		intrinsics->power_pause = 1;
+	}
+}