librte_ethdev: extend dpdk api led control to query capability

Message ID 20200107145637.8922-1-laurent.hardy@6wind.com (mailing list archive)
State Changes Requested, archived
Delegated to: Ferruh Yigit
Headers
Series librte_ethdev: extend dpdk api led control to query capability |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-nxp-Performance success Performance Testing PASS
ci/travis-robot success Travis build: passed
ci/Intel-compilation success Compilation OK

Commit Message

Laurent Hardy Jan. 7, 2020, 2:56 p.m. UTC
  In current led control API we have no way to know if a device is able
to handle on/off requests coming from the application.
Knowing if the device is led control capable could be useful to avoid
exchanges between application and kernel.
Using the on/off requests to flag if the device is led control capable
(based on the ENOSUP returned error) is not convenient as such request
can change the led state on device.

This patch adds a new function rte_eth_led_ctrl_capable() that will look
for led_off/on dev ops availability on the related pmd, to know if the
device is able to handle such led control requests (on/off).

Signed-off-by: Laurent Hardy <laurent.hardy@6wind.com>
---
 doc/guides/nics/features.rst             |  5 +++++
 lib/librte_ethdev/rte_ethdev.c           | 12 ++++++++++++
 lib/librte_ethdev/rte_ethdev.h           | 15 +++++++++++++++
 lib/librte_ethdev/rte_ethdev_core.h      |  4 ++++
 lib/librte_ethdev/rte_ethdev_version.map |  1 +
 5 files changed, 37 insertions(+)
  

Comments

David Marchand Jan. 8, 2020, 8:56 a.m. UTC | #1
Hello Laurent,

Bonne année.

Cc: maintainers.

On Tue, Jan 7, 2020 at 3:57 PM Laurent Hardy <laurent.hardy@6wind.com> wrote:
>
> In current led control API we have no way to know if a device is able
> to handle on/off requests coming from the application.
> Knowing if the device is led control capable could be useful to avoid
> exchanges between application and kernel.
> Using the on/off requests to flag if the device is led control capable
> (based on the ENOSUP returned error) is not convenient as such request
> can change the led state on device.
>
> This patch adds a new function rte_eth_led_ctrl_capable() that will look
> for led_off/on dev ops availability on the related pmd, to know if the
> device is able to handle such led control requests (on/off).

This patch breaks the ABI, which is BAD :-).
This new api only needs to look at the existing ops, so you can remove
the (unused in your patch) dev_led_ctrl_capable ops.

OTOH, would it make sense to expose this capability in dev_flags?
  
Ferruh Yigit Jan. 8, 2020, 9:09 a.m. UTC | #2
On 1/8/2020 8:56 AM, David Marchand wrote:
> Hello Laurent,
> 
> Bonne année.
> 
> Cc: maintainers.
> 
> On Tue, Jan 7, 2020 at 3:57 PM Laurent Hardy <laurent.hardy@6wind.com> wrote:
>>
>> In current led control API we have no way to know if a device is able
>> to handle on/off requests coming from the application.
>> Knowing if the device is led control capable could be useful to avoid
>> exchanges between application and kernel.
>> Using the on/off requests to flag if the device is led control capable
>> (based on the ENOSUP returned error) is not convenient as such request
>> can change the led state on device.
>>
>> This patch adds a new function rte_eth_led_ctrl_capable() that will look
>> for led_off/on dev ops availability on the related pmd, to know if the
>> device is able to handle such led control requests (on/off).
> 
> This patch breaks the ABI, which is BAD :-).

Why it is an ABI break, dev_ops should be between library and drivers, so it
should be out of the ABI concern, isn't it.

> This new api only needs to look at the existing ops, so you can remove
> the (unused in your patch) dev_led_ctrl_capable ops.
> 
> OTOH, would it make sense to expose this capability in dev_flags?
> 

'rte_eth_led_on()' & 'rte_eth_led_off()' APIs returns '-ENOTSUP' when the not
supported, can that help application to understand?
  
Olivier Matz Jan. 8, 2020, 9:42 a.m. UTC | #3
On Wed, Jan 08, 2020 at 09:09:29AM +0000, Ferruh Yigit wrote:
> On 1/8/2020 8:56 AM, David Marchand wrote:
> > Hello Laurent,
> > 
> > Bonne année.
> > 
> > Cc: maintainers.
> > 
> > On Tue, Jan 7, 2020 at 3:57 PM Laurent Hardy <laurent.hardy@6wind.com> wrote:
> >>
> >> In current led control API we have no way to know if a device is able
> >> to handle on/off requests coming from the application.
> >> Knowing if the device is led control capable could be useful to avoid
> >> exchanges between application and kernel.
> >> Using the on/off requests to flag if the device is led control capable
> >> (based on the ENOSUP returned error) is not convenient as such request
> >> can change the led state on device.
> >>
> >> This patch adds a new function rte_eth_led_ctrl_capable() that will look
> >> for led_off/on dev ops availability on the related pmd, to know if the
> >> device is able to handle such led control requests (on/off).
> > 
> > This patch breaks the ABI, which is BAD :-).
> 
> Why it is an ABI break, dev_ops should be between library and drivers, so it
> should be out of the ABI concern, isn't it.
> 
> > This new api only needs to look at the existing ops, so you can remove
> > the (unused in your patch) dev_led_ctrl_capable ops.
> > 
> > OTOH, would it make sense to expose this capability in dev_flags?
> > 
> 
> 'rte_eth_led_on()' & 'rte_eth_led_off()' APIs returns '-ENOTSUP' when the not
> supported, can that help application to understand?

In our case, it is not possible to use rte_eth_led_on/off() to check if
the feature is supported: on success, it would change the value of the
led, and it seems it is not recoverable on some drivers.

Today it is not possible to know if the feature is available without
side effect.
  
David Marchand Jan. 8, 2020, 9:55 a.m. UTC | #4
On Wed, Jan 8, 2020 at 10:09 AM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
>
> On 1/8/2020 8:56 AM, David Marchand wrote:
> > Hello Laurent,
> >
> > Bonne année.
> >
> > Cc: maintainers.
> >
> > On Tue, Jan 7, 2020 at 3:57 PM Laurent Hardy <laurent.hardy@6wind.com> wrote:
> >>
> >> In current led control API we have no way to know if a device is able
> >> to handle on/off requests coming from the application.
> >> Knowing if the device is led control capable could be useful to avoid
> >> exchanges between application and kernel.
> >> Using the on/off requests to flag if the device is led control capable
> >> (based on the ENOSUP returned error) is not convenient as such request
> >> can change the led state on device.
> >>
> >> This patch adds a new function rte_eth_led_ctrl_capable() that will look
> >> for led_off/on dev ops availability on the related pmd, to know if the
> >> device is able to handle such led control requests (on/off).
> >
> > This patch breaks the ABI, which is BAD :-).
>
> Why it is an ABI break, dev_ops should be between library and drivers, so it
> should be out of the ABI concern, isn't it.

You are right.
So in our context, this is not an ABI breakage.
But abidiff still reports it, so maybe some filtering is required to
avoid this false positive.

Note that if we insert an ops before rx_queue_count, we would have a
real ABI breakage, as this ops is accessed via an inline wrapper by
applications.


>
> > This new api only needs to look at the existing ops, so you can remove
> > the (unused in your patch) dev_led_ctrl_capable ops.
> >
> > OTOH, would it make sense to expose this capability in dev_flags?
> >
>
> 'rte_eth_led_on()' & 'rte_eth_led_off()' APIs returns '-ENOTSUP' when the not
> supported, can that help application to understand?

You might want to know it is supported without changing the state.
Laurent?



--
David Marchand
  
Laurent Hardy Jan. 8, 2020, 10:31 a.m. UTC | #5
Hi all,

On 1/8/20 10:55 AM, David Marchand wrote:
> On Wed, Jan 8, 2020 at 10:09 AM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
>> On 1/8/2020 8:56 AM, David Marchand wrote:
>>> Hello Laurent,
>>>
>>> Bonne année.
>>>
>>> Cc: maintainers.
>>>
>>> On Tue, Jan 7, 2020 at 3:57 PM Laurent Hardy <laurent.hardy@6wind.com> wrote:
>>>> In current led control API we have no way to know if a device is able
>>>> to handle on/off requests coming from the application.
>>>> Knowing if the device is led control capable could be useful to avoid
>>>> exchanges between application and kernel.
>>>> Using the on/off requests to flag if the device is led control capable
>>>> (based on the ENOSUP returned error) is not convenient as such request
>>>> can change the led state on device.
>>>>
>>>> This patch adds a new function rte_eth_led_ctrl_capable() that will look
>>>> for led_off/on dev ops availability on the related pmd, to know if the
>>>> device is able to handle such led control requests (on/off).
>>> This patch breaks the ABI, which is BAD :-).
>> Why it is an ABI break, dev_ops should be between library and drivers, so it
>> should be out of the ABI concern, isn't it.
> You are right.
> So in our context, this is not an ABI breakage.
> But abidiff still reports it, so maybe some filtering is required to
> avoid this false positive.
>
> Note that if we insert an ops before rx_queue_count, we would have a
> real ABI breakage, as this ops is accessed via an inline wrapper by
> applications.
>
>
>>> This new api only needs to look at the existing ops, so you can remove
>>> the (unused in your patch) dev_led_ctrl_capable ops.
>>>
>>> OTOH, would it make sense to expose this capability in dev_flags?
>>>
>> 'rte_eth_led_on()' & 'rte_eth_led_off()' APIs returns '-ENOTSUP' when the not
>> supported, can that help application to understand?
> You might want to know it is supported without changing the state.
> Laurent?

First, happy new year :)

Yes exactly, the purpose of this patch is to query if the device is led 
control capable or not without changing the led state.

About exposing the capability through a dev_flags, means to make some 
modification in each pmds. It looks more easy in term of pmds 
maintenance to relying on the rte_eth_led_off()/on() dev ops 
availability at rte_ethdev level, right ?

>
>
> --
> David Marchand
>
>
  
Ferruh Yigit Jan. 8, 2020, 12:12 p.m. UTC | #6
On 1/8/2020 9:42 AM, Olivier Matz wrote:
> On Wed, Jan 08, 2020 at 09:09:29AM +0000, Ferruh Yigit wrote:
>> On 1/8/2020 8:56 AM, David Marchand wrote:
>>> Hello Laurent,
>>>
>>> Bonne année.
>>>
>>> Cc: maintainers.
>>>
>>> On Tue, Jan 7, 2020 at 3:57 PM Laurent Hardy <laurent.hardy@6wind.com> wrote:
>>>>
>>>> In current led control API we have no way to know if a device is able
>>>> to handle on/off requests coming from the application.
>>>> Knowing if the device is led control capable could be useful to avoid
>>>> exchanges between application and kernel.
>>>> Using the on/off requests to flag if the device is led control capable
>>>> (based on the ENOSUP returned error) is not convenient as such request
>>>> can change the led state on device.
>>>>
>>>> This patch adds a new function rte_eth_led_ctrl_capable() that will look
>>>> for led_off/on dev ops availability on the related pmd, to know if the
>>>> device is able to handle such led control requests (on/off).
>>>
>>> This patch breaks the ABI, which is BAD :-).
>>
>> Why it is an ABI break, dev_ops should be between library and drivers, so it
>> should be out of the ABI concern, isn't it.
>>
>>> This new api only needs to look at the existing ops, so you can remove
>>> the (unused in your patch) dev_led_ctrl_capable ops.
>>>
>>> OTOH, would it make sense to expose this capability in dev_flags?
>>>
>>
>> 'rte_eth_led_on()' & 'rte_eth_led_off()' APIs returns '-ENOTSUP' when the not
>> supported, can that help application to understand?
> 
> In our case, it is not possible to use rte_eth_led_on/off() to check if
> the feature is supported: on success, it would change the value of the
> led, and it seems it is not recoverable on some drivers.

What does it mean it is not recoverable, like can you turn on the led but can't
turn off it back? Can't this be fixed in the PMD?

> 
> Today it is not possible to know if the feature is available without
> side effect.
>
  
Olivier Matz Jan. 8, 2020, 12:27 p.m. UTC | #7
On Wed, Jan 08, 2020 at 12:12:11PM +0000, Ferruh Yigit wrote:
> On 1/8/2020 9:42 AM, Olivier Matz wrote:
> > On Wed, Jan 08, 2020 at 09:09:29AM +0000, Ferruh Yigit wrote:
> >> On 1/8/2020 8:56 AM, David Marchand wrote:
> >>> Hello Laurent,
> >>>
> >>> Bonne année.
> >>>
> >>> Cc: maintainers.
> >>>
> >>> On Tue, Jan 7, 2020 at 3:57 PM Laurent Hardy <laurent.hardy@6wind.com> wrote:
> >>>>
> >>>> In current led control API we have no way to know if a device is able
> >>>> to handle on/off requests coming from the application.
> >>>> Knowing if the device is led control capable could be useful to avoid
> >>>> exchanges between application and kernel.
> >>>> Using the on/off requests to flag if the device is led control capable
> >>>> (based on the ENOSUP returned error) is not convenient as such request
> >>>> can change the led state on device.
> >>>>
> >>>> This patch adds a new function rte_eth_led_ctrl_capable() that will look
> >>>> for led_off/on dev ops availability on the related pmd, to know if the
> >>>> device is able to handle such led control requests (on/off).
> >>>
> >>> This patch breaks the ABI, which is BAD :-).
> >>
> >> Why it is an ABI break, dev_ops should be between library and drivers, so it
> >> should be out of the ABI concern, isn't it.
> >>
> >>> This new api only needs to look at the existing ops, so you can remove
> >>> the (unused in your patch) dev_led_ctrl_capable ops.
> >>>
> >>> OTOH, would it make sense to expose this capability in dev_flags?
> >>>
> >>
> >> 'rte_eth_led_on()' & 'rte_eth_led_off()' APIs returns '-ENOTSUP' when the not
> >> supported, can that help application to understand?
> > 
> > In our case, it is not possible to use rte_eth_led_on/off() to check if
> > the feature is supported: on success, it would change the value of the
> > led, and it seems it is not recoverable on some drivers.
> 
> What does it mean it is not recoverable, like can you turn on the led but can't
> turn off it back? Can't this be fixed in the PMD?

In the case there is only one LED, which is by default used to display
the link status, it can never display the status again without resetting
the device.

Maybe an alternative solution would be to add a function, in addition to
on() and off():

  led_ctrl_status_link() to display the status link on the led.


> 
> > 
> > Today it is not possible to know if the feature is available without
> > side effect.
> > 
>
  
Ferruh Yigit Jan. 8, 2020, 12:30 p.m. UTC | #8
On 1/8/2020 9:55 AM, David Marchand wrote:
> On Wed, Jan 8, 2020 at 10:09 AM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
>>
>> On 1/8/2020 8:56 AM, David Marchand wrote:
>>> Hello Laurent,
>>>
>>> Bonne année.
>>>
>>> Cc: maintainers.
>>>
>>> On Tue, Jan 7, 2020 at 3:57 PM Laurent Hardy <laurent.hardy@6wind.com> wrote:
>>>>
>>>> In current led control API we have no way to know if a device is able
>>>> to handle on/off requests coming from the application.
>>>> Knowing if the device is led control capable could be useful to avoid
>>>> exchanges between application and kernel.
>>>> Using the on/off requests to flag if the device is led control capable
>>>> (based on the ENOSUP returned error) is not convenient as such request
>>>> can change the led state on device.
>>>>
>>>> This patch adds a new function rte_eth_led_ctrl_capable() that will look
>>>> for led_off/on dev ops availability on the related pmd, to know if the
>>>> device is able to handle such led control requests (on/off).
>>>
>>> This patch breaks the ABI, which is BAD :-).
>>
>> Why it is an ABI break, dev_ops should be between library and drivers, so it
>> should be out of the ABI concern, isn't it.
> 
> You are right.
> So in our context, this is not an ABI breakage.
> But abidiff still reports it, so maybe some filtering is required to
> avoid this false positive.
> 
> Note that if we insert an ops before rx_queue_count, we would have a
> real ABI breakage, as this ops is accessed via an inline wrapper by
> applications.
> 

This is good point, perhaps we should add a comment to that line to highlight it.

> 
>>
>>> This new api only needs to look at the existing ops, so you can remove
>>> the (unused in your patch) dev_led_ctrl_capable ops.
>>>
>>> OTOH, would it make sense to expose this capability in dev_flags?
>>>
>>
>> 'rte_eth_led_on()' & 'rte_eth_led_off()' APIs returns '-ENOTSUP' when the not
>> supported, can that help application to understand?
> 
> You might want to know it is supported without changing the state.
> Laurent?
> 
> 
> 
> --
> David Marchand
>
  
Ferruh Yigit Jan. 8, 2020, 12:59 p.m. UTC | #9
On 1/8/2020 10:31 AM, Laurent Hardy wrote:
> Hi all,
> 
> On 1/8/20 10:55 AM, David Marchand wrote:
>> On Wed, Jan 8, 2020 at 10:09 AM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
>>> On 1/8/2020 8:56 AM, David Marchand wrote:
>>>> Hello Laurent,
>>>>
>>>> Bonne année.
>>>>
>>>> Cc: maintainers.
>>>>
>>>> On Tue, Jan 7, 2020 at 3:57 PM Laurent Hardy <laurent.hardy@6wind.com> wrote:
>>>>> In current led control API we have no way to know if a device is able
>>>>> to handle on/off requests coming from the application.
>>>>> Knowing if the device is led control capable could be useful to avoid
>>>>> exchanges between application and kernel.
>>>>> Using the on/off requests to flag if the device is led control capable
>>>>> (based on the ENOSUP returned error) is not convenient as such request
>>>>> can change the led state on device.
>>>>>
>>>>> This patch adds a new function rte_eth_led_ctrl_capable() that will look
>>>>> for led_off/on dev ops availability on the related pmd, to know if the
>>>>> device is able to handle such led control requests (on/off).
>>>> This patch breaks the ABI, which is BAD :-).
>>> Why it is an ABI break, dev_ops should be between library and drivers, so it
>>> should be out of the ABI concern, isn't it.
>> You are right.
>> So in our context, this is not an ABI breakage.
>> But abidiff still reports it, so maybe some filtering is required to
>> avoid this false positive.
>>
>> Note that if we insert an ops before rx_queue_count, we would have a
>> real ABI breakage, as this ops is accessed via an inline wrapper by
>> applications.
>>
>>
>>>> This new api only needs to look at the existing ops, so you can remove
>>>> the (unused in your patch) dev_led_ctrl_capable ops.
>>>>
>>>> OTOH, would it make sense to expose this capability in dev_flags?
>>>>
>>> 'rte_eth_led_on()' & 'rte_eth_led_off()' APIs returns '-ENOTSUP' when the not
>>> supported, can that help application to understand?
>> You might want to know it is supported without changing the state.
>> Laurent?
> 
> First, happy new year :)
> 
> Yes exactly, the purpose of this patch is to query if the device is led 
> control capable or not without changing the led state.
> 
> About exposing the capability through a dev_flags, means to make some 
> modification in each pmds. It looks more easy in term of pmds 
> maintenance to relying on the rte_eth_led_off()/on() dev ops 
> availability at rte_ethdev level, right ?
> 

'dev_flag' definition is not clear, right now it holds the combination of status
and capability. And we have 'rte_eth_dev_info' struct, which is again
combination of device capability and status.
Perhaps we should have explicit capabilities and status fields, even in the
rte_device level which inherited by net/crypto devices etc..

But for dev_ops, instead of having another capabilities indicator, which
requires PMDs to keep this synchronized, I think it is better if we can self
contain this information within dev_ops, like not implementing dev_ops would
mean it is not supported, this way it is easier to maintain and less error prone.

Only we should have it without side effect,

1- adding an additional 'dry-run' parameter can work, but this means breaking
ABI and updating majority of the ethdev APIs :)
2- Adding 'is_supported' versions of the APIs as we need can be an option, like
'rte_eth_led_on_is_supported()'
3- Olivier's suggestion to add a new API to get the led status, so that this
information can be used select led API which won't cause side affect and let us
learn if it is supported.

Any other alternatives?

I would prefer the 2) in above ones, which is very similar to the original patch.
  
David Marchand Jan. 8, 2020, 1 p.m. UTC | #10
On Wed, Jan 8, 2020 at 1:30 PM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
> On 1/8/2020 9:55 AM, David Marchand wrote:
> > On Wed, Jan 8, 2020 at 10:09 AM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
> >> Why it is an ABI break, dev_ops should be between library and drivers, so it
> >> should be out of the ABI concern, isn't it.
> >
> > You are right.
> > So in our context, this is not an ABI breakage.
> > But abidiff still reports it, so maybe some filtering is required to
> > avoid this false positive.
> >
> > Note that if we insert an ops before rx_queue_count, we would have a
> > real ABI breakage, as this ops is accessed via an inline wrapper by
> > applications.
> >
>
> This is good point, perhaps we should add a comment to that line to highlight it.

The comment won't help in the CI checks.


Not talking about short term, but could we consider separating the
inlined ops from the rest (pushing them to rte_eth_dev ?) ?
We would then hide completely eth_dev_ops at the next ABI break window.


--
David Marchand
  
Thomas Monjalon Jan. 8, 2020, 1:06 p.m. UTC | #11
08/01/2020 13:59, Ferruh Yigit:
> On 1/8/2020 10:31 AM, Laurent Hardy wrote:
> > Hi all,
> > 
> > On 1/8/20 10:55 AM, David Marchand wrote:
> >> On Wed, Jan 8, 2020 at 10:09 AM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
> >>> On 1/8/2020 8:56 AM, David Marchand wrote:
> >>>> Hello Laurent,
> >>>>
> >>>> Bonne année.
> >>>>
> >>>> Cc: maintainers.
> >>>>
> >>>> On Tue, Jan 7, 2020 at 3:57 PM Laurent Hardy <laurent.hardy@6wind.com> wrote:
> >>>>> In current led control API we have no way to know if a device is able
> >>>>> to handle on/off requests coming from the application.
> >>>>> Knowing if the device is led control capable could be useful to avoid
> >>>>> exchanges between application and kernel.
> >>>>> Using the on/off requests to flag if the device is led control capable
> >>>>> (based on the ENOSUP returned error) is not convenient as such request
> >>>>> can change the led state on device.
> >>>>>
> >>>>> This patch adds a new function rte_eth_led_ctrl_capable() that will look
> >>>>> for led_off/on dev ops availability on the related pmd, to know if the
> >>>>> device is able to handle such led control requests (on/off).
> >>>> This patch breaks the ABI, which is BAD :-).
> >>> Why it is an ABI break, dev_ops should be between library and drivers, so it
> >>> should be out of the ABI concern, isn't it.
> >> You are right.
> >> So in our context, this is not an ABI breakage.
> >> But abidiff still reports it, so maybe some filtering is required to
> >> avoid this false positive.
> >>
> >> Note that if we insert an ops before rx_queue_count, we would have a
> >> real ABI breakage, as this ops is accessed via an inline wrapper by
> >> applications.
> >>
> >>
> >>>> This new api only needs to look at the existing ops, so you can remove
> >>>> the (unused in your patch) dev_led_ctrl_capable ops.
> >>>>
> >>>> OTOH, would it make sense to expose this capability in dev_flags?
> >>>>
> >>> 'rte_eth_led_on()' & 'rte_eth_led_off()' APIs returns '-ENOTSUP' when the not
> >>> supported, can that help application to understand?
> >> You might want to know it is supported without changing the state.
> >> Laurent?
> > 
> > First, happy new year :)
> > 
> > Yes exactly, the purpose of this patch is to query if the device is led 
> > control capable or not without changing the led state.
> > 
> > About exposing the capability through a dev_flags, means to make some 
> > modification in each pmds. It looks more easy in term of pmds 
> > maintenance to relying on the rte_eth_led_off()/on() dev ops 
> > availability at rte_ethdev level, right ?
> > 
> 
> 'dev_flag' definition is not clear, right now it holds the combination of status
> and capability. And we have 'rte_eth_dev_info' struct, which is again
> combination of device capability and status.

I agree capabilities in ethdev are a bit of a mess.
I would appreciate someone makes a complete audit of it
so we can discuss how to improve the situation.


> Perhaps we should have explicit capabilities and status fields, even in the
> rte_device level which inherited by net/crypto devices etc..

No, ethdev capabilities should stay in ethdev.


> But for dev_ops, instead of having another capabilities indicator, which
> requires PMDs to keep this synchronized, I think it is better if we can self
> contain this information within dev_ops, like not implementing dev_ops would
> mean it is not supported, this way it is easier to maintain and less error prone.

It means the dev_ops is resetted at init if a device does not support the feature.
It is against having const dev_ops.


> Only we should have it without side effect,
> 
> 1- adding an additional 'dry-run' parameter can work, but this means breaking
> ABI and updating majority of the ethdev APIs :)
> 2- Adding 'is_supported' versions of the APIs as we need can be an option, like
> 'rte_eth_led_on_is_supported()'
> 3- Olivier's suggestion to add a new API to get the led status, so that this
> information can be used select led API which won't cause side affect and let us
> learn if it is supported.
> 
> Any other alternatives?
> 
> I would prefer the 2) in above ones, which is very similar to the original patch.

The other alternatives are in rte_eth_dev_info and dev_flags.
  
Ferruh Yigit Jan. 8, 2020, 1:11 p.m. UTC | #12
On 1/8/2020 1:00 PM, David Marchand wrote:
> On Wed, Jan 8, 2020 at 1:30 PM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
>> On 1/8/2020 9:55 AM, David Marchand wrote:
>>> On Wed, Jan 8, 2020 at 10:09 AM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
>>>> Why it is an ABI break, dev_ops should be between library and drivers, so it
>>>> should be out of the ABI concern, isn't it.
>>>
>>> You are right.
>>> So in our context, this is not an ABI breakage.
>>> But abidiff still reports it, so maybe some filtering is required to
>>> avoid this false positive.
>>>
>>> Note that if we insert an ops before rx_queue_count, we would have a
>>> real ABI breakage, as this ops is accessed via an inline wrapper by
>>> applications.
>>>
>>
>> This is good point, perhaps we should add a comment to that line to highlight it.
> 
> The comment won't help in the CI checks.
> 
> 
> Not talking about short term, but could we consider separating the
> inlined ops from the rest (pushing them to rte_eth_dev ?) ?
> We would then hide completely eth_dev_ops at the next ABI break window.

+1

We didn't able to apply the patch that removes the inline functions [1] because
of the performance concerns, but at least separating some ops enables us to hide
the eth_dev_ops, although I guess still have to expose 'rte_eth_dev' &
'rte_eth_dev_data'. Still better than nothing.


[1]
https://patches.dpdk.org/patch/58874/
  
Ferruh Yigit Jan. 8, 2020, 1:20 p.m. UTC | #13
On 1/8/2020 1:06 PM, Thomas Monjalon wrote:
> 08/01/2020 13:59, Ferruh Yigit:
>> On 1/8/2020 10:31 AM, Laurent Hardy wrote:
>>> Hi all,
>>>
>>> On 1/8/20 10:55 AM, David Marchand wrote:
>>>> On Wed, Jan 8, 2020 at 10:09 AM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
>>>>> On 1/8/2020 8:56 AM, David Marchand wrote:
>>>>>> Hello Laurent,
>>>>>>
>>>>>> Bonne année.
>>>>>>
>>>>>> Cc: maintainers.
>>>>>>
>>>>>> On Tue, Jan 7, 2020 at 3:57 PM Laurent Hardy <laurent.hardy@6wind.com> wrote:
>>>>>>> In current led control API we have no way to know if a device is able
>>>>>>> to handle on/off requests coming from the application.
>>>>>>> Knowing if the device is led control capable could be useful to avoid
>>>>>>> exchanges between application and kernel.
>>>>>>> Using the on/off requests to flag if the device is led control capable
>>>>>>> (based on the ENOSUP returned error) is not convenient as such request
>>>>>>> can change the led state on device.
>>>>>>>
>>>>>>> This patch adds a new function rte_eth_led_ctrl_capable() that will look
>>>>>>> for led_off/on dev ops availability on the related pmd, to know if the
>>>>>>> device is able to handle such led control requests (on/off).
>>>>>> This patch breaks the ABI, which is BAD :-).
>>>>> Why it is an ABI break, dev_ops should be between library and drivers, so it
>>>>> should be out of the ABI concern, isn't it.
>>>> You are right.
>>>> So in our context, this is not an ABI breakage.
>>>> But abidiff still reports it, so maybe some filtering is required to
>>>> avoid this false positive.
>>>>
>>>> Note that if we insert an ops before rx_queue_count, we would have a
>>>> real ABI breakage, as this ops is accessed via an inline wrapper by
>>>> applications.
>>>>
>>>>
>>>>>> This new api only needs to look at the existing ops, so you can remove
>>>>>> the (unused in your patch) dev_led_ctrl_capable ops.
>>>>>>
>>>>>> OTOH, would it make sense to expose this capability in dev_flags?
>>>>>>
>>>>> 'rte_eth_led_on()' & 'rte_eth_led_off()' APIs returns '-ENOTSUP' when the not
>>>>> supported, can that help application to understand?
>>>> You might want to know it is supported without changing the state.
>>>> Laurent?
>>>
>>> First, happy new year :)
>>>
>>> Yes exactly, the purpose of this patch is to query if the device is led 
>>> control capable or not without changing the led state.
>>>
>>> About exposing the capability through a dev_flags, means to make some 
>>> modification in each pmds. It looks more easy in term of pmds 
>>> maintenance to relying on the rte_eth_led_off()/on() dev ops 
>>> availability at rte_ethdev level, right ?
>>>
>>
>> 'dev_flag' definition is not clear, right now it holds the combination of status
>> and capability. And we have 'rte_eth_dev_info' struct, which is again
>> combination of device capability and status.
> 
> I agree capabilities in ethdev are a bit of a mess.
> I would appreciate someone makes a complete audit of it
> so we can discuss how to improve the situation.
> 
> 
>> Perhaps we should have explicit capabilities and status fields, even in the
>> rte_device level which inherited by net/crypto devices etc..
> 
> No, ethdev capabilities should stay in ethdev.

No strong opinion, I though a standardized way may help other device abstraction
layers too.

> 
> 
>> But for dev_ops, instead of having another capabilities indicator, which
>> requires PMDs to keep this synchronized, I think it is better if we can self
>> contain this information within dev_ops, like not implementing dev_ops would
>> mean it is not supported, this way it is easier to maintain and less error prone.
> 
> It means the dev_ops is resetted at init if a device does not support the feature.
> It is against having const dev_ops.

I didn't get your comment.
For example getting FW version, I am saying instead of keeping another piece of
information to say if it is supported by device/driver, better to grasp this
from if the driver implemented 'fw_version_get' dev_ops or not.

> 
> 
>> Only we should have it without side effect,
>>
>> 1- adding an additional 'dry-run' parameter can work, but this means breaking
>> ABI and updating majority of the ethdev APIs :)
>> 2- Adding 'is_supported' versions of the APIs as we need can be an option, like
>> 'rte_eth_led_on_is_supported()'
>> 3- Olivier's suggestion to add a new API to get the led status, so that this
>> information can be used select led API which won't cause side affect and let us
>> learn if it is supported.
>>
>> Any other alternatives?
>>
>> I would prefer the 2) in above ones, which is very similar to the original patch.
> 
> The other alternatives are in rte_eth_dev_info and dev_flags.
> 
>
  
Thomas Monjalon Jan. 8, 2020, 1:25 p.m. UTC | #14
08/01/2020 14:20, Ferruh Yigit:
> On 1/8/2020 1:06 PM, Thomas Monjalon wrote:
> > 08/01/2020 13:59, Ferruh Yigit:
> >> But for dev_ops, instead of having another capabilities indicator, which
> >> requires PMDs to keep this synchronized, I think it is better if we can self
> >> contain this information within dev_ops, like not implementing dev_ops would
> >> mean it is not supported, this way it is easier to maintain and less error prone.
> > 
> > It means the dev_ops is resetted at init if a device does not support the feature.
> > It is against having const dev_ops.
> 
> I didn't get your comment.
> For example getting FW version, I am saying instead of keeping another piece of
> information to say if it is supported by device/driver, better to grasp this
> from if the driver implemented 'fw_version_get' dev_ops or not.

I like this approach.
Capabilities should be expressed by setting the function pointer or not (NULL).
But a driver may support a feature for a subset of devices.
If a device does not support a feature, the function pointer must be set to NULL.
The only issue is having dev_ops as a const struct.
  
Thomas Monjalon Jan. 8, 2020, 1:34 p.m. UTC | #15
08/01/2020 14:25, Thomas Monjalon:
> 08/01/2020 14:20, Ferruh Yigit:
> > On 1/8/2020 1:06 PM, Thomas Monjalon wrote:
> > > 08/01/2020 13:59, Ferruh Yigit:
> > >> But for dev_ops, instead of having another capabilities indicator, which
> > >> requires PMDs to keep this synchronized, I think it is better if we can self
> > >> contain this information within dev_ops, like not implementing dev_ops would
> > >> mean it is not supported, this way it is easier to maintain and less error prone.
> > > 
> > > It means the dev_ops is resetted at init if a device does not support the feature.
> > > It is against having const dev_ops.
> > 
> > I didn't get your comment.
> > For example getting FW version, I am saying instead of keeping another piece of
> > information to say if it is supported by device/driver, better to grasp this
> > from if the driver implemented 'fw_version_get' dev_ops or not.
> 
> I like this approach.
> Capabilities should be expressed by setting the function pointer or not (NULL).
> But a driver may support a feature for a subset of devices.
> If a device does not support a feature, the function pointer must be set to NULL.
> The only issue is having dev_ops as a const struct.

Anyway the dev_ops is not part of the API.
We still need a way to express the capability to the application.
  
Ferruh Yigit Jan. 8, 2020, 1:52 p.m. UTC | #16
On 1/8/2020 1:25 PM, Thomas Monjalon wrote:
> 08/01/2020 14:20, Ferruh Yigit:
>> On 1/8/2020 1:06 PM, Thomas Monjalon wrote:
>>> 08/01/2020 13:59, Ferruh Yigit:
>>>> But for dev_ops, instead of having another capabilities indicator, which
>>>> requires PMDs to keep this synchronized, I think it is better if we can self
>>>> contain this information within dev_ops, like not implementing dev_ops would
>>>> mean it is not supported, this way it is easier to maintain and less error prone.
>>>
>>> It means the dev_ops is resetted at init if a device does not support the feature.
>>> It is against having const dev_ops.
>>
>> I didn't get your comment.
>> For example getting FW version, I am saying instead of keeping another piece of
>> information to say if it is supported by device/driver, better to grasp this
>> from if the driver implemented 'fw_version_get' dev_ops or not.
> 
> I like this approach.
> Capabilities should be expressed by setting the function pointer or not (NULL).
> But a driver may support a feature for a subset of devices.

In that case dev_ops itself can return the '-ENOTSUP', since application
interaction will be through the ethdev API, either API send '-ENOTSUP' because
the dev_ops is NULL or dev_ops itself send the '-ENOTSUP' because of the
underlying version of the device, for application it will be clear that that
feature is not supported.

> If a device does not support a feature, the function pointer must be set to NULL.
> The only issue is having dev_ops as a const struct.

Not sure about changing the dev_ops on runtime, it can be very hard to debug.
  
Ferruh Yigit Jan. 8, 2020, 1:53 p.m. UTC | #17
On 1/8/2020 1:34 PM, Thomas Monjalon wrote:
> 08/01/2020 14:25, Thomas Monjalon:
>> 08/01/2020 14:20, Ferruh Yigit:
>>> On 1/8/2020 1:06 PM, Thomas Monjalon wrote:
>>>> 08/01/2020 13:59, Ferruh Yigit:
>>>>> But for dev_ops, instead of having another capabilities indicator, which
>>>>> requires PMDs to keep this synchronized, I think it is better if we can self
>>>>> contain this information within dev_ops, like not implementing dev_ops would
>>>>> mean it is not supported, this way it is easier to maintain and less error prone.
>>>>
>>>> It means the dev_ops is resetted at init if a device does not support the feature.
>>>> It is against having const dev_ops.
>>>
>>> I didn't get your comment.
>>> For example getting FW version, I am saying instead of keeping another piece of
>>> information to say if it is supported by device/driver, better to grasp this
>>> from if the driver implemented 'fw_version_get' dev_ops or not.
>>
>> I like this approach.
>> Capabilities should be expressed by setting the function pointer or not (NULL).
>> But a driver may support a feature for a subset of devices.
>> If a device does not support a feature, the function pointer must be set to NULL.
>> The only issue is having dev_ops as a const struct.
> 
> Anyway the dev_ops is not part of the API.
> We still need a way to express the capability to the application.
> 

+1
  
Laurent Hardy Jan. 8, 2020, 1:58 p.m. UTC | #18
On 1/8/20 2:06 PM, Thomas Monjalon wrote:
> 08/01/2020 13:59, Ferruh Yigit:
>> On 1/8/2020 10:31 AM, Laurent Hardy wrote:
>>> Hi all,
>>>
>>> On 1/8/20 10:55 AM, David Marchand wrote:
>>>> On Wed, Jan 8, 2020 at 10:09 AM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
>>>>> On 1/8/2020 8:56 AM, David Marchand wrote:
>>>>>> Hello Laurent,
>>>>>>
>>>>>> Bonne année.
>>>>>>
>>>>>> Cc: maintainers.
>>>>>>
>>>>>> On Tue, Jan 7, 2020 at 3:57 PM Laurent Hardy <laurent.hardy@6wind.com> wrote:
>>>>>>> In current led control API we have no way to know if a device is able
>>>>>>> to handle on/off requests coming from the application.
>>>>>>> Knowing if the device is led control capable could be useful to avoid
>>>>>>> exchanges between application and kernel.
>>>>>>> Using the on/off requests to flag if the device is led control capable
>>>>>>> (based on the ENOSUP returned error) is not convenient as such request
>>>>>>> can change the led state on device.
>>>>>>>
>>>>>>> This patch adds a new function rte_eth_led_ctrl_capable() that will look
>>>>>>> for led_off/on dev ops availability on the related pmd, to know if the
>>>>>>> device is able to handle such led control requests (on/off).
>>>>>> This patch breaks the ABI, which is BAD :-).
>>>>> Why it is an ABI break, dev_ops should be between library and drivers, so it
>>>>> should be out of the ABI concern, isn't it.
>>>> You are right.
>>>> So in our context, this is not an ABI breakage.
>>>> But abidiff still reports it, so maybe some filtering is required to
>>>> avoid this false positive.
>>>>
>>>> Note that if we insert an ops before rx_queue_count, we would have a
>>>> real ABI breakage, as this ops is accessed via an inline wrapper by
>>>> applications.
>>>>
>>>>
>>>>>> This new api only needs to look at the existing ops, so you can remove
>>>>>> the (unused in your patch) dev_led_ctrl_capable ops.
>>>>>>
>>>>>> OTOH, would it make sense to expose this capability in dev_flags?
>>>>>>
>>>>> 'rte_eth_led_on()' & 'rte_eth_led_off()' APIs returns '-ENOTSUP' when the not
>>>>> supported, can that help application to understand?
>>>> You might want to know it is supported without changing the state.
>>>> Laurent?
>>> First, happy new year :)
>>>
>>> Yes exactly, the purpose of this patch is to query if the device is led
>>> control capable or not without changing the led state.
>>>
>>> About exposing the capability through a dev_flags, means to make some
>>> modification in each pmds. It looks more easy in term of pmds
>>> maintenance to relying on the rte_eth_led_off()/on() dev ops
>>> availability at rte_ethdev level, right ?
>>>
>> 'dev_flag' definition is not clear, right now it holds the combination of status
>> and capability. And we have 'rte_eth_dev_info' struct, which is again
>> combination of device capability and status.
> I agree capabilities in ethdev are a bit of a mess.
> I would appreciate someone makes a complete audit of it
> so we can discuss how to improve the situation.
>
>
>> Perhaps we should have explicit capabilities and status fields, even in the
>> rte_device level which inherited by net/crypto devices etc..
> No, ethdev capabilities should stay in ethdev.
>
>
>> But for dev_ops, instead of having another capabilities indicator, which
>> requires PMDs to keep this synchronized, I think it is better if we can self
>> contain this information within dev_ops, like not implementing dev_ops would
>> mean it is not supported, this way it is easier to maintain and less error prone.
> It means the dev_ops is resetted at init if a device does not support the feature.
> It is against having const dev_ops.
>
>
>> Only we should have it without side effect,
>>
>> 1- adding an additional 'dry-run' parameter can work, but this means breaking
>> ABI and updating majority of the ethdev APIs :)
>> 2- Adding 'is_supported' versions of the APIs as we need can be an option, like
>> 'rte_eth_led_on_is_supported()'
>> 3- Olivier's suggestion to add a new API to get the led status, so that this
>> information can be used select led API which won't cause side affect and let us
>> learn if it is supported.
>>
>> Any other alternatives?
>>
>> I would prefer the 2) in above ones, which is very similar to the original patch.

I can provide a V2 which will remove the useless dev_led_ctrl_capable ops.

About the 'is_supported()' versions of APIs, in the current patch I 
factorize
the check on dev ops on and off availability in a same function named
"led_ctrl_capable" but I can rename it if required.

Just in this specific case I don't dissociate on and off capability, as 
being
able to set the led off without a way to set it on again sounds a bit 
unusual :)

> The other alternatives are in rte_eth_dev_info and dev_flags.
>
>
>
  
Ferruh Yigit Jan. 8, 2020, 2:01 p.m. UTC | #19
On 1/8/2020 1:52 PM, Ferruh Yigit wrote:
> On 1/8/2020 1:25 PM, Thomas Monjalon wrote:
>> 08/01/2020 14:20, Ferruh Yigit:
>>> On 1/8/2020 1:06 PM, Thomas Monjalon wrote:
>>>> 08/01/2020 13:59, Ferruh Yigit:
>>>>> But for dev_ops, instead of having another capabilities indicator, which
>>>>> requires PMDs to keep this synchronized, I think it is better if we can self
>>>>> contain this information within dev_ops, like not implementing dev_ops would
>>>>> mean it is not supported, this way it is easier to maintain and less error prone.
>>>>
>>>> It means the dev_ops is resetted at init if a device does not support the feature.
>>>> It is against having const dev_ops.
>>>
>>> I didn't get your comment.
>>> For example getting FW version, I am saying instead of keeping another piece of
>>> information to say if it is supported by device/driver, better to grasp this
>>> from if the driver implemented 'fw_version_get' dev_ops or not.
>>
>> I like this approach.
>> Capabilities should be expressed by setting the function pointer or not (NULL).
>> But a driver may support a feature for a subset of devices.
> 
> In that case dev_ops itself can return the '-ENOTSUP', since application
> interaction will be through the ethdev API, either API send '-ENOTSUP' because
> the dev_ops is NULL or dev_ops itself send the '-ENOTSUP' because of the
> underlying version of the device, for application it will be clear that that
> feature is not supported.

For this specific problem,
What do you think having:
rte_eth_led_on_is_supported()
rte_eth_led_off_is_supported()

like:

rte_eth_led_on_is_supported() {
  RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
  RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_led_on, -ENOTSUP);
  return 0;
}

rte_eth_led_on() {
  ret = rte_eth_led_on_is_supported();
  if (ret)
    return ret;
  return eth_err(port_id, (*dev->dev_ops->dev_led_on)(dev));
}

> 
>> If a device does not support a feature, the function pointer must be set to NULL.
>> The only issue is having dev_ops as a const struct.
> 
> Not sure about changing the dev_ops on runtime, it can be very hard to debug.
>
  
Thomas Monjalon Jan. 8, 2020, 2:07 p.m. UTC | #20
08/01/2020 14:58, Laurent Hardy:
> About the 'is_supported()' versions of APIs, in the current patch I 
> factorize
> the check on dev ops on and off availability in a same function named
> "led_ctrl_capable" but I can rename it if required.
> 
> Just in this specific case I don't dissociate on and off capability, as 
> being
> able to set the led off without a way to set it on again sounds a bit 
> unusual :)
> 
> > The other alternatives are in rte_eth_dev_info and dev_flags.

Basically we just need to decide whether we prefer a new function
or a new flag.

Until now, capabilities were given in flags.
Why a function here?
  
Ferruh Yigit Jan. 8, 2020, 2:08 p.m. UTC | #21
On 1/8/2020 12:27 PM, Olivier Matz wrote:
> On Wed, Jan 08, 2020 at 12:12:11PM +0000, Ferruh Yigit wrote:
>> On 1/8/2020 9:42 AM, Olivier Matz wrote:
>>> On Wed, Jan 08, 2020 at 09:09:29AM +0000, Ferruh Yigit wrote:
>>>> On 1/8/2020 8:56 AM, David Marchand wrote:
>>>>> Hello Laurent,
>>>>>
>>>>> Bonne année.
>>>>>
>>>>> Cc: maintainers.
>>>>>
>>>>> On Tue, Jan 7, 2020 at 3:57 PM Laurent Hardy <laurent.hardy@6wind.com> wrote:
>>>>>>
>>>>>> In current led control API we have no way to know if a device is able
>>>>>> to handle on/off requests coming from the application.
>>>>>> Knowing if the device is led control capable could be useful to avoid
>>>>>> exchanges between application and kernel.
>>>>>> Using the on/off requests to flag if the device is led control capable
>>>>>> (based on the ENOSUP returned error) is not convenient as such request
>>>>>> can change the led state on device.
>>>>>>
>>>>>> This patch adds a new function rte_eth_led_ctrl_capable() that will look
>>>>>> for led_off/on dev ops availability on the related pmd, to know if the
>>>>>> device is able to handle such led control requests (on/off).
>>>>>
>>>>> This patch breaks the ABI, which is BAD :-).
>>>>
>>>> Why it is an ABI break, dev_ops should be between library and drivers, so it
>>>> should be out of the ABI concern, isn't it.
>>>>
>>>>> This new api only needs to look at the existing ops, so you can remove
>>>>> the (unused in your patch) dev_led_ctrl_capable ops.
>>>>>
>>>>> OTOH, would it make sense to expose this capability in dev_flags?
>>>>>
>>>>
>>>> 'rte_eth_led_on()' & 'rte_eth_led_off()' APIs returns '-ENOTSUP' when the not
>>>> supported, can that help application to understand?
>>>
>>> In our case, it is not possible to use rte_eth_led_on/off() to check if
>>> the feature is supported: on success, it would change the value of the
>>> led, and it seems it is not recoverable on some drivers.
>>
>> What does it mean it is not recoverable, like can you turn on the led but can't
>> turn off it back? Can't this be fixed in the PMD?
> 
> In the case there is only one LED, which is by default used to display
> the link status, it can never display the status again without resetting
> the device.

Is there a specific PMD are we talking about?

> 
> Maybe an alternative solution would be to add a function, in addition to
> on() and off():
> 
>   led_ctrl_status_link() to display the status link on the led.
> 
> 
>>
>>>
>>> Today it is not possible to know if the feature is available without
>>> side effect.
>>>
>>
  
Andrew Rybchenko Jan. 8, 2020, 2:15 p.m. UTC | #22
On 1/8/20 4:52 PM, Ferruh Yigit wrote:
> On 1/8/2020 1:25 PM, Thomas Monjalon wrote:
>> 08/01/2020 14:20, Ferruh Yigit:
>>> On 1/8/2020 1:06 PM, Thomas Monjalon wrote:
>>>> 08/01/2020 13:59, Ferruh Yigit:
>>>>> But for dev_ops, instead of having another capabilities indicator, which
>>>>> requires PMDs to keep this synchronized, I think it is better if we can self
>>>>> contain this information within dev_ops, like not implementing dev_ops would
>>>>> mean it is not supported, this way it is easier to maintain and less error prone.
>>>>
>>>> It means the dev_ops is resetted at init if a device does not support the feature.
>>>> It is against having const dev_ops.
>>>
>>> I didn't get your comment.
>>> For example getting FW version, I am saying instead of keeping another piece of
>>> information to say if it is supported by device/driver, better to grasp this
>>> from if the driver implemented 'fw_version_get' dev_ops or not.
>>
>> I like this approach.
>> Capabilities should be expressed by setting the function pointer or not (NULL).
>> But a driver may support a feature for a subset of devices.
> 
> In that case dev_ops itself can return the '-ENOTSUP', since application
> interaction will be through the ethdev API, either API send '-ENOTSUP' because
> the dev_ops is NULL or dev_ops itself send the '-ENOTSUP' because of the
> underlying version of the device, for application it will be clear that that
> feature is not supported.

I think it is a good illustration why deriving the capability
from dev_ops pointer is not that good idea.

>> If a device does not support a feature, the function pointer must be set to NULL.
>> The only issue is having dev_ops as a const struct.
> 
> Not sure about changing the dev_ops on runtime, it can be very hard to debug.

I hope it was just an idea to copy dev_ops and adjust in
accordance with the device capabilities on register.
I.e. not fully dynamic changes in runtime.
  
Thomas Monjalon Jan. 8, 2020, 2:27 p.m. UTC | #23
08/01/2020 15:15, Andrew Rybchenko:
> On 1/8/20 4:52 PM, Ferruh Yigit wrote:
> > On 1/8/2020 1:25 PM, Thomas Monjalon wrote:
> >> 08/01/2020 14:20, Ferruh Yigit:
> >>> On 1/8/2020 1:06 PM, Thomas Monjalon wrote:
> >>>> 08/01/2020 13:59, Ferruh Yigit:
> >>>>> But for dev_ops, instead of having another capabilities indicator, which
> >>>>> requires PMDs to keep this synchronized, I think it is better if we can self
> >>>>> contain this information within dev_ops, like not implementing dev_ops would
> >>>>> mean it is not supported, this way it is easier to maintain and less error prone.
> >>>>
> >>>> It means the dev_ops is resetted at init if a device does not support the feature.
> >>>> It is against having const dev_ops.
> >>>
> >>> I didn't get your comment.
> >>> For example getting FW version, I am saying instead of keeping another piece of
> >>> information to say if it is supported by device/driver, better to grasp this
> >>> from if the driver implemented 'fw_version_get' dev_ops or not.
> >>
> >> I like this approach.
> >> Capabilities should be expressed by setting the function pointer or not (NULL).
> >> But a driver may support a feature for a subset of devices.
> > 
> > In that case dev_ops itself can return the '-ENOTSUP', since application
> > interaction will be through the ethdev API, either API send '-ENOTSUP' because
> > the dev_ops is NULL or dev_ops itself send the '-ENOTSUP' because of the
> > underlying version of the device, for application it will be clear that that
> > feature is not supported.
> 
> I think it is a good illustration why deriving the capability
> from dev_ops pointer is not that good idea.
> 
> >> If a device does not support a feature, the function pointer must be set to NULL.
> >> The only issue is having dev_ops as a const struct.
> > 
> > Not sure about changing the dev_ops on runtime, it can be very hard to debug.
> 
> I hope it was just an idea to copy dev_ops and adjust in
> accordance with the device capabilities on register.
> I.e. not fully dynamic changes in runtime.

Changing a function pointer in runtime is tough :)
I was thinking about changing it during init but I don't really see a great value.
Probably better to return ENOTSUP.

Anyway it does not address the capability info need.
  
Andrew Rybchenko Jan. 8, 2020, 2:37 p.m. UTC | #24
On 1/8/20 5:27 PM, Thomas Monjalon wrote:
> 08/01/2020 15:15, Andrew Rybchenko:
>> On 1/8/20 4:52 PM, Ferruh Yigit wrote:
>>> On 1/8/2020 1:25 PM, Thomas Monjalon wrote:
>>>> 08/01/2020 14:20, Ferruh Yigit:
>>>>> On 1/8/2020 1:06 PM, Thomas Monjalon wrote:
>>>>>> 08/01/2020 13:59, Ferruh Yigit:
>>>>>>> But for dev_ops, instead of having another capabilities indicator, which
>>>>>>> requires PMDs to keep this synchronized, I think it is better if we can self
>>>>>>> contain this information within dev_ops, like not implementing dev_ops would
>>>>>>> mean it is not supported, this way it is easier to maintain and less error prone.
>>>>>> It means the dev_ops is resetted at init if a device does not support the feature.
>>>>>> It is against having const dev_ops.
>>>>> I didn't get your comment.
>>>>> For example getting FW version, I am saying instead of keeping another piece of
>>>>> information to say if it is supported by device/driver, better to grasp this
>>>>> from if the driver implemented 'fw_version_get' dev_ops or not.
>>>> I like this approach.
>>>> Capabilities should be expressed by setting the function pointer or not (NULL).
>>>> But a driver may support a feature for a subset of devices.
>>> In that case dev_ops itself can return the '-ENOTSUP', since application
>>> interaction will be through the ethdev API, either API send '-ENOTSUP' because
>>> the dev_ops is NULL or dev_ops itself send the '-ENOTSUP' because of the
>>> underlying version of the device, for application it will be clear that that
>>> feature is not supported.
>> I think it is a good illustration why deriving the capability
>> from dev_ops pointer is not that good idea.
>>
>>>> If a device does not support a feature, the function pointer must be set to NULL.
>>>> The only issue is having dev_ops as a const struct.
>>> Not sure about changing the dev_ops on runtime, it can be very hard to debug.
>> I hope it was just an idea to copy dev_ops and adjust in
>> accordance with the device capabilities on register.
>> I.e. not fully dynamic changes in runtime.
> Changing a function pointer in runtime is tough :)
> I was thinking about changing it during init but I don't really see a great value.

Yes exactly, copying just solve the 'const' problem.

> Probably better to return ENOTSUP.
>
> Anyway it does not address the capability info need.

Yes, I agree. Back to other branch of the thread:
dev_info flag vs dedicated dev_ops function.
  
Laurent Hardy Jan. 8, 2020, 2:45 p.m. UTC | #25
On 1/8/20 3:08 PM, Ferruh Yigit wrote:
> On 1/8/2020 12:27 PM, Olivier Matz wrote:
>> On Wed, Jan 08, 2020 at 12:12:11PM +0000, Ferruh Yigit wrote:
>>> On 1/8/2020 9:42 AM, Olivier Matz wrote:
>>>> On Wed, Jan 08, 2020 at 09:09:29AM +0000, Ferruh Yigit wrote:
>>>>> On 1/8/2020 8:56 AM, David Marchand wrote:
>>>>>> Hello Laurent,
>>>>>>
>>>>>> Bonne année.
>>>>>>
>>>>>> Cc: maintainers.
>>>>>>
>>>>>> On Tue, Jan 7, 2020 at 3:57 PM Laurent Hardy <laurent.hardy@6wind.com> wrote:
>>>>>>> In current led control API we have no way to know if a device is able
>>>>>>> to handle on/off requests coming from the application.
>>>>>>> Knowing if the device is led control capable could be useful to avoid
>>>>>>> exchanges between application and kernel.
>>>>>>> Using the on/off requests to flag if the device is led control capable
>>>>>>> (based on the ENOSUP returned error) is not convenient as such request
>>>>>>> can change the led state on device.
>>>>>>>
>>>>>>> This patch adds a new function rte_eth_led_ctrl_capable() that will look
>>>>>>> for led_off/on dev ops availability on the related pmd, to know if the
>>>>>>> device is able to handle such led control requests (on/off).
>>>>>> This patch breaks the ABI, which is BAD :-).
>>>>> Why it is an ABI break, dev_ops should be between library and drivers, so it
>>>>> should be out of the ABI concern, isn't it.
>>>>>
>>>>>> This new api only needs to look at the existing ops, so you can remove
>>>>>> the (unused in your patch) dev_led_ctrl_capable ops.
>>>>>>
>>>>>> OTOH, would it make sense to expose this capability in dev_flags?
>>>>>>
>>>>> 'rte_eth_led_on()' & 'rte_eth_led_off()' APIs returns '-ENOTSUP' when the not
>>>>> supported, can that help application to understand?
>>>> In our case, it is not possible to use rte_eth_led_on/off() to check if
>>>> the feature is supported: on success, it would change the value of the
>>>> led, and it seems it is not recoverable on some drivers.
>>> What does it mean it is not recoverable, like can you turn on the led but can't
>>> turn off it back? Can't this be fixed in the PMD?
>> In the case there is only one LED, which is by default used to display
>> the link status, it can never display the status again without resetting
>> the device.
> Is there a specific PMD are we talking about?

e1000 pmd which seems to have different behavior according to the

nics used (at least it has been reported using i210 and i350 copper).

>
>> Maybe an alternative solution would be to add a function, in addition to
>> on() and off():
>>
>>    led_ctrl_status_link() to display the status link on the led.
>>
>>
>>>> Today it is not possible to know if the feature is available without
>>>> side effect.
>>>>
  
Laurent Hardy Jan. 8, 2020, 3:16 p.m. UTC | #26
On 1/8/20 3:07 PM, Thomas Monjalon wrote:
> 08/01/2020 14:58, Laurent Hardy:
>> About the 'is_supported()' versions of APIs, in the current patch I
>> factorize
>> the check on dev ops on and off availability in a same function named
>> "led_ctrl_capable" but I can rename it if required.
>>
>> Just in this specific case I don't dissociate on and off capability, as
>> being
>> able to set the led off without a way to set it on again sounds a bit
>> unusual :)
>>
>>> The other alternatives are in rte_eth_dev_info and dev_flags.
> Basically we just need to decide whether we prefer a new function
> or a new flag.
>
> Until now, capabilities were given in flags.
> Why a function here?
>
For this case, (led control API) all material is already available at 
rte_ethdev layer.

So you could rely on led_off/on ops availability without the need to 
add/maintain in

all pmds some flags to expose such capabilities.

What do you suggest to set a capability flag for the device at 
rte_ethdev level?
  
Ferruh Yigit May 8, 2020, 12:03 p.m. UTC | #27
On 1/8/2020 2:07 PM, Thomas Monjalon wrote:
> 08/01/2020 14:58, Laurent Hardy:
>> About the 'is_supported()' versions of APIs, in the current patch I 
>> factorize
>> the check on dev ops on and off availability in a same function named
>> "led_ctrl_capable" but I can rename it if required.
>>
>> Just in this specific case I don't dissociate on and off capability, as 
>> being
>> able to set the led off without a way to set it on again sounds a bit 
>> unusual :)
>>
>>> The other alternatives are in rte_eth_dev_info and dev_flags.
> 
> Basically we just need to decide whether we prefer a new function
> or a new flag.
> 
> Until now, capabilities were given in flags.
> Why a function here?
> 

Capabilities provided by dev_ops not given by flags, and I am not sure opening
that door.

Right now offload capabilities are given by flags, which has dedicated flag.
dev_flags is used more like status, it has things like 'DEV_BONDED_SLAVE',
'DEV_REPRESENTOR', etc.
  
Ferruh Yigit May 8, 2020, 12:11 p.m. UTC | #28
On 1/8/2020 1:58 PM, Laurent Hardy wrote:
> 
> On 1/8/20 2:06 PM, Thomas Monjalon wrote:
>> 08/01/2020 13:59, Ferruh Yigit:
>>> On 1/8/2020 10:31 AM, Laurent Hardy wrote:
>>>> Hi all,
>>>>
>>>> On 1/8/20 10:55 AM, David Marchand wrote:
>>>>> On Wed, Jan 8, 2020 at 10:09 AM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
>>>>>> On 1/8/2020 8:56 AM, David Marchand wrote:
>>>>>>> Hello Laurent,
>>>>>>>
>>>>>>> Bonne année.
>>>>>>>
>>>>>>> Cc: maintainers.
>>>>>>>
>>>>>>> On Tue, Jan 7, 2020 at 3:57 PM Laurent Hardy <laurent.hardy@6wind.com> wrote:
>>>>>>>> In current led control API we have no way to know if a device is able
>>>>>>>> to handle on/off requests coming from the application.
>>>>>>>> Knowing if the device is led control capable could be useful to avoid
>>>>>>>> exchanges between application and kernel.
>>>>>>>> Using the on/off requests to flag if the device is led control capable
>>>>>>>> (based on the ENOSUP returned error) is not convenient as such request
>>>>>>>> can change the led state on device.
>>>>>>>>
>>>>>>>> This patch adds a new function rte_eth_led_ctrl_capable() that will look
>>>>>>>> for led_off/on dev ops availability on the related pmd, to know if the
>>>>>>>> device is able to handle such led control requests (on/off).
>>>>>>> This patch breaks the ABI, which is BAD :-).
>>>>>> Why it is an ABI break, dev_ops should be between library and drivers, so it
>>>>>> should be out of the ABI concern, isn't it.
>>>>> You are right.
>>>>> So in our context, this is not an ABI breakage.
>>>>> But abidiff still reports it, so maybe some filtering is required to
>>>>> avoid this false positive.
>>>>>
>>>>> Note that if we insert an ops before rx_queue_count, we would have a
>>>>> real ABI breakage, as this ops is accessed via an inline wrapper by
>>>>> applications.
>>>>>
>>>>>
>>>>>>> This new api only needs to look at the existing ops, so you can remove
>>>>>>> the (unused in your patch) dev_led_ctrl_capable ops.
>>>>>>>
>>>>>>> OTOH, would it make sense to expose this capability in dev_flags?
>>>>>>>
>>>>>> 'rte_eth_led_on()' & 'rte_eth_led_off()' APIs returns '-ENOTSUP' when the not
>>>>>> supported, can that help application to understand?
>>>>> You might want to know it is supported without changing the state.
>>>>> Laurent?
>>>> First, happy new year :)
>>>>
>>>> Yes exactly, the purpose of this patch is to query if the device is led
>>>> control capable or not without changing the led state.
>>>>
>>>> About exposing the capability through a dev_flags, means to make some
>>>> modification in each pmds. It looks more easy in term of pmds
>>>> maintenance to relying on the rte_eth_led_off()/on() dev ops
>>>> availability at rte_ethdev level, right ?
>>>>
>>> 'dev_flag' definition is not clear, right now it holds the combination of status
>>> and capability. And we have 'rte_eth_dev_info' struct, which is again
>>> combination of device capability and status.
>> I agree capabilities in ethdev are a bit of a mess.
>> I would appreciate someone makes a complete audit of it
>> so we can discuss how to improve the situation.
>>
>>
>>> Perhaps we should have explicit capabilities and status fields, even in the
>>> rte_device level which inherited by net/crypto devices etc..
>> No, ethdev capabilities should stay in ethdev.
>>
>>
>>> But for dev_ops, instead of having another capabilities indicator, which
>>> requires PMDs to keep this synchronized, I think it is better if we can self
>>> contain this information within dev_ops, like not implementing dev_ops would
>>> mean it is not supported, this way it is easier to maintain and less error prone.
>> It means the dev_ops is resetted at init if a device does not support the feature.
>> It is against having const dev_ops.
>>
>>
>>> Only we should have it without side effect,
>>>
>>> 1- adding an additional 'dry-run' parameter can work, but this means breaking
>>> ABI and updating majority of the ethdev APIs :)
>>> 2- Adding 'is_supported' versions of the APIs as we need can be an option, like
>>> 'rte_eth_led_on_is_supported()'
>>> 3- Olivier's suggestion to add a new API to get the led status, so that this
>>> information can be used select led API which won't cause side affect and let us
>>> learn if it is supported.
>>>
>>> Any other alternatives?
>>>
>>> I would prefer the 2) in above ones, which is very similar to the original patch.
> 
> I can provide a V2 which will remove the useless dev_led_ctrl_capable ops.

+1, dev_led_ctrl_capable is not used.

> 
> About the 'is_supported()' versions of APIs, in the current patch I 
> factorize
> the check on dev ops on and off availability in a same function named
> "led_ctrl_capable" but I can rename it if required.
> 
> Just in this specific case I don't dissociate on and off capability, as 
> being
> able to set the led off without a way to set it on again sounds a bit 
> unusual :)

What about following,
Right now there is not way to get led status, only have on/off
We can store status in ethdev layer add a 'rte_eth_led_status' which can return
status.
If the on/off dev_ops are not set, it can return 'unavailable' which covers your
usecase.
  

Patch

diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
index 8394a6595..012645dc5 100644
--- a/doc/guides/nics/features.rst
+++ b/doc/guides/nics/features.rst
@@ -732,6 +732,11 @@  registers and register size).
 LED
 ---
 
+Interrogates device to know if it is led control capable.
+
+* **[implements] eth_dev_ops**: ``dev_led_ctrl_capable``.
+* **[related]    API**: ``rte_eth_led_ctrl_capable()``.
+
 Supports turning on/off a software controllable LED on a device.
 
 * **[implements] eth_dev_ops**: ``dev_led_on``, ``dev_led_off``.
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 6e9cb243e..b259b6b19 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -3612,6 +3612,18 @@  rte_eth_led_off(uint16_t port_id)
 	return eth_err(port_id, (*dev->dev_ops->dev_led_off)(dev));
 }
 
+int
+rte_eth_led_ctrl_capable(uint16_t port_id)
+{
+	struct rte_eth_dev *dev;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	dev = &rte_eth_devices[port_id];
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_led_off, -ENOTSUP);
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_led_on, -ENOTSUP);
+	return 0;
+}
+
 /*
  * Returns index into MAC address array of addr. Use 00:00:00:00:00:00 to find
  * an empty spot.
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index 18a9defc2..a5bacd643 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -3204,6 +3204,21 @@  int  rte_eth_led_on(uint16_t port_id);
  */
 int  rte_eth_led_off(uint16_t port_id);
 
+/**
+ * Interrogate the Ethernet device to know if it is led control capable.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @return
+ *   - (0) if successful.
+ *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
+ *     that operation.
+ *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
+ */
+int  __rte_experimental
+rte_eth_led_ctrl_capable(uint16_t port_id);
+
 /**
  * Get current status of the Ethernet link flow control for Ethernet device
  *
diff --git a/lib/librte_ethdev/rte_ethdev_core.h b/lib/librte_ethdev/rte_ethdev_core.h
index 7bf97e24e..6cf2a5242 100644
--- a/lib/librte_ethdev/rte_ethdev_core.h
+++ b/lib/librte_ethdev/rte_ethdev_core.h
@@ -388,6 +388,9 @@  typedef int (*eth_dev_led_on_t)(struct rte_eth_dev *dev);
 typedef int (*eth_dev_led_off_t)(struct rte_eth_dev *dev);
 /**< @internal Turn off SW controllable LED on an Ethernet device */
 
+typedef int (*eth_dev_led_ctrl_capable_t)(struct rte_eth_dev *dev);
+/**< @internal Get led control capability on an Ethernet device */
+
 typedef void (*eth_mac_addr_remove_t)(struct rte_eth_dev *dev, uint32_t index);
 /**< @internal Remove MAC address from receive address register */
 
@@ -675,6 +678,7 @@  struct eth_dev_ops {
 
 	eth_dev_led_on_t           dev_led_on;    /**< Turn on LED. */
 	eth_dev_led_off_t          dev_led_off;   /**< Turn off LED. */
+	eth_dev_led_ctrl_capable_t dev_led_ctrl_capable;  /**< Get led control capability. */
 
 	flow_ctrl_get_t            flow_ctrl_get; /**< Get flow control. */
 	flow_ctrl_set_t            flow_ctrl_set; /**< Setup flow control. */
diff --git a/lib/librte_ethdev/rte_ethdev_version.map b/lib/librte_ethdev/rte_ethdev_version.map
index a7dacf2cf..776f3b5d6 100644
--- a/lib/librte_ethdev/rte_ethdev_version.map
+++ b/lib/librte_ethdev/rte_ethdev_version.map
@@ -227,4 +227,5 @@  EXPERIMENTAL {
 	rte_flow_dynf_metadata_mask;
 	rte_flow_dynf_metadata_register;
 	rte_eth_dev_set_ptypes;
+	rte_eth_led_ctrl_capable;
 };