[dpdk-dev,2/2] ethdev: pre-emptively document rte_eth_dev_reset error code

Message ID 20171019134827.22740-2-luca.boccassi@gmail.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Luca Boccassi Oct. 19, 2017, 1:48 p.m. UTC
  From: Luca Boccassi <bluca@debian.org>

When VF reset will be supported by drivers, the API will most likely
have to return -EAGAIN to avoid blocking when the VF cannot be reset
because the PF is down.
Document it immediately even if it's not yet supported, so that users
and developers can already take into account about this use case, and
thus avoid an API-incompatible change later on.

This is based on real-world production usage and customer escalations,
using earlier patches from Intel.

Signed-off-by: Luca Boccassi <bluca@debian.org>
---
 lib/librte_ether/rte_ethdev.h | 1 +
 1 file changed, 1 insertion(+)
  

Comments

Remy Horton Oct. 19, 2017, 2:53 p.m. UTC | #1
On 19/10/2017 14:48, luca.boccassi@gmail.com wrote:
> Document it immediately even if it's not yet supported, so that users
> and developers can already take into account about this use case, and
> thus avoid an API-incompatible change later on.

I'm not sure about documenting unimplemented features, as API docs ought 
to describe what the code currently does. Then again reason seems OK and 
I don't think there's hard guidelines on this..


> This is based on real-world production usage and customer escalations,
> using earlier patches from Intel.

Can you give the patchwork link for these patches?


..Remy
  
Luca Boccassi Oct. 19, 2017, 4:14 p.m. UTC | #2
On Thu, 2017-10-19 at 15:53 +0100, Remy Horton wrote:
> On 19/10/2017 14:48, luca.boccassi@gmail.com wrote:
> > Document it immediately even if it's not yet supported, so that
> > users
> > and developers can already take into account about this use case,
> > and
> > thus avoid an API-incompatible change later on.
> 
> I'm not sure about documenting unimplemented features, as API docs
> ought 
> to describe what the code currently does. Then again reason seems OK
> and 
> I don't think there's hard guidelines on this..

Yeah I realised that, that's why it's a separate patch :-)

I'm just trying to avoid a foreseeable API breakage given we've been
using this API in production.

OTOH there are a few cases where perhaps EAGAIN is already a possible
error code to return - for example where the driver fails to
initialise? For example there's an error path that just returns -1 in
i40evf_dev_init

> > This is based on real-world production usage and customer
> > escalations,
> > using earlier patches from Intel.
> 
> Can you give the patchwork link for these patches?
> 
> 
> ..Remy

Based on these ones:

http://dpdk.org/dev/patchwork/patch/14009/
http://dpdk.org/dev/patchwork/patch/14011/
http://dpdk.org/dev/patchwork/patch/14010/

I had sent some feedback, as especially the ixgbe one was a blocking
call in case the PF was down, which is a deal breaker in most
situations given the API will be called from the controller thread in
most cases.

We've adapted and used these patches with the early rte_eth_dev_reset
for a year in production now, and we had a customer who requested it
since they were running into the problem it solves (PF flaps).

I have adapted them on the latest 17.11 tree and tested with X540
10gbit cards, and it seems to work as before. Should I send an RFC and
CC all of you?

Incidentally, are there specific reasons why the VF functionality was
dropped since the first patches were sent?
  
Thomas Monjalon Oct. 23, 2017, 10:11 p.m. UTC | #3
19/10/2017 16:53, Remy Horton:
> 
> On 19/10/2017 14:48, luca.boccassi@gmail.com wrote:
> > Document it immediately even if it's not yet supported, so that users
> > and developers can already take into account about this use case, and
> > thus avoid an API-incompatible change later on.
> 
> I'm not sure about documenting unimplemented features, as API docs ought 
> to describe what the code currently does. Then again reason seems OK and 
> I don't think there's hard guidelines on this..

An API does not need to be implemented to describe its behaviour.
We just have to agree about which behaviour we want to expect.

It seems this error code is reasonable.
But I am not sure you need to give the details about the cause
of the error. The most important is to advise app writer to retry
resetting later when the device is ready. Isn't it?
  
Remy Horton Oct. 24, 2017, 6:18 a.m. UTC | #4
On 19/10/2017 17:14, Luca Boccassi wrote:
[..]
> We've adapted and used these patches with the early rte_eth_dev_reset
> for a year in production now, and we had a customer who requested it
> since they were running into the problem it solves (PF flaps).
>
> I have adapted them on the latest 17.11 tree and tested with X540
> 10gbit cards, and it seems to work as before. Should I send an RFC and
> CC all of you?

Since it sounds stable, probably best to post the updated/rebased patch. 
Should get merged as long as nothing breaks.


> Incidentally, are there specific reasons why the VF functionality was
> dropped since the first patches were sent?

I'm personally not sure, but the others should know.
  
Luca Boccassi Oct. 24, 2017, noon UTC | #5
On Tue, 2017-10-24 at 00:11 +0200, Thomas Monjalon wrote:
> 19/10/2017 16:53, Remy Horton:
> > 
> > On 19/10/2017 14:48, luca.boccassi@gmail.com wrote:
> > > Document it immediately even if it's not yet supported, so that
> > > users
> > > and developers can already take into account about this use case,
> > > and
> > > thus avoid an API-incompatible change later on.
> > 
> > I'm not sure about documenting unimplemented features, as API docs
> > ought 
> > to describe what the code currently does. Then again reason seems
> > OK and 
> > I don't think there's hard guidelines on this..
> 
> An API does not need to be implemented to describe its behaviour.
> We just have to agree about which behaviour we want to expect.
> 
> It seems this error code is reasonable.
> But I am not sure you need to give the details about the cause
> of the error. The most important is to advise app writer to retry
> resetting later when the device is ready. Isn't it?

Sure, makes sense, see v2.
  
Luca Boccassi Oct. 24, 2017, 12:01 p.m. UTC | #6
On Tue, 2017-10-24 at 07:18 +0100, Remy Horton wrote:
> On 19/10/2017 17:14, Luca Boccassi wrote:
> [..]
> > We've adapted and used these patches with the early
> > rte_eth_dev_reset
> > for a year in production now, and we had a customer who requested
> > it
> > since they were running into the problem it solves (PF flaps).
> > 
> > I have adapted them on the latest 17.11 tree and tested with X540
> > 10gbit cards, and it seems to work as before. Should I send an RFC
> > and
> > CC all of you?
> 
> Since it sounds stable, probably best to post the updated/rebased
> patch. 
> Should get merged as long as nothing breaks.

I can send an RFC later today.

> 
> > Incidentally, are there specific reasons why the VF functionality
> > was
> > dropped since the first patches were sent?
> 
> I'm personally not sure, but the others should know.
  

Patch

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 9cdb9724a..126f42d3c 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -2270,6 +2270,7 @@  void rte_eth_dev_close(uint16_t port_id);
  *   - (-EPERM) if not ran from the primary process.
  *   - (-EIO) if re-initialisation failed.
  *   - (-ENOMEM) if the reset failed due to OOM.
+ *   - (-EAGAIN) if PF is not up and the reset cannot proceed yet.
  */
 int rte_eth_dev_reset(uint16_t port_id);