[RFC,v4,3/3] app/testpmd: handle device recovery event

Message ID 20200930123314.27669-4-kalesh-anakkur.purayil@broadcom.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series librte_ethdev: error recovery support |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Kalesh A P Sept. 30, 2020, 12:33 p.m. UTC
  From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>

Added code to handle device reset and recovery event in testpmd.
This is an indication from the PMD that device has reset and
recovered error condition.

Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Kumar Khaparde <ajit.khaparde@broadcom.com>
---
 app/test-pmd/testpmd.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)
  

Comments

Ophir Munk Oct. 6, 2020, 5:25 p.m. UTC | #1
Hi Kalesh,
Please find a few comments.
The name you gave to the event (EVENT_RESET) is very close to an already existing one: "EVENT_INTR_RESET".
But they are different.
EVENT_INTR_RESET originates from a port reset. It requires application reaction. It is widely used. It is documented in *.rst files.
EVENT_RESET originates from FW error (or maybe any error). It requires no application reaction (PMD manages by itself). It is not documented.
I therefore suggest renaming it (maybe EVENT_ERR_RECOVERING) and please document it in *.rst files.
More comments below:

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Kalesh A P
> Sent: Wednesday, September 30, 2020 3:33 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [RFC PATCH v4 3/3] app/testpmd: handle device recovery
> event
> 
> From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> 
> Added code to handle device reset and recovery event in testpmd.
> This is an indication from the PMD that device has reset and recovered error
> condition.
> 
> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> Reviewed-by: Ajit Kumar Khaparde <ajit.khaparde@broadcom.com>
> ---
>  app/test-pmd/testpmd.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> fe6450c..1c8fb46 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -380,6 +380,8 @@ static const char * const eth_event_desc[] = {
>  	[RTE_ETH_EVENT_NEW] = "device probed",
>  	[RTE_ETH_EVENT_DESTROY] = "device released",
>  	[RTE_ETH_EVENT_FLOW_AGED] = "flow aged",
> +	[RTE_ETH_EVENT_RESET] = "device reset",

"device reset" is similar to the existing "reset" string. Can you suggest a different one? Maybe "error under recovery" ?

> +	[RTE_ETH_EVENT_RECOVERED] = "device recovery",

Wouldn't you prefer "device recovered" ?

>  	[RTE_ETH_EVENT_MAX] = NULL,
>  };
> 
> @@ -394,7 +396,9 @@ uint32_t event_print_mask = (UINT32_C(1) <<
> RTE_ETH_EVENT_UNKNOWN) |
>  			    (UINT32_C(1) << RTE_ETH_EVENT_IPSEC) |
>  			    (UINT32_C(1) << RTE_ETH_EVENT_MACSEC) |
>  			    (UINT32_C(1) << RTE_ETH_EVENT_INTR_RMV) |
> -			    (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED);
> +			    (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED) |
> +			    (UINT32_C(1) << RTE_ETH_EVENT_RESET) |
> +			    (UINT32_C(1) << RTE_ETH_EVENT_RECOVERED);
>  /*
>   * Decide if all memory are locked for performance.
>   */
> --
> 2.10.1
  
Kalesh A P Oct. 7, 2020, 4:46 a.m. UTC | #2
Hi Ophir,

Thank you for the comments. I will address them in the next version.

I will push these changes as Patches next time and not as an RFC. Hope that
is OK.

Regards,
Kalesh

On Tue, Oct 6, 2020 at 10:55 PM Ophir Munk <ophirmu@nvidia.com> wrote:

> Hi Kalesh,
> Please find a few comments.
> The name you gave to the event (EVENT_RESET) is very close to an already
> existing one: "EVENT_INTR_RESET".
> But they are different.
> EVENT_INTR_RESET originates from a port reset. It requires application
> reaction. It is widely used. It is documented in *.rst files.
> EVENT_RESET originates from FW error (or maybe any error). It requires no
> application reaction (PMD manages by itself). It is not documented.
> I therefore suggest renaming it (maybe EVENT_ERR_RECOVERING) and please
> document it in *.rst files.
> More comments below:
>
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Kalesh A P
> > Sent: Wednesday, September 30, 2020 3:33 PM
> > To: dev@dpdk.org
> > Subject: [dpdk-dev] [RFC PATCH v4 3/3] app/testpmd: handle device
> recovery
> > event
> >
> > From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> >
> > Added code to handle device reset and recovery event in testpmd.
> > This is an indication from the PMD that device has reset and recovered
> error
> > condition.
> >
> > Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> > Reviewed-by: Ajit Kumar Khaparde <ajit.khaparde@broadcom.com>
> > ---
> >  app/test-pmd/testpmd.c | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> > fe6450c..1c8fb46 100644
> > --- a/app/test-pmd/testpmd.c
> > +++ b/app/test-pmd/testpmd.c
> > @@ -380,6 +380,8 @@ static const char * const eth_event_desc[] = {
> >       [RTE_ETH_EVENT_NEW] = "device probed",
> >       [RTE_ETH_EVENT_DESTROY] = "device released",
> >       [RTE_ETH_EVENT_FLOW_AGED] = "flow aged",
> > +     [RTE_ETH_EVENT_RESET] = "device reset",
>
> "device reset" is similar to the existing "reset" string. Can you suggest
> a different one? Maybe "error under recovery" ?
>
> > +     [RTE_ETH_EVENT_RECOVERED] = "device recovery",
>
> Wouldn't you prefer "device recovered" ?
>
> >       [RTE_ETH_EVENT_MAX] = NULL,
> >  };
> >
> > @@ -394,7 +396,9 @@ uint32_t event_print_mask = (UINT32_C(1) <<
> > RTE_ETH_EVENT_UNKNOWN) |
> >                           (UINT32_C(1) << RTE_ETH_EVENT_IPSEC) |
> >                           (UINT32_C(1) << RTE_ETH_EVENT_MACSEC) |
> >                           (UINT32_C(1) << RTE_ETH_EVENT_INTR_RMV) |
> > -                         (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED);
> > +                         (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED) |
> > +                         (UINT32_C(1) << RTE_ETH_EVENT_RESET) |
> > +                         (UINT32_C(1) << RTE_ETH_EVENT_RECOVERED);
> >  /*
> >   * Decide if all memory are locked for performance.
> >   */
> > --
> > 2.10.1
>
>
  
Ophir Munk Oct. 7, 2020, 8:36 a.m. UTC | #3
Adding Ferruh and Ajit

From: Kalesh Anakkur Purayil <kalesh-anakkur.purayil@broadcom.com>
Sent: Wednesday, October 7, 2020 7:47 AM
To: Ophir Munk <ophirmu@nvidia.com>
Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon <thomas@monjalon.net>
Subject: Re: [dpdk-dev] [RFC PATCH v4 3/3] app/testpmd: handle device recovery event

Hi Ophir,

Thank you for the comments. I will address them in the next version.

I will push these changes as Patches next time and not as an RFC. Hope that is OK.

Regards,
Kalesh

On Tue, Oct 6, 2020 at 10:55 PM Ophir Munk <ophirmu@nvidia.com<mailto:ophirmu@nvidia.com>> wrote:
Hi Kalesh,
Please find a few comments.
The name you gave to the event (EVENT_RESET) is very close to an already existing one: "EVENT_INTR_RESET".
But they are different.
EVENT_INTR_RESET originates from a port reset. It requires application reaction. It is widely used. It is documented in *.rst files.
EVENT_RESET originates from FW error (or maybe any error). It requires no application reaction (PMD manages by itself). It is not documented.
I therefore suggest renaming it (maybe EVENT_ERR_RECOVERING) and please document it in *.rst files.
More comments below:

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org<mailto:dev-bounces@dpdk.org>> On Behalf Of Kalesh A P
> Sent: Wednesday, September 30, 2020 3:33 PM
> To: dev@dpdk.org<mailto:dev@dpdk.org>
> Subject: [dpdk-dev] [RFC PATCH v4 3/3] app/testpmd: handle device recovery
> event
>
> From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com<mailto:kalesh-anakkur.purayil@broadcom.com>>
>
> Added code to handle device reset and recovery event in testpmd.
> This is an indication from the PMD that device has reset and recovered error
> condition.
>
> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com<mailto:kalesh-anakkur.purayil@broadcom.com>>
> Reviewed-by: Ajit Kumar Khaparde <ajit.khaparde@broadcom.com<mailto:ajit.khaparde@broadcom.com>>
> ---
>  app/test-pmd/testpmd.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> fe6450c..1c8fb46 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -380,6 +380,8 @@ static const char * const eth_event_desc[] = {
>       [RTE_ETH_EVENT_NEW] = "device probed",
>       [RTE_ETH_EVENT_DESTROY] = "device released",
>       [RTE_ETH_EVENT_FLOW_AGED] = "flow aged",
> +     [RTE_ETH_EVENT_RESET] = "device reset",

"device reset" is similar to the existing "reset" string. Can you suggest a different one? Maybe "error under recovery" ?

> +     [RTE_ETH_EVENT_RECOVERED] = "device recovery",

Wouldn't you prefer "device recovered" ?

>       [RTE_ETH_EVENT_MAX] = NULL,
>  };
>
> @@ -394,7 +396,9 @@ uint32_t event_print_mask = (UINT32_C(1) <<
> RTE_ETH_EVENT_UNKNOWN) |
>                           (UINT32_C(1) << RTE_ETH_EVENT_IPSEC) |
>                           (UINT32_C(1) << RTE_ETH_EVENT_MACSEC) |
>                           (UINT32_C(1) << RTE_ETH_EVENT_INTR_RMV) |
> -                         (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED);
> +                         (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED) |
> +                         (UINT32_C(1) << RTE_ETH_EVENT_RESET) |
> +                         (UINT32_C(1) << RTE_ETH_EVENT_RECOVERED);
>  /*
>   * Decide if all memory are locked for performance.
>   */
> --
> 2.10.1


--
Regards,
Kalesh A P
  
Ferruh Yigit Oct. 7, 2020, 9:37 a.m. UTC | #4
On 10/7/2020 5:46 AM, Kalesh Anakkur Purayil wrote:
> Hi Ophir,
> 
> Thank you for the comments. I will address them in the next version.
> 
> I will push these changes as Patches next time and not as an RFC. Hope that
> is OK.
> 
> Regards,
> Kalesh
> 
> On Tue, Oct 6, 2020 at 10:55 PM Ophir Munk <ophirmu@nvidia.com> wrote:
> 
>> Hi Kalesh,
>> Please find a few comments.
>> The name you gave to the event (EVENT_RESET) is very close to an already
>> existing one: "EVENT_INTR_RESET".
>> But they are different.
>> EVENT_INTR_RESET originates from a port reset. It requires application
>> reaction. It is widely used. It is documented in *.rst files.
>> EVENT_RESET originates from FW error (or maybe any error). It requires no
>> application reaction (PMD manages by itself). It is not documented.
>> I therefore suggest renaming it (maybe EVENT_ERR_RECOVERING) and please
>> document it in *.rst files.

+1 to renaming and documenting the event.

And agree to proceed as regular patch instead of RFC.


>> More comments below:
>>
>>> -----Original Message-----
>>> From: dev <dev-bounces@dpdk.org> On Behalf Of Kalesh A P
>>> Sent: Wednesday, September 30, 2020 3:33 PM
>>> To: dev@dpdk.org
>>> Subject: [dpdk-dev] [RFC PATCH v4 3/3] app/testpmd: handle device
>> recovery
>>> event
>>>
>>> From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
>>>
>>> Added code to handle device reset and recovery event in testpmd.
>>> This is an indication from the PMD that device has reset and recovered
>> error
>>> condition.
>>>
>>> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
>>> Reviewed-by: Ajit Kumar Khaparde <ajit.khaparde@broadcom.com>
>>> ---
>>>   app/test-pmd/testpmd.c | 6 +++++-
>>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
>>> fe6450c..1c8fb46 100644
>>> --- a/app/test-pmd/testpmd.c
>>> +++ b/app/test-pmd/testpmd.c
>>> @@ -380,6 +380,8 @@ static const char * const eth_event_desc[] = {
>>>        [RTE_ETH_EVENT_NEW] = "device probed",
>>>        [RTE_ETH_EVENT_DESTROY] = "device released",
>>>        [RTE_ETH_EVENT_FLOW_AGED] = "flow aged",
>>> +     [RTE_ETH_EVENT_RESET] = "device reset",
>>
>> "device reset" is similar to the existing "reset" string. Can you suggest
>> a different one? Maybe "error under recovery" ?
>>
>>> +     [RTE_ETH_EVENT_RECOVERED] = "device recovery",
>>
>> Wouldn't you prefer "device recovered" ?
>>
>>>        [RTE_ETH_EVENT_MAX] = NULL,
>>>   };
>>>
>>> @@ -394,7 +396,9 @@ uint32_t event_print_mask = (UINT32_C(1) <<
>>> RTE_ETH_EVENT_UNKNOWN) |
>>>                            (UINT32_C(1) << RTE_ETH_EVENT_IPSEC) |
>>>                            (UINT32_C(1) << RTE_ETH_EVENT_MACSEC) |
>>>                            (UINT32_C(1) << RTE_ETH_EVENT_INTR_RMV) |
>>> -                         (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED);
>>> +                         (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED) |
>>> +                         (UINT32_C(1) << RTE_ETH_EVENT_RESET) |
>>> +                         (UINT32_C(1) << RTE_ETH_EVENT_RECOVERED);
>>>   /*
>>>    * Decide if all memory are locked for performance.
>>>    */
>>> --
>>> 2.10.1
>>
>>
>
  
Ajit Khaparde Oct. 7, 2020, 6:42 p.m. UTC | #5
On Wed, Oct 7, 2020 at 2:37 AM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
>
> On 10/7/2020 5:46 AM, Kalesh Anakkur Purayil wrote:
> > Hi Ophir,
> >
> > Thank you for the comments. I will address them in the next version.
> >
> > I will push these changes as Patches next time and not as an RFC. Hope that
> > is OK.
> >
> > Regards,
> > Kalesh
> >
> > On Tue, Oct 6, 2020 at 10:55 PM Ophir Munk <ophirmu@nvidia.com> wrote:
> >
> >> Hi Kalesh,
> >> Please find a few comments.
> >> The name you gave to the event (EVENT_RESET) is very close to an already
> >> existing one: "EVENT_INTR_RESET".
> >> But they are different.
> >> EVENT_INTR_RESET originates from a port reset. It requires application
> >> reaction. It is widely used. It is documented in *.rst files.
> >> EVENT_RESET originates from FW error (or maybe any error). It requires no
> >> application reaction (PMD manages by itself). It is not documented.
> >> I therefore suggest renaming it (maybe EVENT_ERR_RECOVERING) and please
> >> document it in *.rst files.
>
> +1 to renaming and documenting the event.
>
> And agree to proceed as regular patch instead of RFC.
Ferruh,
If/when the new version of patch is good,
Can you pick the bnxt PMD patch along with the ethdev and testpmd patch?
Let me know.


>
>
> >> More comments below:
> >>
> >>> -----Original Message-----
> >>> From: dev <dev-bounces@dpdk.org> On Behalf Of Kalesh A P
> >>> Sent: Wednesday, September 30, 2020 3:33 PM
> >>> To: dev@dpdk.org
> >>> Subject: [dpdk-dev] [RFC PATCH v4 3/3] app/testpmd: handle device
> >> recovery
> >>> event
> >>>
> >>> From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> >>>
> >>> Added code to handle device reset and recovery event in testpmd.
> >>> This is an indication from the PMD that device has reset and recovered
> >> error
> >>> condition.
> >>>
> >>> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> >>> Reviewed-by: Ajit Kumar Khaparde <ajit.khaparde@broadcom.com>
> >>> ---
> >>>   app/test-pmd/testpmd.c | 6 +++++-
> >>>   1 file changed, 5 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> >>> fe6450c..1c8fb46 100644
> >>> --- a/app/test-pmd/testpmd.c
> >>> +++ b/app/test-pmd/testpmd.c
> >>> @@ -380,6 +380,8 @@ static const char * const eth_event_desc[] = {
> >>>        [RTE_ETH_EVENT_NEW] = "device probed",
> >>>        [RTE_ETH_EVENT_DESTROY] = "device released",
> >>>        [RTE_ETH_EVENT_FLOW_AGED] = "flow aged",
> >>> +     [RTE_ETH_EVENT_RESET] = "device reset",
> >>
> >> "device reset" is similar to the existing "reset" string. Can you suggest
> >> a different one? Maybe "error under recovery" ?
> >>
> >>> +     [RTE_ETH_EVENT_RECOVERED] = "device recovery",
> >>
> >> Wouldn't you prefer "device recovered" ?
> >>
> >>>        [RTE_ETH_EVENT_MAX] = NULL,
> >>>   };
> >>>
> >>> @@ -394,7 +396,9 @@ uint32_t event_print_mask = (UINT32_C(1) <<
> >>> RTE_ETH_EVENT_UNKNOWN) |
> >>>                            (UINT32_C(1) << RTE_ETH_EVENT_IPSEC) |
> >>>                            (UINT32_C(1) << RTE_ETH_EVENT_MACSEC) |
> >>>                            (UINT32_C(1) << RTE_ETH_EVENT_INTR_RMV) |
> >>> -                         (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED);
> >>> +                         (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED) |
> >>> +                         (UINT32_C(1) << RTE_ETH_EVENT_RESET) |
> >>> +                         (UINT32_C(1) << RTE_ETH_EVENT_RECOVERED);
> >>>   /*
> >>>    * Decide if all memory are locked for performance.
> >>>    */
> >>> --
> >>> 2.10.1
> >>
> >>
> >
>
  

Patch

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index fe6450c..1c8fb46 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -380,6 +380,8 @@  static const char * const eth_event_desc[] = {
 	[RTE_ETH_EVENT_NEW] = "device probed",
 	[RTE_ETH_EVENT_DESTROY] = "device released",
 	[RTE_ETH_EVENT_FLOW_AGED] = "flow aged",
+	[RTE_ETH_EVENT_RESET] = "device reset",
+	[RTE_ETH_EVENT_RECOVERED] = "device recovery",
 	[RTE_ETH_EVENT_MAX] = NULL,
 };
 
@@ -394,7 +396,9 @@  uint32_t event_print_mask = (UINT32_C(1) << RTE_ETH_EVENT_UNKNOWN) |
 			    (UINT32_C(1) << RTE_ETH_EVENT_IPSEC) |
 			    (UINT32_C(1) << RTE_ETH_EVENT_MACSEC) |
 			    (UINT32_C(1) << RTE_ETH_EVENT_INTR_RMV) |
-			    (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED);
+			    (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED) |
+			    (UINT32_C(1) << RTE_ETH_EVENT_RESET) |
+			    (UINT32_C(1) << RTE_ETH_EVENT_RECOVERED);
 /*
  * Decide if all memory are locked for performance.
  */