[v5,2/5] telemetry: fix repeated display when callback don't set dict

Message ID 20221219090723.29356-3-fengchengwen@huawei.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series support dmadev/ethdev stats reset |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

fengchengwen Dec. 19, 2022, 9:07 a.m. UTC
  When telemetry callback didn't set dict and return a non-negative
number, the telemetry will repeat to display the last result.

Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")
Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
---
 lib/telemetry/telemetry.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
  

Comments

Bruce Richardson Dec. 19, 2022, 9:33 a.m. UTC | #1
On Mon, Dec 19, 2022 at 09:07:20AM +0000, Chengwen Feng wrote:
> When telemetry callback didn't set dict and return a non-negative
> number, the telemetry will repeat to display the last result.
> 
> Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> ---

Hi Chengwen,

I'm a little curious about this bug. Can you describe some steps to
reproduce it as I'm curious as to exactly what is happening. The fix seems
a little strange to me so I'd like to investigate a little more to see if
other approaches might work.

Thanks,
/Bruce
  
fengchengwen Dec. 26, 2022, 4:53 a.m. UTC | #2
On 2022/12/19 17:33, Bruce Richardson wrote:
> On Mon, Dec 19, 2022 at 09:07:20AM +0000, Chengwen Feng wrote:
>> When telemetry callback didn't set dict and return a non-negative
>> number, the telemetry will repeat to display the last result.
>>
>> Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>> ---
> 
> Hi Chengwen,
> 
> I'm a little curious about this bug. Can you describe some steps to
> reproduce it as I'm curious as to exactly what is happening. The fix seems
> a little strange to me so I'd like to investigate a little more to see if
> other approaches might work.

Hi Bruce,

Sorry for late reply.

The steps:
  1. applay "[PATCH v5 1/5] dmadev: support stats reset telemetry command"
  2. compile
  3. start dpdk-dma: dpdk-dma -a DMA.BDF -a NIC.BDF -- -c hw
  4. start telemetry, and execute /dmadev/stats,0, and then /dmadev/stats_reset,0
     the output of /dmadev/stats_reset,0 will be the same of previous cmd "/dmadev/stats,0"
     e.g. my environment:
--> /dmadev/stats,0
{
  "/dmadev/stats": {
    "submitted": 23,
    "completed": 23,
    "errors": 0
  }
}
--> /dmadev/stats_reset,0
{
  "/dmadev/stats_reset": {
    "submitted": 23,
    "completed": 23,
    "errors": 0
  }
}

The rootcause is that the /dmadev/stats_reset don't set the outer parameter "struct rte_tel_data *info"
and return zero.


BTW: although the telemetry mainly used to query, but some reset counter maybe usefull, and it already
exist like: "/eventdev/rxa_stats_reset" and this patchset.


> 
> Thanks,
> /Bruce
> .
>
  
Bruce Richardson Jan. 6, 2023, 4:07 p.m. UTC | #3
On Mon, Dec 26, 2022 at 12:53:57PM +0800, fengchengwen wrote:
> On 2022/12/19 17:33, Bruce Richardson wrote:
> > On Mon, Dec 19, 2022 at 09:07:20AM +0000, Chengwen Feng wrote:
> >> When telemetry callback didn't set dict and return a non-negative
> >> number, the telemetry will repeat to display the last result.
> >>
> >> Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")
> >> Cc: stable@dpdk.org
> >>
> >> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> >> ---
> > 
> > Hi Chengwen,
> > 
> > I'm a little curious about this bug. Can you describe some steps to
> > reproduce it as I'm curious as to exactly what is happening. The fix seems
> > a little strange to me so I'd like to investigate a little more to see if
> > other approaches might work.
> 
> Hi Bruce,
> 
> Sorry for late reply.
> 
> The steps:
>   1. applay "[PATCH v5 1/5] dmadev: support stats reset telemetry command"
>   2. compile
>   3. start dpdk-dma: dpdk-dma -a DMA.BDF -a NIC.BDF -- -c hw
>   4. start telemetry, and execute /dmadev/stats,0, and then /dmadev/stats_reset,0
>      the output of /dmadev/stats_reset,0 will be the same of previous cmd "/dmadev/stats,0"
>      e.g. my environment:
> --> /dmadev/stats,0
> {
>   "/dmadev/stats": {
>     "submitted": 23,
>     "completed": 23,
>     "errors": 0
>   }
> }
> --> /dmadev/stats_reset,0
> {
>   "/dmadev/stats_reset": {
>     "submitted": 23,
>     "completed": 23,
>     "errors": 0
>   }
> }
> 
> The rootcause is that the /dmadev/stats_reset don't set the outer parameter "struct rte_tel_data *info"
> and return zero.
>
Thanks for the fuller explanation, I'll hopefully test it out myself.

However, in the meantime, looking at the telemetry library code, would the
following change work rather than explicitly always setting the telemetry
data to a dictionary by default? Zeroing the data by default sets it to a
null return which is what you probably want as default rather than an empty
dictionary. (And it's also a smaller diff)

/Bruce

diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
index 8fbb4f3060..7b905355cd 100644
--- a/lib/telemetry/telemetry.c
+++ b/lib/telemetry/telemetry.c
@@ -333,7 +333,7 @@ output_json(const char *cmd, const struct rte_tel_data *d, int s)
 static void
 perform_command(telemetry_cb fn, const char *cmd, const char *param, int s)
 {
-       struct rte_tel_data data;
+       struct rte_tel_data data = {0};
 
        int ret = fn(cmd, param, &data);
        if (ret < 0) {
  
Bruce Richardson Jan. 6, 2023, 5:33 p.m. UTC | #4
On Fri, Jan 06, 2023 at 04:07:45PM +0000, Bruce Richardson wrote:
> On Mon, Dec 26, 2022 at 12:53:57PM +0800, fengchengwen wrote:
> > On 2022/12/19 17:33, Bruce Richardson wrote:
> > > On Mon, Dec 19, 2022 at 09:07:20AM +0000, Chengwen Feng wrote:
> > >> When telemetry callback didn't set dict and return a non-negative
> > >> number, the telemetry will repeat to display the last result.
> > >>
> > >> Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")
> > >> Cc: stable@dpdk.org
> > >>
> > >> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> > >> ---
> > > 
> > > Hi Chengwen,
> > > 
> > > I'm a little curious about this bug. Can you describe some steps to
> > > reproduce it as I'm curious as to exactly what is happening. The fix seems
> > > a little strange to me so I'd like to investigate a little more to see if
> > > other approaches might work.
> > 
> > Hi Bruce,
> > 
> > Sorry for late reply.
> > 
> > The steps:
> >   1. applay "[PATCH v5 1/5] dmadev: support stats reset telemetry command"
> >   2. compile
> >   3. start dpdk-dma: dpdk-dma -a DMA.BDF -a NIC.BDF -- -c hw
> >   4. start telemetry, and execute /dmadev/stats,0, and then /dmadev/stats_reset,0
> >      the output of /dmadev/stats_reset,0 will be the same of previous cmd "/dmadev/stats,0"
> >      e.g. my environment:
> > --> /dmadev/stats,0
> > {
> >   "/dmadev/stats": {
> >     "submitted": 23,
> >     "completed": 23,
> >     "errors": 0
> >   }
> > }
> > --> /dmadev/stats_reset,0
> > {
> >   "/dmadev/stats_reset": {
> >     "submitted": 23,
> >     "completed": 23,
> >     "errors": 0
> >   }
> > }
> > 
> > The rootcause is that the /dmadev/stats_reset don't set the outer parameter "struct rte_tel_data *info"
> > and return zero.
> >
> Thanks for the fuller explanation, I'll hopefully test it out myself.
> 
> However, in the meantime, looking at the telemetry library code, would the
> following change work rather than explicitly always setting the telemetry
> data to a dictionary by default? Zeroing the data by default sets it to a
> null return which is what you probably want as default rather than an empty
> dictionary. (And it's also a smaller diff)
> 
> /Bruce
> 
> diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
> index 8fbb4f3060..7b905355cd 100644
> --- a/lib/telemetry/telemetry.c
> +++ b/lib/telemetry/telemetry.c
> @@ -333,7 +333,7 @@ output_json(const char *cmd, const struct rte_tel_data *d, int s)
>  static void
>  perform_command(telemetry_cb fn, const char *cmd, const char *param, int s)
>  {
> -       struct rte_tel_data data;
> +       struct rte_tel_data data = {0};
>  
>         int ret = fn(cmd, param, &data);
>         if (ret < 0) {
>

I've handily reproduced the issue using the instructions you gave above,
thanks for those.

Based on that, and looking a little deeper:

* I think it is an error with the reset function not to initialize and
  complete the return value. I believe that for each telemetry callback we
  should require that the callback fill in a valid value on success. For
  reset, some options could be just a string "OK", or an array just
  containing "[0]", or similar.

* Given that point above, I do agree though that the "data" parameter
  should be properly initialized on entry to the callbacks. However, I feel
  that the correct init should be the empty/null value, as is given in the
  patch above.

Regards,
/Bruce
  
fengchengwen Jan. 11, 2023, 12:38 p.m. UTC | #5
Hi Bruce,

On 2023/1/7 1:33, Bruce Richardson wrote:
> On Fri, Jan 06, 2023 at 04:07:45PM +0000, Bruce Richardson wrote:
>> On Mon, Dec 26, 2022 at 12:53:57PM +0800, fengchengwen wrote:
>>> On 2022/12/19 17:33, Bruce Richardson wrote:
>>>> On Mon, Dec 19, 2022 at 09:07:20AM +0000, Chengwen Feng wrote:
>>>>> When telemetry callback didn't set dict and return a non-negative
>>>>> number, the telemetry will repeat to display the last result.
>>>>>
>>>>> Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")
>>>>> Cc: stable@dpdk.org
>>>>>
>>>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>>>>> ---
>>>>
>>>> Hi Chengwen,
>>>>
>>>> I'm a little curious about this bug. Can you describe some steps to
>>>> reproduce it as I'm curious as to exactly what is happening. The fix seems
>>>> a little strange to me so I'd like to investigate a little more to see if
>>>> other approaches might work.
>>>
>>> Hi Bruce,
>>>
>>> Sorry for late reply.
>>>
>>> The steps:
>>>   1. applay "[PATCH v5 1/5] dmadev: support stats reset telemetry command"
>>>   2. compile
>>>   3. start dpdk-dma: dpdk-dma -a DMA.BDF -a NIC.BDF -- -c hw
>>>   4. start telemetry, and execute /dmadev/stats,0, and then /dmadev/stats_reset,0
>>>      the output of /dmadev/stats_reset,0 will be the same of previous cmd "/dmadev/stats,0"
>>>      e.g. my environment:
>>> --> /dmadev/stats,0
>>> {
>>>   "/dmadev/stats": {
>>>     "submitted": 23,
>>>     "completed": 23,
>>>     "errors": 0
>>>   }
>>> }
>>> --> /dmadev/stats_reset,0
>>> {
>>>   "/dmadev/stats_reset": {
>>>     "submitted": 23,
>>>     "completed": 23,
>>>     "errors": 0
>>>   }
>>> }
>>>
>>> The rootcause is that the /dmadev/stats_reset don't set the outer parameter "struct rte_tel_data *info"
>>> and return zero.
>>>
>> Thanks for the fuller explanation, I'll hopefully test it out myself.
>>
>> However, in the meantime, looking at the telemetry library code, would the
>> following change work rather than explicitly always setting the telemetry
>> data to a dictionary by default? Zeroing the data by default sets it to a
>> null return which is what you probably want as default rather than an empty
>> dictionary. (And it's also a smaller diff)
>>
>> /Bruce
>>
>> diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
>> index 8fbb4f3060..7b905355cd 100644
>> --- a/lib/telemetry/telemetry.c
>> +++ b/lib/telemetry/telemetry.c
>> @@ -333,7 +333,7 @@ output_json(const char *cmd, const struct rte_tel_data *d, int s)
>>  static void
>>  perform_command(telemetry_cb fn, const char *cmd, const char *param, int s)
>>  {
>> -       struct rte_tel_data data;
>> +       struct rte_tel_data data = {0};
>>  
>>         int ret = fn(cmd, param, &data);
>>         if (ret < 0) {
>>
> 
> I've handily reproduced the issue using the instructions you gave above,
> thanks for those.
> 
> Based on that, and looking a little deeper:
> 
> * I think it is an error with the reset function not to initialize and
>   complete the return value. I believe that for each telemetry callback we
>   should require that the callback fill in a valid value on success. For
>   reset, some options could be just a string "OK", or an array just
>   containing "[0]", or similar.

good idea, the v2 follow it.

> 
> * Given that point above, I do agree though that the "data" parameter
>   should be properly initialized on entry to the callbacks. However, I feel
>   that the correct init should be the empty/null value, as is given in the
>   patch above.

already fix in v2

Thanks.

> 
> Regards,
> /Bruce
> .
>
  

Patch

diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
index 8fbb4f3060..a6c84e3499 100644
--- a/lib/telemetry/telemetry.c
+++ b/lib/telemetry/telemetry.c
@@ -334,8 +334,10 @@  static void
 perform_command(telemetry_cb fn, const char *cmd, const char *param, int s)
 {
 	struct rte_tel_data data;
+	int ret;
 
-	int ret = fn(cmd, param, &data);
+	rte_tel_data_start_dict(&data);
+	ret = fn(cmd, param, &data);
 	if (ret < 0) {
 		char out_buf[MAX_CMD_LEN + 10];
 		int used = snprintf(out_buf, sizeof(out_buf), "{\"%.*s\":null}",