mbox series

[0/5] add apistats function

Message ID 20201204075109.14694-1-yamashita.hideyuki@ntt-tx.co.jp (mailing list archive)
Headers
Series add apistats function |

Message

Hideyuki Yamashita Dec. 4, 2020, 7:51 a.m. UTC
  In general, DPDK application consumes CPU usage because it polls
incoming packets using rx_burst API in infinite loop.
This makes difficult to estimate how much CPU usage is really
used to send/receive packets by the DPDK application.

For example, even if no incoming packets arriving, CPU usage
looks nearly 100% when observed by top command.

It is beneficial if developers can observe real CPU usage of the
DPDK application.
Such information can be exported to monitoring application like
prometheus/graphana and shows CPU usage graphically.

To achieve above, this patch set provides apistats functionality.
apistats provides the followiing two counters for each lcore.
- rx_burst_counts[RTE_MAX_LCORE]
- tx_burst_counts[RTE_MAX_LCORE]
Those accumulates rx_burst/tx_burst counts since the application starts.

By using those values, developers can roughly estimate CPU usage.
Let us assume a DPDK application is simply forwarding packets.
It calls tx_burst only if it receive packets.
If rx_burst_counts=1000 and tx_burst_count=1000 during certain
period of time, one can assume CPU usage is 100%.
If rx_burst_counts=1000 and tx_burst_count=100 during certain
period of time, one can assume CPU usage is 10%.
Here we assumes that tx_burst_count equals counts which rx_burst function
really receives incoming packets.


This patch set provides the following.
- basic API counting functionality(apistats) into librte_ethdev
- add code to testpmd to accumulate counter information
- add code to proc-info to retrieve above mentioned counter information
- add description in proc-info document about --apistats parameter
- modify MAINTAINERS file for apistats.c and apistats.h

Hideyuki Yamashita (5):
  maintainers: update maintainers file for apistats
  app/proc-info: add to use apistats
  app/test-pmd: add to use apistats
  docs: add description of apistats parameter into proc-info
  librte_ethdev: add to use apistats

 MAINTAINERS                      |  3 ++
 app/proc-info/main.c             | 46 +++++++++++++++++++++++
 app/test-pmd/testpmd.c           |  4 ++
 doc/guides/tools/proc_info.rst   | 10 ++++-
 lib/librte_ethdev/meson.build    |  6 ++-
 lib/librte_ethdev/rte_apistats.c | 64 ++++++++++++++++++++++++++++++++
 lib/librte_ethdev/rte_apistats.h | 64 ++++++++++++++++++++++++++++++++
 lib/librte_ethdev/rte_ethdev.h   |  7 ++++
 lib/librte_ethdev/version.map    |  5 +++
 9 files changed, 205 insertions(+), 4 deletions(-)
 create mode 100644 lib/librte_ethdev/rte_apistats.c
 create mode 100644 lib/librte_ethdev/rte_apistats.h
  

Comments

Ferruh Yigit Dec. 4, 2020, 9:09 a.m. UTC | #1
On 12/4/2020 7:51 AM, Hideyuki Yamashita wrote:
> In general, DPDK application consumes CPU usage because it polls
> incoming packets using rx_burst API in infinite loop.
> This makes difficult to estimate how much CPU usage is really
> used to send/receive packets by the DPDK application.
> 
> For example, even if no incoming packets arriving, CPU usage
> looks nearly 100% when observed by top command.
> 
> It is beneficial if developers can observe real CPU usage of the
> DPDK application.
> Such information can be exported to monitoring application like
> prometheus/graphana and shows CPU usage graphically.
> 
> To achieve above, this patch set provides apistats functionality.
> apistats provides the followiing two counters for each lcore.
> - rx_burst_counts[RTE_MAX_LCORE]
> - tx_burst_counts[RTE_MAX_LCORE]
> Those accumulates rx_burst/tx_burst counts since the application starts.
> 
> By using those values, developers can roughly estimate CPU usage.
> Let us assume a DPDK application is simply forwarding packets.
> It calls tx_burst only if it receive packets.
> If rx_burst_counts=1000 and tx_burst_count=1000 during certain
> period of time, one can assume CPU usage is 100%.
> If rx_burst_counts=1000 and tx_burst_count=100 during certain
> period of time, one can assume CPU usage is 10%.
> Here we assumes that tx_burst_count equals counts which rx_burst function
> really receives incoming packets.
> 
> 
> This patch set provides the following.
> - basic API counting functionality(apistats) into librte_ethdev
> - add code to testpmd to accumulate counter information
> - add code to proc-info to retrieve above mentioned counter information
> - add description in proc-info document about --apistats parameter
> - modify MAINTAINERS file for apistats.c and apistats.h
> 
> Hideyuki Yamashita (5):
>    maintainers: update maintainers file for apistats
>    app/proc-info: add to use apistats
>    app/test-pmd: add to use apistats
>    docs: add description of apistats parameter into proc-info
>    librte_ethdev: add to use apistats
> 
>   MAINTAINERS                      |  3 ++
>   app/proc-info/main.c             | 46 +++++++++++++++++++++++
>   app/test-pmd/testpmd.c           |  4 ++
>   doc/guides/tools/proc_info.rst   | 10 ++++-
>   lib/librte_ethdev/meson.build    |  6 ++-
>   lib/librte_ethdev/rte_apistats.c | 64 ++++++++++++++++++++++++++++++++
>   lib/librte_ethdev/rte_apistats.h | 64 ++++++++++++++++++++++++++++++++
>   lib/librte_ethdev/rte_ethdev.h   |  7 ++++
>   lib/librte_ethdev/version.map    |  5 +++
>   9 files changed, 205 insertions(+), 4 deletions(-)
>   create mode 100644 lib/librte_ethdev/rte_apistats.c
>   create mode 100644 lib/librte_ethdev/rte_apistats.h
> 

cc'ed Dave Hunt. As far as I remember he did something for same goal in the 
past, but in a different way, he can comment better.
  
Hunt, David Dec. 4, 2020, 10:20 a.m. UTC | #2
On 4/12/2020 7:51 AM, Hideyuki Yamashita wrote:
> In general, DPDK application consumes CPU usage because it polls
> incoming packets using rx_burst API in infinite loop.
> This makes difficult to estimate how much CPU usage is really
> used to send/receive packets by the DPDK application.
>
> For example, even if no incoming packets arriving, CPU usage
> looks nearly 100% when observed by top command.
>
> It is beneficial if developers can observe real CPU usage of the
> DPDK application.
> Such information can be exported to monitoring application like
> prometheus/graphana and shows CPU usage graphically.
>
> To achieve above, this patch set provides apistats functionality.
> apistats provides the followiing two counters for each lcore.
> - rx_burst_counts[RTE_MAX_LCORE]
> - tx_burst_counts[RTE_MAX_LCORE]
> Those accumulates rx_burst/tx_burst counts since the application starts.
>
> By using those values, developers can roughly estimate CPU usage.
> Let us assume a DPDK application is simply forwarding packets.
> It calls tx_burst only if it receive packets.
> If rx_burst_counts=1000 and tx_burst_count=1000 during certain
> period of time, one can assume CPU usage is 100%.
> If rx_burst_counts=1000 and tx_burst_count=100 during certain
> period of time, one can assume CPU usage is 10%.
> Here we assumes that tx_burst_count equals counts which rx_burst function
> really receives incoming packets.
>
>
> This patch set provides the following.
> - basic API counting functionality(apistats) into librte_ethdev
> - add code to testpmd to accumulate counter information
> - add code to proc-info to retrieve above mentioned counter information
> - add description in proc-info document about --apistats parameter
> - modify MAINTAINERS file for apistats.c and apistats.h
>
> Hideyuki Yamashita (5):
>    maintainers: update maintainers file for apistats
>    app/proc-info: add to use apistats
>    app/test-pmd: add to use apistats
>    docs: add description of apistats parameter into proc-info
>    librte_ethdev: add to use apistats
>
>   MAINTAINERS                      |  3 ++
>   app/proc-info/main.c             | 46 +++++++++++++++++++++++
>   app/test-pmd/testpmd.c           |  4 ++
>   doc/guides/tools/proc_info.rst   | 10 ++++-
>   lib/librte_ethdev/meson.build    |  6 ++-
>   lib/librte_ethdev/rte_apistats.c | 64 ++++++++++++++++++++++++++++++++
>   lib/librte_ethdev/rte_apistats.h | 64 ++++++++++++++++++++++++++++++++
>   lib/librte_ethdev/rte_ethdev.h   |  7 ++++
>   lib/librte_ethdev/version.map    |  5 +++
>   9 files changed, 205 insertions(+), 4 deletions(-)
>   create mode 100644 lib/librte_ethdev/rte_apistats.c
>   create mode 100644 lib/librte_ethdev/rte_apistats.h


Hi Hideyuki,

I have a few questions on the patch set.

How does this compare to the mechanism added to l3fwd-power which counts 
the number of empty, partial and full polls, and uses them to calculate 
busyness? We saw pretty good tracking of busyness using those metrics. I 
would be concerned that just looking at the numebr of rx_bursts and 
tx_bursts may be limited to only a few use-cases. The l3fwd-power 
example uses branchless increments to capture empty, non-empty, full, 
and non-full polls.

Why not use the existing telemetry library to store the stats? It would 
be good if whatever metrics were counted were made available in a 
standard way, so that external entities such as collectd could pick them 
up, rather than having to parse the new struct. The l3fwd-power example 
registers the necessary new metrics, and exposes them through the 
telemetry library.

And a comment on the patch set in general: The order of the patch set 
seems reversed. The earlier patch do not compile, because they depend on 
rte_apistats.h, which is introduced in the final patch. Each patch as it 
is applied needs to build successfully.

Also, I would suggest a different name. rte_apistats seems very generic 
and could apply to anything. How about something like rte_ethdev_stats.h 
or rte_burst_stats.h?

Rgds,
Dave.
  
Varghese, Vipin Dec. 5, 2020, 1:23 p.m. UTC | #3
Sharing an alternate approach, if RX-TX callbacks are enabled in DPDK (which is by default). One can register a callback handler to update counters with the following information as `port-queue pair, lcoreid, total rx burst request, total empty rx burst, 1-8 pks, 9-16 pkts, 16-32 pkts`. Callback handlers can be selectively enabled or disabled too.

Can you please help me understand how `rte_apistats` would be different or pros of having it as library (that needs to be added to linking and running in case of using DPDK applications)?

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of David Hunt
> Sent: Friday, December 4, 2020 3:51 PM
> To: Hideyuki Yamashita <yamashita.hideyuki@ntt-tx.co.jp>
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 0/5] add apistats function
> 
> 
> On 4/12/2020 7:51 AM, Hideyuki Yamashita wrote:
> > In general, DPDK application consumes CPU usage because it polls
> > incoming packets using rx_burst API in infinite loop.
> > This makes difficult to estimate how much CPU usage is really used to
> > send/receive packets by the DPDK application.
> >
> > For example, even if no incoming packets arriving, CPU usage looks
> > nearly 100% when observed by top command.
> >
> > It is beneficial if developers can observe real CPU usage of the DPDK
> > application.
> > Such information can be exported to monitoring application like
> > prometheus/graphana and shows CPU usage graphically.
> >
> > To achieve above, this patch set provides apistats functionality.
> > apistats provides the followiing two counters for each lcore.
> > - rx_burst_counts[RTE_MAX_LCORE]
> > - tx_burst_counts[RTE_MAX_LCORE]
> > Those accumulates rx_burst/tx_burst counts since the application starts.
> >
> > By using those values, developers can roughly estimate CPU usage.
> > Let us assume a DPDK application is simply forwarding packets.
> > It calls tx_burst only if it receive packets.
> > If rx_burst_counts=1000 and tx_burst_count=1000 during certain period
> > of time, one can assume CPU usage is 100%.
> > If rx_burst_counts=1000 and tx_burst_count=100 during certain period
> > of time, one can assume CPU usage is 10%.
> > Here we assumes that tx_burst_count equals counts which rx_burst
> > function really receives incoming packets.
> >
> >
> > This patch set provides the following.
> > - basic API counting functionality(apistats) into librte_ethdev
> > - add code to testpmd to accumulate counter information
> > - add code to proc-info to retrieve above mentioned counter
> > information
> > - add description in proc-info document about --apistats parameter
> > - modify MAINTAINERS file for apistats.c and apistats.h
> >
> > Hideyuki Yamashita (5):
> >    maintainers: update maintainers file for apistats
> >    app/proc-info: add to use apistats
> >    app/test-pmd: add to use apistats
> >    docs: add description of apistats parameter into proc-info
> >    librte_ethdev: add to use apistats
> >
> >   MAINTAINERS                      |  3 ++
> >   app/proc-info/main.c             | 46 +++++++++++++++++++++++
> >   app/test-pmd/testpmd.c           |  4 ++
> >   doc/guides/tools/proc_info.rst   | 10 ++++-
> >   lib/librte_ethdev/meson.build    |  6 ++-
> >   lib/librte_ethdev/rte_apistats.c | 64 ++++++++++++++++++++++++++++++++
> >   lib/librte_ethdev/rte_apistats.h | 64 ++++++++++++++++++++++++++++++++
> >   lib/librte_ethdev/rte_ethdev.h   |  7 ++++
> >   lib/librte_ethdev/version.map    |  5 +++
> >   9 files changed, 205 insertions(+), 4 deletions(-)
> >   create mode 100644 lib/librte_ethdev/rte_apistats.c
> >   create mode 100644 lib/librte_ethdev/rte_apistats.h
> 
> 
> Hi Hideyuki,
> 
> I have a few questions on the patch set.
> 
> How does this compare to the mechanism added to l3fwd-power which counts
> the number of empty, partial and full polls, and uses them to calculate
> busyness? We saw pretty good tracking of busyness using those metrics. I
> would be concerned that just looking at the numebr of rx_bursts and tx_bursts
> may be limited to only a few use-cases. The l3fwd-power example uses
> branchless increments to capture empty, non-empty, full, and non-full polls.
> 
> Why not use the existing telemetry library to store the stats? It would be good
> if whatever metrics were counted were made available in a standard way, so
> that external entities such as collectd could pick them up, rather than having to
> parse the new struct. The l3fwd-power example registers the necessary new
> metrics, and exposes them through the telemetry library.
> 
> And a comment on the patch set in general: The order of the patch set seems
> reversed. The earlier patch do not compile, because they depend on
> rte_apistats.h, which is introduced in the final patch. Each patch as it is applied
> needs to build successfully.
> 
> Also, I would suggest a different name. rte_apistats seems very generic and
> could apply to anything. How about something like rte_ethdev_stats.h or
> rte_burst_stats.h?
> 
> Rgds,
> Dave.
> 
> 
> 
> 
> 
>
  
Thomas Monjalon Dec. 7, 2020, 9:46 a.m. UTC | #4
04/12/2020 08:51, Hideyuki Yamashita:
> In general, DPDK application consumes CPU usage because it polls
> incoming packets using rx_burst API in infinite loop.
> This makes difficult to estimate how much CPU usage is really
> used to send/receive packets by the DPDK application.
> 
> For example, even if no incoming packets arriving, CPU usage
> looks nearly 100% when observed by top command.
> 
> It is beneficial if developers can observe real CPU usage of the
> DPDK application.
> Such information can be exported to monitoring application like
> prometheus/graphana and shows CPU usage graphically.
> 
> To achieve above, this patch set provides apistats functionality.
> apistats provides the followiing two counters for each lcore.
> - rx_burst_counts[RTE_MAX_LCORE]
> - tx_burst_counts[RTE_MAX_LCORE]
> Those accumulates rx_burst/tx_burst counts since the application starts.

Please could you compare with what rte_jobstats offers?
http://code.dpdk.org/dpdk/latest/source/lib/librte_jobstats/rte_jobstats.h

I feel all of this shares the same goals as librte_power work.

[...]
> - basic API counting functionality(apistats) into librte_ethdev

Could it be it be accessible via rte_telemetry?
http://code.dpdk.org/dpdk/latest/source/lib/librte_telemetry/rte_telemetry.h
  
Hideyuki Yamashita Dec. 22, 2020, 2:16 a.m. UTC | #5
Hi, David

Thanks for your comments.
Please see my comments inline tagged with [HY].

> 
> On 4/12/2020 7:51 AM, Hideyuki Yamashita wrote:
> > In general, DPDK application consumes CPU usage because it polls
> > incoming packets using rx_burst API in infinite loop.
> > This makes difficult to estimate how much CPU usage is really
> > used to send/receive packets by the DPDK application.
> >
> > For example, even if no incoming packets arriving, CPU usage
> > looks nearly 100% when observed by top command.
> >
> > It is beneficial if developers can observe real CPU usage of the
> > DPDK application.
> > Such information can be exported to monitoring application like
> > prometheus/graphana and shows CPU usage graphically.
> >
> > To achieve above, this patch set provides apistats functionality.
> > apistats provides the followiing two counters for each lcore.
> > - rx_burst_counts[RTE_MAX_LCORE]
> > - tx_burst_counts[RTE_MAX_LCORE]
> > Those accumulates rx_burst/tx_burst counts since the application starts.
> >
> > By using those values, developers can roughly estimate CPU usage.
> > Let us assume a DPDK application is simply forwarding packets.
> > It calls tx_burst only if it receive packets.
> > If rx_burst_counts=1000 and tx_burst_count=1000 during certain
> > period of time, one can assume CPU usage is 100%.
> > If rx_burst_counts=1000 and tx_burst_count=100 during certain
> > period of time, one can assume CPU usage is 10%.
> > Here we assumes that tx_burst_count equals counts which rx_burst function
> > really receives incoming packets.
> >
> >
> > This patch set provides the following.
> > - basic API counting functionality(apistats) into librte_ethdev
> > - add code to testpmd to accumulate counter information
> > - add code to proc-info to retrieve above mentioned counter information
> > - add description in proc-info document about --apistats parameter
> > - modify MAINTAINERS file for apistats.c and apistats.h
> >
> > Hideyuki Yamashita (5):
> >    maintainers: update maintainers file for apistats
> >    app/proc-info: add to use apistats
> >    app/test-pmd: add to use apistats
> >    docs: add description of apistats parameter into proc-info
> >    librte_ethdev: add to use apistats
> >
> >   MAINTAINERS                      |  3 ++
> >   app/proc-info/main.c             | 46 +++++++++++++++++++++++
> >   app/test-pmd/testpmd.c           |  4 ++
> >   doc/guides/tools/proc_info.rst   | 10 ++++-
> >   lib/librte_ethdev/meson.build    |  6 ++-
> >   lib/librte_ethdev/rte_apistats.c | 64 ++++++++++++++++++++++++++++++++
> >   lib/librte_ethdev/rte_apistats.h | 64 ++++++++++++++++++++++++++++++++
> >   lib/librte_ethdev/rte_ethdev.h   |  7 ++++
> >   lib/librte_ethdev/version.map    |  5 +++
> >   9 files changed, 205 insertions(+), 4 deletions(-)
> >   create mode 100644 lib/librte_ethdev/rte_apistats.c
> >   create mode 100644 lib/librte_ethdev/rte_apistats.h
> 
> 
> Hi Hideyuki,
> 
> I have a few questions on the patch set.
> 
> How does this compare to the mechanism added to l3fwd-power which counts the number of empty, partial and full polls, and uses them to calculate busyness? We saw pretty good tracking of busyness using those metrics. I would be concerned that just looking at the numebr of rx_bursts and tx_bursts may be limited to only a few use-cases. The l3fwd-power example uses branchless increments to capture empty, non-empty, full, and non-full polls.
[HY]
Thanks for your commetns.
You are correct. As you well know, l3fwd-power measures "how cpu cores
are busy".
And in that sense, the goal of my proposal is the same with yours .
Moreover l3fwd-power is more precise than my proposal.

Point of my proposal is 
- more easy to use
- less code impact on application code

I think that if application developer wants to need to measure "how cpu
cores are busy" he/she will needs to implement
- logic similar with l3fwd-power
or
- use jobstats API

But it is rather heavy for existing applications.
By using my proposal, it is "much easier" to implement.
(But it is "rough" measurement. I think it is trade-off)

> Why not use the existing telemetry library to store the stats? It would be good if whatever metrics were counted were made available in a standard way, so that external entities such as collectd could pick them up, rather than having to parse the new struct. The l3fwd-power example registers the necessary new metrics, and exposes them through the telemetry library.

[HY]
OK.
Currently, no reason not using telemetry.

I think telemetry is useful for applications which does NOT call DPDK
API(C lang API) directly.
My patchset provide only C API to retrieve apistats.
But if assuming not all applications call C API, then I think it is
reasonable to add telemetry in addition to C API for exposing stats.

> And a comment on the patch set in general: The order of the patch set seems reversed. The earlier patch do not compile, because they depend on rte_apistats.h, which is introduced in the final patch. Each patch as it is applied needs to build successfully.
[HY]
Thanks for your information.
OK. Now I understand your point that the order of the patch is very
important. Thanks.

> Also, I would suggest a different name. rte_apistats seems very generic and could apply to anything. How about something like rte_ethdev_stats.h or rte_burst_stats.h?
[HY]
Thanks. I agree.
"txrx_apicall_stats" maybe?

Thanks!
BR,
Hideyuki Yamashita
NTT TechnoCross
 
> Rgds,
> Dave.
> 
> 
> 
> 
> 
>
  
Hideyuki Yamashita Dec. 22, 2020, 2:22 a.m. UTC | #6
Hello,

Thanks for your comments.
Please see my comments inline tagged with [HY].

> 04/12/2020 08:51, Hideyuki Yamashita:
> > In general, DPDK application consumes CPU usage because it polls
> > incoming packets using rx_burst API in infinite loop.
> > This makes difficult to estimate how much CPU usage is really
> > used to send/receive packets by the DPDK application.
> > 
> > For example, even if no incoming packets arriving, CPU usage
> > looks nearly 100% when observed by top command.
> > 
> > It is beneficial if developers can observe real CPU usage of the
> > DPDK application.
> > Such information can be exported to monitoring application like
> > prometheus/graphana and shows CPU usage graphically.
> > 
> > To achieve above, this patch set provides apistats functionality.
> > apistats provides the followiing two counters for each lcore.
> > - rx_burst_counts[RTE_MAX_LCORE]
> > - tx_burst_counts[RTE_MAX_LCORE]
> > Those accumulates rx_burst/tx_burst counts since the application starts.
> 
> Please could you compare with what rte_jobstats offers?
> http://code.dpdk.org/dpdk/latest/source/lib/librte_jobstats/rte_jobstats.h
> 
> I feel all of this shares the same goals as librte_power work.

[HY]
Thanks for your commetns.
You are correct. As you well know, l3fwd-power measures "how cpu cores
are busy".
And in that sense, the goal of my proposal is the same with yours .
Moreover l3fwd-power is more precise than my proposal.

Point of my proposal is 
- more easy to use
- less code impact on application code

I think that if application developer wants to need to measure "how cpu
cores are busy" he/she will needs to implement
- logic similar with l3fwd-power
or
- use jobstats API

But it is rather heavy for existing applications.
By using my proposal, it is "much easier" to implement.
(But it is "rough" measurement. I think it is trade-off)
 
How do you think about the idea?

> [...]
> > - basic API counting functionality(apistats) into librte_ethdev
> 
> Could it be it be accessible via rte_telemetry?
> http://code.dpdk.org/dpdk/latest/source/lib/librte_telemetry/rte_telemetry.h
> 
[HY]
OK.
Currently, no reason not using telemetry.

I think telemetry is useful for applications which does NOT call DPDK
API(C lang API) directly.
My patchset provide only C API to retrieve apistats.
But if assuming not all applications call C API, then I think it is
reasonable to add telemetry in addition to C API for exposing stats.

Do you think "exposure via C API" is not needed?

Thanks!
BR,
Hideyuki Yamashita
NTT TechnoCross
  
Hideyuki Yamashita Dec. 24, 2020, 6:43 a.m. UTC | #7
Hello 

Thanks for your comments.
I know you kindly provided many valuable comments
though I reply the following first because I think it is 
important that my idea/proposal is acceptable or not
first.

> Sharing an alternate approach, if RX-TX callbacks are enabled in DPDK (which is by default). One can register a callback handler to update counters with the following information as `port-queue pair, lcoreid, total rx burst request, total empty rx burst, 1-8 pks, 9-16 pkts, 16-32 pkts`. Callback handlers can be selectively enabled or disabled too.
> 
> Can you please help me understand how `rte_apistats` would be different or pros of having it as library (that needs to be added to linking and running in case of using DPDK applications)?
You are correct.
Application can do that by using callback of rx_burst/tx_burst.
But needs to modify/add the code for RX/TX process logic.

Point of my patchset is couting is done by library 
so that APP only needs to "retrieve/read" those data
if needed (especially for existing applications).

I think it makes some developers happy becaseu it is relatively easy to
measure "how cpu core is bysy" easily.
(I admit relatively rough measurement though. I think it is trade-off)

What do you think?

Thanks!
BR,
Hideyuki Yamashita
NTT TechnoCross

> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of David Hunt
> > Sent: Friday, December 4, 2020 3:51 PM
> > To: Hideyuki Yamashita <yamashita.hideyuki@ntt-tx.co.jp>
> > Cc: dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 0/5] add apistats function
> > 
> > 
> > On 4/12/2020 7:51 AM, Hideyuki Yamashita wrote:
> > > In general, DPDK application consumes CPU usage because it polls
> > > incoming packets using rx_burst API in infinite loop.
> > > This makes difficult to estimate how much CPU usage is really used to
> > > send/receive packets by the DPDK application.
> > >
> > > For example, even if no incoming packets arriving, CPU usage looks
> > > nearly 100% when observed by top command.
> > >
> > > It is beneficial if developers can observe real CPU usage of the DPDK
> > > application.
> > > Such information can be exported to monitoring application like
> > > prometheus/graphana and shows CPU usage graphically.
> > >
> > > To achieve above, this patch set provides apistats functionality.
> > > apistats provides the followiing two counters for each lcore.
> > > - rx_burst_counts[RTE_MAX_LCORE]
> > > - tx_burst_counts[RTE_MAX_LCORE]
> > > Those accumulates rx_burst/tx_burst counts since the application starts.
> > >
> > > By using those values, developers can roughly estimate CPU usage.
> > > Let us assume a DPDK application is simply forwarding packets.
> > > It calls tx_burst only if it receive packets.
> > > If rx_burst_counts=1000 and tx_burst_count=1000 during certain period
> > > of time, one can assume CPU usage is 100%.
> > > If rx_burst_counts=1000 and tx_burst_count=100 during certain period
> > > of time, one can assume CPU usage is 10%.
> > > Here we assumes that tx_burst_count equals counts which rx_burst
> > > function really receives incoming packets.
> > >
> > >
> > > This patch set provides the following.
> > > - basic API counting functionality(apistats) into librte_ethdev
> > > - add code to testpmd to accumulate counter information
> > > - add code to proc-info to retrieve above mentioned counter
> > > information
> > > - add description in proc-info document about --apistats parameter
> > > - modify MAINTAINERS file for apistats.c and apistats.h
> > >
> > > Hideyuki Yamashita (5):
> > >    maintainers: update maintainers file for apistats
> > >    app/proc-info: add to use apistats
> > >    app/test-pmd: add to use apistats
> > >    docs: add description of apistats parameter into proc-info
> > >    librte_ethdev: add to use apistats
> > >
> > >   MAINTAINERS                      |  3 ++
> > >   app/proc-info/main.c             | 46 +++++++++++++++++++++++
> > >   app/test-pmd/testpmd.c           |  4 ++
> > >   doc/guides/tools/proc_info.rst   | 10 ++++-
> > >   lib/librte_ethdev/meson.build    |  6 ++-
> > >   lib/librte_ethdev/rte_apistats.c | 64 ++++++++++++++++++++++++++++++++
> > >   lib/librte_ethdev/rte_apistats.h | 64 ++++++++++++++++++++++++++++++++
> > >   lib/librte_ethdev/rte_ethdev.h   |  7 ++++
> > >   lib/librte_ethdev/version.map    |  5 +++
> > >   9 files changed, 205 insertions(+), 4 deletions(-)
> > >   create mode 100644 lib/librte_ethdev/rte_apistats.c
> > >   create mode 100644 lib/librte_ethdev/rte_apistats.h
> > 
> > 
> > Hi Hideyuki,
> > 
> > I have a few questions on the patch set.
> > 
> > How does this compare to the mechanism added to l3fwd-power which counts
> > the number of empty, partial and full polls, and uses them to calculate
> > busyness? We saw pretty good tracking of busyness using those metrics. I
> > would be concerned that just looking at the numebr of rx_bursts and tx_bursts
> > may be limited to only a few use-cases. The l3fwd-power example uses
> > branchless increments to capture empty, non-empty, full, and non-full polls.
> > 
> > Why not use the existing telemetry library to store the stats? It would be good
> > if whatever metrics were counted were made available in a standard way, so
> > that external entities such as collectd could pick them up, rather than having to
> > parse the new struct. The l3fwd-power example registers the necessary new
> > metrics, and exposes them through the telemetry library.
> > 
> > And a comment on the patch set in general: The order of the patch set seems
> > reversed. The earlier patch do not compile, because they depend on
> > rte_apistats.h, which is introduced in the final patch. Each patch as it is applied
> > needs to build successfully.
> > 
> > Also, I would suggest a different name. rte_apistats seems very generic and
> > could apply to anything. How about something like rte_ethdev_stats.h or
> > rte_burst_stats.h?
> > 
> > Rgds,
> > Dave.
> > 
> > 
> > 
> > 
> > 
> > 
>
  
Varghese, Vipin Dec. 24, 2020, 12:35 p.m. UTC | #8
snipped
> 
> Thanks for your comments.
> I know you kindly provided many valuable comments though I reply the
> following first because I think it is important that my idea/proposal is
> acceptable or not first.
> 
> > Sharing an alternate approach, if RX-TX callbacks are enabled in DPDK (which
> is by default). One can register a callback handler to update counters with the
> following information as `port-queue pair, lcoreid, total rx burst request, total
> empty rx burst, 1-8 pks, 9-16 pkts, 16-32 pkts`. Callback handlers can be
> selectively enabled or disabled too.
> >
> > Can you please help me understand how `rte_apistats` would be different or
> pros of having it as library (that needs to be added to linking and running in
> case of using DPDK applications)?
> You are correct.
> Application can do that by using callback of rx_burst/tx_burst.
> But needs to modify/add the code for RX/TX process logic.

No the RX or TX application logic is not modified as the call back registration is done once right after port_init.

> 
> Point of my patchset is couting is done by library so that APP only needs to
> "retrieve/read" those data if needed (especially for existing applications).

As mentioned in the other patchset the library function is enabled through and not conditionally built. Any application which is built with this patch set will be forced to invoke the calls.

> 
> I think it makes some developers happy becaseu it is relatively easy to measure
> "how cpu core is bysy" easily.

Not sure what do you mean, as there 2 factors which conflicts
1. there are existing uses cases and examples like power, metric, telemetry all uses RX/TX callbacks which does the same.
2. The current logic only helps in RX/TX cores and not other cores. So in case of pipeline model there are only a couple of RX or TX threads.

> (I admit relatively rough measurement though. I think it is trade-off)
> 
> What do you think?

If I consider RX calls there are 3 main metrics
1. How many times RX is invoked.
2. How many times RX returns with `0` packets
3. How many times RX returns with `> 0` packets.

With these values in the current logic you are trying to deduct actual CPU RX usage by `useful work = number of times `> 0` / total RX calls`

As a end user I will always be happy to see telemetry data as `from time t1 to t2, 
1. how many times per second on average RX calls were made.
2. how many times per second on average The calls returned packets in range of 1-4, 5-8, 9-16, 17 and more  `

Current logic is not trying to address this problem. With my current understanding I am not convinced that one needs yet another library when the same can be done either with `RX/TX callback`

> 
snipped
  
Ferruh Yigit Feb. 22, 2021, 3:10 p.m. UTC | #9
On 12/22/2020 2:22 AM, Hideyuki Yamashita wrote:
> Hello,
> 
> Thanks for your comments.
> Please see my comments inline tagged with [HY].
> 
>> 04/12/2020 08:51, Hideyuki Yamashita:
>>> In general, DPDK application consumes CPU usage because it polls
>>> incoming packets using rx_burst API in infinite loop.
>>> This makes difficult to estimate how much CPU usage is really
>>> used to send/receive packets by the DPDK application.
>>>
>>> For example, even if no incoming packets arriving, CPU usage
>>> looks nearly 100% when observed by top command.
>>>
>>> It is beneficial if developers can observe real CPU usage of the
>>> DPDK application.
>>> Such information can be exported to monitoring application like
>>> prometheus/graphana and shows CPU usage graphically.
>>>
>>> To achieve above, this patch set provides apistats functionality.
>>> apistats provides the followiing two counters for each lcore.
>>> - rx_burst_counts[RTE_MAX_LCORE]
>>> - tx_burst_counts[RTE_MAX_LCORE]
>>> Those accumulates rx_burst/tx_burst counts since the application starts.
>>
>> Please could you compare with what rte_jobstats offers?
>> http://code.dpdk.org/dpdk/latest/source/lib/librte_jobstats/rte_jobstats.h
>>
>> I feel all of this shares the same goals as librte_power work.
> 
> [HY]
> Thanks for your commetns.
> You are correct. As you well know, l3fwd-power measures "how cpu cores
> are busy".
> And in that sense, the goal of my proposal is the same with yours .
> Moreover l3fwd-power is more precise than my proposal.
> 
> Point of my proposal is
> - more easy to use
> - less code impact on application code
> 
> I think that if application developer wants to need to measure "how cpu
> cores are busy" he/she will needs to implement
> - logic similar with l3fwd-power
> or
> - use jobstats API
> 
> But it is rather heavy for existing applications.
> By using my proposal, it is "much easier" to implement.
> (But it is "rough" measurement. I think it is trade-off)
>   
> How do you think about the idea?
> 
>> [...]
>>> - basic API counting functionality(apistats) into librte_ethdev
>>
>> Could it be it be accessible via rte_telemetry?
>> http://code.dpdk.org/dpdk/latest/source/lib/librte_telemetry/rte_telemetry.h
>>
> [HY]
> OK.
> Currently, no reason not using telemetry.
> 
> I think telemetry is useful for applications which does NOT call DPDK
> API(C lang API) directly.
> My patchset provide only C API to retrieve apistats.
> But if assuming not all applications call C API, then I think it is
> reasonable to add telemetry in addition to C API for exposing stats.
> 
> Do you think "exposure via C API" is not needed?
> 

Hi Hideyuki,

With current implementation the change will cause a performance degradation for 
all users, even if they use the stats or not.

Even if the statics update wrapped with #ifdef, as suggested, the functionality 
is simple and it can be easily implemented by application using Rx/Tx callbacks, 
again as suggested, without introducing this new complexity.

And for more complex usage, the jobstats already tries to provide a generic 
library for it, or application specific needs can be implemented in application 
level as done is l3fwd-power.


Overall I agree the problem this patch is trying to solve is a valid one, but 
not sure about the current implementation, and I am marking current version as 
'rejected', please feel free to send a new version with a new approach.

Thanks,
ferruh