[v2,5/6] app/regex: support performance measurements per QP
Checks
Commit Message
Up to this commit measuring the parsing elapsed time and Giga bits per
second performance was done on the aggregation of all QPs (per core).
This commit separates the time measurements per individual QP.
Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
---
app/test-regex/main.c | 33 ++++++++++++++++++++++-----------
1 file changed, 22 insertions(+), 11 deletions(-)
Comments
20/12/2020 11:41, Ophir Munk:
> Up to this commit measuring the parsing elapsed time and Giga bits per
> second performance was done on the aggregation of all QPs (per core).
> This commit separates the time measurements per individual QP.
>
> Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
> ---
> --- a/app/test-regex/main.c
> +++ b/app/test-regex/main.c
> + for (qp_id = 0; qp_id < nb_qps; qp_id++) {
> + time = ((double)qp->end - qp->start) / CLOCKS_PER_SEC;
This line triggers an error with PPC compiler:
error: ‘qp’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
time = ((double)qp->end - qp->start) / CLOCKS_PER_SEC;
> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Friday, January 8, 2021 11:09 AM
> To: Ophir Munk <ophirmu@nvidia.com>
> Cc: dev@dpdk.org; Ori Kam <orika@nvidia.com>
> Subject: Re: [dpdk-dev] [PATCH v2 5/6] app/regex: support performance
> measurements per QP
>
> 20/12/2020 11:41, Ophir Munk:
> > Up to this commit measuring the parsing elapsed time and Giga bits per
> > second performance was done on the aggregation of all QPs (per core).
> > This commit separates the time measurements per individual QP.
> >
> > Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
> > ---
> > --- a/app/test-regex/main.c
> > +++ b/app/test-regex/main.c
> > + for (qp_id = 0; qp_id < nb_qps; qp_id++) {
> > + time = ((double)qp->end - qp->start) / CLOCKS_PER_SEC;
>
> This line triggers an error with PPC compiler:
> error: ‘qp’ may be used uninitialized in this function [-Werror=maybe-
> uninitialized]
> time = ((double)qp->end - qp->start) / CLOCKS_PER_SEC;
>
Thanks for reporting.
I sent v3 with a fix.
Could I have known this error in advance? Is there a CI report?
10/01/2021 12:16, Ophir Munk:
>
> > -----Original Message-----
> > From: Thomas Monjalon <thomas@monjalon.net>
> > Sent: Friday, January 8, 2021 11:09 AM
> > To: Ophir Munk <ophirmu@nvidia.com>
> > Cc: dev@dpdk.org; Ori Kam <orika@nvidia.com>
> > Subject: Re: [dpdk-dev] [PATCH v2 5/6] app/regex: support performance
> > measurements per QP
> >
> > 20/12/2020 11:41, Ophir Munk:
> > > Up to this commit measuring the parsing elapsed time and Giga bits per
> > > second performance was done on the aggregation of all QPs (per core).
> > > This commit separates the time measurements per individual QP.
> > >
> > > Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
> > > ---
> > > --- a/app/test-regex/main.c
> > > +++ b/app/test-regex/main.c
> > > + for (qp_id = 0; qp_id < nb_qps; qp_id++) {
> > > + time = ((double)qp->end - qp->start) / CLOCKS_PER_SEC;
> >
> > This line triggers an error with PPC compiler:
> > error: ‘qp’ may be used uninitialized in this function [-Werror=maybe-
> > uninitialized]
> > time = ((double)qp->end - qp->start) / CLOCKS_PER_SEC;
> >
> Thanks for reporting.
> I sent v3 with a fix.
> Could I have known this error in advance? Is there a CI report?
No I think there is no CI report for PPC.
Actually the PPC support in DPDK is not clear.
>>> -----Original Message-----
>>> From: Thomas Monjalon <thomas@monjalon.net>
>>> Sent: Friday, January 8, 2021 11:09 AM
>>> To: Ophir Munk <ophirmu@nvidia.com>
>>> Cc: dev@dpdk.org; Ori Kam <orika@nvidia.com>
>>> Subject: Re: [dpdk-dev] [PATCH v2 5/6] app/regex: support performance
>>> measurements per QP
>>>
>>> 20/12/2020 11:41, Ophir Munk:
>>>> Up to this commit measuring the parsing elapsed time and Giga bits per
>>>> second performance was done on the aggregation of all QPs (per core).
>>>> This commit separates the time measurements per individual QP.
>>>>
>>>> Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
>>>> ---
>>>> --- a/app/test-regex/main.c
>>>> +++ b/app/test-regex/main.c
>>>> + for (qp_id = 0; qp_id < nb_qps; qp_id++) {
>>>> + time = ((double)qp->end - qp->start) / CLOCKS_PER_SEC;
>>>
>>> This line triggers an error with PPC compiler:
>>> error: ‘qp’ may be used uninitialized in this function [-Werror=maybe-
>>> uninitialized]
>>> time = ((double)qp->end - qp->start) / CLOCKS_PER_SEC;
>>>
>> Thanks for reporting.
>> I sent v3 with a fix.
>> Could I have known this error in advance? Is there a CI report?
>
> No I think there is no CI report for PPC.
> Actually the PPC support in DPDK is not clear.
IBM is still supporting DPDK on PPC, though we don't have a proper CI
infrastructure in place. PPC support is available from Travis but was
unreliable when we last attempted to enable it. I'll take Thomas' hint
the hint and revisit the issue in 2021.
Dave
FYI, v2 of the patch built successfully on my PPC test system with gcc
8.3.1. There's a compiler difference between my RH8 environment and the
cross-build environment Thomas references.
@@ -48,6 +48,8 @@ struct qp_params {
struct rte_regex_ops **ops;
struct job_ctx *jobs_ctx;
char *buf;
+ time_t start;
+ time_t end;
};
struct qps_per_lcore {
@@ -324,8 +326,6 @@ run_regex(void *args)
unsigned long d_ind = 0;
struct rte_mbuf_ext_shared_info shinfo;
int res = 0;
- time_t start;
- time_t end;
double time;
struct rte_mempool *mbuf_mp;
struct qp_params *qp;
@@ -418,9 +418,10 @@ run_regex(void *args)
qp->buf = buf;
qp->total_matches = 0;
+ qp->start = 0;
+ qp->end = 0;
}
- start = clock();
for (i = 0; i < nb_iterations; i++) {
for (qp_id = 0; qp_id < nb_qps; qp_id++) {
qp = &qps[qp_id];
@@ -431,6 +432,8 @@ run_regex(void *args)
update = false;
for (qp_id = 0; qp_id < nb_qps; qp_id++) {
qp = &qps[qp_id];
+ if (!qp->start)
+ qp->start = clock();
if (qp->total_dequeue < actual_jobs) {
struct rte_regex_ops **
cur_ops_to_enqueue = qp->ops +
@@ -461,22 +464,30 @@ run_regex(void *args)
qp->total_enqueue -
qp->total_dequeue);
update = true;
+ } else {
+ if (!qp->end)
+ qp->end = clock();
}
+
}
} while (update);
}
- end = clock();
- time = ((double)end - start) / CLOCKS_PER_SEC;
- printf("Job len = %ld Bytes\n", job_len);
- printf("Time = %lf sec\n", time);
- printf("Perf = %lf Gbps\n",
- (((double)actual_jobs * job_len * nb_iterations * 8) / time) /
- 1000000000.0);
+ for (qp_id = 0; qp_id < nb_qps; qp_id++) {
+ time = ((double)qp->end - qp->start) / CLOCKS_PER_SEC;
+ printf("Core=%u QP=%u\n", rte_lcore_id(), qp_id + qp_id_base);
+ printf("Job len = %ld Bytes\n", job_len);
+ printf("Time = %lf sec\n", time);
+ printf("Perf = %lf Gbps\n\n",
+ (((double)actual_jobs * job_len *
+ nb_iterations * 8) / time) /
+ 1000000000.0);
+ }
if (rgxc->perf_mode)
goto end;
for (qp_id = 0; qp_id < nb_qps; qp_id++) {
- printf("\n############ QP id=%u ############\n", qp_id);
+ printf("\n############ Core=%u QP=%u ############\n",
+ rte_lcore_id(), qp_id + qp_id_base);
qp = &qps[qp_id];
/* Log results per job. */
for (d_ind = 0; d_ind < qp->total_dequeue; d_ind++) {