[v2,1/5] app/testpmd: clock gettime call in throughput calculation

Message ID 20200626220943.22526-1-honnappa.nagarahalli@arm.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series [v2,1/5] app/testpmd: clock gettime call in throughput calculation |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-nxp-Performance success Performance Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/Intel-compilation success Compilation OK
ci/iol-testing success Testing PASS

Commit Message

Honnappa Nagarahalli June 26, 2020, 10:09 p.m. UTC
  The throughput calculation requires a counter that measures
passing of time. However, the kernel saves and restores the PMU
state when a thread is unscheduled and scheduled. This ensures
that the PMU cycles are not counted towards a thread that is
not scheduled. Hence, when RTE_ARM_EAL_RDTSC_USE_PMU is enabled,
the PMU cycles do not represent the passing of time.
This results in incorrect calculation of throughput numbers.
Use clock_gettime system call to calculate the time passed since
last call.

Bugzilla ID: 450
Fixes: 0e106980301d ("app/testpmd: show throughput in port stats")
Cc: zhihong.wang@intel.com
Cc: stable@dpdk.org

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Tested-by: Phil Yang <phil.yang@arm.com>
Tested-by: Ali Alnubani <alialnu@mellanox.com>
---
v2: Updated commit log in 1/5 (Jerin)

 app/test-pmd/config.c | 44 +++++++++++++++++++++++++++++--------------
 1 file changed, 30 insertions(+), 14 deletions(-)
  

Comments

Ferruh Yigit July 6, 2020, 3:36 p.m. UTC | #1
On 6/26/2020 11:09 PM, Honnappa Nagarahalli wrote:
> The throughput calculation requires a counter that measures
> passing of time. However, the kernel saves and restores the PMU
> state when a thread is unscheduled and scheduled. This ensures
> that the PMU cycles are not counted towards a thread that is
> not scheduled. Hence, when RTE_ARM_EAL_RDTSC_USE_PMU is enabled,
> the PMU cycles do not represent the passing of time.

Does this mean 'rte_rdtsc()' is broken for Arm?
Wouldn't it cause more serious issue than testpmd throughput stats?

> This results in incorrect calculation of throughput numbers.
> Use clock_gettime system call to calculate the time passed since
> last call.
> 
> Bugzilla ID: 450
> Fixes: 0e106980301d ("app/testpmd: show throughput in port stats")
> Cc: zhihong.wang@intel.com
> Cc: stable@dpdk.org
> 
> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Reviewed-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Tested-by: Phil Yang <phil.yang@arm.com>
> Tested-by: Ali Alnubani <alialnu@mellanox.com>

<...>

> @@ -195,10 +204,17 @@ nic_stats_display(portid_t port_id)
>  		}
>  	}
>  
> -	diff_cycles = prev_cycles[port_id];
> -	prev_cycles[port_id] = rte_rdtsc();
> -	if (diff_cycles > 0)
> -		diff_cycles = prev_cycles[port_id] - diff_cycles;
> +	diff_ns = 0;
> +	if (clock_gettime(CLOCK_TYPE_ID, &cur_time) == 0) {

I guess 'rte_rdtsc()' is faster, but since this is not in the data path, I think
it is OK.

<...>

> @@ -217,10 +233,10 @@ nic_stats_display(portid_t port_id)
>  		(stats.obytes - prev_bytes_tx[port_id]) : 0;
>  	prev_bytes_rx[port_id] = stats.ibytes;
>  	prev_bytes_tx[port_id] = stats.obytes;
> -	mbps_rx = diff_cycles > 0 ?
> -		diff_bytes_rx * rte_get_tsc_hz() / diff_cycles : 0;
> -	mbps_tx = diff_cycles > 0 ?
> -		diff_bytes_tx * rte_get_tsc_hz() / diff_cycles : 0;
> +	mbps_rx = diff_ns > 0 ?
> +		(double)diff_bytes_rx / diff_ns * NS_PER_SEC : 0;
> +	mbps_tx = diff_ns > 0 ?
> +		(double)diff_bytes_tx / diff_ns * NS_PER_SEC : 0;

The calculation also fixes other issue in the stats.
With previous method, if the sampling between two stats is a little long,
"diff_pkts_rx * rte_get_tsc_hz()" can overflow and produce wrong 'bps'.


Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
  
Honnappa Nagarahalli July 6, 2020, 4:53 p.m. UTC | #2
Hi Ferruh,
	Thanks for your comments.

<snip>

> Subject: Re: [PATCH v2 1/5] app/testpmd: clock gettime call in throughput
> calculation
> 
> On 6/26/2020 11:09 PM, Honnappa Nagarahalli wrote:
> > The throughput calculation requires a counter that measures passing of
> > time. However, the kernel saves and restores the PMU state when a
> > thread is unscheduled and scheduled. This ensures that the PMU cycles
> > are not counted towards a thread that is not scheduled. Hence, when
> > RTE_ARM_EAL_RDTSC_USE_PMU is enabled, the PMU cycles do not
> represent
> > the passing of time.
> 
> Does this mean 'rte_rdtsc()' is broken for Arm?
It depends on the kernel I think. IMO, for isolated CPUs it should be fine. It is currently getting fixed through a kernel patch. Once the kernel patch is up streamed, we will make the required changes in DPDK.

> Wouldn't it cause more serious issue than testpmd throughput stats?
Within DPDK, it does not seem to be used for anything critical (I have fixed one).

> 
> > This results in incorrect calculation of throughput numbers.
> > Use clock_gettime system call to calculate the time passed since last
> > call.
> >
> > Bugzilla ID: 450
> > Fixes: 0e106980301d ("app/testpmd: show throughput in port stats")
> > Cc: zhihong.wang@intel.com
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > Reviewed-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > Tested-by: Phil Yang <phil.yang@arm.com>
> > Tested-by: Ali Alnubani <alialnu@mellanox.com>
> 
> <...>
> 
> > @@ -195,10 +204,17 @@ nic_stats_display(portid_t port_id)
> >  		}
> >  	}
> >
> > -	diff_cycles = prev_cycles[port_id];
> > -	prev_cycles[port_id] = rte_rdtsc();
> > -	if (diff_cycles > 0)
> > -		diff_cycles = prev_cycles[port_id] - diff_cycles;
> > +	diff_ns = 0;
> > +	if (clock_gettime(CLOCK_TYPE_ID, &cur_time) == 0) {
> 
> I guess 'rte_rdtsc()' is faster, but since this is not in the data path, I think it is
> OK.
> 
> <...>
> 
> > @@ -217,10 +233,10 @@ nic_stats_display(portid_t port_id)
> >  		(stats.obytes - prev_bytes_tx[port_id]) : 0;
> >  	prev_bytes_rx[port_id] = stats.ibytes;
> >  	prev_bytes_tx[port_id] = stats.obytes;
> > -	mbps_rx = diff_cycles > 0 ?
> > -		diff_bytes_rx * rte_get_tsc_hz() / diff_cycles : 0;
> > -	mbps_tx = diff_cycles > 0 ?
> > -		diff_bytes_tx * rte_get_tsc_hz() / diff_cycles : 0;
> > +	mbps_rx = diff_ns > 0 ?
> > +		(double)diff_bytes_rx / diff_ns * NS_PER_SEC : 0;
> > +	mbps_tx = diff_ns > 0 ?
> > +		(double)diff_bytes_tx / diff_ns * NS_PER_SEC : 0;
> 
> The calculation also fixes other issue in the stats.
> With previous method, if the sampling between two stats is a little long,
> "diff_pkts_rx * rte_get_tsc_hz()" can overflow and produce wrong 'bps'.
> 
> 
> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
  
Ferruh Yigit July 6, 2020, 5:19 p.m. UTC | #3
On 7/6/2020 5:53 PM, Honnappa Nagarahalli wrote:
> Hi Ferruh,
> 	Thanks for your comments.
> 
> <snip>
> 
>> Subject: Re: [PATCH v2 1/5] app/testpmd: clock gettime call in throughput
>> calculation
>>
>> On 6/26/2020 11:09 PM, Honnappa Nagarahalli wrote:
>>> The throughput calculation requires a counter that measures passing of
>>> time. However, the kernel saves and restores the PMU state when a
>>> thread is unscheduled and scheduled. This ensures that the PMU cycles
>>> are not counted towards a thread that is not scheduled. Hence, when
>>> RTE_ARM_EAL_RDTSC_USE_PMU is enabled, the PMU cycles do not
>> represent
>>> the passing of time.
>>
>> Does this mean 'rte_rdtsc()' is broken for Arm?
> It depends on the kernel I think. IMO, for isolated CPUs it should be fine. It is currently getting fixed through a kernel patch. Once the kernel patch is up streamed, we will make the required changes in DPDK.
> 
>> Wouldn't it cause more serious issue than testpmd throughput stats?
> Within DPDK, it does not seem to be used for anything critical (I have fixed one).

OK, thanks for clarification.

> 
>>
>>> This results in incorrect calculation of throughput numbers.
>>> Use clock_gettime system call to calculate the time passed since last
>>> call.
>>>
>>> Bugzilla ID: 450
>>> Fixes: 0e106980301d ("app/testpmd: show throughput in port stats")
>>> Cc: zhihong.wang@intel.com
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
>>> Reviewed-by: Phil Yang <phil.yang@arm.com>
>>> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
>>> Tested-by: Phil Yang <phil.yang@arm.com>
>>> Tested-by: Ali Alnubani <alialnu@mellanox.com>
>>
>> <...>
>>
>>> @@ -195,10 +204,17 @@ nic_stats_display(portid_t port_id)
>>>  		}
>>>  	}
>>>
>>> -	diff_cycles = prev_cycles[port_id];
>>> -	prev_cycles[port_id] = rte_rdtsc();
>>> -	if (diff_cycles > 0)
>>> -		diff_cycles = prev_cycles[port_id] - diff_cycles;
>>> +	diff_ns = 0;
>>> +	if (clock_gettime(CLOCK_TYPE_ID, &cur_time) == 0) {
>>
>> I guess 'rte_rdtsc()' is faster, but since this is not in the data path, I think it is
>> OK.
>>
>> <...>
>>
>>> @@ -217,10 +233,10 @@ nic_stats_display(portid_t port_id)
>>>  		(stats.obytes - prev_bytes_tx[port_id]) : 0;
>>>  	prev_bytes_rx[port_id] = stats.ibytes;
>>>  	prev_bytes_tx[port_id] = stats.obytes;
>>> -	mbps_rx = diff_cycles > 0 ?
>>> -		diff_bytes_rx * rte_get_tsc_hz() / diff_cycles : 0;
>>> -	mbps_tx = diff_cycles > 0 ?
>>> -		diff_bytes_tx * rte_get_tsc_hz() / diff_cycles : 0;
>>> +	mbps_rx = diff_ns > 0 ?
>>> +		(double)diff_bytes_rx / diff_ns * NS_PER_SEC : 0;
>>> +	mbps_tx = diff_ns > 0 ?
>>> +		(double)diff_bytes_tx / diff_ns * NS_PER_SEC : 0;
>>
>> The calculation also fixes other issue in the stats.
>> With previous method, if the sampling between two stats is a little long,
>> "diff_pkts_rx * rte_get_tsc_hz()" can overflow and produce wrong 'bps'.
>>
>>
>> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
  

Patch

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 016bcb09c..91fbf99f8 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -54,6 +54,14 @@ 
 
 #define ETHDEV_FWVERS_LEN 32
 
+#ifdef CLOCK_MONOTONIC_RAW /* Defined in glibc bits/time.h */
+#define CLOCK_TYPE_ID CLOCK_MONOTONIC_RAW
+#else
+#define CLOCK_TYPE_ID CLOCK_MONOTONIC
+#endif
+
+#define NS_PER_SEC 1E9
+
 static char *flowtype_to_str(uint16_t flow_type);
 
 static const struct {
@@ -136,9 +144,10 @@  nic_stats_display(portid_t port_id)
 	static uint64_t prev_pkts_tx[RTE_MAX_ETHPORTS];
 	static uint64_t prev_bytes_rx[RTE_MAX_ETHPORTS];
 	static uint64_t prev_bytes_tx[RTE_MAX_ETHPORTS];
-	static uint64_t prev_cycles[RTE_MAX_ETHPORTS];
+	static uint64_t prev_ns[RTE_MAX_ETHPORTS];
+	struct timespec cur_time;
 	uint64_t diff_pkts_rx, diff_pkts_tx, diff_bytes_rx, diff_bytes_tx,
-								diff_cycles;
+								diff_ns;
 	uint64_t mpps_rx, mpps_tx, mbps_rx, mbps_tx;
 	struct rte_eth_stats stats;
 	struct rte_port *port = &ports[port_id];
@@ -195,10 +204,17 @@  nic_stats_display(portid_t port_id)
 		}
 	}
 
-	diff_cycles = prev_cycles[port_id];
-	prev_cycles[port_id] = rte_rdtsc();
-	if (diff_cycles > 0)
-		diff_cycles = prev_cycles[port_id] - diff_cycles;
+	diff_ns = 0;
+	if (clock_gettime(CLOCK_TYPE_ID, &cur_time) == 0) {
+		uint64_t ns;
+
+		ns = cur_time.tv_sec * NS_PER_SEC;
+		ns += cur_time.tv_nsec;
+
+		if (prev_ns[port_id] != 0)
+			diff_ns = ns - prev_ns[port_id];
+		prev_ns[port_id] = ns;
+	}
 
 	diff_pkts_rx = (stats.ipackets > prev_pkts_rx[port_id]) ?
 		(stats.ipackets - prev_pkts_rx[port_id]) : 0;
@@ -206,10 +222,10 @@  nic_stats_display(portid_t port_id)
 		(stats.opackets - prev_pkts_tx[port_id]) : 0;
 	prev_pkts_rx[port_id] = stats.ipackets;
 	prev_pkts_tx[port_id] = stats.opackets;
-	mpps_rx = diff_cycles > 0 ?
-		diff_pkts_rx * rte_get_tsc_hz() / diff_cycles : 0;
-	mpps_tx = diff_cycles > 0 ?
-		diff_pkts_tx * rte_get_tsc_hz() / diff_cycles : 0;
+	mpps_rx = diff_ns > 0 ?
+		(double)diff_pkts_rx / diff_ns * NS_PER_SEC : 0;
+	mpps_tx = diff_ns > 0 ?
+		(double)diff_pkts_tx / diff_ns * NS_PER_SEC : 0;
 
 	diff_bytes_rx = (stats.ibytes > prev_bytes_rx[port_id]) ?
 		(stats.ibytes - prev_bytes_rx[port_id]) : 0;
@@ -217,10 +233,10 @@  nic_stats_display(portid_t port_id)
 		(stats.obytes - prev_bytes_tx[port_id]) : 0;
 	prev_bytes_rx[port_id] = stats.ibytes;
 	prev_bytes_tx[port_id] = stats.obytes;
-	mbps_rx = diff_cycles > 0 ?
-		diff_bytes_rx * rte_get_tsc_hz() / diff_cycles : 0;
-	mbps_tx = diff_cycles > 0 ?
-		diff_bytes_tx * rte_get_tsc_hz() / diff_cycles : 0;
+	mbps_rx = diff_ns > 0 ?
+		(double)diff_bytes_rx / diff_ns * NS_PER_SEC : 0;
+	mbps_tx = diff_ns > 0 ?
+		(double)diff_bytes_tx / diff_ns * NS_PER_SEC : 0;
 
 	printf("\n  Throughput (since last show)\n");
 	printf("  Rx-pps: %12"PRIu64"          Rx-bps: %12"PRIu64"\n  Tx-pps: %12"