[v2,1/5] app/testpmd: clock gettime call in throughput calculation
Checks
Commit Message
The throughput calculation requires a counter that measures
passing of time. However, the kernel saves and restores the PMU
state when a thread is unscheduled and scheduled. This ensures
that the PMU cycles are not counted towards a thread that is
not scheduled. Hence, when RTE_ARM_EAL_RDTSC_USE_PMU is enabled,
the PMU cycles do not represent the passing of time.
This results in incorrect calculation of throughput numbers.
Use clock_gettime system call to calculate the time passed since
last call.
Bugzilla ID: 450
Fixes: 0e106980301d ("app/testpmd: show throughput in port stats")
Cc: zhihong.wang@intel.com
Cc: stable@dpdk.org
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Tested-by: Phil Yang <phil.yang@arm.com>
Tested-by: Ali Alnubani <alialnu@mellanox.com>
---
v2: Updated commit log in 1/5 (Jerin)
app/test-pmd/config.c | 44 +++++++++++++++++++++++++++++--------------
1 file changed, 30 insertions(+), 14 deletions(-)
Comments
On 6/26/2020 11:09 PM, Honnappa Nagarahalli wrote:
> The throughput calculation requires a counter that measures
> passing of time. However, the kernel saves and restores the PMU
> state when a thread is unscheduled and scheduled. This ensures
> that the PMU cycles are not counted towards a thread that is
> not scheduled. Hence, when RTE_ARM_EAL_RDTSC_USE_PMU is enabled,
> the PMU cycles do not represent the passing of time.
Does this mean 'rte_rdtsc()' is broken for Arm?
Wouldn't it cause more serious issue than testpmd throughput stats?
> This results in incorrect calculation of throughput numbers.
> Use clock_gettime system call to calculate the time passed since
> last call.
>
> Bugzilla ID: 450
> Fixes: 0e106980301d ("app/testpmd: show throughput in port stats")
> Cc: zhihong.wang@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Reviewed-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Tested-by: Phil Yang <phil.yang@arm.com>
> Tested-by: Ali Alnubani <alialnu@mellanox.com>
<...>
> @@ -195,10 +204,17 @@ nic_stats_display(portid_t port_id)
> }
> }
>
> - diff_cycles = prev_cycles[port_id];
> - prev_cycles[port_id] = rte_rdtsc();
> - if (diff_cycles > 0)
> - diff_cycles = prev_cycles[port_id] - diff_cycles;
> + diff_ns = 0;
> + if (clock_gettime(CLOCK_TYPE_ID, &cur_time) == 0) {
I guess 'rte_rdtsc()' is faster, but since this is not in the data path, I think
it is OK.
<...>
> @@ -217,10 +233,10 @@ nic_stats_display(portid_t port_id)
> (stats.obytes - prev_bytes_tx[port_id]) : 0;
> prev_bytes_rx[port_id] = stats.ibytes;
> prev_bytes_tx[port_id] = stats.obytes;
> - mbps_rx = diff_cycles > 0 ?
> - diff_bytes_rx * rte_get_tsc_hz() / diff_cycles : 0;
> - mbps_tx = diff_cycles > 0 ?
> - diff_bytes_tx * rte_get_tsc_hz() / diff_cycles : 0;
> + mbps_rx = diff_ns > 0 ?
> + (double)diff_bytes_rx / diff_ns * NS_PER_SEC : 0;
> + mbps_tx = diff_ns > 0 ?
> + (double)diff_bytes_tx / diff_ns * NS_PER_SEC : 0;
The calculation also fixes other issue in the stats.
With previous method, if the sampling between two stats is a little long,
"diff_pkts_rx * rte_get_tsc_hz()" can overflow and produce wrong 'bps'.
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Hi Ferruh,
Thanks for your comments.
<snip>
> Subject: Re: [PATCH v2 1/5] app/testpmd: clock gettime call in throughput
> calculation
>
> On 6/26/2020 11:09 PM, Honnappa Nagarahalli wrote:
> > The throughput calculation requires a counter that measures passing of
> > time. However, the kernel saves and restores the PMU state when a
> > thread is unscheduled and scheduled. This ensures that the PMU cycles
> > are not counted towards a thread that is not scheduled. Hence, when
> > RTE_ARM_EAL_RDTSC_USE_PMU is enabled, the PMU cycles do not
> represent
> > the passing of time.
>
> Does this mean 'rte_rdtsc()' is broken for Arm?
It depends on the kernel I think. IMO, for isolated CPUs it should be fine. It is currently getting fixed through a kernel patch. Once the kernel patch is up streamed, we will make the required changes in DPDK.
> Wouldn't it cause more serious issue than testpmd throughput stats?
Within DPDK, it does not seem to be used for anything critical (I have fixed one).
>
> > This results in incorrect calculation of throughput numbers.
> > Use clock_gettime system call to calculate the time passed since last
> > call.
> >
> > Bugzilla ID: 450
> > Fixes: 0e106980301d ("app/testpmd: show throughput in port stats")
> > Cc: zhihong.wang@intel.com
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > Reviewed-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > Tested-by: Phil Yang <phil.yang@arm.com>
> > Tested-by: Ali Alnubani <alialnu@mellanox.com>
>
> <...>
>
> > @@ -195,10 +204,17 @@ nic_stats_display(portid_t port_id)
> > }
> > }
> >
> > - diff_cycles = prev_cycles[port_id];
> > - prev_cycles[port_id] = rte_rdtsc();
> > - if (diff_cycles > 0)
> > - diff_cycles = prev_cycles[port_id] - diff_cycles;
> > + diff_ns = 0;
> > + if (clock_gettime(CLOCK_TYPE_ID, &cur_time) == 0) {
>
> I guess 'rte_rdtsc()' is faster, but since this is not in the data path, I think it is
> OK.
>
> <...>
>
> > @@ -217,10 +233,10 @@ nic_stats_display(portid_t port_id)
> > (stats.obytes - prev_bytes_tx[port_id]) : 0;
> > prev_bytes_rx[port_id] = stats.ibytes;
> > prev_bytes_tx[port_id] = stats.obytes;
> > - mbps_rx = diff_cycles > 0 ?
> > - diff_bytes_rx * rte_get_tsc_hz() / diff_cycles : 0;
> > - mbps_tx = diff_cycles > 0 ?
> > - diff_bytes_tx * rte_get_tsc_hz() / diff_cycles : 0;
> > + mbps_rx = diff_ns > 0 ?
> > + (double)diff_bytes_rx / diff_ns * NS_PER_SEC : 0;
> > + mbps_tx = diff_ns > 0 ?
> > + (double)diff_bytes_tx / diff_ns * NS_PER_SEC : 0;
>
> The calculation also fixes other issue in the stats.
> With previous method, if the sampling between two stats is a little long,
> "diff_pkts_rx * rte_get_tsc_hz()" can overflow and produce wrong 'bps'.
>
>
> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
On 7/6/2020 5:53 PM, Honnappa Nagarahalli wrote:
> Hi Ferruh,
> Thanks for your comments.
>
> <snip>
>
>> Subject: Re: [PATCH v2 1/5] app/testpmd: clock gettime call in throughput
>> calculation
>>
>> On 6/26/2020 11:09 PM, Honnappa Nagarahalli wrote:
>>> The throughput calculation requires a counter that measures passing of
>>> time. However, the kernel saves and restores the PMU state when a
>>> thread is unscheduled and scheduled. This ensures that the PMU cycles
>>> are not counted towards a thread that is not scheduled. Hence, when
>>> RTE_ARM_EAL_RDTSC_USE_PMU is enabled, the PMU cycles do not
>> represent
>>> the passing of time.
>>
>> Does this mean 'rte_rdtsc()' is broken for Arm?
> It depends on the kernel I think. IMO, for isolated CPUs it should be fine. It is currently getting fixed through a kernel patch. Once the kernel patch is up streamed, we will make the required changes in DPDK.
>
>> Wouldn't it cause more serious issue than testpmd throughput stats?
> Within DPDK, it does not seem to be used for anything critical (I have fixed one).
OK, thanks for clarification.
>
>>
>>> This results in incorrect calculation of throughput numbers.
>>> Use clock_gettime system call to calculate the time passed since last
>>> call.
>>>
>>> Bugzilla ID: 450
>>> Fixes: 0e106980301d ("app/testpmd: show throughput in port stats")
>>> Cc: zhihong.wang@intel.com
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
>>> Reviewed-by: Phil Yang <phil.yang@arm.com>
>>> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
>>> Tested-by: Phil Yang <phil.yang@arm.com>
>>> Tested-by: Ali Alnubani <alialnu@mellanox.com>
>>
>> <...>
>>
>>> @@ -195,10 +204,17 @@ nic_stats_display(portid_t port_id)
>>> }
>>> }
>>>
>>> - diff_cycles = prev_cycles[port_id];
>>> - prev_cycles[port_id] = rte_rdtsc();
>>> - if (diff_cycles > 0)
>>> - diff_cycles = prev_cycles[port_id] - diff_cycles;
>>> + diff_ns = 0;
>>> + if (clock_gettime(CLOCK_TYPE_ID, &cur_time) == 0) {
>>
>> I guess 'rte_rdtsc()' is faster, but since this is not in the data path, I think it is
>> OK.
>>
>> <...>
>>
>>> @@ -217,10 +233,10 @@ nic_stats_display(portid_t port_id)
>>> (stats.obytes - prev_bytes_tx[port_id]) : 0;
>>> prev_bytes_rx[port_id] = stats.ibytes;
>>> prev_bytes_tx[port_id] = stats.obytes;
>>> - mbps_rx = diff_cycles > 0 ?
>>> - diff_bytes_rx * rte_get_tsc_hz() / diff_cycles : 0;
>>> - mbps_tx = diff_cycles > 0 ?
>>> - diff_bytes_tx * rte_get_tsc_hz() / diff_cycles : 0;
>>> + mbps_rx = diff_ns > 0 ?
>>> + (double)diff_bytes_rx / diff_ns * NS_PER_SEC : 0;
>>> + mbps_tx = diff_ns > 0 ?
>>> + (double)diff_bytes_tx / diff_ns * NS_PER_SEC : 0;
>>
>> The calculation also fixes other issue in the stats.
>> With previous method, if the sampling between two stats is a little long,
>> "diff_pkts_rx * rte_get_tsc_hz()" can overflow and produce wrong 'bps'.
>>
>>
>> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
@@ -54,6 +54,14 @@
#define ETHDEV_FWVERS_LEN 32
+#ifdef CLOCK_MONOTONIC_RAW /* Defined in glibc bits/time.h */
+#define CLOCK_TYPE_ID CLOCK_MONOTONIC_RAW
+#else
+#define CLOCK_TYPE_ID CLOCK_MONOTONIC
+#endif
+
+#define NS_PER_SEC 1E9
+
static char *flowtype_to_str(uint16_t flow_type);
static const struct {
@@ -136,9 +144,10 @@ nic_stats_display(portid_t port_id)
static uint64_t prev_pkts_tx[RTE_MAX_ETHPORTS];
static uint64_t prev_bytes_rx[RTE_MAX_ETHPORTS];
static uint64_t prev_bytes_tx[RTE_MAX_ETHPORTS];
- static uint64_t prev_cycles[RTE_MAX_ETHPORTS];
+ static uint64_t prev_ns[RTE_MAX_ETHPORTS];
+ struct timespec cur_time;
uint64_t diff_pkts_rx, diff_pkts_tx, diff_bytes_rx, diff_bytes_tx,
- diff_cycles;
+ diff_ns;
uint64_t mpps_rx, mpps_tx, mbps_rx, mbps_tx;
struct rte_eth_stats stats;
struct rte_port *port = &ports[port_id];
@@ -195,10 +204,17 @@ nic_stats_display(portid_t port_id)
}
}
- diff_cycles = prev_cycles[port_id];
- prev_cycles[port_id] = rte_rdtsc();
- if (diff_cycles > 0)
- diff_cycles = prev_cycles[port_id] - diff_cycles;
+ diff_ns = 0;
+ if (clock_gettime(CLOCK_TYPE_ID, &cur_time) == 0) {
+ uint64_t ns;
+
+ ns = cur_time.tv_sec * NS_PER_SEC;
+ ns += cur_time.tv_nsec;
+
+ if (prev_ns[port_id] != 0)
+ diff_ns = ns - prev_ns[port_id];
+ prev_ns[port_id] = ns;
+ }
diff_pkts_rx = (stats.ipackets > prev_pkts_rx[port_id]) ?
(stats.ipackets - prev_pkts_rx[port_id]) : 0;
@@ -206,10 +222,10 @@ nic_stats_display(portid_t port_id)
(stats.opackets - prev_pkts_tx[port_id]) : 0;
prev_pkts_rx[port_id] = stats.ipackets;
prev_pkts_tx[port_id] = stats.opackets;
- mpps_rx = diff_cycles > 0 ?
- diff_pkts_rx * rte_get_tsc_hz() / diff_cycles : 0;
- mpps_tx = diff_cycles > 0 ?
- diff_pkts_tx * rte_get_tsc_hz() / diff_cycles : 0;
+ mpps_rx = diff_ns > 0 ?
+ (double)diff_pkts_rx / diff_ns * NS_PER_SEC : 0;
+ mpps_tx = diff_ns > 0 ?
+ (double)diff_pkts_tx / diff_ns * NS_PER_SEC : 0;
diff_bytes_rx = (stats.ibytes > prev_bytes_rx[port_id]) ?
(stats.ibytes - prev_bytes_rx[port_id]) : 0;
@@ -217,10 +233,10 @@ nic_stats_display(portid_t port_id)
(stats.obytes - prev_bytes_tx[port_id]) : 0;
prev_bytes_rx[port_id] = stats.ibytes;
prev_bytes_tx[port_id] = stats.obytes;
- mbps_rx = diff_cycles > 0 ?
- diff_bytes_rx * rte_get_tsc_hz() / diff_cycles : 0;
- mbps_tx = diff_cycles > 0 ?
- diff_bytes_tx * rte_get_tsc_hz() / diff_cycles : 0;
+ mbps_rx = diff_ns > 0 ?
+ (double)diff_bytes_rx / diff_ns * NS_PER_SEC : 0;
+ mbps_tx = diff_ns > 0 ?
+ (double)diff_bytes_tx / diff_ns * NS_PER_SEC : 0;
printf("\n Throughput (since last show)\n");
printf(" Rx-pps: %12"PRIu64" Rx-bps: %12"PRIu64"\n Tx-pps: %12"