cycles: add isb before read cntvct_el0

Message ID 4099DE2E54AFAD489356C6C9161D53339729EB9A@DGGEML502-MBX.china.huawei.com (mailing list archive)
State Superseded, archived
Delegated to: David Marchand
Headers
Series cycles: add isb before read cntvct_el0 |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/Intel-compilation fail apply issues

Commit Message

Linhaifeng March 9, 2020, 9:22 a.m. UTC
  We should use isb rather than dsb to sync system counter to cntvct_el0.

Signed-off-by: Haifeng Lin <haifeng.lin@huawei.com>
---
lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 3 +++
lib/librte_eal/common/include/arch/arm/rte_cycles_64.h | 2 ++
2 files changed, 5 insertions(+)
  

Comments

Gavin Hu March 10, 2020, 7:11 a.m. UTC | #1
Hi Haifeng,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Linhaifeng
> Sent: Monday, March 9, 2020 5:23 PM
> To: dev@dpdk.org; thomas@monjalon.net
> Cc: chenchanghu <chenchanghu@huawei.com>; xudingke
> <xudingke@huawei.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>
> Subject: [dpdk-dev] [PATCH] cycles: add isb before read cntvct_el0
> 
> We should use isb rather than dsb to sync system counter to cntvct_el0.
> 
> Signed-off-by: Haifeng Lin <haifeng.lin@huawei.com>
> ---
> lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 3 +++
> lib/librte_eal/common/include/arch/arm/rte_cycles_64.h | 2 ++
> 2 files changed, 5 insertions(+)
> 
> diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
> b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
> index 859ae129d..7e8049725 100644
> --- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
> +++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
> @@ -21,6 +21,7 @@ extern "C" {
>  #define dsb(opt) asm volatile("dsb " #opt : : : "memory")
> #define dmb(opt) asm volatile("dmb " #opt : : : "memory")
> +#define isb()    asm volatile("isb" : : : "memory")
>  #define rte_mb() dsb(sy)
> @@ -44,6 +45,8 @@ extern "C" {
>  #define rte_cio_rmb() dmb(oshld)
> +#define rte_isb() isb()
> +
> /*------------------------ 128 bit atomic operations -------------------------*/
>  #if defined(__ARM_FEATURE_ATOMICS) ||
> defined(RTE_ARM_FEATURE_ATOMICS)
> diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> index 68e7c7338..29f524901 100644
> --- a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> @@ -18,6 +18,7 @@ extern "C" {
>   *   The time base for this lcore.
>   */
> #ifndef RTE_ARM_EAL_RDTSC_USE_PMU
> +
> /**
>   * This call is portable to any ARMv8 architecture, however, typically
>   * cntvct_el0 runs at <= 100MHz and it may be imprecise for some tasks.
> @@ -27,6 +28,7 @@ rte_rdtsc(void)
> {
>        uint64_t tsc;
> +       rte_isb();
Good catch, could you add a link to the commit log as a reference.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/include/asm/arch_timer.h?h=v5.5#n220

>        asm volatile("mrs %0, cntvct_el0" : "=r" (tsc));
In kernel, there is a call to arch_counter_enforce_ordering(cnt), maybe it is also necessary. 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/include/asm/arch_timer.h?h=v5.5#n168
>        return tsc;
> }
> --
  
Linhaifeng March 10, 2020, 7:22 a.m. UTC | #2
> -----Original Message-----
> From: Gavin Hu [mailto:Gavin.Hu@arm.com]
> Sent: Tuesday, March 10, 2020 3:11 PM
> To: Linhaifeng <haifeng.lin@huawei.com>; dev@dpdk.org;
> thomas@monjalon.net
> Cc: chenchanghu <chenchanghu@huawei.com>; xudingke
> <xudingke@huawei.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; Honnappa
> Nagarahalli <Honnappa.Nagarahalli@arm.com>; Steve Capper
> <Steve.Capper@arm.com>; nd <nd@arm.com>
> Subject: RE: [PATCH] cycles: add isb before read cntvct_el0
> 
> Hi Haifeng,
> 
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Linhaifeng
> > Sent: Monday, March 9, 2020 5:23 PM
> > To: dev@dpdk.org; thomas@monjalon.net
> > Cc: chenchanghu <chenchanghu@huawei.com>; xudingke
> > <xudingke@huawei.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>
> > Subject: [dpdk-dev] [PATCH] cycles: add isb before read cntvct_el0
> >
> > We should use isb rather than dsb to sync system counter to cntvct_el0.
> >
> > Signed-off-by: Haifeng Lin <haifeng.lin@huawei.com>
> > ---
> > lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 3 +++
> > lib/librte_eal/common/include/arch/arm/rte_cycles_64.h | 2 ++
> > 2 files changed, 5 insertions(+)
> >
> > diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
> > b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
> > index 859ae129d..7e8049725 100644
> > --- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
> > +++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
> > @@ -21,6 +21,7 @@ extern "C" {
> >  #define dsb(opt) asm volatile("dsb " #opt : : : "memory") #define
> > dmb(opt) asm volatile("dmb " #opt : : : "memory")
> > +#define isb()    asm volatile("isb" : : : "memory")
> >  #define rte_mb() dsb(sy)
> > @@ -44,6 +45,8 @@ extern "C" {
> >  #define rte_cio_rmb() dmb(oshld)
> > +#define rte_isb() isb()
> > +
> > /*------------------------ 128 bit atomic operations
> > -------------------------*/  #if defined(__ARM_FEATURE_ATOMICS) ||
> > defined(RTE_ARM_FEATURE_ATOMICS)
> > diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> > b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> > index 68e7c7338..29f524901 100644
> > --- a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> > +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> > @@ -18,6 +18,7 @@ extern "C" {
> >   *   The time base for this lcore.
> >   */
> > #ifndef RTE_ARM_EAL_RDTSC_USE_PMU
> > +
> > /**
> >   * This call is portable to any ARMv8 architecture, however, typically
> >   * cntvct_el0 runs at <= 100MHz and it may be imprecise for some tasks.
> > @@ -27,6 +28,7 @@ rte_rdtsc(void)
> > {
> >        uint64_t tsc;
> > +       rte_isb();
> Good catch, could you add a link to the commit log as a reference.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/ar
> m64/include/asm/arch_timer.h?h=v5.5#n220
> 

Ok.

> >        asm volatile("mrs %0, cntvct_el0" : "=r" (tsc));
> In kernel, there is a call to arch_counter_enforce_ordering(cnt), maybe it is
> also necessary.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/ar
> m64/include/asm/arch_timer.h?h=v5.5#n168

Should we add isb and arch_counter_enforce_ordering in rte_rdtsc or rte_rdtsc_precise?

> >        return tsc;
> > }
> > --
  
Jerin Jacob March 10, 2020, 7:50 a.m. UTC | #3
On Tue, Mar 10, 2020 at 12:52 PM Linhaifeng <haifeng.lin@huawei.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Gavin Hu [mailto:Gavin.Hu@arm.com]
> > Sent: Tuesday, March 10, 2020 3:11 PM
> > To: Linhaifeng <haifeng.lin@huawei.com>; dev@dpdk.org;
> > thomas@monjalon.net
> > Cc: chenchanghu <chenchanghu@huawei.com>; xudingke
> > <xudingke@huawei.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; Honnappa
> > Nagarahalli <Honnappa.Nagarahalli@arm.com>; Steve Capper
> > <Steve.Capper@arm.com>; nd <nd@arm.com>
> > Subject: RE: [PATCH] cycles: add isb before read cntvct_el0
> >
> > Hi Haifeng,
> >
> > > -----Original Message-----
> > > From: dev <dev-bounces@dpdk.org> On Behalf Of Linhaifeng
> > > Sent: Monday, March 9, 2020 5:23 PM
> > > To: dev@dpdk.org; thomas@monjalon.net
> > > Cc: chenchanghu <chenchanghu@huawei.com>; xudingke
> > > <xudingke@huawei.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>
> > > Subject: [dpdk-dev] [PATCH] cycles: add isb before read cntvct_el0
> > >
> > > We should use isb rather than dsb to sync system counter to cntvct_el0.
> > >
> > > Signed-off-by: Haifeng Lin <haifeng.lin@huawei.com>
> > > ---
> > > lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 3 +++
> > > lib/librte_eal/common/include/arch/arm/rte_cycles_64.h | 2 ++
> > > 2 files changed, 5 insertions(+)
> > >
> > > diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
> > > b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
> > > index 859ae129d..7e8049725 100644
> > > --- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
> > > +++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
> > > @@ -21,6 +21,7 @@ extern "C" {
> > >  #define dsb(opt) asm volatile("dsb " #opt : : : "memory") #define
> > > dmb(opt) asm volatile("dmb " #opt : : : "memory")
> > > +#define isb()    asm volatile("isb" : : : "memory")
> > >  #define rte_mb() dsb(sy)
> > > @@ -44,6 +45,8 @@ extern "C" {
> > >  #define rte_cio_rmb() dmb(oshld)
> > > +#define rte_isb() isb()
> > > +
> > > /*------------------------ 128 bit atomic operations
> > > -------------------------*/  #if defined(__ARM_FEATURE_ATOMICS) ||
> > > defined(RTE_ARM_FEATURE_ATOMICS)
> > > diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> > > b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> > > index 68e7c7338..29f524901 100644
> > > --- a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> > > +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> > > @@ -18,6 +18,7 @@ extern "C" {
> > >   *   The time base for this lcore.
> > >   */
> > > #ifndef RTE_ARM_EAL_RDTSC_USE_PMU
> > > +
> > > /**
> > >   * This call is portable to any ARMv8 architecture, however, typically
> > >   * cntvct_el0 runs at <= 100MHz and it may be imprecise for some tasks.
> > > @@ -27,6 +28,7 @@ rte_rdtsc(void)
> > > {
> > >        uint64_t tsc;
> > > +       rte_isb();
> > Good catch, could you add a link to the commit log as a reference.
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/ar
> > m64/include/asm/arch_timer.h?h=v5.5#n220
> >
>
> Ok.
>
> > >        asm volatile("mrs %0, cntvct_el0" : "=r" (tsc));
> > In kernel, there is a call to arch_counter_enforce_ordering(cnt), maybe it is
> > also necessary.
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/ar
> > m64/include/asm/arch_timer.h?h=v5.5#n168
>
> Should we add isb and arch_counter_enforce_ordering in rte_rdtsc or rte_rdtsc_precise?

Only for rte_rdtsc_precise() as some cases would not need a barrier.
rte_rdtsc_precise() created for this specific purpose.

>
> > >        return tsc;
> > > }
> > > --
  

Patch

diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
index 859ae129d..7e8049725 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h
@@ -21,6 +21,7 @@  extern "C" {
 #define dsb(opt) asm volatile("dsb " #opt : : : "memory")
#define dmb(opt) asm volatile("dmb " #opt : : : "memory")
+#define isb()    asm volatile("isb" : : : "memory")
 #define rte_mb() dsb(sy)
@@ -44,6 +45,8 @@  extern "C" {
 #define rte_cio_rmb() dmb(oshld)
+#define rte_isb() isb()
+
/*------------------------ 128 bit atomic operations -------------------------*/
 #if defined(__ARM_FEATURE_ATOMICS) || defined(RTE_ARM_FEATURE_ATOMICS)
diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
index 68e7c7338..29f524901 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
@@ -18,6 +18,7 @@  extern "C" {
  *   The time base for this lcore.
  */
#ifndef RTE_ARM_EAL_RDTSC_USE_PMU
+
/**
  * This call is portable to any ARMv8 architecture, however, typically
  * cntvct_el0 runs at <= 100MHz and it may be imprecise for some tasks.
@@ -27,6 +28,7 @@  rte_rdtsc(void)
{
       uint64_t tsc;
+       rte_isb();
       asm volatile("mrs %0, cntvct_el0" : "=r" (tsc));
       return tsc;
}
--