[v2] eal/arm: remove CASP constraints for GCC
Checks
Commit Message
From: Pavan Nikhilesh <pbhagavatula@marvell.com>
GCC now assigns even register pairs for CASP, the fix has also been
backported to all stable releases of older GCC versions.
Removing the manual register allocation allows GCC to inline the
functions and pick optimal registers for performing CASP.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
---
v2 Changes:
- Remove unnecessary LSE_PREAMBLE for GCC (Ruifeng).
lib/eal/arm/include/rte_atomic_64.h | 21 ++++++++++++++-------
1 file changed, 14 insertions(+), 7 deletions(-)
--
2.17.1
Comments
> -----Original Message-----
> From: pbhagavatula@marvell.com <pbhagavatula@marvell.com>
> Sent: Friday, November 5, 2021 4:57 PM
> To: Ruifeng Wang <Ruifeng.Wang@arm.com>; david.marchand@redhat.com;
> jerinj@marvell.com
> Cc: dev@dpdk.org; Pavan Nikhilesh <pbhagavatula@marvell.com>
> Subject: [dpdk-dev] [PATCH v2] eal/arm: remove CASP constraints for GCC
>
> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
>
> GCC now assigns even register pairs for CASP, the fix has also been
> backported to all stable releases of older GCC versions.
> Removing the manual register allocation allows GCC to inline the functions
> and pick optimal registers for performing CASP.
>
> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> ---
> v2 Changes:
> - Remove unnecessary LSE_PREAMBLE for GCC (Ruifeng).
>
> lib/eal/arm/include/rte_atomic_64.h | 21 ++++++++++++++-------
> 1 file changed, 14 insertions(+), 7 deletions(-)
>
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
On Mon, Nov 8, 2021 at 8:15 AM Ruifeng Wang <Ruifeng.Wang@arm.com> wrote:
>
> > -----Original Message-----
> > From: pbhagavatula@marvell.com <pbhagavatula@marvell.com>
> > Sent: Friday, November 5, 2021 4:57 PM
> > To: Ruifeng Wang <Ruifeng.Wang@arm.com>; david.marchand@redhat.com;
> > jerinj@marvell.com
> > Cc: dev@dpdk.org; Pavan Nikhilesh <pbhagavatula@marvell.com>
> > Subject: [dpdk-dev] [PATCH v2] eal/arm: remove CASP constraints for GCC
> >
> > From: Pavan Nikhilesh <pbhagavatula@marvell.com>
> >
> > GCC now assigns even register pairs for CASP, the fix has also been
> > backported to all stable releases of older GCC versions.
> > Removing the manual register allocation allows GCC to inline the functions
> > and pick optimal registers for performing CASP.
> >
> > Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
Patch lgtm but it is late for merging in 21.11.
It is in EAL, and is an optimisation of the 128 bits cas operation on ARM.
This is used by the stack library and mempool.
There might be other impacts I did not think of.
Do you have links to bugs or commits for the mentionned fix on gcc side?
This will help when we get reports from users with compilers without the fix.
Thanks.
>On Mon, Nov 8, 2021 at 8:15 AM Ruifeng Wang
><Ruifeng.Wang@arm.com> wrote:
>>
>> > -----Original Message-----
>> > From: pbhagavatula@marvell.com <pbhagavatula@marvell.com>
>> > Sent: Friday, November 5, 2021 4:57 PM
>> > To: Ruifeng Wang <Ruifeng.Wang@arm.com>;
>david.marchand@redhat.com;
>> > jerinj@marvell.com
>> > Cc: dev@dpdk.org; Pavan Nikhilesh <pbhagavatula@marvell.com>
>> > Subject: [dpdk-dev] [PATCH v2] eal/arm: remove CASP constraints
>for GCC
>> >
>> > From: Pavan Nikhilesh <pbhagavatula@marvell.com>
>> >
>> > GCC now assigns even register pairs for CASP, the fix has also been
>> > backported to all stable releases of older GCC versions.
>> > Removing the manual register allocation allows GCC to inline the
>functions
>> > and pick optimal registers for performing CASP.
>> >
>> > Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
>> Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
>
>Patch lgtm but it is late for merging in 21.11.
>
>It is in EAL, and is an optimisation of the 128 bits cas operation on ARM.
>This is used by the stack library and mempool.
>There might be other impacts I did not think of.
>
>
>Do you have links to bugs or commits for the mentionned fix on gcc
>side?
Here is the gcc git commit that fixes this.
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=563cc649beaf11d707c422e5f4e9e5cdacb818c3
>This will help when we get reports from users with compilers without
>the fix.
>
>
>Thanks.
>
>--
>David Marchand
On Tue, Nov 16, 2021 at 3:56 PM David Marchand
<david.marchand@redhat.com> wrote:
> > > GCC now assigns even register pairs for CASP, the fix has also been
> > > backported to all stable releases of older GCC versions.
> > > Removing the manual register allocation allows GCC to inline the functions
> > > and pick optimal registers for performing CASP.
> > >
> > > Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> > Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
>
I added a reference to gcc commit and applied, thanks.
@@ -46,12 +46,8 @@ rte_atomic_thread_fence(int memorder)
/*------------------------ 128 bit atomic operations -------------------------*/
#if defined(__ARM_FEATURE_ATOMICS) || defined(RTE_ARM_FEATURE_ATOMICS)
-#if defined(RTE_CC_CLANG)
-#define __LSE_PREAMBLE ".arch armv8-a+lse\n"
-#else
-#define __LSE_PREAMBLE ""
-#endif
+#if defined(RTE_CC_CLANG)
#define __ATOMIC128_CAS_OP(cas_op_name, op_string) \
static __rte_noinline void \
cas_op_name(rte_int128_t *dst, rte_int128_t *old, rte_int128_t updated) \
@@ -65,7 +61,7 @@ cas_op_name(rte_int128_t *dst, rte_int128_t *old, rte_int128_t updated) \
register uint64_t x2 __asm("x2") = (uint64_t)updated.val[0]; \
register uint64_t x3 __asm("x3") = (uint64_t)updated.val[1]; \
asm volatile( \
- __LSE_PREAMBLE \
+ ".arch armv8-a+lse\n" \
op_string " %[old0], %[old1], %[upd0], %[upd1], [%[dst]]" \
: [old0] "+r" (x0), \
[old1] "+r" (x1) \
@@ -76,13 +72,24 @@ cas_op_name(rte_int128_t *dst, rte_int128_t *old, rte_int128_t updated) \
old->val[0] = x0; \
old->val[1] = x1; \
}
+#else
+#define __ATOMIC128_CAS_OP(cas_op_name, op_string) \
+static __rte_always_inline void \
+cas_op_name(rte_int128_t *dst, rte_int128_t *old, rte_int128_t updated) \
+{ \
+ asm volatile( \
+ op_string " %[old], %H[old], %[upd], %H[upd], [%[dst]]" \
+ : [old] "+r"(old->int128) \
+ : [upd] "r"(updated.int128), [dst] "r"(dst) \
+ : "memory"); \
+}
+#endif
__ATOMIC128_CAS_OP(__cas_128_relaxed, "casp")
__ATOMIC128_CAS_OP(__cas_128_acquire, "caspa")
__ATOMIC128_CAS_OP(__cas_128_release, "caspl")
__ATOMIC128_CAS_OP(__cas_128_acq_rel, "caspal")
-#undef __LSE_PREAMBLE
#undef __ATOMIC128_CAS_OP
#endif