event/cnxk: add asm to support CASP for clang
Checks
Commit Message
From: Pavan Nikhilesh <pbhagavatula@marvell.com>
Clang fails to use register pairs for CASP instruction, use
inline asm to fix register pairs.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
---
drivers/event/cnxk/cn10k_worker.h | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
Comments
On Mon, Jul 24, 2023 at 2:12 PM <pbhagavatula@marvell.com> wrote:
>
> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
>
> Clang fails to use register pairs for CASP instruction, use
> inline asm to fix register pairs.
>
> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Updated the git commit as follows and applied to
dpdk-next-net-eventdev/for-main. Thanks
Author: Pavan Nikhilesh <pbhagavatula@marvell.com>
Date: Mon Jul 24 14:11:56 2023 +0530
event/cnxk: fix CASP usage for clang
Clang fails to use register pairs for CASP instruction, use
inline asm to fix register pairs.
Fixes: e239e0d3faf7 ("event/cnxk: add SSO HW device operations")
Cc: stable@dpdk.org
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
@@ -239,19 +239,32 @@ cn10k_sso_hws_get_work(struct cn10k_sso_hws *ws, struct rte_event *ev,
} gw;
gw.get_work = ws->gw_wdata;
-#if defined(RTE_ARCH_ARM64) && !defined(__clang__)
+#if defined(RTE_ARCH_ARM64)
+#if !defined(__clang__)
asm volatile(
PLT_CPU_FEATURE_PREAMBLE
"caspal %[wdata], %H[wdata], %[wdata], %H[wdata], [%[gw_loc]]\n"
: [wdata] "+r"(gw.get_work)
: [gw_loc] "r"(ws->base + SSOW_LF_GWS_OP_GET_WORK0)
: "memory");
+#else
+ register uint64_t x0 __asm("x0") = (uint64_t)gw.u64[0];
+ register uint64_t x1 __asm("x1") = (uint64_t)gw.u64[1];
+ asm volatile(".arch armv8-a+lse\n"
+ "caspal %[x0], %[x1], %[x0], %[x1], [%[dst]]\n"
+ : [x0] "+r"(x0), [x1] "+r"(x1)
+ : [dst] "r"(ws->base + SSOW_LF_GWS_OP_GET_WORK0)
+ : "memory");
+ gw.u64[0] = x0;
+ gw.u64[1] = x1;
+#endif
#else
plt_write64(gw.u64[0], ws->base + SSOW_LF_GWS_OP_GET_WORK0);
do {
roc_load_pair(gw.u64[0], gw.u64[1],
ws->base + SSOW_LF_GWS_WQE0);
} while (gw.u64[0] & BIT_ULL(63));
+ rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
#endif
ws->gw_rdata = gw.u64[0];
if (gw.u64[1])