[v4,5/6] spinlock: use wfe to reduce contention on aarch64

Message ID 1566454356-37277-6-git-send-email-gavin.hu@arm.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series use WFE for locks and ring on aarch64 |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation fail Compilation issues

Commit Message

Gavin Hu Aug. 22, 2019, 6:12 a.m. UTC
  In acquiring a spinlock, cores repeatedly poll the lock variable.
This is replaced by rte_wait_until_equal API.

Running the micro benchmarking and the testpmd and l3fwd traffic tests
on ThunderX2, Ampere eMAG80 and Arm N1SDP, everything went well and no
notable performance gain nor degradation was measured.

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Tested-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
---
 .../common/include/arch/arm/rte_spinlock.h         | 25 ++++++++++++++++++++++
 1 file changed, 25 insertions(+)
  

Comments

Gavin Hu Sept. 14, 2019, 3:21 p.m. UTC | #1
Hi Jerin,

Add the offlist discussion with Pavan to facilitate the review for the spinlock patch(currently in v6). Thanks Pavan and Jerin for review.

Best Regards,
Gavin

> -----Original Message-----
> From: Gavin Hu (Arm Technology China)
> Sent: Thursday, September 12, 2019 5:22 PM
> To: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
> Subject: RE: [EXT] [PATCH v4 5/6] spinlock: use wfe to reduce contention on
> aarch64
>
> Hi Pavan,
>
> Thanks for pointing this out, it was implemented in the API already.
> Spinlock did not use the API to save a comparison branch(loading 0 to a reg
> and compare against).
>
> Anyway it is also a good idea to add it into this asm code.
>
> Best Regards,
> Gavin
>
> > -----Original Message-----
> > From: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
> > Sent: Thursday, September 12, 2019 4:45 PM
> > To: Gavin Hu (Arm Technology China) <Gavin.Hu@arm.com>
> > Subject: RE: [EXT] [PATCH v4 5/6] spinlock: use wfe to reduce contention on
> > aarch64
> >
> > Hi Gavin, (Offlist)
> >
> > I there a reason why the below asm doesn't use early exit as discussed in
> > http://patches.dpdk.org/patch/55669/
> >
> > Regards,
> > Pavan.
> >
> > >+#ifndef RTE_FORCE_INTRINSICS
> > >+static inline void
> > >+rte_spinlock_lock(rte_spinlock_t *sl)
> > >+{
> > >+  unsigned int tmp;
> > >+  /*
> > >http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.
> > >+   * faqs/ka16809.html
> > >+   */
> > >+  asm volatile(
> > >+          "sevl\n"
> > >+          "1:     wfe\n"
> > >+          "2:     ldaxr %w[tmp], %w[locked]\n"
> > >+          "cbnz   %w[tmp], 1b\n"
> > >+          "stxr   %w[tmp], %w[one], %w[locked]\n"
> > >+          "cbnz   %w[tmp], 2b\n"
> > >+          : [tmp] "=&r" (tmp), [locked] "+Q"(sl->locked)
> > >+          : [one] "r" (1)
> > >+          : "cc", "memory");
> > >+}
> > >+#endif
> > >+
> > > static inline int rte_tm_supported(void)
> > > {
> > >   return 0;
> > >--
> > >2.7.4

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
  

Patch

diff --git a/lib/librte_eal/common/include/arch/arm/rte_spinlock.h b/lib/librte_eal/common/include/arch/arm/rte_spinlock.h
index 1a6916b..7b8328e 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_spinlock.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_spinlock.h
@@ -16,6 +16,31 @@  extern "C" {
 #include <rte_common.h>
 #include "generic/rte_spinlock.h"
 
+/* armv7a does support WFE, but an explicit wake-up signal using SEV is
+ * required (must be preceded by DSB to drain the store buffer) and
+ * this is less performant, so keep armv7a implementation unchanged.
+ */
+#ifndef RTE_FORCE_INTRINSICS
+static inline void
+rte_spinlock_lock(rte_spinlock_t *sl)
+{
+	unsigned int tmp;
+	/* http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.
+	 * faqs/ka16809.html
+	 */
+	asm volatile(
+		"sevl\n"
+		"1:	wfe\n"
+		"2:	ldaxr %w[tmp], %w[locked]\n"
+		"cbnz   %w[tmp], 1b\n"
+		"stxr   %w[tmp], %w[one], %w[locked]\n"
+		"cbnz   %w[tmp], 2b\n"
+		: [tmp] "=&r" (tmp), [locked] "+Q"(sl->locked)
+		: [one] "r" (1)
+		: "cc", "memory");
+}
+#endif
+
 static inline int rte_tm_supported(void)
 {
 	return 0;