mbox series

[v1,0/3] MCS queued lock implementation

Message ID 1559750328-22377-1-git-send-email-phil.yang@arm.com (mailing list archive)
Headers
Series MCS queued lock implementation |

Message

Phil Yang June 5, 2019, 3:58 p.m. UTC
  This patch set added MCS lock library and its unit test.

The MCS lock (proposed by JOHN M. MELLOR-CRUMMEY and MICHAEL L. SCOTT) provides
scalability by spinning on a CPU/thread local variable which avoids expensive
cache bouncings. It provides fairness by maintaining a list of acquirers and
passing the lock to each CPU/thread in the order they acquired the lock.

References:
1. http://web.mit.edu/6.173/www/currentsemester/readings/R06-scalable-synchronization-1991.pdf
2. https://lwn.net/Articles/590243/

Mirco-benchmarking result:
------------------------------------------------------------------------------------------------
MCS lock                      | spinlock                       | ticket lock
------------------------------+--------------------------------+--------------------------------
Test with lock on 13 cores... |  Test with lock on 14 cores... |  Test with lock on 14 cores...
Core [15] Cost Time = 22426 us|  Core [14] Cost Time = 47974 us|  Core [14] cost time = 66761 us
Core [16] Cost Time = 22382 us|  Core [15] Cost Time = 46979 us|  Core [15] cost time = 66766 us
Core [17] Cost Time = 22294 us|  Core [16] Cost Time = 46044 us|  Core [16] cost time = 66761 us
Core [18] Cost Time = 22412 us|  Core [17] Cost Time = 28793 us|  Core [17] cost time = 66767 us
Core [19] Cost Time = 22407 us|  Core [18] Cost Time = 48349 us|  Core [18] cost time = 66758 us
Core [20] Cost Time = 22436 us|  Core [19] Cost Time = 19381 us|  Core [19] cost time = 66766 us
Core [21] Cost Time = 22414 us|  Core [20] Cost Time = 47914 us|  Core [20] cost time = 66763 us
Core [22] Cost Time = 22405 us|  Core [21] Cost Time = 48333 us|  Core [21] cost time = 66766 us
Core [23] Cost Time = 22435 us|  Core [22] Cost Time = 38900 us|  Core [22] cost time = 66749 us
Core [24] Cost Time = 22401 us|  Core [23] Cost Time = 45374 us|  Core [23] cost time = 66765 us
Core [25] Cost Time = 22408 us|  Core [24] Cost Time = 16121 us|  Core [24] cost time = 66762 us
Core [26] Cost Time = 22380 us|  Core [25] Cost Time = 42731 us|  Core [25] cost time = 66768 us
Core [27] Cost Time = 22395 us|  Core [26] Cost Time = 29439 us|  Core [26] cost time = 66768 us
                              |  Core [27] Cost Time = 38071 us|  Core [27] cost time = 66767 us
------------------------------+--------------------------------+--------------------------------
Total Cost Time = 291195 us   |  Total Cost Time = 544403 us   |  Total cost time = 934687 us
------------------------------------------------------------------------------------------------


Phil Yang (3):
  eal/mcslock: add mcs queued lock implementation
  eal/mcslock: use generic msc queued lock on all arch
  test/mcslock: add mcs queued lock unit test

 MAINTAINERS                                        |   5 +
 app/test/Makefile                                  |   1 +
 app/test/autotest_data.py                          |   6 +
 app/test/autotest_test_funcs.py                    |  32 +++
 app/test/meson.build                               |   2 +
 app/test/test_mcslock.c                            | 248 +++++++++++++++++++++
 doc/api/doxy-api-index.md                          |   1 +
 doc/guides/rel_notes/release_19_08.rst             |   6 +
 lib/librte_eal/common/Makefile                     |   2 +-
 .../common/include/arch/arm/rte_mcslock.h          |  23 ++
 .../common/include/arch/ppc_64/rte_mcslock.h       |  19 ++
 .../common/include/arch/x86/rte_mcslock.h          |  19 ++
 .../common/include/generic/rte_mcslock.h           | 169 ++++++++++++++
 lib/librte_eal/common/meson.build                  |   1 +
 14 files changed, 533 insertions(+), 1 deletion(-)
 create mode 100644 app/test/test_mcslock.c
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_mcslock.h
 create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_mcslock.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_mcslock.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_mcslock.h
  

Comments

David Marchand June 5, 2019, 4:29 p.m. UTC | #1
On Wed, Jun 5, 2019 at 6:00 PM Phil Yang <phil.yang@arm.com> wrote:

> This patch set added MCS lock library and its unit test.
>
> The MCS lock (proposed by JOHN M. MELLOR-CRUMMEY and MICHAEL L. SCOTT)
> provides
> scalability by spinning on a CPU/thread local variable which avoids
> expensive
> cache bouncings. It provides fairness by maintaining a list of acquirers
> and
> passing the lock to each CPU/thread in the order they acquired the lock.
>
> References:
> 1.
> http://web.mit.edu/6.173/www/currentsemester/readings/R06-scalable-synchronization-1991.pdf
> 2. https://lwn.net/Articles/590243/
>
> Mirco-benchmarking result:
>
> ------------------------------------------------------------------------------------------------
> MCS lock                      | spinlock                       | ticket
> lock
>
> ------------------------------+--------------------------------+--------------------------------
> Test with lock on 13 cores... |  Test with lock on 14 cores... |  Test
> with lock on 14 cores...
> Core [15] Cost Time = 22426 us|  Core [14] Cost Time = 47974 us|  Core
> [14] cost time = 66761 us
> Core [16] Cost Time = 22382 us|  Core [15] Cost Time = 46979 us|  Core
> [15] cost time = 66766 us
> Core [17] Cost Time = 22294 us|  Core [16] Cost Time = 46044 us|  Core
> [16] cost time = 66761 us
> Core [18] Cost Time = 22412 us|  Core [17] Cost Time = 28793 us|  Core
> [17] cost time = 66767 us
> Core [19] Cost Time = 22407 us|  Core [18] Cost Time = 48349 us|  Core
> [18] cost time = 66758 us
> Core [20] Cost Time = 22436 us|  Core [19] Cost Time = 19381 us|  Core
> [19] cost time = 66766 us
> Core [21] Cost Time = 22414 us|  Core [20] Cost Time = 47914 us|  Core
> [20] cost time = 66763 us
> Core [22] Cost Time = 22405 us|  Core [21] Cost Time = 48333 us|  Core
> [21] cost time = 66766 us
> Core [23] Cost Time = 22435 us|  Core [22] Cost Time = 38900 us|  Core
> [22] cost time = 66749 us
> Core [24] Cost Time = 22401 us|  Core [23] Cost Time = 45374 us|  Core
> [23] cost time = 66765 us
> Core [25] Cost Time = 22408 us|  Core [24] Cost Time = 16121 us|  Core
> [24] cost time = 66762 us
> Core [26] Cost Time = 22380 us|  Core [25] Cost Time = 42731 us|  Core
> [25] cost time = 66768 us
> Core [27] Cost Time = 22395 us|  Core [26] Cost Time = 29439 us|  Core
> [26] cost time = 66768 us
>                               |  Core [27] Cost Time = 38071 us|  Core
> [27] cost time = 66767 us
>
> ------------------------------+--------------------------------+--------------------------------
> Total Cost Time = 291195 us   |  Total Cost Time = 544403 us   |  Total
> cost time = 934687 us
>
> ------------------------------------------------------------------------------------------------
>

Had a quick look, interesting.

Quick comments:
- your numbers are for 13 cores, while the other are for 14, what is the
reason?
- do we need per architecture header? all I can see is generic code, we
might as well directly put rte_mcslock.h in the common/include directory.
- could we replace the current spinlock with this approach? is this more
expensive than spinlock on lowly contended locks? is there a reason we want
to keep all these approaches? we could have now 3 lock implementations.
- do we need to write the authors names in full capitalized version? it
seems like you are shouting :-)
  
Stephen Hemminger June 5, 2019, 4:47 p.m. UTC | #2
On Wed,  5 Jun 2019 23:58:45 +0800
Phil Yang <phil.yang@arm.com> wrote:

> This patch set added MCS lock library and its unit test.
> 
> The MCS lock (proposed by JOHN M. MELLOR-CRUMMEY and MICHAEL L. SCOTT) provides
> scalability by spinning on a CPU/thread local variable which avoids expensive
> cache bouncings. It provides fairness by maintaining a list of acquirers and
> passing the lock to each CPU/thread in the order they acquired the lock.
> 
> References:
> 1. http://web.mit.edu/6.173/www/currentsemester/readings/R06-scalable-synchronization-1991.pdf
> 2. https://lwn.net/Articles/590243/
> 
> Mirco-benchmarking result:
> ------------------------------------------------------------------------------------------------
> MCS lock                      | spinlock                       | ticket lock
> ------------------------------+--------------------------------+--------------------------------
> Test with lock on 13 cores... |  Test with lock on 14 cores... |  Test with lock on 14 cores...
> Core [15] Cost Time = 22426 us|  Core [14] Cost Time = 47974 us|  Core [14] cost time = 66761 us
> Core [16] Cost Time = 22382 us|  Core [15] Cost Time = 46979 us|  Core [15] cost time = 66766 us
> Core [17] Cost Time = 22294 us|  Core [16] Cost Time = 46044 us|  Core [16] cost time = 66761 us
> Core [18] Cost Time = 22412 us|  Core [17] Cost Time = 28793 us|  Core [17] cost time = 66767 us
> Core [19] Cost Time = 22407 us|  Core [18] Cost Time = 48349 us|  Core [18] cost time = 66758 us
> Core [20] Cost Time = 22436 us|  Core [19] Cost Time = 19381 us|  Core [19] cost time = 66766 us
> Core [21] Cost Time = 22414 us|  Core [20] Cost Time = 47914 us|  Core [20] cost time = 66763 us
> Core [22] Cost Time = 22405 us|  Core [21] Cost Time = 48333 us|  Core [21] cost time = 66766 us
> Core [23] Cost Time = 22435 us|  Core [22] Cost Time = 38900 us|  Core [22] cost time = 66749 us
> Core [24] Cost Time = 22401 us|  Core [23] Cost Time = 45374 us|  Core [23] cost time = 66765 us
> Core [25] Cost Time = 22408 us|  Core [24] Cost Time = 16121 us|  Core [24] cost time = 66762 us
> Core [26] Cost Time = 22380 us|  Core [25] Cost Time = 42731 us|  Core [25] cost time = 66768 us
> Core [27] Cost Time = 22395 us|  Core [26] Cost Time = 29439 us|  Core [26] cost time = 66768 us
>                               |  Core [27] Cost Time = 38071 us|  Core [27] cost time = 66767 us
> ------------------------------+--------------------------------+--------------------------------
> Total Cost Time = 291195 us   |  Total Cost Time = 544403 us   |  Total cost time = 934687 us
> ------------------------------------------------------------------------------------------------

From the user point of view there needs to be clear recommendations
on which lock type to use.

And if one of the lock types is always slower it should be deprecated long term.
  
Thomas Monjalon June 5, 2019, 5:35 p.m. UTC | #3
05/06/2019 17:58, Phil Yang:
> This patch set added MCS lock library and its unit test.

It seems this patch is targetting 19.08,
however it missed the proposal deadline by 2 days.

+Cc techboard for a decision.
  
Honnappa Nagarahalli June 5, 2019, 7:59 p.m. UTC | #4
On Wed, Jun 5, 2019 at 6:00 PM Phil Yang <phil.yang@arm.com<mailto:phil.yang@arm.com>> wrote:
This patch set added MCS lock library and its unit test.

The MCS lock (proposed by JOHN M. MELLOR-CRUMMEY and MICHAEL L. SCOTT) provides
scalability by spinning on a CPU/thread local variable which avoids expensive
cache bouncings. It provides fairness by maintaining a list of acquirers and
passing the lock to each CPU/thread in the order they acquired the lock.

References:
1. http://web.mit.edu/6.173/www/currentsemester/readings/R06-scalable-synchronization-1991.pdf
2. https://lwn.net/Articles/590243/

Mirco-benchmarking result:
------------------------------------------------------------------------------------------------
MCS lock                      | spinlock                       | ticket lock
------------------------------+--------------------------------+--------------------------------
Test with lock on 13 cores... |  Test with lock on 14 cores... |  Test with lock on 14 cores...
Core [15] Cost Time = 22426 us|  Core [14] Cost Time = 47974 us|  Core [14] cost time = 66761 us
Core [16] Cost Time = 22382 us|  Core [15] Cost Time = 46979 us|  Core [15] cost time = 66766 us
Core [17] Cost Time = 22294 us|  Core [16] Cost Time = 46044 us|  Core [16] cost time = 66761 us
Core [18] Cost Time = 22412 us|  Core [17] Cost Time = 28793 us|  Core [17] cost time = 66767 us
Core [19] Cost Time = 22407 us|  Core [18] Cost Time = 48349 us|  Core [18] cost time = 66758 us
Core [20] Cost Time = 22436 us|  Core [19] Cost Time = 19381 us|  Core [19] cost time = 66766 us
Core [21] Cost Time = 22414 us|  Core [20] Cost Time = 47914 us|  Core [20] cost time = 66763 us
Core [22] Cost Time = 22405 us|  Core [21] Cost Time = 48333 us|  Core [21] cost time = 66766 us
Core [23] Cost Time = 22435 us|  Core [22] Cost Time = 38900 us|  Core [22] cost time = 66749 us
Core [24] Cost Time = 22401 us|  Core [23] Cost Time = 45374 us|  Core [23] cost time = 66765 us
Core [25] Cost Time = 22408 us|  Core [24] Cost Time = 16121 us|  Core [24] cost time = 66762 us
Core [26] Cost Time = 22380 us|  Core [25] Cost Time = 42731 us|  Core [25] cost time = 66768 us
Core [27] Cost Time = 22395 us|  Core [26] Cost Time = 29439 us|  Core [26] cost time = 66768 us
                              |  Core [27] Cost Time = 38071 us|  Core [27] cost time = 66767 us
------------------------------+--------------------------------+--------------------------------
Total Cost Time = 291195 us   |  Total Cost Time = 544403 us   |  Total cost time = 934687 us
------------------------------------------------------------------------------------------------

Had a quick look, interesting.

Quick comments:
- your numbers are for 13 cores, while the other are for 14, what is the reason?
- do we need per architecture header? all I can see is generic code, we might as well directly put rte_mcslock.h in the common/include directory.
- could we replace the current spinlock with this approach? is this more expensive than spinlock on lowly contended locks? is there a reason we want to keep all these approaches? we could have now 3 lock implementations.
- do we need to write the authors names in full capitalized version? it seems like you are shouting :-)
[Honnappa] IMO, writing full names is a cultural thing? I do not see any harm. But, I do think, we do not need to capitalize everything.


--
David Marchand
[Honnappa] This mail is appearing as HTML, would be good to change to text mode. It is much easier to add inline comments.
  
Honnappa Nagarahalli June 5, 2019, 8:48 p.m. UTC | #5
+David
(had similar questions)

> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Wednesday, June 5, 2019 11:48 AM
> To: Phil Yang (Arm Technology China) <Phil.Yang@arm.com>
> Cc: dev@dpdk.org; thomas@monjalon.net; jerinj@marvell.com;
> hemant.agrawal@nxp.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Gavin Hu (Arm Technology China)
> <Gavin.Hu@arm.com>; nd <nd@arm.com>
> Subject: Re: [dpdk-dev] [PATCH v1 0/3] MCS queued lock implementation
> 
> On Wed,  5 Jun 2019 23:58:45 +0800
> Phil Yang <phil.yang@arm.com> wrote:
> 
> > This patch set added MCS lock library and its unit test.
> >
> > The MCS lock (proposed by JOHN M. MELLOR-CRUMMEY and MICHAEL L.
> SCOTT)
> > provides scalability by spinning on a CPU/thread local variable which
> > avoids expensive cache bouncings. It provides fairness by maintaining
> > a list of acquirers and passing the lock to each CPU/thread in the order they
> acquired the lock.
> >
> > References:
> > 1.
> > http://web.mit.edu/6.173/www/currentsemester/readings/R06-scalable-syn
> > chronization-1991.pdf
> > 2. https://lwn.net/Articles/590243/
> >
> > Mirco-benchmarking result:
> > ------------------------------------------------------------------------------------------------
> > MCS lock                      | spinlock                       | ticket lock
> > ------------------------------+--------------------------------+------
> > ------------------------------+--------------------------------+------
> > ------------------------------+--------------------------------+------
> > ------------------------------+--------------------------------+------
> > ------------------------------+--------------------------------+------
> > ------------------------------+--------------------------------+--
> > Test with lock on 13 cores... |  Test with lock on 14 cores... |  Test with lock
> on 14 cores...
> > Core [15] Cost Time = 22426 us|  Core [14] Cost Time = 47974 us|  Core
> > [14] cost time = 66761 us Core [16] Cost Time = 22382 us|  Core [15]
> > Cost Time = 46979 us|  Core [15] cost time = 66766 us Core [17] Cost
> > Time = 22294 us|  Core [16] Cost Time = 46044 us|  Core [16] cost time
> > = 66761 us Core [18] Cost Time = 22412 us|  Core [17] Cost Time =
> > 28793 us|  Core [17] cost time = 66767 us Core [19] Cost Time = 22407
> > us|  Core [18] Cost Time = 48349 us|  Core [18] cost time = 66758 us
> > Core [20] Cost Time = 22436 us|  Core [19] Cost Time = 19381 us|  Core
> > [19] cost time = 66766 us Core [21] Cost Time = 22414 us|  Core [20]
> > Cost Time = 47914 us|  Core [20] cost time = 66763 us Core [22] Cost
> > Time = 22405 us|  Core [21] Cost Time = 48333 us|  Core [21] cost time
> > = 66766 us Core [23] Cost Time = 22435 us|  Core [22] Cost Time =
> > 38900 us|  Core [22] cost time = 66749 us Core [24] Cost Time = 22401 us|
> Core [23] Cost Time = 45374 us|  Core [23] cost time = 66765 us Core [25] Cost
> Time = 22408 us|  Core [24] Cost Time = 16121 us|  Core [24] cost time =
> 66762 us Core [26] Cost Time = 22380 us|  Core [25] Cost Time = 42731 us|
> Core [25] cost time = 66768 us Core [27] Cost Time = 22395 us|  Core [26] Cost
> Time = 29439 us|  Core [26] cost time = 66768 us
> >                               |  Core [27] Cost Time = 38071 us|  Core
> > [27] cost time = 66767 us
> > ------------------------------+--------------------------------+------
> > ------------------------------+--------------------------------+------
> > ------------------------------+--------------------------------+------
> > ------------------------------+--------------------------------+------
> > ------------------------------+--------------------------------+------
> > ------------------------------+--------------------------------+--
> > Total Cost Time = 291195 us   |  Total Cost Time = 544403 us   |  Total cost
> time = 934687 us
> > ----------------------------------------------------------------------
> > --------------------------
> 
> From the user point of view there needs to be clear recommendations on
> which lock type to use.
I think the data about fairness needs to be added to this - especially for ticket lock. IMO, we should consider fairness and space (cache lines) along with cycles.

> 
> And if one of the lock types is always slower it should be deprecated long term.
The ticket lock can be a drop-in replacement for spinlock. Gavin is working on a patch which will make ticket lock as the backend for spinlock through a configuration. But the performance needs to be considered.
  
Phil Yang June 6, 2019, 10:17 a.m. UTC | #6
From: David Marchand <david.marchand@redhat.com>
Sent: Thursday, June 6, 2019 12:30 AM
To: Phil Yang (Arm Technology China) <Phil.Yang@arm.com>
Cc: dev <dev@dpdk.org>; thomas@monjalon.net; jerinj@marvell.com; hemant.agrawal@nxp.com; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Gavin Hu (Arm Technology China) <Gavin.Hu@arm.com>; nd <nd@arm.com>
Subject: Re: [dpdk-dev] [PATCH v1 0/3] MCS queued lock implementation



On Wed, Jun 5, 2019 at 6:00 PM Phil Yang <phil.yang@arm.com<mailto:phil.yang@arm.com>> wrote:
This patch set added MCS lock library and its unit test.

The MCS lock (proposed by JOHN M. MELLOR-CRUMMEY and MICHAEL L. SCOTT) provides
scalability by spinning on a CPU/thread local variable which avoids expensive
cache bouncings. It provides fairness by maintaining a list of acquirers and
passing the lock to each CPU/thread in the order they acquired the lock.

References:
1. http://web.mit.edu/6.173/www/currentsemester/readings/R06-scalable-synchronization-1991.pdf
2. https://lwn.net/Articles/590243/

Mirco-benchmarking result:
------------------------------------------------------------------------------------------------
MCS lock                      | spinlock                       | ticket lock
------------------------------+--------------------------------+--------------------------------
Test with lock on 13 cores... |  Test with lock on 14 cores... |  Test with lock on 14 cores...
Core [15] Cost Time = 22426 us|  Core [14] Cost Time = 47974 us|  Core [14] cost time = 66761 us
Core [16] Cost Time = 22382 us|  Core [15] Cost Time = 46979 us|  Core [15] cost time = 66766 us
Core [17] Cost Time = 22294 us|  Core [16] Cost Time = 46044 us|  Core [16] cost time = 66761 us
Core [18] Cost Time = 22412 us|  Core [17] Cost Time = 28793 us|  Core [17] cost time = 66767 us
Core [19] Cost Time = 22407 us|  Core [18] Cost Time = 48349 us|  Core [18] cost time = 66758 us
Core [20] Cost Time = 22436 us|  Core [19] Cost Time = 19381 us|  Core [19] cost time = 66766 us
Core [21] Cost Time = 22414 us|  Core [20] Cost Time = 47914 us|  Core [20] cost time = 66763 us
Core [22] Cost Time = 22405 us|  Core [21] Cost Time = 48333 us|  Core [21] cost time = 66766 us
Core [23] Cost Time = 22435 us|  Core [22] Cost Time = 38900 us|  Core [22] cost time = 66749 us
Core [24] Cost Time = 22401 us|  Core [23] Cost Time = 45374 us|  Core [23] cost time = 66765 us
Core [25] Cost Time = 22408 us|  Core [24] Cost Time = 16121 us|  Core [24] cost time = 66762 us
Core [26] Cost Time = 22380 us|  Core [25] Cost Time = 42731 us|  Core [25] cost time = 66768 us
Core [27] Cost Time = 22395 us|  Core [26] Cost Time = 29439 us|  Core [26] cost time = 66768 us
                              |  Core [27] Cost Time = 38071 us|  Core [27] cost time = 66767 us
------------------------------+--------------------------------+--------------------------------
Total Cost Time = 291195 us   |  Total Cost Time = 544403 us   |  Total cost time = 934687 us
------------------------------------------------------------------------------------------------

Had a quick look, interesting.

Hi David,

Thanks for your comments.

Quick comments:
- your numbers are for 13 cores, while the other are for 14, what is the reason?
[Phil]The test case skipped the master thread while doing the load test. The master thread just controls the trigger. So all the other threads acquiring the lock and running the same workload at the same time.
Actually, there is no difference on per core performance when it involved the master thread in the load test.

- do we need per architecture header? all I can see is generic code, we might as well directly put rte_mcslock.h in the common/include directory.
[Phil] I just trying to leave it for architecture specific optimization.

- could we replace the current spinlock with this approach? is this more expensive than spinlock on lowly contended locks? is there a reason we want to keep all these approaches? we could have now 3 lock implementations.
[Phil] Under the high lock contention scenarios, MCS is much better than spinlock. However, MCS lock is more complicated than spinlock and more expensive than spinlock in the single thread scenario. E.g:
Test with lock on single core..
MCS lock :
Core [14] Cost Time = 327 us

Spinlock:
Core [14] Cost Time = 258 us

ticket lock:
Core [14] cost time = 195 us
I think in low-contention scenarios but you still need mutual exclusion you can use spinlock. It is lighter. I think that all depends on the application.

- do we need to write the authors names in full capitalized version? it seems like you are shouting :-)
[Phil] :-)  I will modify it in the next version. Thanks.


--
David Marchand


Thanks,
Phil Yang
  
Thomas Monjalon July 4, 2019, 8:12 p.m. UTC | #7
05/06/2019 17:58, Phil Yang:
> This patch set added MCS lock library and its unit test.
> 
> The MCS lock (proposed by JOHN M. MELLOR-CRUMMEY and MICHAEL L. SCOTT) provides
> scalability by spinning on a CPU/thread local variable which avoids expensive
> cache bouncings. It provides fairness by maintaining a list of acquirers and
> passing the lock to each CPU/thread in the order they acquired the lock.

What is the plan for this patch?
Do we try to get more reviews for a merge in 19.08-rc2?
Or we target 19.11?
  
Phil Yang July 5, 2019, 10:33 a.m. UTC | #8
> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Friday, July 5, 2019 4:12 AM
> To: Phil Yang (Arm Technology China) <Phil.Yang@arm.com>; Honnappa
> Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Cc: dev@dpdk.org; jerinj@marvell.com; hemant.agrawal@nxp.com; Gavin
> Hu (Arm Technology China) <Gavin.Hu@arm.com>; nd <nd@arm.com>
> Subject: Re: [dpdk-dev] [PATCH v1 0/3] MCS queued lock implementation
> 
> 05/06/2019 17:58, Phil Yang:
> > This patch set added MCS lock library and its unit test.
> >
> > The MCS lock (proposed by JOHN M. MELLOR-CRUMMEY and MICHAEL L.
> SCOTT) provides
> > scalability by spinning on a CPU/thread local variable which avoids
> expensive
> > cache bouncings. It provides fairness by maintaining a list of acquirers and
> > passing the lock to each CPU/thread in the order they acquired the lock.
> 
> What is the plan for this patch?
I have reworked this patch and addressed all the comments in the previous version. Please review the v3. 

> Do we try to get more reviews for a merge in 19.08-rc2?
> Or we target 19.11?
> 
Thank,
Phil