diff mbox series

[RFC] eal: add seqlock

Message ID	20220325202428.94628-1-mattias.ronnblom@ericsson.com (mailing list archive)
State	Superseded, archived
Delegated to:	Thomas Monjalon
Headers	Received-SPF: TempError (protection.outlook.com: error in processing during lookup of ericsson.com: DNS Timeout) From: =?utf-8?q?Mattias_R=C3=B6nnblom?= <mattias.ronnblom@ericsson.com> To: <dev@dpdk.org> CC: Thomas Monjalon <thomas@monjalon.net>, David Marchand <david.marchand@redhat.com>, <onar.olsen@ericsson.com>, <Honnappa.Nagarahalli@arm.com>, <nd@arm.com>, <konstantin.ananyev@intel.com>, <mb@smartsharesystems.com>, <stephen@networkplumber.org>, =?utf-8?q?Mattia?= =?utf-8?q?s_R=C3=B6nnblom?= <mattias.ronnblom@ericsson.com>, "Ola Liljedahl" <ola.liljedahl@arm.com> Subject: [RFC] eal: add seqlock Date: Fri, 25 Mar 2022 21:24:28 +0100 Message-ID: <20220325202428.94628-1-mattias.ronnblom@ericsson.com> In-Reply-To: <ef0fe83b-0af0-3210-4c40-e26c5b7d416b@ericsson.com> References: <ef0fe83b-0af0-3210-4c40-e26c5b7d416b@ericsson.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit Precedence: list Errors-To: dev-bounces@dpdk.org
Series	[RFC] eal: add seqlock \| [RFC] eal: add seqlock

Checks

Context	Check	Description
ci/checkpatch	warning	coding style issues
ci/Intel-compilation	success	Compilation OK
ci/intel-Testing	success	Testing PASS

Commit Message

Mattias Rönnblom March 25, 2022, 8:24 p.m. UTC

  A sequence lock (seqlock) is synchronization primitive which allows
for data-race free, low-overhead, high-frequency reads, especially for
data structures shared across many cores and which are updated with
relatively low frequency.

A seqlock permits multiple parallel readers. The variant of seqlock
implemented in this patch supports multiple writers as well. A
spinlock is used for writer-writer serialization.

To avoid resource reclamation and other issues, the data protected by
a seqlock is best off being self-contained (i.e., no pointers [except
to constant data]).

One way to think about seqlocks is that they provide means to perform
atomic operations on data objects larger what the native atomic
machine instructions allow for.

DPDK seqlocks are not preemption safe on the writer side. A thread
preemption affects performance, not correctness.

A seqlock contains a sequence number, which can be thought of as the
generation of the data it protects.

A reader will
  1. Load the sequence number (sn).
  2. Load, in arbitrary order, the seqlock-protected data.
  3. Load the sn again.
  4. Check if the first and second sn are equal, and even numbered.
     If they are not, discard the loaded data, and restart from 1.

The first three steps need to be ordered using suitable memory fences.

A writer will
  1. Take the spinlock, to serialize writer access.
  2. Load the sn.
  3. Store the original sn + 1 as the new sn.
  4. Perform load and stores to the seqlock-protected data.
  5. Store the original sn + 2 as the new sn.
  6. Release the spinlock.

Proper memory fencing is required to make sure the first sn store, the
data stores, and the second sn store appear to the reader in the
mentioned order.

The sn loads and stores must be atomic, but the data loads and stores
need not be.

The original seqlock design and implementation was done by Stephen
Hemminger. This is an independent implementation, using C11 atomics.

This RFC version lacks API documentation.

Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
---
 app/test/meson.build          |   2 +
 app/test/test_seqlock.c       | 197 ++++++++++++++++++++++++++++++++++
 lib/eal/common/meson.build    |   1 +
 lib/eal/common/rte_seqlock.c  |  12 +++
 lib/eal/include/meson.build   |   1 +
 lib/eal/include/rte_seqlock.h |  84 +++++++++++++++
 lib/eal/version.map           |   3 +
 7 files changed, 300 insertions(+)
 create mode 100644 app/test/test_seqlock.c
 create mode 100644 lib/eal/common/rte_seqlock.c
 create mode 100644 lib/eal/include/rte_seqlock.h

Comments

Stephen Hemminger March 25, 2022, 9:10 p.m. UTC | #1

On Fri, 25 Mar 2022 21:24:28 +0100
Mattias Rönnblom <mattias.ronnblom@ericsson.com> wrote:

> diff --git a/lib/eal/include/rte_seqlock.h b/lib/eal/include/rte_seqlock.h
> new file mode 100644
> index 0000000000..b975ca848a
> --- /dev/null
> +++ b/lib/eal/include/rte_seqlock.h
> @@ -0,0 +1,84 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2022 Ericsson AB
> + */
> +
> +#ifndef _RTE_SEQLOCK_H_
> +#define _RTE_SEQLOCK_H_
> +
> +#include <stdbool.h>
> +#include <stdint.h>
> +
> +#include <rte_atomic.h>
> +#include <rte_branch_prediction.h>
> +#include <rte_spinlock.h>
> +
> +struct rte_seqlock {
> +	uint64_t sn;
> +	rte_spinlock_t lock;
> +};
> +
> +typedef struct rte_seqlock rte_seqlock_t;
> +


Add a reference to Wikipedia and/or Linux since not every DPDK
user maybe familar with this.

> +
> +	sn = seqlock->sn + 1;
> +
> +	__atomic_store_n(&seqlock->sn, sn, __ATOMIC_RELAXED);
> +
> +	/* __ATOMIC_RELEASE to prevent stores after (in program order)
> +	 * from happening before the sn store.
> +	 */
> +	rte_atomic_thread_fence(__ATOMIC_RELEASE);

Could this just be __atomic_fetch_add() with __ATOMIC_RELEASE?

Mattias Rönnblom March 26, 2022, 2:57 p.m. UTC | #2

On 2022-03-25 22:10, Stephen Hemminger wrote:
> On Fri, 25 Mar 2022 21:24:28 +0100
> Mattias Rönnblom <mattias.ronnblom@ericsson.com> wrote:
>
>> diff --git a/lib/eal/include/rte_seqlock.h b/lib/eal/include/rte_seqlock.h
>> new file mode 100644
>> index 0000000000..b975ca848a
>> --- /dev/null
>> +++ b/lib/eal/include/rte_seqlock.h
>> @@ -0,0 +1,84 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2022 Ericsson AB
>> + */
>> +
>> +#ifndef _RTE_SEQLOCK_H_
>> +#define _RTE_SEQLOCK_H_
>> +
>> +#include <stdbool.h>
>> +#include <stdint.h>
>> +
>> +#include <rte_atomic.h>
>> +#include <rte_branch_prediction.h>
>> +#include <rte_spinlock.h>
>> +
>> +struct rte_seqlock {
>> +	uint64_t sn;
>> +	rte_spinlock_t lock;
>> +};
>> +
>> +typedef struct rte_seqlock rte_seqlock_t;
>> +
>
> Add a reference to Wikipedia and/or Linux since not every DPDK
> user maybe familar with this.

OK, will do.

>> +
>> +	sn = seqlock->sn + 1;
>> +
>> +	__atomic_store_n(&seqlock->sn, sn, __ATOMIC_RELAXED);
>> +
>> +	/* __ATOMIC_RELEASE to prevent stores after (in program order)
>> +	 * from happening before the sn store.
>> +	 */
>> +	rte_atomic_thread_fence(__ATOMIC_RELEASE);
> Could this just be __atomic_fetch_add() with __ATOMIC_RELEASE?

If I understood C11 correctly, an __atomic_fetch_add() with 
__ATOMIC_RELEASE only prevents stores that precedes it (in program 
order) to be move ahead of it. Thus, stores that follows it may be 
reordered across the __atomic_fetch_add(), and seen by a reader before 
the sn change.

Also, __atomic_fetch_add() would generate an atomic add machine 
instruction, which, at least according to my experience (on x86_64), is 
slower than a mov+add+mov, which is what the above code will generate 
(plus prevent certain compiler optimizations). That's with TSO. What 
would happen on weakly ordered machines, I don't know in detail.

Ananyev, Konstantin March 27, 2022, 2:49 p.m. UTC | #3

> diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build
> index 9700494816..48df5f1a21 100644
> --- a/lib/eal/include/meson.build
> +++ b/lib/eal/include/meson.build
> @@ -36,6 +36,7 @@ headers += files(
>          'rte_per_lcore.h',
>          'rte_random.h',
>          'rte_reciprocal.h',
> +        'rte_seqlock.h',
>          'rte_service.h',
>          'rte_service_component.h',
>          'rte_string_fns.h',
> diff --git a/lib/eal/include/rte_seqlock.h b/lib/eal/include/rte_seqlock.h
> new file mode 100644
> index 0000000000..b975ca848a
> --- /dev/null
> +++ b/lib/eal/include/rte_seqlock.h
> @@ -0,0 +1,84 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2022 Ericsson AB
> + */
> +
> +#ifndef _RTE_SEQLOCK_H_
> +#define _RTE_SEQLOCK_H_
> +
> +#include <stdbool.h>
> +#include <stdint.h>
> +
> +#include <rte_atomic.h>
> +#include <rte_branch_prediction.h>
> +#include <rte_spinlock.h>
> +
> +struct rte_seqlock {
> +	uint64_t sn;
> +	rte_spinlock_t lock;
> +};
> +
> +typedef struct rte_seqlock rte_seqlock_t;
> +
> +__rte_experimental
> +void
> +rte_seqlock_init(rte_seqlock_t *seqlock);

Probably worth to have static initializer too.


> +
> +__rte_experimental
> +static inline uint64_t
> +rte_seqlock_read_begin(const rte_seqlock_t *seqlock)
> +{
> +	/* __ATOMIC_ACQUIRE to prevent loads after (in program order)
> +	 * from happening before the sn load. Syncronizes-with the
> +	 * store release in rte_seqlock_end().
> +	 */
> +	return __atomic_load_n(&seqlock->sn, __ATOMIC_ACQUIRE);
> +}
> +
> +__rte_experimental
> +static inline bool
> +rte_seqlock_read_retry(const rte_seqlock_t *seqlock, uint64_t begin_sn)
> +{
> +	uint64_t end_sn;
> +
> +	/* make sure the data loads happens before the sn load */
> +	rte_atomic_thread_fence(__ATOMIC_ACQUIRE);

That's sort of 'read_end' correct?
If so, shouldn't it be '__ATOMIC_RELEASE' instead here,
and
end_sn = __atomic_load_n(..., (__ATOMIC_ACQUIRE)
on the line below? 

> +
> +	end_sn = __atomic_load_n(&seqlock->sn, __ATOMIC_RELAXED);
> +
> +	return unlikely(begin_sn & 1 || begin_sn != end_sn);
> +}
> +
> +__rte_experimental
> +static inline void
> +rte_seqlock_write_begin(rte_seqlock_t *seqlock)
> +{
> +	uint64_t sn;
> +
> +	/* to synchronize with other writers */
> +	rte_spinlock_lock(&seqlock->lock);
> +
> +	sn = seqlock->sn + 1;
> +
> +	__atomic_store_n(&seqlock->sn, sn, __ATOMIC_RELAXED);
> +
> +	/* __ATOMIC_RELEASE to prevent stores after (in program order)
> +	 * from happening before the sn store.
> +	 */
> +	rte_atomic_thread_fence(__ATOMIC_RELEASE);

I think it needs to be '__ATOMIC_ACQUIRE' here instead of '__ATOMIC_RELEASE'.

> +}
> +
> +__rte_experimental
> +static inline void
> +rte_seqlock_write_end(rte_seqlock_t *seqlock)
> +{
> +	uint64_t sn;
> +
> +	sn = seqlock->sn + 1;
> +
> +	/* synchronizes-with the load acquire in rte_seqlock_begin() */
> +	__atomic_store_n(&seqlock->sn, sn, __ATOMIC_RELEASE);
> +
> +	rte_spinlock_unlock(&seqlock->lock);
> +}
> +

Mattias Rönnblom March 27, 2022, 5:42 p.m. UTC | #4

On 2022-03-27 16:49, Ananyev, Konstantin wrote:
>> diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build
>> index 9700494816..48df5f1a21 100644
>> --- a/lib/eal/include/meson.build
>> +++ b/lib/eal/include/meson.build
>> @@ -36,6 +36,7 @@ headers += files(
>>           'rte_per_lcore.h',
>>           'rte_random.h',
>>           'rte_reciprocal.h',
>> +        'rte_seqlock.h',
>>           'rte_service.h',
>>           'rte_service_component.h',
>>           'rte_string_fns.h',
>> diff --git a/lib/eal/include/rte_seqlock.h b/lib/eal/include/rte_seqlock.h
>> new file mode 100644
>> index 0000000000..b975ca848a
>> --- /dev/null
>> +++ b/lib/eal/include/rte_seqlock.h
>> @@ -0,0 +1,84 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2022 Ericsson AB
>> + */
>> +
>> +#ifndef _RTE_SEQLOCK_H_
>> +#define _RTE_SEQLOCK_H_
>> +
>> +#include <stdbool.h>
>> +#include <stdint.h>
>> +
>> +#include <rte_atomic.h>
>> +#include <rte_branch_prediction.h>
>> +#include <rte_spinlock.h>
>> +
>> +struct rte_seqlock {
>> +	uint64_t sn;
>> +	rte_spinlock_t lock;
>> +};
>> +
>> +typedef struct rte_seqlock rte_seqlock_t;
>> +
>> +__rte_experimental
>> +void
>> +rte_seqlock_init(rte_seqlock_t *seqlock);
> Probably worth to have static initializer too.
>

I will add that in the next version, thanks.

>> +
>> +__rte_experimental
>> +static inline uint64_t
>> +rte_seqlock_read_begin(const rte_seqlock_t *seqlock)
>> +{
>> +	/* __ATOMIC_ACQUIRE to prevent loads after (in program order)
>> +	 * from happening before the sn load. Syncronizes-with the
>> +	 * store release in rte_seqlock_end().
>> +	 */
>> +	return __atomic_load_n(&seqlock->sn, __ATOMIC_ACQUIRE);
>> +}
>> +
>> +__rte_experimental
>> +static inline bool
>> +rte_seqlock_read_retry(const rte_seqlock_t *seqlock, uint64_t begin_sn)
>> +{
>> +	uint64_t end_sn;
>> +
>> +	/* make sure the data loads happens before the sn load */
>> +	rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
> That's sort of 'read_end' correct?
> If so, shouldn't it be '__ATOMIC_RELEASE' instead here,
> and
> end_sn = __atomic_load_n(..., (__ATOMIC_ACQUIRE)
> on the line below?

A release fence prevents reordering of stores. The reader doesn't do any 
stores, so I don't understand why you would use a release fence here. 
Could you elaborate?

>> +
>> +	end_sn = __atomic_load_n(&seqlock->sn, __ATOMIC_RELAXED);
>> +
>> +	return unlikely(begin_sn & 1 || begin_sn != end_sn);
>> +}
>> +
>> +__rte_experimental
>> +static inline void
>> +rte_seqlock_write_begin(rte_seqlock_t *seqlock)
>> +{
>> +	uint64_t sn;
>> +
>> +	/* to synchronize with other writers */
>> +	rte_spinlock_lock(&seqlock->lock);
>> +
>> +	sn = seqlock->sn + 1;
>> +
>> +	__atomic_store_n(&seqlock->sn, sn, __ATOMIC_RELAXED);
>> +
>> +	/* __ATOMIC_RELEASE to prevent stores after (in program order)
>> +	 * from happening before the sn store.
>> +	 */
>> +	rte_atomic_thread_fence(__ATOMIC_RELEASE);
> I think it needs to be '__ATOMIC_ACQUIRE' here instead of '__ATOMIC_RELEASE'.

Please elaborate on why.

>> +}
>> +
>> +__rte_experimental
>> +static inline void
>> +rte_seqlock_write_end(rte_seqlock_t *seqlock)
>> +{
>> +	uint64_t sn;
>> +
>> +	sn = seqlock->sn + 1;
>> +
>> +	/* synchronizes-with the load acquire in rte_seqlock_begin() */
>> +	__atomic_store_n(&seqlock->sn, sn, __ATOMIC_RELEASE);
>> +
>> +	rte_spinlock_unlock(&seqlock->lock);
>> +}
>> +

Ananyev, Konstantin March 28, 2022, 10:53 a.m. UTC | #5

> >> diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build
> >> index 9700494816..48df5f1a21 100644
> >> --- a/lib/eal/include/meson.build
> >> +++ b/lib/eal/include/meson.build
> >> @@ -36,6 +36,7 @@ headers += files(
> >>           'rte_per_lcore.h',
> >>           'rte_random.h',
> >>           'rte_reciprocal.h',
> >> +        'rte_seqlock.h',
> >>           'rte_service.h',
> >>           'rte_service_component.h',
> >>           'rte_string_fns.h',
> >> diff --git a/lib/eal/include/rte_seqlock.h b/lib/eal/include/rte_seqlock.h
> >> new file mode 100644
> >> index 0000000000..b975ca848a
> >> --- /dev/null
> >> +++ b/lib/eal/include/rte_seqlock.h
> >> @@ -0,0 +1,84 @@
> >> +/* SPDX-License-Identifier: BSD-3-Clause
> >> + * Copyright(c) 2022 Ericsson AB
> >> + */
> >> +
> >> +#ifndef _RTE_SEQLOCK_H_
> >> +#define _RTE_SEQLOCK_H_
> >> +
> >> +#include <stdbool.h>
> >> +#include <stdint.h>
> >> +
> >> +#include <rte_atomic.h>
> >> +#include <rte_branch_prediction.h>
> >> +#include <rte_spinlock.h>
> >> +
> >> +struct rte_seqlock {
> >> +	uint64_t sn;
> >> +	rte_spinlock_t lock;
> >> +};
> >> +
> >> +typedef struct rte_seqlock rte_seqlock_t;
> >> +
> >> +__rte_experimental
> >> +void
> >> +rte_seqlock_init(rte_seqlock_t *seqlock);
> > Probably worth to have static initializer too.
> >
> 
> I will add that in the next version, thanks.
> 
> >> +
> >> +__rte_experimental
> >> +static inline uint64_t
> >> +rte_seqlock_read_begin(const rte_seqlock_t *seqlock)
> >> +{
> >> +	/* __ATOMIC_ACQUIRE to prevent loads after (in program order)
> >> +	 * from happening before the sn load. Syncronizes-with the
> >> +	 * store release in rte_seqlock_end().
> >> +	 */
> >> +	return __atomic_load_n(&seqlock->sn, __ATOMIC_ACQUIRE);
> >> +}
> >> +
> >> +__rte_experimental
> >> +static inline bool
> >> +rte_seqlock_read_retry(const rte_seqlock_t *seqlock, uint64_t begin_sn)
> >> +{
> >> +	uint64_t end_sn;
> >> +
> >> +	/* make sure the data loads happens before the sn load */
> >> +	rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
> > That's sort of 'read_end' correct?
> > If so, shouldn't it be '__ATOMIC_RELEASE' instead here,
> > and
> > end_sn = __atomic_load_n(..., (__ATOMIC_ACQUIRE)
> > on the line below?
> 
> A release fence prevents reordering of stores. The reader doesn't do any
> stores, so I don't understand why you would use a release fence here.
> Could you elaborate?

From my understanding:  
rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
serves as a hoist barrier here, so it would only prevent later instructions
to be executed before that point.
But it wouldn't prevent earlier instructions to be executed after that point.
While we do need to guarantee that cpu will finish all previous reads before
progressing further. 

Suppose we have something like that:

struct {
	uint64_t shared;
	rte_seqlock_t lock;
} data;

...
sn = ...
uint64_t x = data.shared; 
/* inside rte_seqlock_read_retry(): */
...
rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
end_sn = __atomic_load_n(&data.lock.sn, __ATOMIC_RELAXED);

Here we need to make sure that read of data.shared will always happen
before reading of data.lock.sn. 
It is not a problem on IA (as reads are not reordered), but on machines with 
relaxed memory ordering (ARM, etc.)  it can happen.
So to prevent it we do need a sink barrier here first (ATOMIC_RELEASE).

Honnappa and other ARM & atomics experts, please correct me if I am wrong here.    

> >> +
> >> +	end_sn = __atomic_load_n(&seqlock->sn, __ATOMIC_RELAXED);
> >> +
> >> +	return unlikely(begin_sn & 1 || begin_sn != end_sn);
> >> +}
> >> +
> >> +__rte_experimental
> >> +static inline void
> >> +rte_seqlock_write_begin(rte_seqlock_t *seqlock)
> >> +{
> >> +	uint64_t sn;
> >> +
> >> +	/* to synchronize with other writers */
> >> +	rte_spinlock_lock(&seqlock->lock);
> >> +
> >> +	sn = seqlock->sn + 1;
> >> +
> >> +	__atomic_store_n(&seqlock->sn, sn, __ATOMIC_RELAXED);
> >> +
> >> +	/* __ATOMIC_RELEASE to prevent stores after (in program order)
> >> +	 * from happening before the sn store.
> >> +	 */
> >> +	rte_atomic_thread_fence(__ATOMIC_RELEASE);
> > I think it needs to be '__ATOMIC_ACQUIRE' here instead of '__ATOMIC_RELEASE'.
> 
> Please elaborate on why.

As you said in the comments above, we need to prevent later stores
to be executed before that point. So we do need a hoist barrier here.
AFAIK to guarantee a hoist barrier '__ATOMIC_ACQUIRE' is required.

> 
> >> +}
> >> +
> >> +__rte_experimental
> >> +static inline void
> >> +rte_seqlock_write_end(rte_seqlock_t *seqlock)
> >> +{
> >> +	uint64_t sn;
> >> +
> >> +	sn = seqlock->sn + 1;
> >> +
> >> +	/* synchronizes-with the load acquire in rte_seqlock_begin() */
> >> +	__atomic_store_n(&seqlock->sn, sn, __ATOMIC_RELEASE);
> >> +
> >> +	rte_spinlock_unlock(&seqlock->lock);
> >> +}
> >> +

Ola Liljedahl March 28, 2022, 2:06 p.m. UTC | #6

On 3/28/22 12:53, Ananyev, Konstantin wrote:
> 
>>>> diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build
>>>> index 9700494816..48df5f1a21 100644
>>>> --- a/lib/eal/include/meson.build
>>>> +++ b/lib/eal/include/meson.build
>>>> @@ -36,6 +36,7 @@ headers += files(
>>>>            'rte_per_lcore.h',
>>>>            'rte_random.h',
>>>>            'rte_reciprocal.h',
>>>> +        'rte_seqlock.h',
>>>>            'rte_service.h',
>>>>            'rte_service_component.h',
>>>>            'rte_string_fns.h',
>>>> diff --git a/lib/eal/include/rte_seqlock.h b/lib/eal/include/rte_seqlock.h
>>>> new file mode 100644
>>>> index 0000000000..b975ca848a
>>>> --- /dev/null
>>>> +++ b/lib/eal/include/rte_seqlock.h
>>>> @@ -0,0 +1,84 @@
>>>> +/* SPDX-License-Identifier: BSD-3-Clause
>>>> + * Copyright(c) 2022 Ericsson AB
>>>> + */
>>>> +
>>>> +#ifndef _RTE_SEQLOCK_H_
>>>> +#define _RTE_SEQLOCK_H_
>>>> +
>>>> +#include <stdbool.h>
>>>> +#include <stdint.h>
>>>> +
>>>> +#include <rte_atomic.h>
>>>> +#include <rte_branch_prediction.h>
>>>> +#include <rte_spinlock.h>
>>>> +
>>>> +struct rte_seqlock {
>>>> +	uint64_t sn;
>>>> +	rte_spinlock_t lock;
>>>> +};
>>>> +
>>>> +typedef struct rte_seqlock rte_seqlock_t;
>>>> +
>>>> +__rte_experimental
>>>> +void
>>>> +rte_seqlock_init(rte_seqlock_t *seqlock);
>>> Probably worth to have static initializer too.
>>>
>>
>> I will add that in the next version, thanks.
>>
>>>> +
>>>> +__rte_experimental
>>>> +static inline uint64_t
>>>> +rte_seqlock_read_begin(const rte_seqlock_t *seqlock)
>>>> +{
>>>> +	/* __ATOMIC_ACQUIRE to prevent loads after (in program order)
>>>> +	 * from happening before the sn load. Syncronizes-with the
>>>> +	 * store release in rte_seqlock_end().
>>>> +	 */
>>>> +	return __atomic_load_n(&seqlock->sn, __ATOMIC_ACQUIRE);
>>>> +}
>>>> +
>>>> +__rte_experimental
>>>> +static inline bool
>>>> +rte_seqlock_read_retry(const rte_seqlock_t *seqlock, uint64_t begin_sn)
>>>> +{
>>>> +	uint64_t end_sn;
>>>> +
>>>> +	/* make sure the data loads happens before the sn load */
>>>> +	rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
>>> That's sort of 'read_end' correct?
>>> If so, shouldn't it be '__ATOMIC_RELEASE' instead here,
>>> and
>>> end_sn = __atomic_load_n(..., (__ATOMIC_ACQUIRE)
>>> on the line below?
>>
>> A release fence prevents reordering of stores. The reader doesn't do any
>> stores, so I don't understand why you would use a release fence here.
>> Could you elaborate?
> 
>  From my understanding:
> rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
> serves as a hoist barrier here, so it would only prevent later instructions
> to be executed before that point.
> But it wouldn't prevent earlier instructions to be executed after that point.
> While we do need to guarantee that cpu will finish all previous reads before
> progressing further.
> 
> Suppose we have something like that:
> 
> struct {
> 	uint64_t shared;
> 	rte_seqlock_t lock;
> } data;
> 
> ...
> sn = ...
> uint64_t x = data.shared;
> /* inside rte_seqlock_read_retry(): */
> ...
> rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
> end_sn = __atomic_load_n(&data.lock.sn, __ATOMIC_RELAXED);
> 
> Here we need to make sure that read of data.shared will always happen
> before reading of data.lock.sn.
> It is not a problem on IA (as reads are not reordered), but on machines with
> relaxed memory ordering (ARM, etc.)  it can happen.
> So to prevent it we do need a sink barrier here first (ATOMIC_RELEASE)
We can't use store-release since there is no write on the reader-side.
And fence-release orders against later stores, not later loads.

> 
> Honnappa and other ARM & atomics experts, please correct me if I am wrong here.
The C standard (chapter 7.17.4 in the C11 (draft)) isn't so easy to 
digest. If we trust Preshing, he has a more accessible description here: 
https://preshing.com/20130922/acquire-and-release-fences/
"An acquire fence prevents the memory reordering of any read which 
precedes it in program order with any read or write which follows it in 
program order."
and here: 
https://preshing.com/20131125/acquire-and-release-fences-dont-work-the-way-youd-expect/ 
(for C++ but the definition seems to be identical to that of C11).
Essentially a LoadLoad+LoadStore barrier which is what we want to achieve.

GCC 10.3 for AArch64/A64 ISA generates a "DMB ISHLD" instruction. This 
waits for all loads preceding (in program order) the memory barrier to 
be observed before any memory accesses after (in program order) the 
memory barrier.

I think the key to understanding atomic thread fences is that they are 
not associated with a specific memory access (unlike load-acquire and 
store-release) so they can't order earlier or later memory accesses 
against some specific memory access. Instead the fence orders any/all 
earlier loads and/or stores against any/all later loads or stores 
(depending on acquire or release).

> 
>>>> +
>>>> +	end_sn = __atomic_load_n(&seqlock->sn, __ATOMIC_RELAXED);
>>>> +
>>>> +	return unlikely(begin_sn & 1 || begin_sn != end_sn);
>>>> +}
>>>> +
>>>> +__rte_experimental
>>>> +static inline void
>>>> +rte_seqlock_write_begin(rte_seqlock_t *seqlock)
>>>> +{
>>>> +	uint64_t sn;
>>>> +
>>>> +	/* to synchronize with other writers */
>>>> +	rte_spinlock_lock(&seqlock->lock);
>>>> +
>>>> +	sn = seqlock->sn + 1;
>>>> +
>>>> +	__atomic_store_n(&seqlock->sn, sn, __ATOMIC_RELAXED);
>>>> +
>>>> +	/* __ATOMIC_RELEASE to prevent stores after (in program order)
>>>> +	 * from happening before the sn store.
>>>> +	 */
>>>> +	rte_atomic_thread_fence(__ATOMIC_RELEASE);
>>> I think it needs to be '__ATOMIC_ACQUIRE' here instead of '__ATOMIC_RELEASE'.
>>
>> Please elaborate on why.
> 
> As you said in the comments above, we need to prevent later stores
> to be executed before that point. So we do need a hoist barrier here.
> AFAIK to guarantee a hoist barrier '__ATOMIC_ACQUIRE' is required.
An acquire fence wouldn't order an earlier store (the write to 
seqlock->sn) from being reordered with some later store (e.g. writes to 
the protected data), thus it would allow readers to see updated data 
(possibly torn) with a pre-update sequence number. We need a StoreStore 
barrier for ordering the SN store and data stores => fence(release).

Acquire and releases fences can (also) be used to create 
synchronize-with relationships (this is how the C standard defines 
them). Preshing has a good example on this. Basically
Thread 1:
data = 242;
atomic_thread_fence(atomic_release);
atomic_store_n(&guard, 1, atomic_relaxed);

Thread 2:
while (atomic_load_n(&guard, atomic_relaxed) != 1) ;
atomic_thread_fence(atomic_acquire);
do_something(data);

These are obvious analogues to store-release and load-acquire, thus the 
acquire & release names of the fences.

- Ola

> 
>>
>>>> +}
>>>> +
>>>> +__rte_experimental
>>>> +static inline void
>>>> +rte_seqlock_write_end(rte_seqlock_t *seqlock)
>>>> +{
>>>> +	uint64_t sn;
>>>> +
>>>> +	sn = seqlock->sn + 1;
>>>> +
>>>> +	/* synchronizes-with the load acquire in rte_seqlock_begin() */
>>>> +	__atomic_store_n(&seqlock->sn, sn, __ATOMIC_RELEASE);
>>>> +
>>>> +	rte_spinlock_unlock(&seqlock->lock);
>>>> +}
>>>> +
>

Mattias Rönnblom March 29, 2022, 8:32 a.m. UTC | #7

On 2022-03-28 16:06, Ola Liljedahl wrote:
>
>
> On 3/28/22 12:53, Ananyev, Konstantin wrote:
>>
>>>>> diff --git a/lib/eal/include/meson.build 
>>>>> b/lib/eal/include/meson.build
>>>>> index 9700494816..48df5f1a21 100644
>>>>> --- a/lib/eal/include/meson.build
>>>>> +++ b/lib/eal/include/meson.build
>>>>> @@ -36,6 +36,7 @@ headers += files(
>>>>>            'rte_per_lcore.h',
>>>>>            'rte_random.h',
>>>>>            'rte_reciprocal.h',
>>>>> +        'rte_seqlock.h',
>>>>>            'rte_service.h',
>>>>>            'rte_service_component.h',
>>>>>            'rte_string_fns.h',
>>>>> diff --git a/lib/eal/include/rte_seqlock.h 
>>>>> b/lib/eal/include/rte_seqlock.h
>>>>> new file mode 100644
>>>>> index 0000000000..b975ca848a
>>>>> --- /dev/null
>>>>> +++ b/lib/eal/include/rte_seqlock.h
>>>>> @@ -0,0 +1,84 @@
>>>>> +/* SPDX-License-Identifier: BSD-3-Clause
>>>>> + * Copyright(c) 2022 Ericsson AB
>>>>> + */
>>>>> +
>>>>> +#ifndef _RTE_SEQLOCK_H_
>>>>> +#define _RTE_SEQLOCK_H_
>>>>> +
>>>>> +#include <stdbool.h>
>>>>> +#include <stdint.h>
>>>>> +
>>>>> +#include <rte_atomic.h>
>>>>> +#include <rte_branch_prediction.h>
>>>>> +#include <rte_spinlock.h>
>>>>> +
>>>>> +struct rte_seqlock {
>>>>> +    uint64_t sn;
>>>>> +    rte_spinlock_t lock;
>>>>> +};
>>>>> +
>>>>> +typedef struct rte_seqlock rte_seqlock_t;
>>>>> +
>>>>> +__rte_experimental
>>>>> +void
>>>>> +rte_seqlock_init(rte_seqlock_t *seqlock);
>>>> Probably worth to have static initializer too.
>>>>
>>>
>>> I will add that in the next version, thanks.
>>>
>>>>> +
>>>>> +__rte_experimental
>>>>> +static inline uint64_t
>>>>> +rte_seqlock_read_begin(const rte_seqlock_t *seqlock)
>>>>> +{
>>>>> +    /* __ATOMIC_ACQUIRE to prevent loads after (in program order)
>>>>> +     * from happening before the sn load. Syncronizes-with the
>>>>> +     * store release in rte_seqlock_end().
>>>>> +     */
>>>>> +    return __atomic_load_n(&seqlock->sn, __ATOMIC_ACQUIRE);
>>>>> +}
>>>>> +
>>>>> +__rte_experimental
>>>>> +static inline bool
>>>>> +rte_seqlock_read_retry(const rte_seqlock_t *seqlock, uint64_t 
>>>>> begin_sn)
>>>>> +{
>>>>> +    uint64_t end_sn;
>>>>> +
>>>>> +    /* make sure the data loads happens before the sn load */
>>>>> +    rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
>>>> That's sort of 'read_end' correct?
>>>> If so, shouldn't it be '__ATOMIC_RELEASE' instead here,
>>>> and
>>>> end_sn = __atomic_load_n(..., (__ATOMIC_ACQUIRE)
>>>> on the line below?
>>>
>>> A release fence prevents reordering of stores. The reader doesn't do 
>>> any
>>> stores, so I don't understand why you would use a release fence here.
>>> Could you elaborate?
>>
>>  From my understanding:
>> rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
>> serves as a hoist barrier here, so it would only prevent later 
>> instructions
>> to be executed before that point.
>> But it wouldn't prevent earlier instructions to be executed after 
>> that point.
>> While we do need to guarantee that cpu will finish all previous reads 
>> before
>> progressing further.
>>
>> Suppose we have something like that:
>>
>> struct {
>>     uint64_t shared;
>>     rte_seqlock_t lock;
>> } data;
>>
>> ...
>> sn = ...
>> uint64_t x = data.shared;
>> /* inside rte_seqlock_read_retry(): */
>> ...
>> rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
>> end_sn = __atomic_load_n(&data.lock.sn, __ATOMIC_RELAXED);
>>
>> Here we need to make sure that read of data.shared will always happen
>> before reading of data.lock.sn.
>> It is not a problem on IA (as reads are not reordered), but on 
>> machines with
>> relaxed memory ordering (ARM, etc.)  it can happen.
>> So to prevent it we do need a sink barrier here first (ATOMIC_RELEASE)
> We can't use store-release since there is no write on the reader-side.
> And fence-release orders against later stores, not later loads.
>
>>
>> Honnappa and other ARM & atomics experts, please correct me if I am 
>> wrong here.
> The C standard (chapter 7.17.4 in the C11 (draft)) isn't so easy to 
> digest. If we trust Preshing, he has a more accessible description 
> here: 
> https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-f4f5b1eec2980283&q=1&e=3479ebfa-e18d-4bf8-88fe-76823a531912&u=https%3A%2F%2Fpreshing.com%2F20130922%2Facquire-and-release-fences%2F
> "An acquire fence prevents the memory reordering of any read which 
> precedes it in program order with any read or write which follows it 
> in program order."
> and here: 
> https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-64b0eba450be934b&q=1&e=3479ebfa-e18d-4bf8-88fe-76823a531912&u=https%3A%2F%2Fpreshing.com%2F20131125%2Facquire-and-release-fences-dont-work-the-way-youd-expect%2F 
> (for C++ but the definition seems to be identical to that of C11).
> Essentially a LoadLoad+LoadStore barrier which is what we want to 
> achieve.
>
> GCC 10.3 for AArch64/A64 ISA generates a "DMB ISHLD" instruction. This 
> waits for all loads preceding (in program order) the memory barrier to 
> be observed before any memory accesses after (in program order) the 
> memory barrier.
>
> I think the key to understanding atomic thread fences is that they are 
> not associated with a specific memory access (unlike load-acquire and 
> store-release) so they can't order earlier or later memory accesses 
> against some specific memory access. Instead the fence orders any/all 
> earlier loads and/or stores against any/all later loads or stores 
> (depending on acquire or release).
>
>>
>>>>> +
>>>>> +    end_sn = __atomic_load_n(&seqlock->sn, __ATOMIC_RELAXED);
>>>>> +
>>>>> +    return unlikely(begin_sn & 1 || begin_sn != end_sn);
>>>>> +}
>>>>> +
>>>>> +__rte_experimental
>>>>> +static inline void
>>>>> +rte_seqlock_write_begin(rte_seqlock_t *seqlock)
>>>>> +{
>>>>> +    uint64_t sn;
>>>>> +
>>>>> +    /* to synchronize with other writers */
>>>>> +    rte_spinlock_lock(&seqlock->lock);
>>>>> +
>>>>> +    sn = seqlock->sn + 1;
>>>>> +
>>>>> +    __atomic_store_n(&seqlock->sn, sn, __ATOMIC_RELAXED);
>>>>> +
>>>>> +    /* __ATOMIC_RELEASE to prevent stores after (in program order)
>>>>> +     * from happening before the sn store.
>>>>> +     */
>>>>> +    rte_atomic_thread_fence(__ATOMIC_RELEASE);
>>>> I think it needs to be '__ATOMIC_ACQUIRE' here instead of 
>>>> '__ATOMIC_RELEASE'.
>>>
>>> Please elaborate on why.
>>
>> As you said in the comments above, we need to prevent later stores
>> to be executed before that point. So we do need a hoist barrier here.
>> AFAIK to guarantee a hoist barrier '__ATOMIC_ACQUIRE' is required.
> An acquire fence wouldn't order an earlier store (the write to 
> seqlock->sn) from being reordered with some later store (e.g. writes 
> to the protected data), thus it would allow readers to see updated 
> data (possibly torn) with a pre-update sequence number. We need a 
> StoreStore barrier for ordering the SN store and data stores => 
> fence(release).
>
> Acquire and releases fences can (also) be used to create 
> synchronize-with relationships (this is how the C standard defines 
> them). Preshing has a good example on this. Basically
> Thread 1:
> data = 242;
> atomic_thread_fence(atomic_release);
> atomic_store_n(&guard, 1, atomic_relaxed);
>
> Thread 2:
> while (atomic_load_n(&guard, atomic_relaxed) != 1) ;
> atomic_thread_fence(atomic_acquire);
> do_something(data);
>
> These are obvious analogues to store-release and load-acquire, thus 
> the acquire & release names of the fences.
>
> - Ola
>
>>
>>>
>>>>> +}
>>>>> +
>>>>> +__rte_experimental
>>>>> +static inline void
>>>>> +rte_seqlock_write_end(rte_seqlock_t *seqlock)
>>>>> +{
>>>>> +    uint64_t sn;
>>>>> +
>>>>> +    sn = seqlock->sn + 1;
>>>>> +
>>>>> +    /* synchronizes-with the load acquire in rte_seqlock_begin() */
>>>>> +    __atomic_store_n(&seqlock->sn, sn, __ATOMIC_RELEASE);
>>>>> +
>>>>> +    rte_spinlock_unlock(&seqlock->lock);
>>>>> +}
>>>>> +
>>

I have nothing to add, but Ola's mail seems to have been blocked from 
the dev list, so I'm posting this again.

Ananyev, Konstantin March 29, 2022, 1:20 p.m. UTC | #8

> >>>>> diff --git a/lib/eal/include/meson.build
> >>>>> b/lib/eal/include/meson.build
> >>>>> index 9700494816..48df5f1a21 100644
> >>>>> --- a/lib/eal/include/meson.build
> >>>>> +++ b/lib/eal/include/meson.build
> >>>>> @@ -36,6 +36,7 @@ headers += files(
> >>>>>            'rte_per_lcore.h',
> >>>>>            'rte_random.h',
> >>>>>            'rte_reciprocal.h',
> >>>>> +        'rte_seqlock.h',
> >>>>>            'rte_service.h',
> >>>>>            'rte_service_component.h',
> >>>>>            'rte_string_fns.h',
> >>>>> diff --git a/lib/eal/include/rte_seqlock.h
> >>>>> b/lib/eal/include/rte_seqlock.h
> >>>>> new file mode 100644
> >>>>> index 0000000000..b975ca848a
> >>>>> --- /dev/null
> >>>>> +++ b/lib/eal/include/rte_seqlock.h
> >>>>> @@ -0,0 +1,84 @@
> >>>>> +/* SPDX-License-Identifier: BSD-3-Clause
> >>>>> + * Copyright(c) 2022 Ericsson AB
> >>>>> + */
> >>>>> +
> >>>>> +#ifndef _RTE_SEQLOCK_H_
> >>>>> +#define _RTE_SEQLOCK_H_
> >>>>> +
> >>>>> +#include <stdbool.h>
> >>>>> +#include <stdint.h>
> >>>>> +
> >>>>> +#include <rte_atomic.h>
> >>>>> +#include <rte_branch_prediction.h>
> >>>>> +#include <rte_spinlock.h>
> >>>>> +
> >>>>> +struct rte_seqlock {
> >>>>> +    uint64_t sn;
> >>>>> +    rte_spinlock_t lock;
> >>>>> +};
> >>>>> +
> >>>>> +typedef struct rte_seqlock rte_seqlock_t;
> >>>>> +
> >>>>> +__rte_experimental
> >>>>> +void
> >>>>> +rte_seqlock_init(rte_seqlock_t *seqlock);
> >>>> Probably worth to have static initializer too.
> >>>>
> >>>
> >>> I will add that in the next version, thanks.
> >>>
> >>>>> +
> >>>>> +__rte_experimental
> >>>>> +static inline uint64_t
> >>>>> +rte_seqlock_read_begin(const rte_seqlock_t *seqlock)
> >>>>> +{
> >>>>> +    /* __ATOMIC_ACQUIRE to prevent loads after (in program order)
> >>>>> +     * from happening before the sn load. Syncronizes-with the
> >>>>> +     * store release in rte_seqlock_end().
> >>>>> +     */
> >>>>> +    return __atomic_load_n(&seqlock->sn, __ATOMIC_ACQUIRE);
> >>>>> +}
> >>>>> +
> >>>>> +__rte_experimental
> >>>>> +static inline bool
> >>>>> +rte_seqlock_read_retry(const rte_seqlock_t *seqlock, uint64_t
> >>>>> begin_sn)
> >>>>> +{
> >>>>> +    uint64_t end_sn;
> >>>>> +
> >>>>> +    /* make sure the data loads happens before the sn load */
> >>>>> +    rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
> >>>> That's sort of 'read_end' correct?
> >>>> If so, shouldn't it be '__ATOMIC_RELEASE' instead here,
> >>>> and
> >>>> end_sn = __atomic_load_n(..., (__ATOMIC_ACQUIRE)
> >>>> on the line below?
> >>>
> >>> A release fence prevents reordering of stores. The reader doesn't do
> >>> any
> >>> stores, so I don't understand why you would use a release fence here.
> >>> Could you elaborate?
> >>
> >>  From my understanding:
> >> rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
> >> serves as a hoist barrier here, so it would only prevent later
> >> instructions
> >> to be executed before that point.
> >> But it wouldn't prevent earlier instructions to be executed after
> >> that point.
> >> While we do need to guarantee that cpu will finish all previous reads
> >> before
> >> progressing further.
> >>
> >> Suppose we have something like that:
> >>
> >> struct {
> >>     uint64_t shared;
> >>     rte_seqlock_t lock;
> >> } data;
> >>
> >> ...
> >> sn = ...
> >> uint64_t x = data.shared;
> >> /* inside rte_seqlock_read_retry(): */
> >> ...
> >> rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
> >> end_sn = __atomic_load_n(&data.lock.sn, __ATOMIC_RELAXED);
> >>
> >> Here we need to make sure that read of data.shared will always happen
> >> before reading of data.lock.sn.
> >> It is not a problem on IA (as reads are not reordered), but on
> >> machines with
> >> relaxed memory ordering (ARM, etc.)  it can happen.
> >> So to prevent it we do need a sink barrier here first (ATOMIC_RELEASE)
> > We can't use store-release since there is no write on the reader-side.
> > And fence-release orders against later stores, not later loads.
> >
> >>
> >> Honnappa and other ARM & atomics experts, please correct me if I am
> >> wrong here.
> > The C standard (chapter 7.17.4 in the C11 (draft)) isn't so easy to
> > digest. If we trust Preshing, he has a more accessible description
> > here:
> > https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-f4f5b1eec2980283&q=1&e=3479ebfa-e18d-4bf8-
> 88fe-76823a531912&u=https%3A%2F%2Fpreshing.com%2F20130922%2Facquire-and-release-fences%2F
> > "An acquire fence prevents the memory reordering of any read which
> > precedes it in program order with any read or write which follows it
> > in program order."
> > and here:
> > https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-64b0eba450be934b&q=1&e=3479ebfa-e18d-4bf8-
> 88fe-76823a531912&u=https%3A%2F%2Fpreshing.com%2F20131125%2Facquire-and-release-fences-dont-work-the-way-youd-expect%2F
> > (for C++ but the definition seems to be identical to that of C11).
> > Essentially a LoadLoad+LoadStore barrier which is what we want to
> > achieve.
> >
> > GCC 10.3 for AArch64/A64 ISA generates a "DMB ISHLD" instruction. This
> > waits for all loads preceding (in program order) the memory barrier to
> > be observed before any memory accesses after (in program order) the
> > memory barrier.
> >
> > I think the key to understanding atomic thread fences is that they are
> > not associated with a specific memory access (unlike load-acquire and
> > store-release) so they can't order earlier or later memory accesses
> > against some specific memory access. Instead the fence orders any/all
> > earlier loads and/or stores against any/all later loads or stores
> > (depending on acquire or release).
> >
> >>
> >>>>> +
> >>>>> +    end_sn = __atomic_load_n(&seqlock->sn, __ATOMIC_RELAXED);
> >>>>> +
> >>>>> +    return unlikely(begin_sn & 1 || begin_sn != end_sn);
> >>>>> +}
> >>>>> +
> >>>>> +__rte_experimental
> >>>>> +static inline void
> >>>>> +rte_seqlock_write_begin(rte_seqlock_t *seqlock)
> >>>>> +{
> >>>>> +    uint64_t sn;
> >>>>> +
> >>>>> +    /* to synchronize with other writers */
> >>>>> +    rte_spinlock_lock(&seqlock->lock);
> >>>>> +
> >>>>> +    sn = seqlock->sn + 1;
> >>>>> +
> >>>>> +    __atomic_store_n(&seqlock->sn, sn, __ATOMIC_RELAXED);
> >>>>> +
> >>>>> +    /* __ATOMIC_RELEASE to prevent stores after (in program order)
> >>>>> +     * from happening before the sn store.
> >>>>> +     */
> >>>>> +    rte_atomic_thread_fence(__ATOMIC_RELEASE);
> >>>> I think it needs to be '__ATOMIC_ACQUIRE' here instead of
> >>>> '__ATOMIC_RELEASE'.
> >>>
> >>> Please elaborate on why.
> >>
> >> As you said in the comments above, we need to prevent later stores
> >> to be executed before that point. So we do need a hoist barrier here.
> >> AFAIK to guarantee a hoist barrier '__ATOMIC_ACQUIRE' is required.
> > An acquire fence wouldn't order an earlier store (the write to
> > seqlock->sn) from being reordered with some later store (e.g. writes
> > to the protected data), thus it would allow readers to see updated
> > data (possibly torn) with a pre-update sequence number. We need a
> > StoreStore barrier for ordering the SN store and data stores =>
> > fence(release).
> >
> > Acquire and releases fences can (also) be used to create
> > synchronize-with relationships (this is how the C standard defines
> > them). Preshing has a good example on this. Basically
> > Thread 1:
> > data = 242;
> > atomic_thread_fence(atomic_release);
> > atomic_store_n(&guard, 1, atomic_relaxed);
> >
> > Thread 2:
> > while (atomic_load_n(&guard, atomic_relaxed) != 1) ;
> > atomic_thread_fence(atomic_acquire);
> > do_something(data);
> >
> > These are obvious analogues to store-release and load-acquire, thus
> > the acquire & release names of the fences.
> >
> > - Ola
> >
> >>
> >>>
> >>>>> +}
> >>>>> +
> >>>>> +__rte_experimental
> >>>>> +static inline void
> >>>>> +rte_seqlock_write_end(rte_seqlock_t *seqlock)
> >>>>> +{
> >>>>> +    uint64_t sn;
> >>>>> +
> >>>>> +    sn = seqlock->sn + 1;
> >>>>> +
> >>>>> +    /* synchronizes-with the load acquire in rte_seqlock_begin() */
> >>>>> +    __atomic_store_n(&seqlock->sn, sn, __ATOMIC_RELEASE);
> >>>>> +
> >>>>> +    rte_spinlock_unlock(&seqlock->lock);
> >>>>> +}
> >>>>> +
> >>
> 
> I have nothing to add, but Ola's mail seems to have been blocked from
> the dev list, so I'm posting this again.

Ok, thanks Ola for detailed explanation.
Have to admit then that my understanding of atomic_fence() behaviour was incorrect.
Please disregard my comments above about rte_seqlock_read_retry()
and  rte_seqlock_write_begin().

Konstantin

diff mbox series

Patch

diff --git a/app/test/meson.build b/app/test/meson.build
index 5fc1dd1b7b..5e418e8766 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -125,6 +125,7 @@  test_sources = files(
         'test_rwlock.c',
         'test_sched.c',
         'test_security.c',
+        'test_seqlock.c',
         'test_service_cores.c',
         'test_spinlock.c',
         'test_stack.c',
@@ -214,6 +215,7 @@  fast_tests = [
         ['rwlock_rde_wro_autotest', true],
         ['sched_autotest', true],
         ['security_autotest', false],
+        ['seqlock_autotest', true],
         ['spinlock_autotest', true],
         ['stack_autotest', false],
         ['stack_lf_autotest', false],
diff --git a/app/test/test_seqlock.c b/app/test/test_seqlock.c
new file mode 100644
index 0000000000..a727e16caf
--- /dev/null
+++ b/app/test/test_seqlock.c
@@ -0,0 +1,197 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Ericsson AB
+ */
+
+#include <rte_seqlock.h>
+
+#include <rte_cycles.h>
+#include <rte_malloc.h>
+#include <rte_random.h>
+
+#include <inttypes.h>
+
+#include "test.h"
+
+struct data {
+	rte_seqlock_t lock;
+
+	uint64_t a;
+	uint64_t b __rte_cache_aligned;
+	uint64_t c __rte_cache_aligned;
+} __rte_cache_aligned;
+
+struct reader {
+	struct data *data;
+	uint8_t stop;
+};
+
+#define WRITER_RUNTIME (2.0) /* s */
+
+#define WRITER_MAX_DELAY (100) /* us */
+
+#define INTERRUPTED_WRITER_FREQUENCY (1000)
+#define WRITER_INTERRUPT_TIME (1) /* us */
+
+static int
+writer_start(void *arg)
+{
+	struct data *data = arg;
+	uint64_t deadline;
+
+	deadline = rte_get_timer_cycles() +
+		WRITER_RUNTIME * rte_get_timer_hz();
+
+	while (rte_get_timer_cycles() < deadline) {
+		bool interrupted;
+		uint64_t new_value;
+		unsigned int delay;
+
+		new_value = rte_rand();
+
+		interrupted = rte_rand_max(INTERRUPTED_WRITER_FREQUENCY) == 0;
+
+		rte_seqlock_write_begin(&data->lock);
+
+		data->c = new_value;
+
+		/* These compiler barriers (both on the test reader
+		 * and the test writer side) are here to ensure that
+		 * loads/stores *usually* happen in test program order
+		 * (always on a TSO machine). They are arrange in such
+		 * a way that the writer stores in a different order
+		 * than the reader loads, to emulate an arbitrary
+		 * order. A real application using a seqlock does not
+		 * require any compiler barriers.
+		 */
+		rte_compiler_barrier();
+		data->b = new_value;
+
+		if (interrupted)
+			rte_delay_us_block(WRITER_INTERRUPT_TIME);
+
+		rte_compiler_barrier();
+		data->a = new_value;
+
+		rte_seqlock_write_end(&data->lock);
+
+		delay = rte_rand_max(WRITER_MAX_DELAY);
+
+		rte_delay_us_block(delay);
+	}
+
+	return 0;
+}
+
+#define INTERRUPTED_READER_FREQUENCY (1000)
+#define READER_INTERRUPT_TIME (1000) /* us */
+
+static int
+reader_start(void *arg)
+{
+	struct reader *r = arg;
+	int rc = 0;
+
+	while (__atomic_load_n(&r->stop, __ATOMIC_RELAXED) == 0 && rc == 0) {
+		struct data *data = r->data;
+		bool interrupted;
+		uint64_t a;
+		uint64_t b;
+		uint64_t c;
+		uint64_t sn;
+
+		interrupted = rte_rand_max(INTERRUPTED_READER_FREQUENCY) == 0;
+
+		do {
+			sn = rte_seqlock_read_begin(&data->lock);
+
+			a = data->a;
+			/* See writer_start() for an explaination why
+			 * these barriers are here.
+			 */
+			rte_compiler_barrier();
+
+			if (interrupted)
+				rte_delay_us_block(READER_INTERRUPT_TIME);
+
+			c = data->c;
+
+			rte_compiler_barrier();
+			b = data->b;
+
+		} while (rte_seqlock_read_retry(&data->lock, sn));
+
+		if (a != b || b != c) {
+			printf("Reader observed inconsistent data values "
+			       "%" PRIu64 " %" PRIu64 " %" PRIu64 "\n",
+			       a, b, c);
+			rc = -1;
+		}
+	}
+
+	return rc;
+}
+
+static void
+reader_stop(struct reader *reader)
+{
+	__atomic_store_n(&reader->stop, 1, __ATOMIC_RELAXED);
+}
+
+#define NUM_WRITERS (2)
+#define MIN_NUM_READERS (2)
+#define MAX_READERS (RTE_MAX_LCORE - NUM_WRITERS - 1)
+#define MIN_LCORE_COUNT (NUM_WRITERS + MIN_NUM_READERS + 1)
+
+static int
+test_seqlock(void)
+{
+	struct reader readers[MAX_READERS];
+	unsigned int num_readers;
+	unsigned int num_lcores;
+	unsigned int i;
+	unsigned int lcore_id;
+	unsigned int writer_lcore_ids[NUM_WRITERS] = { 0 };
+	unsigned int reader_lcore_ids[MAX_READERS];
+	int rc = 0;
+
+	num_lcores = rte_lcore_count();
+
+	if (num_lcores < MIN_LCORE_COUNT)
+		return -1;
+
+	num_readers = num_lcores - NUM_WRITERS - 1;
+
+	struct data *data = rte_zmalloc(NULL, sizeof(struct data), 0);
+
+	i = 0;
+	RTE_LCORE_FOREACH_WORKER(lcore_id) {
+		if (i < NUM_WRITERS) {
+			rte_eal_remote_launch(writer_start, data, lcore_id);
+			writer_lcore_ids[i] = lcore_id;
+		} else {
+			unsigned int reader_idx = i - NUM_WRITERS;
+			struct reader *reader = &readers[reader_idx];
+
+			reader->data = data;
+			reader->stop = 0;
+
+			rte_eal_remote_launch(reader_start, reader, lcore_id);
+			reader_lcore_ids[reader_idx] = lcore_id;
+		}
+		i++;
+	}
+
+	for (i = 0; i < NUM_WRITERS; i++)
+		if (rte_eal_wait_lcore(writer_lcore_ids[i]) != 0)
+			rc = -1;
+
+	for (i = 0; i < num_readers; i++) {
+		reader_stop(&readers[i]);
+		if (rte_eal_wait_lcore(reader_lcore_ids[i]) != 0)
+			rc = -1;
+	}
+
+	return rc;
+}
+
+REGISTER_TEST_COMMAND(seqlock_autotest, test_seqlock);
diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build
index 917758cc65..a41343bfed 100644
--- a/lib/eal/common/meson.build
+++ b/lib/eal/common/meson.build
@@ -35,6 +35,7 @@  sources += files(
         'rte_malloc.c',
         'rte_random.c',
         'rte_reciprocal.c',
+	'rte_seqlock.c',
         'rte_service.c',
         'rte_version.c',
 )
diff --git a/lib/eal/common/rte_seqlock.c b/lib/eal/common/rte_seqlock.c
new file mode 100644
index 0000000000..d4fe648799
--- /dev/null
+++ b/lib/eal/common/rte_seqlock.c
@@ -0,0 +1,12 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Ericsson AB
+ */
+
+#include <rte_seqlock.h>
+
+void
+rte_seqlock_init(rte_seqlock_t *seqlock)
+{
+	seqlock->sn = 0;
+	rte_spinlock_init(&seqlock->lock);
+}
diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build
index 9700494816..48df5f1a21 100644
--- a/lib/eal/include/meson.build
+++ b/lib/eal/include/meson.build
@@ -36,6 +36,7 @@  headers += files(
         'rte_per_lcore.h',
         'rte_random.h',
         'rte_reciprocal.h',
+        'rte_seqlock.h',
         'rte_service.h',
         'rte_service_component.h',
         'rte_string_fns.h',
diff --git a/lib/eal/include/rte_seqlock.h b/lib/eal/include/rte_seqlock.h
new file mode 100644
index 0000000000..b975ca848a
--- /dev/null
+++ b/lib/eal/include/rte_seqlock.h
@@ -0,0 +1,84 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Ericsson AB
+ */
+
+#ifndef _RTE_SEQLOCK_H_
+#define _RTE_SEQLOCK_H_
+
+#include <stdbool.h>
+#include <stdint.h>
+
+#include <rte_atomic.h>
+#include <rte_branch_prediction.h>
+#include <rte_spinlock.h>
+
+struct rte_seqlock {
+	uint64_t sn;
+	rte_spinlock_t lock;
+};
+
+typedef struct rte_seqlock rte_seqlock_t;
+
+__rte_experimental
+void
+rte_seqlock_init(rte_seqlock_t *seqlock);
+
+__rte_experimental
+static inline uint64_t
+rte_seqlock_read_begin(const rte_seqlock_t *seqlock)
+{
+	/* __ATOMIC_ACQUIRE to prevent loads after (in program order)
+	 * from happening before the sn load. Syncronizes-with the
+	 * store release in rte_seqlock_end().
+	 */
+	return __atomic_load_n(&seqlock->sn, __ATOMIC_ACQUIRE);
+}
+
+__rte_experimental
+static inline bool
+rte_seqlock_read_retry(const rte_seqlock_t *seqlock, uint64_t begin_sn)
+{
+	uint64_t end_sn;
+
+	/* make sure the data loads happens before the sn load */
+	rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
+
+	end_sn = __atomic_load_n(&seqlock->sn, __ATOMIC_RELAXED);
+
+	return unlikely(begin_sn & 1 || begin_sn != end_sn);
+}
+
+__rte_experimental
+static inline void
+rte_seqlock_write_begin(rte_seqlock_t *seqlock)
+{
+	uint64_t sn;
+
+	/* to synchronize with other writers */
+	rte_spinlock_lock(&seqlock->lock);
+
+	sn = seqlock->sn + 1;
+
+	__atomic_store_n(&seqlock->sn, sn, __ATOMIC_RELAXED);
+
+	/* __ATOMIC_RELEASE to prevent stores after (in program order)
+	 * from happening before the sn store.
+	 */
+	rte_atomic_thread_fence(__ATOMIC_RELEASE);
+}
+
+__rte_experimental
+static inline void
+rte_seqlock_write_end(rte_seqlock_t *seqlock)
+{
+	uint64_t sn;
+
+	sn = seqlock->sn + 1;
+
+	/* synchronizes-with the load acquire in rte_seqlock_begin() */
+	__atomic_store_n(&seqlock->sn, sn, __ATOMIC_RELEASE);
+
+	rte_spinlock_unlock(&seqlock->lock);
+}
+
+#endif  /* _RTE_SEQLOCK_H_ */
diff --git a/lib/eal/version.map b/lib/eal/version.map
index b53eeb30d7..4a9d0ed899 100644
--- a/lib/eal/version.map
+++ b/lib/eal/version.map
@@ -420,6 +420,9 @@  EXPERIMENTAL {
 	rte_intr_instance_free;
 	rte_intr_type_get;
 	rte_intr_type_set;
+
+	# added in 22.07
+	rte_seqlock_init;
 };
 
 INTERNAL {