From patchwork Wed Apr 3 20:09:09 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eads, Gage" X-Patchwork-Id: 52227 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id B8D0D1B1DD; Wed, 3 Apr 2019 22:10:05 +0200 (CEST) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 589E54F93 for ; Wed, 3 Apr 2019 22:10:00 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 03 Apr 2019 13:09:59 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,306,1549958400"; d="scan'208";a="128403691" Received: from txasoft-yocto.an.intel.com ([10.123.72.192]) by orsmga007.jf.intel.com with ESMTP; 03 Apr 2019 13:09:58 -0700 From: Gage Eads To: dev@dpdk.org Cc: olivier.matz@6wind.com, arybchenko@solarflare.com, bruce.richardson@intel.com, konstantin.ananyev@intel.com, gavin.hu@arm.com, Honnappa.Nagarahalli@arm.com, nd@arm.com, thomas@monjalon.net Date: Wed, 3 Apr 2019 15:09:09 -0500 Message-Id: <20190403200916.16349-2-gage.eads@intel.com> X-Mailer: git-send-email 2.13.6 In-Reply-To: <20190403200916.16349-1-gage.eads@intel.com> References: <20190401211429.20282-1-gage.eads@intel.com> <20190403200916.16349-1-gage.eads@intel.com> Subject: [dpdk-dev] [PATCH v7 1/8] stack: introduce rte stack library X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The rte_stack library provides an API for configuration and use of a bounded stack of pointers. Push and pop operations are MT-safe, allowing concurrent access, and the interface supports pushing and popping multiple pointers at a time. The library's interface is modeled after another DPDK data structure, rte_ring, and its lock-based implementation is derived from the stack mempool handler. An upcoming commit will migrate the stack mempool handler to rte_stack. Signed-off-by: Gage Eads Reviewed-by: Olivier Matz Reviewed-by: Honnappa Nagarahalli --- MAINTAINERS | 6 + config/common_base | 5 + doc/api/doxy-api-index.md | 1 + doc/api/doxy-api.conf.in | 1 + doc/guides/prog_guide/index.rst | 1 + doc/guides/prog_guide/stack_lib.rst | 28 +++++ doc/guides/rel_notes/release_19_05.rst | 5 + lib/Makefile | 2 + lib/librte_stack/Makefile | 25 ++++ lib/librte_stack/meson.build | 8 ++ lib/librte_stack/rte_stack.c | 182 +++++++++++++++++++++++++++++ lib/librte_stack/rte_stack.h | 208 +++++++++++++++++++++++++++++++++ lib/librte_stack/rte_stack_pvt.h | 34 ++++++ lib/librte_stack/rte_stack_std.c | 26 +++++ lib/librte_stack/rte_stack_std.h | 121 +++++++++++++++++++ lib/librte_stack/rte_stack_version.map | 9 ++ lib/meson.build | 2 +- mk/rte.app.mk | 1 + 18 files changed, 664 insertions(+), 1 deletion(-) create mode 100644 doc/guides/prog_guide/stack_lib.rst create mode 100644 lib/librte_stack/Makefile create mode 100644 lib/librte_stack/meson.build create mode 100644 lib/librte_stack/rte_stack.c create mode 100644 lib/librte_stack/rte_stack.h create mode 100644 lib/librte_stack/rte_stack_pvt.h create mode 100644 lib/librte_stack/rte_stack_std.c create mode 100644 lib/librte_stack/rte_stack_std.h create mode 100644 lib/librte_stack/rte_stack_version.map diff --git a/MAINTAINERS b/MAINTAINERS index 71ac8cd4b..f30fc4aa6 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -426,6 +426,12 @@ F: drivers/raw/skeleton_rawdev/ F: app/test/test_rawdev.c F: doc/guides/prog_guide/rawdev.rst +Stack API - EXPERIMENTAL +M: Gage Eads +M: Olivier Matz +F: lib/librte_stack/ +F: doc/guides/prog_guide/stack_lib.rst + Memory Pool Drivers ------------------- diff --git a/config/common_base b/config/common_base index 6292bc4af..fc8dba69d 100644 --- a/config/common_base +++ b/config/common_base @@ -994,3 +994,8 @@ CONFIG_RTE_APP_CRYPTO_PERF=y # Compile the eventdev application # CONFIG_RTE_APP_EVENTDEV=y + +# +# Compile librte_stack +# +CONFIG_RTE_LIBRTE_STACK=y diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md index aacc66bd8..de1e215dd 100644 --- a/doc/api/doxy-api-index.md +++ b/doc/api/doxy-api-index.md @@ -125,6 +125,7 @@ The public API headers are grouped by topics: [mbuf] (@ref rte_mbuf.h), [mbuf pool ops] (@ref rte_mbuf_pool_ops.h), [ring] (@ref rte_ring.h), + [stack] (@ref rte_stack.h), [tailq] (@ref rte_tailq.h), [bitmap] (@ref rte_bitmap.h) diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in index a365e669b..7722fc3e9 100644 --- a/doc/api/doxy-api.conf.in +++ b/doc/api/doxy-api.conf.in @@ -55,6 +55,7 @@ INPUT = @TOPDIR@/doc/api/doxy-api-index.md \ @TOPDIR@/lib/librte_ring \ @TOPDIR@/lib/librte_sched \ @TOPDIR@/lib/librte_security \ + @TOPDIR@/lib/librte_stack \ @TOPDIR@/lib/librte_table \ @TOPDIR@/lib/librte_telemetry \ @TOPDIR@/lib/librte_timer \ diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst index 6726b1e8d..f4f60862f 100644 --- a/doc/guides/prog_guide/index.rst +++ b/doc/guides/prog_guide/index.rst @@ -55,6 +55,7 @@ Programmer's Guide metrics_lib bpf_lib ipsec_lib + stack_lib source_org dev_kit_build_system dev_kit_root_make_help diff --git a/doc/guides/prog_guide/stack_lib.rst b/doc/guides/prog_guide/stack_lib.rst new file mode 100644 index 000000000..25a8cc38a --- /dev/null +++ b/doc/guides/prog_guide/stack_lib.rst @@ -0,0 +1,28 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2019 Intel Corporation. + +Stack Library +============= + +DPDK's stack library provides an API for configuration and use of a bounded +stack of pointers. + +The stack library provides the following basic operations: + +* Create a uniquely named stack of a user-specified size and using a + user-specified socket. + +* Push and pop a burst of one or more stack objects (pointers). These function + are multi-threading safe. + +* Free a previously created stack. + +* Lookup a pointer to a stack by its name. + +* Query a stack's current depth and number of free entries. + +Implementation +~~~~~~~~~~~~~~ + +The stack consists of a contiguous array of pointers, a current index, and a +spinlock. Accesses to the stack are made multi-thread safe by the spinlock. diff --git a/doc/guides/rel_notes/release_19_05.rst b/doc/guides/rel_notes/release_19_05.rst index bdad1ddbe..ebfbe36e5 100644 --- a/doc/guides/rel_notes/release_19_05.rst +++ b/doc/guides/rel_notes/release_19_05.rst @@ -121,6 +121,11 @@ New Features Improved testpmd application performance on ARM platform. For ``macswap`` forwarding mode, NEON intrinsics were used to do swap to save CPU cycles. +* **Added Stack API.** + + Added a new stack API for configuration and use of a bounded stack of + pointers. The API provides MT-safe push and pop operations that can operate + on one or more pointers per operation. Removed Items ------------- diff --git a/lib/Makefile b/lib/Makefile index a358f1c19..9f90e80ad 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -109,6 +109,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_IPSEC) += librte_ipsec DEPDIRS-librte_ipsec := librte_eal librte_mbuf librte_cryptodev librte_security DIRS-$(CONFIG_RTE_LIBRTE_TELEMETRY) += librte_telemetry DEPDIRS-librte_telemetry := librte_eal librte_metrics librte_ethdev +DIRS-$(CONFIG_RTE_LIBRTE_STACK) += librte_stack +DEPDIRS-librte_stack := librte_eal ifeq ($(CONFIG_RTE_EXEC_ENV_LINUX),y) DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni diff --git a/lib/librte_stack/Makefile b/lib/librte_stack/Makefile new file mode 100644 index 000000000..6db540073 --- /dev/null +++ b/lib/librte_stack/Makefile @@ -0,0 +1,25 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2019 Intel Corporation + +include $(RTE_SDK)/mk/rte.vars.mk + +# library name +LIB = librte_stack.a + +CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3 +CFLAGS += -DALLOW_EXPERIMENTAL_API +LDLIBS += -lrte_eal + +EXPORT_MAP := rte_stack_version.map + +LIBABIVER := 1 + +# all source are stored in SRCS-y +SRCS-$(CONFIG_RTE_LIBRTE_STACK) := rte_stack.c \ + rte_stack_std.c + +# install includes +SYMLINK-$(CONFIG_RTE_LIBRTE_STACK)-include := rte_stack.h \ + rte_stack_std.h + +include $(RTE_SDK)/mk/rte.lib.mk diff --git a/lib/librte_stack/meson.build b/lib/librte_stack/meson.build new file mode 100644 index 000000000..d2e60ce9b --- /dev/null +++ b/lib/librte_stack/meson.build @@ -0,0 +1,8 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2019 Intel Corporation + +allow_experimental_apis = true + +version = 1 +sources = files('rte_stack.c', 'rte_stack_std.c') +headers = files('rte_stack.h', 'rte_stack_std.h') diff --git a/lib/librte_stack/rte_stack.c b/lib/librte_stack/rte_stack.c new file mode 100644 index 000000000..610014b6c --- /dev/null +++ b/lib/librte_stack/rte_stack.c @@ -0,0 +1,182 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Intel Corporation + */ + +#include + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "rte_stack.h" +#include "rte_stack_pvt.h" + +int stack_logtype; + +TAILQ_HEAD(rte_stack_list, rte_tailq_entry); + +static struct rte_tailq_elem rte_stack_tailq = { + .name = RTE_TAILQ_STACK_NAME, +}; +EAL_REGISTER_TAILQ(rte_stack_tailq) + +static void +rte_stack_init(struct rte_stack *s) +{ + memset(s, 0, sizeof(*s)); + + rte_stack_std_init(s); +} + +static ssize_t +rte_stack_get_memsize(unsigned int count) +{ + return rte_stack_std_get_memsize(count); +} + +struct rte_stack * +rte_stack_create(const char *name, unsigned int count, int socket_id, + uint32_t flags) +{ + char mz_name[RTE_MEMZONE_NAMESIZE]; + struct rte_stack_list *stack_list; + const struct rte_memzone *mz; + struct rte_tailq_entry *te; + struct rte_stack *s; + unsigned int sz; + int ret; + + RTE_SET_USED(flags); + + sz = rte_stack_get_memsize(count); + + ret = snprintf(mz_name, sizeof(mz_name), "%s%s", + RTE_STACK_MZ_PREFIX, name); + if (ret < 0 || ret >= (int)sizeof(mz_name)) { + rte_errno = ENAMETOOLONG; + return NULL; + } + + te = rte_zmalloc("STACK_TAILQ_ENTRY", sizeof(*te), 0); + if (te == NULL) { + STACK_LOG_ERR("Cannot reserve memory for tailq\n"); + rte_errno = ENOMEM; + return NULL; + } + + rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK); + + mz = rte_memzone_reserve_aligned(mz_name, sz, socket_id, + 0, __alignof__(*s)); + if (mz == NULL) { + STACK_LOG_ERR("Cannot reserve stack memzone!\n"); + rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); + rte_free(te); + return NULL; + } + + s = mz->addr; + + rte_stack_init(s); + + /* Store the name for later lookups */ + ret = snprintf(s->name, sizeof(s->name), "%s", name); + if (ret < 0 || ret >= (int)sizeof(s->name)) { + rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); + + rte_errno = ENAMETOOLONG; + rte_free(te); + rte_memzone_free(mz); + return NULL; + } + + s->memzone = mz; + s->capacity = count; + s->flags = flags; + + te->data = s; + + stack_list = RTE_TAILQ_CAST(rte_stack_tailq.head, rte_stack_list); + + TAILQ_INSERT_TAIL(stack_list, te, next); + + rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); + + return s; +} + +void +rte_stack_free(struct rte_stack *s) +{ + struct rte_stack_list *stack_list; + struct rte_tailq_entry *te; + + if (s == NULL) + return; + + stack_list = RTE_TAILQ_CAST(rte_stack_tailq.head, rte_stack_list); + rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK); + + /* find out tailq entry */ + TAILQ_FOREACH(te, stack_list, next) { + if (te->data == s) + break; + } + + if (te == NULL) { + rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); + return; + } + + TAILQ_REMOVE(stack_list, te, next); + + rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); + + rte_free(te); + + rte_memzone_free(s->memzone); +} + +struct rte_stack * +rte_stack_lookup(const char *name) +{ + struct rte_stack_list *stack_list; + struct rte_tailq_entry *te; + struct rte_stack *r = NULL; + + if (name == NULL) { + rte_errno = EINVAL; + return NULL; + } + + stack_list = RTE_TAILQ_CAST(rte_stack_tailq.head, rte_stack_list); + + rte_rwlock_read_lock(RTE_EAL_TAILQ_RWLOCK); + + TAILQ_FOREACH(te, stack_list, next) { + r = (struct rte_stack *) te->data; + if (strncmp(name, r->name, RTE_STACK_NAMESIZE) == 0) + break; + } + + rte_rwlock_read_unlock(RTE_EAL_TAILQ_RWLOCK); + + if (te == NULL) { + rte_errno = ENOENT; + return NULL; + } + + return r; +} + +RTE_INIT(librte_stack_init_log) +{ + stack_logtype = rte_log_register("lib.stack"); + if (stack_logtype >= 0) + rte_log_set_level(stack_logtype, RTE_LOG_NOTICE); +} diff --git a/lib/librte_stack/rte_stack.h b/lib/librte_stack/rte_stack.h new file mode 100644 index 000000000..cebb5be13 --- /dev/null +++ b/lib/librte_stack/rte_stack.h @@ -0,0 +1,208 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Intel Corporation + */ + +/** + * @file rte_stack.h + * @b EXPERIMENTAL: this API may change without prior notice + * + * RTE Stack + * + * librte_stack provides an API for configuration and use of a bounded stack of + * pointers. Push and pop operations are MT-safe, allowing concurrent access, + * and the interface supports pushing and popping multiple pointers at a time. + */ + +#ifndef _RTE_STACK_H_ +#define _RTE_STACK_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include +#include +#include + +#define RTE_TAILQ_STACK_NAME "RTE_STACK" +#define RTE_STACK_MZ_PREFIX "STK_" +/** The maximum length of a stack name. */ +#define RTE_STACK_NAMESIZE (RTE_MEMZONE_NAMESIZE - \ + sizeof(RTE_STACK_MZ_PREFIX) + 1) + +/* Structure containing the LIFO, its current length, and a lock for mutual + * exclusion. + */ +struct rte_stack_std { + rte_spinlock_t lock; /**< LIFO lock */ + uint32_t len; /**< LIFO len */ + void *objs[]; /**< LIFO pointer table */ +}; + +/* The RTE stack structure contains the LIFO structure itself, plus metadata + * such as its name and memzone pointer. + */ +struct rte_stack { + /** Name of the stack. */ + char name[RTE_STACK_NAMESIZE] __rte_cache_aligned; + /** Memzone containing the rte_stack structure. */ + const struct rte_memzone *memzone; + uint32_t capacity; /**< Usable size of the stack. */ + uint32_t flags; /**< Flags supplied at creation. */ + struct rte_stack_std stack_std; /**< LIFO structure. */ +} __rte_cache_aligned; + +#include "rte_stack_std.h" + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Push several objects on the stack (MT-safe). + * + * @param s + * A pointer to the stack structure. + * @param obj_table + * A pointer to a table of void * pointers (objects). + * @param n + * The number of objects to push on the stack from the obj_table. + * @return + * Actual number of objects pushed (either 0 or *n*). + */ +static __rte_always_inline unsigned int __rte_experimental +rte_stack_push(struct rte_stack *s, void * const *obj_table, unsigned int n) +{ + RTE_ASSERT(s != NULL); + RTE_ASSERT(obj_table != NULL); + + return __rte_stack_std_push(s, obj_table, n); +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Pop several objects from the stack (MT-safe). + * + * @param s + * A pointer to the stack structure. + * @param obj_table + * A pointer to a table of void * pointers (objects). + * @param n + * The number of objects to pull from the stack. + * @return + * Actual number of objects popped (either 0 or *n*). + */ +static __rte_always_inline unsigned int __rte_experimental +rte_stack_pop(struct rte_stack *s, void **obj_table, unsigned int n) +{ + RTE_ASSERT(s != NULL); + RTE_ASSERT(obj_table != NULL); + + return __rte_stack_std_pop(s, obj_table, n); +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Return the number of used entries in a stack. + * + * @param s + * A pointer to the stack structure. + * @return + * The number of used entries in the stack. + */ +static __rte_always_inline unsigned int __rte_experimental +rte_stack_count(struct rte_stack *s) +{ + RTE_ASSERT(s != NULL); + + return __rte_stack_std_count(s); +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Return the number of free entries in a stack. + * + * @param s + * A pointer to the stack structure. + * @return + * The number of free entries in the stack. + */ +static __rte_always_inline unsigned int __rte_experimental +rte_stack_free_count(struct rte_stack *s) +{ + RTE_ASSERT(s != NULL); + + return s->capacity - rte_stack_count(s); +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Create a new stack named *name* in memory. + * + * This function uses ``memzone_reserve()`` to allocate memory for a stack of + * size *count*. The behavior of the stack is controlled by the *flags*. + * + * @param name + * The name of the stack. + * @param count + * The size of the stack. + * @param socket_id + * The *socket_id* argument is the socket identifier in case of + * NUMA. The value can be *SOCKET_ID_ANY* if there is no NUMA + * constraint for the reserved zone. + * @param flags + * Reserved for future use. + * @return + * On success, the pointer to the new allocated stack. NULL on error with + * rte_errno set appropriately. Possible errno values include: + * - ENOSPC - the maximum number of memzones has already been allocated + * - EEXIST - a stack with the same name already exists + * - ENOMEM - insufficient memory to create the stack + * - ENAMETOOLONG - name size exceeds RTE_STACK_NAMESIZE + */ +struct rte_stack *__rte_experimental +rte_stack_create(const char *name, unsigned int count, int socket_id, + uint32_t flags); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Free all memory used by the stack. + * + * @param s + * Stack to free + */ +void __rte_experimental +rte_stack_free(struct rte_stack *s); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Lookup a stack by its name. + * + * @param name + * The name of the stack. + * @return + * The pointer to the stack matching the name, or NULL if not found, + * with rte_errno set appropriately. Possible rte_errno values include: + * - ENOENT - Stack with name *name* not found. + * - EINVAL - *name* pointer is NULL. + */ +struct rte_stack * __rte_experimental +rte_stack_lookup(const char *name); + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_STACK_H_ */ diff --git a/lib/librte_stack/rte_stack_pvt.h b/lib/librte_stack/rte_stack_pvt.h new file mode 100644 index 000000000..4a6a7bdb3 --- /dev/null +++ b/lib/librte_stack/rte_stack_pvt.h @@ -0,0 +1,34 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Intel Corporation + */ + +#ifndef _RTE_STACK_PVT_H_ +#define _RTE_STACK_PVT_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include + +extern int stack_logtype; + +#define STACK_LOG(level, fmt, args...) \ + rte_log(RTE_LOG_ ##level, stack_logtype, "%s(): "fmt "\n", \ + __func__, ##args) + +#define STACK_LOG_ERR(fmt, args...) \ + STACK_LOG(ERR, fmt, ## args) + +#define STACK_LOG_WARN(fmt, args...) \ + STACK_LOG(WARNING, fmt, ## args) + +#define STACK_LOG_INFO(fmt, args...) \ + STACK_LOG(INFO, fmt, ## args) + + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_STACK_PVT_H_ */ diff --git a/lib/librte_stack/rte_stack_std.c b/lib/librte_stack/rte_stack_std.c new file mode 100644 index 000000000..0a310d7c6 --- /dev/null +++ b/lib/librte_stack/rte_stack_std.c @@ -0,0 +1,26 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Intel Corporation + */ + +#include "rte_stack.h" + +void +rte_stack_std_init(struct rte_stack *s) +{ + rte_spinlock_init(&s->stack_std.lock); +} + +ssize_t +rte_stack_std_get_memsize(unsigned int count) +{ + ssize_t sz = sizeof(struct rte_stack); + + sz += RTE_CACHE_LINE_ROUNDUP(count * sizeof(void *)); + + /* Add padding to avoid false sharing conflicts caused by + * next-line hardware prefetchers. + */ + sz += 2 * RTE_CACHE_LINE_SIZE; + + return sz; +} diff --git a/lib/librte_stack/rte_stack_std.h b/lib/librte_stack/rte_stack_std.h new file mode 100644 index 000000000..5dc940932 --- /dev/null +++ b/lib/librte_stack/rte_stack_std.h @@ -0,0 +1,121 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Intel Corporation + */ + +#ifndef _RTE_STACK_STD_H_ +#define _RTE_STACK_STD_H_ + +#include + +/** + * @internal Push several objects on the stack (MT-safe). + * + * @param s + * A pointer to the stack structure. + * @param obj_table + * A pointer to a table of void * pointers (objects). + * @param n + * The number of objects to push on the stack from the obj_table. + * @return + * Actual number of objects pushed (either 0 or *n*). + */ +static __rte_always_inline unsigned int __rte_experimental +__rte_stack_std_push(struct rte_stack *s, void * const *obj_table, + unsigned int n) +{ + struct rte_stack_std *stack = &s->stack_std; + unsigned int index; + void **cache_objs; + + rte_spinlock_lock(&stack->lock); + cache_objs = &stack->objs[stack->len]; + + /* Is there sufficient space in the stack? */ + if ((stack->len + n) > s->capacity) { + rte_spinlock_unlock(&stack->lock); + return 0; + } + + /* Add elements back into the cache */ + for (index = 0; index < n; ++index, obj_table++) + cache_objs[index] = *obj_table; + + stack->len += n; + + rte_spinlock_unlock(&stack->lock); + return n; +} + +/** + * @internal Pop several objects from the stack (MT-safe). + * + * @param s + * A pointer to the stack structure. + * @param obj_table + * A pointer to a table of void * pointers (objects). + * @param n + * The number of objects to pull from the stack. + * @return + * Actual number of objects popped (either 0 or *n*). + */ +static __rte_always_inline unsigned int __rte_experimental +__rte_stack_std_pop(struct rte_stack *s, void **obj_table, unsigned int n) +{ + struct rte_stack_std *stack = &s->stack_std; + unsigned int index, len; + void **cache_objs; + + rte_spinlock_lock(&stack->lock); + + if (unlikely(n > stack->len)) { + rte_spinlock_unlock(&stack->lock); + return 0; + } + + cache_objs = stack->objs; + + for (index = 0, len = stack->len - 1; index < n; + ++index, len--, obj_table++) + *obj_table = cache_objs[len]; + + stack->len -= n; + rte_spinlock_unlock(&stack->lock); + + return n; +} + +/** + * @internal Return the number of used entries in a stack. + * + * @param s + * A pointer to the stack structure. + * @return + * The number of used entries in the stack. + */ +static __rte_always_inline unsigned int __rte_experimental +__rte_stack_std_count(struct rte_stack *s) +{ + return (unsigned int)s->stack_std.len; +} + +/** + * @internal Initialize a standard stack. + * + * @param s + * A pointer to the stack structure. + */ +void +rte_stack_std_init(struct rte_stack *s); + +/** + * @internal Return the memory required for a standard stack. + * + * @param count + * The size of the stack. + * @return + * The bytes to allocate for a standard stack. + */ +ssize_t +rte_stack_std_get_memsize(unsigned int count); + +#endif /* _RTE_STACK_STD_H_ */ diff --git a/lib/librte_stack/rte_stack_version.map b/lib/librte_stack/rte_stack_version.map new file mode 100644 index 000000000..6662679c3 --- /dev/null +++ b/lib/librte_stack/rte_stack_version.map @@ -0,0 +1,9 @@ +EXPERIMENTAL { + global: + + rte_stack_create; + rte_stack_free; + rte_stack_lookup; + + local: *; +}; diff --git a/lib/meson.build b/lib/meson.build index c3289f885..595314d7d 100644 --- a/lib/meson.build +++ b/lib/meson.build @@ -22,7 +22,7 @@ libraries = [ 'gro', 'gso', 'ip_frag', 'jobstats', 'kni', 'latencystats', 'lpm', 'member', 'power', 'pdump', 'rawdev', - 'reorder', 'sched', 'security', 'vhost', + 'reorder', 'sched', 'security', 'stack', 'vhost', #ipsec lib depends on crypto and security 'ipsec', # add pkt framework libs which use other libs from above diff --git a/mk/rte.app.mk b/mk/rte.app.mk index 262132fc6..7e033e78c 100644 --- a/mk/rte.app.mk +++ b/mk/rte.app.mk @@ -87,6 +87,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_SECURITY) += -lrte_security _LDLIBS-$(CONFIG_RTE_LIBRTE_COMPRESSDEV) += -lrte_compressdev _LDLIBS-$(CONFIG_RTE_LIBRTE_EVENTDEV) += -lrte_eventdev _LDLIBS-$(CONFIG_RTE_LIBRTE_RAWDEV) += -lrte_rawdev +_LDLIBS-$(CONFIG_RTE_LIBRTE_STACK) += -lrte_stack _LDLIBS-$(CONFIG_RTE_LIBRTE_TIMER) += -lrte_timer _LDLIBS-$(CONFIG_RTE_LIBRTE_MEMPOOL) += -lrte_mempool _LDLIBS-$(CONFIG_RTE_DRIVER_MEMPOOL_RING) += -lrte_mempool_ring From patchwork Wed Apr 3 20:09:10 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eads, Gage" X-Patchwork-Id: 52228 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 5A8101B1F7; Wed, 3 Apr 2019 22:10:11 +0200 (CEST) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 494F36CC1 for ; Wed, 3 Apr 2019 22:10:01 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 03 Apr 2019 13:10:00 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,306,1549958400"; d="scan'208";a="128403697" Received: from txasoft-yocto.an.intel.com ([10.123.72.192]) by orsmga007.jf.intel.com with ESMTP; 03 Apr 2019 13:09:59 -0700 From: Gage Eads To: dev@dpdk.org Cc: olivier.matz@6wind.com, arybchenko@solarflare.com, bruce.richardson@intel.com, konstantin.ananyev@intel.com, gavin.hu@arm.com, Honnappa.Nagarahalli@arm.com, nd@arm.com, thomas@monjalon.net Date: Wed, 3 Apr 2019 15:09:10 -0500 Message-Id: <20190403200916.16349-3-gage.eads@intel.com> X-Mailer: git-send-email 2.13.6 In-Reply-To: <20190403200916.16349-1-gage.eads@intel.com> References: <20190401211429.20282-1-gage.eads@intel.com> <20190403200916.16349-1-gage.eads@intel.com> Subject: [dpdk-dev] [PATCH v7 2/8] mempool/stack: convert mempool to use rte stack X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The new rte_stack library is derived from the mempool handler, so this commit removes duplicated code and simplifies the handler by migrating it to this new API. Signed-off-by: Gage Eads Reviewed-by: Olivier Matz --- MAINTAINERS | 2 +- drivers/mempool/stack/Makefile | 3 +- drivers/mempool/stack/meson.build | 6 +- drivers/mempool/stack/rte_mempool_stack.c | 93 +++++++++---------------------- 4 files changed, 33 insertions(+), 71 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index f30fc4aa6..e09e7d93f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -303,7 +303,6 @@ M: Andrew Rybchenko F: lib/librte_mempool/ F: drivers/mempool/Makefile F: drivers/mempool/ring/ -F: drivers/mempool/stack/ F: doc/guides/prog_guide/mempool_lib.rst F: app/test/test_mempool* F: app/test/test_func_reentrancy.c @@ -431,6 +430,7 @@ M: Gage Eads M: Olivier Matz F: lib/librte_stack/ F: doc/guides/prog_guide/stack_lib.rst +F: drivers/mempool/stack/ Memory Pool Drivers diff --git a/drivers/mempool/stack/Makefile b/drivers/mempool/stack/Makefile index 0444aedad..1681a62bc 100644 --- a/drivers/mempool/stack/Makefile +++ b/drivers/mempool/stack/Makefile @@ -10,10 +10,11 @@ LIB = librte_mempool_stack.a CFLAGS += -O3 CFLAGS += $(WERROR_FLAGS) +CFLAGS += -DALLOW_EXPERIMENTAL_API # Headers CFLAGS += -I$(RTE_SDK)/lib/librte_mempool -LDLIBS += -lrte_eal -lrte_mempool -lrte_ring +LDLIBS += -lrte_eal -lrte_mempool -lrte_stack EXPORT_MAP := rte_mempool_stack_version.map diff --git a/drivers/mempool/stack/meson.build b/drivers/mempool/stack/meson.build index b75a3bb56..03e369a41 100644 --- a/drivers/mempool/stack/meson.build +++ b/drivers/mempool/stack/meson.build @@ -1,4 +1,8 @@ # SPDX-License-Identifier: BSD-3-Clause -# Copyright(c) 2017 Intel Corporation +# Copyright(c) 2017-2019 Intel Corporation + +allow_experimental_apis = true sources = files('rte_mempool_stack.c') + +deps += ['stack'] diff --git a/drivers/mempool/stack/rte_mempool_stack.c b/drivers/mempool/stack/rte_mempool_stack.c index e6d504af5..25ccdb9af 100644 --- a/drivers/mempool/stack/rte_mempool_stack.c +++ b/drivers/mempool/stack/rte_mempool_stack.c @@ -1,39 +1,29 @@ /* SPDX-License-Identifier: BSD-3-Clause - * Copyright(c) 2016 Intel Corporation + * Copyright(c) 2016-2019 Intel Corporation */ #include #include -#include - -struct rte_mempool_stack { - rte_spinlock_t sl; - - uint32_t size; - uint32_t len; - void *objs[]; -}; +#include static int stack_alloc(struct rte_mempool *mp) { - struct rte_mempool_stack *s; - unsigned n = mp->size; - int size = sizeof(*s) + (n+16)*sizeof(void *); - - /* Allocate our local memory structure */ - s = rte_zmalloc_socket("mempool-stack", - size, - RTE_CACHE_LINE_SIZE, - mp->socket_id); - if (s == NULL) { - RTE_LOG(ERR, MEMPOOL, "Cannot allocate stack!\n"); - return -ENOMEM; + char name[RTE_STACK_NAMESIZE]; + struct rte_stack *s; + int ret; + + ret = snprintf(name, sizeof(name), + RTE_MEMPOOL_MZ_FORMAT, mp->name); + if (ret < 0 || ret >= (int)sizeof(name)) { + rte_errno = ENAMETOOLONG; + return -rte_errno; } - rte_spinlock_init(&s->sl); + s = rte_stack_create(name, mp->size, mp->socket_id, 0); + if (s == NULL) + return -rte_errno; - s->size = n; mp->pool_data = s; return 0; @@ -41,69 +31,36 @@ stack_alloc(struct rte_mempool *mp) static int stack_enqueue(struct rte_mempool *mp, void * const *obj_table, - unsigned n) + unsigned int n) { - struct rte_mempool_stack *s = mp->pool_data; - void **cache_objs; - unsigned index; - - rte_spinlock_lock(&s->sl); - cache_objs = &s->objs[s->len]; - - /* Is there sufficient space in the stack ? */ - if ((s->len + n) > s->size) { - rte_spinlock_unlock(&s->sl); - return -ENOBUFS; - } - - /* Add elements back into the cache */ - for (index = 0; index < n; ++index, obj_table++) - cache_objs[index] = *obj_table; - - s->len += n; + struct rte_stack *s = mp->pool_data; - rte_spinlock_unlock(&s->sl); - return 0; + return rte_stack_push(s, obj_table, n) == 0 ? -ENOBUFS : 0; } static int stack_dequeue(struct rte_mempool *mp, void **obj_table, - unsigned n) + unsigned int n) { - struct rte_mempool_stack *s = mp->pool_data; - void **cache_objs; - unsigned index, len; - - rte_spinlock_lock(&s->sl); - - if (unlikely(n > s->len)) { - rte_spinlock_unlock(&s->sl); - return -ENOENT; - } + struct rte_stack *s = mp->pool_data; - cache_objs = s->objs; - - for (index = 0, len = s->len - 1; index < n; - ++index, len--, obj_table++) - *obj_table = cache_objs[len]; - - s->len -= n; - rte_spinlock_unlock(&s->sl); - return 0; + return rte_stack_pop(s, obj_table, n) == 0 ? -ENOBUFS : 0; } static unsigned stack_get_count(const struct rte_mempool *mp) { - struct rte_mempool_stack *s = mp->pool_data; + struct rte_stack *s = mp->pool_data; - return s->len; + return rte_stack_count(s); } static void stack_free(struct rte_mempool *mp) { - rte_free((void *)(mp->pool_data)); + struct rte_stack *s = mp->pool_data; + + rte_stack_free(s); } static struct rte_mempool_ops ops_stack = { From patchwork Wed Apr 3 20:09:11 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eads, Gage" X-Patchwork-Id: 52229 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 0CDF31B273; Wed, 3 Apr 2019 22:10:15 +0200 (CEST) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id E75B11B0F7 for ; Wed, 3 Apr 2019 22:10:01 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 03 Apr 2019 13:10:01 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,306,1549958400"; d="scan'208";a="128403702" Received: from txasoft-yocto.an.intel.com ([10.123.72.192]) by orsmga007.jf.intel.com with ESMTP; 03 Apr 2019 13:10:00 -0700 From: Gage Eads To: dev@dpdk.org Cc: olivier.matz@6wind.com, arybchenko@solarflare.com, bruce.richardson@intel.com, konstantin.ananyev@intel.com, gavin.hu@arm.com, Honnappa.Nagarahalli@arm.com, nd@arm.com, thomas@monjalon.net Date: Wed, 3 Apr 2019 15:09:11 -0500 Message-Id: <20190403200916.16349-4-gage.eads@intel.com> X-Mailer: git-send-email 2.13.6 In-Reply-To: <20190403200916.16349-1-gage.eads@intel.com> References: <20190401211429.20282-1-gage.eads@intel.com> <20190403200916.16349-1-gage.eads@intel.com> Subject: [dpdk-dev] [PATCH v7 3/8] test/stack: add stack test X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" stack_autotest performs positive and negative testing of the stack API, and exercises the push and pop datapath functions with all available lcores. Signed-off-by: Gage Eads Reviewed-by: Olivier Matz --- MAINTAINERS | 1 + app/test/Makefile | 2 + app/test/meson.build | 3 + app/test/test_stack.c | 410 ++++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 416 insertions(+) create mode 100644 app/test/test_stack.c diff --git a/MAINTAINERS b/MAINTAINERS index e09e7d93f..e4e6d1b15 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -431,6 +431,7 @@ M: Olivier Matz F: lib/librte_stack/ F: doc/guides/prog_guide/stack_lib.rst F: drivers/mempool/stack/ +F: test/test/*stack* Memory Pool Drivers diff --git a/app/test/Makefile b/app/test/Makefile index d6aa28bad..e5bde81af 100644 --- a/app/test/Makefile +++ b/app/test/Makefile @@ -90,6 +90,8 @@ endif SRCS-y += test_rwlock.c +SRCS-$(CONFIG_RTE_LIBRTE_STACK) += test_stack.c + SRCS-$(CONFIG_RTE_LIBRTE_TIMER) += test_timer.c SRCS-$(CONFIG_RTE_LIBRTE_TIMER) += test_timer_perf.c SRCS-$(CONFIG_RTE_LIBRTE_TIMER) += test_timer_racecond.c diff --git a/app/test/meson.build b/app/test/meson.build index c5e65fe66..56ea13f53 100644 --- a/app/test/meson.build +++ b/app/test/meson.build @@ -95,6 +95,7 @@ test_sources = files('commands.c', 'test_sched.c', 'test_service_cores.c', 'test_spinlock.c', + 'test_stack.c', 'test_string_fns.c', 'test_table.c', 'test_table_acl.c', @@ -133,6 +134,7 @@ test_deps = ['acl', 'port', 'reorder', 'ring', + 'stack', 'timer' ] @@ -174,6 +176,7 @@ fast_parallel_test_names = [ 'rwlock_autotest', 'sched_autotest', 'spinlock_autotest', + 'stack_autotest', 'string_autotest', 'table_autotest', 'tailq_autotest', diff --git a/app/test/test_stack.c b/app/test/test_stack.c new file mode 100644 index 000000000..8392e4e4d --- /dev/null +++ b/app/test/test_stack.c @@ -0,0 +1,410 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Intel Corporation + */ + +#include + +#include +#include +#include +#include + +#include "test.h" + +#define STACK_SIZE 4096 +#define MAX_BULK 32 + +static int +test_stack_push_pop(struct rte_stack *s, void **obj_table, unsigned int bulk_sz) +{ + unsigned int i, ret; + void **popped_objs; + + popped_objs = rte_calloc(NULL, STACK_SIZE, sizeof(void *), 0); + if (popped_objs == NULL) { + printf("[%s():%u] failed to calloc %zu bytes\n", + __func__, __LINE__, STACK_SIZE * sizeof(void *)); + return -1; + } + + for (i = 0; i < STACK_SIZE; i += bulk_sz) { + ret = rte_stack_push(s, &obj_table[i], bulk_sz); + + if (ret != bulk_sz) { + printf("[%s():%u] push returned: %d (expected %u)\n", + __func__, __LINE__, ret, bulk_sz); + rte_free(popped_objs); + return -1; + } + + if (rte_stack_count(s) != i + bulk_sz) { + printf("[%s():%u] stack count: %u (expected %u)\n", + __func__, __LINE__, rte_stack_count(s), + i + bulk_sz); + rte_free(popped_objs); + return -1; + } + + if (rte_stack_free_count(s) != STACK_SIZE - i - bulk_sz) { + printf("[%s():%u] stack free count: %u (expected %u)\n", + __func__, __LINE__, rte_stack_count(s), + STACK_SIZE - i - bulk_sz); + rte_free(popped_objs); + return -1; + } + } + + for (i = 0; i < STACK_SIZE; i += bulk_sz) { + ret = rte_stack_pop(s, &popped_objs[i], bulk_sz); + + if (ret != bulk_sz) { + printf("[%s():%u] pop returned: %d (expected %u)\n", + __func__, __LINE__, ret, bulk_sz); + rte_free(popped_objs); + return -1; + } + + if (rte_stack_count(s) != STACK_SIZE - i - bulk_sz) { + printf("[%s():%u] stack count: %u (expected %u)\n", + __func__, __LINE__, rte_stack_count(s), + STACK_SIZE - i - bulk_sz); + rte_free(popped_objs); + return -1; + } + + if (rte_stack_free_count(s) != i + bulk_sz) { + printf("[%s():%u] stack free count: %u (expected %u)\n", + __func__, __LINE__, rte_stack_count(s), + i + bulk_sz); + rte_free(popped_objs); + return -1; + } + } + + for (i = 0; i < STACK_SIZE; i++) { + if (obj_table[i] != popped_objs[STACK_SIZE - i - 1]) { + printf("[%s():%u] Incorrect value %p at index 0x%x\n", + __func__, __LINE__, + popped_objs[STACK_SIZE - i - 1], i); + rte_free(popped_objs); + return -1; + } + } + + rte_free(popped_objs); + + return 0; +} + +static int +test_stack_basic(void) +{ + struct rte_stack *s = NULL; + void **obj_table = NULL; + int i, ret = -1; + + obj_table = rte_calloc(NULL, STACK_SIZE, sizeof(void *), 0); + if (obj_table == NULL) { + printf("[%s():%u] failed to calloc %zu bytes\n", + __func__, __LINE__, STACK_SIZE * sizeof(void *)); + goto fail_test; + } + + for (i = 0; i < STACK_SIZE; i++) + obj_table[i] = (void *)(uintptr_t)i; + + s = rte_stack_create(__func__, STACK_SIZE, rte_socket_id(), 0); + if (s == NULL) { + printf("[%s():%u] failed to create a stack\n", + __func__, __LINE__); + goto fail_test; + } + + if (rte_stack_lookup(__func__) != s) { + printf("[%s():%u] failed to lookup a stack\n", + __func__, __LINE__); + goto fail_test; + } + + if (rte_stack_count(s) != 0) { + printf("[%s():%u] stack count: %u (expected 0)\n", + __func__, __LINE__, rte_stack_count(s)); + goto fail_test; + } + + if (rte_stack_free_count(s) != STACK_SIZE) { + printf("[%s():%u] stack free count: %u (expected %u)\n", + __func__, __LINE__, rte_stack_count(s), STACK_SIZE); + goto fail_test; + } + + ret = test_stack_push_pop(s, obj_table, 1); + if (ret) { + printf("[%s():%u] Single object push/pop failed\n", + __func__, __LINE__); + goto fail_test; + } + + ret = test_stack_push_pop(s, obj_table, MAX_BULK); + if (ret) { + printf("[%s():%u] Bulk object push/pop failed\n", + __func__, __LINE__); + goto fail_test; + } + + ret = rte_stack_push(s, obj_table, 2 * STACK_SIZE); + if (ret != 0) { + printf("[%s():%u] Excess objects push succeeded\n", + __func__, __LINE__); + goto fail_test; + } + + ret = rte_stack_pop(s, obj_table, 1); + if (ret != 0) { + printf("[%s():%u] Empty stack pop succeeded\n", + __func__, __LINE__); + goto fail_test; + } + + ret = 0; + +fail_test: + rte_stack_free(s); + + rte_free(obj_table); + + return ret; +} + +static int +test_stack_name_reuse(void) +{ + struct rte_stack *s[2]; + + s[0] = rte_stack_create("test", STACK_SIZE, rte_socket_id(), 0); + if (s[0] == NULL) { + printf("[%s():%u] Failed to create a stack\n", + __func__, __LINE__); + return -1; + } + + s[1] = rte_stack_create("test", STACK_SIZE, rte_socket_id(), 0); + if (s[1] != NULL) { + printf("[%s():%u] Failed to detect re-used name\n", + __func__, __LINE__); + return -1; + } + + rte_stack_free(s[0]); + + return 0; +} + +static int +test_stack_name_length(void) +{ + char name[RTE_STACK_NAMESIZE + 1]; + struct rte_stack *s; + + memset(name, 's', sizeof(name)); + name[RTE_STACK_NAMESIZE] = '\0'; + + s = rte_stack_create(name, STACK_SIZE, rte_socket_id(), 0); + if (s != NULL) { + printf("[%s():%u] Failed to prevent long name\n", + __func__, __LINE__); + return -1; + } + + if (rte_errno != ENAMETOOLONG) { + printf("[%s():%u] rte_stack failed to set correct errno on failed lookup\n", + __func__, __LINE__); + return -1; + } + + return 0; +} + +static int +test_lookup_null(void) +{ + struct rte_stack *s = rte_stack_lookup("stack_not_found"); + + if (s != NULL) { + printf("[%s():%u] rte_stack found a non-existent stack\n", + __func__, __LINE__); + return -1; + } + + if (rte_errno != ENOENT) { + printf("[%s():%u] rte_stack failed to set correct errno on failed lookup\n", + __func__, __LINE__); + return -1; + } + + s = rte_stack_lookup(NULL); + + if (s != NULL) { + printf("[%s():%u] rte_stack found a non-existent stack\n", + __func__, __LINE__); + return -1; + } + + if (rte_errno != EINVAL) { + printf("[%s():%u] rte_stack failed to set correct errno on failed lookup\n", + __func__, __LINE__); + return -1; + } + + return 0; +} + +static int +test_free_null(void) +{ + /* Check whether the library proper handles a NULL pointer */ + rte_stack_free(NULL); + + return 0; +} + +#define NUM_ITERS_PER_THREAD 100000 + +struct test_args { + struct rte_stack *s; + rte_atomic64_t *sz; +}; + +static int +stack_thread_push_pop(void *args) +{ + struct test_args *t = args; + void **obj_table; + int i; + + obj_table = rte_calloc(NULL, STACK_SIZE, sizeof(void *), 0); + if (obj_table == NULL) { + printf("[%s():%u] failed to calloc %zu bytes\n", + __func__, __LINE__, STACK_SIZE * sizeof(void *)); + return -1; + } + + for (i = 0; i < NUM_ITERS_PER_THREAD; i++) { + unsigned int success, num; + + /* Reserve up to min(MAX_BULK, available slots) stack entries, + * then push and pop those stack entries. + */ + do { + uint64_t sz = rte_atomic64_read(t->sz); + volatile uint64_t *sz_addr; + + sz_addr = (volatile uint64_t *)t->sz; + + num = RTE_MIN(rte_rand() % MAX_BULK, STACK_SIZE - sz); + + success = rte_atomic64_cmpset(sz_addr, sz, sz + num); + } while (success == 0); + + if (rte_stack_push(t->s, obj_table, num) != num) { + printf("[%s():%u] Failed to push %u pointers\n", + __func__, __LINE__, num); + rte_free(obj_table); + return -1; + } + + if (rte_stack_pop(t->s, obj_table, num) != num) { + printf("[%s():%u] Failed to pop %u pointers\n", + __func__, __LINE__, num); + rte_free(obj_table); + return -1; + } + + rte_atomic64_sub(t->sz, num); + } + + rte_free(obj_table); + return 0; +} + +static int +test_stack_multithreaded(void) +{ + struct test_args *args; + unsigned int lcore_id; + struct rte_stack *s; + rte_atomic64_t size; + + printf("[%s():%u] Running with %u lcores\n", + __func__, __LINE__, rte_lcore_count()); + + if (rte_lcore_count() < 2) + return 0; + + args = rte_malloc(NULL, sizeof(struct test_args) * RTE_MAX_LCORE, 0); + if (args == NULL) { + printf("[%s():%u] failed to malloc %zu bytes\n", + __func__, __LINE__, + sizeof(struct test_args) * RTE_MAX_LCORE); + return -1; + } + + s = rte_stack_create("test", STACK_SIZE, rte_socket_id(), 0); + if (s == NULL) { + printf("[%s():%u] Failed to create a stack\n", + __func__, __LINE__); + rte_free(args); + return -1; + } + + rte_atomic64_init(&size); + + RTE_LCORE_FOREACH_SLAVE(lcore_id) { + args[lcore_id].s = s; + args[lcore_id].sz = &size; + + if (rte_eal_remote_launch(stack_thread_push_pop, + &args[lcore_id], lcore_id)) + rte_panic("Failed to launch lcore %d\n", lcore_id); + } + + lcore_id = rte_lcore_id(); + + args[lcore_id].s = s; + args[lcore_id].sz = &size; + + stack_thread_push_pop(&args[lcore_id]); + + rte_eal_mp_wait_lcore(); + + rte_stack_free(s); + rte_free(args); + + return 0; +} + +static int +test_stack(void) +{ + if (test_stack_basic() < 0) + return -1; + + if (test_lookup_null() < 0) + return -1; + + if (test_free_null() < 0) + return -1; + + if (test_stack_name_reuse() < 0) + return -1; + + if (test_stack_name_length() < 0) + return -1; + + if (test_stack_multithreaded() < 0) + return -1; + + return 0; +} + +REGISTER_TEST_COMMAND(stack_autotest, test_stack); From patchwork Wed Apr 3 20:09:12 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eads, Gage" X-Patchwork-Id: 52230 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id ECA4A1B3DA; Wed, 3 Apr 2019 22:10:18 +0200 (CEST) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id EB13C6CC1 for ; Wed, 3 Apr 2019 22:10:02 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 03 Apr 2019 13:10:02 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,306,1549958400"; d="scan'208";a="128403719" Received: from txasoft-yocto.an.intel.com ([10.123.72.192]) by orsmga007.jf.intel.com with ESMTP; 03 Apr 2019 13:10:01 -0700 From: Gage Eads To: dev@dpdk.org Cc: olivier.matz@6wind.com, arybchenko@solarflare.com, bruce.richardson@intel.com, konstantin.ananyev@intel.com, gavin.hu@arm.com, Honnappa.Nagarahalli@arm.com, nd@arm.com, thomas@monjalon.net Date: Wed, 3 Apr 2019 15:09:12 -0500 Message-Id: <20190403200916.16349-5-gage.eads@intel.com> X-Mailer: git-send-email 2.13.6 In-Reply-To: <20190403200916.16349-1-gage.eads@intel.com> References: <20190401211429.20282-1-gage.eads@intel.com> <20190403200916.16349-1-gage.eads@intel.com> Subject: [dpdk-dev] [PATCH v7 4/8] test/stack: add stack perf test X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" stack_perf_autotest tests the following with one lcore: - Cycles to attempt to pop an empty stack - Cycles to push then pop a single object - Cycles to push then pop a burst of 32 objects It also tests the cycles to push then pop a burst of 8 and 32 objects with the following lcore combinations (if possible): - Two hyperthreads - Two physical cores - Two physical cores on separate NUMA nodes - All available lcores Signed-off-by: Gage Eads Reviewed-by: Olivier Matz --- app/test/Makefile | 1 + app/test/meson.build | 2 + app/test/test_stack_perf.c | 343 +++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 346 insertions(+) create mode 100644 app/test/test_stack_perf.c diff --git a/app/test/Makefile b/app/test/Makefile index e5bde81af..b28bed2d4 100644 --- a/app/test/Makefile +++ b/app/test/Makefile @@ -91,6 +91,7 @@ endif SRCS-y += test_rwlock.c SRCS-$(CONFIG_RTE_LIBRTE_STACK) += test_stack.c +SRCS-$(CONFIG_RTE_LIBRTE_STACK) += test_stack_perf.c SRCS-$(CONFIG_RTE_LIBRTE_TIMER) += test_timer.c SRCS-$(CONFIG_RTE_LIBRTE_TIMER) += test_timer_perf.c diff --git a/app/test/meson.build b/app/test/meson.build index 56ea13f53..02eb788a4 100644 --- a/app/test/meson.build +++ b/app/test/meson.build @@ -96,6 +96,7 @@ test_sources = files('commands.c', 'test_service_cores.c', 'test_spinlock.c', 'test_stack.c', + 'test_stack_perf.c', 'test_string_fns.c', 'test_table.c', 'test_table_acl.c', @@ -241,6 +242,7 @@ perf_test_names = [ 'distributor_perf_autotest', 'ring_pmd_perf_autotest', 'pmd_perf_autotest', + 'stack_perf_autotest', ] # All test cases in driver_test_names list are non-parallel diff --git a/app/test/test_stack_perf.c b/app/test/test_stack_perf.c new file mode 100644 index 000000000..484370d30 --- /dev/null +++ b/app/test/test_stack_perf.c @@ -0,0 +1,343 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Intel Corporation + */ + + +#include +#include +#include +#include +#include +#include + +#include "test.h" + +#define STACK_NAME "STACK_PERF" +#define MAX_BURST 32 +#define STACK_SIZE (RTE_MAX_LCORE * MAX_BURST) + +#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) + +/* + * Push/pop bulk sizes, marked volatile so they aren't treated as compile-time + * constants. + */ +static volatile unsigned int bulk_sizes[] = {8, MAX_BURST}; + +static rte_atomic32_t lcore_barrier; + +struct lcore_pair { + unsigned int c1; + unsigned int c2; +}; + +static int +get_two_hyperthreads(struct lcore_pair *lcp) +{ + unsigned int socket[2]; + unsigned int core[2]; + unsigned int id[2]; + + RTE_LCORE_FOREACH(id[0]) { + RTE_LCORE_FOREACH(id[1]) { + if (id[0] == id[1]) + continue; + core[0] = lcore_config[id[0]].core_id; + core[1] = lcore_config[id[1]].core_id; + socket[0] = lcore_config[id[0]].socket_id; + socket[1] = lcore_config[id[1]].socket_id; + if ((core[0] == core[1]) && (socket[0] == socket[1])) { + lcp->c1 = id[0]; + lcp->c2 = id[1]; + return 0; + } + } + } + + return 1; +} + +static int +get_two_cores(struct lcore_pair *lcp) +{ + unsigned int socket[2]; + unsigned int core[2]; + unsigned int id[2]; + + RTE_LCORE_FOREACH(id[0]) { + RTE_LCORE_FOREACH(id[1]) { + if (id[0] == id[1]) + continue; + core[0] = lcore_config[id[0]].core_id; + core[1] = lcore_config[id[1]].core_id; + socket[0] = lcore_config[id[0]].socket_id; + socket[1] = lcore_config[id[1]].socket_id; + if ((core[0] != core[1]) && (socket[0] == socket[1])) { + lcp->c1 = id[0]; + lcp->c2 = id[1]; + return 0; + } + } + } + + return 1; +} + +static int +get_two_sockets(struct lcore_pair *lcp) +{ + unsigned int socket[2]; + unsigned int id[2]; + + RTE_LCORE_FOREACH(id[0]) { + RTE_LCORE_FOREACH(id[1]) { + if (id[0] == id[1]) + continue; + socket[0] = lcore_config[id[0]].socket_id; + socket[1] = lcore_config[id[1]].socket_id; + if (socket[0] != socket[1]) { + lcp->c1 = id[0]; + lcp->c2 = id[1]; + return 0; + } + } + } + + return 1; +} + +/* Measure the cycle cost of popping an empty stack. */ +static void +test_empty_pop(struct rte_stack *s) +{ + unsigned int iterations = 100000000; + void *objs[MAX_BURST]; + unsigned int i; + + uint64_t start = rte_rdtsc(); + + for (i = 0; i < iterations; i++) + rte_stack_pop(s, objs, bulk_sizes[0]); + + uint64_t end = rte_rdtsc(); + + printf("Stack empty pop: %.2F\n", + (double)(end - start) / iterations); +} + +struct thread_args { + struct rte_stack *s; + unsigned int sz; + double avg; +}; + +/* Measure the average per-pointer cycle cost of stack push and pop */ +static int +bulk_push_pop(void *p) +{ + unsigned int iterations = 1000000; + struct thread_args *args = p; + void *objs[MAX_BURST] = {0}; + unsigned int size, i; + struct rte_stack *s; + + s = args->s; + size = args->sz; + + rte_atomic32_sub(&lcore_barrier, 1); + while (rte_atomic32_read(&lcore_barrier) != 0) + rte_pause(); + + uint64_t start = rte_rdtsc(); + + for (i = 0; i < iterations; i++) { + rte_stack_push(s, objs, size); + rte_stack_pop(s, objs, size); + } + + uint64_t end = rte_rdtsc(); + + args->avg = ((double)(end - start))/(iterations * size); + + return 0; +} + +/* + * Run bulk_push_pop() simultaneously on pairs of cores, to measure stack + * perf when between hyperthread siblings, cores on the same socket, and cores + * on different sockets. + */ +static void +run_on_core_pair(struct lcore_pair *cores, struct rte_stack *s, + lcore_function_t fn) +{ + struct thread_args args[2]; + unsigned int i; + + for (i = 0; i < ARRAY_SIZE(bulk_sizes); i++) { + rte_atomic32_set(&lcore_barrier, 2); + + args[0].sz = args[1].sz = bulk_sizes[i]; + args[0].s = args[1].s = s; + + if (cores->c1 == rte_get_master_lcore()) { + rte_eal_remote_launch(fn, &args[1], cores->c2); + fn(&args[0]); + rte_eal_wait_lcore(cores->c2); + } else { + rte_eal_remote_launch(fn, &args[0], cores->c1); + rte_eal_remote_launch(fn, &args[1], cores->c2); + rte_eal_wait_lcore(cores->c1); + rte_eal_wait_lcore(cores->c2); + } + + printf("Average cycles per object push/pop (bulk size: %u): %.2F\n", + bulk_sizes[i], (args[0].avg + args[1].avg) / 2); + } +} + +/* Run bulk_push_pop() simultaneously on 1+ cores. */ +static void +run_on_n_cores(struct rte_stack *s, lcore_function_t fn, int n) +{ + struct thread_args args[RTE_MAX_LCORE]; + unsigned int i; + + for (i = 0; i < ARRAY_SIZE(bulk_sizes); i++) { + unsigned int lcore_id; + int cnt = 0; + double avg; + + rte_atomic32_set(&lcore_barrier, n); + + RTE_LCORE_FOREACH_SLAVE(lcore_id) { + if (++cnt >= n) + break; + + args[lcore_id].s = s; + args[lcore_id].sz = bulk_sizes[i]; + + if (rte_eal_remote_launch(fn, &args[lcore_id], + lcore_id)) + rte_panic("Failed to launch lcore %d\n", + lcore_id); + } + + lcore_id = rte_lcore_id(); + + args[lcore_id].s = s; + args[lcore_id].sz = bulk_sizes[i]; + + fn(&args[lcore_id]); + + rte_eal_mp_wait_lcore(); + + avg = args[rte_lcore_id()].avg; + + cnt = 0; + RTE_LCORE_FOREACH_SLAVE(lcore_id) { + if (++cnt >= n) + break; + avg += args[lcore_id].avg; + } + + printf("Average cycles per object push/pop (bulk size: %u): %.2F\n", + bulk_sizes[i], avg / n); + } +} + +/* + * Measure the cycle cost of pushing and popping a single pointer on a single + * lcore. + */ +static void +test_single_push_pop(struct rte_stack *s) +{ + unsigned int iterations = 16000000; + void *obj = NULL; + unsigned int i; + + uint64_t start = rte_rdtsc(); + + for (i = 0; i < iterations; i++) { + rte_stack_push(s, &obj, 1); + rte_stack_pop(s, &obj, 1); + } + + uint64_t end = rte_rdtsc(); + + printf("Average cycles per single object push/pop: %.2F\n", + ((double)(end - start)) / iterations); +} + +/* Measure the cycle cost of bulk pushing and popping on a single lcore. */ +static void +test_bulk_push_pop(struct rte_stack *s) +{ + unsigned int iterations = 8000000; + void *objs[MAX_BURST]; + unsigned int sz, i; + + for (sz = 0; sz < ARRAY_SIZE(bulk_sizes); sz++) { + uint64_t start = rte_rdtsc(); + + for (i = 0; i < iterations; i++) { + rte_stack_push(s, objs, bulk_sizes[sz]); + rte_stack_pop(s, objs, bulk_sizes[sz]); + } + + uint64_t end = rte_rdtsc(); + + double avg = ((double)(end - start) / + (iterations * bulk_sizes[sz])); + + printf("Average cycles per object push/pop (bulk size: %u): %.2F\n", + bulk_sizes[sz], avg); + } +} + +static int +test_stack_perf(void) +{ + struct lcore_pair cores; + struct rte_stack *s; + + rte_atomic32_init(&lcore_barrier); + + s = rte_stack_create(STACK_NAME, STACK_SIZE, rte_socket_id(), 0); + if (s == NULL) { + printf("[%s():%u] failed to create a stack\n", + __func__, __LINE__); + return -1; + } + + printf("### Testing single element push/pop ###\n"); + test_single_push_pop(s); + + printf("\n### Testing empty pop ###\n"); + test_empty_pop(s); + + printf("\n### Testing using a single lcore ###\n"); + test_bulk_push_pop(s); + + if (get_two_hyperthreads(&cores) == 0) { + printf("\n### Testing using two hyperthreads ###\n"); + run_on_core_pair(&cores, s, bulk_push_pop); + } + if (get_two_cores(&cores) == 0) { + printf("\n### Testing using two physical cores ###\n"); + run_on_core_pair(&cores, s, bulk_push_pop); + } + if (get_two_sockets(&cores) == 0) { + printf("\n### Testing using two NUMA nodes ###\n"); + run_on_core_pair(&cores, s, bulk_push_pop); + } + + printf("\n### Testing on all %u lcores ###\n", rte_lcore_count()); + run_on_n_cores(s, bulk_push_pop, rte_lcore_count()); + + rte_stack_free(s); + return 0; +} + +REGISTER_TEST_COMMAND(stack_perf_autotest, test_stack_perf); From patchwork Wed Apr 3 20:09:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eads, Gage" X-Patchwork-Id: 52231 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 488A11B44E; Wed, 3 Apr 2019 22:10:23 +0200 (CEST) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 2A6606CC1 for ; Wed, 3 Apr 2019 22:10:04 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 03 Apr 2019 13:10:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,306,1549958400"; d="scan'208";a="128403728" Received: from txasoft-yocto.an.intel.com ([10.123.72.192]) by orsmga007.jf.intel.com with ESMTP; 03 Apr 2019 13:10:02 -0700 From: Gage Eads To: dev@dpdk.org Cc: olivier.matz@6wind.com, arybchenko@solarflare.com, bruce.richardson@intel.com, konstantin.ananyev@intel.com, gavin.hu@arm.com, Honnappa.Nagarahalli@arm.com, nd@arm.com, thomas@monjalon.net Date: Wed, 3 Apr 2019 15:09:13 -0500 Message-Id: <20190403200916.16349-6-gage.eads@intel.com> X-Mailer: git-send-email 2.13.6 In-Reply-To: <20190403200916.16349-1-gage.eads@intel.com> References: <20190401211429.20282-1-gage.eads@intel.com> <20190403200916.16349-1-gage.eads@intel.com> Subject: [dpdk-dev] [PATCH v7 5/8] stack: add lock-free stack implementation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This commit adds support for a lock-free (linked list based) stack to the stack API. This behavior is selected through a new rte_stack_create() flag, RTE_STACK_F_LF. The stack consists of a linked list of elements, each containing a data pointer and a next pointer, and an atomic stack depth counter. The lock-free push operation enqueues a linked list of pointers by pointing the tail of the list to the current stack head, and using a CAS to swing the stack head pointer to the head of the list. The operation retries if it is unsuccessful (i.e. the list changed between reading the head and modifying it), else it adjusts the stack length and returns. The lock-free pop operation first reserves num elements by adjusting the stack length, to ensure the dequeue operation will succeed without blocking. It then dequeues pointers by walking the list -- starting from the head -- then swinging the head pointer (using a CAS as well). While walking the list, the data pointers are recorded in an object table. This algorithm stack uses a 128-bit compare-and-swap instruction, which atomically updates the stack top pointer and a modification counter, to protect against the ABA problem. The linked list elements themselves are maintained in a lock-free LIFO list, and are allocated before stack pushes and freed after stack pops. Since the stack has a fixed maximum depth, these elements do not need to be dynamically created. Signed-off-by: Gage Eads Reviewed-by: Olivier Matz Reviewed-by: Honnappa Nagarahalli --- doc/guides/prog_guide/stack_lib.rst | 61 +++++++++++- doc/guides/rel_notes/release_19_05.rst | 3 + lib/librte_stack/Makefile | 7 +- lib/librte_stack/meson.build | 7 +- lib/librte_stack/rte_stack.c | 28 ++++-- lib/librte_stack/rte_stack.h | 62 +++++++++++- lib/librte_stack/rte_stack_lf.c | 31 ++++++ lib/librte_stack/rte_stack_lf.h | 102 ++++++++++++++++++++ lib/librte_stack/rte_stack_lf_generic.h | 164 ++++++++++++++++++++++++++++++++ 9 files changed, 446 insertions(+), 19 deletions(-) create mode 100644 lib/librte_stack/rte_stack_lf.c create mode 100644 lib/librte_stack/rte_stack_lf.h create mode 100644 lib/librte_stack/rte_stack_lf_generic.h diff --git a/doc/guides/prog_guide/stack_lib.rst b/doc/guides/prog_guide/stack_lib.rst index 25a8cc38a..8fe8804e3 100644 --- a/doc/guides/prog_guide/stack_lib.rst +++ b/doc/guides/prog_guide/stack_lib.rst @@ -10,7 +10,8 @@ stack of pointers. The stack library provides the following basic operations: * Create a uniquely named stack of a user-specified size and using a - user-specified socket. + user-specified socket, with either standard (lock-based) or lock-free + behavior. * Push and pop a burst of one or more stack objects (pointers). These function are multi-threading safe. @@ -24,5 +25,59 @@ The stack library provides the following basic operations: Implementation ~~~~~~~~~~~~~~ -The stack consists of a contiguous array of pointers, a current index, and a -spinlock. Accesses to the stack are made multi-thread safe by the spinlock. +The library supports two types of stacks: standard (lock-based) and lock-free. +Both types use the same set of interfaces, but their implementations differ. + +Lock-based Stack +---------------- + +The lock-based stack consists of a contiguous array of pointers, a current +index, and a spinlock. Accesses to the stack are made multi-thread safe by the +spinlock. + +Lock-free Stack +------------------ + +The lock-free stack consists of a linked list of elements, each containing a +data pointer and a next pointer, and an atomic stack depth counter. The +lock-free property means that multiple threads can push and pop simultaneously, +and one thread being preempted/delayed in a push or pop operation will not +impede the forward progress of any other thread. + +The lock-free push operation enqueues a linked list of pointers by pointing the +list's tail to the current stack head, and using a CAS to swing the stack head +pointer to the head of the list. The operation retries if it is unsuccessful +(i.e. the list changed between reading the head and modifying it), else it +adjusts the stack length and returns. + +The lock-free pop operation first reserves one or more list elements by +adjusting the stack length, to ensure the dequeue operation will succeed +without blocking. It then dequeues pointers by walking the list -- starting +from the head -- then swinging the head pointer (using a CAS as well). While +walking the list, the data pointers are recorded in an object table. + +The linked list elements themselves are maintained in a lock-free LIFO, and are +allocated before stack pushes and freed after stack pops. Since the stack has a +fixed maximum depth, these elements do not need to be dynamically created. + +The lock-free behavior is selected by passing the *RTE_STACK_F_LF* flag to +rte_stack_create(). + +Preventing the ABA Problem +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To prevent the ABA problem, this algorithm stack uses a 128-bit +compare-and-swap instruction to atomically update both the stack top pointer +and a modification counter. The ABA problem can occur without a modification +counter if, for example: + +1. Thread A reads head pointer X and stores the pointed-to list element. +2. Other threads modify the list such that the head pointer is once again X, + but its pointed-to data is different than what thread A read. +3. Thread A changes the head pointer with a compare-and-swap and succeeds. + +In this case thread A would not detect that the list had changed, and would +both pop stale data and incorrect change the head pointer. By adding a +modification counter that is updated on every push and pop as part of the +compare-and-swap, the algorithm can detect when the list changes even if the +head pointer remains the same. diff --git a/doc/guides/rel_notes/release_19_05.rst b/doc/guides/rel_notes/release_19_05.rst index ebfbe36e5..3b115b5f6 100644 --- a/doc/guides/rel_notes/release_19_05.rst +++ b/doc/guides/rel_notes/release_19_05.rst @@ -127,6 +127,9 @@ New Features pointers. The API provides MT-safe push and pop operations that can operate on one or more pointers per operation. + The library supports two stack implementations: standard (lock-based) and lock-free. + The lock-free implementation is currently limited to x86-64 platforms. + Removed Items ------------- diff --git a/lib/librte_stack/Makefile b/lib/librte_stack/Makefile index 6db540073..311edd997 100644 --- a/lib/librte_stack/Makefile +++ b/lib/librte_stack/Makefile @@ -16,10 +16,13 @@ LIBABIVER := 1 # all source are stored in SRCS-y SRCS-$(CONFIG_RTE_LIBRTE_STACK) := rte_stack.c \ - rte_stack_std.c + rte_stack_std.c \ + rte_stack_lf.c # install includes SYMLINK-$(CONFIG_RTE_LIBRTE_STACK)-include := rte_stack.h \ - rte_stack_std.h + rte_stack_std.h \ + rte_stack_lf.h \ + rte_stack_lf_generic.h include $(RTE_SDK)/mk/rte.lib.mk diff --git a/lib/librte_stack/meson.build b/lib/librte_stack/meson.build index d2e60ce9b..7a09a5d66 100644 --- a/lib/librte_stack/meson.build +++ b/lib/librte_stack/meson.build @@ -4,5 +4,8 @@ allow_experimental_apis = true version = 1 -sources = files('rte_stack.c', 'rte_stack_std.c') -headers = files('rte_stack.h', 'rte_stack_std.h') +sources = files('rte_stack.c', 'rte_stack_std.c', 'rte_stack_lf.c') +headers = files('rte_stack.h', + 'rte_stack_std.h', + 'rte_stack_lf.h', + 'rte_stack_lf_generic.h') diff --git a/lib/librte_stack/rte_stack.c b/lib/librte_stack/rte_stack.c index 610014b6c..1a4d9bd1e 100644 --- a/lib/librte_stack/rte_stack.c +++ b/lib/librte_stack/rte_stack.c @@ -25,18 +25,25 @@ static struct rte_tailq_elem rte_stack_tailq = { }; EAL_REGISTER_TAILQ(rte_stack_tailq) + static void -rte_stack_init(struct rte_stack *s) +rte_stack_init(struct rte_stack *s, unsigned int count, uint32_t flags) { memset(s, 0, sizeof(*s)); - rte_stack_std_init(s); + if (flags & RTE_STACK_F_LF) + rte_stack_lf_init(s, count); + else + rte_stack_std_init(s); } static ssize_t -rte_stack_get_memsize(unsigned int count) +rte_stack_get_memsize(unsigned int count, uint32_t flags) { - return rte_stack_std_get_memsize(count); + if (flags & RTE_STACK_F_LF) + return rte_stack_lf_get_memsize(count); + else + return rte_stack_std_get_memsize(count); } struct rte_stack * @@ -51,9 +58,16 @@ rte_stack_create(const char *name, unsigned int count, int socket_id, unsigned int sz; int ret; - RTE_SET_USED(flags); +#ifdef RTE_ARCH_64 + RTE_BUILD_BUG_ON(sizeof(struct rte_stack_lf_head) != 16); +#else + if (flags & RTE_STACK_F_LF) { + STACK_LOG_ERR("Lock-free stack is not supported on your platform\n"); + return NULL; + } +#endif - sz = rte_stack_get_memsize(count); + sz = rte_stack_get_memsize(count, flags); ret = snprintf(mz_name, sizeof(mz_name), "%s%s", RTE_STACK_MZ_PREFIX, name); @@ -82,7 +96,7 @@ rte_stack_create(const char *name, unsigned int count, int socket_id, s = mz->addr; - rte_stack_init(s); + rte_stack_init(s, count, flags); /* Store the name for later lookups */ ret = snprintf(s->name, sizeof(s->name), "%s", name); diff --git a/lib/librte_stack/rte_stack.h b/lib/librte_stack/rte_stack.h index cebb5be13..54e795682 100644 --- a/lib/librte_stack/rte_stack.h +++ b/lib/librte_stack/rte_stack.h @@ -31,6 +31,35 @@ extern "C" { #define RTE_STACK_NAMESIZE (RTE_MEMZONE_NAMESIZE - \ sizeof(RTE_STACK_MZ_PREFIX) + 1) +struct rte_stack_lf_elem { + void *data; /**< Data pointer */ + struct rte_stack_lf_elem *next; /**< Next pointer */ +}; + +struct rte_stack_lf_head { + struct rte_stack_lf_elem *top; /**< Stack top */ + uint64_t cnt; /**< Modification counter for avoiding ABA problem */ +}; + +struct rte_stack_lf_list { + /** List head */ + struct rte_stack_lf_head head __rte_aligned(16); + /** List len */ + rte_atomic64_t len; +}; + +/* Structure containing two lock-free LIFO lists: the stack itself and a list + * of free linked-list elements. + */ +struct rte_stack_lf { + /** LIFO list of elements */ + struct rte_stack_lf_list used __rte_cache_aligned; + /** LIFO list of free elements */ + struct rte_stack_lf_list free __rte_cache_aligned; + /** LIFO elements */ + struct rte_stack_lf_elem elems[] __rte_cache_aligned; +}; + /* Structure containing the LIFO, its current length, and a lock for mutual * exclusion. */ @@ -50,10 +79,21 @@ struct rte_stack { const struct rte_memzone *memzone; uint32_t capacity; /**< Usable size of the stack. */ uint32_t flags; /**< Flags supplied at creation. */ - struct rte_stack_std stack_std; /**< LIFO structure. */ + RTE_STD_C11 + union { + struct rte_stack_lf stack_lf; /**< Lock-free LIFO structure. */ + struct rte_stack_std stack_std; /**< LIFO structure. */ + }; } __rte_cache_aligned; +/** + * The stack uses lock-free push and pop functions. This flag is only + * supported on x86_64 platforms, currently. + */ +#define RTE_STACK_F_LF 0x0001 + #include "rte_stack_std.h" +#include "rte_stack_lf.h" /** * @warning @@ -76,7 +116,10 @@ rte_stack_push(struct rte_stack *s, void * const *obj_table, unsigned int n) RTE_ASSERT(s != NULL); RTE_ASSERT(obj_table != NULL); - return __rte_stack_std_push(s, obj_table, n); + if (s->flags & RTE_STACK_F_LF) + return __rte_stack_lf_push(s, obj_table, n); + else + return __rte_stack_std_push(s, obj_table, n); } /** @@ -100,7 +143,10 @@ rte_stack_pop(struct rte_stack *s, void **obj_table, unsigned int n) RTE_ASSERT(s != NULL); RTE_ASSERT(obj_table != NULL); - return __rte_stack_std_pop(s, obj_table, n); + if (s->flags & RTE_STACK_F_LF) + return __rte_stack_lf_pop(s, obj_table, n); + else + return __rte_stack_std_pop(s, obj_table, n); } /** @@ -119,7 +165,10 @@ rte_stack_count(struct rte_stack *s) { RTE_ASSERT(s != NULL); - return __rte_stack_std_count(s); + if (s->flags & RTE_STACK_F_LF) + return __rte_stack_lf_count(s); + else + return __rte_stack_std_count(s); } /** @@ -159,7 +208,10 @@ rte_stack_free_count(struct rte_stack *s) * NUMA. The value can be *SOCKET_ID_ANY* if there is no NUMA * constraint for the reserved zone. * @param flags - * Reserved for future use. + * An OR of the following: + * - RTE_STACK_F_LF: If this flag is set, the stack uses lock-free + * variants of the push and pop functions. Otherwise, it achieves + * thread-safety using a lock. * @return * On success, the pointer to the new allocated stack. NULL on error with * rte_errno set appropriately. Possible errno values include: diff --git a/lib/librte_stack/rte_stack_lf.c b/lib/librte_stack/rte_stack_lf.c new file mode 100644 index 000000000..0adcc263e --- /dev/null +++ b/lib/librte_stack/rte_stack_lf.c @@ -0,0 +1,31 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Intel Corporation + */ + +#include "rte_stack.h" + +void +rte_stack_lf_init(struct rte_stack *s, unsigned int count) +{ + struct rte_stack_lf_elem *elems = s->stack_lf.elems; + unsigned int i; + + for (i = 0; i < count; i++) + __rte_stack_lf_push_elems(&s->stack_lf.free, + &elems[i], &elems[i], 1); +} + +ssize_t +rte_stack_lf_get_memsize(unsigned int count) +{ + ssize_t sz = sizeof(struct rte_stack); + + sz += RTE_CACHE_LINE_ROUNDUP(count * sizeof(struct rte_stack_lf_elem)); + + /* Add padding to avoid false sharing conflicts caused by + * next-line hardware prefetchers. + */ + sz += 2 * RTE_CACHE_LINE_SIZE; + + return sz; +} diff --git a/lib/librte_stack/rte_stack_lf.h b/lib/librte_stack/rte_stack_lf.h new file mode 100644 index 000000000..bfd680133 --- /dev/null +++ b/lib/librte_stack/rte_stack_lf.h @@ -0,0 +1,102 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Intel Corporation + */ + +#ifndef _RTE_STACK_LF_H_ +#define _RTE_STACK_LF_H_ + +#include "rte_stack_lf_generic.h" + +/** + * @internal Push several objects on the lock-free stack (MT-safe). + * + * @param s + * A pointer to the stack structure. + * @param obj_table + * A pointer to a table of void * pointers (objects). + * @param n + * The number of objects to push on the stack from the obj_table. + * @return + * Actual number of objects enqueued. + */ +static __rte_always_inline unsigned int __rte_experimental +__rte_stack_lf_push(struct rte_stack *s, + void * const *obj_table, + unsigned int n) +{ + struct rte_stack_lf_elem *tmp, *first, *last = NULL; + unsigned int i; + + if (unlikely(n == 0)) + return 0; + + /* Pop n free elements */ + first = __rte_stack_lf_pop_elems(&s->stack_lf.free, n, NULL, &last); + if (unlikely(first == NULL)) + return 0; + + /* Construct the list elements */ + for (tmp = first, i = 0; i < n; i++, tmp = tmp->next) + tmp->data = obj_table[n - i - 1]; + + /* Push them to the used list */ + __rte_stack_lf_push_elems(&s->stack_lf.used, first, last, n); + + return n; +} + +/** + * @internal Pop several objects from the lock-free stack (MT-safe). + * + * @param s + * A pointer to the stack structure. + * @param obj_table + * A pointer to a table of void * pointers (objects). + * @param n + * The number of objects to pull from the stack. + * @return + * - Actual number of objects popped. + */ +static __rte_always_inline unsigned int __rte_experimental +__rte_stack_lf_pop(struct rte_stack *s, void **obj_table, unsigned int n) +{ + struct rte_stack_lf_elem *first, *last = NULL; + + if (unlikely(n == 0)) + return 0; + + /* Pop n used elements */ + first = __rte_stack_lf_pop_elems(&s->stack_lf.used, + n, obj_table, &last); + if (unlikely(first == NULL)) + return 0; + + /* Push the list elements to the free list */ + __rte_stack_lf_push_elems(&s->stack_lf.free, first, last, n); + + return n; +} + +/** + * @internal Initialize a lock-free stack. + * + * @param s + * A pointer to the stack structure. + * @param count + * The size of the stack. + */ +void +rte_stack_lf_init(struct rte_stack *s, unsigned int count); + +/** + * @internal Return the memory required for a lock-free stack. + * + * @param count + * The size of the stack. + * @return + * The bytes to allocate for a lock-free stack. + */ +ssize_t +rte_stack_lf_get_memsize(unsigned int count); + +#endif /* _RTE_STACK_LF_H_ */ diff --git a/lib/librte_stack/rte_stack_lf_generic.h b/lib/librte_stack/rte_stack_lf_generic.h new file mode 100644 index 000000000..1191406d3 --- /dev/null +++ b/lib/librte_stack/rte_stack_lf_generic.h @@ -0,0 +1,164 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Intel Corporation + */ + +#ifndef _RTE_STACK_LF_GENERIC_H_ +#define _RTE_STACK_LF_GENERIC_H_ + +#include +#include + +static __rte_always_inline unsigned int +__rte_stack_lf_count(struct rte_stack *s) +{ + /* stack_lf_push() and stack_lf_pop() do not update the list's contents + * and stack_lf->len atomically, which can cause the list to appear + * shorter than it actually is if this function is called while other + * threads are modifying the list. + * + * However, given the inherently approximate nature of the get_count + * callback -- even if the list and its size were updated atomically, + * the size could change between when get_count executes and when the + * value is returned to the caller -- this is acceptable. + * + * The stack_lf->len updates are placed such that the list may appear to + * have fewer elements than it does, but will never appear to have more + * elements. If the mempool is near-empty to the point that this is a + * concern, the user should consider increasing the mempool size. + */ + return (unsigned int)rte_atomic64_read(&s->stack_lf.used.len); +} + +static __rte_always_inline void +__rte_stack_lf_push_elems(struct rte_stack_lf_list *list, + struct rte_stack_lf_elem *first, + struct rte_stack_lf_elem *last, + unsigned int num) +{ +#ifndef RTE_ARCH_X86_64 + RTE_SET_USED(first); + RTE_SET_USED(last); + RTE_SET_USED(list); + RTE_SET_USED(num); +#else + struct rte_stack_lf_head old_head; + int success; + + old_head = list->head; + + do { + struct rte_stack_lf_head new_head; + + /* An acquire fence (or stronger) is needed for weak memory + * models to establish a synchronized-with relationship between + * the list->head load and store-release operations (as part of + * the rte_atomic128_cmp_exchange()). + */ + rte_smp_mb(); + + /* Swing the top pointer to the first element in the list and + * make the last element point to the old top. + */ + new_head.top = first; + new_head.cnt = old_head.cnt + 1; + + last->next = old_head.top; + + /* old_head is updated on failure */ + success = rte_atomic128_cmp_exchange( + (rte_int128_t *)&list->head, + (rte_int128_t *)&old_head, + (rte_int128_t *)&new_head, + 1, __ATOMIC_RELEASE, + __ATOMIC_RELAXED); + } while (success == 0); + + rte_atomic64_add(&list->len, num); +#endif +} + +static __rte_always_inline struct rte_stack_lf_elem * +__rte_stack_lf_pop_elems(struct rte_stack_lf_list *list, + unsigned int num, + void **obj_table, + struct rte_stack_lf_elem **last) +{ +#ifndef RTE_ARCH_X86_64 + RTE_SET_USED(obj_table); + RTE_SET_USED(last); + RTE_SET_USED(list); + RTE_SET_USED(num); + + return NULL; +#else + struct rte_stack_lf_head old_head; + int success; + + /* Reserve num elements, if available */ + while (1) { + uint64_t len = rte_atomic64_read(&list->len); + + /* Does the list contain enough elements? */ + if (unlikely(len < num)) + return NULL; + + if (rte_atomic64_cmpset((volatile uint64_t *)&list->len, + len, len - num)) + break; + } + + old_head = list->head; + + /* Pop num elements */ + do { + struct rte_stack_lf_head new_head; + struct rte_stack_lf_elem *tmp; + unsigned int i; + + /* An acquire fence (or stronger) is needed for weak memory + * models to ensure the LF LIFO element reads are properly + * ordered with respect to the head pointer read. + */ + rte_smp_mb(); + + rte_prefetch0(old_head.top); + + tmp = old_head.top; + + /* Traverse the list to find the new head. A next pointer will + * either point to another element or NULL; if a thread + * encounters a pointer that has already been popped, the CAS + * will fail. + */ + for (i = 0; i < num && tmp != NULL; i++) { + rte_prefetch0(tmp->next); + if (obj_table) + obj_table[i] = tmp->data; + if (last) + *last = tmp; + tmp = tmp->next; + } + + /* If NULL was encountered, the list was modified while + * traversing it. Retry. + */ + if (i != num) + continue; + + new_head.top = tmp; + new_head.cnt = old_head.cnt + 1; + + /* old_head is updated on failure */ + success = rte_atomic128_cmp_exchange( + (rte_int128_t *)&list->head, + (rte_int128_t *)&old_head, + (rte_int128_t *)&new_head, + 1, __ATOMIC_RELEASE, + __ATOMIC_RELAXED); + } while (success == 0); + + return old_head.top; +#endif +} + +#endif /* _RTE_STACK_LF_GENERIC_H_ */ From patchwork Wed Apr 3 20:09:14 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eads, Gage" X-Patchwork-Id: 52232 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 77FB61B44D; Wed, 3 Apr 2019 22:10:26 +0200 (CEST) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 21D131B150 for ; Wed, 3 Apr 2019 22:10:04 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 03 Apr 2019 13:10:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,306,1549958400"; d="scan'208";a="128403737" Received: from txasoft-yocto.an.intel.com ([10.123.72.192]) by orsmga007.jf.intel.com with ESMTP; 03 Apr 2019 13:10:03 -0700 From: Gage Eads To: dev@dpdk.org Cc: olivier.matz@6wind.com, arybchenko@solarflare.com, bruce.richardson@intel.com, konstantin.ananyev@intel.com, gavin.hu@arm.com, Honnappa.Nagarahalli@arm.com, nd@arm.com, thomas@monjalon.net Date: Wed, 3 Apr 2019 15:09:14 -0500 Message-Id: <20190403200916.16349-7-gage.eads@intel.com> X-Mailer: git-send-email 2.13.6 In-Reply-To: <20190403200916.16349-1-gage.eads@intel.com> References: <20190401211429.20282-1-gage.eads@intel.com> <20190403200916.16349-1-gage.eads@intel.com> Subject: [dpdk-dev] [PATCH v7 6/8] stack: add C11 atomic implementation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This commit adds an implementation of the lock-free stack push, pop, and length functions that use __atomic builtins, for systems that benefit from the finer-grained memory ordering control. Signed-off-by: Gage Eads Reviewed-by: Olivier Matz Reviewed-by: Honnappa Nagarahalli --- lib/librte_stack/Makefile | 3 +- lib/librte_stack/meson.build | 3 +- lib/librte_stack/rte_stack_lf.h | 4 + lib/librte_stack/rte_stack_lf_c11.h | 175 ++++++++++++++++++++++++++++++++++++ 4 files changed, 183 insertions(+), 2 deletions(-) create mode 100644 lib/librte_stack/rte_stack_lf_c11.h diff --git a/lib/librte_stack/Makefile b/lib/librte_stack/Makefile index 311edd997..8d18ce520 100644 --- a/lib/librte_stack/Makefile +++ b/lib/librte_stack/Makefile @@ -23,6 +23,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_STACK) := rte_stack.c \ SYMLINK-$(CONFIG_RTE_LIBRTE_STACK)-include := rte_stack.h \ rte_stack_std.h \ rte_stack_lf.h \ - rte_stack_lf_generic.h + rte_stack_lf_generic.h \ + rte_stack_lf_c11.h include $(RTE_SDK)/mk/rte.lib.mk diff --git a/lib/librte_stack/meson.build b/lib/librte_stack/meson.build index 7a09a5d66..46fce0c20 100644 --- a/lib/librte_stack/meson.build +++ b/lib/librte_stack/meson.build @@ -8,4 +8,5 @@ sources = files('rte_stack.c', 'rte_stack_std.c', 'rte_stack_lf.c') headers = files('rte_stack.h', 'rte_stack_std.h', 'rte_stack_lf.h', - 'rte_stack_lf_generic.h') + 'rte_stack_lf_generic.h', + 'rte_stack_lf_c11.h') diff --git a/lib/librte_stack/rte_stack_lf.h b/lib/librte_stack/rte_stack_lf.h index bfd680133..518889a05 100644 --- a/lib/librte_stack/rte_stack_lf.h +++ b/lib/librte_stack/rte_stack_lf.h @@ -5,7 +5,11 @@ #ifndef _RTE_STACK_LF_H_ #define _RTE_STACK_LF_H_ +#ifdef RTE_USE_C11_MEM_MODEL +#include "rte_stack_lf_c11.h" +#else #include "rte_stack_lf_generic.h" +#endif /** * @internal Push several objects on the lock-free stack (MT-safe). diff --git a/lib/librte_stack/rte_stack_lf_c11.h b/lib/librte_stack/rte_stack_lf_c11.h new file mode 100644 index 000000000..a316e9af5 --- /dev/null +++ b/lib/librte_stack/rte_stack_lf_c11.h @@ -0,0 +1,175 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Intel Corporation + */ + +#ifndef _RTE_STACK_LF_C11_H_ +#define _RTE_STACK_LF_C11_H_ + +#include +#include + +static __rte_always_inline unsigned int +__rte_stack_lf_count(struct rte_stack *s) +{ + /* stack_lf_push() and stack_lf_pop() do not update the list's contents + * and stack_lf->len atomically, which can cause the list to appear + * shorter than it actually is if this function is called while other + * threads are modifying the list. + * + * However, given the inherently approximate nature of the get_count + * callback -- even if the list and its size were updated atomically, + * the size could change between when get_count executes and when the + * value is returned to the caller -- this is acceptable. + * + * The stack_lf->len updates are placed such that the list may appear to + * have fewer elements than it does, but will never appear to have more + * elements. If the mempool is near-empty to the point that this is a + * concern, the user should consider increasing the mempool size. + */ + return (unsigned int)__atomic_load_n(&s->stack_lf.used.len.cnt, + __ATOMIC_RELAXED); +} + +static __rte_always_inline void +__rte_stack_lf_push_elems(struct rte_stack_lf_list *list, + struct rte_stack_lf_elem *first, + struct rte_stack_lf_elem *last, + unsigned int num) +{ +#ifndef RTE_ARCH_X86_64 + RTE_SET_USED(first); + RTE_SET_USED(last); + RTE_SET_USED(list); + RTE_SET_USED(num); +#else + struct rte_stack_lf_head old_head; + int success; + + old_head = list->head; + + do { + struct rte_stack_lf_head new_head; + + /* Use an acquire fence to establish a synchronized-with + * relationship between the list->head load and store-release + * operations (as part of the rte_atomic128_cmp_exchange()). + */ + __atomic_thread_fence(__ATOMIC_ACQUIRE); + + /* Swing the top pointer to the first element in the list and + * make the last element point to the old top. + */ + new_head.top = first; + new_head.cnt = old_head.cnt + 1; + + last->next = old_head.top; + + /* Use the release memmodel to ensure the writes to the LF LIFO + * elements are visible before the head pointer write. + */ + success = rte_atomic128_cmp_exchange( + (rte_int128_t *)&list->head, + (rte_int128_t *)&old_head, + (rte_int128_t *)&new_head, + 1, __ATOMIC_RELEASE, + __ATOMIC_RELAXED); + } while (success == 0); + + /* Ensure the stack modifications are not reordered with respect + * to the LIFO len update. + */ + __atomic_add_fetch(&list->len.cnt, num, __ATOMIC_RELEASE); +#endif +} + +static __rte_always_inline struct rte_stack_lf_elem * +__rte_stack_lf_pop_elems(struct rte_stack_lf_list *list, + unsigned int num, + void **obj_table, + struct rte_stack_lf_elem **last) +{ +#ifndef RTE_ARCH_X86_64 + RTE_SET_USED(obj_table); + RTE_SET_USED(last); + RTE_SET_USED(list); + RTE_SET_USED(num); + + return NULL; +#else + struct rte_stack_lf_head old_head; + uint64_t len; + int success; + + /* Reserve num elements, if available */ + len = __atomic_load_n(&list->len.cnt, __ATOMIC_ACQUIRE); + + while (1) { + /* Does the list contain enough elements? */ + if (unlikely(len < num)) + return NULL; + + /* len is updated on failure */ + if (__atomic_compare_exchange_n(&list->len.cnt, + &len, len - num, + 0, __ATOMIC_ACQUIRE, + __ATOMIC_ACQUIRE)) + break; + } + + /* If a torn read occurs, the CAS will fail and set old_head to the + * correct/latest value. + */ + old_head = list->head; + + /* Pop num elements */ + do { + struct rte_stack_lf_head new_head; + struct rte_stack_lf_elem *tmp; + unsigned int i; + + /* Use the acquire memmodel to ensure the reads to the LF LIFO + * elements are properly ordered with respect to the head + * pointer read. + */ + __atomic_thread_fence(__ATOMIC_ACQUIRE); + + rte_prefetch0(old_head.top); + + tmp = old_head.top; + + /* Traverse the list to find the new head. A next pointer will + * either point to another element or NULL; if a thread + * encounters a pointer that has already been popped, the CAS + * will fail. + */ + for (i = 0; i < num && tmp != NULL; i++) { + rte_prefetch0(tmp->next); + if (obj_table) + obj_table[i] = tmp->data; + if (last) + *last = tmp; + tmp = tmp->next; + } + + /* If NULL was encountered, the list was modified while + * traversing it. Retry. + */ + if (i != num) + continue; + + new_head.top = tmp; + new_head.cnt = old_head.cnt + 1; + + success = rte_atomic128_cmp_exchange( + (rte_int128_t *)&list->head, + (rte_int128_t *)&old_head, + (rte_int128_t *)&new_head, + 1, __ATOMIC_RELEASE, + __ATOMIC_RELAXED); + } while (success == 0); + + return old_head.top; +#endif +} + +#endif /* _RTE_STACK_LF_C11_H_ */ From patchwork Wed Apr 3 20:09:15 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eads, Gage" X-Patchwork-Id: 52233 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 04FDD1B474; Wed, 3 Apr 2019 22:10:29 +0200 (CEST) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 185B51B150 for ; Wed, 3 Apr 2019 22:10:05 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 03 Apr 2019 13:10:05 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,306,1549958400"; d="scan'208";a="128403747" Received: from txasoft-yocto.an.intel.com ([10.123.72.192]) by orsmga007.jf.intel.com with ESMTP; 03 Apr 2019 13:10:04 -0700 From: Gage Eads To: dev@dpdk.org Cc: olivier.matz@6wind.com, arybchenko@solarflare.com, bruce.richardson@intel.com, konstantin.ananyev@intel.com, gavin.hu@arm.com, Honnappa.Nagarahalli@arm.com, nd@arm.com, thomas@monjalon.net Date: Wed, 3 Apr 2019 15:09:15 -0500 Message-Id: <20190403200916.16349-8-gage.eads@intel.com> X-Mailer: git-send-email 2.13.6 In-Reply-To: <20190403200916.16349-1-gage.eads@intel.com> References: <20190401211429.20282-1-gage.eads@intel.com> <20190403200916.16349-1-gage.eads@intel.com> Subject: [dpdk-dev] [PATCH v7 7/8] test/stack: add lock-free stack tests X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This commit adds lock-free stack variants of stack_autotest (stack_lf_autotest) and stack_perf_autotest (stack_lf_perf_autotest), which differ only in that the lock-free versions pass the RTE_STACK_F_LF flag to all rte_stack_create() calls. Signed-off-by: Gage Eads Reviewed-by: Olivier Matz --- app/test/meson.build | 2 ++ app/test/test_stack.c | 41 +++++++++++++++++++++++++++-------------- app/test/test_stack_perf.c | 17 +++++++++++++++-- 3 files changed, 44 insertions(+), 16 deletions(-) diff --git a/app/test/meson.build b/app/test/meson.build index 02eb788a4..867cc5863 100644 --- a/app/test/meson.build +++ b/app/test/meson.build @@ -178,6 +178,7 @@ fast_parallel_test_names = [ 'sched_autotest', 'spinlock_autotest', 'stack_autotest', + 'stack_nb_autotest', 'string_autotest', 'table_autotest', 'tailq_autotest', @@ -243,6 +244,7 @@ perf_test_names = [ 'ring_pmd_perf_autotest', 'pmd_perf_autotest', 'stack_perf_autotest', + 'stack_nb_perf_autotest', ] # All test cases in driver_test_names list are non-parallel diff --git a/app/test/test_stack.c b/app/test/test_stack.c index 8392e4e4d..f199136aa 100644 --- a/app/test/test_stack.c +++ b/app/test/test_stack.c @@ -97,7 +97,7 @@ test_stack_push_pop(struct rte_stack *s, void **obj_table, unsigned int bulk_sz) } static int -test_stack_basic(void) +test_stack_basic(uint32_t flags) { struct rte_stack *s = NULL; void **obj_table = NULL; @@ -113,7 +113,7 @@ test_stack_basic(void) for (i = 0; i < STACK_SIZE; i++) obj_table[i] = (void *)(uintptr_t)i; - s = rte_stack_create(__func__, STACK_SIZE, rte_socket_id(), 0); + s = rte_stack_create(__func__, STACK_SIZE, rte_socket_id(), flags); if (s == NULL) { printf("[%s():%u] failed to create a stack\n", __func__, __LINE__); @@ -177,18 +177,18 @@ test_stack_basic(void) } static int -test_stack_name_reuse(void) +test_stack_name_reuse(uint32_t flags) { struct rte_stack *s[2]; - s[0] = rte_stack_create("test", STACK_SIZE, rte_socket_id(), 0); + s[0] = rte_stack_create("test", STACK_SIZE, rte_socket_id(), flags); if (s[0] == NULL) { printf("[%s():%u] Failed to create a stack\n", __func__, __LINE__); return -1; } - s[1] = rte_stack_create("test", STACK_SIZE, rte_socket_id(), 0); + s[1] = rte_stack_create("test", STACK_SIZE, rte_socket_id(), flags); if (s[1] != NULL) { printf("[%s():%u] Failed to detect re-used name\n", __func__, __LINE__); @@ -201,7 +201,7 @@ test_stack_name_reuse(void) } static int -test_stack_name_length(void) +test_stack_name_length(uint32_t flags) { char name[RTE_STACK_NAMESIZE + 1]; struct rte_stack *s; @@ -209,7 +209,7 @@ test_stack_name_length(void) memset(name, 's', sizeof(name)); name[RTE_STACK_NAMESIZE] = '\0'; - s = rte_stack_create(name, STACK_SIZE, rte_socket_id(), 0); + s = rte_stack_create(name, STACK_SIZE, rte_socket_id(), flags); if (s != NULL) { printf("[%s():%u] Failed to prevent long name\n", __func__, __LINE__); @@ -328,7 +328,7 @@ stack_thread_push_pop(void *args) } static int -test_stack_multithreaded(void) +test_stack_multithreaded(uint32_t flags) { struct test_args *args; unsigned int lcore_id; @@ -349,7 +349,7 @@ test_stack_multithreaded(void) return -1; } - s = rte_stack_create("test", STACK_SIZE, rte_socket_id(), 0); + s = rte_stack_create("test", STACK_SIZE, rte_socket_id(), flags); if (s == NULL) { printf("[%s():%u] Failed to create a stack\n", __func__, __LINE__); @@ -384,9 +384,9 @@ test_stack_multithreaded(void) } static int -test_stack(void) +__test_stack(uint32_t flags) { - if (test_stack_basic() < 0) + if (test_stack_basic(flags) < 0) return -1; if (test_lookup_null() < 0) @@ -395,16 +395,29 @@ test_stack(void) if (test_free_null() < 0) return -1; - if (test_stack_name_reuse() < 0) + if (test_stack_name_reuse(flags) < 0) return -1; - if (test_stack_name_length() < 0) + if (test_stack_name_length(flags) < 0) return -1; - if (test_stack_multithreaded() < 0) + if (test_stack_multithreaded(flags) < 0) return -1; return 0; } +static int +test_stack(void) +{ + return __test_stack(0); +} + +static int +test_lf_stack(void) +{ + return __test_stack(RTE_STACK_F_LF); +} + REGISTER_TEST_COMMAND(stack_autotest, test_stack); +REGISTER_TEST_COMMAND(stack_lf_autotest, test_lf_stack); diff --git a/app/test/test_stack_perf.c b/app/test/test_stack_perf.c index 484370d30..e09d5384c 100644 --- a/app/test/test_stack_perf.c +++ b/app/test/test_stack_perf.c @@ -297,14 +297,14 @@ test_bulk_push_pop(struct rte_stack *s) } static int -test_stack_perf(void) +__test_stack_perf(uint32_t flags) { struct lcore_pair cores; struct rte_stack *s; rte_atomic32_init(&lcore_barrier); - s = rte_stack_create(STACK_NAME, STACK_SIZE, rte_socket_id(), 0); + s = rte_stack_create(STACK_NAME, STACK_SIZE, rte_socket_id(), flags); if (s == NULL) { printf("[%s():%u] failed to create a stack\n", __func__, __LINE__); @@ -340,4 +340,17 @@ test_stack_perf(void) return 0; } +static int +test_stack_perf(void) +{ + return __test_stack_perf(0); +} + +static int +test_lf_stack_perf(void) +{ + return __test_stack_perf(RTE_STACK_F_LF); +} + REGISTER_TEST_COMMAND(stack_perf_autotest, test_stack_perf); +REGISTER_TEST_COMMAND(stack_lf_perf_autotest, test_lf_stack_perf); From patchwork Wed Apr 3 20:09:16 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eads, Gage" X-Patchwork-Id: 52234 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 5768E1B483; Wed, 3 Apr 2019 22:10:31 +0200 (CEST) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 1BA881B150 for ; Wed, 3 Apr 2019 22:10:06 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 03 Apr 2019 13:10:06 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,306,1549958400"; d="scan'208";a="128403754" Received: from txasoft-yocto.an.intel.com ([10.123.72.192]) by orsmga007.jf.intel.com with ESMTP; 03 Apr 2019 13:10:05 -0700 From: Gage Eads To: dev@dpdk.org Cc: olivier.matz@6wind.com, arybchenko@solarflare.com, bruce.richardson@intel.com, konstantin.ananyev@intel.com, gavin.hu@arm.com, Honnappa.Nagarahalli@arm.com, nd@arm.com, thomas@monjalon.net Date: Wed, 3 Apr 2019 15:09:16 -0500 Message-Id: <20190403200916.16349-9-gage.eads@intel.com> X-Mailer: git-send-email 2.13.6 In-Reply-To: <20190403200916.16349-1-gage.eads@intel.com> References: <20190401211429.20282-1-gage.eads@intel.com> <20190403200916.16349-1-gage.eads@intel.com> Subject: [dpdk-dev] [PATCH v7 8/8] mempool/stack: add lock-free stack mempool handler X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This commit adds support for lock-free (linked list based) stack mempool handler. In mempool_perf_autotest the lock-based stack outperforms the lock-free handler for certain lcore/alloc count/free count combinations*, however: - For applications with preemptible pthreads, a standard (lock-based) stack's worst-case performance (i.e. one thread being preempted while holding the spinlock) is much worse than the lock-free stack's. - Using per-thread mempool caches will largely mitigate the performance difference. *Test setup: x86_64 build with default config, dual-socket Xeon E5-2699 v4, running on isolcpus cores with a tickless scheduler. The lock-based stack's rate_persec was 0.6x-3.5x the lock-free stack's. Signed-off-by: Gage Eads Reviewed-by: Olivier Matz --- doc/guides/prog_guide/env_abstraction_layer.rst | 10 ++++++++++ doc/guides/rel_notes/release_19_05.rst | 5 +++++ drivers/mempool/stack/rte_mempool_stack.c | 26 +++++++++++++++++++++++-- 3 files changed, 39 insertions(+), 2 deletions(-) diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst index 6a04c3c33..fa8afdb3a 100644 --- a/doc/guides/prog_guide/env_abstraction_layer.rst +++ b/doc/guides/prog_guide/env_abstraction_layer.rst @@ -581,6 +581,16 @@ Known Issues 5. It MUST not be used by multi-producer/consumer pthreads, whose scheduling policies are SCHED_FIFO or SCHED_RR. + Alternatively, applications can use the lock-free stack mempool handler. When + considering this handler, note that: + + - It is currently limited to the x86_64 platform, because it uses an + instruction (16-byte compare-and-swap) that is not yet available on other + platforms. + - It has worse average-case performance than the non-preemptive rte_ring, but + software caching (e.g. the mempool cache) can mitigate this by reducing the + number of stack accesses. + + rte_timer Running ``rte_timer_manage()`` on a non-EAL pthread is not allowed. However, resetting/stopping the timer from a non-EAL pthread is allowed. diff --git a/doc/guides/rel_notes/release_19_05.rst b/doc/guides/rel_notes/release_19_05.rst index 3b115b5f6..f873984ad 100644 --- a/doc/guides/rel_notes/release_19_05.rst +++ b/doc/guides/rel_notes/release_19_05.rst @@ -130,6 +130,11 @@ New Features The library supports two stack implementations: standard (lock-based) and lock-free. The lock-free implementation is currently limited to x86-64 platforms. +* **Added Lock-Free Stack Mempool Handler.** + + Added a new lock-free stack handler, which uses the newly added stack + library. + Removed Items ------------- diff --git a/drivers/mempool/stack/rte_mempool_stack.c b/drivers/mempool/stack/rte_mempool_stack.c index 25ccdb9af..7e85c8d6b 100644 --- a/drivers/mempool/stack/rte_mempool_stack.c +++ b/drivers/mempool/stack/rte_mempool_stack.c @@ -7,7 +7,7 @@ #include static int -stack_alloc(struct rte_mempool *mp) +__stack_alloc(struct rte_mempool *mp, uint32_t flags) { char name[RTE_STACK_NAMESIZE]; struct rte_stack *s; @@ -20,7 +20,7 @@ stack_alloc(struct rte_mempool *mp) return -rte_errno; } - s = rte_stack_create(name, mp->size, mp->socket_id, 0); + s = rte_stack_create(name, mp->size, mp->socket_id, flags); if (s == NULL) return -rte_errno; @@ -30,6 +30,18 @@ stack_alloc(struct rte_mempool *mp) } static int +stack_alloc(struct rte_mempool *mp) +{ + return __stack_alloc(mp, 0); +} + +static int +lf_stack_alloc(struct rte_mempool *mp) +{ + return __stack_alloc(mp, RTE_STACK_F_LF); +} + +static int stack_enqueue(struct rte_mempool *mp, void * const *obj_table, unsigned int n) { @@ -72,4 +84,14 @@ static struct rte_mempool_ops ops_stack = { .get_count = stack_get_count }; +static struct rte_mempool_ops ops_lf_stack = { + .name = "lf_stack", + .alloc = lf_stack_alloc, + .free = stack_free, + .enqueue = stack_enqueue, + .dequeue = stack_dequeue, + .get_count = stack_get_count +}; + MEMPOOL_REGISTER_OPS(ops_stack); +MEMPOOL_REGISTER_OPS(ops_lf_stack);