[v8,7/9] memarea: support backup memory mechanism

Message ID 20221011121720.2657-8-fengchengwen@huawei.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series introduce memarea library |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

fengchengwen Oct. 11, 2022, 12:17 p.m. UTC
  This patch adds a memarea backup mechanism, where an allocation request
which cannot be met by the current memarea is deferred to its backup
memarea.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
---
 doc/guides/prog_guide/memarea_lib.rst |  4 ++
 lib/memarea/memarea_private.h         |  3 ++
 lib/memarea/rte_memarea.c             | 67 ++++++++++++++++++++++++++-
 lib/memarea/rte_memarea.h             | 10 ++++
 4 files changed, 83 insertions(+), 1 deletion(-)
  

Comments

Dmitry Kozlyuk Oct. 11, 2022, 3:58 p.m. UTC | #1
2022-10-11 12:17 (UTC+0000), Chengwen Feng:
> This patch adds a memarea backup mechanism, where an allocation request
> which cannot be met by the current memarea is deferred to its backup
> memarea.

This is a controversial feature.

1. It violates memarea property of freeing all allocated objects
   at once when the memarea itself is destroyed. Objects allocated
   in the backup memarea through the destroyed one will remain.

2. If there was an API to check that the object belongs to a memarea
   (the check from rte_memarea_update_refcnt() in this patch),
   it would be trivial to implement this feature on top of memarea API.

Nit: "Deferred" is about time -> "forwarded", "delegated", or "handled over".

A general note about this series.
IMO, libraries should have limited scope and allow composition
rather than accumulate features and control their use via options.
The core idea of memarea is an allocator within a memory region,
a fast one and with low overhead, usable to free all objects at once.

This is orthogonal to the question from where the memory comes from.
HEAP and LIBC sources could be built on top of USER source,
which means that the concept of source is less relevant.
Backup mechanism could instead be a way to add memory to the area,
in which case HEAP and LIBC memarea would also be expandable.
Memarea API could be defined as a structure with callbacks,
and different types of memarea could be combined,
for example, interlocked memarea on top of expandable memarea on top of
memarea with a particular memory management algorithm.

I'm not saying we should immediately build all this complexity.
On the contrary, I would merge the basic things first,
then try to _stack_ new features on top,
then look if interfaces emerge that can be used for composition.
  
Mattias Rönnblom Oct. 11, 2022, 8:26 p.m. UTC | #2
On 2022-10-11 17:58, Dmitry Kozlyuk wrote:
> 2022-10-11 12:17 (UTC+0000), Chengwen Feng:
>> This patch adds a memarea backup mechanism, where an allocation request
>> which cannot be met by the current memarea is deferred to its backup
>> memarea.
> 
> This is a controversial feature.
> 
> 1. It violates memarea property of freeing all allocated objects
>     at once when the memarea itself is destroyed. Objects allocated
>     in the backup memarea through the destroyed one will remain.
> 
> 2. If there was an API to check that the object belongs to a memarea
>     (the check from rte_memarea_update_refcnt() in this patch),
>     it would be trivial to implement this feature on top of memarea API.
> 
> Nit: "Deferred" is about time -> "forwarded", "delegated", or "handled over".
> 
> A general note about this series.
> IMO, libraries should have limited scope and allow composition
> rather than accumulate features and control their use via options.
> The core idea of memarea is an allocator within a memory region,
> a fast one and with low overhead, usable to free all objects at once.
> 

What's a typical use case for a memory region? In a packet processing 
context.

The ability to instantiate a variable number of heaps/regions seems 
useful, although it's not clear to me if the application should order 
that to happen on a per-lcore basis, on a per-NUMA node basis, a 
per-<some domain object>-basis, or something else entirely.

It seems to me that DPDK is lacking a variable-size memory allocator 
which is efficient and safe to use from lcore threads. My impression is 
that glibc malloc() and rte_malloc() are too slow for the packet 
processing threads, and involves code paths taking locks shared with 
non-EAL threads.

> This is orthogonal to the question from where the memory comes from.
> HEAP and LIBC sources could be built on top of USER source,
> which means that the concept of source is less relevant.
> Backup mechanism could instead be a way to add memory to the area,
> in which case HEAP and LIBC memarea would also be expandable.
> Memarea API could be defined as a structure with callbacks,
> and different types of memarea could be combined,
> for example, interlocked memarea on top of expandable memarea on top of
> memarea with a particular memory management algorithm.
> 
> I'm not saying we should immediately build all this complexity.

The part with implementing runtime polymorphism using a struct with 
function pointers, instead of the enum+switch-based-type-test approach, 
doesn't sound like something that would add complexity. Rather the opposite.

Also, having a clear-cut separation of concern between 
the-thing-that-allocates-and-frees-the-region and the region-internal 
memory manager (what's called an algorithm in this patchset) also seems 
like something that would simplify the code.

> On the contrary, I would merge the basic things first,
> then try to _stack_ new features on top,
> then look if interfaces emerge that can be used for composition.
  
fengchengwen Oct. 12, 2022, 7:57 a.m. UTC | #3
Hi Dmitry,

On 2022/10/11 23:58, Dmitry Kozlyuk wrote:
> 2022-10-11 12:17 (UTC+0000), Chengwen Feng:
>> This patch adds a memarea backup mechanism, where an allocation request
>> which cannot be met by the current memarea is deferred to its backup
>> memarea.
> 
> This is a controversial feature.
> 
> 1. It violates memarea property of freeing all allocated objects
>    at once when the memarea itself is destroyed. Objects allocated
>    in the backup memarea through the destroyed one will remain.
> 
> 2. If there was an API to check that the object belongs to a memarea
>    (the check from rte_memarea_update_refcnt() in this patch),
>    it would be trivial to implement this feature on top of memarea API.

This patch add 'struct memarea *owner' in each object's meta data, and it was
used to free which allocated from backup memarea. So this problem will not exist.

> 
> Nit: "Deferred" is about time -> "forwarded", "delegated", or "handled over".

ok

> 
> A general note about this series.
> IMO, libraries should have limited scope and allow composition
> rather than accumulate features and control their use via options.
> The core idea of memarea is an allocator within a memory region,
> a fast one and with low overhead, usable to free all objects at once.
> 
> This is orthogonal to the question from where the memory comes from.
> HEAP and LIBC sources could be built on top of USER source,
> which means that the concept of source is less relevant.
> Backup mechanism could instead be a way to add memory to the area,
> in which case HEAP and LIBC memarea would also be expandable.
> Memarea API could be defined as a structure with callbacks,
> and different types of memarea could be combined,
> for example, interlocked memarea on top of expandable memarea on top of
> memarea with a particular memory management algorithm.
> 
> I'm not saying we should immediately build all this complexity.
> On the contrary, I would merge the basic things first,
> then try to _stack_ new features on top,
> then look if interfaces emerge that can be used for composition.

Agree, I will drop this feature in next version.

> .
>
  
fengchengwen Oct. 12, 2022, 8:23 a.m. UTC | #4
Hi Mattias,

On 2022/10/12 4:26, Mattias Rönnblom wrote:
> On 2022-10-11 17:58, Dmitry Kozlyuk wrote:
>> 2022-10-11 12:17 (UTC+0000), Chengwen Feng:
>>> This patch adds a memarea backup mechanism, where an allocation request
>>> which cannot be met by the current memarea is deferred to its backup
>>> memarea.
>>
>> This is a controversial feature.
>>
>> 1. It violates memarea property of freeing all allocated objects
>>     at once when the memarea itself is destroyed. Objects allocated
>>     in the backup memarea through the destroyed one will remain.
>>
>> 2. If there was an API to check that the object belongs to a memarea
>>     (the check from rte_memarea_update_refcnt() in this patch),
>>     it would be trivial to implement this feature on top of memarea API.
>>
>> Nit: "Deferred" is about time -> "forwarded", "delegated", or "handled over".
>>
>> A general note about this series.
>> IMO, libraries should have limited scope and allow composition
>> rather than accumulate features and control their use via options.
>> The core idea of memarea is an allocator within a memory region,
>> a fast one and with low overhead, usable to free all objects at once.
>>
> 
> What's a typical use case for a memory region? In a packet processing context.

we used it in video system:

   udp-packets  ->  splitter ->  stream-1-reorder  -> stream-1-decoder --
                             |                                          |
                             |                                          |--> compose-picture  -> picture-encoder -> udp-packets
                             |                                          |
                             ->  stream-2-reorder  -> stream-2-decoder --

each stream-decoder use a dedicated memarea which have following advantage:
1. there will no global lock which like rte_malloc
2. memory objects leakage will only impact the decoder, it will not impact global system.
3. simple the programmer, only destroy the memarea after the session finished.

As you see, this is a different idea of memory optimization, in pktmbuf_pool, it use
prealloc + per-lcore cache, it readuces lock conficts by lcore cache, but didn't fix
memory-leakage's impact.

> 
> The ability to instantiate a variable number of heaps/regions seems useful, although it's not clear to me if the application should order that to happen on a per-lcore basis, on a per-NUMA node basis, a per-<some domain object>-basis, or something else entirely.
> 
> It seems to me that DPDK is lacking a variable-size memory allocator which is efficient and safe to use from lcore threads. My impression is that glibc malloc() and rte_malloc() are too slow for the packet processing threads, and involves code paths taking locks shared with non-EAL threads.
> 
>> This is orthogonal to the question from where the memory comes from.
>> HEAP and LIBC sources could be built on top of USER source,
>> which means that the concept of source is less relevant.
>> Backup mechanism could instead be a way to add memory to the area,
>> in which case HEAP and LIBC memarea would also be expandable.
>> Memarea API could be defined as a structure with callbacks,
>> and different types of memarea could be combined,
>> for example, interlocked memarea on top of expandable memarea on top of
>> memarea with a particular memory management algorithm.
>>
>> I'm not saying we should immediately build all this complexity.
> 
> The part with implementing runtime polymorphism using a struct with function pointers, instead of the enum+switch-based-type-test approach, doesn't sound like something that would add complexity. Rather the opposite.
> 
> Also, having a clear-cut separation of concern between the-thing-that-allocates-and-frees-the-region and the region-internal memory manager (what's called an algorithm in this patchset) also seems like something that would simplify the code.
> 
>> On the contrary, I would merge the basic things first,
>> then try to _stack_ new features on top,
>> then look if interfaces emerge that can be used for composition.
> 
> .
  

Patch

diff --git a/doc/guides/prog_guide/memarea_lib.rst b/doc/guides/prog_guide/memarea_lib.rst
index 8b616b57e6..6b180bb0e4 100644
--- a/doc/guides/prog_guide/memarea_lib.rst
+++ b/doc/guides/prog_guide/memarea_lib.rst
@@ -25,6 +25,10 @@  The main features are as follows:
 
 * It supports MT-safe as long as it's specified at creation time.
 
+* It provides backup memory mechanism, the memarea could use another memarea
+  as a backup. It will attempts to allocate object from backup memarea when
+  the current memarea failed to allocate.
+
 Library API Overview
 --------------------
 
diff --git a/lib/memarea/memarea_private.h b/lib/memarea/memarea_private.h
index f5accf2987..062e967250 100644
--- a/lib/memarea/memarea_private.h
+++ b/lib/memarea/memarea_private.h
@@ -14,6 +14,7 @@ 
 struct memarea_elem {
 	TAILQ_ENTRY(memarea_elem) elem_node;
 	TAILQ_ENTRY(memarea_elem) free_node;
+	struct rte_memarea       *owner;
 	size_t   size;
 	uint32_t magic;
 	uint32_t cookie;
@@ -26,11 +27,13 @@  struct rte_memarea {
 	struct rte_memarea_param init;
 	rte_spinlock_t           lock;
 	void                    *area_addr;
+	void                    *top_addr;
 	struct memarea_elem_list elem_list;
 	struct memarea_elem_list free_list;
 
 	uint64_t alloc_fails;
 	uint64_t refcnt_check_fails;
+	uint64_t bak_alloc_fails;
 } __rte_cache_aligned;
 
 #endif /* MEMAREA_PRIVATE_H */
diff --git a/lib/memarea/rte_memarea.c b/lib/memarea/rte_memarea.c
index 163ebfa792..3de4297cf7 100644
--- a/lib/memarea/rte_memarea.c
+++ b/lib/memarea/rte_memarea.c
@@ -132,9 +132,11 @@  rte_memarea_create(const struct rte_memarea_param *init)
 	TAILQ_INIT(&ma->elem_list);
 	TAILQ_INIT(&ma->free_list);
 	ma->area_addr = addr;
+	ma->top_addr = RTE_PTR_ADD(addr, init->total_sz - 1);
 	elem = addr;
 	TAILQ_INSERT_TAIL(&ma->elem_list, elem, elem_node);
 	TAILQ_INSERT_TAIL(&ma->free_list, elem, free_node);
+	elem->owner = NULL;
 	elem->size = init->total_sz - sizeof(struct memarea_elem);
 	elem->magic = MEMAREA_AVAILABLE_ELEM_MAGIC;
 	elem->cookie = MEMAREA_AVAILABLE_ELEM_COOKIE;
@@ -154,11 +156,41 @@  memarea_free_area(struct rte_memarea *ma)
 		rte_memarea_free(ma->init.src_memarea, ma->area_addr);
 }
 
+static inline void memarea_lock(struct rte_memarea *ma);
+static inline void memarea_unlock(struct rte_memarea *ma);
+static inline void memarea_free_elem(struct rte_memarea *ma, struct memarea_elem *elem);
+
+static void
+memarea_free_owner_objs(struct rte_memarea *ma, struct rte_memarea *owner)
+{
+	struct memarea_elem *elem, *tmp_elem;
+
+	memarea_lock(ma);
+	/* The TAILQ_FOREACH_SAFE is undefined in sys/queue.h, so extend it here. */
+	for (elem = TAILQ_FIRST(&ma->elem_list);
+	     elem && (tmp_elem = TAILQ_NEXT(elem, elem_node), 1);
+	     elem = tmp_elem) {
+		if (elem->owner != owner)
+			continue;
+		elem->refcnt = 0;
+		memarea_free_elem(ma, elem);
+	}
+	if (ma->init.bak_memarea != NULL)
+		memarea_free_owner_objs(ma->init.bak_memarea, owner);
+	memarea_unlock(ma);
+}
+
 void
 rte_memarea_destroy(struct rte_memarea *ma)
 {
 	if (ma == NULL)
 		return;
+	if (ma->init.bak_memarea != NULL) {
+		/* Some objects are allocated from backup memarea, these objects need to be
+		 * freed when the memarea is destroyed.
+		 */
+		memarea_free_owner_objs(ma->init.bak_memarea, ma);
+	}
 	memarea_free_area(ma);
 	rte_free(ma);
 }
@@ -191,6 +223,7 @@  memarea_add_node(struct rte_memarea *ma, struct memarea_elem *elem, size_t need_
 	struct memarea_elem *new_elem;
 	new_elem = (struct memarea_elem *)RTE_PTR_ADD(elem, sizeof(struct memarea_elem) +
 							    align_size);
+	new_elem->owner = NULL;
 	new_elem->size = elem->size - align_size - sizeof(struct memarea_elem);
 	new_elem->magic = MEMAREA_AVAILABLE_ELEM_MAGIC;
 	new_elem->cookie = MEMAREA_AVAILABLE_ELEM_COOKIE;
@@ -200,6 +233,23 @@  memarea_add_node(struct rte_memarea *ma, struct memarea_elem *elem, size_t need_
 	elem->size = align_size;
 }
 
+static inline void
+memarea_mark_owner(struct rte_memarea *ma, void *ptr)
+{
+	struct memarea_elem *elem;
+	elem = (struct memarea_elem *)RTE_PTR_SUB(ptr, sizeof(struct memarea_elem));
+	elem->owner = ma;
+}
+
+static inline void *
+memarea_alloc_backup(struct rte_memarea *ma, size_t size, uint32_t cookie)
+{
+	void *ptr = rte_memarea_alloc(ma->init.bak_memarea, size, cookie);
+	if (unlikely(ptr == NULL))
+		ma->bak_alloc_fails++;
+	return ptr;
+}
+
 void *
 rte_memarea_alloc(struct rte_memarea *ma, size_t size, uint32_t cookie)
 {
@@ -224,7 +274,11 @@  rte_memarea_alloc(struct rte_memarea *ma, size_t size, uint32_t cookie)
 		ptr = RTE_PTR_ADD(elem, sizeof(struct memarea_elem));
 		break;
 	}
-	if (unlikely(ptr == NULL))
+	if (unlikely(ptr == NULL && ma->init.bak_memarea != NULL))
+		ptr = memarea_alloc_backup(ma, size, cookie);
+	if (likely(ptr != NULL))
+		memarea_mark_owner(ma, ptr);
+	else
 		ma->alloc_fails++;
 	memarea_unlock(ma);
 
@@ -261,6 +315,7 @@  memarea_free_elem(struct rte_memarea *ma, struct memarea_elem *elem)
 {
 	struct memarea_elem *prev, *next;
 	bool merged = false;
+	elem->owner = NULL;
 	prev = TAILQ_PREV(elem, memarea_elem_list, elem_node);
 	next = TAILQ_NEXT(elem, elem_node);
 	if (prev != NULL && prev->refcnt == 0) {
@@ -296,6 +351,12 @@  rte_memarea_update_refcnt(struct rte_memarea *ma, void *ptr, int16_t value)
 		return;
 	}
 
+	if (unlikely(ptr < ma->area_addr || ptr > ma->top_addr)) {
+		rte_memarea_update_refcnt(ma->init.bak_memarea, ptr, value);
+		memarea_unlock(ma);
+		return;
+	}
+
 	if (unlikely(elem->refcnt <= 0 || elem->refcnt + value < 0)) {
 		RTE_LOG(ERR, MEMAREA,
 			"memarea: %s cookie: 0x%x curr refcnt: %d update refcnt: %d check fail!\n",
@@ -399,10 +460,14 @@  rte_memarea_dump(struct rte_memarea *ma, FILE *f, bool dump_all)
 	fprintf(f, "  algorithm: %s\n", memarea_alg_name(ma->init.alg));
 	fprintf(f, "  total-size: 0x%zx\n", ma->init.total_sz);
 	fprintf(f, "  mt-safe: %s\n", ma->init.mt_safe ? "yes" : "no");
+	if (ma->init.bak_memarea)
+		fprintf(f, "  backup-memarea-name: %s\n", ma->init.bak_memarea->init.name);
 	fprintf(f, "  total-regions: %u\n", memarea_elem_list_num(ma));
 	fprintf(f, "  total-free-regions: %u\n", memarea_free_list_num(ma));
 	fprintf(f, "  alloc_fails: %" PRIu64 "\n", ma->alloc_fails);
 	fprintf(f, "  refcnt_check_fails: %" PRIu64 "\n", ma->refcnt_check_fails);
+	if (ma->init.bak_memarea)
+		fprintf(f, "  backup_alloc_fails: %" PRIu64 "\n", ma->bak_alloc_fails);
 	if (dump_all)
 		memarea_dump_all(ma, f);
 	memarea_unlock(ma);
diff --git a/lib/memarea/rte_memarea.h b/lib/memarea/rte_memarea.h
index 5e3f23700b..abfff3d996 100644
--- a/lib/memarea/rte_memarea.h
+++ b/lib/memarea/rte_memarea.h
@@ -28,6 +28,12 @@ 
  *   specified, all the functions of the memarea API are lock-free, and assume
  *   to not be invoked in parallel on different logical cores to work on the
  *   same memarea.
+ *
+ * - It provides backup memory mechanism, the memarea could use another memarea
+ *   as a backup. It will attempts to allocate object from backup memarea when
+ *   the current memarea failed to allocate.
+ *   @note If the backup of memarea-A is memarea-B, then must destroy memarea-A
+ *   before destroying memarae-B.
  */
 
 #include <stdbool.h>
@@ -96,6 +102,10 @@  struct rte_memarea_param {
 		 */
 		struct rte_memarea *src_memarea;
 	};
+	/** Backup memarea, which is used to allocate object from when the current
+	 * memarea failed to allocate.
+	 */
+	struct rte_memarea *bak_memarea;
 };
 
 /**