[v8,7/9] memarea: support backup memory mechanism
Checks
Commit Message
This patch adds a memarea backup mechanism, where an allocation request
which cannot be met by the current memarea is deferred to its backup
memarea.
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
---
doc/guides/prog_guide/memarea_lib.rst | 4 ++
lib/memarea/memarea_private.h | 3 ++
lib/memarea/rte_memarea.c | 67 ++++++++++++++++++++++++++-
lib/memarea/rte_memarea.h | 10 ++++
4 files changed, 83 insertions(+), 1 deletion(-)
Comments
2022-10-11 12:17 (UTC+0000), Chengwen Feng:
> This patch adds a memarea backup mechanism, where an allocation request
> which cannot be met by the current memarea is deferred to its backup
> memarea.
This is a controversial feature.
1. It violates memarea property of freeing all allocated objects
at once when the memarea itself is destroyed. Objects allocated
in the backup memarea through the destroyed one will remain.
2. If there was an API to check that the object belongs to a memarea
(the check from rte_memarea_update_refcnt() in this patch),
it would be trivial to implement this feature on top of memarea API.
Nit: "Deferred" is about time -> "forwarded", "delegated", or "handled over".
A general note about this series.
IMO, libraries should have limited scope and allow composition
rather than accumulate features and control their use via options.
The core idea of memarea is an allocator within a memory region,
a fast one and with low overhead, usable to free all objects at once.
This is orthogonal to the question from where the memory comes from.
HEAP and LIBC sources could be built on top of USER source,
which means that the concept of source is less relevant.
Backup mechanism could instead be a way to add memory to the area,
in which case HEAP and LIBC memarea would also be expandable.
Memarea API could be defined as a structure with callbacks,
and different types of memarea could be combined,
for example, interlocked memarea on top of expandable memarea on top of
memarea with a particular memory management algorithm.
I'm not saying we should immediately build all this complexity.
On the contrary, I would merge the basic things first,
then try to _stack_ new features on top,
then look if interfaces emerge that can be used for composition.
On 2022-10-11 17:58, Dmitry Kozlyuk wrote:
> 2022-10-11 12:17 (UTC+0000), Chengwen Feng:
>> This patch adds a memarea backup mechanism, where an allocation request
>> which cannot be met by the current memarea is deferred to its backup
>> memarea.
>
> This is a controversial feature.
>
> 1. It violates memarea property of freeing all allocated objects
> at once when the memarea itself is destroyed. Objects allocated
> in the backup memarea through the destroyed one will remain.
>
> 2. If there was an API to check that the object belongs to a memarea
> (the check from rte_memarea_update_refcnt() in this patch),
> it would be trivial to implement this feature on top of memarea API.
>
> Nit: "Deferred" is about time -> "forwarded", "delegated", or "handled over".
>
> A general note about this series.
> IMO, libraries should have limited scope and allow composition
> rather than accumulate features and control their use via options.
> The core idea of memarea is an allocator within a memory region,
> a fast one and with low overhead, usable to free all objects at once.
>
What's a typical use case for a memory region? In a packet processing
context.
The ability to instantiate a variable number of heaps/regions seems
useful, although it's not clear to me if the application should order
that to happen on a per-lcore basis, on a per-NUMA node basis, a
per-<some domain object>-basis, or something else entirely.
It seems to me that DPDK is lacking a variable-size memory allocator
which is efficient and safe to use from lcore threads. My impression is
that glibc malloc() and rte_malloc() are too slow for the packet
processing threads, and involves code paths taking locks shared with
non-EAL threads.
> This is orthogonal to the question from where the memory comes from.
> HEAP and LIBC sources could be built on top of USER source,
> which means that the concept of source is less relevant.
> Backup mechanism could instead be a way to add memory to the area,
> in which case HEAP and LIBC memarea would also be expandable.
> Memarea API could be defined as a structure with callbacks,
> and different types of memarea could be combined,
> for example, interlocked memarea on top of expandable memarea on top of
> memarea with a particular memory management algorithm.
>
> I'm not saying we should immediately build all this complexity.
The part with implementing runtime polymorphism using a struct with
function pointers, instead of the enum+switch-based-type-test approach,
doesn't sound like something that would add complexity. Rather the opposite.
Also, having a clear-cut separation of concern between
the-thing-that-allocates-and-frees-the-region and the region-internal
memory manager (what's called an algorithm in this patchset) also seems
like something that would simplify the code.
> On the contrary, I would merge the basic things first,
> then try to _stack_ new features on top,
> then look if interfaces emerge that can be used for composition.
Hi Dmitry,
On 2022/10/11 23:58, Dmitry Kozlyuk wrote:
> 2022-10-11 12:17 (UTC+0000), Chengwen Feng:
>> This patch adds a memarea backup mechanism, where an allocation request
>> which cannot be met by the current memarea is deferred to its backup
>> memarea.
>
> This is a controversial feature.
>
> 1. It violates memarea property of freeing all allocated objects
> at once when the memarea itself is destroyed. Objects allocated
> in the backup memarea through the destroyed one will remain.
>
> 2. If there was an API to check that the object belongs to a memarea
> (the check from rte_memarea_update_refcnt() in this patch),
> it would be trivial to implement this feature on top of memarea API.
This patch add 'struct memarea *owner' in each object's meta data, and it was
used to free which allocated from backup memarea. So this problem will not exist.
>
> Nit: "Deferred" is about time -> "forwarded", "delegated", or "handled over".
ok
>
> A general note about this series.
> IMO, libraries should have limited scope and allow composition
> rather than accumulate features and control their use via options.
> The core idea of memarea is an allocator within a memory region,
> a fast one and with low overhead, usable to free all objects at once.
>
> This is orthogonal to the question from where the memory comes from.
> HEAP and LIBC sources could be built on top of USER source,
> which means that the concept of source is less relevant.
> Backup mechanism could instead be a way to add memory to the area,
> in which case HEAP and LIBC memarea would also be expandable.
> Memarea API could be defined as a structure with callbacks,
> and different types of memarea could be combined,
> for example, interlocked memarea on top of expandable memarea on top of
> memarea with a particular memory management algorithm.
>
> I'm not saying we should immediately build all this complexity.
> On the contrary, I would merge the basic things first,
> then try to _stack_ new features on top,
> then look if interfaces emerge that can be used for composition.
Agree, I will drop this feature in next version.
> .
>
Hi Mattias,
On 2022/10/12 4:26, Mattias Rönnblom wrote:
> On 2022-10-11 17:58, Dmitry Kozlyuk wrote:
>> 2022-10-11 12:17 (UTC+0000), Chengwen Feng:
>>> This patch adds a memarea backup mechanism, where an allocation request
>>> which cannot be met by the current memarea is deferred to its backup
>>> memarea.
>>
>> This is a controversial feature.
>>
>> 1. It violates memarea property of freeing all allocated objects
>> at once when the memarea itself is destroyed. Objects allocated
>> in the backup memarea through the destroyed one will remain.
>>
>> 2. If there was an API to check that the object belongs to a memarea
>> (the check from rte_memarea_update_refcnt() in this patch),
>> it would be trivial to implement this feature on top of memarea API.
>>
>> Nit: "Deferred" is about time -> "forwarded", "delegated", or "handled over".
>>
>> A general note about this series.
>> IMO, libraries should have limited scope and allow composition
>> rather than accumulate features and control their use via options.
>> The core idea of memarea is an allocator within a memory region,
>> a fast one and with low overhead, usable to free all objects at once.
>>
>
> What's a typical use case for a memory region? In a packet processing context.
we used it in video system:
udp-packets -> splitter -> stream-1-reorder -> stream-1-decoder --
| |
| |--> compose-picture -> picture-encoder -> udp-packets
| |
-> stream-2-reorder -> stream-2-decoder --
each stream-decoder use a dedicated memarea which have following advantage:
1. there will no global lock which like rte_malloc
2. memory objects leakage will only impact the decoder, it will not impact global system.
3. simple the programmer, only destroy the memarea after the session finished.
As you see, this is a different idea of memory optimization, in pktmbuf_pool, it use
prealloc + per-lcore cache, it readuces lock conficts by lcore cache, but didn't fix
memory-leakage's impact.
>
> The ability to instantiate a variable number of heaps/regions seems useful, although it's not clear to me if the application should order that to happen on a per-lcore basis, on a per-NUMA node basis, a per-<some domain object>-basis, or something else entirely.
>
> It seems to me that DPDK is lacking a variable-size memory allocator which is efficient and safe to use from lcore threads. My impression is that glibc malloc() and rte_malloc() are too slow for the packet processing threads, and involves code paths taking locks shared with non-EAL threads.
>
>> This is orthogonal to the question from where the memory comes from.
>> HEAP and LIBC sources could be built on top of USER source,
>> which means that the concept of source is less relevant.
>> Backup mechanism could instead be a way to add memory to the area,
>> in which case HEAP and LIBC memarea would also be expandable.
>> Memarea API could be defined as a structure with callbacks,
>> and different types of memarea could be combined,
>> for example, interlocked memarea on top of expandable memarea on top of
>> memarea with a particular memory management algorithm.
>>
>> I'm not saying we should immediately build all this complexity.
>
> The part with implementing runtime polymorphism using a struct with function pointers, instead of the enum+switch-based-type-test approach, doesn't sound like something that would add complexity. Rather the opposite.
>
> Also, having a clear-cut separation of concern between the-thing-that-allocates-and-frees-the-region and the region-internal memory manager (what's called an algorithm in this patchset) also seems like something that would simplify the code.
>
>> On the contrary, I would merge the basic things first,
>> then try to _stack_ new features on top,
>> then look if interfaces emerge that can be used for composition.
>
> .
@@ -25,6 +25,10 @@ The main features are as follows:
* It supports MT-safe as long as it's specified at creation time.
+* It provides backup memory mechanism, the memarea could use another memarea
+ as a backup. It will attempts to allocate object from backup memarea when
+ the current memarea failed to allocate.
+
Library API Overview
--------------------
@@ -14,6 +14,7 @@
struct memarea_elem {
TAILQ_ENTRY(memarea_elem) elem_node;
TAILQ_ENTRY(memarea_elem) free_node;
+ struct rte_memarea *owner;
size_t size;
uint32_t magic;
uint32_t cookie;
@@ -26,11 +27,13 @@ struct rte_memarea {
struct rte_memarea_param init;
rte_spinlock_t lock;
void *area_addr;
+ void *top_addr;
struct memarea_elem_list elem_list;
struct memarea_elem_list free_list;
uint64_t alloc_fails;
uint64_t refcnt_check_fails;
+ uint64_t bak_alloc_fails;
} __rte_cache_aligned;
#endif /* MEMAREA_PRIVATE_H */
@@ -132,9 +132,11 @@ rte_memarea_create(const struct rte_memarea_param *init)
TAILQ_INIT(&ma->elem_list);
TAILQ_INIT(&ma->free_list);
ma->area_addr = addr;
+ ma->top_addr = RTE_PTR_ADD(addr, init->total_sz - 1);
elem = addr;
TAILQ_INSERT_TAIL(&ma->elem_list, elem, elem_node);
TAILQ_INSERT_TAIL(&ma->free_list, elem, free_node);
+ elem->owner = NULL;
elem->size = init->total_sz - sizeof(struct memarea_elem);
elem->magic = MEMAREA_AVAILABLE_ELEM_MAGIC;
elem->cookie = MEMAREA_AVAILABLE_ELEM_COOKIE;
@@ -154,11 +156,41 @@ memarea_free_area(struct rte_memarea *ma)
rte_memarea_free(ma->init.src_memarea, ma->area_addr);
}
+static inline void memarea_lock(struct rte_memarea *ma);
+static inline void memarea_unlock(struct rte_memarea *ma);
+static inline void memarea_free_elem(struct rte_memarea *ma, struct memarea_elem *elem);
+
+static void
+memarea_free_owner_objs(struct rte_memarea *ma, struct rte_memarea *owner)
+{
+ struct memarea_elem *elem, *tmp_elem;
+
+ memarea_lock(ma);
+ /* The TAILQ_FOREACH_SAFE is undefined in sys/queue.h, so extend it here. */
+ for (elem = TAILQ_FIRST(&ma->elem_list);
+ elem && (tmp_elem = TAILQ_NEXT(elem, elem_node), 1);
+ elem = tmp_elem) {
+ if (elem->owner != owner)
+ continue;
+ elem->refcnt = 0;
+ memarea_free_elem(ma, elem);
+ }
+ if (ma->init.bak_memarea != NULL)
+ memarea_free_owner_objs(ma->init.bak_memarea, owner);
+ memarea_unlock(ma);
+}
+
void
rte_memarea_destroy(struct rte_memarea *ma)
{
if (ma == NULL)
return;
+ if (ma->init.bak_memarea != NULL) {
+ /* Some objects are allocated from backup memarea, these objects need to be
+ * freed when the memarea is destroyed.
+ */
+ memarea_free_owner_objs(ma->init.bak_memarea, ma);
+ }
memarea_free_area(ma);
rte_free(ma);
}
@@ -191,6 +223,7 @@ memarea_add_node(struct rte_memarea *ma, struct memarea_elem *elem, size_t need_
struct memarea_elem *new_elem;
new_elem = (struct memarea_elem *)RTE_PTR_ADD(elem, sizeof(struct memarea_elem) +
align_size);
+ new_elem->owner = NULL;
new_elem->size = elem->size - align_size - sizeof(struct memarea_elem);
new_elem->magic = MEMAREA_AVAILABLE_ELEM_MAGIC;
new_elem->cookie = MEMAREA_AVAILABLE_ELEM_COOKIE;
@@ -200,6 +233,23 @@ memarea_add_node(struct rte_memarea *ma, struct memarea_elem *elem, size_t need_
elem->size = align_size;
}
+static inline void
+memarea_mark_owner(struct rte_memarea *ma, void *ptr)
+{
+ struct memarea_elem *elem;
+ elem = (struct memarea_elem *)RTE_PTR_SUB(ptr, sizeof(struct memarea_elem));
+ elem->owner = ma;
+}
+
+static inline void *
+memarea_alloc_backup(struct rte_memarea *ma, size_t size, uint32_t cookie)
+{
+ void *ptr = rte_memarea_alloc(ma->init.bak_memarea, size, cookie);
+ if (unlikely(ptr == NULL))
+ ma->bak_alloc_fails++;
+ return ptr;
+}
+
void *
rte_memarea_alloc(struct rte_memarea *ma, size_t size, uint32_t cookie)
{
@@ -224,7 +274,11 @@ rte_memarea_alloc(struct rte_memarea *ma, size_t size, uint32_t cookie)
ptr = RTE_PTR_ADD(elem, sizeof(struct memarea_elem));
break;
}
- if (unlikely(ptr == NULL))
+ if (unlikely(ptr == NULL && ma->init.bak_memarea != NULL))
+ ptr = memarea_alloc_backup(ma, size, cookie);
+ if (likely(ptr != NULL))
+ memarea_mark_owner(ma, ptr);
+ else
ma->alloc_fails++;
memarea_unlock(ma);
@@ -261,6 +315,7 @@ memarea_free_elem(struct rte_memarea *ma, struct memarea_elem *elem)
{
struct memarea_elem *prev, *next;
bool merged = false;
+ elem->owner = NULL;
prev = TAILQ_PREV(elem, memarea_elem_list, elem_node);
next = TAILQ_NEXT(elem, elem_node);
if (prev != NULL && prev->refcnt == 0) {
@@ -296,6 +351,12 @@ rte_memarea_update_refcnt(struct rte_memarea *ma, void *ptr, int16_t value)
return;
}
+ if (unlikely(ptr < ma->area_addr || ptr > ma->top_addr)) {
+ rte_memarea_update_refcnt(ma->init.bak_memarea, ptr, value);
+ memarea_unlock(ma);
+ return;
+ }
+
if (unlikely(elem->refcnt <= 0 || elem->refcnt + value < 0)) {
RTE_LOG(ERR, MEMAREA,
"memarea: %s cookie: 0x%x curr refcnt: %d update refcnt: %d check fail!\n",
@@ -399,10 +460,14 @@ rte_memarea_dump(struct rte_memarea *ma, FILE *f, bool dump_all)
fprintf(f, " algorithm: %s\n", memarea_alg_name(ma->init.alg));
fprintf(f, " total-size: 0x%zx\n", ma->init.total_sz);
fprintf(f, " mt-safe: %s\n", ma->init.mt_safe ? "yes" : "no");
+ if (ma->init.bak_memarea)
+ fprintf(f, " backup-memarea-name: %s\n", ma->init.bak_memarea->init.name);
fprintf(f, " total-regions: %u\n", memarea_elem_list_num(ma));
fprintf(f, " total-free-regions: %u\n", memarea_free_list_num(ma));
fprintf(f, " alloc_fails: %" PRIu64 "\n", ma->alloc_fails);
fprintf(f, " refcnt_check_fails: %" PRIu64 "\n", ma->refcnt_check_fails);
+ if (ma->init.bak_memarea)
+ fprintf(f, " backup_alloc_fails: %" PRIu64 "\n", ma->bak_alloc_fails);
if (dump_all)
memarea_dump_all(ma, f);
memarea_unlock(ma);
@@ -28,6 +28,12 @@
* specified, all the functions of the memarea API are lock-free, and assume
* to not be invoked in parallel on different logical cores to work on the
* same memarea.
+ *
+ * - It provides backup memory mechanism, the memarea could use another memarea
+ * as a backup. It will attempts to allocate object from backup memarea when
+ * the current memarea failed to allocate.
+ * @note If the backup of memarea-A is memarea-B, then must destroy memarea-A
+ * before destroying memarae-B.
*/
#include <stdbool.h>
@@ -96,6 +102,10 @@ struct rte_memarea_param {
*/
struct rte_memarea *src_memarea;
};
+ /** Backup memarea, which is used to allocate object from when the current
+ * memarea failed to allocate.
+ */
+ struct rte_memarea *bak_memarea;
};
/**