[v3,2/2] mempool: use shared memzone for rte_mempool_ops
Checks
Commit Message
From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
The order of mempool initiation affects mempool index in the
rte_mempool_ops_table. For example, when building APPs with:
$ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
The "bucket" mempool will be registered firstly, and its index
in table is 0 while the index of "ring" mempool is 1. DPDK
uses the mk/rte.app.mk to build APPs, and others, for example,
Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
The mempool lib linked in dpdk and Open vSwitch is different.
The mempool can be used between primary and secondary process,
such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
There will be a crash because dpdk-pdump creates the "ring_mp_mc"
ring which index in table is 0, but the index of "bucket" ring
is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
mempool ops and malloc memory from mempool. The crash will occur:
bucket_dequeue (access null and crash)
rte_mempool_get_ops (should get "ring_mp_mc",
but get "bucket" mempool)
rte_mempool_ops_dequeue_bulk
...
rte_pktmbuf_alloc
rte_pktmbuf_copy
pdump_copy
pdump_rx
rte_eth_rx_burst
To avoid the crash, there are some solution:
* constructor priority: Different mempool uses different
priority in RTE_INIT, but it's not easy to maintain.
* change mk/rte.app.mk: Change the order in mk/rte.app.mk to
be same as libdpdk.a/libdpdk.so, but when adding a new mempool
driver in future, we must make sure the order.
* register mempool orderly: Sort the mempool when registering,
so the lib linked will not affect the index in mempool table.
but the number of mempool libraries may be different.
* shared memzone: The primary process allocates a struct in
shared memory named memzone, When we register a mempool ops,
we first get a name and id from the shared struct: with the lock held,
lookup for the registered name and return its index, else
get the last id and copy the name in the struct.
Previous discussion: https://mails.dpdk.org/archives/dev/2020-March/159354.html
Suggested-by: Olivier Matz <olivier.matz@6wind.com>
Suggested-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
---
v2:
* fix checkpatch warning
---
lib/librte_mempool/rte_mempool.h | 28 +++++++++++-
lib/librte_mempool/rte_mempool_ops.c | 89 ++++++++++++++++++++++++++++--------
2 files changed, 96 insertions(+), 21 deletions(-)
Comments
13/04/2020 16:21, xiangxia.m.yue@gmail.com:
> The order of mempool initiation affects mempool index in the
> rte_mempool_ops_table. For example, when building APPs with:
>
> $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
>
> The "bucket" mempool will be registered firstly, and its index
> in table is 0 while the index of "ring" mempool is 1. DPDK
> uses the mk/rte.app.mk to build APPs, and others, for example,
> Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> The mempool lib linked in dpdk and Open vSwitch is different.
We are supposed to use pkg-config to link DPDK.
Does the problem appear between a DPDK compiled with meson
and an application linked with pkg-config information?
If the problem really needs to be solved,
the EAL patch (first of this series) needs to be discussed
and reviewed carefully. I don't imagine it being done in 20.05.
On 4/13/20 5:21 PM, xiangxia.m.yue@gmail.com wrote:
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>
> The order of mempool initiation affects mempool index in the
> rte_mempool_ops_table. For example, when building APPs with:
>
> $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
>
> The "bucket" mempool will be registered firstly, and its index
> in table is 0 while the index of "ring" mempool is 1. DPDK
> uses the mk/rte.app.mk to build APPs, and others, for example,
> Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> The mempool lib linked in dpdk and Open vSwitch is different.
>
> The mempool can be used between primary and secondary process,
> such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
> There will be a crash because dpdk-pdump creates the "ring_mp_mc"
> ring which index in table is 0, but the index of "bucket" ring
> is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
> mempool ops and malloc memory from mempool. The crash will occur:
>
> bucket_dequeue (access null and crash)
> rte_mempool_get_ops (should get "ring_mp_mc",
> but get "bucket" mempool)
> rte_mempool_ops_dequeue_bulk
> ...
> rte_pktmbuf_alloc
> rte_pktmbuf_copy
> pdump_copy
> pdump_rx
> rte_eth_rx_burst
>
> To avoid the crash, there are some solution:
> * constructor priority: Different mempool uses different
> priority in RTE_INIT, but it's not easy to maintain.
>
> * change mk/rte.app.mk: Change the order in mk/rte.app.mk to
> be same as libdpdk.a/libdpdk.so, but when adding a new mempool
> driver in future, we must make sure the order.
>
> * register mempool orderly: Sort the mempool when registering,
> so the lib linked will not affect the index in mempool table.
> but the number of mempool libraries may be different.
>
> * shared memzone: The primary process allocates a struct in
> shared memory named memzone, When we register a mempool ops,
> we first get a name and id from the shared struct: with the lock held,
> lookup for the registered name and return its index, else
> get the last id and copy the name in the struct.
>
> Previous discussion: https://mails.dpdk.org/archives/dev/2020-March/159354.html
>
> Suggested-by: Olivier Matz <olivier.matz@6wind.com>
> Suggested-by: Jerin Jacob <jerinj@marvell.com>
> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> ---
> v2:
> * fix checkpatch warning
> ---
> lib/librte_mempool/rte_mempool.h | 28 +++++++++++-
> lib/librte_mempool/rte_mempool_ops.c | 89 ++++++++++++++++++++++++++++--------
> 2 files changed, 96 insertions(+), 21 deletions(-)
>
> diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
> index c90cf31467b2..2709b9e1d51b 100644
> --- a/lib/librte_mempool/rte_mempool.h
> +++ b/lib/librte_mempool/rte_mempool.h
> @@ -50,6 +50,7 @@
> #include <rte_ring.h>
> #include <rte_memcpy.h>
> #include <rte_common.h>
> +#include <rte_init.h>
>
> #ifdef __cplusplus
> extern "C" {
> @@ -678,7 +679,6 @@ struct rte_mempool_ops {
> */
> struct rte_mempool_ops_table {
> rte_spinlock_t sl; /**< Spinlock for add/delete. */
> - uint32_t num_ops; /**< Number of used ops structs in the table. */
> /**
> * Storage for all possible ops structs.
> */
> @@ -910,6 +910,30 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
> */
> int rte_mempool_register_ops(const struct rte_mempool_ops *ops);
>
> +struct rte_mempool_shared_ops {
> + size_t num_mempool_ops;
Is there any specific reason to change type from uint32_t used
above to size_t? I think that uint32_t is better here since
it is just a number, not a size of memory or related value.
> + struct {
> + char name[RTE_MEMPOOL_OPS_NAMESIZE];
> + } mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX];
> +
> + rte_spinlock_t mempool;
> +};
> +
> +static inline int
> +mempool_ops_register_cb(const void *arg)
> +{
> + const struct rte_mempool_ops *h = (const struct rte_mempool_ops *)arg;
> +
> + return rte_mempool_register_ops(h);
> +}
> +
> +static inline void
> +mempool_ops_register(const struct rte_mempool_ops *ops)
> +{
> + rte_init_register(mempool_ops_register_cb, (const void *)ops,
> + RTE_INIT_PRE);
> +}
> +
> /**
> * Macro to statically register the ops of a mempool handler.
> * Note that the rte_mempool_register_ops fails silently here when
> @@ -918,7 +942,7 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
> #define MEMPOOL_REGISTER_OPS(ops) \
> RTE_INIT(mp_hdlr_init_##ops) \
> { \
> - rte_mempool_register_ops(&ops); \
> + mempool_ops_register(&ops); \
> }
>
> /**
> diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c
> index 22c5251eb068..b10fda662db6 100644
> --- a/lib/librte_mempool/rte_mempool_ops.c
> +++ b/lib/librte_mempool/rte_mempool_ops.c
> @@ -14,43 +14,95 @@
> /* indirect jump table to support external memory pools. */
> struct rte_mempool_ops_table rte_mempool_ops_table = {
> .sl = RTE_SPINLOCK_INITIALIZER,
> - .num_ops = 0
> };
>
> -/* add a new ops struct in rte_mempool_ops_table, return its index. */
> -int
> -rte_mempool_register_ops(const struct rte_mempool_ops *h)
> +static int
> +rte_mempool_register_shared_ops(const char *name)
> {
> - struct rte_mempool_ops *ops;
> - int16_t ops_index;
> + static bool mempool_shared_ops_inited;
> + struct rte_mempool_shared_ops *shared_ops;
> + const struct rte_memzone *mz;
> + uint32_t ops_index = 0;
> +
I think we should sanity check 'name' here to be not
empty string (see review notes below).
> + if (rte_eal_process_type() == RTE_PROC_PRIMARY &&
> + !mempool_shared_ops_inited) {
> +
> + mz = rte_memzone_reserve("mempool_ops_shared",
> + sizeof(*shared_ops), 0, 0);
> + if (mz == NULL)
> + return -ENOMEM;
> +
> + shared_ops = mz->addr;
> + shared_ops->num_mempool_ops = 0;
> + rte_spinlock_init(&shared_ops->mempool);
> +
> + mempool_shared_ops_inited = true;
> + } else {
> + mz = rte_memzone_lookup("mempool_ops_shared");
> + if (mz == NULL)
> + return -ENOMEM;
> +
> + shared_ops = mz->addr;
> + }
>
> - rte_spinlock_lock(&rte_mempool_ops_table.sl);
> + rte_spinlock_lock(&shared_ops->mempool);
>
> - if (rte_mempool_ops_table.num_ops >=
> - RTE_MEMPOOL_MAX_OPS_IDX) {
> - rte_spinlock_unlock(&rte_mempool_ops_table.sl);
> + if (shared_ops->num_mempool_ops >= RTE_MEMPOOL_MAX_OPS_IDX) {
> + rte_spinlock_unlock(&shared_ops->mempool);
> RTE_LOG(ERR, MEMPOOL,
> "Maximum number of mempool ops structs exceeded\n");
> return -ENOSPC;
> }
>
> + while (shared_ops->mempool_ops[ops_index].name[0]) {
Please, compare with '\0' as DPDK style guide says.
> + if (!strcmp(name, shared_ops->mempool_ops[ops_index].name)) {
> + rte_spinlock_unlock(&shared_ops->mempool);
> + return ops_index;
> + }
> +
> + ops_index++;
> + }
> +
> + strlcpy(shared_ops->mempool_ops[ops_index].name, name,
> + sizeof(shared_ops->mempool_ops[0].name));
> +
> + shared_ops->num_mempool_ops++;
> +
> + rte_spinlock_unlock(&shared_ops->mempool);
> + return ops_index;
> +}
> +
> +/* add a new ops struct in rte_mempool_ops_table, return its index. */
> +int
> +rte_mempool_register_ops(const struct rte_mempool_ops *h)
> +{
> + struct rte_mempool_ops *ops;
> + int16_t ops_index;
> +
> if (h->alloc == NULL || h->enqueue == NULL ||
> - h->dequeue == NULL || h->get_count == NULL) {
> - rte_spinlock_unlock(&rte_mempool_ops_table.sl);
> + h->dequeue == NULL || h->get_count == NULL) {
Changing formatting just makes review a bit more harder.
> RTE_LOG(ERR, MEMPOOL,
> "Missing callback while registering mempool ops\n");
> + rte_errno = EINVAL;
Why is it done in the patch? For me it looks like logically
different change if it is really required (properly motivated).
> return -EINVAL;
> }
>
> if (strlen(h->name) >= sizeof(ops->name) - 1) {
> - rte_spinlock_unlock(&rte_mempool_ops_table.sl);
> - RTE_LOG(DEBUG, EAL, "%s(): mempool_ops <%s>: name too long\n",
> - __func__, h->name);
> + RTE_LOG(ERR, MEMPOOL,
> + "The registering mempool_ops <%s>: name too long\n",
> + h->name);
Why do you change from DEBUG to ERR here? It it not
directly related to the purpose of the patch.
> rte_errno = EEXIST;
> return -EEXIST;
> }
>
> - ops_index = rte_mempool_ops_table.num_ops++;
> + ops_index = rte_mempool_register_shared_ops(h->name);
> + if (ops_index < 0) {
> + rte_errno = -ops_index;
> + return ops_index;
> + }
> +
> + rte_spinlock_lock(&rte_mempool_ops_table.sl);
> +
> ops = &rte_mempool_ops_table.ops[ops_index];
> strlcpy(ops->name, h->name, sizeof(ops->name));
> ops->alloc = h->alloc;
> @@ -165,9 +217,8 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> if (mp->flags & MEMPOOL_F_POOL_CREATED)
> return -EEXIST;
>
> - for (i = 0; i < rte_mempool_ops_table.num_ops; i++) {
> - if (!strcmp(name,
> - rte_mempool_ops_table.ops[i].name)) {
> + for (i = 0; i < RTE_MEMPOOL_MAX_OPS_IDX; i++) {
> + if (!strcmp(name, rte_mempool_ops_table.ops[i].name)) {
Since rte_mempool_ops_table is filled in which zeros,
name string is empty by default. So, request with empty name
will match the first unused entry. I guess it is not what we
want here. I think we should handle empty string before the
loop and return -EINVAL.
> ops = &rte_mempool_ops_table.ops[i];
> break;
> }
>
On Thu, Apr 23, 2020 at 9:38 PM Andrew Rybchenko
<arybchenko@solarflare.com> wrote:
>
> On 4/13/20 5:21 PM, xiangxia.m.yue@gmail.com wrote:
> > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> >
> > The order of mempool initiation affects mempool index in the
> > rte_mempool_ops_table. For example, when building APPs with:
> >
> > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> >
> > The "bucket" mempool will be registered firstly, and its index
> > in table is 0 while the index of "ring" mempool is 1. DPDK
> > uses the mk/rte.app.mk to build APPs, and others, for example,
> > Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> > The mempool lib linked in dpdk and Open vSwitch is different.
> >
> > The mempool can be used between primary and secondary process,
> > such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
> > There will be a crash because dpdk-pdump creates the "ring_mp_mc"
> > ring which index in table is 0, but the index of "bucket" ring
> > is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
> > mempool ops and malloc memory from mempool. The crash will occur:
> >
> > bucket_dequeue (access null and crash)
> > rte_mempool_get_ops (should get "ring_mp_mc",
> > but get "bucket" mempool)
> > rte_mempool_ops_dequeue_bulk
> > ...
> > rte_pktmbuf_alloc
> > rte_pktmbuf_copy
> > pdump_copy
> > pdump_rx
> > rte_eth_rx_burst
> >
> > To avoid the crash, there are some solution:
> > * constructor priority: Different mempool uses different
> > priority in RTE_INIT, but it's not easy to maintain.
> >
> > * change mk/rte.app.mk: Change the order in mk/rte.app.mk to
> > be same as libdpdk.a/libdpdk.so, but when adding a new mempool
> > driver in future, we must make sure the order.
> >
> > * register mempool orderly: Sort the mempool when registering,
> > so the lib linked will not affect the index in mempool table.
> > but the number of mempool libraries may be different.
> >
> > * shared memzone: The primary process allocates a struct in
> > shared memory named memzone, When we register a mempool ops,
> > we first get a name and id from the shared struct: with the lock held,
> > lookup for the registered name and return its index, else
> > get the last id and copy the name in the struct.
> >
> > Previous discussion: https://mails.dpdk.org/archives/dev/2020-March/159354.html
> >
> > Suggested-by: Olivier Matz <olivier.matz@6wind.com>
> > Suggested-by: Jerin Jacob <jerinj@marvell.com>
> > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > ---
> > v2:
> > * fix checkpatch warning
> > ---
> > lib/librte_mempool/rte_mempool.h | 28 +++++++++++-
> > lib/librte_mempool/rte_mempool_ops.c | 89 ++++++++++++++++++++++++++++--------
> > 2 files changed, 96 insertions(+), 21 deletions(-)
> >
> > diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
> > index c90cf31467b2..2709b9e1d51b 100644
> > --- a/lib/librte_mempool/rte_mempool.h
> > +++ b/lib/librte_mempool/rte_mempool.h
> > @@ -50,6 +50,7 @@
> > #include <rte_ring.h>
> > #include <rte_memcpy.h>
> > #include <rte_common.h>
> > +#include <rte_init.h>
> >
> > #ifdef __cplusplus
> > extern "C" {
> > @@ -678,7 +679,6 @@ struct rte_mempool_ops {
> > */
> > struct rte_mempool_ops_table {
> > rte_spinlock_t sl; /**< Spinlock for add/delete. */
> > - uint32_t num_ops; /**< Number of used ops structs in the table. */
> > /**
> > * Storage for all possible ops structs.
> > */
> > @@ -910,6 +910,30 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
> > */
> > int rte_mempool_register_ops(const struct rte_mempool_ops *ops);
> >
> > +struct rte_mempool_shared_ops {
> > + size_t num_mempool_ops;
>
> Is there any specific reason to change type from uint32_t used
> above to size_t? I think that uint32_t is better here since
> it is just a number, not a size of memory or related value.
Thanks for you review. busy to commit other patch, so that be delayed.
I will change that in next version.
> > + struct {
> > + char name[RTE_MEMPOOL_OPS_NAMESIZE];
> > + } mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX];
> > +
> > + rte_spinlock_t mempool;
> > +};
> > +
> > +static inline int
> > +mempool_ops_register_cb(const void *arg)
> > +{
> > + const struct rte_mempool_ops *h = (const struct rte_mempool_ops *)arg;
> > +
> > + return rte_mempool_register_ops(h);
> > +}
> > +
> > +static inline void
> > +mempool_ops_register(const struct rte_mempool_ops *ops)
> > +{
> > + rte_init_register(mempool_ops_register_cb, (const void *)ops,
> > + RTE_INIT_PRE);
> > +}
> > +
> > /**
> > * Macro to statically register the ops of a mempool handler.
> > * Note that the rte_mempool_register_ops fails silently here when
> > @@ -918,7 +942,7 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
> > #define MEMPOOL_REGISTER_OPS(ops) \
> > RTE_INIT(mp_hdlr_init_##ops) \
> > { \
> > - rte_mempool_register_ops(&ops); \
> > + mempool_ops_register(&ops); \
> > }
> >
> > /**
> > diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c
> > index 22c5251eb068..b10fda662db6 100644
> > --- a/lib/librte_mempool/rte_mempool_ops.c
> > +++ b/lib/librte_mempool/rte_mempool_ops.c
> > @@ -14,43 +14,95 @@
> > /* indirect jump table to support external memory pools. */
> > struct rte_mempool_ops_table rte_mempool_ops_table = {
> > .sl = RTE_SPINLOCK_INITIALIZER,
> > - .num_ops = 0
> > };
> >
> > -/* add a new ops struct in rte_mempool_ops_table, return its index. */
> > -int
> > -rte_mempool_register_ops(const struct rte_mempool_ops *h)
> > +static int
> > +rte_mempool_register_shared_ops(const char *name)
> > {
> > - struct rte_mempool_ops *ops;
> > - int16_t ops_index;
> > + static bool mempool_shared_ops_inited;
> > + struct rte_mempool_shared_ops *shared_ops;
> > + const struct rte_memzone *mz;
> > + uint32_t ops_index = 0;
> > +
>
> I think we should sanity check 'name' here to be not
> empty string (see review notes below).
OK
> > + if (rte_eal_process_type() == RTE_PROC_PRIMARY &&
> > + !mempool_shared_ops_inited) {
> > +
> > + mz = rte_memzone_reserve("mempool_ops_shared",
> > + sizeof(*shared_ops), 0, 0);
> > + if (mz == NULL)
> > + return -ENOMEM;
> > +
> > + shared_ops = mz->addr;
> > + shared_ops->num_mempool_ops = 0;
> > + rte_spinlock_init(&shared_ops->mempool);
> > +
> > + mempool_shared_ops_inited = true;
> > + } else {
> > + mz = rte_memzone_lookup("mempool_ops_shared");
> > + if (mz == NULL)
> > + return -ENOMEM;
> > +
> > + shared_ops = mz->addr;
> > + }
> >
> > - rte_spinlock_lock(&rte_mempool_ops_table.sl);
> > + rte_spinlock_lock(&shared_ops->mempool);
> >
> > - if (rte_mempool_ops_table.num_ops >=
> > - RTE_MEMPOOL_MAX_OPS_IDX) {
> > - rte_spinlock_unlock(&rte_mempool_ops_table.sl);
> > + if (shared_ops->num_mempool_ops >= RTE_MEMPOOL_MAX_OPS_IDX) {
> > + rte_spinlock_unlock(&shared_ops->mempool);
> > RTE_LOG(ERR, MEMPOOL,
> > "Maximum number of mempool ops structs exceeded\n");
> > return -ENOSPC;
> > }
> >
> > + while (shared_ops->mempool_ops[ops_index].name[0]) {
>
> Please, compare with '\0' as DPDK style guide says.
>
> > + if (!strcmp(name, shared_ops->mempool_ops[ops_index].name)) {
> > + rte_spinlock_unlock(&shared_ops->mempool);
> > + return ops_index;
> > + }
> > +
> > + ops_index++;
> > + }
> > +
> > + strlcpy(shared_ops->mempool_ops[ops_index].name, name,
> > + sizeof(shared_ops->mempool_ops[0].name));
> > +
> > + shared_ops->num_mempool_ops++;
> > +
> > + rte_spinlock_unlock(&shared_ops->mempool);
> > + return ops_index;
> > +}
> > +
> > +/* add a new ops struct in rte_mempool_ops_table, return its index. */
> > +int
> > +rte_mempool_register_ops(const struct rte_mempool_ops *h)
> > +{
> > + struct rte_mempool_ops *ops;
> > + int16_t ops_index;
> > +
> > if (h->alloc == NULL || h->enqueue == NULL ||
> > - h->dequeue == NULL || h->get_count == NULL) {
> > - rte_spinlock_unlock(&rte_mempool_ops_table.sl);
> > + h->dequeue == NULL || h->get_count == NULL) {
>
> Changing formatting just makes review a bit more harder.
>
> > RTE_LOG(ERR, MEMPOOL,
> > "Missing callback while registering mempool ops\n");
> > + rte_errno = EINVAL;
>
> Why is it done in the patch? For me it looks like logically
> different change if it is really required (properly motivated).
I guess that should add the err code to rte_rrno ? but I am fine, not
to do that.
> > return -EINVAL;
> > }
> >
> > if (strlen(h->name) >= sizeof(ops->name) - 1) {
> > - rte_spinlock_unlock(&rte_mempool_ops_table.sl);
> > - RTE_LOG(DEBUG, EAL, "%s(): mempool_ops <%s>: name too long\n",
> > - __func__, h->name);
> > + RTE_LOG(ERR, MEMPOOL,
> > + "The registering mempool_ops <%s>: name too long\n",
> > + h->name);
>
> Why do you change from DEBUG to ERR here? It it not
> directly related to the purpose of the patch.
Yes, because it is an error, so change that type. I change it in
separate patch ?
> > rte_errno = EEXIST;
> > return -EEXIST;
> > }
> >
> > - ops_index = rte_mempool_ops_table.num_ops++;
> > + ops_index = rte_mempool_register_shared_ops(h->name);
> > + if (ops_index < 0) {
> > + rte_errno = -ops_index;
> > + return ops_index;
> > + }
> > +
> > + rte_spinlock_lock(&rte_mempool_ops_table.sl);
> > +
> > ops = &rte_mempool_ops_table.ops[ops_index];
> > strlcpy(ops->name, h->name, sizeof(ops->name));
> > ops->alloc = h->alloc;
> > @@ -165,9 +217,8 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> > if (mp->flags & MEMPOOL_F_POOL_CREATED)
> > return -EEXIST;
> >
> > - for (i = 0; i < rte_mempool_ops_table.num_ops; i++) {
> > - if (!strcmp(name,
> > - rte_mempool_ops_table.ops[i].name)) {
> > + for (i = 0; i < RTE_MEMPOOL_MAX_OPS_IDX; i++) {
> > + if (!strcmp(name, rte_mempool_ops_table.ops[i].name)) {
>
> Since rte_mempool_ops_table is filled in which zeros,
> name string is empty by default. So, request with empty name
> will match the first unused entry. I guess it is not what we
> want here. I think we should handle empty string before the
> loop and return -EINVAL.
OK, thanks.
> > ops = &rte_mempool_ops_table.ops[i];
> > break;
> > }
> >
>
On Fri, Apr 17, 2020 at 6:27 AM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 13/04/2020 16:21, xiangxia.m.yue@gmail.com:
> > The order of mempool initiation affects mempool index in the
> > rte_mempool_ops_table. For example, when building APPs with:
> >
> > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> >
> > The "bucket" mempool will be registered firstly, and its index
> > in table is 0 while the index of "ring" mempool is 1. DPDK
> > uses the mk/rte.app.mk to build APPs, and others, for example,
> > Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> > The mempool lib linked in dpdk and Open vSwitch is different.
>
> We are supposed to use pkg-config to link DPDK.
> Does the problem appear between a DPDK compiled with meson
> and an application linked with pkg-config information?
>
> If the problem really needs to be solved,
> the EAL patch (first of this series) needs to be discussed
> and reviewed carefully. I don't imagine it being done in 20.05.
will I stop update the patches or when the patch is ready and then
decide applied or not?
>
>
27/04/2020 10:03, Tonghao Zhang:
> On Fri, Apr 17, 2020 at 6:27 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > 13/04/2020 16:21, xiangxia.m.yue@gmail.com:
> > > The order of mempool initiation affects mempool index in the
> > > rte_mempool_ops_table. For example, when building APPs with:
> > >
> > > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> > >
> > > The "bucket" mempool will be registered firstly, and its index
> > > in table is 0 while the index of "ring" mempool is 1. DPDK
> > > uses the mk/rte.app.mk to build APPs, and others, for example,
> > > Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> > > The mempool lib linked in dpdk and Open vSwitch is different.
> >
> > We are supposed to use pkg-config to link DPDK.
> > Does the problem appear between a DPDK compiled with meson
> > and an application linked with pkg-config information?
> >
> > If the problem really needs to be solved,
> > the EAL patch (first of this series) needs to be discussed
> > and reviewed carefully. I don't imagine it being done in 20.05.
>
> will I stop update the patches or when the patch is ready and then
> decide applied or not?
As I said, it requires more discussion.
Please start by answering my question above.
On Mon, Apr 27, 2020 at 7:40 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 27/04/2020 10:03, Tonghao Zhang:
> > On Fri, Apr 17, 2020 at 6:27 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> > >
> > > 13/04/2020 16:21, xiangxia.m.yue@gmail.com:
> > > > The order of mempool initiation affects mempool index in the
> > > > rte_mempool_ops_table. For example, when building APPs with:
> > > >
> > > > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> > > >
> > > > The "bucket" mempool will be registered firstly, and its index
> > > > in table is 0 while the index of "ring" mempool is 1. DPDK
> > > > uses the mk/rte.app.mk to build APPs, and others, for example,
> > > > Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> > > > The mempool lib linked in dpdk and Open vSwitch is different.
> > >
> > > We are supposed to use pkg-config to link DPDK.
> > > Does the problem appear between a DPDK compiled with meson
> > > and an application linked with pkg-config information?
Hi Thomas,
The library mempool linked order can trigger that problem. but when
the library is loaded
dynamically, trigger that problem too.
as Olivier Matz said:
The fact that the ops index changes during mempool driver lifetime is
indeed frightening, especially knowning that this is a dynamic
registration that could happen at any moment in the life of the
application.
the message in https://mails.dpdk.org/archives/dev/2020-March/159354.html
> > > If the problem really needs to be solved,
> > > the EAL patch (first of this series) needs to be discussed
> > > and reviewed carefully. I don't imagine it being done in 20.05.
> >
> > will I stop update the patches or when the patch is ready and then
> > decide applied or not?
>
> As I said, it requires more discussion.
> Please start by answering my question above.
I got it, Thanks.
>
On Mon, Apr 27, 2020 at 8:51 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>
> On Mon, Apr 27, 2020 at 7:40 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > 27/04/2020 10:03, Tonghao Zhang:
> > > On Fri, Apr 17, 2020 at 6:27 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > >
> > > > 13/04/2020 16:21, xiangxia.m.yue@gmail.com:
> > > > > The order of mempool initiation affects mempool index in the
> > > > > rte_mempool_ops_table. For example, when building APPs with:
> > > > >
> > > > > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> > > > >
> > > > > The "bucket" mempool will be registered firstly, and its index
> > > > > in table is 0 while the index of "ring" mempool is 1. DPDK
> > > > > uses the mk/rte.app.mk to build APPs, and others, for example,
> > > > > Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> > > > > The mempool lib linked in dpdk and Open vSwitch is different.
> > > >
> > > > We are supposed to use pkg-config to link DPDK.
> > > > Does the problem appear between a DPDK compiled with meson
> > > > and an application linked with pkg-config information?
> Hi Thomas,
> The library mempool linked order can trigger that problem. but when
> the library is loaded
> dynamically, trigger that problem too.
> as Olivier Matz said:
> The fact that the ops index changes during mempool driver lifetime is
> indeed frightening, especially knowning that this is a dynamic
> registration that could happen at any moment in the life of the
> application.
>
> the message in https://mails.dpdk.org/archives/dev/2020-March/159354.html
Hi Thomas,
For first patch, I guess we support a callback for other library, it
make the codes much cleaner
at eal layer. Otherwise, if we init for library, we may include their
header file.
There is a better solution ?
> > > > If the problem really needs to be solved,
> > > > the EAL patch (first of this series) needs to be discussed
> > > > and reviewed carefully. I don't imagine it being done in 20.05.
> > >
> > > will I stop update the patches or when the patch is ready and then
> > > decide applied or not?
> >
> > As I said, it requires more discussion.
> > Please start by answering my question above.
> I got it, Thanks.
> >
>
>
> --
> Best regards, Tonghao
Hi,
On Tue, Apr 28, 2020 at 09:22:37PM +0800, Tonghao Zhang wrote:
> On Mon, Apr 27, 2020 at 8:51 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> >
> > On Mon, Apr 27, 2020 at 7:40 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > >
> > > 27/04/2020 10:03, Tonghao Zhang:
> > > > On Fri, Apr 17, 2020 at 6:27 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > >
> > > > > 13/04/2020 16:21, xiangxia.m.yue@gmail.com:
> > > > > > The order of mempool initiation affects mempool index in the
> > > > > > rte_mempool_ops_table. For example, when building APPs with:
> > > > > >
> > > > > > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> > > > > >
> > > > > > The "bucket" mempool will be registered firstly, and its index
> > > > > > in table is 0 while the index of "ring" mempool is 1. DPDK
> > > > > > uses the mk/rte.app.mk to build APPs, and others, for example,
> > > > > > Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> > > > > > The mempool lib linked in dpdk and Open vSwitch is different.
> > > > >
> > > > > We are supposed to use pkg-config to link DPDK.
> > > > > Does the problem appear between a DPDK compiled with meson
> > > > > and an application linked with pkg-config information?
> > Hi Thomas,
> > The library mempool linked order can trigger that problem. but when
> > the library is loaded
> > dynamically, trigger that problem too.
> > as Olivier Matz said:
> > The fact that the ops index changes during mempool driver lifetime is
> > indeed frightening, especially knowning that this is a dynamic
> > registration that could happen at any moment in the life of the
> > application.
> >
> > the message in https://mails.dpdk.org/archives/dev/2020-March/159354.html
> Hi Thomas,
> For first patch, I guess we support a callback for other library, it
> make the codes much cleaner
> at eal layer. Otherwise, if we init for library, we may include their
> header file.
> There is a better solution ?
To summarize my understanding of the issu encountered by Tonghao:
Currently, it is not possible to call memzone_register() from an init
function (registered with RTE_INIT()). This is needed if we want to
store the list of registered mempool ops in a shared memory, available
from multiprocess.
Tonghao's patch 1/2 solves this issue. I tried to find alternatives
to this approach, but none of them seems satisfying:
- use RTE_PMD_REGISTER_VDEV() and rte_vdev_add_custom_scan() instead of
RTE_INIT() in the MEMPOOL_REGISTER_OPS() macro to delay the
initialization after eal_init(). This looks too complex (I made a POC
of it, it someone is interested).
- synchronize mempool ops in shared memory when mempool_create() is
called in the primary: this would probably works most of the time, but
it is not a perfect solution as we cannot ensure that the primary
application will create a mempool before the secondary comes up.
- introduce a mandatory call to rte_mempool_lib_init(): despite it's the
usual way to initialize libs, this will break compatibility.
> > > > > If the problem really needs to be solved,
> > > > > the EAL patch (first of this series) needs to be discussed
> > > > > and reviewed carefully. I don't imagine it being done in 20.05.
> > > >
OK, let's discuss it once 20.05 is out.
Regards,
Olivier
On Mon, May 4, 2020 at 9:42 AM Olivier Matz <olivier.matz@6wind.com> wrote:
>
> Hi,
>
> On Tue, Apr 28, 2020 at 09:22:37PM +0800, Tonghao Zhang wrote:
> > On Mon, Apr 27, 2020 at 8:51 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> > >
> > > On Mon, Apr 27, 2020 at 7:40 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > >
> > > > 27/04/2020 10:03, Tonghao Zhang:
> > > > > On Fri, Apr 17, 2020 at 6:27 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > >
> > > > > > 13/04/2020 16:21, xiangxia.m.yue@gmail.com:
> > > > > > > The order of mempool initiation affects mempool index in the
> > > > > > > rte_mempool_ops_table. For example, when building APPs with:
> > > > > > >
> > > > > > > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> > > > > > >
> > > > > > > The "bucket" mempool will be registered firstly, and its index
> > > > > > > in table is 0 while the index of "ring" mempool is 1. DPDK
> > > > > > > uses the mk/rte.app.mk to build APPs, and others, for example,
> > > > > > > Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> > > > > > > The mempool lib linked in dpdk and Open vSwitch is different.
> > > > > >
> > > > > > We are supposed to use pkg-config to link DPDK.
> > > > > > Does the problem appear between a DPDK compiled with meson
> > > > > > and an application linked with pkg-config information?
> > > Hi Thomas,
> > > The library mempool linked order can trigger that problem. but when
> > > the library is loaded
> > > dynamically, trigger that problem too.
> > > as Olivier Matz said:
> > > The fact that the ops index changes during mempool driver lifetime is
> > > indeed frightening, especially knowning that this is a dynamic
> > > registration that could happen at any moment in the life of the
> > > application.
> > >
> > > the message in https://mails.dpdk.org/archives/dev/2020-March/159354.html
> > Hi Thomas,
> > For first patch, I guess we support a callback for other library, it
> > make the codes much cleaner
> > at eal layer. Otherwise, if we init for library, we may include their
> > header file.
> > There is a better solution ?
>
> To summarize my understanding of the issu encountered by Tonghao:
>
> Currently, it is not possible to call memzone_register() from an init
> function (registered with RTE_INIT()). This is needed if we want to
> store the list of registered mempool ops in a shared memory, available
> from multiprocess.
>
> Tonghao's patch 1/2 solves this issue. I tried to find alternatives
> to this approach, but none of them seems satisfying:
>
> - use RTE_PMD_REGISTER_VDEV() and rte_vdev_add_custom_scan() instead of
> RTE_INIT() in the MEMPOOL_REGISTER_OPS() macro to delay the
> initialization after eal_init(). This looks too complex (I made a POC
> of it, it someone is interested).
>
> - synchronize mempool ops in shared memory when mempool_create() is
> called in the primary: this would probably works most of the time, but
> it is not a perfect solution as we cannot ensure that the primary
> application will create a mempool before the secondary comes up.
>
> - introduce a mandatory call to rte_mempool_lib_init(): despite it's the
> usual way to initialize libs, this will break compatibility.
>
> > > > > > If the problem really needs to be solved,
> > > > > > the EAL patch (first of this series) needs to be discussed
> > > > > > and reviewed carefully. I don't imagine it being done in 20.05.
> > > > >
>
> OK, let's discuss it once 20.05 is out.
>
Any news on this topic?
Is this issue still a problem?
@@ -50,6 +50,7 @@
#include <rte_ring.h>
#include <rte_memcpy.h>
#include <rte_common.h>
+#include <rte_init.h>
#ifdef __cplusplus
extern "C" {
@@ -678,7 +679,6 @@ struct rte_mempool_ops {
*/
struct rte_mempool_ops_table {
rte_spinlock_t sl; /**< Spinlock for add/delete. */
- uint32_t num_ops; /**< Number of used ops structs in the table. */
/**
* Storage for all possible ops structs.
*/
@@ -910,6 +910,30 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
*/
int rte_mempool_register_ops(const struct rte_mempool_ops *ops);
+struct rte_mempool_shared_ops {
+ size_t num_mempool_ops;
+ struct {
+ char name[RTE_MEMPOOL_OPS_NAMESIZE];
+ } mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX];
+
+ rte_spinlock_t mempool;
+};
+
+static inline int
+mempool_ops_register_cb(const void *arg)
+{
+ const struct rte_mempool_ops *h = (const struct rte_mempool_ops *)arg;
+
+ return rte_mempool_register_ops(h);
+}
+
+static inline void
+mempool_ops_register(const struct rte_mempool_ops *ops)
+{
+ rte_init_register(mempool_ops_register_cb, (const void *)ops,
+ RTE_INIT_PRE);
+}
+
/**
* Macro to statically register the ops of a mempool handler.
* Note that the rte_mempool_register_ops fails silently here when
@@ -918,7 +942,7 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
#define MEMPOOL_REGISTER_OPS(ops) \
RTE_INIT(mp_hdlr_init_##ops) \
{ \
- rte_mempool_register_ops(&ops); \
+ mempool_ops_register(&ops); \
}
/**
@@ -14,43 +14,95 @@
/* indirect jump table to support external memory pools. */
struct rte_mempool_ops_table rte_mempool_ops_table = {
.sl = RTE_SPINLOCK_INITIALIZER,
- .num_ops = 0
};
-/* add a new ops struct in rte_mempool_ops_table, return its index. */
-int
-rte_mempool_register_ops(const struct rte_mempool_ops *h)
+static int
+rte_mempool_register_shared_ops(const char *name)
{
- struct rte_mempool_ops *ops;
- int16_t ops_index;
+ static bool mempool_shared_ops_inited;
+ struct rte_mempool_shared_ops *shared_ops;
+ const struct rte_memzone *mz;
+ uint32_t ops_index = 0;
+
+ if (rte_eal_process_type() == RTE_PROC_PRIMARY &&
+ !mempool_shared_ops_inited) {
+
+ mz = rte_memzone_reserve("mempool_ops_shared",
+ sizeof(*shared_ops), 0, 0);
+ if (mz == NULL)
+ return -ENOMEM;
+
+ shared_ops = mz->addr;
+ shared_ops->num_mempool_ops = 0;
+ rte_spinlock_init(&shared_ops->mempool);
+
+ mempool_shared_ops_inited = true;
+ } else {
+ mz = rte_memzone_lookup("mempool_ops_shared");
+ if (mz == NULL)
+ return -ENOMEM;
+
+ shared_ops = mz->addr;
+ }
- rte_spinlock_lock(&rte_mempool_ops_table.sl);
+ rte_spinlock_lock(&shared_ops->mempool);
- if (rte_mempool_ops_table.num_ops >=
- RTE_MEMPOOL_MAX_OPS_IDX) {
- rte_spinlock_unlock(&rte_mempool_ops_table.sl);
+ if (shared_ops->num_mempool_ops >= RTE_MEMPOOL_MAX_OPS_IDX) {
+ rte_spinlock_unlock(&shared_ops->mempool);
RTE_LOG(ERR, MEMPOOL,
"Maximum number of mempool ops structs exceeded\n");
return -ENOSPC;
}
+ while (shared_ops->mempool_ops[ops_index].name[0]) {
+ if (!strcmp(name, shared_ops->mempool_ops[ops_index].name)) {
+ rte_spinlock_unlock(&shared_ops->mempool);
+ return ops_index;
+ }
+
+ ops_index++;
+ }
+
+ strlcpy(shared_ops->mempool_ops[ops_index].name, name,
+ sizeof(shared_ops->mempool_ops[0].name));
+
+ shared_ops->num_mempool_ops++;
+
+ rte_spinlock_unlock(&shared_ops->mempool);
+ return ops_index;
+}
+
+/* add a new ops struct in rte_mempool_ops_table, return its index. */
+int
+rte_mempool_register_ops(const struct rte_mempool_ops *h)
+{
+ struct rte_mempool_ops *ops;
+ int16_t ops_index;
+
if (h->alloc == NULL || h->enqueue == NULL ||
- h->dequeue == NULL || h->get_count == NULL) {
- rte_spinlock_unlock(&rte_mempool_ops_table.sl);
+ h->dequeue == NULL || h->get_count == NULL) {
RTE_LOG(ERR, MEMPOOL,
"Missing callback while registering mempool ops\n");
+ rte_errno = EINVAL;
return -EINVAL;
}
if (strlen(h->name) >= sizeof(ops->name) - 1) {
- rte_spinlock_unlock(&rte_mempool_ops_table.sl);
- RTE_LOG(DEBUG, EAL, "%s(): mempool_ops <%s>: name too long\n",
- __func__, h->name);
+ RTE_LOG(ERR, MEMPOOL,
+ "The registering mempool_ops <%s>: name too long\n",
+ h->name);
rte_errno = EEXIST;
return -EEXIST;
}
- ops_index = rte_mempool_ops_table.num_ops++;
+ ops_index = rte_mempool_register_shared_ops(h->name);
+ if (ops_index < 0) {
+ rte_errno = -ops_index;
+ return ops_index;
+ }
+
+ rte_spinlock_lock(&rte_mempool_ops_table.sl);
+
ops = &rte_mempool_ops_table.ops[ops_index];
strlcpy(ops->name, h->name, sizeof(ops->name));
ops->alloc = h->alloc;
@@ -165,9 +217,8 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
if (mp->flags & MEMPOOL_F_POOL_CREATED)
return -EEXIST;
- for (i = 0; i < rte_mempool_ops_table.num_ops; i++) {
- if (!strcmp(name,
- rte_mempool_ops_table.ops[i].name)) {
+ for (i = 0; i < RTE_MEMPOOL_MAX_OPS_IDX; i++) {
+ if (!strcmp(name, rte_mempool_ops_table.ops[i].name)) {
ops = &rte_mempool_ops_table.ops[i];
break;
}