Message ID | 1586787714-13890-2-git-send-email-xiangxia.m.yue@gmail.com (mailing list archive) |
---|---|
State | Changes Requested, archived |
Delegated to: | David Marchand |
Headers | show |
Series | [v3,1/2] eal: introduce rte-init queue for libraries initialization | expand |
Context | Check | Description |
---|---|---|
ci/checkpatch | warning | coding style issues |
ci/Intel-compilation | fail | Compilation issues |
ci/travis-robot | warning | Travis build: failed |
13/04/2020 16:21, xiangxia.m.yue@gmail.com: > The order of mempool initiation affects mempool index in the > rte_mempool_ops_table. For example, when building APPs with: > > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ... > > The "bucket" mempool will be registered firstly, and its index > in table is 0 while the index of "ring" mempool is 1. DPDK > uses the mk/rte.app.mk to build APPs, and others, for example, > Open vSwitch, use the libdpdk.a or libdpdk.so to build it. > The mempool lib linked in dpdk and Open vSwitch is different. We are supposed to use pkg-config to link DPDK. Does the problem appear between a DPDK compiled with meson and an application linked with pkg-config information? If the problem really needs to be solved, the EAL patch (first of this series) needs to be discussed and reviewed carefully. I don't imagine it being done in 20.05.
On 4/13/20 5:21 PM, xiangxia.m.yue@gmail.com wrote: > From: Tonghao Zhang <xiangxia.m.yue@gmail.com> > > The order of mempool initiation affects mempool index in the > rte_mempool_ops_table. For example, when building APPs with: > > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ... > > The "bucket" mempool will be registered firstly, and its index > in table is 0 while the index of "ring" mempool is 1. DPDK > uses the mk/rte.app.mk to build APPs, and others, for example, > Open vSwitch, use the libdpdk.a or libdpdk.so to build it. > The mempool lib linked in dpdk and Open vSwitch is different. > > The mempool can be used between primary and secondary process, > such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled). > There will be a crash because dpdk-pdump creates the "ring_mp_mc" > ring which index in table is 0, but the index of "bucket" ring > is 0 in Open vSwitch. If Open vSwitch use the index 0 to get > mempool ops and malloc memory from mempool. The crash will occur: > > bucket_dequeue (access null and crash) > rte_mempool_get_ops (should get "ring_mp_mc", > but get "bucket" mempool) > rte_mempool_ops_dequeue_bulk > ... > rte_pktmbuf_alloc > rte_pktmbuf_copy > pdump_copy > pdump_rx > rte_eth_rx_burst > > To avoid the crash, there are some solution: > * constructor priority: Different mempool uses different > priority in RTE_INIT, but it's not easy to maintain. > > * change mk/rte.app.mk: Change the order in mk/rte.app.mk to > be same as libdpdk.a/libdpdk.so, but when adding a new mempool > driver in future, we must make sure the order. > > * register mempool orderly: Sort the mempool when registering, > so the lib linked will not affect the index in mempool table. > but the number of mempool libraries may be different. > > * shared memzone: The primary process allocates a struct in > shared memory named memzone, When we register a mempool ops, > we first get a name and id from the shared struct: with the lock held, > lookup for the registered name and return its index, else > get the last id and copy the name in the struct. > > Previous discussion: https://mails.dpdk.org/archives/dev/2020-March/159354.html > > Suggested-by: Olivier Matz <olivier.matz@6wind.com> > Suggested-by: Jerin Jacob <jerinj@marvell.com> > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> > --- > v2: > * fix checkpatch warning > --- > lib/librte_mempool/rte_mempool.h | 28 +++++++++++- > lib/librte_mempool/rte_mempool_ops.c | 89 ++++++++++++++++++++++++++++-------- > 2 files changed, 96 insertions(+), 21 deletions(-) > > diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h > index c90cf31467b2..2709b9e1d51b 100644 > --- a/lib/librte_mempool/rte_mempool.h > +++ b/lib/librte_mempool/rte_mempool.h > @@ -50,6 +50,7 @@ > #include <rte_ring.h> > #include <rte_memcpy.h> > #include <rte_common.h> > +#include <rte_init.h> > > #ifdef __cplusplus > extern "C" { > @@ -678,7 +679,6 @@ struct rte_mempool_ops { > */ > struct rte_mempool_ops_table { > rte_spinlock_t sl; /**< Spinlock for add/delete. */ > - uint32_t num_ops; /**< Number of used ops structs in the table. */ > /** > * Storage for all possible ops structs. > */ > @@ -910,6 +910,30 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp, > */ > int rte_mempool_register_ops(const struct rte_mempool_ops *ops); > > +struct rte_mempool_shared_ops { > + size_t num_mempool_ops; Is there any specific reason to change type from uint32_t used above to size_t? I think that uint32_t is better here since it is just a number, not a size of memory or related value. > + struct { > + char name[RTE_MEMPOOL_OPS_NAMESIZE]; > + } mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX]; > + > + rte_spinlock_t mempool; > +}; > + > +static inline int > +mempool_ops_register_cb(const void *arg) > +{ > + const struct rte_mempool_ops *h = (const struct rte_mempool_ops *)arg; > + > + return rte_mempool_register_ops(h); > +} > + > +static inline void > +mempool_ops_register(const struct rte_mempool_ops *ops) > +{ > + rte_init_register(mempool_ops_register_cb, (const void *)ops, > + RTE_INIT_PRE); > +} > + > /** > * Macro to statically register the ops of a mempool handler. > * Note that the rte_mempool_register_ops fails silently here when > @@ -918,7 +942,7 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp, > #define MEMPOOL_REGISTER_OPS(ops) \ > RTE_INIT(mp_hdlr_init_##ops) \ > { \ > - rte_mempool_register_ops(&ops); \ > + mempool_ops_register(&ops); \ > } > > /** > diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c > index 22c5251eb068..b10fda662db6 100644 > --- a/lib/librte_mempool/rte_mempool_ops.c > +++ b/lib/librte_mempool/rte_mempool_ops.c > @@ -14,43 +14,95 @@ > /* indirect jump table to support external memory pools. */ > struct rte_mempool_ops_table rte_mempool_ops_table = { > .sl = RTE_SPINLOCK_INITIALIZER, > - .num_ops = 0 > }; > > -/* add a new ops struct in rte_mempool_ops_table, return its index. */ > -int > -rte_mempool_register_ops(const struct rte_mempool_ops *h) > +static int > +rte_mempool_register_shared_ops(const char *name) > { > - struct rte_mempool_ops *ops; > - int16_t ops_index; > + static bool mempool_shared_ops_inited; > + struct rte_mempool_shared_ops *shared_ops; > + const struct rte_memzone *mz; > + uint32_t ops_index = 0; > + I think we should sanity check 'name' here to be not empty string (see review notes below). > + if (rte_eal_process_type() == RTE_PROC_PRIMARY && > + !mempool_shared_ops_inited) { > + > + mz = rte_memzone_reserve("mempool_ops_shared", > + sizeof(*shared_ops), 0, 0); > + if (mz == NULL) > + return -ENOMEM; > + > + shared_ops = mz->addr; > + shared_ops->num_mempool_ops = 0; > + rte_spinlock_init(&shared_ops->mempool); > + > + mempool_shared_ops_inited = true; > + } else { > + mz = rte_memzone_lookup("mempool_ops_shared"); > + if (mz == NULL) > + return -ENOMEM; > + > + shared_ops = mz->addr; > + } > > - rte_spinlock_lock(&rte_mempool_ops_table.sl); > + rte_spinlock_lock(&shared_ops->mempool); > > - if (rte_mempool_ops_table.num_ops >= > - RTE_MEMPOOL_MAX_OPS_IDX) { > - rte_spinlock_unlock(&rte_mempool_ops_table.sl); > + if (shared_ops->num_mempool_ops >= RTE_MEMPOOL_MAX_OPS_IDX) { > + rte_spinlock_unlock(&shared_ops->mempool); > RTE_LOG(ERR, MEMPOOL, > "Maximum number of mempool ops structs exceeded\n"); > return -ENOSPC; > } > > + while (shared_ops->mempool_ops[ops_index].name[0]) { Please, compare with '\0' as DPDK style guide says. > + if (!strcmp(name, shared_ops->mempool_ops[ops_index].name)) { > + rte_spinlock_unlock(&shared_ops->mempool); > + return ops_index; > + } > + > + ops_index++; > + } > + > + strlcpy(shared_ops->mempool_ops[ops_index].name, name, > + sizeof(shared_ops->mempool_ops[0].name)); > + > + shared_ops->num_mempool_ops++; > + > + rte_spinlock_unlock(&shared_ops->mempool); > + return ops_index; > +} > + > +/* add a new ops struct in rte_mempool_ops_table, return its index. */ > +int > +rte_mempool_register_ops(const struct rte_mempool_ops *h) > +{ > + struct rte_mempool_ops *ops; > + int16_t ops_index; > + > if (h->alloc == NULL || h->enqueue == NULL || > - h->dequeue == NULL || h->get_count == NULL) { > - rte_spinlock_unlock(&rte_mempool_ops_table.sl); > + h->dequeue == NULL || h->get_count == NULL) { Changing formatting just makes review a bit more harder. > RTE_LOG(ERR, MEMPOOL, > "Missing callback while registering mempool ops\n"); > + rte_errno = EINVAL; Why is it done in the patch? For me it looks like logically different change if it is really required (properly motivated). > return -EINVAL; > } > > if (strlen(h->name) >= sizeof(ops->name) - 1) { > - rte_spinlock_unlock(&rte_mempool_ops_table.sl); > - RTE_LOG(DEBUG, EAL, "%s(): mempool_ops <%s>: name too long\n", > - __func__, h->name); > + RTE_LOG(ERR, MEMPOOL, > + "The registering mempool_ops <%s>: name too long\n", > + h->name); Why do you change from DEBUG to ERR here? It it not directly related to the purpose of the patch. > rte_errno = EEXIST; > return -EEXIST; > } > > - ops_index = rte_mempool_ops_table.num_ops++; > + ops_index = rte_mempool_register_shared_ops(h->name); > + if (ops_index < 0) { > + rte_errno = -ops_index; > + return ops_index; > + } > + > + rte_spinlock_lock(&rte_mempool_ops_table.sl); > + > ops = &rte_mempool_ops_table.ops[ops_index]; > strlcpy(ops->name, h->name, sizeof(ops->name)); > ops->alloc = h->alloc; > @@ -165,9 +217,8 @@ struct rte_mempool_ops_table rte_mempool_ops_table = { > if (mp->flags & MEMPOOL_F_POOL_CREATED) > return -EEXIST; > > - for (i = 0; i < rte_mempool_ops_table.num_ops; i++) { > - if (!strcmp(name, > - rte_mempool_ops_table.ops[i].name)) { > + for (i = 0; i < RTE_MEMPOOL_MAX_OPS_IDX; i++) { > + if (!strcmp(name, rte_mempool_ops_table.ops[i].name)) { Since rte_mempool_ops_table is filled in which zeros, name string is empty by default. So, request with empty name will match the first unused entry. I guess it is not what we want here. I think we should handle empty string before the loop and return -EINVAL. > ops = &rte_mempool_ops_table.ops[i]; > break; > } >
On Thu, Apr 23, 2020 at 9:38 PM Andrew Rybchenko <arybchenko@solarflare.com> wrote: > > On 4/13/20 5:21 PM, xiangxia.m.yue@gmail.com wrote: > > From: Tonghao Zhang <xiangxia.m.yue@gmail.com> > > > > The order of mempool initiation affects mempool index in the > > rte_mempool_ops_table. For example, when building APPs with: > > > > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ... > > > > The "bucket" mempool will be registered firstly, and its index > > in table is 0 while the index of "ring" mempool is 1. DPDK > > uses the mk/rte.app.mk to build APPs, and others, for example, > > Open vSwitch, use the libdpdk.a or libdpdk.so to build it. > > The mempool lib linked in dpdk and Open vSwitch is different. > > > > The mempool can be used between primary and secondary process, > > such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled). > > There will be a crash because dpdk-pdump creates the "ring_mp_mc" > > ring which index in table is 0, but the index of "bucket" ring > > is 0 in Open vSwitch. If Open vSwitch use the index 0 to get > > mempool ops and malloc memory from mempool. The crash will occur: > > > > bucket_dequeue (access null and crash) > > rte_mempool_get_ops (should get "ring_mp_mc", > > but get "bucket" mempool) > > rte_mempool_ops_dequeue_bulk > > ... > > rte_pktmbuf_alloc > > rte_pktmbuf_copy > > pdump_copy > > pdump_rx > > rte_eth_rx_burst > > > > To avoid the crash, there are some solution: > > * constructor priority: Different mempool uses different > > priority in RTE_INIT, but it's not easy to maintain. > > > > * change mk/rte.app.mk: Change the order in mk/rte.app.mk to > > be same as libdpdk.a/libdpdk.so, but when adding a new mempool > > driver in future, we must make sure the order. > > > > * register mempool orderly: Sort the mempool when registering, > > so the lib linked will not affect the index in mempool table. > > but the number of mempool libraries may be different. > > > > * shared memzone: The primary process allocates a struct in > > shared memory named memzone, When we register a mempool ops, > > we first get a name and id from the shared struct: with the lock held, > > lookup for the registered name and return its index, else > > get the last id and copy the name in the struct. > > > > Previous discussion: https://mails.dpdk.org/archives/dev/2020-March/159354.html > > > > Suggested-by: Olivier Matz <olivier.matz@6wind.com> > > Suggested-by: Jerin Jacob <jerinj@marvell.com> > > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> > > --- > > v2: > > * fix checkpatch warning > > --- > > lib/librte_mempool/rte_mempool.h | 28 +++++++++++- > > lib/librte_mempool/rte_mempool_ops.c | 89 ++++++++++++++++++++++++++++-------- > > 2 files changed, 96 insertions(+), 21 deletions(-) > > > > diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h > > index c90cf31467b2..2709b9e1d51b 100644 > > --- a/lib/librte_mempool/rte_mempool.h > > +++ b/lib/librte_mempool/rte_mempool.h > > @@ -50,6 +50,7 @@ > > #include <rte_ring.h> > > #include <rte_memcpy.h> > > #include <rte_common.h> > > +#include <rte_init.h> > > > > #ifdef __cplusplus > > extern "C" { > > @@ -678,7 +679,6 @@ struct rte_mempool_ops { > > */ > > struct rte_mempool_ops_table { > > rte_spinlock_t sl; /**< Spinlock for add/delete. */ > > - uint32_t num_ops; /**< Number of used ops structs in the table. */ > > /** > > * Storage for all possible ops structs. > > */ > > @@ -910,6 +910,30 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp, > > */ > > int rte_mempool_register_ops(const struct rte_mempool_ops *ops); > > > > +struct rte_mempool_shared_ops { > > + size_t num_mempool_ops; > > Is there any specific reason to change type from uint32_t used > above to size_t? I think that uint32_t is better here since > it is just a number, not a size of memory or related value. Thanks for you review. busy to commit other patch, so that be delayed. I will change that in next version. > > + struct { > > + char name[RTE_MEMPOOL_OPS_NAMESIZE]; > > + } mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX]; > > + > > + rte_spinlock_t mempool; > > +}; > > + > > +static inline int > > +mempool_ops_register_cb(const void *arg) > > +{ > > + const struct rte_mempool_ops *h = (const struct rte_mempool_ops *)arg; > > + > > + return rte_mempool_register_ops(h); > > +} > > + > > +static inline void > > +mempool_ops_register(const struct rte_mempool_ops *ops) > > +{ > > + rte_init_register(mempool_ops_register_cb, (const void *)ops, > > + RTE_INIT_PRE); > > +} > > + > > /** > > * Macro to statically register the ops of a mempool handler. > > * Note that the rte_mempool_register_ops fails silently here when > > @@ -918,7 +942,7 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp, > > #define MEMPOOL_REGISTER_OPS(ops) \ > > RTE_INIT(mp_hdlr_init_##ops) \ > > { \ > > - rte_mempool_register_ops(&ops); \ > > + mempool_ops_register(&ops); \ > > } > > > > /** > > diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c > > index 22c5251eb068..b10fda662db6 100644 > > --- a/lib/librte_mempool/rte_mempool_ops.c > > +++ b/lib/librte_mempool/rte_mempool_ops.c > > @@ -14,43 +14,95 @@ > > /* indirect jump table to support external memory pools. */ > > struct rte_mempool_ops_table rte_mempool_ops_table = { > > .sl = RTE_SPINLOCK_INITIALIZER, > > - .num_ops = 0 > > }; > > > > -/* add a new ops struct in rte_mempool_ops_table, return its index. */ > > -int > > -rte_mempool_register_ops(const struct rte_mempool_ops *h) > > +static int > > +rte_mempool_register_shared_ops(const char *name) > > { > > - struct rte_mempool_ops *ops; > > - int16_t ops_index; > > + static bool mempool_shared_ops_inited; > > + struct rte_mempool_shared_ops *shared_ops; > > + const struct rte_memzone *mz; > > + uint32_t ops_index = 0; > > + > > I think we should sanity check 'name' here to be not > empty string (see review notes below). OK > > + if (rte_eal_process_type() == RTE_PROC_PRIMARY && > > + !mempool_shared_ops_inited) { > > + > > + mz = rte_memzone_reserve("mempool_ops_shared", > > + sizeof(*shared_ops), 0, 0); > > + if (mz == NULL) > > + return -ENOMEM; > > + > > + shared_ops = mz->addr; > > + shared_ops->num_mempool_ops = 0; > > + rte_spinlock_init(&shared_ops->mempool); > > + > > + mempool_shared_ops_inited = true; > > + } else { > > + mz = rte_memzone_lookup("mempool_ops_shared"); > > + if (mz == NULL) > > + return -ENOMEM; > > + > > + shared_ops = mz->addr; > > + } > > > > - rte_spinlock_lock(&rte_mempool_ops_table.sl); > > + rte_spinlock_lock(&shared_ops->mempool); > > > > - if (rte_mempool_ops_table.num_ops >= > > - RTE_MEMPOOL_MAX_OPS_IDX) { > > - rte_spinlock_unlock(&rte_mempool_ops_table.sl); > > + if (shared_ops->num_mempool_ops >= RTE_MEMPOOL_MAX_OPS_IDX) { > > + rte_spinlock_unlock(&shared_ops->mempool); > > RTE_LOG(ERR, MEMPOOL, > > "Maximum number of mempool ops structs exceeded\n"); > > return -ENOSPC; > > } > > > > + while (shared_ops->mempool_ops[ops_index].name[0]) { > > Please, compare with '\0' as DPDK style guide says. > > > + if (!strcmp(name, shared_ops->mempool_ops[ops_index].name)) { > > + rte_spinlock_unlock(&shared_ops->mempool); > > + return ops_index; > > + } > > + > > + ops_index++; > > + } > > + > > + strlcpy(shared_ops->mempool_ops[ops_index].name, name, > > + sizeof(shared_ops->mempool_ops[0].name)); > > + > > + shared_ops->num_mempool_ops++; > > + > > + rte_spinlock_unlock(&shared_ops->mempool); > > + return ops_index; > > +} > > + > > +/* add a new ops struct in rte_mempool_ops_table, return its index. */ > > +int > > +rte_mempool_register_ops(const struct rte_mempool_ops *h) > > +{ > > + struct rte_mempool_ops *ops; > > + int16_t ops_index; > > + > > if (h->alloc == NULL || h->enqueue == NULL || > > - h->dequeue == NULL || h->get_count == NULL) { > > - rte_spinlock_unlock(&rte_mempool_ops_table.sl); > > + h->dequeue == NULL || h->get_count == NULL) { > > Changing formatting just makes review a bit more harder. > > > RTE_LOG(ERR, MEMPOOL, > > "Missing callback while registering mempool ops\n"); > > + rte_errno = EINVAL; > > Why is it done in the patch? For me it looks like logically > different change if it is really required (properly motivated). I guess that should add the err code to rte_rrno ? but I am fine, not to do that. > > return -EINVAL; > > } > > > > if (strlen(h->name) >= sizeof(ops->name) - 1) { > > - rte_spinlock_unlock(&rte_mempool_ops_table.sl); > > - RTE_LOG(DEBUG, EAL, "%s(): mempool_ops <%s>: name too long\n", > > - __func__, h->name); > > + RTE_LOG(ERR, MEMPOOL, > > + "The registering mempool_ops <%s>: name too long\n", > > + h->name); > > Why do you change from DEBUG to ERR here? It it not > directly related to the purpose of the patch. Yes, because it is an error, so change that type. I change it in separate patch ? > > rte_errno = EEXIST; > > return -EEXIST; > > } > > > > - ops_index = rte_mempool_ops_table.num_ops++; > > + ops_index = rte_mempool_register_shared_ops(h->name); > > + if (ops_index < 0) { > > + rte_errno = -ops_index; > > + return ops_index; > > + } > > + > > + rte_spinlock_lock(&rte_mempool_ops_table.sl); > > + > > ops = &rte_mempool_ops_table.ops[ops_index]; > > strlcpy(ops->name, h->name, sizeof(ops->name)); > > ops->alloc = h->alloc; > > @@ -165,9 +217,8 @@ struct rte_mempool_ops_table rte_mempool_ops_table = { > > if (mp->flags & MEMPOOL_F_POOL_CREATED) > > return -EEXIST; > > > > - for (i = 0; i < rte_mempool_ops_table.num_ops; i++) { > > - if (!strcmp(name, > > - rte_mempool_ops_table.ops[i].name)) { > > + for (i = 0; i < RTE_MEMPOOL_MAX_OPS_IDX; i++) { > > + if (!strcmp(name, rte_mempool_ops_table.ops[i].name)) { > > Since rte_mempool_ops_table is filled in which zeros, > name string is empty by default. So, request with empty name > will match the first unused entry. I guess it is not what we > want here. I think we should handle empty string before the > loop and return -EINVAL. OK, thanks. > > ops = &rte_mempool_ops_table.ops[i]; > > break; > > } > > >
On Fri, Apr 17, 2020 at 6:27 AM Thomas Monjalon <thomas@monjalon.net> wrote: > > 13/04/2020 16:21, xiangxia.m.yue@gmail.com: > > The order of mempool initiation affects mempool index in the > > rte_mempool_ops_table. For example, when building APPs with: > > > > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ... > > > > The "bucket" mempool will be registered firstly, and its index > > in table is 0 while the index of "ring" mempool is 1. DPDK > > uses the mk/rte.app.mk to build APPs, and others, for example, > > Open vSwitch, use the libdpdk.a or libdpdk.so to build it. > > The mempool lib linked in dpdk and Open vSwitch is different. > > We are supposed to use pkg-config to link DPDK. > Does the problem appear between a DPDK compiled with meson > and an application linked with pkg-config information? > > If the problem really needs to be solved, > the EAL patch (first of this series) needs to be discussed > and reviewed carefully. I don't imagine it being done in 20.05. will I stop update the patches or when the patch is ready and then decide applied or not? > >
27/04/2020 10:03, Tonghao Zhang: > On Fri, Apr 17, 2020 at 6:27 AM Thomas Monjalon <thomas@monjalon.net> wrote: > > > > 13/04/2020 16:21, xiangxia.m.yue@gmail.com: > > > The order of mempool initiation affects mempool index in the > > > rte_mempool_ops_table. For example, when building APPs with: > > > > > > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ... > > > > > > The "bucket" mempool will be registered firstly, and its index > > > in table is 0 while the index of "ring" mempool is 1. DPDK > > > uses the mk/rte.app.mk to build APPs, and others, for example, > > > Open vSwitch, use the libdpdk.a or libdpdk.so to build it. > > > The mempool lib linked in dpdk and Open vSwitch is different. > > > > We are supposed to use pkg-config to link DPDK. > > Does the problem appear between a DPDK compiled with meson > > and an application linked with pkg-config information? > > > > If the problem really needs to be solved, > > the EAL patch (first of this series) needs to be discussed > > and reviewed carefully. I don't imagine it being done in 20.05. > > will I stop update the patches or when the patch is ready and then > decide applied or not? As I said, it requires more discussion. Please start by answering my question above.
On Mon, Apr 27, 2020 at 7:40 PM Thomas Monjalon <thomas@monjalon.net> wrote: > > 27/04/2020 10:03, Tonghao Zhang: > > On Fri, Apr 17, 2020 at 6:27 AM Thomas Monjalon <thomas@monjalon.net> wrote: > > > > > > 13/04/2020 16:21, xiangxia.m.yue@gmail.com: > > > > The order of mempool initiation affects mempool index in the > > > > rte_mempool_ops_table. For example, when building APPs with: > > > > > > > > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ... > > > > > > > > The "bucket" mempool will be registered firstly, and its index > > > > in table is 0 while the index of "ring" mempool is 1. DPDK > > > > uses the mk/rte.app.mk to build APPs, and others, for example, > > > > Open vSwitch, use the libdpdk.a or libdpdk.so to build it. > > > > The mempool lib linked in dpdk and Open vSwitch is different. > > > > > > We are supposed to use pkg-config to link DPDK. > > > Does the problem appear between a DPDK compiled with meson > > > and an application linked with pkg-config information? Hi Thomas, The library mempool linked order can trigger that problem. but when the library is loaded dynamically, trigger that problem too. as Olivier Matz said: The fact that the ops index changes during mempool driver lifetime is indeed frightening, especially knowning that this is a dynamic registration that could happen at any moment in the life of the application. the message in https://mails.dpdk.org/archives/dev/2020-March/159354.html > > > If the problem really needs to be solved, > > > the EAL patch (first of this series) needs to be discussed > > > and reviewed carefully. I don't imagine it being done in 20.05. > > > > will I stop update the patches or when the patch is ready and then > > decide applied or not? > > As I said, it requires more discussion. > Please start by answering my question above. I got it, Thanks. >
On Mon, Apr 27, 2020 at 8:51 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote: > > On Mon, Apr 27, 2020 at 7:40 PM Thomas Monjalon <thomas@monjalon.net> wrote: > > > > 27/04/2020 10:03, Tonghao Zhang: > > > On Fri, Apr 17, 2020 at 6:27 AM Thomas Monjalon <thomas@monjalon.net> wrote: > > > > > > > > 13/04/2020 16:21, xiangxia.m.yue@gmail.com: > > > > > The order of mempool initiation affects mempool index in the > > > > > rte_mempool_ops_table. For example, when building APPs with: > > > > > > > > > > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ... > > > > > > > > > > The "bucket" mempool will be registered firstly, and its index > > > > > in table is 0 while the index of "ring" mempool is 1. DPDK > > > > > uses the mk/rte.app.mk to build APPs, and others, for example, > > > > > Open vSwitch, use the libdpdk.a or libdpdk.so to build it. > > > > > The mempool lib linked in dpdk and Open vSwitch is different. > > > > > > > > We are supposed to use pkg-config to link DPDK. > > > > Does the problem appear between a DPDK compiled with meson > > > > and an application linked with pkg-config information? > Hi Thomas, > The library mempool linked order can trigger that problem. but when > the library is loaded > dynamically, trigger that problem too. > as Olivier Matz said: > The fact that the ops index changes during mempool driver lifetime is > indeed frightening, especially knowning that this is a dynamic > registration that could happen at any moment in the life of the > application. > > the message in https://mails.dpdk.org/archives/dev/2020-March/159354.html Hi Thomas, For first patch, I guess we support a callback for other library, it make the codes much cleaner at eal layer. Otherwise, if we init for library, we may include their header file. There is a better solution ? > > > > If the problem really needs to be solved, > > > > the EAL patch (first of this series) needs to be discussed > > > > and reviewed carefully. I don't imagine it being done in 20.05. > > > > > > will I stop update the patches or when the patch is ready and then > > > decide applied or not? > > > > As I said, it requires more discussion. > > Please start by answering my question above. > I got it, Thanks. > > > > > -- > Best regards, Tonghao
Hi, On Tue, Apr 28, 2020 at 09:22:37PM +0800, Tonghao Zhang wrote: > On Mon, Apr 27, 2020 at 8:51 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote: > > > > On Mon, Apr 27, 2020 at 7:40 PM Thomas Monjalon <thomas@monjalon.net> wrote: > > > > > > 27/04/2020 10:03, Tonghao Zhang: > > > > On Fri, Apr 17, 2020 at 6:27 AM Thomas Monjalon <thomas@monjalon.net> wrote: > > > > > > > > > > 13/04/2020 16:21, xiangxia.m.yue@gmail.com: > > > > > > The order of mempool initiation affects mempool index in the > > > > > > rte_mempool_ops_table. For example, when building APPs with: > > > > > > > > > > > > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ... > > > > > > > > > > > > The "bucket" mempool will be registered firstly, and its index > > > > > > in table is 0 while the index of "ring" mempool is 1. DPDK > > > > > > uses the mk/rte.app.mk to build APPs, and others, for example, > > > > > > Open vSwitch, use the libdpdk.a or libdpdk.so to build it. > > > > > > The mempool lib linked in dpdk and Open vSwitch is different. > > > > > > > > > > We are supposed to use pkg-config to link DPDK. > > > > > Does the problem appear between a DPDK compiled with meson > > > > > and an application linked with pkg-config information? > > Hi Thomas, > > The library mempool linked order can trigger that problem. but when > > the library is loaded > > dynamically, trigger that problem too. > > as Olivier Matz said: > > The fact that the ops index changes during mempool driver lifetime is > > indeed frightening, especially knowning that this is a dynamic > > registration that could happen at any moment in the life of the > > application. > > > > the message in https://mails.dpdk.org/archives/dev/2020-March/159354.html > Hi Thomas, > For first patch, I guess we support a callback for other library, it > make the codes much cleaner > at eal layer. Otherwise, if we init for library, we may include their > header file. > There is a better solution ? To summarize my understanding of the issu encountered by Tonghao: Currently, it is not possible to call memzone_register() from an init function (registered with RTE_INIT()). This is needed if we want to store the list of registered mempool ops in a shared memory, available from multiprocess. Tonghao's patch 1/2 solves this issue. I tried to find alternatives to this approach, but none of them seems satisfying: - use RTE_PMD_REGISTER_VDEV() and rte_vdev_add_custom_scan() instead of RTE_INIT() in the MEMPOOL_REGISTER_OPS() macro to delay the initialization after eal_init(). This looks too complex (I made a POC of it, it someone is interested). - synchronize mempool ops in shared memory when mempool_create() is called in the primary: this would probably works most of the time, but it is not a perfect solution as we cannot ensure that the primary application will create a mempool before the secondary comes up. - introduce a mandatory call to rte_mempool_lib_init(): despite it's the usual way to initialize libs, this will break compatibility. > > > > > If the problem really needs to be solved, > > > > > the EAL patch (first of this series) needs to be discussed > > > > > and reviewed carefully. I don't imagine it being done in 20.05. > > > > OK, let's discuss it once 20.05 is out. Regards, Olivier
On Mon, May 4, 2020 at 9:42 AM Olivier Matz <olivier.matz@6wind.com> wrote: > > Hi, > > On Tue, Apr 28, 2020 at 09:22:37PM +0800, Tonghao Zhang wrote: > > On Mon, Apr 27, 2020 at 8:51 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote: > > > > > > On Mon, Apr 27, 2020 at 7:40 PM Thomas Monjalon <thomas@monjalon.net> wrote: > > > > > > > > 27/04/2020 10:03, Tonghao Zhang: > > > > > On Fri, Apr 17, 2020 at 6:27 AM Thomas Monjalon <thomas@monjalon.net> wrote: > > > > > > > > > > > > 13/04/2020 16:21, xiangxia.m.yue@gmail.com: > > > > > > > The order of mempool initiation affects mempool index in the > > > > > > > rte_mempool_ops_table. For example, when building APPs with: > > > > > > > > > > > > > > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ... > > > > > > > > > > > > > > The "bucket" mempool will be registered firstly, and its index > > > > > > > in table is 0 while the index of "ring" mempool is 1. DPDK > > > > > > > uses the mk/rte.app.mk to build APPs, and others, for example, > > > > > > > Open vSwitch, use the libdpdk.a or libdpdk.so to build it. > > > > > > > The mempool lib linked in dpdk and Open vSwitch is different. > > > > > > > > > > > > We are supposed to use pkg-config to link DPDK. > > > > > > Does the problem appear between a DPDK compiled with meson > > > > > > and an application linked with pkg-config information? > > > Hi Thomas, > > > The library mempool linked order can trigger that problem. but when > > > the library is loaded > > > dynamically, trigger that problem too. > > > as Olivier Matz said: > > > The fact that the ops index changes during mempool driver lifetime is > > > indeed frightening, especially knowning that this is a dynamic > > > registration that could happen at any moment in the life of the > > > application. > > > > > > the message in https://mails.dpdk.org/archives/dev/2020-March/159354.html > > Hi Thomas, > > For first patch, I guess we support a callback for other library, it > > make the codes much cleaner > > at eal layer. Otherwise, if we init for library, we may include their > > header file. > > There is a better solution ? > > To summarize my understanding of the issu encountered by Tonghao: > > Currently, it is not possible to call memzone_register() from an init > function (registered with RTE_INIT()). This is needed if we want to > store the list of registered mempool ops in a shared memory, available > from multiprocess. > > Tonghao's patch 1/2 solves this issue. I tried to find alternatives > to this approach, but none of them seems satisfying: > > - use RTE_PMD_REGISTER_VDEV() and rte_vdev_add_custom_scan() instead of > RTE_INIT() in the MEMPOOL_REGISTER_OPS() macro to delay the > initialization after eal_init(). This looks too complex (I made a POC > of it, it someone is interested). > > - synchronize mempool ops in shared memory when mempool_create() is > called in the primary: this would probably works most of the time, but > it is not a perfect solution as we cannot ensure that the primary > application will create a mempool before the secondary comes up. > > - introduce a mandatory call to rte_mempool_lib_init(): despite it's the > usual way to initialize libs, this will break compatibility. > > > > > > > If the problem really needs to be solved, > > > > > > the EAL patch (first of this series) needs to be discussed > > > > > > and reviewed carefully. I don't imagine it being done in 20.05. > > > > > > > OK, let's discuss it once 20.05 is out. > Any news on this topic? Is this issue still a problem?
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h index c90cf31467b2..2709b9e1d51b 100644 --- a/lib/librte_mempool/rte_mempool.h +++ b/lib/librte_mempool/rte_mempool.h @@ -50,6 +50,7 @@ #include <rte_ring.h> #include <rte_memcpy.h> #include <rte_common.h> +#include <rte_init.h> #ifdef __cplusplus extern "C" { @@ -678,7 +679,6 @@ struct rte_mempool_ops { */ struct rte_mempool_ops_table { rte_spinlock_t sl; /**< Spinlock for add/delete. */ - uint32_t num_ops; /**< Number of used ops structs in the table. */ /** * Storage for all possible ops structs. */ @@ -910,6 +910,30 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp, */ int rte_mempool_register_ops(const struct rte_mempool_ops *ops); +struct rte_mempool_shared_ops { + size_t num_mempool_ops; + struct { + char name[RTE_MEMPOOL_OPS_NAMESIZE]; + } mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX]; + + rte_spinlock_t mempool; +}; + +static inline int +mempool_ops_register_cb(const void *arg) +{ + const struct rte_mempool_ops *h = (const struct rte_mempool_ops *)arg; + + return rte_mempool_register_ops(h); +} + +static inline void +mempool_ops_register(const struct rte_mempool_ops *ops) +{ + rte_init_register(mempool_ops_register_cb, (const void *)ops, + RTE_INIT_PRE); +} + /** * Macro to statically register the ops of a mempool handler. * Note that the rte_mempool_register_ops fails silently here when @@ -918,7 +942,7 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp, #define MEMPOOL_REGISTER_OPS(ops) \ RTE_INIT(mp_hdlr_init_##ops) \ { \ - rte_mempool_register_ops(&ops); \ + mempool_ops_register(&ops); \ } /** diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c index 22c5251eb068..b10fda662db6 100644 --- a/lib/librte_mempool/rte_mempool_ops.c +++ b/lib/librte_mempool/rte_mempool_ops.c @@ -14,43 +14,95 @@ /* indirect jump table to support external memory pools. */ struct rte_mempool_ops_table rte_mempool_ops_table = { .sl = RTE_SPINLOCK_INITIALIZER, - .num_ops = 0 }; -/* add a new ops struct in rte_mempool_ops_table, return its index. */ -int -rte_mempool_register_ops(const struct rte_mempool_ops *h) +static int +rte_mempool_register_shared_ops(const char *name) { - struct rte_mempool_ops *ops; - int16_t ops_index; + static bool mempool_shared_ops_inited; + struct rte_mempool_shared_ops *shared_ops; + const struct rte_memzone *mz; + uint32_t ops_index = 0; + + if (rte_eal_process_type() == RTE_PROC_PRIMARY && + !mempool_shared_ops_inited) { + + mz = rte_memzone_reserve("mempool_ops_shared", + sizeof(*shared_ops), 0, 0); + if (mz == NULL) + return -ENOMEM; + + shared_ops = mz->addr; + shared_ops->num_mempool_ops = 0; + rte_spinlock_init(&shared_ops->mempool); + + mempool_shared_ops_inited = true; + } else { + mz = rte_memzone_lookup("mempool_ops_shared"); + if (mz == NULL) + return -ENOMEM; + + shared_ops = mz->addr; + } - rte_spinlock_lock(&rte_mempool_ops_table.sl); + rte_spinlock_lock(&shared_ops->mempool); - if (rte_mempool_ops_table.num_ops >= - RTE_MEMPOOL_MAX_OPS_IDX) { - rte_spinlock_unlock(&rte_mempool_ops_table.sl); + if (shared_ops->num_mempool_ops >= RTE_MEMPOOL_MAX_OPS_IDX) { + rte_spinlock_unlock(&shared_ops->mempool); RTE_LOG(ERR, MEMPOOL, "Maximum number of mempool ops structs exceeded\n"); return -ENOSPC; } + while (shared_ops->mempool_ops[ops_index].name[0]) { + if (!strcmp(name, shared_ops->mempool_ops[ops_index].name)) { + rte_spinlock_unlock(&shared_ops->mempool); + return ops_index; + } + + ops_index++; + } + + strlcpy(shared_ops->mempool_ops[ops_index].name, name, + sizeof(shared_ops->mempool_ops[0].name)); + + shared_ops->num_mempool_ops++; + + rte_spinlock_unlock(&shared_ops->mempool); + return ops_index; +} + +/* add a new ops struct in rte_mempool_ops_table, return its index. */ +int +rte_mempool_register_ops(const struct rte_mempool_ops *h) +{ + struct rte_mempool_ops *ops; + int16_t ops_index; + if (h->alloc == NULL || h->enqueue == NULL || - h->dequeue == NULL || h->get_count == NULL) { - rte_spinlock_unlock(&rte_mempool_ops_table.sl); + h->dequeue == NULL || h->get_count == NULL) { RTE_LOG(ERR, MEMPOOL, "Missing callback while registering mempool ops\n"); + rte_errno = EINVAL; return -EINVAL; } if (strlen(h->name) >= sizeof(ops->name) - 1) { - rte_spinlock_unlock(&rte_mempool_ops_table.sl); - RTE_LOG(DEBUG, EAL, "%s(): mempool_ops <%s>: name too long\n", - __func__, h->name); + RTE_LOG(ERR, MEMPOOL, + "The registering mempool_ops <%s>: name too long\n", + h->name); rte_errno = EEXIST; return -EEXIST; } - ops_index = rte_mempool_ops_table.num_ops++; + ops_index = rte_mempool_register_shared_ops(h->name); + if (ops_index < 0) { + rte_errno = -ops_index; + return ops_index; + } + + rte_spinlock_lock(&rte_mempool_ops_table.sl); + ops = &rte_mempool_ops_table.ops[ops_index]; strlcpy(ops->name, h->name, sizeof(ops->name)); ops->alloc = h->alloc; @@ -165,9 +217,8 @@ struct rte_mempool_ops_table rte_mempool_ops_table = { if (mp->flags & MEMPOOL_F_POOL_CREATED) return -EEXIST; - for (i = 0; i < rte_mempool_ops_table.num_ops; i++) { - if (!strcmp(name, - rte_mempool_ops_table.ops[i].name)) { + for (i = 0; i < RTE_MEMPOOL_MAX_OPS_IDX; i++) { + if (!strcmp(name, rte_mempool_ops_table.ops[i].name)) { ops = &rte_mempool_ops_table.ops[i]; break; }