[v8,1/5] mempool: populate mempool with page sized chunks of memory

Message ID 20190723053821.30227-2-vattunuru@marvell.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series kni: add IOVA=VA support |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Performance-Testing fail build patch failure
ci/Intel-compilation fail Compilation issues

Commit Message

Vamsi Krishna Attunuru July 23, 2019, 5:38 a.m. UTC
  From: Vamsi Attunuru <vattunuru@marvell.com>

Patch adds a routine to populate mempool from page aligned and
page sized chunks of memory to ensures memory objs do not fall
across the page boundaries. It's useful for applications that
require physically contiguous mbuf memory while running in
IOVA=VA mode.

Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
---
 lib/librte_mempool/rte_mempool.c           | 59 ++++++++++++++++++++++++++++++
 lib/librte_mempool/rte_mempool.h           | 17 +++++++++
 lib/librte_mempool/rte_mempool_version.map |  1 +
 3 files changed, 77 insertions(+)
  

Comments

Andrew Rybchenko July 23, 2019, 11:08 a.m. UTC | #1
On 7/23/19 8:38 AM, vattunuru@marvell.com wrote:
> From: Vamsi Attunuru <vattunuru@marvell.com>
>
> Patch adds a routine to populate mempool from page aligned and
> page sized chunks of memory to ensures memory objs do not fall
> across the page boundaries. It's useful for applications that
> require physically contiguous mbuf memory while running in
> IOVA=VA mode.
>
> Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
> Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
> ---
>   lib/librte_mempool/rte_mempool.c           | 59 ++++++++++++++++++++++++++++++
>   lib/librte_mempool/rte_mempool.h           | 17 +++++++++
>   lib/librte_mempool/rte_mempool_version.map |  1 +
>   3 files changed, 77 insertions(+)
>
> diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
> index 7260ce0..5312c8f 100644
> --- a/lib/librte_mempool/rte_mempool.c
> +++ b/lib/librte_mempool/rte_mempool.c
> @@ -414,6 +414,65 @@ rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
>   	return ret;
>   }
>   
> +/* Function to populate mempool from page sized mem chunks, allocate page size
> + * of memory in memzone and populate them. Return the number of objects added,
> + * or a negative value on error.
> + */
> +int
> +rte_mempool_populate_from_pg_sz_chunks(struct rte_mempool *mp)
> +{
> +	char mz_name[RTE_MEMZONE_NAMESIZE];
> +	size_t align, pg_sz, pg_shift;
> +	const struct rte_memzone *mz;
> +	unsigned int mz_id, n;
> +	size_t chunk_size;

I think it would be better to keep min_chunk_size name here.
It would make it easier to read and understand the code.

> +	int ret;
> +
> +	ret = mempool_ops_alloc_once(mp);
> +	if (ret != 0)
> +		return ret;
> +
> +	if (mp->nb_mem_chunks != 0)
> +		return -EEXIST;
> +
> +	pg_sz = get_min_page_size(mp->socket_id);
> +	pg_shift = rte_bsf32(pg_sz);
> +
> +	for (mz_id = 0, n = mp->size; n > 0; mz_id++, n -= ret) {
> +
> +		rte_mempool_op_calc_mem_size_default(mp, n, pg_shift,
> +			     &chunk_size, &align);

It is incorrect to ignore mempool pool ops and enforce
default handler. Use rte_mempool_ops_calc_mem_size().
Also it is better to treat negative return value as an error
as default function does.
(May be it my mistake in return value description that it is
not mentioned).

> +
> +		if (chunk_size > pg_sz)
> +			goto fail;
> +
> +		ret = snprintf(mz_name, sizeof(mz_name),
> +			RTE_MEMPOOL_MZ_FORMAT "_%d", mp->name, mz_id);
> +		if (ret < 0 || ret >= (int)sizeof(mz_name)) {
> +			ret = -ENAMETOOLONG;
> +			goto fail;
> +		}
> +
> +		mz = rte_memzone_reserve_aligned(mz_name, chunk_size,
> +				mp->socket_id, 0, align);

NULL return value must be handled.

> +
> +		ret = rte_mempool_populate_iova(mp, mz->addr,
> +				mz->iova, mz->len,
> +				rte_mempool_memchunk_mz_free,
> +				(void *)(uintptr_t)mz);
> +		if (ret < 0) {
> +			rte_memzone_free(mz);
> +			goto fail;
> +		}
> +	}
> +
> +	return mp->size;
> +
> +fail:
> +	rte_mempool_free_memchunks(mp);
> +	return ret;
> +}
> +
>   /* Default function to populate the mempool: allocate memory in memzones,
>    * and populate them. Return the number of objects added, or a negative
>    * value on error.
> diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
> index 8053f7a..73d6ada 100644
> --- a/lib/librte_mempool/rte_mempool.h
> +++ b/lib/librte_mempool/rte_mempool.h
> @@ -1064,6 +1064,23 @@ rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
>   /**
>    * Add memory for objects in the pool at init

The different from default must be highlighted in the summary.

>    *
> + * This is the function used to populate the mempool with page aligned memzone
> + * memory. It ensures all mempool objects being on the page by allocating
> + * memzones with page size.
> + *
> + * @param mp
> + *   A pointer to the mempool structure.
> + * @return
> + *   The number of objects added on success.
> + *   On error, the chunk is not added in the memory list of the
> + *   mempool and a negative errno is returned.
> + */
> +__rte_experimental
> +int rte_mempool_populate_from_pg_sz_chunks(struct rte_mempool *mp);
> +
> +/**
> + * Add memory for objects in the pool at init
> + *
>    * This is the default function used by rte_mempool_create() to populate
>    * the mempool. It adds memory allocated using rte_memzone_reserve().
>    *
> diff --git a/lib/librte_mempool/rte_mempool_version.map b/lib/librte_mempool/rte_mempool_version.map
> index 17cbca4..9a6fe65 100644
> --- a/lib/librte_mempool/rte_mempool_version.map
> +++ b/lib/librte_mempool/rte_mempool_version.map
> @@ -57,4 +57,5 @@ EXPERIMENTAL {
>   	global:
>   
>   	rte_mempool_ops_get_info;
> +	rte_mempool_populate_from_pg_sz_chunks;
>   };
  
Vamsi Krishna Attunuru July 23, 2019, 12:28 p.m. UTC | #2
> -----Original Message-----
> From: Andrew Rybchenko <arybchenko@solarflare.com>
> Sent: Tuesday, July 23, 2019 4:38 PM
> To: Vamsi Krishna Attunuru <vattunuru@marvell.com>; dev@dpdk.org
> Cc: thomas@monjalon.net; Jerin Jacob Kollanukkaran <jerinj@marvell.com>;
> olivier.matz@6wind.com; ferruh.yigit@intel.com;
> anatoly.burakov@intel.com; Kiran Kumar Kokkilagadda
> <kirankumark@marvell.com>
> Subject: Re: [dpdk-dev] [PATCH v8 1/5] mempool: populate mempool with
> page sized chunks of memory
> 
> On 7/23/19 8:38 AM, vattunuru@marvell.com wrote:
> > From: Vamsi Attunuru <vattunuru@marvell.com>
> >
> > Patch adds a routine to populate mempool from page aligned and page
> > sized chunks of memory to ensures memory objs do not fall across the
> > page boundaries. It's useful for applications that require physically
> > contiguous mbuf memory while running in IOVA=VA mode.
> >
> > Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
> > Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
> > ---
> >   lib/librte_mempool/rte_mempool.c           | 59
> ++++++++++++++++++++++++++++++
> >   lib/librte_mempool/rte_mempool.h           | 17 +++++++++
> >   lib/librte_mempool/rte_mempool_version.map |  1 +
> >   3 files changed, 77 insertions(+)
> >
> > diff --git a/lib/librte_mempool/rte_mempool.c
> > b/lib/librte_mempool/rte_mempool.c
> > index 7260ce0..5312c8f 100644
> > --- a/lib/librte_mempool/rte_mempool.c
> > +++ b/lib/librte_mempool/rte_mempool.c
> > @@ -414,6 +414,65 @@ rte_mempool_populate_virt(struct rte_mempool
> *mp, char *addr,
> >   	return ret;
> >   }
> >
> > +/* Function to populate mempool from page sized mem chunks, allocate
> > +page size
> > + * of memory in memzone and populate them. Return the number of
> > +objects added,
> > + * or a negative value on error.
> > + */
> > +int
> > +rte_mempool_populate_from_pg_sz_chunks(struct rte_mempool *mp) {
> > +	char mz_name[RTE_MEMZONE_NAMESIZE];
> > +	size_t align, pg_sz, pg_shift;
> > +	const struct rte_memzone *mz;
> > +	unsigned int mz_id, n;
> > +	size_t chunk_size;
> 
> I think it would be better to keep min_chunk_size name here.
> It would make it easier to read and understand the code.

ack
> 
> > +	int ret;
> > +
> > +	ret = mempool_ops_alloc_once(mp);
> > +	if (ret != 0)
> > +		return ret;
> > +
> > +	if (mp->nb_mem_chunks != 0)
> > +		return -EEXIST;
> > +
> > +	pg_sz = get_min_page_size(mp->socket_id);
> > +	pg_shift = rte_bsf32(pg_sz);
> > +
> > +	for (mz_id = 0, n = mp->size; n > 0; mz_id++, n -= ret) {
> > +
> > +		rte_mempool_op_calc_mem_size_default(mp, n, pg_shift,
> > +			     &chunk_size, &align);
> 
> It is incorrect to ignore mempool pool ops and enforce default handler. Use
> rte_mempool_ops_calc_mem_size().
> Also it is better to treat negative return value as an error as default function
> does.
> (May be it my mistake in return value description that it is not mentioned).
> 

Yes, I thought so, but ops_calc_mem_size() would in turn call mempool pmd's calc_mem_size() op which
may/may not return required chunk_size and align values in this case. Or else it would be skipped completely
and use pg_sz for both memzone len and align, anyways this  page sized alignment will suits the pmd's specific align requirements. 
  
> > +
> > +		if (chunk_size > pg_sz)
> > +			goto fail;
> > +
> > +		ret = snprintf(mz_name, sizeof(mz_name),
> > +			RTE_MEMPOOL_MZ_FORMAT "_%d", mp->name,
> mz_id);
> > +		if (ret < 0 || ret >= (int)sizeof(mz_name)) {
> > +			ret = -ENAMETOOLONG;
> > +			goto fail;
> > +		}
> > +
> > +		mz = rte_memzone_reserve_aligned(mz_name, chunk_size,
> > +				mp->socket_id, 0, align);
> 
> NULL return value must be handled.
> 
Ack.
> > +
> > +		ret = rte_mempool_populate_iova(mp, mz->addr,
> > +				mz->iova, mz->len,
> > +				rte_mempool_memchunk_mz_free,
> > +				(void *)(uintptr_t)mz);
> > +		if (ret < 0) {
> > +			rte_memzone_free(mz);
> > +			goto fail;
> > +		}
> > +	}
> > +
> > +	return mp->size;
> > +
> > +fail:
> > +	rte_mempool_free_memchunks(mp);
> > +	return ret;
> > +}
> > +
> >   /* Default function to populate the mempool: allocate memory in
> memzones,
> >    * and populate them. Return the number of objects added, or a negative
> >    * value on error.
> > diff --git a/lib/librte_mempool/rte_mempool.h
> > b/lib/librte_mempool/rte_mempool.h
> > index 8053f7a..73d6ada 100644
> > --- a/lib/librte_mempool/rte_mempool.h
> > +++ b/lib/librte_mempool/rte_mempool.h
> > @@ -1064,6 +1064,23 @@ rte_mempool_populate_virt(struct
> rte_mempool *mp, char *addr,
> >   /**
> >    * Add memory for objects in the pool at init
> 
> The different from default must be highlighted in the summary.

Sure, will add more details.
> 
> >    *
> > + * This is the function used to populate the mempool with page
> > +aligned memzone
> > + * memory. It ensures all mempool objects being on the page by
> > +allocating
> > + * memzones with page size.
> > + *
> > + * @param mp
> > + *   A pointer to the mempool structure.
> > + * @return
> > + *   The number of objects added on success.
> > + *   On error, the chunk is not added in the memory list of the
> > + *   mempool and a negative errno is returned.
> > + */
> > +__rte_experimental
> > +int rte_mempool_populate_from_pg_sz_chunks(struct rte_mempool
> *mp);
> > +
> > +/**
> > + * Add memory for objects in the pool at init
> > + *
> >    * This is the default function used by rte_mempool_create() to populate
> >    * the mempool. It adds memory allocated using rte_memzone_reserve().
> >    *
> > diff --git a/lib/librte_mempool/rte_mempool_version.map
> > b/lib/librte_mempool/rte_mempool_version.map
> > index 17cbca4..9a6fe65 100644
> > --- a/lib/librte_mempool/rte_mempool_version.map
> > +++ b/lib/librte_mempool/rte_mempool_version.map
> > @@ -57,4 +57,5 @@ EXPERIMENTAL {
> >   	global:
> >
> >   	rte_mempool_ops_get_info;
> > +	rte_mempool_populate_from_pg_sz_chunks;
> >   };
  
Andrew Rybchenko July 23, 2019, 7:33 p.m. UTC | #3
On 7/23/19 3:28 PM, Vamsi Krishna Attunuru wrote:
>> -----Original Message-----
>> From: Andrew Rybchenko <arybchenko@solarflare.com>
>> Sent: Tuesday, July 23, 2019 4:38 PM
>> To: Vamsi Krishna Attunuru <vattunuru@marvell.com>; dev@dpdk.org
>> Cc: thomas@monjalon.net; Jerin Jacob Kollanukkaran <jerinj@marvell.com>;
>> olivier.matz@6wind.com; ferruh.yigit@intel.com;
>> anatoly.burakov@intel.com; Kiran Kumar Kokkilagadda
>> <kirankumark@marvell.com>
>> Subject: Re: [dpdk-dev] [PATCH v8 1/5] mempool: populate mempool with
>> page sized chunks of memory
>>
>> On 7/23/19 8:38 AM, vattunuru@marvell.com wrote:
>>> From: Vamsi Attunuru <vattunuru@marvell.com>
>>>
>>> Patch adds a routine to populate mempool from page aligned and page
>>> sized chunks of memory to ensures memory objs do not fall across the
>>> page boundaries. It's useful for applications that require physically
>>> contiguous mbuf memory while running in IOVA=VA mode.
>>>
>>> Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
>>> Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
>>> ---

<...>

>>> +	int ret;
>>> +
>>> +	ret = mempool_ops_alloc_once(mp);
>>> +	if (ret != 0)
>>> +		return ret;
>>> +
>>> +	if (mp->nb_mem_chunks != 0)
>>> +		return -EEXIST;
>>> +
>>> +	pg_sz = get_min_page_size(mp->socket_id);
>>> +	pg_shift = rte_bsf32(pg_sz);
>>> +
>>> +	for (mz_id = 0, n = mp->size; n > 0; mz_id++, n -= ret) {
>>> +
>>> +		rte_mempool_op_calc_mem_size_default(mp, n, pg_shift,
>>> +			     &chunk_size, &align);
>> It is incorrect to ignore mempool pool ops and enforce default handler. Use
>> rte_mempool_ops_calc_mem_size().
>> Also it is better to treat negative return value as an error as default function
>> does.
>> (May be it my mistake in return value description that it is not mentioned).
>>
> Yes, I thought so, but ops_calc_mem_size() would in turn call mempool pmd's calc_mem_size() op which
> may/may not return required chunk_size and align values in this case. Or else it would be skipped completely
> and use pg_sz for both memzone len and align, anyways this  page sized alignment will suits the pmd's specific align requirements.

Anyway it is incorrect to violate driver ops. default is definitely 
wrong for bucket.
min_chunk_size and align is mempool driver requirements. You can harden it,
but should not violate it.
  
Vamsi Krishna Attunuru July 24, 2019, 7:09 a.m. UTC | #4
> -----Original Message-----
> From: Andrew Rybchenko <arybchenko@solarflare.com>
> Sent: Wednesday, July 24, 2019 1:04 AM
> To: Vamsi Krishna Attunuru <vattunuru@marvell.com>; dev@dpdk.org
> Cc: thomas@monjalon.net; Jerin Jacob Kollanukkaran <jerinj@marvell.com>;
> olivier.matz@6wind.com; ferruh.yigit@intel.com; anatoly.burakov@intel.com;
> Kiran Kumar Kokkilagadda <kirankumark@marvell.com>
> Subject: Re: [dpdk-dev] [PATCH v8 1/5] mempool: populate mempool with page
> sized chunks of memory
> 
> On 7/23/19 3:28 PM, Vamsi Krishna Attunuru wrote:
> >> -----Original Message-----
> >> From: Andrew Rybchenko <arybchenko@solarflare.com>
> >> Sent: Tuesday, July 23, 2019 4:38 PM
> >> To: Vamsi Krishna Attunuru <vattunuru@marvell.com>; dev@dpdk.org
> >> Cc: thomas@monjalon.net; Jerin Jacob Kollanukkaran
> >> <jerinj@marvell.com>; olivier.matz@6wind.com; ferruh.yigit@intel.com;
> >> anatoly.burakov@intel.com; Kiran Kumar Kokkilagadda
> >> <kirankumark@marvell.com>
> >> Subject: Re: [dpdk-dev] [PATCH v8 1/5] mempool: populate mempool with
> >> page sized chunks of memory
> >>
> >> On 7/23/19 8:38 AM, vattunuru@marvell.com wrote:
> >>> From: Vamsi Attunuru <vattunuru@marvell.com>
> >>>
> >>> Patch adds a routine to populate mempool from page aligned and page
> >>> sized chunks of memory to ensures memory objs do not fall across the
> >>> page boundaries. It's useful for applications that require
> >>> physically contiguous mbuf memory while running in IOVA=VA mode.
> >>>
> >>> Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
> >>> Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
> >>> ---
> 
> <...>
> 
> >>> +	int ret;
> >>> +
> >>> +	ret = mempool_ops_alloc_once(mp);
> >>> +	if (ret != 0)
> >>> +		return ret;
> >>> +
> >>> +	if (mp->nb_mem_chunks != 0)
> >>> +		return -EEXIST;
> >>> +
> >>> +	pg_sz = get_min_page_size(mp->socket_id);
> >>> +	pg_shift = rte_bsf32(pg_sz);
> >>> +
> >>> +	for (mz_id = 0, n = mp->size; n > 0; mz_id++, n -= ret) {
> >>> +
> >>> +		rte_mempool_op_calc_mem_size_default(mp, n, pg_shift,
> >>> +			     &chunk_size, &align);
> >> It is incorrect to ignore mempool pool ops and enforce default
> >> handler. Use rte_mempool_ops_calc_mem_size().
> >> Also it is better to treat negative return value as an error as
> >> default function does.
> >> (May be it my mistake in return value description that it is not mentioned).
> >>
> > Yes, I thought so, but ops_calc_mem_size() would in turn call mempool
> > pmd's calc_mem_size() op which may/may not return required chunk_size
> > and align values in this case. Or else it would be skipped completely and use
> pg_sz for both memzone len and align, anyways this  page sized alignment will
> suits the pmd's specific align requirements.
> 
> Anyway it is incorrect to violate driver ops. default is definitely wrong for
> bucket.
> min_chunk_size and align is mempool driver requirements. You can harden it,
> but should not violate it.

fine, I will modify the routine as below,  call pmd's calc_mem_size() op and over write min_chunk_size if it does not suit for this function's purpose.

+       total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
+       if (total_elt_sz > pg_sz)
+               return ret;

+       for (mz_id = 0, n = mp->size; n > 0; mz_id++, n -= ret) {

-               rte_mempool_op_calc_mem_size_default(mp, n, pg_shift,
-                            &chunk_size, &align);
+               ret = rte_mempool_ops_calc_mem_size(mp, n, pg_shift,
+                               &min_chunk_size, &align);
+
+               if (ret < 0)
                        goto fail;

+               if (min_chunk_size > pg_sz)
+                       min_chunk_size = pg_sz;

Changes look fine.?
  
Andrew Rybchenko July 24, 2019, 7:27 a.m. UTC | #5
On 7/24/19 10:09 AM, Vamsi Krishna Attunuru wrote:
>
>> -----Original Message-----
>> From: Andrew Rybchenko <arybchenko@solarflare.com>
>> Sent: Wednesday, July 24, 2019 1:04 AM
>> To: Vamsi Krishna Attunuru <vattunuru@marvell.com>; dev@dpdk.org
>> Cc: thomas@monjalon.net; Jerin Jacob Kollanukkaran <jerinj@marvell.com>;
>> olivier.matz@6wind.com; ferruh.yigit@intel.com; anatoly.burakov@intel.com;
>> Kiran Kumar Kokkilagadda <kirankumark@marvell.com>
>> Subject: Re: [dpdk-dev] [PATCH v8 1/5] mempool: populate mempool with page
>> sized chunks of memory
>>
>> On 7/23/19 3:28 PM, Vamsi Krishna Attunuru wrote:
>>>> -----Original Message-----
>>>> From: Andrew Rybchenko <arybchenko@solarflare.com>
>>>> Sent: Tuesday, July 23, 2019 4:38 PM
>>>> To: Vamsi Krishna Attunuru <vattunuru@marvell.com>; dev@dpdk.org
>>>> Cc: thomas@monjalon.net; Jerin Jacob Kollanukkaran
>>>> <jerinj@marvell.com>; olivier.matz@6wind.com; ferruh.yigit@intel.com;
>>>> anatoly.burakov@intel.com; Kiran Kumar Kokkilagadda
>>>> <kirankumark@marvell.com>
>>>> Subject: Re: [dpdk-dev] [PATCH v8 1/5] mempool: populate mempool with
>>>> page sized chunks of memory
>>>>
>>>> On 7/23/19 8:38 AM, vattunuru@marvell.com wrote:
>>>>> From: Vamsi Attunuru <vattunuru@marvell.com>
>>>>>
>>>>> Patch adds a routine to populate mempool from page aligned and page
>>>>> sized chunks of memory to ensures memory objs do not fall across the
>>>>> page boundaries. It's useful for applications that require
>>>>> physically contiguous mbuf memory while running in IOVA=VA mode.
>>>>>
>>>>> Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
>>>>> Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
>>>>> ---
>> <...>
>>
>>>>> +	int ret;
>>>>> +
>>>>> +	ret = mempool_ops_alloc_once(mp);
>>>>> +	if (ret != 0)
>>>>> +		return ret;
>>>>> +
>>>>> +	if (mp->nb_mem_chunks != 0)
>>>>> +		return -EEXIST;
>>>>> +
>>>>> +	pg_sz = get_min_page_size(mp->socket_id);
>>>>> +	pg_shift = rte_bsf32(pg_sz);
>>>>> +
>>>>> +	for (mz_id = 0, n = mp->size; n > 0; mz_id++, n -= ret) {
>>>>> +
>>>>> +		rte_mempool_op_calc_mem_size_default(mp, n, pg_shift,
>>>>> +			     &chunk_size, &align);
>>>> It is incorrect to ignore mempool pool ops and enforce default
>>>> handler. Use rte_mempool_ops_calc_mem_size().
>>>> Also it is better to treat negative return value as an error as
>>>> default function does.
>>>> (May be it my mistake in return value description that it is not mentioned).
>>>>
>>> Yes, I thought so, but ops_calc_mem_size() would in turn call mempool
>>> pmd's calc_mem_size() op which may/may not return required chunk_size
>>> and align values in this case. Or else it would be skipped completely and use
>> pg_sz for both memzone len and align, anyways this  page sized alignment will
>> suits the pmd's specific align requirements.
>>
>> Anyway it is incorrect to violate driver ops. default is definitely wrong for
>> bucket.
>> min_chunk_size and align is mempool driver requirements. You can harden it,
>> but should not violate it.
> fine, I will modify the routine as below,  call pmd's calc_mem_size() op and over write min_chunk_size if it does not suit for this function's purpose.
>
> +       total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
> +       if (total_elt_sz > pg_sz)
> +               return ret;
>
> +       for (mz_id = 0, n = mp->size; n > 0; mz_id++, n -= ret) {
>
> -               rte_mempool_op_calc_mem_size_default(mp, n, pg_shift,
> -                            &chunk_size, &align);
> +               ret = rte_mempool_ops_calc_mem_size(mp, n, pg_shift,
> +                               &min_chunk_size, &align);
> +
> +               if (ret < 0)
>                          goto fail;
>
> +               if (min_chunk_size > pg_sz)
> +                       min_chunk_size = pg_sz;
>
> Changes look fine.?

No, you can't violate min_chunk_size requirement. If it is unacceptable,
error should be returned. So, you should not check total_elt_sz above
separately.
  
Vamsi Krishna Attunuru July 29, 2019, 6:25 a.m. UTC | #6
Hi,

> -----Original Message-----
> From: Andrew Rybchenko <arybchenko@solarflare.com>
> Sent: Wednesday, July 24, 2019 12:58 PM
> To: Vamsi Krishna Attunuru <vattunuru@marvell.com>; dev@dpdk.org
> Cc: thomas@monjalon.net; Jerin Jacob Kollanukkaran <jerinj@marvell.com>;
> olivier.matz@6wind.com; ferruh.yigit@intel.com; anatoly.burakov@intel.com;
> Kiran Kumar Kokkilagadda <kirankumark@marvell.com>
> Subject: Re: [dpdk-dev] [PATCH v8 1/5] mempool: populate mempool with page
> sized chunks of memory
> 
> On 7/24/19 10:09 AM, Vamsi Krishna Attunuru wrote:
> >
> >> -----Original Message-----
> >> From: Andrew Rybchenko <arybchenko@solarflare.com>
> >> Sent: Wednesday, July 24, 2019 1:04 AM
> >> To: Vamsi Krishna Attunuru <vattunuru@marvell.com>; dev@dpdk.org
> >> Cc: thomas@monjalon.net; Jerin Jacob Kollanukkaran
> >> <jerinj@marvell.com>; olivier.matz@6wind.com; ferruh.yigit@intel.com;
> >> anatoly.burakov@intel.com; Kiran Kumar Kokkilagadda
> >> <kirankumark@marvell.com>
> >> Subject: Re: [dpdk-dev] [PATCH v8 1/5] mempool: populate mempool with
> >> page sized chunks of memory
> >>
> >> On 7/23/19 3:28 PM, Vamsi Krishna Attunuru wrote:
> >>>> -----Original Message-----
> >>>> From: Andrew Rybchenko <arybchenko@solarflare.com>
> >>>> Sent: Tuesday, July 23, 2019 4:38 PM
> >>>> To: Vamsi Krishna Attunuru <vattunuru@marvell.com>; dev@dpdk.org
> >>>> Cc: thomas@monjalon.net; Jerin Jacob Kollanukkaran
> >>>> <jerinj@marvell.com>; olivier.matz@6wind.com;
> >>>> ferruh.yigit@intel.com; anatoly.burakov@intel.com; Kiran Kumar
> >>>> Kokkilagadda <kirankumark@marvell.com>
> >>>> Subject: Re: [dpdk-dev] [PATCH v8 1/5] mempool: populate mempool
> >>>> with page sized chunks of memory
> >>>>
> >>>> On 7/23/19 8:38 AM, vattunuru@marvell.com wrote:
> >>>>> From: Vamsi Attunuru <vattunuru@marvell.com>
> >>>>>
> >>>>> Patch adds a routine to populate mempool from page aligned and
> >>>>> page sized chunks of memory to ensures memory objs do not fall
> >>>>> across the page boundaries. It's useful for applications that
> >>>>> require physically contiguous mbuf memory while running in IOVA=VA
> mode.
> >>>>>
> >>>>> Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
> >>>>> Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
> >>>>> ---
> >> <...>
> >>
> >>>>> +	int ret;
> >>>>> +
> >>>>> +	ret = mempool_ops_alloc_once(mp);
> >>>>> +	if (ret != 0)
> >>>>> +		return ret;
> >>>>> +
> >>>>> +	if (mp->nb_mem_chunks != 0)
> >>>>> +		return -EEXIST;
> >>>>> +
> >>>>> +	pg_sz = get_min_page_size(mp->socket_id);
> >>>>> +	pg_shift = rte_bsf32(pg_sz);
> >>>>> +
> >>>>> +	for (mz_id = 0, n = mp->size; n > 0; mz_id++, n -= ret) {
> >>>>> +
> >>>>> +		rte_mempool_op_calc_mem_size_default(mp, n, pg_shift,
> >>>>> +			     &chunk_size, &align);
> >>>> It is incorrect to ignore mempool pool ops and enforce default
> >>>> handler. Use rte_mempool_ops_calc_mem_size().
> >>>> Also it is better to treat negative return value as an error as
> >>>> default function does.
> >>>> (May be it my mistake in return value description that it is not mentioned).
> >>>>
> >>> Yes, I thought so, but ops_calc_mem_size() would in turn call
> >>> mempool pmd's calc_mem_size() op which may/may not return required
> >>> chunk_size and align values in this case. Or else it would be
> >>> skipped completely and use
> >> pg_sz for both memzone len and align, anyways this  page sized
> >> alignment will suits the pmd's specific align requirements.
> >>
> >> Anyway it is incorrect to violate driver ops. default is definitely
> >> wrong for bucket.
> >> min_chunk_size and align is mempool driver requirements. You can
> >> harden it, but should not violate it.
> > fine, I will modify the routine as below,  call pmd's calc_mem_size() op and
> over write min_chunk_size if it does not suit for this function's purpose.
> >
> > +       total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
> > +       if (total_elt_sz > pg_sz)
> > +               return ret;
> >
> > +       for (mz_id = 0, n = mp->size; n > 0; mz_id++, n -= ret) {
> >
> > -               rte_mempool_op_calc_mem_size_default(mp, n, pg_shift,
> > -                            &chunk_size, &align);
> > +               ret = rte_mempool_ops_calc_mem_size(mp, n, pg_shift,
> > +                               &min_chunk_size, &align);
> > +
> > +               if (ret < 0)
> >                          goto fail;
> >
> > +               if (min_chunk_size > pg_sz)
> > +                       min_chunk_size = pg_sz;
> >
> > Changes look fine.?
> 
> No, you can't violate min_chunk_size requirement. If it is unacceptable, error
> should be returned. So, you should not check total_elt_sz above separately.

Fine, will put min_chunk_size req check and return error if it's inappropriate.
I will send V9 with addressing all other comments.
Btw, sorry for delayed response.
  

Patch

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 7260ce0..5312c8f 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -414,6 +414,65 @@  rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
 	return ret;
 }
 
+/* Function to populate mempool from page sized mem chunks, allocate page size
+ * of memory in memzone and populate them. Return the number of objects added,
+ * or a negative value on error.
+ */
+int
+rte_mempool_populate_from_pg_sz_chunks(struct rte_mempool *mp)
+{
+	char mz_name[RTE_MEMZONE_NAMESIZE];
+	size_t align, pg_sz, pg_shift;
+	const struct rte_memzone *mz;
+	unsigned int mz_id, n;
+	size_t chunk_size;
+	int ret;
+
+	ret = mempool_ops_alloc_once(mp);
+	if (ret != 0)
+		return ret;
+
+	if (mp->nb_mem_chunks != 0)
+		return -EEXIST;
+
+	pg_sz = get_min_page_size(mp->socket_id);
+	pg_shift = rte_bsf32(pg_sz);
+
+	for (mz_id = 0, n = mp->size; n > 0; mz_id++, n -= ret) {
+
+		rte_mempool_op_calc_mem_size_default(mp, n, pg_shift,
+			     &chunk_size, &align);
+
+		if (chunk_size > pg_sz)
+			goto fail;
+
+		ret = snprintf(mz_name, sizeof(mz_name),
+			RTE_MEMPOOL_MZ_FORMAT "_%d", mp->name, mz_id);
+		if (ret < 0 || ret >= (int)sizeof(mz_name)) {
+			ret = -ENAMETOOLONG;
+			goto fail;
+		}
+
+		mz = rte_memzone_reserve_aligned(mz_name, chunk_size,
+				mp->socket_id, 0, align);
+
+		ret = rte_mempool_populate_iova(mp, mz->addr,
+				mz->iova, mz->len,
+				rte_mempool_memchunk_mz_free,
+				(void *)(uintptr_t)mz);
+		if (ret < 0) {
+			rte_memzone_free(mz);
+			goto fail;
+		}
+	}
+
+	return mp->size;
+
+fail:
+	rte_mempool_free_memchunks(mp);
+	return ret;
+}
+
 /* Default function to populate the mempool: allocate memory in memzones,
  * and populate them. Return the number of objects added, or a negative
  * value on error.
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 8053f7a..73d6ada 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -1064,6 +1064,23 @@  rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
 /**
  * Add memory for objects in the pool at init
  *
+ * This is the function used to populate the mempool with page aligned memzone
+ * memory. It ensures all mempool objects being on the page by allocating
+ * memzones with page size.
+ *
+ * @param mp
+ *   A pointer to the mempool structure.
+ * @return
+ *   The number of objects added on success.
+ *   On error, the chunk is not added in the memory list of the
+ *   mempool and a negative errno is returned.
+ */
+__rte_experimental
+int rte_mempool_populate_from_pg_sz_chunks(struct rte_mempool *mp);
+
+/**
+ * Add memory for objects in the pool at init
+ *
  * This is the default function used by rte_mempool_create() to populate
  * the mempool. It adds memory allocated using rte_memzone_reserve().
  *
diff --git a/lib/librte_mempool/rte_mempool_version.map b/lib/librte_mempool/rte_mempool_version.map
index 17cbca4..9a6fe65 100644
--- a/lib/librte_mempool/rte_mempool_version.map
+++ b/lib/librte_mempool/rte_mempool_version.map
@@ -57,4 +57,5 @@  EXPERIMENTAL {
 	global:
 
 	rte_mempool_ops_get_info;
+	rte_mempool_populate_from_pg_sz_chunks;
 };