[v5] mempool cache: add zero-copy get and put functions

Message ID 20221227151700.80887-1-mb@smartsharesystems.com (mailing list archive)
State Superseded, archived
Delegated to: David Marchand
Headers
Series [v5] mempool cache: add zero-copy get and put functions |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/iol-broadcom-Performance success Performance Testing PASS
ci/intel-Testing success Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-abi-testing success Testing PASS
ci/github-robot: build success github build: passed
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS

Commit Message

Morten Brørup Dec. 27, 2022, 3:17 p.m. UTC
  Zero-copy access to mempool caches is beneficial for PMD performance, and
must be provided by the mempool library to fix [Bug 1052] without a
performance regression.

[Bug 1052]: https://bugs.dpdk.org/show_bug.cgi?id=1052

Bugzilla ID: 1052

v5:
* Bugfix: Compare zero-copy get request to the cache size instead of the
  flush threshold; otherwise refill could overflow the memory allocated
  for the cache. (Andrew)
* Split the zero-copy put function into an internal function doing the
  work, and a public function with trace.
* Avoid code duplication by rewriting rte_mempool_do_generic_put() to use
  the internal zero-copy put function. (Andrew)
* Corrected the return type of rte_mempool_cache_zc_put_bulk() from void *
  to void **; it returns a pointer to an array of objects.
* Fix coding style: Add missing curly brackets. (Andrew)
v4:
* Fix checkpatch warnings.
v3:
* Bugfix: Respect the cache size; compare to the flush threshold instead
  of RTE_MEMPOOL_CACHE_MAX_SIZE.
* Added 'rewind' function for incomplete 'put' operations. (Konstantin)
* Replace RTE_ASSERTs with runtime checks of the request size.
  Instead of failing, return NULL if the request is too big. (Konstantin)
* Modified comparison to prevent overflow if n is really huge and len is
  non-zero.
* Updated the comments in the code.
v2:
* Fix checkpatch warnings.
* Fix missing registration of trace points.
* The functions are inline, so they don't go into the map file.
v1 changes from the RFC:
* Removed run-time parameter checks. (Honnappa)
  This is a hot fast path function; requiring correct application
  behaviour, i.e. function parameters must be valid.
* Added RTE_ASSERT for parameters instead.
  Code for this is only generated if built with RTE_ENABLE_ASSERT.
* Removed fallback when 'cache' parameter is not set. (Honnappa)
* Chose the simple get function; i.e. do not move the existing objects in
  the cache to the top of the new stack, just leave them at the bottom.
* Renamed the functions. Other suggestions are welcome, of course. ;-)
* Updated the function descriptions.
* Added the functions to trace_fp and version.map.

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
---
 lib/mempool/mempool_trace_points.c |   9 ++
 lib/mempool/rte_mempool.h          | 237 +++++++++++++++++++++++++----
 lib/mempool/rte_mempool_trace_fp.h |  23 +++
 lib/mempool/version.map            |   5 +
 4 files changed, 245 insertions(+), 29 deletions(-)
  

Comments

Konstantin Ananyev Jan. 22, 2023, 8:34 p.m. UTC | #1
Hi Morten,

Few nits, see below.
Also I still think we do need a test case for _zc_get_ before
accepting it in the mainline.
With that in place:
Acked-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>

> Zero-copy access to mempool caches is beneficial for PMD performance, and
> must be provided by the mempool library to fix [Bug 1052] without a
> performance regression.
> 
> [Bug 1052]: https://bugs.dpdk.org/show_bug.cgi?id=1052
> 
> Bugzilla ID: 1052
> 
> v5:
> * Bugfix: Compare zero-copy get request to the cache size instead of the
>    flush threshold; otherwise refill could overflow the memory allocated
>    for the cache. (Andrew)
> * Split the zero-copy put function into an internal function doing the
>    work, and a public function with trace.
> * Avoid code duplication by rewriting rte_mempool_do_generic_put() to use
>    the internal zero-copy put function. (Andrew)
> * Corrected the return type of rte_mempool_cache_zc_put_bulk() from void *
>    to void **; it returns a pointer to an array of objects.
> * Fix coding style: Add missing curly brackets. (Andrew)
> v4:
> * Fix checkpatch warnings.
> v3:
> * Bugfix: Respect the cache size; compare to the flush threshold instead
>    of RTE_MEMPOOL_CACHE_MAX_SIZE.
> * Added 'rewind' function for incomplete 'put' operations. (Konstantin)
> * Replace RTE_ASSERTs with runtime checks of the request size.
>    Instead of failing, return NULL if the request is too big. (Konstantin)
> * Modified comparison to prevent overflow if n is really huge and len is
>    non-zero.
> * Updated the comments in the code.
> v2:
> * Fix checkpatch warnings.
> * Fix missing registration of trace points.
> * The functions are inline, so they don't go into the map file.
> v1 changes from the RFC:
> * Removed run-time parameter checks. (Honnappa)
>    This is a hot fast path function; requiring correct application
>    behaviour, i.e. function parameters must be valid.
> * Added RTE_ASSERT for parameters instead.
>    Code for this is only generated if built with RTE_ENABLE_ASSERT.
> * Removed fallback when 'cache' parameter is not set. (Honnappa)
> * Chose the simple get function; i.e. do not move the existing objects in
>    the cache to the top of the new stack, just leave them at the bottom.
> * Renamed the functions. Other suggestions are welcome, of course. ;-)
> * Updated the function descriptions.
> * Added the functions to trace_fp and version.map.
> 
> Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> ---
>   lib/mempool/mempool_trace_points.c |   9 ++
>   lib/mempool/rte_mempool.h          | 237 +++++++++++++++++++++++++----
>   lib/mempool/rte_mempool_trace_fp.h |  23 +++
>   lib/mempool/version.map            |   5 +
>   4 files changed, 245 insertions(+), 29 deletions(-)
> 
> diff --git a/lib/mempool/mempool_trace_points.c b/lib/mempool/mempool_trace_points.c
> index 4ad76deb34..83d353a764 100644
> --- a/lib/mempool/mempool_trace_points.c
> +++ b/lib/mempool/mempool_trace_points.c
> @@ -77,3 +77,12 @@ RTE_TRACE_POINT_REGISTER(rte_mempool_trace_ops_free,
>   
>   RTE_TRACE_POINT_REGISTER(rte_mempool_trace_set_ops_byname,
>   	lib.mempool.set.ops.byname)
> +
> +RTE_TRACE_POINT_REGISTER(rte_mempool_trace_cache_zc_put_bulk,
> +	lib.mempool.cache.zc.put.bulk)
> +
> +RTE_TRACE_POINT_REGISTER(rte_mempool_trace_cache_zc_put_rewind,
> +	lib.mempool.cache.zc.put.rewind)
> +
> +RTE_TRACE_POINT_REGISTER(rte_mempool_trace_cache_zc_get_bulk,
> +	lib.mempool.cache.zc.get.bulk)
> diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
> index 9f530db24b..5efd3c2b5b 100644
> --- a/lib/mempool/rte_mempool.h
> +++ b/lib/mempool/rte_mempool.h
> @@ -47,6 +47,7 @@
>   #include <rte_ring.h>
>   #include <rte_memcpy.h>
>   #include <rte_common.h>
> +#include <rte_errno.h>
>   
>   #include "rte_mempool_trace_fp.h"
>   
> @@ -1346,6 +1347,197 @@ rte_mempool_cache_flush(struct rte_mempool_cache *cache,
>   	cache->len = 0;
>   }
>   
> +/**
> + * @internal used by rte_mempool_cache_zc_put_bulk() and rte_mempool_do_generic_put().
> + *
> + * Zero-copy put objects in a user-owned mempool cache backed by the specified mempool.
> + *
> + * @param cache
> + *   A pointer to the mempool cache.
> + * @param mp
> + *   A pointer to the mempool.
> + * @param n
> + *   The number of objects to be put in the mempool cache.
> + * @return
> + *   The pointer to where to put the objects in the mempool cache.
> + *   NULL if the request itself is too big for the cache, i.e.
> + *   exceeds the cache flush threshold.
> + */
> +static __rte_always_inline void **
> +__rte_mempool_cache_zc_put_bulk(struct rte_mempool_cache *cache,
> +		struct rte_mempool *mp,
> +		unsigned int n)
> +{
> +	void **cache_objs;
> +
> +	RTE_ASSERT(cache != NULL);
> +	RTE_ASSERT(mp != NULL);
> +
> +	if (n <= cache->flushthresh - cache->len) {
> +		/*
> +		 * The objects can be added to the cache without crossing the
> +		 * flush threshold.
> +		 */
> +		cache_objs = &cache->objs[cache->len];
> +		cache->len += n;
> +	} else if (likely(n <= cache->flushthresh)) {
> +		/*
> +		 * The request itself fits into the cache.
> +		 * But first, the cache must be flushed to the backend, so
> +		 * adding the objects does not cross the flush threshold.
> +		 */
> +		cache_objs = &cache->objs[0];
> +		rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len);
> +		cache->len = n;
> +	} else {
> +		/* The request itself is too big for the cache. */
> +		return NULL;
> +	}
> +
> +	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
> +	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
> +
> +	return cache_objs;
> +}
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: This API may change, or be removed, without prior notice.
> + *
> + * Zero-copy put objects in a user-owned mempool cache backed by the specified mempool.
> + *
> + * @param cache
> + *   A pointer to the mempool cache.
> + * @param mp
> + *   A pointer to the mempool.
> + * @param n
> + *   The number of objects to be put in the mempool cache.
> + * @return
> + *   The pointer to where to put the objects in the mempool cache.
> + *   NULL if the request itself is too big for the cache, i.e.
> + *   exceeds the cache flush threshold.
> + */
> +__rte_experimental
> +static __rte_always_inline void **
> +rte_mempool_cache_zc_put_bulk(struct rte_mempool_cache *cache,
> +		struct rte_mempool *mp,
> +		unsigned int n)
> +{
> +	RTE_ASSERT(cache != NULL);
> +	RTE_ASSERT(mp != NULL);
> +
> +	rte_mempool_trace_cache_zc_put_bulk(cache, mp, n);
> +	return __rte_mempool_cache_zc_put_bulk(cache, mp, n);
> +}
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: This API may change, or be removed, without prior notice.
> + *
> + * Zero-copy un-put objects in a user-owned mempool cache.
> + *
> + * @param cache
> + *   A pointer to the mempool cache.
> + * @param n
> + *   The number of objects not put in the mempool cache after calling
> + *   rte_mempool_cache_zc_put_bulk().
> + */
> +__rte_experimental
> +static __rte_always_inline void
> +rte_mempool_cache_zc_put_rewind(struct rte_mempool_cache *cache,
> +		unsigned int n)
> +{
> +	RTE_ASSERT(cache != NULL);
> +	RTE_ASSERT(n <= cache->len);
> +
> +	rte_mempool_trace_cache_zc_put_rewind(cache, n);
> +
> +	cache->len -= n;
> +
> +	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, (int)-n);
> +}
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: This API may change, or be removed, without prior notice.
> + *
> + * Zero-copy get objects from a user-owned mempool cache backed by the specified mempool.
> + *
> + * @param cache
> + *   A pointer to the mempool cache.
> + * @param mp
> + *   A pointer to the mempool.
> + * @param n
> + *   The number of objects to prefetch into the mempool cache.

Why not 'get' instead of 'prefetch'?


> + * @return
> + *   The pointer to the objects in the mempool cache.
> + *   NULL on error; i.e. the cache + the pool does not contain 'n' objects.
> + *   With rte_errno set to the error code of the mempool dequeue function,
> + *   or EINVAL if the request itself is too big for the cache, i.e.
> + *   exceeds the cache flush threshold.
> + */
> +__rte_experimental
> +static __rte_always_inline void *
> +rte_mempool_cache_zc_get_bulk(struct rte_mempool_cache *cache,
> +		struct rte_mempool *mp,
> +		unsigned int n)
> +{
> +	unsigned int len, size;
> +
> +	RTE_ASSERT(cache != NULL);
> +	RTE_ASSERT(mp != NULL);
> +
> +	rte_mempool_trace_cache_zc_get_bulk(cache, mp, n);
> +
> +	len = cache->len;
> +	size = cache->size;
> +
> +	if (n <= len) {
> +		/* The request can be satisfied from the cache as is. */
> +		len -= n;
> +	} else if (likely(n <= size)) {
> +		/*
> +		 * The request itself can be satisfied from the cache.
> +		 * But first, the cache must be filled from the backend;
> +		 * fetch size + requested - len objects.
> +		 */
> +		int ret;
> +
> +		ret = rte_mempool_ops_dequeue_bulk(mp, &cache->objs[len], size + n - len);
> +		if (unlikely(ret < 0)) {
> +			/*
> +			 * We are buffer constrained.
> +			 * Do not fill the cache, just satisfy the request.
> +			 */
> +			ret = rte_mempool_ops_dequeue_bulk(mp, &cache->objs[len], n - len);
> +			if (unlikely(ret < 0)) {
> +				/* Unable to satisfy the request. */
> +
> +				RTE_MEMPOOL_STAT_ADD(mp, get_fail_bulk, 1);
> +				RTE_MEMPOOL_STAT_ADD(mp, get_fail_objs, n);
> +
> +				rte_errno = -ret;
> +				return NULL;
> +			}
> +
> +			len = 0;
> +		} else {
> +			len = size;
> +		}
> +	} else {
> +		/* The request itself is too big for the cache. */
> +		rte_errno = EINVAL;
> +		return NULL;
> +	}
> +
> +	cache->len = len;
> +
> +	RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1);
> +	RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_objs, n);
> +
> +	return &cache->objs[len];
> +}
> +
>   /**
>    * @internal Put several objects back in the mempool; used internally.
>    * @param mp
> @@ -1364,32 +1556,25 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void * const *obj_table,
>   {
>   	void **cache_objs;
>   
> -	/* No cache provided */
> -	if (unlikely(cache == NULL))
> -		goto driver_enqueue;
> +	/* No cache provided? */
> +	if (unlikely(cache == NULL)) {
> +		/* Increment stats now, adding in mempool always succeeds. */
> +		RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1);
> +		RTE_MEMPOOL_STAT_ADD(mp, put_objs, n);
>   
> -	/* increment stat now, adding in mempool always success */
> -	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
> -	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
> +		goto driver_enqueue;
> +	}
>   
> -	/* The request itself is too big for the cache */
> -	if (unlikely(n > cache->flushthresh))
> -		goto driver_enqueue_stats_incremented;
> +	/* Prepare to add the objects to the cache. */
> +	cache_objs = __rte_mempool_cache_zc_put_bulk(cache, mp, n);
>   
> -	/*
> -	 * The cache follows the following algorithm:
> -	 *   1. If the objects cannot be added to the cache without crossing
> -	 *      the flush threshold, flush the cache to the backend.
> -	 *   2. Add the objects to the cache.
> -	 */
> +	/* The request itself is too big for the cache? */
> +	if (unlikely(cache_objs == NULL)) {
> +		/* Increment stats now, adding in mempool always succeeds. */
> +		RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
> +		RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);

Shouldn't it be RTE_MEMPOOL_STAT_ADD() here?

>   
> -	if (cache->len + n <= cache->flushthresh) {
> -		cache_objs = &cache->objs[cache->len];
> -		cache->len += n;
> -	} else {
> -		cache_objs = &cache->objs[0];
> -		rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len);
> -		cache->len = n;
> +		goto driver_enqueue;
>   	}
>   
>   	/* Add the objects to the cache. */
> @@ -1399,13 +1584,7 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void * const *obj_table,
>   
>   driver_enqueue:
>   
> -	/* increment stat now, adding in mempool always success */
> -	RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1);
> -	RTE_MEMPOOL_STAT_ADD(mp, put_objs, n);
> -
> -driver_enqueue_stats_incremented:
> -
> -	/* push objects to the backend */
> +	/* Push the objects to the backend. */
>   	rte_mempool_ops_enqueue_bulk(mp, obj_table, n);
>   }
>   
> diff --git a/lib/mempool/rte_mempool_trace_fp.h b/lib/mempool/rte_mempool_trace_fp.h
> index ed060e887c..14666457f7 100644
> --- a/lib/mempool/rte_mempool_trace_fp.h
> +++ b/lib/mempool/rte_mempool_trace_fp.h
> @@ -109,6 +109,29 @@ RTE_TRACE_POINT_FP(
>   	rte_trace_point_emit_ptr(mempool);
>   )
>   
> +RTE_TRACE_POINT_FP(
> +	rte_mempool_trace_cache_zc_put_bulk,
> +	RTE_TRACE_POINT_ARGS(void *cache, void *mempool, uint32_t nb_objs),
> +	rte_trace_point_emit_ptr(cache);
> +	rte_trace_point_emit_ptr(mempool);
> +	rte_trace_point_emit_u32(nb_objs);
> +)
> +
> +RTE_TRACE_POINT_FP(
> +	rte_mempool_trace_cache_zc_put_rewind,
> +	RTE_TRACE_POINT_ARGS(void *cache, uint32_t nb_objs),
> +	rte_trace_point_emit_ptr(cache);
> +	rte_trace_point_emit_u32(nb_objs);
> +)
> +
> +RTE_TRACE_POINT_FP(
> +	rte_mempool_trace_cache_zc_get_bulk,
> +	RTE_TRACE_POINT_ARGS(void *cache, void *mempool, uint32_t nb_objs),
> +	rte_trace_point_emit_ptr(cache);
> +	rte_trace_point_emit_ptr(mempool);
> +	rte_trace_point_emit_u32(nb_objs);
> +)
> +
>   #ifdef __cplusplus
>   }
>   #endif
> diff --git a/lib/mempool/version.map b/lib/mempool/version.map
> index b67d7aace7..1383ae6db2 100644
> --- a/lib/mempool/version.map
> +++ b/lib/mempool/version.map
> @@ -63,6 +63,11 @@ EXPERIMENTAL {
>   	__rte_mempool_trace_ops_alloc;
>   	__rte_mempool_trace_ops_free;
>   	__rte_mempool_trace_set_ops_byname;
> +
> +	# added in 23.03
> +	__rte_mempool_trace_cache_zc_put_bulk;
> +	__rte_mempool_trace_cache_zc_put_rewind;
> +	__rte_mempool_trace_cache_zc_get_bulk;
>   };
>   
>   INTERNAL {
  
Morten Brørup Jan. 22, 2023, 9:17 p.m. UTC | #2
> From: Konstantin Ananyev [mailto:konstantin.v.ananyev@yandex.ru]
> Sent: Sunday, 22 January 2023 21.35
> 
> Hi Morten,
> 
> Few nits, see below.
> Also I still think we do need a test case for _zc_get_ before
> accepting it in the mainline.

Poking at my bad conscience... :-)

It's on my todo-list. Apparently not high enough. ;-)

> With that in place:
> Acked-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
> 
> > Zero-copy access to mempool caches is beneficial for PMD performance,
> and
> > must be provided by the mempool library to fix [Bug 1052] without a
> > performance regression.
> >
> > [Bug 1052]: https://bugs.dpdk.org/show_bug.cgi?id=1052
> >
> > Bugzilla ID: 1052
> >
> > v5:
> > * Bugfix: Compare zero-copy get request to the cache size instead of
> the
> >    flush threshold; otherwise refill could overflow the memory
> allocated
> >    for the cache. (Andrew)
> > * Split the zero-copy put function into an internal function doing
> the
> >    work, and a public function with trace.
> > * Avoid code duplication by rewriting rte_mempool_do_generic_put() to
> use
> >    the internal zero-copy put function. (Andrew)
> > * Corrected the return type of rte_mempool_cache_zc_put_bulk() from
> void *
> >    to void **; it returns a pointer to an array of objects.
> > * Fix coding style: Add missing curly brackets. (Andrew)
> > v4:
> > * Fix checkpatch warnings.
> > v3:
> > * Bugfix: Respect the cache size; compare to the flush threshold
> instead
> >    of RTE_MEMPOOL_CACHE_MAX_SIZE.
> > * Added 'rewind' function for incomplete 'put' operations.
> (Konstantin)
> > * Replace RTE_ASSERTs with runtime checks of the request size.
> >    Instead of failing, return NULL if the request is too big.
> (Konstantin)
> > * Modified comparison to prevent overflow if n is really huge and len
> is
> >    non-zero.
> > * Updated the comments in the code.
> > v2:
> > * Fix checkpatch warnings.
> > * Fix missing registration of trace points.
> > * The functions are inline, so they don't go into the map file.
> > v1 changes from the RFC:
> > * Removed run-time parameter checks. (Honnappa)
> >    This is a hot fast path function; requiring correct application
> >    behaviour, i.e. function parameters must be valid.
> > * Added RTE_ASSERT for parameters instead.
> >    Code for this is only generated if built with RTE_ENABLE_ASSERT.
> > * Removed fallback when 'cache' parameter is not set. (Honnappa)
> > * Chose the simple get function; i.e. do not move the existing
> objects in
> >    the cache to the top of the new stack, just leave them at the
> bottom.
> > * Renamed the functions. Other suggestions are welcome, of course. ;-
> )
> > * Updated the function descriptions.
> > * Added the functions to trace_fp and version.map.
> >
> > Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> > ---
> >   lib/mempool/mempool_trace_points.c |   9 ++
> >   lib/mempool/rte_mempool.h          | 237 +++++++++++++++++++++++++-
> ---
> >   lib/mempool/rte_mempool_trace_fp.h |  23 +++
> >   lib/mempool/version.map            |   5 +
> >   4 files changed, 245 insertions(+), 29 deletions(-)
> >
> > diff --git a/lib/mempool/mempool_trace_points.c
> b/lib/mempool/mempool_trace_points.c
> > index 4ad76deb34..83d353a764 100644
> > --- a/lib/mempool/mempool_trace_points.c
> > +++ b/lib/mempool/mempool_trace_points.c
> > @@ -77,3 +77,12 @@
> RTE_TRACE_POINT_REGISTER(rte_mempool_trace_ops_free,
> >
> >   RTE_TRACE_POINT_REGISTER(rte_mempool_trace_set_ops_byname,
> >   	lib.mempool.set.ops.byname)
> > +
> > +RTE_TRACE_POINT_REGISTER(rte_mempool_trace_cache_zc_put_bulk,
> > +	lib.mempool.cache.zc.put.bulk)
> > +
> > +RTE_TRACE_POINT_REGISTER(rte_mempool_trace_cache_zc_put_rewind,
> > +	lib.mempool.cache.zc.put.rewind)
> > +
> > +RTE_TRACE_POINT_REGISTER(rte_mempool_trace_cache_zc_get_bulk,
> > +	lib.mempool.cache.zc.get.bulk)
> > diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
> > index 9f530db24b..5efd3c2b5b 100644
> > --- a/lib/mempool/rte_mempool.h
> > +++ b/lib/mempool/rte_mempool.h
> > @@ -47,6 +47,7 @@
> >   #include <rte_ring.h>
> >   #include <rte_memcpy.h>
> >   #include <rte_common.h>
> > +#include <rte_errno.h>
> >
> >   #include "rte_mempool_trace_fp.h"
> >
> > @@ -1346,6 +1347,197 @@ rte_mempool_cache_flush(struct
> rte_mempool_cache *cache,
> >   	cache->len = 0;
> >   }
> >
> > +/**
> > + * @internal used by rte_mempool_cache_zc_put_bulk() and
> rte_mempool_do_generic_put().
> > + *
> > + * Zero-copy put objects in a user-owned mempool cache backed by the
> specified mempool.
> > + *
> > + * @param cache
> > + *   A pointer to the mempool cache.
> > + * @param mp
> > + *   A pointer to the mempool.
> > + * @param n
> > + *   The number of objects to be put in the mempool cache.
> > + * @return
> > + *   The pointer to where to put the objects in the mempool cache.
> > + *   NULL if the request itself is too big for the cache, i.e.
> > + *   exceeds the cache flush threshold.
> > + */
> > +static __rte_always_inline void **
> > +__rte_mempool_cache_zc_put_bulk(struct rte_mempool_cache *cache,
> > +		struct rte_mempool *mp,
> > +		unsigned int n)
> > +{
> > +	void **cache_objs;
> > +
> > +	RTE_ASSERT(cache != NULL);
> > +	RTE_ASSERT(mp != NULL);
> > +
> > +	if (n <= cache->flushthresh - cache->len) {
> > +		/*
> > +		 * The objects can be added to the cache without crossing
> the
> > +		 * flush threshold.
> > +		 */
> > +		cache_objs = &cache->objs[cache->len];
> > +		cache->len += n;
> > +	} else if (likely(n <= cache->flushthresh)) {
> > +		/*
> > +		 * The request itself fits into the cache.
> > +		 * But first, the cache must be flushed to the backend, so
> > +		 * adding the objects does not cross the flush threshold.
> > +		 */
> > +		cache_objs = &cache->objs[0];
> > +		rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len);
> > +		cache->len = n;
> > +	} else {
> > +		/* The request itself is too big for the cache. */
> > +		return NULL;
> > +	}
> > +
> > +	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
> > +	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
> > +
> > +	return cache_objs;
> > +}
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: This API may change, or be removed, without
> prior notice.
> > + *
> > + * Zero-copy put objects in a user-owned mempool cache backed by the
> specified mempool.
> > + *
> > + * @param cache
> > + *   A pointer to the mempool cache.
> > + * @param mp
> > + *   A pointer to the mempool.
> > + * @param n
> > + *   The number of objects to be put in the mempool cache.
> > + * @return
> > + *   The pointer to where to put the objects in the mempool cache.
> > + *   NULL if the request itself is too big for the cache, i.e.
> > + *   exceeds the cache flush threshold.
> > + */
> > +__rte_experimental
> > +static __rte_always_inline void **
> > +rte_mempool_cache_zc_put_bulk(struct rte_mempool_cache *cache,
> > +		struct rte_mempool *mp,
> > +		unsigned int n)
> > +{
> > +	RTE_ASSERT(cache != NULL);
> > +	RTE_ASSERT(mp != NULL);
> > +
> > +	rte_mempool_trace_cache_zc_put_bulk(cache, mp, n);
> > +	return __rte_mempool_cache_zc_put_bulk(cache, mp, n);
> > +}
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: This API may change, or be removed, without
> prior notice.
> > + *
> > + * Zero-copy un-put objects in a user-owned mempool cache.
> > + *
> > + * @param cache
> > + *   A pointer to the mempool cache.
> > + * @param n
> > + *   The number of objects not put in the mempool cache after
> calling
> > + *   rte_mempool_cache_zc_put_bulk().
> > + */
> > +__rte_experimental
> > +static __rte_always_inline void
> > +rte_mempool_cache_zc_put_rewind(struct rte_mempool_cache *cache,
> > +		unsigned int n)
> > +{
> > +	RTE_ASSERT(cache != NULL);
> > +	RTE_ASSERT(n <= cache->len);
> > +
> > +	rte_mempool_trace_cache_zc_put_rewind(cache, n);
> > +
> > +	cache->len -= n;
> > +
> > +	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, (int)-n);
> > +}
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: This API may change, or be removed, without
> prior notice.
> > + *
> > + * Zero-copy get objects from a user-owned mempool cache backed by
> the specified mempool.
> > + *
> > + * @param cache
> > + *   A pointer to the mempool cache.
> > + * @param mp
> > + *   A pointer to the mempool.
> > + * @param n
> > + *   The number of objects to prefetch into the mempool cache.
> 
> Why not 'get' instead of 'prefetch'?

This was my thinking:

The function "prefetches" the objects into the cache. It is the application itself that "gets" the objects from the cache after having called the function.
You might also notice that the n parameter for the zc_put() function is described as "to be put" (future), not "to put" (now) in the cache.

On the other hand, I chose "Zero-copy get" for the function headline to keep it simple.

If you think "get" is a more correct description of the n parameter, I can change it.

Alternatively, I can use the same style as zc_put(), i.e. "to be gotten from the mempool cache" - but that would require input from a natively English speaking person, because Danish and English grammar is very different, and I am highly uncertain about my English grammar here! I originally considered this phrase, but concluded that the "prefetch" description was easier to understand - especially for non-native English readers.

> 
> 
> > + * @return
> > + *   The pointer to the objects in the mempool cache.
> > + *   NULL on error; i.e. the cache + the pool does not contain 'n'
> objects.
> > + *   With rte_errno set to the error code of the mempool dequeue
> function,
> > + *   or EINVAL if the request itself is too big for the cache, i.e.
> > + *   exceeds the cache flush threshold.
> > + */
> > +__rte_experimental
> > +static __rte_always_inline void *
> > +rte_mempool_cache_zc_get_bulk(struct rte_mempool_cache *cache,
> > +		struct rte_mempool *mp,
> > +		unsigned int n)
> > +{
> > +	unsigned int len, size;
> > +
> > +	RTE_ASSERT(cache != NULL);
> > +	RTE_ASSERT(mp != NULL);
> > +
> > +	rte_mempool_trace_cache_zc_get_bulk(cache, mp, n);
> > +
> > +	len = cache->len;
> > +	size = cache->size;
> > +
> > +	if (n <= len) {
> > +		/* The request can be satisfied from the cache as is. */
> > +		len -= n;
> > +	} else if (likely(n <= size)) {
> > +		/*
> > +		 * The request itself can be satisfied from the cache.
> > +		 * But first, the cache must be filled from the backend;
> > +		 * fetch size + requested - len objects.
> > +		 */
> > +		int ret;
> > +
> > +		ret = rte_mempool_ops_dequeue_bulk(mp, &cache->objs[len],
> size + n - len);
> > +		if (unlikely(ret < 0)) {
> > +			/*
> > +			 * We are buffer constrained.
> > +			 * Do not fill the cache, just satisfy the request.
> > +			 */
> > +			ret = rte_mempool_ops_dequeue_bulk(mp, &cache-
> >objs[len], n - len);
> > +			if (unlikely(ret < 0)) {
> > +				/* Unable to satisfy the request. */
> > +
> > +				RTE_MEMPOOL_STAT_ADD(mp, get_fail_bulk, 1);
> > +				RTE_MEMPOOL_STAT_ADD(mp, get_fail_objs, n);
> > +
> > +				rte_errno = -ret;
> > +				return NULL;
> > +			}
> > +
> > +			len = 0;
> > +		} else {
> > +			len = size;
> > +		}
> > +	} else {
> > +		/* The request itself is too big for the cache. */
> > +		rte_errno = EINVAL;
> > +		return NULL;
> > +	}
> > +
> > +	cache->len = len;
> > +
> > +	RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1);
> > +	RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_objs, n);
> > +
> > +	return &cache->objs[len];
> > +}
> > +
> >   /**
> >    * @internal Put several objects back in the mempool; used
> internally.
> >    * @param mp
> > @@ -1364,32 +1556,25 @@ rte_mempool_do_generic_put(struct rte_mempool
> *mp, void * const *obj_table,
> >   {
> >   	void **cache_objs;
> >
> > -	/* No cache provided */
> > -	if (unlikely(cache == NULL))
> > -		goto driver_enqueue;
> > +	/* No cache provided? */
> > +	if (unlikely(cache == NULL)) {
> > +		/* Increment stats now, adding in mempool always succeeds.
> */
> > +		RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1);
> > +		RTE_MEMPOOL_STAT_ADD(mp, put_objs, n);
> >
> > -	/* increment stat now, adding in mempool always success */
> > -	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
> > -	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
> > +		goto driver_enqueue;
> > +	}
> >
> > -	/* The request itself is too big for the cache */
> > -	if (unlikely(n > cache->flushthresh))
> > -		goto driver_enqueue_stats_incremented;
> > +	/* Prepare to add the objects to the cache. */
> > +	cache_objs = __rte_mempool_cache_zc_put_bulk(cache, mp, n);
> >
> > -	/*
> > -	 * The cache follows the following algorithm:
> > -	 *   1. If the objects cannot be added to the cache without
> crossing
> > -	 *      the flush threshold, flush the cache to the backend.
> > -	 *   2. Add the objects to the cache.
> > -	 */
> > +	/* The request itself is too big for the cache? */
> > +	if (unlikely(cache_objs == NULL)) {
> > +		/* Increment stats now, adding in mempool always succeeds.
> */
> > +		RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
> > +		RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
> 
> Shouldn't it be RTE_MEMPOOL_STAT_ADD() here?

I can see why you are wondering, but the answer is no. The statistics in mempool cache are not related to the cache, they are related to the mempool; they are there to provide faster per-lcore update access [1].

[1]: https://elixir.bootlin.com/dpdk/v22.11.1/source/lib/mempool/rte_mempool.h#L94

> 
> >
> > -	if (cache->len + n <= cache->flushthresh) {
> > -		cache_objs = &cache->objs[cache->len];
> > -		cache->len += n;
> > -	} else {
> > -		cache_objs = &cache->objs[0];
> > -		rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len);
> > -		cache->len = n;
> > +		goto driver_enqueue;
> >   	}
> >
> >   	/* Add the objects to the cache. */
> > @@ -1399,13 +1584,7 @@ rte_mempool_do_generic_put(struct rte_mempool
> *mp, void * const *obj_table,
> >
> >   driver_enqueue:
> >
> > -	/* increment stat now, adding in mempool always success */
> > -	RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1);
> > -	RTE_MEMPOOL_STAT_ADD(mp, put_objs, n);
> > -
> > -driver_enqueue_stats_incremented:
> > -
> > -	/* push objects to the backend */
> > +	/* Push the objects to the backend. */
> >   	rte_mempool_ops_enqueue_bulk(mp, obj_table, n);
> >   }
> >
> > diff --git a/lib/mempool/rte_mempool_trace_fp.h
> b/lib/mempool/rte_mempool_trace_fp.h
> > index ed060e887c..14666457f7 100644
> > --- a/lib/mempool/rte_mempool_trace_fp.h
> > +++ b/lib/mempool/rte_mempool_trace_fp.h
> > @@ -109,6 +109,29 @@ RTE_TRACE_POINT_FP(
> >   	rte_trace_point_emit_ptr(mempool);
> >   )
> >
> > +RTE_TRACE_POINT_FP(
> > +	rte_mempool_trace_cache_zc_put_bulk,
> > +	RTE_TRACE_POINT_ARGS(void *cache, void *mempool, uint32_t
> nb_objs),
> > +	rte_trace_point_emit_ptr(cache);
> > +	rte_trace_point_emit_ptr(mempool);
> > +	rte_trace_point_emit_u32(nb_objs);
> > +)
> > +
> > +RTE_TRACE_POINT_FP(
> > +	rte_mempool_trace_cache_zc_put_rewind,
> > +	RTE_TRACE_POINT_ARGS(void *cache, uint32_t nb_objs),
> > +	rte_trace_point_emit_ptr(cache);
> > +	rte_trace_point_emit_u32(nb_objs);
> > +)
> > +
> > +RTE_TRACE_POINT_FP(
> > +	rte_mempool_trace_cache_zc_get_bulk,
> > +	RTE_TRACE_POINT_ARGS(void *cache, void *mempool, uint32_t
> nb_objs),
> > +	rte_trace_point_emit_ptr(cache);
> > +	rte_trace_point_emit_ptr(mempool);
> > +	rte_trace_point_emit_u32(nb_objs);
> > +)
> > +
> >   #ifdef __cplusplus
> >   }
> >   #endif
> > diff --git a/lib/mempool/version.map b/lib/mempool/version.map
> > index b67d7aace7..1383ae6db2 100644
> > --- a/lib/mempool/version.map
> > +++ b/lib/mempool/version.map
> > @@ -63,6 +63,11 @@ EXPERIMENTAL {
> >   	__rte_mempool_trace_ops_alloc;
> >   	__rte_mempool_trace_ops_free;
> >   	__rte_mempool_trace_set_ops_byname;
> > +
> > +	# added in 23.03
> > +	__rte_mempool_trace_cache_zc_put_bulk;
> > +	__rte_mempool_trace_cache_zc_put_rewind;
> > +	__rte_mempool_trace_cache_zc_get_bulk;
> >   };
> >
> >   INTERNAL {
>
  
Konstantin Ananyev Jan. 23, 2023, 11:53 a.m. UTC | #3
> > Few nits, see below.
> > Also I still think we do need a test case for _zc_get_ before
> > accepting it in the mainline.
> 
> Poking at my bad conscience... :-)
> 
> It's on my todo-list. Apparently not high enough. ;-)
> 
> > With that in place:
> > Acked-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
> >
> > > Zero-copy access to mempool caches is beneficial for PMD performance,
> > and
> > > must be provided by the mempool library to fix [Bug 1052] without a
> > > performance regression.
> > >
> > > [Bug 1052]: https://bugs.dpdk.org/show_bug.cgi?id=1052
> > >
> > > Bugzilla ID: 1052
> > >
> > > v5:
> > > * Bugfix: Compare zero-copy get request to the cache size instead of
> > the
> > >    flush threshold; otherwise refill could overflow the memory
> > allocated
> > >    for the cache. (Andrew)
> > > * Split the zero-copy put function into an internal function doing
> > the
> > >    work, and a public function with trace.
> > > * Avoid code duplication by rewriting rte_mempool_do_generic_put() to
> > use
> > >    the internal zero-copy put function. (Andrew)
> > > * Corrected the return type of rte_mempool_cache_zc_put_bulk() from
> > void *
> > >    to void **; it returns a pointer to an array of objects.
> > > * Fix coding style: Add missing curly brackets. (Andrew)
> > > v4:
> > > * Fix checkpatch warnings.
> > > v3:
> > > * Bugfix: Respect the cache size; compare to the flush threshold
> > instead
> > >    of RTE_MEMPOOL_CACHE_MAX_SIZE.
> > > * Added 'rewind' function for incomplete 'put' operations.
> > (Konstantin)
> > > * Replace RTE_ASSERTs with runtime checks of the request size.
> > >    Instead of failing, return NULL if the request is too big.
> > (Konstantin)
> > > * Modified comparison to prevent overflow if n is really huge and len
> > is
> > >    non-zero.
> > > * Updated the comments in the code.
> > > v2:
> > > * Fix checkpatch warnings.
> > > * Fix missing registration of trace points.
> > > * The functions are inline, so they don't go into the map file.
> > > v1 changes from the RFC:
> > > * Removed run-time parameter checks. (Honnappa)
> > >    This is a hot fast path function; requiring correct application
> > >    behaviour, i.e. function parameters must be valid.
> > > * Added RTE_ASSERT for parameters instead.
> > >    Code for this is only generated if built with RTE_ENABLE_ASSERT.
> > > * Removed fallback when 'cache' parameter is not set. (Honnappa)
> > > * Chose the simple get function; i.e. do not move the existing
> > objects in
> > >    the cache to the top of the new stack, just leave them at the
> > bottom.
> > > * Renamed the functions. Other suggestions are welcome, of course. ;-
> > )
> > > * Updated the function descriptions.
> > > * Added the functions to trace_fp and version.map.
> > >
> > > Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> > > ---
> > >   lib/mempool/mempool_trace_points.c |   9 ++
> > >   lib/mempool/rte_mempool.h          | 237 +++++++++++++++++++++++++-
> > ---
> > >   lib/mempool/rte_mempool_trace_fp.h |  23 +++
> > >   lib/mempool/version.map            |   5 +
> > >   4 files changed, 245 insertions(+), 29 deletions(-)
> > >
> > > diff --git a/lib/mempool/mempool_trace_points.c
> > b/lib/mempool/mempool_trace_points.c
> > > index 4ad76deb34..83d353a764 100644
> > > --- a/lib/mempool/mempool_trace_points.c
> > > +++ b/lib/mempool/mempool_trace_points.c
> > > @@ -77,3 +77,12 @@
> > RTE_TRACE_POINT_REGISTER(rte_mempool_trace_ops_free,
> > >
> > >   RTE_TRACE_POINT_REGISTER(rte_mempool_trace_set_ops_byname,
> > >   	lib.mempool.set.ops.byname)
> > > +
> > > +RTE_TRACE_POINT_REGISTER(rte_mempool_trace_cache_zc_put_bulk,
> > > +	lib.mempool.cache.zc.put.bulk)
> > > +
> > > +RTE_TRACE_POINT_REGISTER(rte_mempool_trace_cache_zc_put_rewind,
> > > +	lib.mempool.cache.zc.put.rewind)
> > > +
> > > +RTE_TRACE_POINT_REGISTER(rte_mempool_trace_cache_zc_get_bulk,
> > > +	lib.mempool.cache.zc.get.bulk)
> > > diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
> > > index 9f530db24b..5efd3c2b5b 100644
> > > --- a/lib/mempool/rte_mempool.h
> > > +++ b/lib/mempool/rte_mempool.h
> > > @@ -47,6 +47,7 @@
> > >   #include <rte_ring.h>
> > >   #include <rte_memcpy.h>
> > >   #include <rte_common.h>
> > > +#include <rte_errno.h>
> > >
> > >   #include "rte_mempool_trace_fp.h"
> > >
> > > @@ -1346,6 +1347,197 @@ rte_mempool_cache_flush(struct
> > rte_mempool_cache *cache,
> > >   	cache->len = 0;
> > >   }
> > >
> > > +/**
> > > + * @internal used by rte_mempool_cache_zc_put_bulk() and
> > rte_mempool_do_generic_put().
> > > + *
> > > + * Zero-copy put objects in a user-owned mempool cache backed by the
> > specified mempool.
> > > + *
> > > + * @param cache
> > > + *   A pointer to the mempool cache.
> > > + * @param mp
> > > + *   A pointer to the mempool.
> > > + * @param n
> > > + *   The number of objects to be put in the mempool cache.
> > > + * @return
> > > + *   The pointer to where to put the objects in the mempool cache.
> > > + *   NULL if the request itself is too big for the cache, i.e.
> > > + *   exceeds the cache flush threshold.
> > > + */
> > > +static __rte_always_inline void **
> > > +__rte_mempool_cache_zc_put_bulk(struct rte_mempool_cache *cache,
> > > +		struct rte_mempool *mp,
> > > +		unsigned int n)
> > > +{
> > > +	void **cache_objs;
> > > +
> > > +	RTE_ASSERT(cache != NULL);
> > > +	RTE_ASSERT(mp != NULL);
> > > +
> > > +	if (n <= cache->flushthresh - cache->len) {
> > > +		/*
> > > +		 * The objects can be added to the cache without crossing
> > the
> > > +		 * flush threshold.
> > > +		 */
> > > +		cache_objs = &cache->objs[cache->len];
> > > +		cache->len += n;
> > > +	} else if (likely(n <= cache->flushthresh)) {
> > > +		/*
> > > +		 * The request itself fits into the cache.
> > > +		 * But first, the cache must be flushed to the backend, so
> > > +		 * adding the objects does not cross the flush threshold.
> > > +		 */
> > > +		cache_objs = &cache->objs[0];
> > > +		rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len);
> > > +		cache->len = n;
> > > +	} else {
> > > +		/* The request itself is too big for the cache. */
> > > +		return NULL;
> > > +	}
> > > +
> > > +	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
> > > +	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
> > > +
> > > +	return cache_objs;
> > > +}
> > > +
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: This API may change, or be removed, without
> > prior notice.
> > > + *
> > > + * Zero-copy put objects in a user-owned mempool cache backed by the
> > specified mempool.
> > > + *
> > > + * @param cache
> > > + *   A pointer to the mempool cache.
> > > + * @param mp
> > > + *   A pointer to the mempool.
> > > + * @param n
> > > + *   The number of objects to be put in the mempool cache.
> > > + * @return
> > > + *   The pointer to where to put the objects in the mempool cache.
> > > + *   NULL if the request itself is too big for the cache, i.e.
> > > + *   exceeds the cache flush threshold.
> > > + */
> > > +__rte_experimental
> > > +static __rte_always_inline void **
> > > +rte_mempool_cache_zc_put_bulk(struct rte_mempool_cache *cache,
> > > +		struct rte_mempool *mp,
> > > +		unsigned int n)
> > > +{
> > > +	RTE_ASSERT(cache != NULL);
> > > +	RTE_ASSERT(mp != NULL);
> > > +
> > > +	rte_mempool_trace_cache_zc_put_bulk(cache, mp, n);
> > > +	return __rte_mempool_cache_zc_put_bulk(cache, mp, n);
> > > +}
> > > +
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: This API may change, or be removed, without
> > prior notice.
> > > + *
> > > + * Zero-copy un-put objects in a user-owned mempool cache.
> > > + *
> > > + * @param cache
> > > + *   A pointer to the mempool cache.
> > > + * @param n
> > > + *   The number of objects not put in the mempool cache after
> > calling
> > > + *   rte_mempool_cache_zc_put_bulk().
> > > + */
> > > +__rte_experimental
> > > +static __rte_always_inline void
> > > +rte_mempool_cache_zc_put_rewind(struct rte_mempool_cache *cache,
> > > +		unsigned int n)
> > > +{
> > > +	RTE_ASSERT(cache != NULL);
> > > +	RTE_ASSERT(n <= cache->len);
> > > +
> > > +	rte_mempool_trace_cache_zc_put_rewind(cache, n);
> > > +
> > > +	cache->len -= n;
> > > +
> > > +	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, (int)-n);
> > > +}
> > > +
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: This API may change, or be removed, without
> > prior notice.
> > > + *
> > > + * Zero-copy get objects from a user-owned mempool cache backed by
> > the specified mempool.
> > > + *
> > > + * @param cache
> > > + *   A pointer to the mempool cache.
> > > + * @param mp
> > > + *   A pointer to the mempool.
> > > + * @param n
> > > + *   The number of objects to prefetch into the mempool cache.
> >
> > Why not 'get' instead of 'prefetch'?
> 
> This was my thinking:
> 
> The function "prefetches" the objects into the cache. It is the application itself that "gets" the objects from the cache after having
> called the function.
> You might also notice that the n parameter for the zc_put() function is described as "to be put" (future), not "to put" (now) in the
> cache.
> 
> On the other hand, I chose "Zero-copy get" for the function headline to keep it simple.
> 
> If you think "get" is a more correct description of the n parameter, I can change it.
> 
> Alternatively, I can use the same style as zc_put(), i.e. "to be gotten from the mempool cache" - but that would require input from a
> natively English speaking person, because Danish and English grammar is very different, and I am highly uncertain about my English
> grammar here! I originally considered this phrase, but concluded that the "prefetch" description was easier to understand - especially
> for non-native English readers.

For me 'prefetch' seems a bit unclear in that situation...
Probably: "number of objects that user plans to extract from the cache"?
But again, I am not native English speaker too, so might be someone can suggest a better option.

> 
> >
> >
> > > + * @return
> > > + *   The pointer to the objects in the mempool cache.
> > > + *   NULL on error; i.e. the cache + the pool does not contain 'n'
> > objects.
> > > + *   With rte_errno set to the error code of the mempool dequeue
> > function,
> > > + *   or EINVAL if the request itself is too big for the cache, i.e.
> > > + *   exceeds the cache flush threshold.
> > > + */
> > > +__rte_experimental
> > > +static __rte_always_inline void *
> > > +rte_mempool_cache_zc_get_bulk(struct rte_mempool_cache *cache,
> > > +		struct rte_mempool *mp,
> > > +		unsigned int n)
> > > +{
> > > +	unsigned int len, size;
> > > +
> > > +	RTE_ASSERT(cache != NULL);
> > > +	RTE_ASSERT(mp != NULL);
> > > +
> > > +	rte_mempool_trace_cache_zc_get_bulk(cache, mp, n);
> > > +
> > > +	len = cache->len;
> > > +	size = cache->size;
> > > +
> > > +	if (n <= len) {
> > > +		/* The request can be satisfied from the cache as is. */
> > > +		len -= n;
> > > +	} else if (likely(n <= size)) {
> > > +		/*
> > > +		 * The request itself can be satisfied from the cache.
> > > +		 * But first, the cache must be filled from the backend;
> > > +		 * fetch size + requested - len objects.
> > > +		 */
> > > +		int ret;
> > > +
> > > +		ret = rte_mempool_ops_dequeue_bulk(mp, &cache->objs[len],
> > size + n - len);
> > > +		if (unlikely(ret < 0)) {
> > > +			/*
> > > +			 * We are buffer constrained.
> > > +			 * Do not fill the cache, just satisfy the request.
> > > +			 */
> > > +			ret = rte_mempool_ops_dequeue_bulk(mp, &cache-
> > >objs[len], n - len);
> > > +			if (unlikely(ret < 0)) {
> > > +				/* Unable to satisfy the request. */
> > > +
> > > +				RTE_MEMPOOL_STAT_ADD(mp, get_fail_bulk, 1);
> > > +				RTE_MEMPOOL_STAT_ADD(mp, get_fail_objs, n);
> > > +
> > > +				rte_errno = -ret;
> > > +				return NULL;
> > > +			}
> > > +
> > > +			len = 0;
> > > +		} else {
> > > +			len = size;
> > > +		}
> > > +	} else {
> > > +		/* The request itself is too big for the cache. */
> > > +		rte_errno = EINVAL;
> > > +		return NULL;
> > > +	}
> > > +
> > > +	cache->len = len;
> > > +
> > > +	RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1);
> > > +	RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_objs, n);
> > > +
> > > +	return &cache->objs[len];
> > > +}
> > > +
> > >   /**
> > >    * @internal Put several objects back in the mempool; used
> > internally.
> > >    * @param mp
> > > @@ -1364,32 +1556,25 @@ rte_mempool_do_generic_put(struct rte_mempool
> > *mp, void * const *obj_table,
> > >   {
> > >   	void **cache_objs;
> > >
> > > -	/* No cache provided */
> > > -	if (unlikely(cache == NULL))
> > > -		goto driver_enqueue;
> > > +	/* No cache provided? */
> > > +	if (unlikely(cache == NULL)) {
> > > +		/* Increment stats now, adding in mempool always succeeds.
> > */
> > > +		RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1);
> > > +		RTE_MEMPOOL_STAT_ADD(mp, put_objs, n);
> > >
> > > -	/* increment stat now, adding in mempool always success */
> > > -	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
> > > -	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
> > > +		goto driver_enqueue;
> > > +	}
> > >
> > > -	/* The request itself is too big for the cache */
> > > -	if (unlikely(n > cache->flushthresh))
> > > -		goto driver_enqueue_stats_incremented;
> > > +	/* Prepare to add the objects to the cache. */
> > > +	cache_objs = __rte_mempool_cache_zc_put_bulk(cache, mp, n);
> > >
> > > -	/*
> > > -	 * The cache follows the following algorithm:
> > > -	 *   1. If the objects cannot be added to the cache without
> > crossing
> > > -	 *      the flush threshold, flush the cache to the backend.
> > > -	 *   2. Add the objects to the cache.
> > > -	 */
> > > +	/* The request itself is too big for the cache? */
> > > +	if (unlikely(cache_objs == NULL)) {
> > > +		/* Increment stats now, adding in mempool always succeeds.
> > */
> > > +		RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
> > > +		RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
> >
> > Shouldn't it be RTE_MEMPOOL_STAT_ADD() here?
> 
> I can see why you are wondering, but the answer is no. The statistics in mempool cache are not related to the cache, they are related
> to the mempool; they are there to provide faster per-lcore update access [1].
> 
> [1]: https://elixir.bootlin.com/dpdk/v22.11.1/source/lib/mempool/rte_mempool.h#L94

But  the condition above:
if (unlikely(cache_objs == NULL))
means that me can't put these object to the cache and have to put
objects straight to the pool (skipping cache completely), right?
If so, then why to update cache stats instead of pool stats?

> >
> > >
> > > -	if (cache->len + n <= cache->flushthresh) {
> > > -		cache_objs = &cache->objs[cache->len];
> > > -		cache->len += n;
> > > -	} else {
> > > -		cache_objs = &cache->objs[0];
> > > -		rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len);
> > > -		cache->len = n;
> > > +		goto driver_enqueue;
> > >   	}
> > >
> > >   	/* Add the objects to the cache. */
> > > @@ -1399,13 +1584,7 @@ rte_mempool_do_generic_put(struct rte_mempool
> > *mp, void * const *obj_table,
> > >
> > >   driver_enqueue:
> > >
> > > -	/* increment stat now, adding in mempool always success */
> > > -	RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1);
> > > -	RTE_MEMPOOL_STAT_ADD(mp, put_objs, n);
> > > -
> > > -driver_enqueue_stats_incremented:
> > > -
> > > -	/* push objects to the backend */
> > > +	/* Push the objects to the backend. */
> > >   	rte_mempool_ops_enqueue_bulk(mp, obj_table, n);
> > >   }
> > >
> > > diff --git a/lib/mempool/rte_mempool_trace_fp.h
> > b/lib/mempool/rte_mempool_trace_fp.h
> > > index ed060e887c..14666457f7 100644
> > > --- a/lib/mempool/rte_mempool_trace_fp.h
> > > +++ b/lib/mempool/rte_mempool_trace_fp.h
> > > @@ -109,6 +109,29 @@ RTE_TRACE_POINT_FP(
> > >   	rte_trace_point_emit_ptr(mempool);
> > >   )
> > >
> > > +RTE_TRACE_POINT_FP(
> > > +	rte_mempool_trace_cache_zc_put_bulk,
> > > +	RTE_TRACE_POINT_ARGS(void *cache, void *mempool, uint32_t
> > nb_objs),
> > > +	rte_trace_point_emit_ptr(cache);
> > > +	rte_trace_point_emit_ptr(mempool);
> > > +	rte_trace_point_emit_u32(nb_objs);
> > > +)
> > > +
> > > +RTE_TRACE_POINT_FP(
> > > +	rte_mempool_trace_cache_zc_put_rewind,
> > > +	RTE_TRACE_POINT_ARGS(void *cache, uint32_t nb_objs),
> > > +	rte_trace_point_emit_ptr(cache);
> > > +	rte_trace_point_emit_u32(nb_objs);
> > > +)
> > > +
> > > +RTE_TRACE_POINT_FP(
> > > +	rte_mempool_trace_cache_zc_get_bulk,
> > > +	RTE_TRACE_POINT_ARGS(void *cache, void *mempool, uint32_t
> > nb_objs),
> > > +	rte_trace_point_emit_ptr(cache);
> > > +	rte_trace_point_emit_ptr(mempool);
> > > +	rte_trace_point_emit_u32(nb_objs);
> > > +)
> > > +
> > >   #ifdef __cplusplus
> > >   }
> > >   #endif
> > > diff --git a/lib/mempool/version.map b/lib/mempool/version.map
> > > index b67d7aace7..1383ae6db2 100644
> > > --- a/lib/mempool/version.map
> > > +++ b/lib/mempool/version.map
> > > @@ -63,6 +63,11 @@ EXPERIMENTAL {
> > >   	__rte_mempool_trace_ops_alloc;
> > >   	__rte_mempool_trace_ops_free;
> > >   	__rte_mempool_trace_set_ops_byname;
> > > +
> > > +	# added in 23.03
> > > +	__rte_mempool_trace_cache_zc_put_bulk;
> > > +	__rte_mempool_trace_cache_zc_put_rewind;
> > > +	__rte_mempool_trace_cache_zc_get_bulk;
> > >   };
> > >
> > >   INTERNAL {
> >
  
Morten Brørup Jan. 23, 2023, 12:23 p.m. UTC | #4
> From: Konstantin Ananyev [mailto:konstantin.ananyev@huawei.com]
> Sent: Monday, 23 January 2023 12.54
> 
> > > Few nits, see below.
> > > Also I still think we do need a test case for _zc_get_ before
> > > accepting it in the mainline.
> >
> > Poking at my bad conscience... :-)
> >
> > It's on my todo-list. Apparently not high enough. ;-)
> >
> > > With that in place:
> > > Acked-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
> > >

[...]

> > > > +/**
> > > > + * @warning
> > > > + * @b EXPERIMENTAL: This API may change, or be removed, without
> > > prior notice.
> > > > + *
> > > > + * Zero-copy put objects in a user-owned mempool cache backed by
> the
> > > specified mempool.
> > > > + *
> > > > + * @param cache
> > > > + *   A pointer to the mempool cache.
> > > > + * @param mp
> > > > + *   A pointer to the mempool.
> > > > + * @param n
> > > > + *   The number of objects to be put in the mempool cache.
> > > > + * @return
> > > > + *   The pointer to where to put the objects in the mempool
> cache.
> > > > + *   NULL if the request itself is too big for the cache, i.e.
> > > > + *   exceeds the cache flush threshold.
> > > > + */
> > > > +__rte_experimental
> > > > +static __rte_always_inline void **
> > > > +rte_mempool_cache_zc_put_bulk(struct rte_mempool_cache *cache,
> > > > +		struct rte_mempool *mp,
> > > > +		unsigned int n)
> > > > +{
> > > > +	RTE_ASSERT(cache != NULL);
> > > > +	RTE_ASSERT(mp != NULL);
> > > > +
> > > > +	rte_mempool_trace_cache_zc_put_bulk(cache, mp, n);
> > > > +	return __rte_mempool_cache_zc_put_bulk(cache, mp, n);
> > > > +}
> > > > +
> > > > +/**
> > > > + * @warning
> > > > + * @b EXPERIMENTAL: This API may change, or be removed, without
> > > prior notice.
> > > > + *
> > > > + * Zero-copy un-put objects in a user-owned mempool cache.
> > > > + *
> > > > + * @param cache
> > > > + *   A pointer to the mempool cache.
> > > > + * @param n
> > > > + *   The number of objects not put in the mempool cache after
> > > calling
> > > > + *   rte_mempool_cache_zc_put_bulk().
> > > > + */
> > > > +__rte_experimental
> > > > +static __rte_always_inline void
> > > > +rte_mempool_cache_zc_put_rewind(struct rte_mempool_cache *cache,
> > > > +		unsigned int n)
> > > > +{
> > > > +	RTE_ASSERT(cache != NULL);
> > > > +	RTE_ASSERT(n <= cache->len);
> > > > +
> > > > +	rte_mempool_trace_cache_zc_put_rewind(cache, n);
> > > > +
> > > > +	cache->len -= n;
> > > > +
> > > > +	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, (int)-n);
> > > > +}
> > > > +
> > > > +/**
> > > > + * @warning
> > > > + * @b EXPERIMENTAL: This API may change, or be removed, without
> > > prior notice.
> > > > + *
> > > > + * Zero-copy get objects from a user-owned mempool cache backed
> by
> > > the specified mempool.
> > > > + *
> > > > + * @param cache
> > > > + *   A pointer to the mempool cache.
> > > > + * @param mp
> > > > + *   A pointer to the mempool.
> > > > + * @param n
> > > > + *   The number of objects to prefetch into the mempool cache.
> > >
> > > Why not 'get' instead of 'prefetch'?
> >
> > This was my thinking:
> >
> > The function "prefetches" the objects into the cache. It is the
> application itself that "gets" the objects from the cache after having
> > called the function.
> > You might also notice that the n parameter for the zc_put() function
> is described as "to be put" (future), not "to put" (now) in the
> > cache.
> >
> > On the other hand, I chose "Zero-copy get" for the function headline
> to keep it simple.
> >
> > If you think "get" is a more correct description of the n parameter,
> I can change it.
> >
> > Alternatively, I can use the same style as zc_put(), i.e. "to be
> gotten from the mempool cache" - but that would require input from a
> > natively English speaking person, because Danish and English grammar
> is very different, and I am highly uncertain about my English
> > grammar here! I originally considered this phrase, but concluded that
> the "prefetch" description was easier to understand - especially
> > for non-native English readers.
> 
> For me 'prefetch' seems a bit unclear in that situation...
> Probably: "number of objects that user plans to extract from the
> cache"?
> But again, I am not native English speaker too, so might be someone can
> suggest a better option.
> 

@Bruce (or any other native English speaking person), your input would be appreciated here!

> > > > + * @return
> > > > + *   The pointer to the objects in the mempool cache.
> > > > + *   NULL on error; i.e. the cache + the pool does not contain
> 'n'
> > > objects.
> > > > + *   With rte_errno set to the error code of the mempool dequeue
> > > function,
> > > > + *   or EINVAL if the request itself is too big for the cache,
> i.e.
> > > > + *   exceeds the cache flush threshold.
> > > > + */
> > > > +__rte_experimental
> > > > +static __rte_always_inline void *
> > > > +rte_mempool_cache_zc_get_bulk(struct rte_mempool_cache *cache,
> > > > +		struct rte_mempool *mp,
> > > > +		unsigned int n)

[...]

> > > > @@ -1364,32 +1556,25 @@ rte_mempool_do_generic_put(struct
> rte_mempool
> > > *mp, void * const *obj_table,
> > > >   {
> > > >   	void **cache_objs;
> > > >
> > > > -	/* No cache provided */
> > > > -	if (unlikely(cache == NULL))
> > > > -		goto driver_enqueue;
> > > > +	/* No cache provided? */
> > > > +	if (unlikely(cache == NULL)) {
> > > > +		/* Increment stats now, adding in mempool always
> succeeds.
> > > */
> > > > +		RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1);
> > > > +		RTE_MEMPOOL_STAT_ADD(mp, put_objs, n);
> > > >
> > > > -	/* increment stat now, adding in mempool always success */
> > > > -	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
> > > > -	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
> > > > +		goto driver_enqueue;
> > > > +	}
> > > >
> > > > -	/* The request itself is too big for the cache */
> > > > -	if (unlikely(n > cache->flushthresh))
> > > > -		goto driver_enqueue_stats_incremented;
> > > > +	/* Prepare to add the objects to the cache. */
> > > > +	cache_objs = __rte_mempool_cache_zc_put_bulk(cache, mp, n);
> > > >
> > > > -	/*
> > > > -	 * The cache follows the following algorithm:
> > > > -	 *   1. If the objects cannot be added to the cache without
> > > crossing
> > > > -	 *      the flush threshold, flush the cache to the
> backend.
> > > > -	 *   2. Add the objects to the cache.
> > > > -	 */
> > > > +	/* The request itself is too big for the cache? */
> > > > +	if (unlikely(cache_objs == NULL)) {
> > > > +		/* Increment stats now, adding in mempool always
> succeeds.
> > > */
> > > > +		RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
> > > > +		RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
> > >
> > > Shouldn't it be RTE_MEMPOOL_STAT_ADD() here?
> >
> > I can see why you are wondering, but the answer is no. The statistics
> in mempool cache are not related to the cache, they are related
> > to the mempool; they are there to provide faster per-lcore update
> access [1].
> >
> > [1]:
> https://elixir.bootlin.com/dpdk/v22.11.1/source/lib/mempool/rte_mempool	
> .h#L94
> 
> But  the condition above:
> if (unlikely(cache_objs == NULL))
> means that me can't put these object to the cache and have to put
> objects straight to the pool (skipping cache completely), right?

Correct.

> If so, then why to update cache stats instead of pool stats?

Because updating the stats in the cache structure is faster than updating the stats in the pool structure. Refer to the two macros: RTE_MEMPOOL_STAT_ADD() [2] is effectively five lines of code, but RTE_MEMPOOL_CACHE_STAT_ADD(cache, name, n) [3] is a one-liner: ((cache)->stats.name += (n)).

[2]: https://elixir.bootlin.com/dpdk/v22.11.1/source/lib/mempool/rte_mempool.h#L325
[3]: https://elixir.bootlin.com/dpdk/v22.11.1/source/lib/mempool/rte_mempool.h#L348

And to reiterate that this is the correct behavior here, I will rephrase my previous response: The stats kept in the cache are part of the pool stats, they are not stats for the cache itself.

> > > >
> > > > -	if (cache->len + n <= cache->flushthresh) {
> > > > -		cache_objs = &cache->objs[cache->len];
> > > > -		cache->len += n;
> > > > -	} else {
> > > > -		cache_objs = &cache->objs[0];
> > > > -		rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache-
> >len);
> > > > -		cache->len = n;
> > > > +		goto driver_enqueue;
> > > >   	}
> > > >
> > > >   	/* Add the objects to the cache. */
  
Konstantin Ananyev Jan. 23, 2023, 12:52 p.m. UTC | #5
> > > > > @@ -1364,32 +1556,25 @@ rte_mempool_do_generic_put(struct
> > rte_mempool
> > > > *mp, void * const *obj_table,
> > > > >   {
> > > > >   	void **cache_objs;
> > > > >
> > > > > -	/* No cache provided */
> > > > > -	if (unlikely(cache == NULL))
> > > > > -		goto driver_enqueue;
> > > > > +	/* No cache provided? */
> > > > > +	if (unlikely(cache == NULL)) {
> > > > > +		/* Increment stats now, adding in mempool always
> > succeeds.
> > > > */
> > > > > +		RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1);
> > > > > +		RTE_MEMPOOL_STAT_ADD(mp, put_objs, n);
> > > > >
> > > > > -	/* increment stat now, adding in mempool always success */
> > > > > -	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
> > > > > -	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
> > > > > +		goto driver_enqueue;
> > > > > +	}
> > > > >
> > > > > -	/* The request itself is too big for the cache */
> > > > > -	if (unlikely(n > cache->flushthresh))
> > > > > -		goto driver_enqueue_stats_incremented;
> > > > > +	/* Prepare to add the objects to the cache. */
> > > > > +	cache_objs = __rte_mempool_cache_zc_put_bulk(cache, mp, n);
> > > > >
> > > > > -	/*
> > > > > -	 * The cache follows the following algorithm:
> > > > > -	 *   1. If the objects cannot be added to the cache without
> > > > crossing
> > > > > -	 *      the flush threshold, flush the cache to the
> > backend.
> > > > > -	 *   2. Add the objects to the cache.
> > > > > -	 */
> > > > > +	/* The request itself is too big for the cache? */
> > > > > +	if (unlikely(cache_objs == NULL)) {
> > > > > +		/* Increment stats now, adding in mempool always
> > succeeds.
> > > > */
> > > > > +		RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
> > > > > +		RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
> > > >
> > > > Shouldn't it be RTE_MEMPOOL_STAT_ADD() here?
> > >
> > > I can see why you are wondering, but the answer is no. The statistics
> > in mempool cache are not related to the cache, they are related
> > > to the mempool; they are there to provide faster per-lcore update
> > access [1].
> > >
> > > [1]:
> > https://elixir.bootlin.com/dpdk/v22.11.1/source/lib/mempool/rte_mempool
> > .h#L94
> >
> > But  the condition above:
> > if (unlikely(cache_objs == NULL))
> > means that me can't put these object to the cache and have to put
> > objects straight to the pool (skipping cache completely), right?
> 
> Correct.
> 
> > If so, then why to update cache stats instead of pool stats?
> 
> Because updating the stats in the cache structure is faster than updating the stats in the pool structure. Refer to the two macros:
> RTE_MEMPOOL_STAT_ADD() [2] is effectively five lines of code, but RTE_MEMPOOL_CACHE_STAT_ADD(cache, name, n) [3] is a one-
> liner: ((cache)->stats.name += (n)).
> 
> [2]: https://elixir.bootlin.com/dpdk/v22.11.1/source/lib/mempool/rte_mempool.h#L325
> [3]: https://elixir.bootlin.com/dpdk/v22.11.1/source/lib/mempool/rte_mempool.h#L348
> 
> And to reiterate that this is the correct behavior here, I will rephrase my previous response: The stats kept in the cache are part of the
> pool stats, they are not stats for the cache itself.

Ah ok, that's  the same as current behavior.
It is still looks a bit strange to me that we incrementing cache (not pool) stats here.
But that's another story, so no extra comments from me for that case.

> 
> > > > >
> > > > > -	if (cache->len + n <= cache->flushthresh) {
> > > > > -		cache_objs = &cache->objs[cache->len];
> > > > > -		cache->len += n;
> > > > > -	} else {
> > > > > -		cache_objs = &cache->objs[0];
> > > > > -		rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache-
> > >len);
> > > > > -		cache->len = n;
> > > > > +		goto driver_enqueue;
> > > > >   	}
> > > > >
> > > > >   	/* Add the objects to the cache. */
  
Bruce Richardson Jan. 23, 2023, 2:30 p.m. UTC | #6
On Mon, Jan 23, 2023 at 01:23:50PM +0100, Morten Brørup wrote:
> > From: Konstantin Ananyev [mailto:konstantin.ananyev@huawei.com]
> > Sent: Monday, 23 January 2023 12.54
> > 
> > > > Few nits, see below.
> > > > Also I still think we do need a test case for _zc_get_ before
> > > > accepting it in the mainline.
> > >
> > > Poking at my bad conscience... :-)
> > >
> > > It's on my todo-list. Apparently not high enough. ;-)
> > >
> > > > With that in place:
> > > > Acked-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
> > > >
> 
> [...]
> 
> > > > > +/**
> > > > > + * @warning
> > > > > + * @b EXPERIMENTAL: This API may change, or be removed, without
> > > > prior notice.
> > > > > + *
> > > > > + * Zero-copy put objects in a user-owned mempool cache backed by
> > the
> > > > specified mempool.
> > > > > + *
> > > > > + * @param cache
> > > > > + *   A pointer to the mempool cache.
> > > > > + * @param mp
> > > > > + *   A pointer to the mempool.
> > > > > + * @param n
> > > > > + *   The number of objects to be put in the mempool cache.
> > > > > + * @return
> > > > > + *   The pointer to where to put the objects in the mempool
> > cache.
> > > > > + *   NULL if the request itself is too big for the cache, i.e.
> > > > > + *   exceeds the cache flush threshold.
> > > > > + */
> > > > > +__rte_experimental
> > > > > +static __rte_always_inline void **
> > > > > +rte_mempool_cache_zc_put_bulk(struct rte_mempool_cache *cache,
> > > > > +		struct rte_mempool *mp,
> > > > > +		unsigned int n)
> > > > > +{
> > > > > +	RTE_ASSERT(cache != NULL);
> > > > > +	RTE_ASSERT(mp != NULL);
> > > > > +
> > > > > +	rte_mempool_trace_cache_zc_put_bulk(cache, mp, n);
> > > > > +	return __rte_mempool_cache_zc_put_bulk(cache, mp, n);
> > > > > +}
> > > > > +
> > > > > +/**
> > > > > + * @warning
> > > > > + * @b EXPERIMENTAL: This API may change, or be removed, without
> > > > prior notice.
> > > > > + *
> > > > > + * Zero-copy un-put objects in a user-owned mempool cache.
> > > > > + *
> > > > > + * @param cache
> > > > > + *   A pointer to the mempool cache.
> > > > > + * @param n
> > > > > + *   The number of objects not put in the mempool cache after
> > > > calling
> > > > > + *   rte_mempool_cache_zc_put_bulk().
> > > > > + */
> > > > > +__rte_experimental
> > > > > +static __rte_always_inline void
> > > > > +rte_mempool_cache_zc_put_rewind(struct rte_mempool_cache *cache,
> > > > > +		unsigned int n)
> > > > > +{
> > > > > +	RTE_ASSERT(cache != NULL);
> > > > > +	RTE_ASSERT(n <= cache->len);
> > > > > +
> > > > > +	rte_mempool_trace_cache_zc_put_rewind(cache, n);
> > > > > +
> > > > > +	cache->len -= n;
> > > > > +
> > > > > +	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, (int)-n);
> > > > > +}
> > > > > +
> > > > > +/**
> > > > > + * @warning
> > > > > + * @b EXPERIMENTAL: This API may change, or be removed, without
> > > > prior notice.
> > > > > + *
> > > > > + * Zero-copy get objects from a user-owned mempool cache backed
> > by
> > > > the specified mempool.
> > > > > + *
> > > > > + * @param cache
> > > > > + *   A pointer to the mempool cache.
> > > > > + * @param mp
> > > > > + *   A pointer to the mempool.
> > > > > + * @param n
> > > > > + *   The number of objects to prefetch into the mempool cache.
> > > >
> > > > Why not 'get' instead of 'prefetch'?
> > >
> > > This was my thinking:
> > >
> > > The function "prefetches" the objects into the cache. It is the
> > application itself that "gets" the objects from the cache after having
> > > called the function.
> > > You might also notice that the n parameter for the zc_put() function
> > is described as "to be put" (future), not "to put" (now) in the
> > > cache.
> > >
> > > On the other hand, I chose "Zero-copy get" for the function headline
> > to keep it simple.
> > >
> > > If you think "get" is a more correct description of the n parameter,
> > I can change it.
> > >
> > > Alternatively, I can use the same style as zc_put(), i.e. "to be
> > gotten from the mempool cache" - but that would require input from a
> > > natively English speaking person, because Danish and English grammar
> > is very different, and I am highly uncertain about my English
> > > grammar here! I originally considered this phrase, but concluded that
> > the "prefetch" description was easier to understand - especially
> > > for non-native English readers.
> > 
> > For me 'prefetch' seems a bit unclear in that situation...
> > Probably: "number of objects that user plans to extract from the
> > cache"?
> > But again, I am not native English speaker too, so might be someone can
> > suggest a better option.
> > 
> 
> @Bruce (or any other native English speaking person), your input would be appreciated here!
> 
I was happily ignoring this thread until you went and dragged me in with a
hard question. :-)

I think the longer explanation the clearer it is likely to be. How about
"number of objects to be made available for extraction from the cache"? I
don't like the reference to "the user" in the longer suggestion above, but
otherwise consider it clearer that talking of prefetching or "getting".

My 2c.

/Bruce
  
Kamalakshitha Aligeri Jan. 24, 2023, 1:53 a.m. UTC | #7
-----Original Message-----
From: Bruce Richardson <bruce.richardson@intel.com> 
Sent: Monday, January 23, 2023 6:31 AM
To: Morten Brørup <mb@smartsharesystems.com>
Cc: Konstantin Ananyev <konstantin.ananyev@huawei.com>; Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>; olivier.matz@6wind.com; andrew.rybchenko@oktetlabs.ru; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Kamalakshitha Aligeri <Kamalakshitha.Aligeri@arm.com>; dev@dpdk.org; nd <nd@arm.com>
Subject: Re: [PATCH v5] mempool cache: add zero-copy get and put functions

On Mon, Jan 23, 2023 at 01:23:50PM +0100, Morten Br�rup wrote:
> > From: Konstantin Ananyev [mailto:konstantin.ananyev@huawei.com]
> > Sent: Monday, 23 January 2023 12.54
> > 
> > > > Few nits, see below.
> > > > Also I still think we do need a test case for _zc_get_ before 
> > > > accepting it in the mainline.
> > >
I am working on the test cases. Will submit it soon

> > > Poking at my bad conscience... :-)
> > >
> > > It's on my todo-list. Apparently not high enough. ;-)
> > >
> > > > With that in place:
> > > > Acked-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
> > > >
> 
> [...]
> 
> > > > > +/**
> > > > > + * @warning
> > > > > + * @b EXPERIMENTAL: This API may change, or be removed, 
> > > > > +without
> > > > prior notice.
> > > > > + *
> > > > > + * Zero-copy put objects in a user-owned mempool cache backed 
> > > > > + by
> > the
> > > > specified mempool.
> > > > > + *
> > > > > + * @param cache
> > > > > + *   A pointer to the mempool cache.
> > > > > + * @param mp
> > > > > + *   A pointer to the mempool.
> > > > > + * @param n
> > > > > + *   The number of objects to be put in the mempool cache.
> > > > > + * @return
> > > > > + *   The pointer to where to put the objects in the mempool
> > cache.
> > > > > + *   NULL if the request itself is too big for the cache, i.e.
> > > > > + *   exceeds the cache flush threshold.
> > > > > + */
> > > > > +__rte_experimental
> > > > > +static __rte_always_inline void ** 
> > > > > +rte_mempool_cache_zc_put_bulk(struct rte_mempool_cache *cache,
> > > > > +		struct rte_mempool *mp,
> > > > > +		unsigned int n)
> > > > > +{
> > > > > +	RTE_ASSERT(cache != NULL);
> > > > > +	RTE_ASSERT(mp != NULL);
> > > > > +
> > > > > +	rte_mempool_trace_cache_zc_put_bulk(cache, mp, n);
> > > > > +	return __rte_mempool_cache_zc_put_bulk(cache, mp, n); }
> > > > > +
> > > > > +/**
> > > > > + * @warning
> > > > > + * @b EXPERIMENTAL: This API may change, or be removed, 
> > > > > +without
> > > > prior notice.
> > > > > + *
> > > > > + * Zero-copy un-put objects in a user-owned mempool cache.
Why is it written as user-owned mempool cache. API expects a pointer to mempool cache right, whether it is default or user-owned does not make any difference
Please correct me if I am wrong
> > > > > + *
> > > > > + * @param cache
> > > > > + *   A pointer to the mempool cache.
> > > > > + * @param n
> > > > > + *   The number of objects not put in the mempool cache after
> > > > calling
> > > > > + *   rte_mempool_cache_zc_put_bulk().
> > > > > + */
> > > > > +__rte_experimental
> > > > > +static __rte_always_inline void 
> > > > > +rte_mempool_cache_zc_put_rewind(struct rte_mempool_cache *cache,
> > > > > +		unsigned int n)
> > > > > +{
> > > > > +	RTE_ASSERT(cache != NULL);
> > > > > +	RTE_ASSERT(n <= cache->len);
> > > > > +
> > > > > +	rte_mempool_trace_cache_zc_put_rewind(cache, n);
> > > > > +
> > > > > +	cache->len -= n;
> > > > > +
> > > > > +	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, (int)-n); }
> > > > > +
> > > > > +/**
> > > > > + * @warning
> > > > > + * @b EXPERIMENTAL: This API may change, or be removed, 
> > > > > +without
> > > > prior notice.
> > > > > + *
> > > > > + * Zero-copy get objects from a user-owned mempool cache 
> > > > > + backed
> > by
> > > > the specified mempool.
> > > > > + *
> > > > > + * @param cache
> > > > > + *   A pointer to the mempool cache.
> > > > > + * @param mp
> > > > > + *   A pointer to the mempool.
> > > > > + * @param n
> > > > > + *   The number of objects to prefetch into the mempool cache.
> > > >
> > > > Why not 'get' instead of 'prefetch'?
> > >
> > > This was my thinking:
> > >
> > > The function "prefetches" the objects into the cache. It is the
> > application itself that "gets" the objects from the cache after 
> > having
> > > called the function.
> > > You might also notice that the n parameter for the zc_put() 
> > > function
> > is described as "to be put" (future), not "to put" (now) in the
> > > cache.
> > >
> > > On the other hand, I chose "Zero-copy get" for the function 
> > > headline
> > to keep it simple.
> > >
> > > If you think "get" is a more correct description of the n 
> > > parameter,
> > I can change it.
> > >
> > > Alternatively, I can use the same style as zc_put(), i.e. "to be
> > gotten from the mempool cache" - but that would require input from a
> > > natively English speaking person, because Danish and English 
> > > grammar
> > is very different, and I am highly uncertain about my English
> > > grammar here! I originally considered this phrase, but concluded 
> > > that
> > the "prefetch" description was easier to understand - especially
> > > for non-native English readers.
> > 
> > For me 'prefetch' seems a bit unclear in that situation...
> > Probably: "number of objects that user plans to extract from the 
> > cache"?
> > But again, I am not native English speaker too, so might be someone 
> > can suggest a better option.
> > 
> 
> @Bruce (or any other native English speaking person), your input would be appreciated here!
> 
I was happily ignoring this thread until you went and dragged me in with a hard question. :-)

I think the longer explanation the clearer it is likely to be. How about "number of objects to be made available for extraction from the cache"? I don't like the reference to "the user" in the longer suggestion above, but otherwise consider it clearer that talking of prefetching or "getting".

My 2c.

/Bruce
  

Patch

diff --git a/lib/mempool/mempool_trace_points.c b/lib/mempool/mempool_trace_points.c
index 4ad76deb34..83d353a764 100644
--- a/lib/mempool/mempool_trace_points.c
+++ b/lib/mempool/mempool_trace_points.c
@@ -77,3 +77,12 @@  RTE_TRACE_POINT_REGISTER(rte_mempool_trace_ops_free,
 
 RTE_TRACE_POINT_REGISTER(rte_mempool_trace_set_ops_byname,
 	lib.mempool.set.ops.byname)
+
+RTE_TRACE_POINT_REGISTER(rte_mempool_trace_cache_zc_put_bulk,
+	lib.mempool.cache.zc.put.bulk)
+
+RTE_TRACE_POINT_REGISTER(rte_mempool_trace_cache_zc_put_rewind,
+	lib.mempool.cache.zc.put.rewind)
+
+RTE_TRACE_POINT_REGISTER(rte_mempool_trace_cache_zc_get_bulk,
+	lib.mempool.cache.zc.get.bulk)
diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index 9f530db24b..5efd3c2b5b 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -47,6 +47,7 @@ 
 #include <rte_ring.h>
 #include <rte_memcpy.h>
 #include <rte_common.h>
+#include <rte_errno.h>
 
 #include "rte_mempool_trace_fp.h"
 
@@ -1346,6 +1347,197 @@  rte_mempool_cache_flush(struct rte_mempool_cache *cache,
 	cache->len = 0;
 }
 
+/**
+ * @internal used by rte_mempool_cache_zc_put_bulk() and rte_mempool_do_generic_put().
+ *
+ * Zero-copy put objects in a user-owned mempool cache backed by the specified mempool.
+ *
+ * @param cache
+ *   A pointer to the mempool cache.
+ * @param mp
+ *   A pointer to the mempool.
+ * @param n
+ *   The number of objects to be put in the mempool cache.
+ * @return
+ *   The pointer to where to put the objects in the mempool cache.
+ *   NULL if the request itself is too big for the cache, i.e.
+ *   exceeds the cache flush threshold.
+ */
+static __rte_always_inline void **
+__rte_mempool_cache_zc_put_bulk(struct rte_mempool_cache *cache,
+		struct rte_mempool *mp,
+		unsigned int n)
+{
+	void **cache_objs;
+
+	RTE_ASSERT(cache != NULL);
+	RTE_ASSERT(mp != NULL);
+
+	if (n <= cache->flushthresh - cache->len) {
+		/*
+		 * The objects can be added to the cache without crossing the
+		 * flush threshold.
+		 */
+		cache_objs = &cache->objs[cache->len];
+		cache->len += n;
+	} else if (likely(n <= cache->flushthresh)) {
+		/*
+		 * The request itself fits into the cache.
+		 * But first, the cache must be flushed to the backend, so
+		 * adding the objects does not cross the flush threshold.
+		 */
+		cache_objs = &cache->objs[0];
+		rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len);
+		cache->len = n;
+	} else {
+		/* The request itself is too big for the cache. */
+		return NULL;
+	}
+
+	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
+	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
+
+	return cache_objs;
+}
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: This API may change, or be removed, without prior notice.
+ *
+ * Zero-copy put objects in a user-owned mempool cache backed by the specified mempool.
+ *
+ * @param cache
+ *   A pointer to the mempool cache.
+ * @param mp
+ *   A pointer to the mempool.
+ * @param n
+ *   The number of objects to be put in the mempool cache.
+ * @return
+ *   The pointer to where to put the objects in the mempool cache.
+ *   NULL if the request itself is too big for the cache, i.e.
+ *   exceeds the cache flush threshold.
+ */
+__rte_experimental
+static __rte_always_inline void **
+rte_mempool_cache_zc_put_bulk(struct rte_mempool_cache *cache,
+		struct rte_mempool *mp,
+		unsigned int n)
+{
+	RTE_ASSERT(cache != NULL);
+	RTE_ASSERT(mp != NULL);
+
+	rte_mempool_trace_cache_zc_put_bulk(cache, mp, n);
+	return __rte_mempool_cache_zc_put_bulk(cache, mp, n);
+}
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: This API may change, or be removed, without prior notice.
+ *
+ * Zero-copy un-put objects in a user-owned mempool cache.
+ *
+ * @param cache
+ *   A pointer to the mempool cache.
+ * @param n
+ *   The number of objects not put in the mempool cache after calling
+ *   rte_mempool_cache_zc_put_bulk().
+ */
+__rte_experimental
+static __rte_always_inline void
+rte_mempool_cache_zc_put_rewind(struct rte_mempool_cache *cache,
+		unsigned int n)
+{
+	RTE_ASSERT(cache != NULL);
+	RTE_ASSERT(n <= cache->len);
+
+	rte_mempool_trace_cache_zc_put_rewind(cache, n);
+
+	cache->len -= n;
+
+	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, (int)-n);
+}
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: This API may change, or be removed, without prior notice.
+ *
+ * Zero-copy get objects from a user-owned mempool cache backed by the specified mempool.
+ *
+ * @param cache
+ *   A pointer to the mempool cache.
+ * @param mp
+ *   A pointer to the mempool.
+ * @param n
+ *   The number of objects to prefetch into the mempool cache.
+ * @return
+ *   The pointer to the objects in the mempool cache.
+ *   NULL on error; i.e. the cache + the pool does not contain 'n' objects.
+ *   With rte_errno set to the error code of the mempool dequeue function,
+ *   or EINVAL if the request itself is too big for the cache, i.e.
+ *   exceeds the cache flush threshold.
+ */
+__rte_experimental
+static __rte_always_inline void *
+rte_mempool_cache_zc_get_bulk(struct rte_mempool_cache *cache,
+		struct rte_mempool *mp,
+		unsigned int n)
+{
+	unsigned int len, size;
+
+	RTE_ASSERT(cache != NULL);
+	RTE_ASSERT(mp != NULL);
+
+	rte_mempool_trace_cache_zc_get_bulk(cache, mp, n);
+
+	len = cache->len;
+	size = cache->size;
+
+	if (n <= len) {
+		/* The request can be satisfied from the cache as is. */
+		len -= n;
+	} else if (likely(n <= size)) {
+		/*
+		 * The request itself can be satisfied from the cache.
+		 * But first, the cache must be filled from the backend;
+		 * fetch size + requested - len objects.
+		 */
+		int ret;
+
+		ret = rte_mempool_ops_dequeue_bulk(mp, &cache->objs[len], size + n - len);
+		if (unlikely(ret < 0)) {
+			/*
+			 * We are buffer constrained.
+			 * Do not fill the cache, just satisfy the request.
+			 */
+			ret = rte_mempool_ops_dequeue_bulk(mp, &cache->objs[len], n - len);
+			if (unlikely(ret < 0)) {
+				/* Unable to satisfy the request. */
+
+				RTE_MEMPOOL_STAT_ADD(mp, get_fail_bulk, 1);
+				RTE_MEMPOOL_STAT_ADD(mp, get_fail_objs, n);
+
+				rte_errno = -ret;
+				return NULL;
+			}
+
+			len = 0;
+		} else {
+			len = size;
+		}
+	} else {
+		/* The request itself is too big for the cache. */
+		rte_errno = EINVAL;
+		return NULL;
+	}
+
+	cache->len = len;
+
+	RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1);
+	RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_objs, n);
+
+	return &cache->objs[len];
+}
+
 /**
  * @internal Put several objects back in the mempool; used internally.
  * @param mp
@@ -1364,32 +1556,25 @@  rte_mempool_do_generic_put(struct rte_mempool *mp, void * const *obj_table,
 {
 	void **cache_objs;
 
-	/* No cache provided */
-	if (unlikely(cache == NULL))
-		goto driver_enqueue;
+	/* No cache provided? */
+	if (unlikely(cache == NULL)) {
+		/* Increment stats now, adding in mempool always succeeds. */
+		RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1);
+		RTE_MEMPOOL_STAT_ADD(mp, put_objs, n);
 
-	/* increment stat now, adding in mempool always success */
-	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
-	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
+		goto driver_enqueue;
+	}
 
-	/* The request itself is too big for the cache */
-	if (unlikely(n > cache->flushthresh))
-		goto driver_enqueue_stats_incremented;
+	/* Prepare to add the objects to the cache. */
+	cache_objs = __rte_mempool_cache_zc_put_bulk(cache, mp, n);
 
-	/*
-	 * The cache follows the following algorithm:
-	 *   1. If the objects cannot be added to the cache without crossing
-	 *      the flush threshold, flush the cache to the backend.
-	 *   2. Add the objects to the cache.
-	 */
+	/* The request itself is too big for the cache? */
+	if (unlikely(cache_objs == NULL)) {
+		/* Increment stats now, adding in mempool always succeeds. */
+		RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
+		RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
 
-	if (cache->len + n <= cache->flushthresh) {
-		cache_objs = &cache->objs[cache->len];
-		cache->len += n;
-	} else {
-		cache_objs = &cache->objs[0];
-		rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len);
-		cache->len = n;
+		goto driver_enqueue;
 	}
 
 	/* Add the objects to the cache. */
@@ -1399,13 +1584,7 @@  rte_mempool_do_generic_put(struct rte_mempool *mp, void * const *obj_table,
 
 driver_enqueue:
 
-	/* increment stat now, adding in mempool always success */
-	RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1);
-	RTE_MEMPOOL_STAT_ADD(mp, put_objs, n);
-
-driver_enqueue_stats_incremented:
-
-	/* push objects to the backend */
+	/* Push the objects to the backend. */
 	rte_mempool_ops_enqueue_bulk(mp, obj_table, n);
 }
 
diff --git a/lib/mempool/rte_mempool_trace_fp.h b/lib/mempool/rte_mempool_trace_fp.h
index ed060e887c..14666457f7 100644
--- a/lib/mempool/rte_mempool_trace_fp.h
+++ b/lib/mempool/rte_mempool_trace_fp.h
@@ -109,6 +109,29 @@  RTE_TRACE_POINT_FP(
 	rte_trace_point_emit_ptr(mempool);
 )
 
+RTE_TRACE_POINT_FP(
+	rte_mempool_trace_cache_zc_put_bulk,
+	RTE_TRACE_POINT_ARGS(void *cache, void *mempool, uint32_t nb_objs),
+	rte_trace_point_emit_ptr(cache);
+	rte_trace_point_emit_ptr(mempool);
+	rte_trace_point_emit_u32(nb_objs);
+)
+
+RTE_TRACE_POINT_FP(
+	rte_mempool_trace_cache_zc_put_rewind,
+	RTE_TRACE_POINT_ARGS(void *cache, uint32_t nb_objs),
+	rte_trace_point_emit_ptr(cache);
+	rte_trace_point_emit_u32(nb_objs);
+)
+
+RTE_TRACE_POINT_FP(
+	rte_mempool_trace_cache_zc_get_bulk,
+	RTE_TRACE_POINT_ARGS(void *cache, void *mempool, uint32_t nb_objs),
+	rte_trace_point_emit_ptr(cache);
+	rte_trace_point_emit_ptr(mempool);
+	rte_trace_point_emit_u32(nb_objs);
+)
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/mempool/version.map b/lib/mempool/version.map
index b67d7aace7..1383ae6db2 100644
--- a/lib/mempool/version.map
+++ b/lib/mempool/version.map
@@ -63,6 +63,11 @@  EXPERIMENTAL {
 	__rte_mempool_trace_ops_alloc;
 	__rte_mempool_trace_ops_free;
 	__rte_mempool_trace_set_ops_byname;
+
+	# added in 23.03
+	__rte_mempool_trace_cache_zc_put_bulk;
+	__rte_mempool_trace_cache_zc_put_rewind;
+	__rte_mempool_trace_cache_zc_get_bulk;
 };
 
 INTERNAL {