diff mbox series

gpudev: introduce memory API

Message ID 20210602203531.2288645-1-thomas@monjalon.net (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers show
Series gpudev: introduce memory API | expand

Checks

Context Check Description
ci/iol-testing success Testing PASS
ci/github-robot fail github build: failed
ci/iol-abi-testing success Testing PASS
ci/intel-Testing success Testing PASS
ci/Intel-compilation success Compilation OK
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/checkpatch success coding style OK

Commit Message

Thomas Monjalon June 2, 2021, 8:35 p.m. UTC
From: Elena Agostini <eagostini@nvidia.com>

The new library gpudev is for dealing with GPU from a DPDK application
in a vendor-agnostic way.

As a first step, the features are focused on memory management.
A function allows to allocate memory inside the GPU,
while another one allows to use main (CPU) memory from the GPU.

The infrastructure is prepared to welcome drivers in drivers/gpu/
as the upcoming NVIDIA one, implementing the gpudev API.
Other additions planned for next revisions:
  - C implementation file
  - guide documentation
  - unit tests
  - integration in testpmd to enable Rx/Tx to/from GPU memory.

The next step should focus on GPU processing task control.

Signed-off-by: Elena Agostini <eagostini@nvidia.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 .gitignore                           |   1 +
 MAINTAINERS                          |   6 +
 doc/api/doxy-api-index.md            |   1 +
 doc/api/doxy-api.conf.in             |   1 +
 doc/guides/conf.py                   |   8 ++
 doc/guides/gpus/features/default.ini |  13 ++
 doc/guides/gpus/index.rst            |  11 ++
 doc/guides/gpus/overview.rst         |   7 +
 doc/guides/index.rst                 |   1 +
 doc/guides/prog_guide/gpu.rst        |   5 +
 doc/guides/prog_guide/index.rst      |   1 +
 drivers/gpu/meson.build              |   4 +
 drivers/meson.build                  |   1 +
 lib/gpudev/gpu_driver.h              |  44 +++++++
 lib/gpudev/meson.build               |   9 ++
 lib/gpudev/rte_gpudev.h              | 183 +++++++++++++++++++++++++++
 lib/gpudev/version.map               |  11 ++
 lib/meson.build                      |   1 +
 18 files changed, 308 insertions(+)
 create mode 100644 doc/guides/gpus/features/default.ini
 create mode 100644 doc/guides/gpus/index.rst
 create mode 100644 doc/guides/gpus/overview.rst
 create mode 100644 doc/guides/prog_guide/gpu.rst
 create mode 100644 drivers/gpu/meson.build
 create mode 100644 lib/gpudev/gpu_driver.h
 create mode 100644 lib/gpudev/meson.build
 create mode 100644 lib/gpudev/rte_gpudev.h
 create mode 100644 lib/gpudev/version.map

Comments

Stephen Hemminger June 2, 2021, 8:46 p.m. UTC | #1
On Wed,  2 Jun 2021 22:35:31 +0200
Thomas Monjalon <thomas@monjalon.net> wrote:

> +/** Store a list of info for a given GPU. */
> +struct rte_gpu_info {
> +	/** GPU device ID. */
> +	uint16_t gpu_id;
> +	/** Unique identifier name. */
> +	char name[RTE_GPU_NAME_MAX_LEN];
> +	/** Total memory available on device. */
> +	size_t total_memory;
> +	/** Total processors available on device. */
> +	int processor_count;

Nit: shouldn't processor_count be unsigned.
Thomas Monjalon June 2, 2021, 8:48 p.m. UTC | #2
02/06/2021 22:46, Stephen Hemminger:
> On Wed,  2 Jun 2021 22:35:31 +0200
> Thomas Monjalon <thomas@monjalon.net> wrote:
> 
> > +/** Store a list of info for a given GPU. */
> > +struct rte_gpu_info {
> > +	/** GPU device ID. */
> > +	uint16_t gpu_id;
> > +	/** Unique identifier name. */
> > +	char name[RTE_GPU_NAME_MAX_LEN];
> > +	/** Total memory available on device. */
> > +	size_t total_memory;
> > +	/** Total processors available on device. */
> > +	int processor_count;
> 
> Nit: shouldn't processor_count be unsigned.

Absolutely yes, thanks for the catch.
Andrew Rybchenko June 3, 2021, 7:06 a.m. UTC | #3
On 6/2/21 11:35 PM, Thomas Monjalon wrote:
> From: Elena Agostini <eagostini@nvidia.com>
> 
> The new library gpudev is for dealing with GPU from a DPDK application
> in a vendor-agnostic way.
> 
> As a first step, the features are focused on memory management.
> A function allows to allocate memory inside the GPU,
> while another one allows to use main (CPU) memory from the GPU.
> 
> The infrastructure is prepared to welcome drivers in drivers/gpu/
> as the upcoming NVIDIA one, implementing the gpudev API.
> Other additions planned for next revisions:
>   - C implementation file
>   - guide documentation
>   - unit tests
>   - integration in testpmd to enable Rx/Tx to/from GPU memory.
> 
> The next step should focus on GPU processing task control.
> 
> Signed-off-by: Elena Agostini <eagostini@nvidia.com>
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>


LGTM as an RFC. It is definitely to a patch to apply
since implementation is missing. See my notes below.

> ---
>  .gitignore                           |   1 +
>  MAINTAINERS                          |   6 +
>  doc/api/doxy-api-index.md            |   1 +
>  doc/api/doxy-api.conf.in             |   1 +
>  doc/guides/conf.py                   |   8 ++
>  doc/guides/gpus/features/default.ini |  13 ++
>  doc/guides/gpus/index.rst            |  11 ++
>  doc/guides/gpus/overview.rst         |   7 +
>  doc/guides/index.rst                 |   1 +
>  doc/guides/prog_guide/gpu.rst        |   5 +
>  doc/guides/prog_guide/index.rst      |   1 +
>  drivers/gpu/meson.build              |   4 +
>  drivers/meson.build                  |   1 +
>  lib/gpudev/gpu_driver.h              |  44 +++++++
>  lib/gpudev/meson.build               |   9 ++
>  lib/gpudev/rte_gpudev.h              | 183 +++++++++++++++++++++++++++
>  lib/gpudev/version.map               |  11 ++
>  lib/meson.build                      |   1 +
>  18 files changed, 308 insertions(+)
>  create mode 100644 doc/guides/gpus/features/default.ini
>  create mode 100644 doc/guides/gpus/index.rst
>  create mode 100644 doc/guides/gpus/overview.rst
>  create mode 100644 doc/guides/prog_guide/gpu.rst
>  create mode 100644 drivers/gpu/meson.build
>  create mode 100644 lib/gpudev/gpu_driver.h
>  create mode 100644 lib/gpudev/meson.build
>  create mode 100644 lib/gpudev/rte_gpudev.h
>  create mode 100644 lib/gpudev/version.map
> 
> diff --git a/.gitignore b/.gitignore
> index b19c0717e6..49494e0c6c 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -14,6 +14,7 @@ doc/guides/compressdevs/overview_feature_table.txt
>  doc/guides/regexdevs/overview_feature_table.txt
>  doc/guides/vdpadevs/overview_feature_table.txt
>  doc/guides/bbdevs/overview_feature_table.txt
> +doc/guides/gpus/overview_feature_table.txt
>  
>  # ignore generated ctags/cscope files
>  cscope.out.po
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 5877a16971..c4755dfe9a 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -452,6 +452,12 @@ F: app/test-regex/
>  F: doc/guides/prog_guide/regexdev.rst
>  F: doc/guides/regexdevs/features/default.ini
>  
> +GPU API - EXPERIMENTAL
> +M: Elena Agostini <eagostini@nvidia.com>
> +F: lib/gpudev/
> +F: doc/guides/prog_guide/gpu.rst
> +F: doc/guides/gpus/features/default.ini
> +
>  Eventdev API
>  M: Jerin Jacob <jerinj@marvell.com>
>  T: git://dpdk.org/next/dpdk-next-eventdev
> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
> index 1992107a03..bd10342ca2 100644
> --- a/doc/api/doxy-api-index.md
> +++ b/doc/api/doxy-api-index.md
> @@ -21,6 +21,7 @@ The public API headers are grouped by topics:
>    [compressdev]        (@ref rte_compressdev.h),
>    [compress]           (@ref rte_comp.h),
>    [regexdev]           (@ref rte_regexdev.h),
> +  [gpudev]             (@ref rte_gpudev.h),
>    [eventdev]           (@ref rte_eventdev.h),
>    [event_eth_rx_adapter]   (@ref rte_event_eth_rx_adapter.h),
>    [event_eth_tx_adapter]   (@ref rte_event_eth_tx_adapter.h),
> diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
> index 325a0195c6..831b9a6b33 100644
> --- a/doc/api/doxy-api.conf.in
> +++ b/doc/api/doxy-api.conf.in
> @@ -40,6 +40,7 @@ INPUT                   = @TOPDIR@/doc/api/doxy-api-index.md \
>                            @TOPDIR@/lib/eventdev \
>                            @TOPDIR@/lib/fib \
>                            @TOPDIR@/lib/flow_classify \
> +                          @TOPDIR@/lib/gpudev \
>                            @TOPDIR@/lib/graph \
>                            @TOPDIR@/lib/gro \
>                            @TOPDIR@/lib/gso \
> diff --git a/doc/guides/conf.py b/doc/guides/conf.py
> index 67d2dd62c7..7930da9ceb 100644
> --- a/doc/guides/conf.py
> +++ b/doc/guides/conf.py
> @@ -152,6 +152,9 @@ def generate_overview_table(output_filename, table_id, section, table_name, titl
>          name = ini_filename[:-4]
>          name = name.replace('_vf', 'vf')
>          pmd_names.append(name)
> +    if not pmd_names:
> +        # Add an empty column if table is empty (required by RST syntax)
> +        pmd_names.append(' ')
>  
>      # Pad the table header names.
>      max_header_len = len(max(pmd_names, key=len))
> @@ -388,6 +391,11 @@ def setup(app):
>                              'Features',
>                              'Features availability in bbdev drivers',
>                              'Feature')
> +    table_file = dirname(__file__) + '/gpus/overview_feature_table.txt'
> +    generate_overview_table(table_file, 1,
> +                            'Features',
> +                            'Features availability in GPU drivers',
> +                            'Feature')
>  
>      if LooseVersion(sphinx_version) < LooseVersion('1.3.1'):
>          print('Upgrade sphinx to version >= 1.3.1 for '
> diff --git a/doc/guides/gpus/features/default.ini b/doc/guides/gpus/features/default.ini
> new file mode 100644
> index 0000000000..c363447b0d
> --- /dev/null
> +++ b/doc/guides/gpus/features/default.ini
> @@ -0,0 +1,13 @@
> +;
> +; Features of a GPU driver.
> +;
> +; This file defines the features that are valid for inclusion in
> +; the other driver files and also the order that they appear in
> +; the features table in the documentation. The feature description
> +; string should not exceed feature_str_len defined in conf.py.
> +;
> +[Features]
> +Get device info                =
> +Share CPU memory with GPU      =
> +Allocate GPU memory            =
> +Free memory                    =
> diff --git a/doc/guides/gpus/index.rst b/doc/guides/gpus/index.rst
> new file mode 100644
> index 0000000000..f9c62aeb36
> --- /dev/null
> +++ b/doc/guides/gpus/index.rst
> @@ -0,0 +1,11 @@
> +.. SPDX-License-Identifier: BSD-3-Clause
> +   Copyright 2021 NVIDIA Corporation & Affiliates
> +
> +GPU Drivers
> +===========
> +
> +.. toctree::
> +   :maxdepth: 2
> +   :numbered:
> +
> +   overview
> diff --git a/doc/guides/gpus/overview.rst b/doc/guides/gpus/overview.rst
> new file mode 100644
> index 0000000000..e7f985e98b
> --- /dev/null
> +++ b/doc/guides/gpus/overview.rst
> @@ -0,0 +1,7 @@
> +.. SPDX-License-Identifier: BSD-3-Clause
> +   Copyright 2021 NVIDIA Corporation & Affiliates
> +
> +Overview of GPU Drivers
> +=======================
> +
> +.. include:: overview_feature_table.txt
> diff --git a/doc/guides/index.rst b/doc/guides/index.rst
> index 857f0363d3..ee4d79a4eb 100644
> --- a/doc/guides/index.rst
> +++ b/doc/guides/index.rst
> @@ -21,6 +21,7 @@ DPDK documentation
>     compressdevs/index
>     vdpadevs/index
>     regexdevs/index
> +   gpus/index
>     eventdevs/index
>     rawdevs/index
>     mempool/index
> diff --git a/doc/guides/prog_guide/gpu.rst b/doc/guides/prog_guide/gpu.rst
> new file mode 100644
> index 0000000000..54f9fa8300
> --- /dev/null
> +++ b/doc/guides/prog_guide/gpu.rst
> @@ -0,0 +1,5 @@
> +.. SPDX-License-Identifier: BSD-3-Clause
> +   Copyright 2021 NVIDIA Corporation & Affiliates
> +
> +GPU Library
> +===========
> diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
> index 2dce507f46..dfddf90b51 100644
> --- a/doc/guides/prog_guide/index.rst
> +++ b/doc/guides/prog_guide/index.rst
> @@ -27,6 +27,7 @@ Programmer's Guide
>      cryptodev_lib
>      compressdev
>      regexdev
> +    gpu
>      rte_security
>      rawdev
>      link_bonding_poll_mode_drv_lib
> diff --git a/drivers/gpu/meson.build b/drivers/gpu/meson.build
> new file mode 100644
> index 0000000000..5189950616
> --- /dev/null
> +++ b/drivers/gpu/meson.build
> @@ -0,0 +1,4 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright 2021 NVIDIA Corporation & Affiliates
> +
> +drivers = []
> diff --git a/drivers/meson.build b/drivers/meson.build
> index bc6f4f567f..f607040d79 100644
> --- a/drivers/meson.build
> +++ b/drivers/meson.build
> @@ -18,6 +18,7 @@ subdirs = [
>          'vdpa',           # depends on common, bus and mempool.
>          'event',          # depends on common, bus, mempool and net.
>          'baseband',       # depends on common and bus.
> +        'gpu',            # depends on common and bus.
>  ]
>  
>  if meson.is_cross_build()
> diff --git a/lib/gpudev/gpu_driver.h b/lib/gpudev/gpu_driver.h
> new file mode 100644
> index 0000000000..5ff609e49d
> --- /dev/null
> +++ b/lib/gpudev/gpu_driver.h
> @@ -0,0 +1,44 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2021 NVIDIA Corporation & Affiliates
> + */
> +
> +#ifndef GPU_DRIVER_H
> +#define GPU_DRIVER_H
> +
> +#include <stdint.h>
> +
> +#include <rte_common.h>
> +
> +#include "rte_gpudev.h"
> +
> +struct rte_gpu_dev;
> +
> +typedef int (*gpu_malloc_t)(struct rte_gpu_dev *dev, size_t size, void **ptr);
> +typedef int (*gpu_free_t)(struct rte_gpu_dev *dev, void *ptr);

Not that important but I always prefer to typedef
function prototypes w/o pointer and use pointer in
the structure below. I.e.

typedef int (gpu_malloc_t)(struct rte_gpu_dev *dev, size_t size, void
**ptr);

It allows to specify that corresponding callback
must comply to the prototype and produce build
error otherwise (and do not rely on warnings), e.g.

static gpu_malloc_t mlx5_gpu_malloc;
static int
mlx5_gpu_malloc(struct rte_gpu_dev *dev, size_t size, void **ptr)
{
     ...
}

May be a new library should go this way.

> +
> +struct rte_gpu_dev {
> +	/* Backing device. */
> +	struct rte_device *device;
> +	/* GPU info structure. */
> +	struct rte_gpu_info info;
> +	/* Counter of processes using the device. */
> +	uint16_t process_cnt;
> +	/* If device is currently used or not. */
> +	enum rte_gpu_state state;
> +	/* FUNCTION: Allocate memory on the GPU. */
> +	gpu_malloc_t gpu_malloc;
> +	/* FUNCTION: Allocate memory on the CPU visible from the GPU. */
> +	gpu_malloc_t gpu_malloc_visible;
> +	/* FUNCTION: Free allocated memory on the GPU. */
> +	gpu_free_t gpu_free;

Don't we need a callback to get dev_info?

> +	/* Device interrupt handle. */
> +	struct rte_intr_handle *intr_handle;
> +	/* Driver-specific private data. */
> +	void *dev_private;
> +} __rte_cache_aligned;
> +
> +struct rte_gpu_dev *rte_gpu_dev_allocate(const char *name);
> +struct rte_gpu_dev *rte_gpu_dev_get_by_name(const char *name);
> +int rte_gpu_dev_release(struct rte_gpu_dev *gpudev);
> +
> +#endif /* GPU_DRIVER_H */
> diff --git a/lib/gpudev/meson.build b/lib/gpudev/meson.build
> new file mode 100644
> index 0000000000..f05459e18d
> --- /dev/null
> +++ b/lib/gpudev/meson.build
> @@ -0,0 +1,9 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright 2021 NVIDIA Corporation & Affiliates
> +
> +headers = files(
> +        'rte_gpudev.h',
> +)
> +
> +sources = files(
> +)
> diff --git a/lib/gpudev/rte_gpudev.h b/lib/gpudev/rte_gpudev.h
> new file mode 100644
> index 0000000000..b12f35c17e
> --- /dev/null
> +++ b/lib/gpudev/rte_gpudev.h
> @@ -0,0 +1,183 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2021 NVIDIA Corporation & Affiliates
> + */
> +
> +#ifndef RTE_GPUDEV_H
> +#define RTE_GPUDEV_H
> +
> +#include <stdint.h>
> +#include <stdbool.h>
> +
> +#include <rte_common.h>
> +
> +/**
> + * @file
> + * Generic library to interact with a GPU.
> + *
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/** Maximum number of GPU engines. */
> +#define RTE_GPU_MAX_DEVS UINT16_C(32)
> +/** Maximum length of device name. */
> +#define RTE_GPU_NAME_MAX_LEN 128
> +
> +/** Flags indicate current state of GPU device. */
> +enum rte_gpu_state {
> +	RTE_GPU_STATE_UNUSED,        /**< not initialized */
> +	RTE_GPU_STATE_INITIALIZED,   /**< initialized */
> +};
> +
> +/** Store a list of info for a given GPU. */
> +struct rte_gpu_info {
> +	/** GPU device ID. */
> +	uint16_t gpu_id;
> +	/** Unique identifier name. */
> +	char name[RTE_GPU_NAME_MAX_LEN];
> +	/** Total memory available on device. */
> +	size_t total_memory;
> +	/** Total processors available on device. */
> +	int processor_count;
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Returns the number of GPUs detected and associated to DPDK.
> + *
> + * @return
> + *   The number of available GPUs.
> + */
> +__rte_experimental
> +uint16_t rte_gpu_dev_count_avail(void);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Check if the device is valid and initialized in DPDK.
> + *
> + * @param gpu_id
> + *   The input GPU ID.
> + *
> + * @return
> + *   - True if gpu_id is a valid and initialized GPU.
> + *   - False otherwise.
> + */
> +__rte_experimental
> +bool rte_gpu_dev_is_valid(uint16_t gpu_id);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Get the GPU ID of the next valid GPU initialized in DPDK.
> + *
> + * @param gpu_id
> + *   The initial GPU ID to start the research.
> + *
> + * @return
> + *   Next GPU ID corresponding to a valid and initialized GPU device.
> + */
> +__rte_experimental
> +uint16_t rte_gpu_dev_find_next(uint16_t gpu_id);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Macro to iterate over all valid GPUs.
> + *
> + * @param gpu_id
> + *   The ID of the next possible valid GPU.
> + * @return
> + *   Next valid GPU ID, RTE_GPU_MAX_DEVS if there is none.
> + */
> +#define RTE_GPU_FOREACH_DEV(gpu_id) \
> +	for (gpu_id = rte_gpu_find_next(0); \
> +	     gpu_id < RTE_GPU_MAX_DEVS; \
> +	     gpu_id = rte_gpu_find_next(gpu_id + 1))
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Return GPU specific info.
> + *
> + * @param gpu_id
> + *   GPU ID to get info.
> + * @param info
> + *   Memory structure to fill with the info.
> + *
> + * @return
> + *   0 on success, -1 otherwise.
> + */
> +__rte_experimental
> +int rte_gpu_dev_info_get(uint16_t gpu_id, struct rte_gpu_info **info);

Hm, I think it is better to have 'struct rte_gpu_info *info'.
Why should it allocate and return memory to be freed by caller?

> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Allocate a chunk of memory on the GPU.

Looking a below function it is required to clarify here if
the memory is visible or invisible to GPU (or both allowed).

> + *
> + * @param gpu_id
> + *   GPU ID to allocate memory.
> + * @param size
> + *   Number of bytes to allocate.

Is behaviour defined if zero size is requested?
IMHO, it would be good to define.

> + * @param ptr
> + *   Pointer to store the address of the allocated memory.
> + *
> + * @return
> + *   0 on success, -1 otherwise.

Don't we want to differentiate various errors using
negative errno as it is done in many DPDK libraries?

> + */
> +__rte_experimental
> +int rte_gpu_malloc(uint16_t gpu_id, size_t size, void **ptr);

May be *malloc() should return a pointer and "negative"
values used to report various errnos?

The problem with the approach that comparison vs NULL will
not work in this case and we need special macro or small
inline function to check error condition.

Returned pointer is definitely more convenient, but above
not may result in bugs.

> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Allocate a chunk of memory on the CPU that is visible from the GPU.
> + *
> + * @param gpu_id
> + *   Reference GPU ID.
> + * @param size
> + *   Number of bytes to allocate.
> + * @param ptr
> + *   Pointer to store the address of the allocated memory.
> + *
> + * @return
> + *   0 on success, -1 otherwise.

Same here

> + */
> +__rte_experimental
> +int rte_gpu_malloc_visible(uint16_t gpu_id, size_t size, void **ptr);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Deallocate a chunk of memory allocated with rte_gpu_malloc*.
> + *
> + * @param gpu_id
> + *   Reference GPU ID.
> + * @param ptr
> + *   Pointer to the memory area to be deallocated.

I think it should be NOP in the case of NULL pointer and it
should be documented. If not, it must be documented as well.

> + *
> + * @return
> + *   0 on success, -1 otherwise.

Same here

> + */
> +__rte_experimental
> +int rte_gpu_free(uint16_t gpu_id, void *ptr);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* RTE_GPUDEV_H */
> diff --git a/lib/gpudev/version.map b/lib/gpudev/version.map
> new file mode 100644
> index 0000000000..9e0f218e8b
> --- /dev/null
> +++ b/lib/gpudev/version.map
> @@ -0,0 +1,11 @@
> +EXPERIMENTAL {
> +	global:
> +
> +	rte_gpu_dev_count_avail;
> +	rte_gpu_dev_find_next;
> +	rte_gpu_dev_info_get;
> +	rte_gpu_dev_is_valid;
> +	rte_gpu_free;
> +	rte_gpu_malloc;
> +	rte_gpu_malloc_visible;
> +};
> diff --git a/lib/meson.build b/lib/meson.build
> index 4a64756a68..ffefc64c69 100644
> --- a/lib/meson.build
> +++ b/lib/meson.build
> @@ -33,6 +33,7 @@ libraries = [
>          'distributor',
>          'efd',
>          'eventdev',
> +        'gpudev',
>          'gro',
>          'gso',
>          'ip_frag',
>
David Marchand June 3, 2021, 7:18 a.m. UTC | #4
Quick pass:

On Wed, Jun 2, 2021 at 10:36 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> diff --git a/lib/gpudev/gpu_driver.h b/lib/gpudev/gpu_driver.h
> new file mode 100644
> index 0000000000..5ff609e49d
> --- /dev/null
> +++ b/lib/gpudev/gpu_driver.h
> @@ -0,0 +1,44 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2021 NVIDIA Corporation & Affiliates
> + */
> +
> +#ifndef GPU_DRIVER_H
> +#define GPU_DRIVER_H
> +
> +#include <stdint.h>
> +
> +#include <rte_common.h>
> +
> +#include "rte_gpudev.h"
> +
> +struct rte_gpu_dev;
> +
> +typedef int (*gpu_malloc_t)(struct rte_gpu_dev *dev, size_t size, void **ptr);
> +typedef int (*gpu_free_t)(struct rte_gpu_dev *dev, void *ptr);
> +

Great to see this structure hidden in a driver-only header.


> +struct rte_gpu_dev {

We could have a name[] field here, that will be later pointed at, in
rte_gpu_info.
Who is responsible for deciding of the device name?


> +       /* Backing device. */
> +       struct rte_device *device;
> +       /* GPU info structure. */
> +       struct rte_gpu_info info;
> +       /* Counter of processes using the device. */
> +       uint16_t process_cnt;
> +       /* If device is currently used or not. */
> +       enum rte_gpu_state state;
> +       /* FUNCTION: Allocate memory on the GPU. */
> +       gpu_malloc_t gpu_malloc;
> +       /* FUNCTION: Allocate memory on the CPU visible from the GPU. */
> +       gpu_malloc_t gpu_malloc_visible;
> +       /* FUNCTION: Free allocated memory on the GPU. */
> +       gpu_free_t gpu_free;
> +       /* Device interrupt handle. */
> +       struct rte_intr_handle *intr_handle;
> +       /* Driver-specific private data. */
> +       void *dev_private;
> +} __rte_cache_aligned;
> +
> +struct rte_gpu_dev *rte_gpu_dev_allocate(const char *name);
> +struct rte_gpu_dev *rte_gpu_dev_get_by_name(const char *name);

Those symbols will have to be marked internal (__rte_internal +
version.map) for drivers to see them.


> +int rte_gpu_dev_release(struct rte_gpu_dev *gpudev);
> +
> +#endif /* GPU_DRIVER_H */


> diff --git a/lib/gpudev/rte_gpudev.h b/lib/gpudev/rte_gpudev.h
> new file mode 100644
> index 0000000000..b12f35c17e
> --- /dev/null
> +++ b/lib/gpudev/rte_gpudev.h
> @@ -0,0 +1,183 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2021 NVIDIA Corporation & Affiliates
> + */
> +
> +#ifndef RTE_GPUDEV_H
> +#define RTE_GPUDEV_H
> +
> +#include <stdint.h>
> +#include <stdbool.h>
> +
> +#include <rte_common.h>
> +
> +/**
> + * @file
> + * Generic library to interact with a GPU.
> + *
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/** Maximum number of GPU engines. */
> +#define RTE_GPU_MAX_DEVS UINT16_C(32)

Bleh.
Let's stop with max values.
The iterator _next should return a special value indicating there is
no more devs to list.



> +/** Maximum length of device name. */
> +#define RTE_GPU_NAME_MAX_LEN 128
> +
> +/** Flags indicate current state of GPU device. */
> +enum rte_gpu_state {
> +       RTE_GPU_STATE_UNUSED,        /**< not initialized */
> +       RTE_GPU_STATE_INITIALIZED,   /**< initialized */
> +};
> +
> +/** Store a list of info for a given GPU. */
> +struct rte_gpu_info {
> +       /** GPU device ID. */
> +       uint16_t gpu_id;
> +       /** Unique identifier name. */
> +       char name[RTE_GPU_NAME_MAX_LEN];

const char *name;

Then the gpu generic layer simply fills this with the
rte_gpu_dev->name field I proposed above.


> +       /** Total memory available on device. */
> +       size_t total_memory;
> +       /** Total processors available on device. */
> +       int processor_count;
> +};
Thomas Monjalon June 3, 2021, 7:26 a.m. UTC | #5
03/06/2021 09:06, Andrew Rybchenko:
> On 6/2/21 11:35 PM, Thomas Monjalon wrote:
> > From: Elena Agostini <eagostini@nvidia.com>
> > 
> > The new library gpudev is for dealing with GPU from a DPDK application
> > in a vendor-agnostic way.
> > 
> > As a first step, the features are focused on memory management.
> > A function allows to allocate memory inside the GPU,
> > while another one allows to use main (CPU) memory from the GPU.
> > 
> > The infrastructure is prepared to welcome drivers in drivers/gpu/
> > as the upcoming NVIDIA one, implementing the gpudev API.
> > Other additions planned for next revisions:
> >   - C implementation file
> >   - guide documentation
> >   - unit tests
> >   - integration in testpmd to enable Rx/Tx to/from GPU memory.
> > 
> > The next step should focus on GPU processing task control.
> > 
> > Signed-off-by: Elena Agostini <eagostini@nvidia.com>
> > Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> 
> 
> LGTM as an RFC. It is definitely to a patch to apply
> since implementation is missing. See my notes below.

Yes sorry I forgot the RFC tag when sending.

[...]
> > +typedef int (*gpu_malloc_t)(struct rte_gpu_dev *dev, size_t size, void **ptr);
> > +typedef int (*gpu_free_t)(struct rte_gpu_dev *dev, void *ptr);
> 
> Not that important but I always prefer to typedef
> function prototypes w/o pointer and use pointer in
> the structure below. I.e.
> 
> typedef int (gpu_malloc_t)(struct rte_gpu_dev *dev, size_t size, void
> **ptr);
> 
> It allows to specify that corresponding callback
> must comply to the prototype and produce build
> error otherwise (and do not rely on warnings), e.g.
> 
> static gpu_malloc_t mlx5_gpu_malloc;
> static int
> mlx5_gpu_malloc(struct rte_gpu_dev *dev, size_t size, void **ptr)
> {
>      ...
> }
> 
> May be a new library should go this way.

I agree.
> 
> > +
> > +struct rte_gpu_dev {
> > +	/* Backing device. */
> > +	struct rte_device *device;
> > +	/* GPU info structure. */
> > +	struct rte_gpu_info info;
> > +	/* Counter of processes using the device. */
> > +	uint16_t process_cnt;
> > +	/* If device is currently used or not. */
> > +	enum rte_gpu_state state;
> > +	/* FUNCTION: Allocate memory on the GPU. */
> > +	gpu_malloc_t gpu_malloc;
> > +	/* FUNCTION: Allocate memory on the CPU visible from the GPU. */
> > +	gpu_malloc_t gpu_malloc_visible;
> > +	/* FUNCTION: Free allocated memory on the GPU. */
> > +	gpu_free_t gpu_free;
> 
> Don't we need a callback to get dev_info?

Yes it's my miss.

[...]
> > +__rte_experimental
> > +int rte_gpu_dev_info_get(uint16_t gpu_id, struct rte_gpu_info **info);
> 
> Hm, I think it is better to have 'struct rte_gpu_info *info'.
> Why should it allocate and return memory to be freed by caller?

No you're right, I overlooked it.

[...]
> > + * Allocate a chunk of memory on the GPU.
> 
> Looking a below function it is required to clarify here if
> the memory is visible or invisible to GPU (or both allowed).

This function allocates on the GPU so it is visible by the GPU.
I feel I misunderstand your question.

> > + *
> > + * @param gpu_id
> > + *   GPU ID to allocate memory.
> > + * @param size
> > + *   Number of bytes to allocate.
> 
> Is behaviour defined if zero size is requested?
> IMHO, it would be good to define.

OK

> > + * @param ptr
> > + *   Pointer to store the address of the allocated memory.
> > + *
> > + * @return
> > + *   0 on success, -1 otherwise.
> 
> Don't we want to differentiate various errors using
> negative errno as it is done in many DPDK libraries?

Yes I think so, I was just too much lazy to do it in this RFC.

> > + */
> > +__rte_experimental
> > +int rte_gpu_malloc(uint16_t gpu_id, size_t size, void **ptr);
> 
> May be *malloc() should return a pointer and "negative"
> values used to report various errnos?

I don't understand what you mean by negative values if it is a pointer.
We could return a pointer and use rte_errno.

> The problem with the approach that comparison vs NULL will
> not work in this case and we need special macro or small
> inline function to check error condition.
> 
> Returned pointer is definitely more convenient, but above
> not may result in bugs.

I don't know what is better.

[...]
> > + * Deallocate a chunk of memory allocated with rte_gpu_malloc*.
> > + *
> > + * @param gpu_id
> > + *   Reference GPU ID.
> > + * @param ptr
> > + *   Pointer to the memory area to be deallocated.
> 
> I think it should be NOP in the case of NULL pointer and it
> should be documented. If not, it must be documented as well.

OK for NOP.
Thomas Monjalon June 3, 2021, 7:30 a.m. UTC | #6
03/06/2021 09:18, David Marchand:
> Quick pass:
> 
> On Wed, Jun 2, 2021 at 10:36 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > diff --git a/lib/gpudev/gpu_driver.h b/lib/gpudev/gpu_driver.h
> > new file mode 100644
> > index 0000000000..5ff609e49d
> > --- /dev/null
> > +++ b/lib/gpudev/gpu_driver.h
[...]
> > +struct rte_gpu_dev;
> > +
> > +typedef int (*gpu_malloc_t)(struct rte_gpu_dev *dev, size_t size, void **ptr);
> > +typedef int (*gpu_free_t)(struct rte_gpu_dev *dev, void *ptr);
> > +
> 
> Great to see this structure hidden in a driver-only header.
> 
> 
> > +struct rte_gpu_dev {
> 
> We could have a name[] field here, that will be later pointed at, in
> rte_gpu_info.
> Who is responsible for deciding of the device name?

The driver is responsible for the name of the device.
Yes I agree to store the name here.

> > +       /* Backing device. */
> > +       struct rte_device *device;
> > +       /* GPU info structure. */
> > +       struct rte_gpu_info info;
> > +       /* Counter of processes using the device. */
> > +       uint16_t process_cnt;
> > +       /* If device is currently used or not. */
> > +       enum rte_gpu_state state;
> > +       /* FUNCTION: Allocate memory on the GPU. */
> > +       gpu_malloc_t gpu_malloc;
> > +       /* FUNCTION: Allocate memory on the CPU visible from the GPU. */
> > +       gpu_malloc_t gpu_malloc_visible;
> > +       /* FUNCTION: Free allocated memory on the GPU. */
> > +       gpu_free_t gpu_free;
> > +       /* Device interrupt handle. */
> > +       struct rte_intr_handle *intr_handle;
> > +       /* Driver-specific private data. */
> > +       void *dev_private;
> > +} __rte_cache_aligned;
> > +
> > +struct rte_gpu_dev *rte_gpu_dev_allocate(const char *name);
> > +struct rte_gpu_dev *rte_gpu_dev_get_by_name(const char *name);
> 
> Those symbols will have to be marked internal (__rte_internal +
> version.map) for drivers to see them.

OK, good catch.

[...]
> > +/** Maximum number of GPU engines. */
> > +#define RTE_GPU_MAX_DEVS UINT16_C(32)
> 
> Bleh.
> Let's stop with max values.
> The iterator _next should return a special value indicating there is
> no more devs to list.

I fully agree.
I would like to define gpu_id as an int to simplify comparisons
with error value. int or int16_t ?

> > +/** Maximum length of device name. */
> > +#define RTE_GPU_NAME_MAX_LEN 128

Will be dropped as well.

> > +
> > +/** Flags indicate current state of GPU device. */
> > +enum rte_gpu_state {
> > +       RTE_GPU_STATE_UNUSED,        /**< not initialized */
> > +       RTE_GPU_STATE_INITIALIZED,   /**< initialized */
> > +};
> > +
> > +/** Store a list of info for a given GPU. */
> > +struct rte_gpu_info {
> > +       /** GPU device ID. */
> > +       uint16_t gpu_id;
> > +       /** Unique identifier name. */
> > +       char name[RTE_GPU_NAME_MAX_LEN];
> 
> const char *name;
> 
> Then the gpu generic layer simply fills this with the
> rte_gpu_dev->name field I proposed above.

Yes.

> > +       /** Total memory available on device. */
> > +       size_t total_memory;
> > +       /** Total processors available on device. */
> > +       int processor_count;
> > +};
Jerin Jacob June 3, 2021, 7:47 a.m. UTC | #7
On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> From: Elena Agostini <eagostini@nvidia.com>
>
> The new library gpudev is for dealing with GPU from a DPDK application
> in a vendor-agnostic way.
>
> As a first step, the features are focused on memory management.
> A function allows to allocate memory inside the GPU,
> while another one allows to use main (CPU) memory from the GPU.
>
> The infrastructure is prepared to welcome drivers in drivers/gpu/
> as the upcoming NVIDIA one, implementing the gpudev API.
> Other additions planned for next revisions:
>   - C implementation file
>   - guide documentation
>   - unit tests
>   - integration in testpmd to enable Rx/Tx to/from GPU memory.
>
> The next step should focus on GPU processing task control.
>
> Signed-off-by: Elena Agostini <eagostini@nvidia.com>
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> ---
>  .gitignore                           |   1 +
>  MAINTAINERS                          |   6 +
>  doc/api/doxy-api-index.md            |   1 +
>  doc/api/doxy-api.conf.in             |   1 +
>  doc/guides/conf.py                   |   8 ++
>  doc/guides/gpus/features/default.ini |  13 ++
>  doc/guides/gpus/index.rst            |  11 ++
>  doc/guides/gpus/overview.rst         |   7 +
>  doc/guides/index.rst                 |   1 +
>  doc/guides/prog_guide/gpu.rst        |   5 +
>  doc/guides/prog_guide/index.rst      |   1 +
>  drivers/gpu/meson.build              |   4 +
>  drivers/meson.build                  |   1 +
>  lib/gpudev/gpu_driver.h              |  44 +++++++
>  lib/gpudev/meson.build               |   9 ++
>  lib/gpudev/rte_gpudev.h              | 183 +++++++++++++++++++++++++++
>  lib/gpudev/version.map               |  11 ++
>  lib/meson.build                      |   1 +
>  18 files changed, 308 insertions(+)
>  create mode 100644 doc/guides/gpus/features/default.ini
>  create mode 100644 doc/guides/gpus/index.rst
>  create mode 100644 doc/guides/gpus/overview.rst
>  create mode 100644 doc/guides/prog_guide/gpu.rst
>  create mode 100644 drivers/gpu/meson.build
>  create mode 100644 lib/gpudev/gpu_driver.h
>  create mode 100644 lib/gpudev/meson.build
>  create mode 100644 lib/gpudev/rte_gpudev.h
>  create mode 100644 lib/gpudev/version.map
>
> diff --git a/.gitignore b/.gitignore
> index b19c0717e6..49494e0c6c 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -14,6 +14,7 @@ doc/guides/compressdevs/overview_feature_table.txt
>  doc/guides/regexdevs/overview_feature_table.txt
>  doc/guides/vdpadevs/overview_feature_table.txt
>  doc/guides/bbdevs/overview_feature_table.txt
> +doc/guides/gpus/overview_feature_table.txt
>
>  # ignore generated ctags/cscope files
>  cscope.out.po
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 5877a16971..c4755dfe9a 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -452,6 +452,12 @@ F: app/test-regex/
>  F: doc/guides/prog_guide/regexdev.rst
>  F: doc/guides/regexdevs/features/default.ini
>
> +GPU API - EXPERIMENTAL
> +M: Elena Agostini <eagostini@nvidia.com>
> +F: lib/gpudev/
> +F: doc/guides/prog_guide/gpu.rst
> +F: doc/guides/gpus/features/default.ini
> +
>  Eventdev API
>  M: Jerin Jacob <jerinj@marvell.com>
>  T: git://dpdk.org/next/dpdk-next-eventdev
> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
> index 1992107a03..bd10342ca2 100644
> --- a/doc/api/doxy-api-index.md
> +++ b/doc/api/doxy-api-index.md
> @@ -21,6 +21,7 @@ The public API headers are grouped by topics:
>    [compressdev]        (@ref rte_compressdev.h),
>    [compress]           (@ref rte_comp.h),
>    [regexdev]           (@ref rte_regexdev.h),
> +  [gpudev]             (@ref rte_gpudev.h),

Since this device does not have a queue etc? Shouldn't make it a
library like mempool with vendor-defined ops?
Any specific reason for making it a device? The reason why I am asking
this is, as other DPDK devices as symmetry in queue(s), configure,
start, stop operation etc.


> +
> +struct rte_gpu_dev {
> +       /* Backing device. */
> +       struct rte_device *device;

See above?

> +       /* GPU info structure. */
Andrew Rybchenko June 3, 2021, 7:49 a.m. UTC | #8
On 6/3/21 10:26 AM, Thomas Monjalon wrote:
> 03/06/2021 09:06, Andrew Rybchenko:
>> On 6/2/21 11:35 PM, Thomas Monjalon wrote:
>>> + * Allocate a chunk of memory on the GPU.
>>
>> Looking a below function it is required to clarify here if
>> the memory is visible or invisible to GPU (or both allowed).
> 
> This function allocates on the GPU so it is visible by the GPU.
> I feel I misunderstand your question.

Below function says rte_gpu_malloc_visible() and its
description highlights that allocated memory is visible to GPU.
My problem that I don't understand what's the difference
between these two functions.

>>> + */
>>> +__rte_experimental
>>> +int rte_gpu_malloc(uint16_t gpu_id, size_t size, void **ptr);
>>
>> May be *malloc() should return a pointer and "negative"
>> values used to report various errnos?
> 
> I don't understand what you mean by negative values if it is a pointer.
> We could return a pointer and use rte_errno.

I was talking about something like (void *)(-ENOMEM), but it is
a bad idea. NULL + rte_errno is much better.

However, may be I'd kept callback as is set rte_error in lib if
negative value is returned by the callback. This way we'll be
safe against lost rte_errno update in drivers.
Thomas Monjalon June 3, 2021, 8:26 a.m. UTC | #9
03/06/2021 09:49, Andrew Rybchenko:
> On 6/3/21 10:26 AM, Thomas Monjalon wrote:
> > 03/06/2021 09:06, Andrew Rybchenko:
> >> On 6/2/21 11:35 PM, Thomas Monjalon wrote:
> >>> + * Allocate a chunk of memory on the GPU.
> >>
> >> Looking a below function it is required to clarify here if
> >> the memory is visible or invisible to GPU (or both allowed).
> > 
> > This function allocates on the GPU so it is visible by the GPU.
> > I feel I misunderstand your question.
> 
> Below function says rte_gpu_malloc_visible() and its
> description highlights that allocated memory is visible to GPU.
> My problem that I don't understand what's the difference
> between these two functions.

One function allocates in GPU mem, the other allows the GPU to use CPU mem.

> >>> + */
> >>> +__rte_experimental
> >>> +int rte_gpu_malloc(uint16_t gpu_id, size_t size, void **ptr);
> >>
> >> May be *malloc() should return a pointer and "negative"
> >> values used to report various errnos?
> > 
> > I don't understand what you mean by negative values if it is a pointer.
> > We could return a pointer and use rte_errno.
> 
> I was talking about something like (void *)(-ENOMEM), but it is
> a bad idea. NULL + rte_errno is much better.
> 
> However, may be I'd kept callback as is set rte_error in lib if
> negative value is returned by the callback. This way we'll be
> safe against lost rte_errno update in drivers.

OK
Thomas Monjalon June 3, 2021, 8:28 a.m. UTC | #10
03/06/2021 09:47, Jerin Jacob:
> On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> > --- a/doc/api/doxy-api-index.md
> > +++ b/doc/api/doxy-api-index.md
> > @@ -21,6 +21,7 @@ The public API headers are grouped by topics:
> >    [compressdev]        (@ref rte_compressdev.h),
> >    [compress]           (@ref rte_comp.h),
> >    [regexdev]           (@ref rte_regexdev.h),
> > +  [gpudev]             (@ref rte_gpudev.h),
> 
> Since this device does not have a queue etc? Shouldn't make it a
> library like mempool with vendor-defined ops?
> Any specific reason for making it a device? The reason why I am asking
> this is, as other DPDK devices as symmetry in queue(s), configure,
> start, stop operation etc.
> 
> 
> > +
> > +struct rte_gpu_dev {
> > +       /* Backing device. */
> > +       struct rte_device *device;
> 
> See above?

There is a PCI device probed.
I don't understand why it would not be represented as a device.
Jerin Jacob June 3, 2021, 8:41 a.m. UTC | #11
On Thu, Jun 3, 2021 at 1:58 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 03/06/2021 09:47, Jerin Jacob:
> > On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > --- a/doc/api/doxy-api-index.md
> > > +++ b/doc/api/doxy-api-index.md
> > > @@ -21,6 +21,7 @@ The public API headers are grouped by topics:
> > >    [compressdev]        (@ref rte_compressdev.h),
> > >    [compress]           (@ref rte_comp.h),
> > >    [regexdev]           (@ref rte_regexdev.h),
> > > +  [gpudev]             (@ref rte_gpudev.h),
> >
> > Since this device does not have a queue etc? Shouldn't make it a
> > library like mempool with vendor-defined ops?
> > Any specific reason for making it a device? The reason why I am asking
> > this is, as other DPDK devices as symmetry in queue(s), configure,
> > start, stop operation etc.
> >
> >
> > > +
> > > +struct rte_gpu_dev {
> > > +       /* Backing device. */
> > > +       struct rte_device *device;
> >
> > See above?
>
> There is a PCI device probed.
> I don't understand why it would not be represented as a device.

All other DPDK device has symmetry in structures like queue and
symmetry in operation like it has configure, start, stop etc.
This one seems more like mempool to me all we want set of
vendor-defined ops. So any justification on
make it a device ? why not like mempool library?
(driver/mempool/octeontx2 Mempool HW is also PCI device, but
we don't take device path for mempool. So I would like to understand
any technical reason for making it a device).



>
>
Thomas Monjalon June 3, 2021, 8:43 a.m. UTC | #12
03/06/2021 10:41, Jerin Jacob:
> On Thu, Jun 3, 2021 at 1:58 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > 03/06/2021 09:47, Jerin Jacob:
> > > On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > --- a/doc/api/doxy-api-index.md
> > > > +++ b/doc/api/doxy-api-index.md
> > > > @@ -21,6 +21,7 @@ The public API headers are grouped by topics:
> > > >    [compressdev]        (@ref rte_compressdev.h),
> > > >    [compress]           (@ref rte_comp.h),
> > > >    [regexdev]           (@ref rte_regexdev.h),
> > > > +  [gpudev]             (@ref rte_gpudev.h),
> > >
> > > Since this device does not have a queue etc? Shouldn't make it a
> > > library like mempool with vendor-defined ops?
> > > Any specific reason for making it a device? The reason why I am asking
> > > this is, as other DPDK devices as symmetry in queue(s), configure,
> > > start, stop operation etc.
> > >
> > >
> > > > +
> > > > +struct rte_gpu_dev {
> > > > +       /* Backing device. */
> > > > +       struct rte_device *device;
> > >
> > > See above?
> >
> > There is a PCI device probed.
> > I don't understand why it would not be represented as a device.
> 
> All other DPDK device has symmetry in structures like queue and
> symmetry in operation like it has configure, start, stop etc.
> This one seems more like mempool to me all we want set of
> vendor-defined ops. So any justification on
> make it a device ? why not like mempool library?
> (driver/mempool/octeontx2 Mempool HW is also PCI device, but
> we don't take device path for mempool. So I would like to understand
> any technical reason for making it a device).

I don't understand what you mean by "symmetry".
Jerin Jacob June 3, 2021, 8:47 a.m. UTC | #13
On Thu, Jun 3, 2021 at 2:13 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 03/06/2021 10:41, Jerin Jacob:
> > On Thu, Jun 3, 2021 at 1:58 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > >
> > > 03/06/2021 09:47, Jerin Jacob:
> > > > On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > --- a/doc/api/doxy-api-index.md
> > > > > +++ b/doc/api/doxy-api-index.md
> > > > > @@ -21,6 +21,7 @@ The public API headers are grouped by topics:
> > > > >    [compressdev]        (@ref rte_compressdev.h),
> > > > >    [compress]           (@ref rte_comp.h),
> > > > >    [regexdev]           (@ref rte_regexdev.h),
> > > > > +  [gpudev]             (@ref rte_gpudev.h),
> > > >
> > > > Since this device does not have a queue etc? Shouldn't make it a
> > > > library like mempool with vendor-defined ops?
> > > > Any specific reason for making it a device? The reason why I am asking
> > > > this is, as other DPDK devices as symmetry in queue(s), configure,
> > > > start, stop operation etc.
> > > >
> > > >
> > > > > +
> > > > > +struct rte_gpu_dev {
> > > > > +       /* Backing device. */
> > > > > +       struct rte_device *device;
> > > >
> > > > See above?
> > >
> > > There is a PCI device probed.
> > > I don't understand why it would not be represented as a device.
> >
> > All other DPDK device has symmetry in structures like queue and
> > symmetry in operation like it has configure, start, stop etc.
> > This one seems more like mempool to me all we want set of
> > vendor-defined ops. So any justification on
> > make it a device ? why not like mempool library?
> > (driver/mempool/octeontx2 Mempool HW is also PCI device, but
> > we don't take device path for mempool. So I would like to understand
> > any technical reason for making it a device).
>
> I don't understand what you mean by "symmetry".

The common attributes. or similarity

>
>
>
Thomas Monjalon June 3, 2021, 8:53 a.m. UTC | #14
03/06/2021 10:47, Jerin Jacob:
> On Thu, Jun 3, 2021 at 2:13 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > 03/06/2021 10:41, Jerin Jacob:
> > > On Thu, Jun 3, 2021 at 1:58 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > >
> > > > 03/06/2021 09:47, Jerin Jacob:
> > > > > On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > > --- a/doc/api/doxy-api-index.md
> > > > > > +++ b/doc/api/doxy-api-index.md
> > > > > > @@ -21,6 +21,7 @@ The public API headers are grouped by topics:
> > > > > >    [compressdev]        (@ref rte_compressdev.h),
> > > > > >    [compress]           (@ref rte_comp.h),
> > > > > >    [regexdev]           (@ref rte_regexdev.h),
> > > > > > +  [gpudev]             (@ref rte_gpudev.h),
> > > > >
> > > > > Since this device does not have a queue etc? Shouldn't make it a
> > > > > library like mempool with vendor-defined ops?
> > > > > Any specific reason for making it a device? The reason why I am asking
> > > > > this is, as other DPDK devices as symmetry in queue(s), configure,
> > > > > start, stop operation etc.
> > > > >
> > > > >
> > > > > > +
> > > > > > +struct rte_gpu_dev {
> > > > > > +       /* Backing device. */
> > > > > > +       struct rte_device *device;
> > > > >
> > > > > See above?
> > > >
> > > > There is a PCI device probed.
> > > > I don't understand why it would not be represented as a device.
> > >
> > > All other DPDK device has symmetry in structures like queue and
> > > symmetry in operation like it has configure, start, stop etc.
> > > This one seems more like mempool to me all we want set of
> > > vendor-defined ops. So any justification on
> > > make it a device ? why not like mempool library?
> > > (driver/mempool/octeontx2 Mempool HW is also PCI device, but
> > > we don't take device path for mempool. So I would like to understand
> > > any technical reason for making it a device).
> >
> > I don't understand what you mean by "symmetry".
> 
> The common attributes. or similarity

The common attributes of a device are:
	- driver
	- bus
	- devargs
We have these attributes for a GPU.

About configure/start/stop usual functions,
I think we'll have something similar in the second step
for running tasks.
Andrew Rybchenko June 3, 2021, 8:57 a.m. UTC | #15
On 6/3/21 11:26 AM, Thomas Monjalon wrote:
> 03/06/2021 09:49, Andrew Rybchenko:
>> On 6/3/21 10:26 AM, Thomas Monjalon wrote:
>>> 03/06/2021 09:06, Andrew Rybchenko:
>>>> On 6/2/21 11:35 PM, Thomas Monjalon wrote:
>>>>> + * Allocate a chunk of memory on the GPU.
>>>>
>>>> Looking a below function it is required to clarify here if
>>>> the memory is visible or invisible to GPU (or both allowed).
>>>
>>> This function allocates on the GPU so it is visible by the GPU.
>>> I feel I misunderstand your question.
>>
>> Below function says rte_gpu_malloc_visible() and its
>> description highlights that allocated memory is visible to GPU.
>> My problem that I don't understand what's the difference
>> between these two functions.
> 
> One function allocates in GPU mem, the other allows the GPU to use CPU mem.

Ah, I see no. G looks like C and I've not noticed the
difference. Now it is clear. My bad.
Jerin Jacob June 3, 2021, 9:20 a.m. UTC | #16
On Thu, Jun 3, 2021 at 2:23 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 03/06/2021 10:47, Jerin Jacob:
> > On Thu, Jun 3, 2021 at 2:13 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > >
> > > 03/06/2021 10:41, Jerin Jacob:
> > > > On Thu, Jun 3, 2021 at 1:58 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > >
> > > > > 03/06/2021 09:47, Jerin Jacob:
> > > > > > On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > > > --- a/doc/api/doxy-api-index.md
> > > > > > > +++ b/doc/api/doxy-api-index.md
> > > > > > > @@ -21,6 +21,7 @@ The public API headers are grouped by topics:
> > > > > > >    [compressdev]        (@ref rte_compressdev.h),
> > > > > > >    [compress]           (@ref rte_comp.h),
> > > > > > >    [regexdev]           (@ref rte_regexdev.h),
> > > > > > > +  [gpudev]             (@ref rte_gpudev.h),
> > > > > >
> > > > > > Since this device does not have a queue etc? Shouldn't make it a
> > > > > > library like mempool with vendor-defined ops?
> > > > > > Any specific reason for making it a device? The reason why I am asking
> > > > > > this is, as other DPDK devices as symmetry in queue(s), configure,
> > > > > > start, stop operation etc.
> > > > > >
> > > > > >
> > > > > > > +
> > > > > > > +struct rte_gpu_dev {
> > > > > > > +       /* Backing device. */
> > > > > > > +       struct rte_device *device;
> > > > > >
> > > > > > See above?
> > > > >
> > > > > There is a PCI device probed.
> > > > > I don't understand why it would not be represented as a device.
> > > >
> > > > All other DPDK device has symmetry in structures like queue and
> > > > symmetry in operation like it has configure, start, stop etc.
> > > > This one seems more like mempool to me all we want set of
> > > > vendor-defined ops. So any justification on
> > > > make it a device ? why not like mempool library?
> > > > (driver/mempool/octeontx2 Mempool HW is also PCI device, but
> > > > we don't take device path for mempool. So I would like to understand
> > > > any technical reason for making it a device).
> > >
> > > I don't understand what you mean by "symmetry".
> >
> > The common attributes. or similarity
>
> The common attributes of a device are:
>         - driver
>         - bus
>         - devargs
> We have these attributes for a GPU.

Yes. Those are attributes of rte_device. That does not mean and
library can not use rte_device.(mempool library driver is using
rte_device which is backed by PCI)
In terms of similarity, all other device libraries(not devices) have
queue, enqueue() and dequeue() kind of scheme
in ethdev, cryptodev, compressdev, eventdev, bbdev, rawdev. regexdev.
i.e existing DPDK device libraries,
This one des not have have that, So question why to call it libgpudev vs libgpu.

The functions you have are memory allocation etc. That's more of a
library candidate.

>
> About configure/start/stop usual functions,
> I think we'll have something similar in the second step

Do you think or it will be there?. I think, it is import decision. The
device needs have a queue kind of structure
and it is mapping to core to have a notion of configure. queue_setup,
start and stop etc
Something similar to
http://code.dpdk.org/dpdk/v21.05/source/lib/regexdev/rte_regexdev.h#L27
Could you share how "running tasks" translates to the above scheme
like other her dpdk device libraries?



> for running tasks.
>
>
Ferruh Yigit June 3, 2021, 9:33 a.m. UTC | #17
On 6/3/2021 8:47 AM, Jerin Jacob wrote:
> On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon <thomas@monjalon.net> wrote:
>>
>> From: Elena Agostini <eagostini@nvidia.com>
>>
>> The new library gpudev is for dealing with GPU from a DPDK application
>> in a vendor-agnostic way.
>>
>> As a first step, the features are focused on memory management.
>> A function allows to allocate memory inside the GPU,
>> while another one allows to use main (CPU) memory from the GPU.
>>
>> The infrastructure is prepared to welcome drivers in drivers/gpu/
>> as the upcoming NVIDIA one, implementing the gpudev API.
>> Other additions planned for next revisions:
>>   - C implementation file
>>   - guide documentation
>>   - unit tests
>>   - integration in testpmd to enable Rx/Tx to/from GPU memory.
>>
>> The next step should focus on GPU processing task control.
>>
>> Signed-off-by: Elena Agostini <eagostini@nvidia.com>
>> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
>> ---
>>  .gitignore                           |   1 +
>>  MAINTAINERS                          |   6 +
>>  doc/api/doxy-api-index.md            |   1 +
>>  doc/api/doxy-api.conf.in             |   1 +
>>  doc/guides/conf.py                   |   8 ++
>>  doc/guides/gpus/features/default.ini |  13 ++
>>  doc/guides/gpus/index.rst            |  11 ++
>>  doc/guides/gpus/overview.rst         |   7 +
>>  doc/guides/index.rst                 |   1 +
>>  doc/guides/prog_guide/gpu.rst        |   5 +
>>  doc/guides/prog_guide/index.rst      |   1 +
>>  drivers/gpu/meson.build              |   4 +
>>  drivers/meson.build                  |   1 +
>>  lib/gpudev/gpu_driver.h              |  44 +++++++
>>  lib/gpudev/meson.build               |   9 ++
>>  lib/gpudev/rte_gpudev.h              | 183 +++++++++++++++++++++++++++
>>  lib/gpudev/version.map               |  11 ++
>>  lib/meson.build                      |   1 +
>>  18 files changed, 308 insertions(+)
>>  create mode 100644 doc/guides/gpus/features/default.ini
>>  create mode 100644 doc/guides/gpus/index.rst
>>  create mode 100644 doc/guides/gpus/overview.rst
>>  create mode 100644 doc/guides/prog_guide/gpu.rst
>>  create mode 100644 drivers/gpu/meson.build
>>  create mode 100644 lib/gpudev/gpu_driver.h
>>  create mode 100644 lib/gpudev/meson.build
>>  create mode 100644 lib/gpudev/rte_gpudev.h
>>  create mode 100644 lib/gpudev/version.map
>>
>> diff --git a/.gitignore b/.gitignore
>> index b19c0717e6..49494e0c6c 100644
>> --- a/.gitignore
>> +++ b/.gitignore
>> @@ -14,6 +14,7 @@ doc/guides/compressdevs/overview_feature_table.txt
>>  doc/guides/regexdevs/overview_feature_table.txt
>>  doc/guides/vdpadevs/overview_feature_table.txt
>>  doc/guides/bbdevs/overview_feature_table.txt
>> +doc/guides/gpus/overview_feature_table.txt
>>
>>  # ignore generated ctags/cscope files
>>  cscope.out.po
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 5877a16971..c4755dfe9a 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -452,6 +452,12 @@ F: app/test-regex/
>>  F: doc/guides/prog_guide/regexdev.rst
>>  F: doc/guides/regexdevs/features/default.ini
>>
>> +GPU API - EXPERIMENTAL
>> +M: Elena Agostini <eagostini@nvidia.com>
>> +F: lib/gpudev/
>> +F: doc/guides/prog_guide/gpu.rst
>> +F: doc/guides/gpus/features/default.ini
>> +
>>  Eventdev API
>>  M: Jerin Jacob <jerinj@marvell.com>
>>  T: git://dpdk.org/next/dpdk-next-eventdev
>> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
>> index 1992107a03..bd10342ca2 100644
>> --- a/doc/api/doxy-api-index.md
>> +++ b/doc/api/doxy-api-index.md
>> @@ -21,6 +21,7 @@ The public API headers are grouped by topics:
>>    [compressdev]        (@ref rte_compressdev.h),
>>    [compress]           (@ref rte_comp.h),
>>    [regexdev]           (@ref rte_regexdev.h),
>> +  [gpudev]             (@ref rte_gpudev.h),
> 
> Since this device does not have a queue etc? Shouldn't make it a
> library like mempool with vendor-defined ops?

+1

Current RFC announces additional memory allocation capabilities, which can suits
better as extension to existing memory related library instead of a new device
abstraction library.

> Any specific reason for making it a device? The reason why I am asking
> this is, as other DPDK devices as symmetry in queue(s), configure,
> start, stop operation etc.
> 
> 
>> +
>> +struct rte_gpu_dev {
>> +       /* Backing device. */
>> +       struct rte_device *device;
> 
> See above?
> 
>> +       /* GPU info structure. */
Thomas Monjalon June 3, 2021, 9:36 a.m. UTC | #18
03/06/2021 11:20, Jerin Jacob:
> On Thu, Jun 3, 2021 at 2:23 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > 03/06/2021 10:47, Jerin Jacob:
> > > On Thu, Jun 3, 2021 at 2:13 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > >
> > > > 03/06/2021 10:41, Jerin Jacob:
> > > > > On Thu, Jun 3, 2021 at 1:58 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > >
> > > > > > 03/06/2021 09:47, Jerin Jacob:
> > > > > > > On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > > > > --- a/doc/api/doxy-api-index.md
> > > > > > > > +++ b/doc/api/doxy-api-index.md
> > > > > > > > @@ -21,6 +21,7 @@ The public API headers are grouped by topics:
> > > > > > > >    [compressdev]        (@ref rte_compressdev.h),
> > > > > > > >    [compress]           (@ref rte_comp.h),
> > > > > > > >    [regexdev]           (@ref rte_regexdev.h),
> > > > > > > > +  [gpudev]             (@ref rte_gpudev.h),
> > > > > > >
> > > > > > > Since this device does not have a queue etc? Shouldn't make it a
> > > > > > > library like mempool with vendor-defined ops?
> > > > > > > Any specific reason for making it a device? The reason why I am asking
> > > > > > > this is, as other DPDK devices as symmetry in queue(s), configure,
> > > > > > > start, stop operation etc.
> > > > > > >
> > > > > > >
> > > > > > > > +
> > > > > > > > +struct rte_gpu_dev {
> > > > > > > > +       /* Backing device. */
> > > > > > > > +       struct rte_device *device;
> > > > > > >
> > > > > > > See above?
> > > > > >
> > > > > > There is a PCI device probed.
> > > > > > I don't understand why it would not be represented as a device.
> > > > >
> > > > > All other DPDK device has symmetry in structures like queue and
> > > > > symmetry in operation like it has configure, start, stop etc.
> > > > > This one seems more like mempool to me all we want set of
> > > > > vendor-defined ops. So any justification on
> > > > > make it a device ? why not like mempool library?
> > > > > (driver/mempool/octeontx2 Mempool HW is also PCI device, but
> > > > > we don't take device path for mempool. So I would like to understand
> > > > > any technical reason for making it a device).
> > > >
> > > > I don't understand what you mean by "symmetry".
> > >
> > > The common attributes. or similarity
> >
> > The common attributes of a device are:
> >         - driver
> >         - bus
> >         - devargs
> > We have these attributes for a GPU.
> 
> Yes. Those are attributes of rte_device. That does not mean and
> library can not use rte_device.(mempool library driver is using
> rte_device which is backed by PCI)
> In terms of similarity, all other device libraries(not devices) have
> queue, enqueue() and dequeue() kind of scheme
> in ethdev, cryptodev, compressdev, eventdev, bbdev, rawdev. regexdev.
> i.e existing DPDK device libraries,
> This one des not have have that, So question why to call it libgpudev vs libgpu.
> 
> The functions you have are memory allocation etc. That's more of a
> library candidate.
> 
> > About configure/start/stop usual functions,
> > I think we'll have something similar in the second step
> 
> Do you think or it will be there?. I think, it is import decision.

That's an important discussion we need to have.
We are preparing a proposal.

> The device needs have a queue kind of structure
> and it is mapping to core to have a notion of configure. queue_setup,
> start and stop etc

Why is it a requirement to call it a device API?

> Something similar to
> http://code.dpdk.org/dpdk/v21.05/source/lib/regexdev/rte_regexdev.h#L27
> Could you share how "running tasks" translates to the above scheme
> like other her dpdk device libraries?

We will share our view soon but what to control in GPU execution
must be a community discussed requirement.
Jerin Jacob June 3, 2021, 10:04 a.m. UTC | #19
On Thu, Jun 3, 2021 at 3:06 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 03/06/2021 11:20, Jerin Jacob:
> > On Thu, Jun 3, 2021 at 2:23 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > >
> > > 03/06/2021 10:47, Jerin Jacob:
> > > > On Thu, Jun 3, 2021 at 2:13 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > >
> > > > > 03/06/2021 10:41, Jerin Jacob:
> > > > > > On Thu, Jun 3, 2021 at 1:58 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > > >
> > > > > > > 03/06/2021 09:47, Jerin Jacob:
> > > > > > > > On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > > > > > --- a/doc/api/doxy-api-index.md
> > > > > > > > > +++ b/doc/api/doxy-api-index.md
> > > > > > > > > @@ -21,6 +21,7 @@ The public API headers are grouped by topics:
> > > > > > > > >    [compressdev]        (@ref rte_compressdev.h),
> > > > > > > > >    [compress]           (@ref rte_comp.h),
> > > > > > > > >    [regexdev]           (@ref rte_regexdev.h),
> > > > > > > > > +  [gpudev]             (@ref rte_gpudev.h),
> > > > > > > >
> > > > > > > > Since this device does not have a queue etc? Shouldn't make it a
> > > > > > > > library like mempool with vendor-defined ops?
> > > > > > > > Any specific reason for making it a device? The reason why I am asking
> > > > > > > > this is, as other DPDK devices as symmetry in queue(s), configure,
> > > > > > > > start, stop operation etc.
> > > > > > > >
> > > > > > > >
> > > > > > > > > +
> > > > > > > > > +struct rte_gpu_dev {
> > > > > > > > > +       /* Backing device. */
> > > > > > > > > +       struct rte_device *device;
> > > > > > > >
> > > > > > > > See above?
> > > > > > >
> > > > > > > There is a PCI device probed.
> > > > > > > I don't understand why it would not be represented as a device.
> > > > > >
> > > > > > All other DPDK device has symmetry in structures like queue and
> > > > > > symmetry in operation like it has configure, start, stop etc.
> > > > > > This one seems more like mempool to me all we want set of
> > > > > > vendor-defined ops. So any justification on
> > > > > > make it a device ? why not like mempool library?
> > > > > > (driver/mempool/octeontx2 Mempool HW is also PCI device, but
> > > > > > we don't take device path for mempool. So I would like to understand
> > > > > > any technical reason for making it a device).
> > > > >
> > > > > I don't understand what you mean by "symmetry".
> > > >
> > > > The common attributes. or similarity
> > >
> > > The common attributes of a device are:
> > >         - driver
> > >         - bus
> > >         - devargs
> > > We have these attributes for a GPU.
> >
> > Yes. Those are attributes of rte_device. That does not mean and
> > library can not use rte_device.(mempool library driver is using
> > rte_device which is backed by PCI)
> > In terms of similarity, all other device libraries(not devices) have
> > queue, enqueue() and dequeue() kind of scheme
> > in ethdev, cryptodev, compressdev, eventdev, bbdev, rawdev. regexdev.
> > i.e existing DPDK device libraries,
> > This one des not have have that, So question why to call it libgpudev vs libgpu.

See below[1]

> >
> > The functions you have are memory allocation etc. That's more of a
> > library candidate.
> >
> > > About configure/start/stop usual functions,
> > > I think we'll have something similar in the second step
> >
> > Do you think or it will be there?. I think, it is import decision.
>
> That's an important discussion we need to have.
> We are preparing a proposal.

Ack.

>
> > The device needs have a queue kind of structure
> > and it is mapping to core to have a notion of configure. queue_setup,
> > start and stop etc
>
> Why is it a requirement to call it a device API?

Then we need to define what needs to call as device library vs library and how?
Why mempool is not called a  device library vs library?  and why all
other device library has a common structure like queues and
it binding core etc. I tried to explain above the similar attributes
for dpdk device libraries[1] which I think, it a requirement so
that the end user will have familiarity with device libraries rather
than each one has separate General guidelines and principles.

I think, it is more TB discussion topic and decides on this because I
don't see in technical issue in calling it a library.

>
> > Something similar to
> > http://code.dpdk.org/dpdk/v21.05/source/lib/regexdev/rte_regexdev.h#L27
> > Could you share how "running tasks" translates to the above scheme
> > like other her dpdk device libraries?
>
> We will share our view soon but what to control in GPU execution
> must be a community discussed requirement.

Makes sense.

>
>
>
Thomas Monjalon June 3, 2021, 10:30 a.m. UTC | #20
03/06/2021 12:04, Jerin Jacob:
> On Thu, Jun 3, 2021 at 3:06 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > 03/06/2021 11:20, Jerin Jacob:
> > > On Thu, Jun 3, 2021 at 2:23 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > >
> > > > 03/06/2021 10:47, Jerin Jacob:
> > > > > On Thu, Jun 3, 2021 at 2:13 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > >
> > > > > > 03/06/2021 10:41, Jerin Jacob:
> > > > > > > On Thu, Jun 3, 2021 at 1:58 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > > > >
> > > > > > > > 03/06/2021 09:47, Jerin Jacob:
> > > > > > > > > On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > > > > > > --- a/doc/api/doxy-api-index.md
> > > > > > > > > > +++ b/doc/api/doxy-api-index.md
> > > > > > > > > > @@ -21,6 +21,7 @@ The public API headers are grouped by topics:
> > > > > > > > > >    [compressdev]        (@ref rte_compressdev.h),
> > > > > > > > > >    [compress]           (@ref rte_comp.h),
> > > > > > > > > >    [regexdev]           (@ref rte_regexdev.h),
> > > > > > > > > > +  [gpudev]             (@ref rte_gpudev.h),
> > > > > > > > >
> > > > > > > > > Since this device does not have a queue etc? Shouldn't make it a
> > > > > > > > > library like mempool with vendor-defined ops?
> > > > > > > > > Any specific reason for making it a device? The reason why I am asking
> > > > > > > > > this is, as other DPDK devices as symmetry in queue(s), configure,
> > > > > > > > > start, stop operation etc.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > +
> > > > > > > > > > +struct rte_gpu_dev {
> > > > > > > > > > +       /* Backing device. */
> > > > > > > > > > +       struct rte_device *device;
> > > > > > > > >
> > > > > > > > > See above?
> > > > > > > >
> > > > > > > > There is a PCI device probed.
> > > > > > > > I don't understand why it would not be represented as a device.
> > > > > > >
> > > > > > > All other DPDK device has symmetry in structures like queue and
> > > > > > > symmetry in operation like it has configure, start, stop etc.
> > > > > > > This one seems more like mempool to me all we want set of
> > > > > > > vendor-defined ops. So any justification on
> > > > > > > make it a device ? why not like mempool library?
> > > > > > > (driver/mempool/octeontx2 Mempool HW is also PCI device, but
> > > > > > > we don't take device path for mempool. So I would like to understand
> > > > > > > any technical reason for making it a device).
> > > > > >
> > > > > > I don't understand what you mean by "symmetry".
> > > > >
> > > > > The common attributes. or similarity
> > > >
> > > > The common attributes of a device are:
> > > >         - driver
> > > >         - bus
> > > >         - devargs
> > > > We have these attributes for a GPU.
> > >
> > > Yes. Those are attributes of rte_device. That does not mean and
> > > library can not use rte_device.(mempool library driver is using
> > > rte_device which is backed by PCI)
> > > In terms of similarity, all other device libraries(not devices) have
> > > queue, enqueue() and dequeue() kind of scheme
> > > in ethdev, cryptodev, compressdev, eventdev, bbdev, rawdev. regexdev.
> > > i.e existing DPDK device libraries,
> > > This one des not have have that, So question why to call it libgpudev vs libgpu.
> 
> See below[1]
> 
> > >
> > > The functions you have are memory allocation etc. That's more of a
> > > library candidate.
> > >
> > > > About configure/start/stop usual functions,
> > > > I think we'll have something similar in the second step
> > >
> > > Do you think or it will be there?. I think, it is import decision.
> >
> > That's an important discussion we need to have.
> > We are preparing a proposal.
> 
> Ack.
> 
> >
> > > The device needs have a queue kind of structure
> > > and it is mapping to core to have a notion of configure. queue_setup,
> > > start and stop etc
> >
> > Why is it a requirement to call it a device API?
> 
> Then we need to define what needs to call as device library vs library and how?
> Why mempool is not called a  device library vs library?

My view is simple:
if it has drivers, it is a device API, except bus and mempool libs.
About mempool, it started as a standard lib and got extended for HW support.

> and why all
> other device library has a common structure like queues and
> it binding core etc. I tried to explain above the similar attributes
> for dpdk device libraries[1] which I think, it a requirement so
> that the end user will have familiarity with device libraries rather
> than each one has separate General guidelines and principles.
> 
> I think, it is more TB discussion topic and decides on this because I
> don't see in technical issue in calling it a library.

The naming is just a choice.
Yesterday morning it was called lib/gpu/
and in the evening it was renamed lib/gpudev/
so no technical issue :)

But the design of the API with queues or other paradigm
is something I would like to discuss here.
Note: there was no intent to publish GPU processing control
in DPDK 21.08. We want to focus on GPU memory in 21.08,
but I understand it is a key decision in the big picture.
What would be your need and would you design such API?

> > > Something similar to
> > > http://code.dpdk.org/dpdk/v21.05/source/lib/regexdev/rte_regexdev.h#L27
> > > Could you share how "running tasks" translates to the above scheme
> > > like other her dpdk device libraries?
> >
> > We will share our view soon but what to control in GPU execution
> > must be a community discussed requirement.
> 
> Makes sense.
Jerin Jacob June 3, 2021, 11:38 a.m. UTC | #21
On Thu, Jun 3, 2021 at 4:00 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 03/06/2021 12:04, Jerin Jacob:
> > On Thu, Jun 3, 2021 at 3:06 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > >
> > > 03/06/2021 11:20, Jerin Jacob:
> > > > On Thu, Jun 3, 2021 at 2:23 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > >
> > > > > 03/06/2021 10:47, Jerin Jacob:
> > > > > > On Thu, Jun 3, 2021 at 2:13 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > > >
> > > > > > > 03/06/2021 10:41, Jerin Jacob:
> > > > > > > > On Thu, Jun 3, 2021 at 1:58 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > > > > >
> > > > > > > > > 03/06/2021 09:47, Jerin Jacob:
> > > > > > > > > > On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > > > > > > > --- a/doc/api/doxy-api-index.md
> > > > > > > > > > > +++ b/doc/api/doxy-api-index.md
> > > > > > > > > > > @@ -21,6 +21,7 @@ The public API headers are grouped by topics:
> > > > > > > > > > >    [compressdev]        (@ref rte_compressdev.h),
> > > > > > > > > > >    [compress]           (@ref rte_comp.h),
> > > > > > > > > > >    [regexdev]           (@ref rte_regexdev.h),
> > > > > > > > > > > +  [gpudev]             (@ref rte_gpudev.h),
> > > > > > > > > >
> > > > > > > > > > Since this device does not have a queue etc? Shouldn't make it a
> > > > > > > > > > library like mempool with vendor-defined ops?
> > > > > > > > > > Any specific reason for making it a device? The reason why I am asking
> > > > > > > > > > this is, as other DPDK devices as symmetry in queue(s), configure,
> > > > > > > > > > start, stop operation etc.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > +
> > > > > > > > > > > +struct rte_gpu_dev {
> > > > > > > > > > > +       /* Backing device. */
> > > > > > > > > > > +       struct rte_device *device;
> > > > > > > > > >
> > > > > > > > > > See above?
> > > > > > > > >
> > > > > > > > > There is a PCI device probed.
> > > > > > > > > I don't understand why it would not be represented as a device.
> > > > > > > >
> > > > > > > > All other DPDK device has symmetry in structures like queue and
> > > > > > > > symmetry in operation like it has configure, start, stop etc.
> > > > > > > > This one seems more like mempool to me all we want set of
> > > > > > > > vendor-defined ops. So any justification on
> > > > > > > > make it a device ? why not like mempool library?
> > > > > > > > (driver/mempool/octeontx2 Mempool HW is also PCI device, but
> > > > > > > > we don't take device path for mempool. So I would like to understand
> > > > > > > > any technical reason for making it a device).
> > > > > > >
> > > > > > > I don't understand what you mean by "symmetry".
> > > > > >
> > > > > > The common attributes. or similarity
> > > > >
> > > > > The common attributes of a device are:
> > > > >         - driver
> > > > >         - bus
> > > > >         - devargs
> > > > > We have these attributes for a GPU.
> > > >
> > > > Yes. Those are attributes of rte_device. That does not mean and
> > > > library can not use rte_device.(mempool library driver is using
> > > > rte_device which is backed by PCI)
> > > > In terms of similarity, all other device libraries(not devices) have
> > > > queue, enqueue() and dequeue() kind of scheme
> > > > in ethdev, cryptodev, compressdev, eventdev, bbdev, rawdev. regexdev.
> > > > i.e existing DPDK device libraries,
> > > > This one des not have have that, So question why to call it libgpudev vs libgpu.
> >
> > See below[1]
> >
> > > >
> > > > The functions you have are memory allocation etc. That's more of a
> > > > library candidate.
> > > >
> > > > > About configure/start/stop usual functions,
> > > > > I think we'll have something similar in the second step
> > > >
> > > > Do you think or it will be there?. I think, it is import decision.
> > >
> > > That's an important discussion we need to have.
> > > We are preparing a proposal.
> >
> > Ack.
> >
> > >
> > > > The device needs have a queue kind of structure
> > > > and it is mapping to core to have a notion of configure. queue_setup,
> > > > start and stop etc
> > >
> > > Why is it a requirement to call it a device API?
> >
> > Then we need to define what needs to call as device library vs library and how?
> > Why mempool is not called a  device library vs library?
>
> My view is simple:
> if it has drivers, it is a device API, except bus and mempool libs.

rte_secuity has drivers but it is not called a device library.

> About mempool, it started as a standard lib and got extended for HW support.

Yes. We did not change to device library as it was fundamentally
different than another DPDK deices
when we added the device support.

>
> > and why all
> > other device library has a common structure like queues and
> > it binding core etc. I tried to explain above the similar attributes
> > for dpdk device libraries[1] which I think, it a requirement so
> > that the end user will have familiarity with device libraries rather
> > than each one has separate General guidelines and principles.
> >
> > I think, it is more TB discussion topic and decides on this because I
> > don't see in technical issue in calling it a library.
>
> The naming is just a choice.

Not sure.

> Yesterday morning it was called lib/gpu/
> and in the evening it was renamed lib/gpudev/
> so no technical issue :)
>
> But the design of the API with queues or other paradigm
> is something I would like to discuss here.

Yeah, That is important. IMO, That defines what needs to be a device library.

> Note: there was no intent to publish GPU processing control
> in DPDK 21.08. We want to focus on GPU memory in 21.08,
> but I understand it is a key decision in the big picture.

if the scope is only memory allocation, IMO, it is better to make a library.

> What would be your need and would you design such API?

For me, there is no need for gpu library(as of now). May GPU consumers
can define what
they need to control using the library.


>
> > > > Something similar to
> > > > http://code.dpdk.org/dpdk/v21.05/source/lib/regexdev/rte_regexdev.h#L27
> > > > Could you share how "running tasks" translates to the above scheme
> > > > like other her dpdk device libraries?
> > >
> > > We will share our view soon but what to control in GPU execution
> > > must be a community discussed requirement.
> >
> > Makes sense.
>
>
>
Wang, Haiyue June 4, 2021, 5:51 a.m. UTC | #22
> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Thomas Monjalon
> Sent: Thursday, June 3, 2021 04:36
> To: dev@dpdk.org
> Cc: Elena Agostini <eagostini@nvidia.com>
> Subject: [dpdk-dev] [PATCH] gpudev: introduce memory API
> 
> From: Elena Agostini <eagostini@nvidia.com>
> 
> The new library gpudev is for dealing with GPU from a DPDK application
> in a vendor-agnostic way.
> 
> As a first step, the features are focused on memory management.
> A function allows to allocate memory inside the GPU,
> while another one allows to use main (CPU) memory from the GPU.
> 
> The infrastructure is prepared to welcome drivers in drivers/gpu/
> as the upcoming NVIDIA one, implementing the gpudev API.
> Other additions planned for next revisions:
>   - C implementation file
>   - guide documentation
>   - unit tests
>   - integration in testpmd to enable Rx/Tx to/from GPU memory.
> 
> The next step should focus on GPU processing task control.
> 

Is this patch for 'L2FWD-NV Workload on GPU' on P26 ?
https://developer.download.nvidia.com/video/gputechconf/gtc/2019/presentation/s9730-packet-processing-on-gpu-at-100gbe-line-rate.pdf


> Signed-off-by: Elena Agostini <eagostini@nvidia.com>
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> ---
>  .gitignore                           |   1 +
>  MAINTAINERS                          |   6 +
>  doc/api/doxy-api-index.md            |   1 +
>  doc/api/doxy-api.conf.in             |   1 +
>  doc/guides/conf.py                   |   8 ++
>  doc/guides/gpus/features/default.ini |  13 ++
>  doc/guides/gpus/index.rst            |  11 ++
>  doc/guides/gpus/overview.rst         |   7 +
>  doc/guides/index.rst                 |   1 +
>  doc/guides/prog_guide/gpu.rst        |   5 +
>  doc/guides/prog_guide/index.rst      |   1 +
>  drivers/gpu/meson.build              |   4 +
>  drivers/meson.build                  |   1 +
>  lib/gpudev/gpu_driver.h              |  44 +++++++
>  lib/gpudev/meson.build               |   9 ++
>  lib/gpudev/rte_gpudev.h              | 183 +++++++++++++++++++++++++++
>  lib/gpudev/version.map               |  11 ++
>  lib/meson.build                      |   1 +
>  18 files changed, 308 insertions(+)
>  create mode 100644 doc/guides/gpus/features/default.ini
>  create mode 100644 doc/guides/gpus/index.rst
>  create mode 100644 doc/guides/gpus/overview.rst
>  create mode 100644 doc/guides/prog_guide/gpu.rst
>  create mode 100644 drivers/gpu/meson.build
>  create mode 100644 lib/gpudev/gpu_driver.h
>  create mode 100644 lib/gpudev/meson.build
>  create mode 100644 lib/gpudev/rte_gpudev.h
>  create mode 100644 lib/gpudev/version.map
> 


> --
> 2.31.1
Thomas Monjalon June 4, 2021, 8:15 a.m. UTC | #23
04/06/2021 07:51, Wang, Haiyue:
> > From: Elena Agostini <eagostini@nvidia.com>
> > 
> > The new library gpudev is for dealing with GPU from a DPDK application
> > in a vendor-agnostic way.
> > 
> > As a first step, the features are focused on memory management.
> > A function allows to allocate memory inside the GPU,
> > while another one allows to use main (CPU) memory from the GPU.
> > 
> > The infrastructure is prepared to welcome drivers in drivers/gpu/
> > as the upcoming NVIDIA one, implementing the gpudev API.
> > Other additions planned for next revisions:
> >   - C implementation file
> >   - guide documentation
> >   - unit tests
> >   - integration in testpmd to enable Rx/Tx to/from GPU memory.
> > 
> > The next step should focus on GPU processing task control.
> > 
> 
> Is this patch for 'L2FWD-NV Workload on GPU' on P26 ?
> https://developer.download.nvidia.com/video/gputechconf/gtc/2019/presentation/s9730-packet-processing-on-gpu-at-100gbe-line-rate.pdf

Yes this is the same project: use GPU in DPDK workload.
Thomas Monjalon June 4, 2021, 10:28 a.m. UTC | #24
03/06/2021 11:33, Ferruh Yigit:
> On 6/3/2021 8:47 AM, Jerin Jacob wrote:
> > On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> >> +  [gpudev]             (@ref rte_gpudev.h),
> > 
> > Since this device does not have a queue etc? Shouldn't make it a
> > library like mempool with vendor-defined ops?
> 
> +1
> 
> Current RFC announces additional memory allocation capabilities, which can suits
> better as extension to existing memory related library instead of a new device
> abstraction library.

It is not replacing mempool.
It is more at the same level as EAL memory management:
allocate simple buffer, but with the exception it is done
on a specific device, so it requires a device ID.

The other reason it needs to be a full library is that
it will start a workload on the GPU and get completion notification
so we can integrate the GPU workload in a packet processing pipeline.
Wang, Haiyue June 4, 2021, 11:07 a.m. UTC | #25
> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Thomas Monjalon
> Sent: Thursday, June 3, 2021 04:36
> To: dev@dpdk.org
> Cc: Elena Agostini <eagostini@nvidia.com>
> Subject: [dpdk-dev] [PATCH] gpudev: introduce memory API
> 
> From: Elena Agostini <eagostini@nvidia.com>
> 
> The new library gpudev is for dealing with GPU from a DPDK application
> in a vendor-agnostic way.
> 
> As a first step, the features are focused on memory management.
> A function allows to allocate memory inside the GPU,
> while another one allows to use main (CPU) memory from the GPU.
> 
> The infrastructure is prepared to welcome drivers in drivers/gpu/
> as the upcoming NVIDIA one, implementing the gpudev API.
> Other additions planned for next revisions:
>   - C implementation file
>   - guide documentation
>   - unit tests
>   - integration in testpmd to enable Rx/Tx to/from GPU memory.
> 
> The next step should focus on GPU processing task control.
> 
> Signed-off-by: Elena Agostini <eagostini@nvidia.com>
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> ---
>  .gitignore                           |   1 +
>  MAINTAINERS                          |   6 +
>  doc/api/doxy-api-index.md            |   1 +
>  doc/api/doxy-api.conf.in             |   1 +
>  doc/guides/conf.py                   |   8 ++
>  doc/guides/gpus/features/default.ini |  13 ++
>  doc/guides/gpus/index.rst            |  11 ++
>  doc/guides/gpus/overview.rst         |   7 +
>  doc/guides/index.rst                 |   1 +
>  doc/guides/prog_guide/gpu.rst        |   5 +
>  doc/guides/prog_guide/index.rst      |   1 +
>  drivers/gpu/meson.build              |   4 +
>  drivers/meson.build                  |   1 +
>  lib/gpudev/gpu_driver.h              |  44 +++++++
>  lib/gpudev/meson.build               |   9 ++
>  lib/gpudev/rte_gpudev.h              | 183 +++++++++++++++++++++++++++
>  lib/gpudev/version.map               |  11 ++
>  lib/meson.build                      |   1 +
>  18 files changed, 308 insertions(+)
>  create mode 100644 doc/guides/gpus/features/default.ini
>  create mode 100644 doc/guides/gpus/index.rst
>  create mode 100644 doc/guides/gpus/overview.rst
>  create mode 100644 doc/guides/prog_guide/gpu.rst
>  create mode 100644 drivers/gpu/meson.build
>  create mode 100644 lib/gpudev/gpu_driver.h
>  create mode 100644 lib/gpudev/meson.build
>  create mode 100644 lib/gpudev/rte_gpudev.h
>  create mode 100644 lib/gpudev/version.map
> 


> +#include <stdint.h>
> +
> +#include <rte_common.h>
> +
> +#include "rte_gpudev.h"
> +
> +struct rte_gpu_dev;
> +
> +typedef int (*gpu_malloc_t)(struct rte_gpu_dev *dev, size_t size, void **ptr);
> +typedef int (*gpu_free_t)(struct rte_gpu_dev *dev, void *ptr);
> +
> +struct rte_gpu_dev {
> +	/* Backing device. */
> +	struct rte_device *device;
> +	/* GPU info structure. */
> +	struct rte_gpu_info info;
> +	/* Counter of processes using the device. */
> +	uint16_t process_cnt;
> +	/* If device is currently used or not. */
> +	enum rte_gpu_state state;
> +	/* FUNCTION: Allocate memory on the GPU. */
> +	gpu_malloc_t gpu_malloc;
> +	/* FUNCTION: Allocate memory on the CPU visible from the GPU. */
> +	gpu_malloc_t gpu_malloc_visible;
> +	/* FUNCTION: Free allocated memory on the GPU. */
> +	gpu_free_t gpu_free;


I'm wondering that we can define the malloc type as:

typedef int (*gpu_malloc_t)(struct rte_gpu_dev *dev, size_t size, void **ptr,
					unsigned int flags)

#define RTE_GPU_MALLOC_F_CPU_VISIBLE 0x01u --> gpu_malloc_visible

Then only one malloc function member is needed, paired with 'gpu_free'.

> +	/* Device interrupt handle. */
> +	struct rte_intr_handle *intr_handle;
> +	/* Driver-specific private data. */
> +	void *dev_private;
> +} __rte_cache_aligned;
> +


> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Allocate a chunk of memory on the GPU.
> + *
> + * @param gpu_id
> + *   GPU ID to allocate memory.
> + * @param size
> + *   Number of bytes to allocate.
> + * @param ptr
> + *   Pointer to store the address of the allocated memory.
> + *
> + * @return
> + *   0 on success, -1 otherwise.
> + */
> +__rte_experimental
> +int rte_gpu_malloc(uint16_t gpu_id, size_t size, void **ptr);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Allocate a chunk of memory on the CPU that is visible from the GPU.
> + *
> + * @param gpu_id
> + *   Reference GPU ID.
> + * @param size
> + *   Number of bytes to allocate.
> + * @param ptr
> + *   Pointer to store the address of the allocated memory.
> + *
> + * @return
> + *   0 on success, -1 otherwise.
> + */
> +__rte_experimental
> +int rte_gpu_malloc_visible(uint16_t gpu_id, size_t size, void **ptr);

Then 'rte_gpu_malloc_visible' is no needed, and the new call is:

rte_gpu_malloc(uint16_t gpu_id, size_t size, void **ptr, RTE_GPU_MALLOC_F_CPU_VISIBLE).

Also, we can define more flags for feature extension. ;-)

> +
> +#ifdef __cplusplus
> +}
> --
> 2.31.1
Jerin Jacob June 4, 2021, 11:09 a.m. UTC | #26
On Fri, Jun 4, 2021 at 3:58 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 03/06/2021 11:33, Ferruh Yigit:
> > On 6/3/2021 8:47 AM, Jerin Jacob wrote:
> > > On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> > >> +  [gpudev]             (@ref rte_gpudev.h),
> > >
> > > Since this device does not have a queue etc? Shouldn't make it a
> > > library like mempool with vendor-defined ops?
> >
> > +1
> >
> > Current RFC announces additional memory allocation capabilities, which can suits
> > better as extension to existing memory related library instead of a new device
> > abstraction library.
>
> It is not replacing mempool.
> It is more at the same level as EAL memory management:
> allocate simple buffer, but with the exception it is done
> on a specific device, so it requires a device ID.
>
> The other reason it needs to be a full library is that
> it will start a workload on the GPU and get completion notification
> so we can integrate the GPU workload in a packet processing pipeline.

I might have confused you. My intention is not to make to fit under mempool API.

I agree that we need a separate library for this. My objection is only
to not call libgpudev and
call it libgpu. And have APIs with rte_gpu_ instead of rte_gpu_dev as
it not like existing "device libraries" in DPDK and
it like other "libraries" in DPDK.



>
>
Thomas Monjalon June 4, 2021, 12:43 p.m. UTC | #27
04/06/2021 13:07, Wang, Haiyue:
> > From: Elena Agostini <eagostini@nvidia.com>
> > +typedef int (*gpu_malloc_t)(struct rte_gpu_dev *dev, size_t size, void **ptr);
> > +typedef int (*gpu_free_t)(struct rte_gpu_dev *dev, void *ptr);
> > +
[...]
> > +	/* FUNCTION: Allocate memory on the GPU. */
> > +	gpu_malloc_t gpu_malloc;
> > +	/* FUNCTION: Allocate memory on the CPU visible from the GPU. */
> > +	gpu_malloc_t gpu_malloc_visible;
> > +	/* FUNCTION: Free allocated memory on the GPU. */
> > +	gpu_free_t gpu_free;
> 
> 
> I'm wondering that we can define the malloc type as:
> 
> typedef int (*gpu_malloc_t)(struct rte_gpu_dev *dev, size_t size, void **ptr,
> 					unsigned int flags)
> 
> #define RTE_GPU_MALLOC_F_CPU_VISIBLE 0x01u --> gpu_malloc_visible
> 
> Then only one malloc function member is needed, paired with 'gpu_free'.
[...]
> > +int rte_gpu_malloc(uint16_t gpu_id, size_t size, void **ptr);
[...]
> > +int rte_gpu_malloc_visible(uint16_t gpu_id, size_t size, void **ptr);
> 
> Then 'rte_gpu_malloc_visible' is no needed, and the new call is:
> 
> rte_gpu_malloc(uint16_t gpu_id, size_t size, void **ptr, RTE_GPU_MALLOC_F_CPU_VISIBLE).
> 
> Also, we can define more flags for feature extension. ;-)

Yes it is a good idea.

Another question is about the function rte_gpu_free().
How do we recognize that a memory chunk is from the CPU and GPU visible,
or just from GPU?
Thomas Monjalon June 4, 2021, 12:46 p.m. UTC | #28
04/06/2021 13:09, Jerin Jacob:
> On Fri, Jun 4, 2021 at 3:58 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > 03/06/2021 11:33, Ferruh Yigit:
> > > On 6/3/2021 8:47 AM, Jerin Jacob wrote:
> > > > On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > >> +  [gpudev]             (@ref rte_gpudev.h),
> > > >
> > > > Since this device does not have a queue etc? Shouldn't make it a
> > > > library like mempool with vendor-defined ops?
> > >
> > > +1
> > >
> > > Current RFC announces additional memory allocation capabilities, which can suits
> > > better as extension to existing memory related library instead of a new device
> > > abstraction library.
> >
> > It is not replacing mempool.
> > It is more at the same level as EAL memory management:
> > allocate simple buffer, but with the exception it is done
> > on a specific device, so it requires a device ID.
> >
> > The other reason it needs to be a full library is that
> > it will start a workload on the GPU and get completion notification
> > so we can integrate the GPU workload in a packet processing pipeline.
> 
> I might have confused you. My intention is not to make to fit under mempool API.
> 
> I agree that we need a separate library for this. My objection is only
> to not call libgpudev and
> call it libgpu. And have APIs with rte_gpu_ instead of rte_gpu_dev as
> it not like existing "device libraries" in DPDK and
> it like other "libraries" in DPDK.

I think we should define a queue of processing actions,
so it looks like other device libraries.
And anyway I think a library managing a device class,
and having some device drivers deserves the name of device library.

I would like to read more opinions.
Thomas Monjalon June 4, 2021, 12:55 p.m. UTC | #29
03/06/2021 13:38, Jerin Jacob:
> On Thu, Jun 3, 2021 at 4:00 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > 03/06/2021 12:04, Jerin Jacob:
> > > On Thu, Jun 3, 2021 at 3:06 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > 03/06/2021 11:20, Jerin Jacob:
> > > > > The device needs have a queue kind of structure
> > > > > and it is mapping to core to have a notion of configure. queue_setup,
> > > > > start and stop etc
> > > >
> > > > Why is it a requirement to call it a device API?
> > >
> > > Then we need to define what needs to call as device library vs library and how?
> > > Why mempool is not called a  device library vs library?
> >
> > My view is simple:
> > if it has drivers, it is a device API, except bus and mempool libs.
> 
> rte_secuity has drivers but it is not called a device library.

rte_security is a monster beast :)
Yes it has rte_security_ops implemented in net and crypto drivers,
but it is an API extension only, there is no driver dedicated to security.

> > About mempool, it started as a standard lib and got extended for HW support.
> 
> Yes. We did not change to device library as it was fundamentally
> different than another DPDK deices
> when we added the device support.
> 
> > > and why all
> > > other device library has a common structure like queues and
> > > it binding core etc. I tried to explain above the similar attributes
> > > for dpdk device libraries[1] which I think, it a requirement so
> > > that the end user will have familiarity with device libraries rather
> > > than each one has separate General guidelines and principles.
> > >
> > > I think, it is more TB discussion topic and decides on this because I
> > > don't see in technical issue in calling it a library.
> >
> > The naming is just a choice.
> 
> Not sure.
> 
> > Yesterday morning it was called lib/gpu/
> > and in the evening it was renamed lib/gpudev/
> > so no technical issue :)
> >
> > But the design of the API with queues or other paradigm
> > is something I would like to discuss here.
> 
> Yeah, That is important. IMO, That defines what needs to be a device library.
> 
> > Note: there was no intent to publish GPU processing control
> > in DPDK 21.08. We want to focus on GPU memory in 21.08,
> > but I understand it is a key decision in the big picture.
> 
> if the scope is only memory allocation, IMO, it is better to make a library.

No it is only the first step.

> > What would be your need and would you design such API?
> 
> For me, there is no need for gpu library(as of now). May GPU consumers
> can define what they need to control using the library.

We need to integrate GPU processing workload in the DPDK workflow
as a generic API.
There could be 2 modes:
	- queue of tasks
	- tasks in an infinite loop
In both modes, we could get completion notifications
with an interrupt/callback or by polling a shared memory.
Andrew Rybchenko June 4, 2021, 1:05 p.m. UTC | #30
On 6/4/21 3:46 PM, Thomas Monjalon wrote:
> 04/06/2021 13:09, Jerin Jacob:
>> On Fri, Jun 4, 2021 at 3:58 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>>> 03/06/2021 11:33, Ferruh Yigit:
>>>> On 6/3/2021 8:47 AM, Jerin Jacob wrote:
>>>>> On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon <thomas@monjalon.net> wrote:
>>>>>> +  [gpudev]             (@ref rte_gpudev.h),
>>>>>
>>>>> Since this device does not have a queue etc? Shouldn't make it a
>>>>> library like mempool with vendor-defined ops?
>>>>
>>>> +1
>>>>
>>>> Current RFC announces additional memory allocation capabilities, which can suits
>>>> better as extension to existing memory related library instead of a new device
>>>> abstraction library.
>>>
>>> It is not replacing mempool.
>>> It is more at the same level as EAL memory management:
>>> allocate simple buffer, but with the exception it is done
>>> on a specific device, so it requires a device ID.
>>>
>>> The other reason it needs to be a full library is that
>>> it will start a workload on the GPU and get completion notification
>>> so we can integrate the GPU workload in a packet processing pipeline.
>>
>> I might have confused you. My intention is not to make to fit under mempool API.
>>
>> I agree that we need a separate library for this. My objection is only
>> to not call libgpudev and
>> call it libgpu. And have APIs with rte_gpu_ instead of rte_gpu_dev as
>> it not like existing "device libraries" in DPDK and
>> it like other "libraries" in DPDK.
> 
> I think we should define a queue of processing actions,
> so it looks like other device libraries.
> And anyway I think a library managing a device class,
> and having some device drivers deserves the name of device library.
> 
> I would like to read more opinions.

Since the library is an unified interface to GPU device drivers
I think it should be named as in the patch - gpudev.

Mempool looks like an exception here - initially it was pure SW
library, but not there are HW backends and corresponding device
drivers.

What I don't understand where is GPU specifics here?
I.e. why GPU? NIC can have own memory and provide
corresponding API.

What's the difference of "the memory on the CPU that is visible from the
GPU" from existing memzones which are DMA mapped?
Thomas Monjalon June 4, 2021, 1:18 p.m. UTC | #31
04/06/2021 15:05, Andrew Rybchenko:
> On 6/4/21 3:46 PM, Thomas Monjalon wrote:
> > 04/06/2021 13:09, Jerin Jacob:
> >> On Fri, Jun 4, 2021 at 3:58 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> >>> 03/06/2021 11:33, Ferruh Yigit:
> >>>> On 6/3/2021 8:47 AM, Jerin Jacob wrote:
> >>>>> On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> >>>>>> +  [gpudev]             (@ref rte_gpudev.h),
> >>>>>
> >>>>> Since this device does not have a queue etc? Shouldn't make it a
> >>>>> library like mempool with vendor-defined ops?
> >>>>
> >>>> +1
> >>>>
> >>>> Current RFC announces additional memory allocation capabilities, which can suits
> >>>> better as extension to existing memory related library instead of a new device
> >>>> abstraction library.
> >>>
> >>> It is not replacing mempool.
> >>> It is more at the same level as EAL memory management:
> >>> allocate simple buffer, but with the exception it is done
> >>> on a specific device, so it requires a device ID.
> >>>
> >>> The other reason it needs to be a full library is that
> >>> it will start a workload on the GPU and get completion notification
> >>> so we can integrate the GPU workload in a packet processing pipeline.
> >>
> >> I might have confused you. My intention is not to make to fit under mempool API.
> >>
> >> I agree that we need a separate library for this. My objection is only
> >> to not call libgpudev and
> >> call it libgpu. And have APIs with rte_gpu_ instead of rte_gpu_dev as
> >> it not like existing "device libraries" in DPDK and
> >> it like other "libraries" in DPDK.
> > 
> > I think we should define a queue of processing actions,
> > so it looks like other device libraries.
> > And anyway I think a library managing a device class,
> > and having some device drivers deserves the name of device library.
> > 
> > I would like to read more opinions.
> 
> Since the library is an unified interface to GPU device drivers
> I think it should be named as in the patch - gpudev.
> 
> Mempool looks like an exception here - initially it was pure SW
> library, but not there are HW backends and corresponding device
> drivers.
> 
> What I don't understand where is GPU specifics here?

That's an interesting question.
Let's ask first what is a GPU for DPDK?
I think it is like a sub-CPU with high parallel execution capabilities,
and it is controlled by the CPU.

> I.e. why GPU? NIC can have own memory and provide corresponding API.

So far we don't need to explicitly allocate memory on the NIC.
The packets are received or copied to the CPU memory.
In the GPU case, the NIC could save the packets directly
in the GPU memory, thus the need to manage the GPU memory.

Also, because the GPU program is dynamically loaded,
there is no fixed API to interact with the GPU workload except via memory.

> What's the difference of "the memory on the CPU that is visible from the
> GPU" from existing memzones which are DMA mapped?

The only difference is that the GPU must map the CPU memory
in its program logic.
Wang, Haiyue June 4, 2021, 1:25 p.m. UTC | #32
> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Friday, June 4, 2021 20:44
> To: Wang, Haiyue <haiyue.wang@intel.com>
> Cc: dev@dpdk.org; Elena Agostini <eagostini@nvidia.com>
> Subject: Re: [dpdk-dev] [PATCH] gpudev: introduce memory API
> 
> 04/06/2021 13:07, Wang, Haiyue:
> > > From: Elena Agostini <eagostini@nvidia.com>
> > > +typedef int (*gpu_malloc_t)(struct rte_gpu_dev *dev, size_t size, void **ptr);
> > > +typedef int (*gpu_free_t)(struct rte_gpu_dev *dev, void *ptr);
> > > +
> [...]
> > > +	/* FUNCTION: Allocate memory on the GPU. */
> > > +	gpu_malloc_t gpu_malloc;
> > > +	/* FUNCTION: Allocate memory on the CPU visible from the GPU. */
> > > +	gpu_malloc_t gpu_malloc_visible;
> > > +	/* FUNCTION: Free allocated memory on the GPU. */
> > > +	gpu_free_t gpu_free;
> >
> >
> > I'm wondering that we can define the malloc type as:
> >
> > typedef int (*gpu_malloc_t)(struct rte_gpu_dev *dev, size_t size, void **ptr,
> > 					unsigned int flags)
> >
> > #define RTE_GPU_MALLOC_F_CPU_VISIBLE 0x01u --> gpu_malloc_visible
> >
> > Then only one malloc function member is needed, paired with 'gpu_free'.
> [...]
> > > +int rte_gpu_malloc(uint16_t gpu_id, size_t size, void **ptr);
> [...]
> > > +int rte_gpu_malloc_visible(uint16_t gpu_id, size_t size, void **ptr);
> >
> > Then 'rte_gpu_malloc_visible' is no needed, and the new call is:
> >
> > rte_gpu_malloc(uint16_t gpu_id, size_t size, void **ptr, RTE_GPU_MALLOC_F_CPU_VISIBLE).
> >
> > Also, we can define more flags for feature extension. ;-)
> 
> Yes it is a good idea.
> 
> Another question is about the function rte_gpu_free().
> How do we recognize that a memory chunk is from the CPU and GPU visible,
> or just from GPU?
> 

I didn't find the rte_gpu_free_visible definition, and the rte_gpu_free's
comment just says: deallocate a chunk of memory allocated with rte_gpu_malloc*

Looks like the rte_gpu_free can handle this case ?

And from the definition "rte_gpu_free(uint16_t gpu_id, void *ptr)", the
free needs to check whether this memory belong to the GPU or not, so it
also can recognize the memory type, I think.
Andrew Rybchenko June 4, 2021, 1:59 p.m. UTC | #33
On 6/4/21 4:18 PM, Thomas Monjalon wrote:
> 04/06/2021 15:05, Andrew Rybchenko:
>> On 6/4/21 3:46 PM, Thomas Monjalon wrote:
>>> 04/06/2021 13:09, Jerin Jacob:
>>>> On Fri, Jun 4, 2021 at 3:58 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>>>>> 03/06/2021 11:33, Ferruh Yigit:
>>>>>> On 6/3/2021 8:47 AM, Jerin Jacob wrote:
>>>>>>> On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon <thomas@monjalon.net> wrote:
>>>>>>>> +  [gpudev]             (@ref rte_gpudev.h),
>>>>>>>
>>>>>>> Since this device does not have a queue etc? Shouldn't make it a
>>>>>>> library like mempool with vendor-defined ops?
>>>>>>
>>>>>> +1
>>>>>>
>>>>>> Current RFC announces additional memory allocation capabilities, which can suits
>>>>>> better as extension to existing memory related library instead of a new device
>>>>>> abstraction library.
>>>>>
>>>>> It is not replacing mempool.
>>>>> It is more at the same level as EAL memory management:
>>>>> allocate simple buffer, but with the exception it is done
>>>>> on a specific device, so it requires a device ID.
>>>>>
>>>>> The other reason it needs to be a full library is that
>>>>> it will start a workload on the GPU and get completion notification
>>>>> so we can integrate the GPU workload in a packet processing pipeline.
>>>>
>>>> I might have confused you. My intention is not to make to fit under mempool API.
>>>>
>>>> I agree that we need a separate library for this. My objection is only
>>>> to not call libgpudev and
>>>> call it libgpu. And have APIs with rte_gpu_ instead of rte_gpu_dev as
>>>> it not like existing "device libraries" in DPDK and
>>>> it like other "libraries" in DPDK.
>>>
>>> I think we should define a queue of processing actions,
>>> so it looks like other device libraries.
>>> And anyway I think a library managing a device class,
>>> and having some device drivers deserves the name of device library.
>>>
>>> I would like to read more opinions.
>>
>> Since the library is an unified interface to GPU device drivers
>> I think it should be named as in the patch - gpudev.
>>
>> Mempool looks like an exception here - initially it was pure SW
>> library, but not there are HW backends and corresponding device
>> drivers.
>>
>> What I don't understand where is GPU specifics here?
> 
> That's an interesting question.
> Let's ask first what is a GPU for DPDK?
> I think it is like a sub-CPU with high parallel execution capabilities,
> and it is controlled by the CPU.

I have no good ideas how to name it in accordance with
above description to avoid "G" which for "Graphics" if
understand correctly. However, may be it is not required.
No strong opinion on the topic, but unbinding from
"Graphics" would be nice.

>> I.e. why GPU? NIC can have own memory and provide corresponding API.
> 
> So far we don't need to explicitly allocate memory on the NIC.
> The packets are received or copied to the CPU memory.
> In the GPU case, the NIC could save the packets directly
> in the GPU memory, thus the need to manage the GPU memory.
> 
> Also, because the GPU program is dynamically loaded,
> there is no fixed API to interact with the GPU workload except via memory.
> 
>> What's the difference of "the memory on the CPU that is visible from the
>> GPU" from existing memzones which are DMA mapped?
> 
> The only difference is that the GPU must map the CPU memory
> in its program logic.

I see. Thanks for the explanations.
Thomas Monjalon June 4, 2021, 2:06 p.m. UTC | #34
04/06/2021 15:25, Wang, Haiyue:
> From: Thomas Monjalon <thomas@monjalon.net>
> > Another question is about the function rte_gpu_free().
> > How do we recognize that a memory chunk is from the CPU and GPU visible,
> > or just from GPU?
> > 
> 
> I didn't find the rte_gpu_free_visible definition, and the rte_gpu_free's
> comment just says: deallocate a chunk of memory allocated with rte_gpu_malloc*
> 
> Looks like the rte_gpu_free can handle this case ?

This is the proposal, yes.

> And from the definition "rte_gpu_free(uint16_t gpu_id, void *ptr)", the
> free needs to check whether this memory belong to the GPU or not, so it
> also can recognize the memory type, I think.

Yes that's the idea behind having a single free function.
We could have some metadata in front of the memory chunk.
My question is to confirm whether it is a good design or not,
and whether it should be driver specific or have a common struct in the lib.

Opinions?
Thomas Monjalon June 4, 2021, 2:09 p.m. UTC | #35
04/06/2021 15:59, Andrew Rybchenko:
> On 6/4/21 4:18 PM, Thomas Monjalon wrote:
> > 04/06/2021 15:05, Andrew Rybchenko:
> >> On 6/4/21 3:46 PM, Thomas Monjalon wrote:
> >>> 04/06/2021 13:09, Jerin Jacob:
> >>>> On Fri, Jun 4, 2021 at 3:58 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> >>>>> 03/06/2021 11:33, Ferruh Yigit:
> >>>>>> On 6/3/2021 8:47 AM, Jerin Jacob wrote:
> >>>>>>> On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> >>>>>>>> +  [gpudev]             (@ref rte_gpudev.h),
> >>>>>>>
> >>>>>>> Since this device does not have a queue etc? Shouldn't make it a
> >>>>>>> library like mempool with vendor-defined ops?
> >>>>>>
> >>>>>> +1
> >>>>>>
> >>>>>> Current RFC announces additional memory allocation capabilities, which can suits
> >>>>>> better as extension to existing memory related library instead of a new device
> >>>>>> abstraction library.
> >>>>>
> >>>>> It is not replacing mempool.
> >>>>> It is more at the same level as EAL memory management:
> >>>>> allocate simple buffer, but with the exception it is done
> >>>>> on a specific device, so it requires a device ID.
> >>>>>
> >>>>> The other reason it needs to be a full library is that
> >>>>> it will start a workload on the GPU and get completion notification
> >>>>> so we can integrate the GPU workload in a packet processing pipeline.
> >>>>
> >>>> I might have confused you. My intention is not to make to fit under mempool API.
> >>>>
> >>>> I agree that we need a separate library for this. My objection is only
> >>>> to not call libgpudev and
> >>>> call it libgpu. And have APIs with rte_gpu_ instead of rte_gpu_dev as
> >>>> it not like existing "device libraries" in DPDK and
> >>>> it like other "libraries" in DPDK.
> >>>
> >>> I think we should define a queue of processing actions,
> >>> so it looks like other device libraries.
> >>> And anyway I think a library managing a device class,
> >>> and having some device drivers deserves the name of device library.
> >>>
> >>> I would like to read more opinions.
> >>
> >> Since the library is an unified interface to GPU device drivers
> >> I think it should be named as in the patch - gpudev.
> >>
> >> Mempool looks like an exception here - initially it was pure SW
> >> library, but not there are HW backends and corresponding device
> >> drivers.
> >>
> >> What I don't understand where is GPU specifics here?
> > 
> > That's an interesting question.
> > Let's ask first what is a GPU for DPDK?
> > I think it is like a sub-CPU with high parallel execution capabilities,
> > and it is controlled by the CPU.
> 
> I have no good ideas how to name it in accordance with
> above description to avoid "G" which for "Graphics" if
> understand correctly. However, may be it is not required.
> No strong opinion on the topic, but unbinding from
> "Graphics" would be nice.

That's a question I ask myself for months now.
I am not able to find a better name,
and I start thinking that "GPU" is famous enough in high-load computing
to convey the idea of what we can expect.
Jerin Jacob June 4, 2021, 3:05 p.m. UTC | #36
On Fri, Jun 4, 2021 at 6:25 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 03/06/2021 13:38, Jerin Jacob:
> > On Thu, Jun 3, 2021 at 4:00 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > 03/06/2021 12:04, Jerin Jacob:
> > > > On Thu, Jun 3, 2021 at 3:06 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > 03/06/2021 11:20, Jerin Jacob:
> > > > > > The device needs have a queue kind of structure
> > > > > > and it is mapping to core to have a notion of configure. queue_setup,
> > > > > > start and stop etc
> > > > >
> > > > > Why is it a requirement to call it a device API?
> > > >
> > > > Then we need to define what needs to call as device library vs library and how?
> > > > Why mempool is not called a  device library vs library?
> > >
> > > My view is simple:
> > > if it has drivers, it is a device API, except bus and mempool libs.
> >
> > rte_secuity has drivers but it is not called a device library.
>
> rte_security is a monster beast :)
> Yes it has rte_security_ops implemented in net and crypto drivers,
> but it is an API extension only, there is no driver dedicated to security.
>
> > > About mempool, it started as a standard lib and got extended for HW support.
> >
> > Yes. We did not change to device library as it was fundamentally
> > different than another DPDK deices
> > when we added the device support.
> >
> > > > and why all
> > > > other device library has a common structure like queues and
> > > > it binding core etc. I tried to explain above the similar attributes
> > > > for dpdk device libraries[1] which I think, it a requirement so
> > > > that the end user will have familiarity with device libraries rather
> > > > than each one has separate General guidelines and principles.
> > > >
> > > > I think, it is more TB discussion topic and decides on this because I
> > > > don't see in technical issue in calling it a library.
> > >
> > > The naming is just a choice.
> >
> > Not sure.
> >
> > > Yesterday morning it was called lib/gpu/
> > > and in the evening it was renamed lib/gpudev/
> > > so no technical issue :)
> > >
> > > But the design of the API with queues or other paradigm
> > > is something I would like to discuss here.
> >
> > Yeah, That is important. IMO, That defines what needs to be a device library.
> >
> > > Note: there was no intent to publish GPU processing control
> > > in DPDK 21.08. We want to focus on GPU memory in 21.08,
> > > but I understand it is a key decision in the big picture.
> >
> > if the scope is only memory allocation, IMO, it is better to make a library.
>
> No it is only the first step.
>
> > > What would be your need and would you design such API?
> >
> > For me, there is no need for gpu library(as of now). May GPU consumers
> > can define what they need to control using the library.
>
> We need to integrate GPU processing workload in the DPDK workflow
> as a generic API.
> There could be 2 modes:
>         - queue of tasks
>         - tasks in an infinite loop
> In both modes, we could get completion notifications
> with an interrupt/callback or by polling a shared memory.


OK. If we have enqeue/dequeue kind operation and with queue model then it
makes sense to have a device model. It was not there in your initial
patch, but if we are adding
in the future then it OK.



>
>
>
Jerin Jacob June 4, 2021, 3:20 p.m. UTC | #37
On Fri, Jun 4, 2021 at 7:39 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 04/06/2021 15:59, Andrew Rybchenko:
> > On 6/4/21 4:18 PM, Thomas Monjalon wrote:
> > > 04/06/2021 15:05, Andrew Rybchenko:
> > >> On 6/4/21 3:46 PM, Thomas Monjalon wrote:
> > >>> 04/06/2021 13:09, Jerin Jacob:
> > >>>> On Fri, Jun 4, 2021 at 3:58 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > >>>>> 03/06/2021 11:33, Ferruh Yigit:
> > >>>>>> On 6/3/2021 8:47 AM, Jerin Jacob wrote:
> > >>>>>>> On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> > >>>>>>>> +  [gpudev]             (@ref rte_gpudev.h),
> > >>>>>>>
> > >>>>>>> Since this device does not have a queue etc? Shouldn't make it a
> > >>>>>>> library like mempool with vendor-defined ops?
> > >>>>>>
> > >>>>>> +1
> > >>>>>>
> > >>>>>> Current RFC announces additional memory allocation capabilities, which can suits
> > >>>>>> better as extension to existing memory related library instead of a new device
> > >>>>>> abstraction library.
> > >>>>>
> > >>>>> It is not replacing mempool.
> > >>>>> It is more at the same level as EAL memory management:
> > >>>>> allocate simple buffer, but with the exception it is done
> > >>>>> on a specific device, so it requires a device ID.
> > >>>>>
> > >>>>> The other reason it needs to be a full library is that
> > >>>>> it will start a workload on the GPU and get completion notification
> > >>>>> so we can integrate the GPU workload in a packet processing pipeline.
> > >>>>
> > >>>> I might have confused you. My intention is not to make to fit under mempool API.
> > >>>>
> > >>>> I agree that we need a separate library for this. My objection is only
> > >>>> to not call libgpudev and
> > >>>> call it libgpu. And have APIs with rte_gpu_ instead of rte_gpu_dev as
> > >>>> it not like existing "device libraries" in DPDK and
> > >>>> it like other "libraries" in DPDK.
> > >>>
> > >>> I think we should define a queue of processing actions,
> > >>> so it looks like other device libraries.
> > >>> And anyway I think a library managing a device class,
> > >>> and having some device drivers deserves the name of device library.
> > >>>
> > >>> I would like to read more opinions.
> > >>
> > >> Since the library is an unified interface to GPU device drivers
> > >> I think it should be named as in the patch - gpudev.
> > >>
> > >> Mempool looks like an exception here - initially it was pure SW
> > >> library, but not there are HW backends and corresponding device
> > >> drivers.
> > >>
> > >> What I don't understand where is GPU specifics here?
> > >
> > > That's an interesting question.
> > > Let's ask first what is a GPU for DPDK?
> > > I think it is like a sub-CPU with high parallel execution capabilities,
> > > and it is controlled by the CPU.
> >
> > I have no good ideas how to name it in accordance with
> > above description to avoid "G" which for "Graphics" if
> > understand correctly. However, may be it is not required.
> > No strong opinion on the topic, but unbinding from
> > "Graphics" would be nice.
>
> That's a question I ask myself for months now.
> I am not able to find a better name,
> and I start thinking that "GPU" is famous enough in high-load computing
> to convey the idea of what we can expect.


The closest I can think of is big-little architecture in ARM SoC.
https://www.arm.com/why-arm/technologies/big-little

We do have similar architecture, Where the "coprocessor" is part of
the main CPU.
It is operations are:
- Download firmware
- Memory mapping for Main CPU memory by the co-processor
- Enq/Deq Jobs from/to Main CPU/Coprocessor CPU.

If your scope is something similar and No Graphics involved here then
we can remove G.

Coincidentally, Yesterday, I had an interaction with Elena for the
same for BaseBand related work in ORAN where
GPU used as Baseband processing instead of Graphics.(So I can
understand the big picture of this library)

I can think of "coprocessor-dev" as one of the name. We do have
similar machine learning co-processors(for compute)
if we can keep a generic name and it is for the above functions we may
use this subsystem as well in the future.










>
>
>
Thomas Monjalon June 4, 2021, 3:51 p.m. UTC | #38
04/06/2021 17:20, Jerin Jacob:
> On Fri, Jun 4, 2021 at 7:39 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > 04/06/2021 15:59, Andrew Rybchenko:
> > > On 6/4/21 4:18 PM, Thomas Monjalon wrote:
> > > > 04/06/2021 15:05, Andrew Rybchenko:
> > > >> On 6/4/21 3:46 PM, Thomas Monjalon wrote:
> > > >>> 04/06/2021 13:09, Jerin Jacob:
> > > >>>> On Fri, Jun 4, 2021 at 3:58 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > >>>>> 03/06/2021 11:33, Ferruh Yigit:
> > > >>>>>> On 6/3/2021 8:47 AM, Jerin Jacob wrote:
> > > >>>>>>> On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > >>>>>>>> +  [gpudev]             (@ref rte_gpudev.h),
> > > >>>>>>>
> > > >>>>>>> Since this device does not have a queue etc? Shouldn't make it a
> > > >>>>>>> library like mempool with vendor-defined ops?
> > > >>>>>>
> > > >>>>>> +1
> > > >>>>>>
> > > >>>>>> Current RFC announces additional memory allocation capabilities, which can suits
> > > >>>>>> better as extension to existing memory related library instead of a new device
> > > >>>>>> abstraction library.
> > > >>>>>
> > > >>>>> It is not replacing mempool.
> > > >>>>> It is more at the same level as EAL memory management:
> > > >>>>> allocate simple buffer, but with the exception it is done
> > > >>>>> on a specific device, so it requires a device ID.
> > > >>>>>
> > > >>>>> The other reason it needs to be a full library is that
> > > >>>>> it will start a workload on the GPU and get completion notification
> > > >>>>> so we can integrate the GPU workload in a packet processing pipeline.
> > > >>>>
> > > >>>> I might have confused you. My intention is not to make to fit under mempool API.
> > > >>>>
> > > >>>> I agree that we need a separate library for this. My objection is only
> > > >>>> to not call libgpudev and
> > > >>>> call it libgpu. And have APIs with rte_gpu_ instead of rte_gpu_dev as
> > > >>>> it not like existing "device libraries" in DPDK and
> > > >>>> it like other "libraries" in DPDK.
> > > >>>
> > > >>> I think we should define a queue of processing actions,
> > > >>> so it looks like other device libraries.
> > > >>> And anyway I think a library managing a device class,
> > > >>> and having some device drivers deserves the name of device library.
> > > >>>
> > > >>> I would like to read more opinions.
> > > >>
> > > >> Since the library is an unified interface to GPU device drivers
> > > >> I think it should be named as in the patch - gpudev.
> > > >>
> > > >> Mempool looks like an exception here - initially it was pure SW
> > > >> library, but not there are HW backends and corresponding device
> > > >> drivers.
> > > >>
> > > >> What I don't understand where is GPU specifics here?
> > > >
> > > > That's an interesting question.
> > > > Let's ask first what is a GPU for DPDK?
> > > > I think it is like a sub-CPU with high parallel execution capabilities,
> > > > and it is controlled by the CPU.
> > >
> > > I have no good ideas how to name it in accordance with
> > > above description to avoid "G" which for "Graphics" if
> > > understand correctly. However, may be it is not required.
> > > No strong opinion on the topic, but unbinding from
> > > "Graphics" would be nice.
> >
> > That's a question I ask myself for months now.
> > I am not able to find a better name,
> > and I start thinking that "GPU" is famous enough in high-load computing
> > to convey the idea of what we can expect.
> 
> 
> The closest I can think of is big-little architecture in ARM SoC.
> https://www.arm.com/why-arm/technologies/big-little
> 
> We do have similar architecture, Where the "coprocessor" is part of
> the main CPU.
> It is operations are:
> - Download firmware
> - Memory mapping for Main CPU memory by the co-processor
> - Enq/Deq Jobs from/to Main CPU/Coprocessor CPU.

Yes it looks like the exact same scope.
I like the word "co-processor" in this context.

> If your scope is something similar and No Graphics involved here then
> we can remove G.

Indeed no graphics in DPDK :)
By removing the G, you mean keeping only PU? like "pudev"?
We could also define the G as "General".

> Coincidentally, Yesterday, I had an interaction with Elena for the
> same for BaseBand related work in ORAN where
> GPU used as Baseband processing instead of Graphics.(So I can
> understand the big picture of this library)

Yes baseband processing is one possible usage of GPU with DPDK.
We could also imagine some security analysis, or any machine learning...

> I can think of "coprocessor-dev" as one of the name.

"coprocessor" looks too long as prefix of the functions.

> We do have similar machine learning co-processors(for compute)
> if we can keep a generic name and it is for the above functions we may
> use this subsystem as well in the future.

Yes that's the idea to share a common synchronization mechanism
with different HW.

That's cool to have such a big interest in the community for this patch.
Wang, Haiyue June 4, 2021, 6:04 p.m. UTC | #39
> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Friday, June 4, 2021 22:06
> To: Wang, Haiyue <haiyue.wang@intel.com>
> Cc: dev@dpdk.org; Elena Agostini <eagostini@nvidia.com>; andrew.rybchenko@oktetlabs.ru; Yigit, Ferruh
> <ferruh.yigit@intel.com>; jerinj@marvell.com
> Subject: Re: [dpdk-dev] [PATCH] gpudev: introduce memory API
> 
> 04/06/2021 15:25, Wang, Haiyue:
> > From: Thomas Monjalon <thomas@monjalon.net>
> > > Another question is about the function rte_gpu_free().
> > > How do we recognize that a memory chunk is from the CPU and GPU visible,
> > > or just from GPU?
> > >
> >
> > I didn't find the rte_gpu_free_visible definition, and the rte_gpu_free's
> > comment just says: deallocate a chunk of memory allocated with rte_gpu_malloc*
> >
> > Looks like the rte_gpu_free can handle this case ?
> 
> This is the proposal, yes.
> 
> > And from the definition "rte_gpu_free(uint16_t gpu_id, void *ptr)", the
> > free needs to check whether this memory belong to the GPU or not, so it
> > also can recognize the memory type, I think.
> 
> Yes that's the idea behind having a single free function.
> We could have some metadata in front of the memory chunk.
> My question is to confirm whether it is a good design or not,
> and whether it should be driver specific or have a common struct in the lib.
> 
> Opinions?
> 

Make the GPU memory to be registered into the common lib API with the metadata
like address, size etc, and also some GPU specific callbacks like to handle how
to make GPU memory visible to CPU ?

And the memory register can be like the exist external memory function:

int
rte_extmem_register(void *va_addr, size_t len, rte_iova_t iova_addrs[],
		unsigned int n_pages, size_t page_sz)
Wang, Haiyue June 4, 2021, 6:20 p.m. UTC | #40
> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Thomas Monjalon
> Sent: Friday, June 4, 2021 23:51
> To: Jerin Jacob <jerinjacobk@gmail.com>
> Cc: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Andrew Rybchenko
> <andrew.rybchenko@oktetlabs.ru>; Yigit, Ferruh <ferruh.yigit@intel.com>; dpdk-dev <dev@dpdk.org>;
> Elena Agostini <eagostini@nvidia.com>; David Marchand <david.marchand@redhat.com>
> Subject: Re: [dpdk-dev] [PATCH] gpudev: introduce memory API
> 
> 04/06/2021 17:20, Jerin Jacob:
> > On Fri, Jun 4, 2021 at 7:39 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > 04/06/2021 15:59, Andrew Rybchenko:
> > > > On 6/4/21 4:18 PM, Thomas Monjalon wrote:
> > > > > 04/06/2021 15:05, Andrew Rybchenko:
> > > > >> On 6/4/21 3:46 PM, Thomas Monjalon wrote:
> > > > >>> 04/06/2021 13:09, Jerin Jacob:
> > > > >>>> On Fri, Jun 4, 2021 at 3:58 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > >>>>> 03/06/2021 11:33, Ferruh Yigit:
> > > > >>>>>> On 6/3/2021 8:47 AM, Jerin Jacob wrote:
> > > > >>>>>>> On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > >>>>>>>> +  [gpudev]             (@ref rte_gpudev.h),
> > > > >>>>>>>
> > > > >>>>>>> Since this device does not have a queue etc? Shouldn't make it a
> > > > >>>>>>> library like mempool with vendor-defined ops?
> > > > >>>>>>
> > > > >>>>>> +1
> > > > >>>>>>
> > > > >>>>>> Current RFC announces additional memory allocation capabilities, which can suits
> > > > >>>>>> better as extension to existing memory related library instead of a new device
> > > > >>>>>> abstraction library.
> > > > >>>>>
> > > > >>>>> It is not replacing mempool.
> > > > >>>>> It is more at the same level as EAL memory management:
> > > > >>>>> allocate simple buffer, but with the exception it is done
> > > > >>>>> on a specific device, so it requires a device ID.
> > > > >>>>>
> > > > >>>>> The other reason it needs to be a full library is that
> > > > >>>>> it will start a workload on the GPU and get completion notification
> > > > >>>>> so we can integrate the GPU workload in a packet processing pipeline.
> > > > >>>>
> > > > >>>> I might have confused you. My intention is not to make to fit under mempool API.
> > > > >>>>
> > > > >>>> I agree that we need a separate library for this. My objection is only
> > > > >>>> to not call libgpudev and
> > > > >>>> call it libgpu. And have APIs with rte_gpu_ instead of rte_gpu_dev as
> > > > >>>> it not like existing "device libraries" in DPDK and
> > > > >>>> it like other "libraries" in DPDK.
> > > > >>>
> > > > >>> I think we should define a queue of processing actions,
> > > > >>> so it looks like other device libraries.
> > > > >>> And anyway I think a library managing a device class,
> > > > >>> and having some device drivers deserves the name of device library.
> > > > >>>
> > > > >>> I would like to read more opinions.
> > > > >>
> > > > >> Since the library is an unified interface to GPU device drivers
> > > > >> I think it should be named as in the patch - gpudev.
> > > > >>
> > > > >> Mempool looks like an exception here - initially it was pure SW
> > > > >> library, but not there are HW backends and corresponding device
> > > > >> drivers.
> > > > >>
> > > > >> What I don't understand where is GPU specifics here?
> > > > >
> > > > > That's an interesting question.
> > > > > Let's ask first what is a GPU for DPDK?
> > > > > I think it is like a sub-CPU with high parallel execution capabilities,
> > > > > and it is controlled by the CPU.
> > > >
> > > > I have no good ideas how to name it in accordance with
> > > > above description to avoid "G" which for "Graphics" if
> > > > understand correctly. However, may be it is not required.
> > > > No strong opinion on the topic, but unbinding from
> > > > "Graphics" would be nice.
> > >
> > > That's a question I ask myself for months now.
> > > I am not able to find a better name,
> > > and I start thinking that "GPU" is famous enough in high-load computing
> > > to convey the idea of what we can expect.
> >
> >
> > The closest I can think of is big-little architecture in ARM SoC.
> > https://www.arm.com/why-arm/technologies/big-little
> >
> > We do have similar architecture, Where the "coprocessor" is part of
> > the main CPU.
> > It is operations are:
> > - Download firmware
> > - Memory mapping for Main CPU memory by the co-processor
> > - Enq/Deq Jobs from/to Main CPU/Coprocessor CPU.
> 
> Yes it looks like the exact same scope.
> I like the word "co-processor" in this context.
> 
> > If your scope is something similar and No Graphics involved here then
> > we can remove G.
> 
> Indeed no graphics in DPDK :)
> By removing the G, you mean keeping only PU? like "pudev"?
> We could also define the G as "General".
> 
> > Coincidentally, Yesterday, I had an interaction with Elena for the
> > same for BaseBand related work in ORAN where
> > GPU used as Baseband processing instead of Graphics.(So I can
> > understand the big picture of this library)
> 
> Yes baseband processing is one possible usage of GPU with DPDK.
> We could also imagine some security analysis, or any machine learning...
> 
> > I can think of "coprocessor-dev" as one of the name.
> 
> "coprocessor" looks too long as prefix of the functions.
> 
> > We do have similar machine learning co-processors(for compute)
> > if we can keep a generic name and it is for the above functions we may
> > use this subsystem as well in the future.
> 

Accelerator, 'acce_dev' ? ;-)

> Yes that's the idea to share a common synchronization mechanism
> with different HW.
> 
> That's cool to have such a big interest in the community for this patch.
>
Jerin Jacob June 5, 2021, 5:09 a.m. UTC | #41
On Fri, Jun 4, 2021 at 11:50 PM Wang, Haiyue <haiyue.wang@intel.com> wrote:
>
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Thomas Monjalon
> > Sent: Friday, June 4, 2021 23:51
> > To: Jerin Jacob <jerinjacobk@gmail.com>
> > Cc: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Andrew Rybchenko
> > <andrew.rybchenko@oktetlabs.ru>; Yigit, Ferruh <ferruh.yigit@intel.com>; dpdk-dev <dev@dpdk.org>;
> > Elena Agostini <eagostini@nvidia.com>; David Marchand <david.marchand@redhat.com>
> > Subject: Re: [dpdk-dev] [PATCH] gpudev: introduce memory API
> >
> > 04/06/2021 17:20, Jerin Jacob:
> > > On Fri, Jun 4, 2021 at 7:39 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > 04/06/2021 15:59, Andrew Rybchenko:
> > > > > On 6/4/21 4:18 PM, Thomas Monjalon wrote:
> > > > > > 04/06/2021 15:05, Andrew Rybchenko:
> > > > > >> On 6/4/21 3:46 PM, Thomas Monjalon wrote:
> > > > > >>> 04/06/2021 13:09, Jerin Jacob:
> > > > > >>>> On Fri, Jun 4, 2021 at 3:58 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > >>>>> 03/06/2021 11:33, Ferruh Yigit:
> > > > > >>>>>> On 6/3/2021 8:47 AM, Jerin Jacob wrote:
> > > > > >>>>>>> On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > >>>>>>>> +  [gpudev]             (@ref rte_gpudev.h),
> > > > > >>>>>>>
> > > > > >>>>>>> Since this device does not have a queue etc? Shouldn't make it a
> > > > > >>>>>>> library like mempool with vendor-defined ops?
> > > > > >>>>>>
> > > > > >>>>>> +1
> > > > > >>>>>>
> > > > > >>>>>> Current RFC announces additional memory allocation capabilities, which can suits
> > > > > >>>>>> better as extension to existing memory related library instead of a new device
> > > > > >>>>>> abstraction library.
> > > > > >>>>>
> > > > > >>>>> It is not replacing mempool.
> > > > > >>>>> It is more at the same level as EAL memory management:
> > > > > >>>>> allocate simple buffer, but with the exception it is done
> > > > > >>>>> on a specific device, so it requires a device ID.
> > > > > >>>>>
> > > > > >>>>> The other reason it needs to be a full library is that
> > > > > >>>>> it will start a workload on the GPU and get completion notification
> > > > > >>>>> so we can integrate the GPU workload in a packet processing pipeline.
> > > > > >>>>
> > > > > >>>> I might have confused you. My intention is not to make to fit under mempool API.
> > > > > >>>>
> > > > > >>>> I agree that we need a separate library for this. My objection is only
> > > > > >>>> to not call libgpudev and
> > > > > >>>> call it libgpu. And have APIs with rte_gpu_ instead of rte_gpu_dev as
> > > > > >>>> it not like existing "device libraries" in DPDK and
> > > > > >>>> it like other "libraries" in DPDK.
> > > > > >>>
> > > > > >>> I think we should define a queue of processing actions,
> > > > > >>> so it looks like other device libraries.
> > > > > >>> And anyway I think a library managing a device class,
> > > > > >>> and having some device drivers deserves the name of device library.
> > > > > >>>
> > > > > >>> I would like to read more opinions.
> > > > > >>
> > > > > >> Since the library is an unified interface to GPU device drivers
> > > > > >> I think it should be named as in the patch - gpudev.
> > > > > >>
> > > > > >> Mempool looks like an exception here - initially it was pure SW
> > > > > >> library, but not there are HW backends and corresponding device
> > > > > >> drivers.
> > > > > >>
> > > > > >> What I don't understand where is GPU specifics here?
> > > > > >
> > > > > > That's an interesting question.
> > > > > > Let's ask first what is a GPU for DPDK?
> > > > > > I think it is like a sub-CPU with high parallel execution capabilities,
> > > > > > and it is controlled by the CPU.
> > > > >
> > > > > I have no good ideas how to name it in accordance with
> > > > > above description to avoid "G" which for "Graphics" if
> > > > > understand correctly. However, may be it is not required.
> > > > > No strong opinion on the topic, but unbinding from
> > > > > "Graphics" would be nice.
> > > >
> > > > That's a question I ask myself for months now.
> > > > I am not able to find a better name,
> > > > and I start thinking that "GPU" is famous enough in high-load computing
> > > > to convey the idea of what we can expect.
> > >
> > >
> > > The closest I can think of is big-little architecture in ARM SoC.
> > > https://www.arm.com/why-arm/technologies/big-little
> > >
> > > We do have similar architecture, Where the "coprocessor" is part of
> > > the main CPU.
> > > It is operations are:
> > > - Download firmware
> > > - Memory mapping for Main CPU memory by the co-processor
> > > - Enq/Deq Jobs from/to Main CPU/Coprocessor CPU.
> >
> > Yes it looks like the exact same scope.
> > I like the word "co-processor" in this context.
> >
> > > If your scope is something similar and No Graphics involved here then
> > > we can remove G.
> >
> > Indeed no graphics in DPDK :)
> > By removing the G, you mean keeping only PU? like "pudev"?
> > We could also define the G as "General".
> >
> > > Coincidentally, Yesterday, I had an interaction with Elena for the
> > > same for BaseBand related work in ORAN where
> > > GPU used as Baseband processing instead of Graphics.(So I can
> > > understand the big picture of this library)
> >
> > Yes baseband processing is one possible usage of GPU with DPDK.
> > We could also imagine some security analysis, or any machine learning...
> >
> > > I can think of "coprocessor-dev" as one of the name.
> >
> > "coprocessor" looks too long as prefix of the functions.

Yes. Libray name can be lengthy, but API prefix should be 3 letters
kind short form will be required.


> >
> > > We do have similar machine learning co-processors(for compute)
> > > if we can keep a generic name and it is for the above functions we may
> > > use this subsystem as well in the future.
> >
>
> Accelerator, 'acce_dev' ? ;-)

It may get confused with HW accelerators.


Some of the options I can think of. Sorting in my preference.

library name, API prefix
1) libhpc-dev, rte_hpc_ (hpc-> Heterogeneous processor compute)
2) libhc-dev, rte_hc_
(https://en.wikipedia.org/wiki/Heterogeneous_computing see: Example
hardware)
3) libpu-dev, rte_pu_ (pu -> processing unit)
4) libhp-dev, rte_hp_ (hp->heterogeneous processor)
5) libcoprocessor-dev, rte_cps_ ?
6) libcompute-dev, rte_cpt_ ?
7) libgpu-dev, rte_gpu_




>
> > Yes that's the idea to share a common synchronization mechanism
> > with different HW.
> >
> > That's cool to have such a big interest in the community for this patch.
> >
>
Thomas Monjalon June 5, 2021, 7:49 a.m. UTC | #42
04/06/2021 20:04, Wang, Haiyue:
> From: Thomas Monjalon <thomas@monjalon.net>
> > 04/06/2021 15:25, Wang, Haiyue:
> > > From: Thomas Monjalon <thomas@monjalon.net>
> > > > Another question is about the function rte_gpu_free().
> > > > How do we recognize that a memory chunk is from the CPU and GPU visible,
> > > > or just from GPU?
> > > >
> > >
> > > I didn't find the rte_gpu_free_visible definition, and the rte_gpu_free's
> > > comment just says: deallocate a chunk of memory allocated with rte_gpu_malloc*
> > >
> > > Looks like the rte_gpu_free can handle this case ?
> > 
> > This is the proposal, yes.
> > 
> > > And from the definition "rte_gpu_free(uint16_t gpu_id, void *ptr)", the
> > > free needs to check whether this memory belong to the GPU or not, so it
> > > also can recognize the memory type, I think.
> > 
> > Yes that's the idea behind having a single free function.
> > We could have some metadata in front of the memory chunk.
> > My question is to confirm whether it is a good design or not,
> > and whether it should be driver specific or have a common struct in the lib.
> > 
> > Opinions?
> > 
> 
> Make the GPU memory to be registered into the common lib API with the metadata
> like address, size etc, and also some GPU specific callbacks like to handle how
> to make GPU memory visible to CPU ?
> 
> And the memory register can be like the exist external memory function:
> 
> int
> rte_extmem_register(void *va_addr, size_t len, rte_iova_t iova_addrs[],
> 		unsigned int n_pages, size_t page_sz)

How do you specify the device ID?
I may have missed something.
Wang, Haiyue June 5, 2021, 11:09 a.m. UTC | #43
> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Saturday, June 5, 2021 15:49
> To: Wang, Haiyue <haiyue.wang@intel.com>
> Cc: dev@dpdk.org; Elena Agostini <eagostini@nvidia.com>; andrew.rybchenko@oktetlabs.ru; Yigit, Ferruh
> <ferruh.yigit@intel.com>; jerinj@marvell.com
> Subject: Re: [dpdk-dev] [PATCH] gpudev: introduce memory API
> 
> 04/06/2021 20:04, Wang, Haiyue:
> > From: Thomas Monjalon <thomas@monjalon.net>
> > > 04/06/2021 15:25, Wang, Haiyue:
> > > > From: Thomas Monjalon <thomas@monjalon.net>
> > > > > Another question is about the function rte_gpu_free().
> > > > > How do we recognize that a memory chunk is from the CPU and GPU visible,
> > > > > or just from GPU?
> > > > >
> > > >
> > > > I didn't find the rte_gpu_free_visible definition, and the rte_gpu_free's
> > > > comment just says: deallocate a chunk of memory allocated with rte_gpu_malloc*
> > > >
> > > > Looks like the rte_gpu_free can handle this case ?
> > >
> > > This is the proposal, yes.
> > >
> > > > And from the definition "rte_gpu_free(uint16_t gpu_id, void *ptr)", the
> > > > free needs to check whether this memory belong to the GPU or not, so it
> > > > also can recognize the memory type, I think.
> > >
> > > Yes that's the idea behind having a single free function.
> > > We could have some metadata in front of the memory chunk.
> > > My question is to confirm whether it is a good design or not,
> > > and whether it should be driver specific or have a common struct in the lib.
> > >
> > > Opinions?
> > >
> >
> > Make the GPU memory to be registered into the common lib API with the metadata
> > like address, size etc, and also some GPU specific callbacks like to handle how
> > to make GPU memory visible to CPU ?
> >
> > And the memory register can be like the exist external memory function:
> >
> > int
> > rte_extmem_register(void *va_addr, size_t len, rte_iova_t iova_addrs[],
> > 		unsigned int n_pages, size_t page_sz)
> 
> How do you specify the device ID

I mean that take the current external memory register as an example, it is not
a real proto-type.

The GPU memory management library can provide the this kind of API for GPU driver
to register its memory at probe time or start time ?  

> I may have missed something.
> 



>
Honnappa Nagarahalli June 6, 2021, 1:10 a.m. UTC | #44
<snip>

> 
> From: Elena Agostini <eagostini@nvidia.com>
> 
> The new library gpudev is for dealing with GPU from a DPDK application in a
> vendor-agnostic way.
It would be good to explain how the application using GPU+DPDK would look like.

Which parts of the workload need DPDK's support?

Any requirements on co-existence of GPU with other accelerators?

> 
> As a first step, the features are focused on memory management.
> A function allows to allocate memory inside the GPU, while another one
> allows to use main (CPU) memory from the GPU.
Is this memory for packet buffers or something else?

> 
> The infrastructure is prepared to welcome drivers in drivers/gpu/ as the
> upcoming NVIDIA one, implementing the gpudev API.
> Other additions planned for next revisions:
>   - C implementation file
>   - guide documentation
>   - unit tests
>   - integration in testpmd to enable Rx/Tx to/from GPU memory.
> 
> The next step should focus on GPU processing task control.
> 
> Signed-off-by: Elena Agostini <eagostini@nvidia.com>
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> ---
>  .gitignore                           |   1 +
>  MAINTAINERS                          |   6 +
>  doc/api/doxy-api-index.md            |   1 +
>  doc/api/doxy-api.conf.in             |   1 +
>  doc/guides/conf.py                   |   8 ++
>  doc/guides/gpus/features/default.ini |  13 ++
>  doc/guides/gpus/index.rst            |  11 ++
>  doc/guides/gpus/overview.rst         |   7 +
>  doc/guides/index.rst                 |   1 +
>  doc/guides/prog_guide/gpu.rst        |   5 +
>  doc/guides/prog_guide/index.rst      |   1 +
>  drivers/gpu/meson.build              |   4 +
>  drivers/meson.build                  |   1 +
>  lib/gpudev/gpu_driver.h              |  44 +++++++
>  lib/gpudev/meson.build               |   9 ++
>  lib/gpudev/rte_gpudev.h              | 183 +++++++++++++++++++++++++++
>  lib/gpudev/version.map               |  11 ++
>  lib/meson.build                      |   1 +
>  18 files changed, 308 insertions(+)
>  create mode 100644 doc/guides/gpus/features/default.ini
>  create mode 100644 doc/guides/gpus/index.rst  create mode 100644
> doc/guides/gpus/overview.rst  create mode 100644
> doc/guides/prog_guide/gpu.rst  create mode 100644
> drivers/gpu/meson.build  create mode 100644 lib/gpudev/gpu_driver.h
> create mode 100644 lib/gpudev/meson.build  create mode 100644
> lib/gpudev/rte_gpudev.h  create mode 100644 lib/gpudev/version.map
> 
> diff --git a/.gitignore b/.gitignore
> index b19c0717e6..49494e0c6c 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -14,6 +14,7 @@ doc/guides/compressdevs/overview_feature_table.txt
>  doc/guides/regexdevs/overview_feature_table.txt
>  doc/guides/vdpadevs/overview_feature_table.txt
>  doc/guides/bbdevs/overview_feature_table.txt
> +doc/guides/gpus/overview_feature_table.txt
> 
>  # ignore generated ctags/cscope files
>  cscope.out.po
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 5877a16971..c4755dfe9a 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -452,6 +452,12 @@ F: app/test-regex/
>  F: doc/guides/prog_guide/regexdev.rst
>  F: doc/guides/regexdevs/features/default.ini
> 
> +GPU API - EXPERIMENTAL
> +M: Elena Agostini <eagostini@nvidia.com>
> +F: lib/gpudev/
> +F: doc/guides/prog_guide/gpu.rst
> +F: doc/guides/gpus/features/default.ini
> +
>  Eventdev API
>  M: Jerin Jacob <jerinj@marvell.com>
>  T: git://dpdk.org/next/dpdk-next-eventdev
> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md index
> 1992107a03..bd10342ca2 100644
> --- a/doc/api/doxy-api-index.md
> +++ b/doc/api/doxy-api-index.md
> @@ -21,6 +21,7 @@ The public API headers are grouped by topics:
>    [compressdev]        (@ref rte_compressdev.h),
>    [compress]           (@ref rte_comp.h),
>    [regexdev]           (@ref rte_regexdev.h),
> +  [gpudev]             (@ref rte_gpudev.h),
>    [eventdev]           (@ref rte_eventdev.h),
>    [event_eth_rx_adapter]   (@ref rte_event_eth_rx_adapter.h),
>    [event_eth_tx_adapter]   (@ref rte_event_eth_tx_adapter.h),
> diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in index
> 325a0195c6..831b9a6b33 100644
> --- a/doc/api/doxy-api.conf.in
> +++ b/doc/api/doxy-api.conf.in
> @@ -40,6 +40,7 @@ INPUT                   = @TOPDIR@/doc/api/doxy-api-
> index.md \
>                            @TOPDIR@/lib/eventdev \
>                            @TOPDIR@/lib/fib \
>                            @TOPDIR@/lib/flow_classify \
> +                          @TOPDIR@/lib/gpudev \
>                            @TOPDIR@/lib/graph \
>                            @TOPDIR@/lib/gro \
>                            @TOPDIR@/lib/gso \ diff --git a/doc/guides/conf.py
> b/doc/guides/conf.py index 67d2dd62c7..7930da9ceb 100644
> --- a/doc/guides/conf.py
> +++ b/doc/guides/conf.py
> @@ -152,6 +152,9 @@ def generate_overview_table(output_filename,
> table_id, section, table_name, titl
>          name = ini_filename[:-4]
>          name = name.replace('_vf', 'vf')
>          pmd_names.append(name)
> +    if not pmd_names:
> +        # Add an empty column if table is empty (required by RST syntax)
> +        pmd_names.append(' ')
> 
>      # Pad the table header names.
>      max_header_len = len(max(pmd_names, key=len)) @@ -388,6 +391,11
> @@ def setup(app):
>                              'Features',
>                              'Features availability in bbdev drivers',
>                              'Feature')
> +    table_file = dirname(__file__) + '/gpus/overview_feature_table.txt'
> +    generate_overview_table(table_file, 1,
> +                            'Features',
> +                            'Features availability in GPU drivers',
> +                            'Feature')
> 
>      if LooseVersion(sphinx_version) < LooseVersion('1.3.1'):
>          print('Upgrade sphinx to version >= 1.3.1 for '
> diff --git a/doc/guides/gpus/features/default.ini
> b/doc/guides/gpus/features/default.ini
> new file mode 100644
> index 0000000000..c363447b0d
> --- /dev/null
> +++ b/doc/guides/gpus/features/default.ini
> @@ -0,0 +1,13 @@
> +;
> +; Features of a GPU driver.
> +;
> +; This file defines the features that are valid for inclusion in ; the
> +other driver files and also the order that they appear in ; the
> +features table in the documentation. The feature description ; string
> +should not exceed feature_str_len defined in conf.py.
> +;
> +[Features]
> +Get device info                =
> +Share CPU memory with GPU      =
> +Allocate GPU memory            =
> +Free memory                    =
> diff --git a/doc/guides/gpus/index.rst b/doc/guides/gpus/index.rst new file
> mode 100644 index 0000000000..f9c62aeb36
> --- /dev/null
> +++ b/doc/guides/gpus/index.rst
> @@ -0,0 +1,11 @@
> +.. SPDX-License-Identifier: BSD-3-Clause
> +   Copyright 2021 NVIDIA Corporation & Affiliates
> +
> +GPU Drivers
> +===========
> +
> +.. toctree::
> +   :maxdepth: 2
> +   :numbered:
> +
> +   overview
> diff --git a/doc/guides/gpus/overview.rst b/doc/guides/gpus/overview.rst
> new file mode 100644 index 0000000000..e7f985e98b
> --- /dev/null
> +++ b/doc/guides/gpus/overview.rst
> @@ -0,0 +1,7 @@
> +.. SPDX-License-Identifier: BSD-3-Clause
> +   Copyright 2021 NVIDIA Corporation & Affiliates
> +
> +Overview of GPU Drivers
> +=======================
> +
> +.. include:: overview_feature_table.txt
> diff --git a/doc/guides/index.rst b/doc/guides/index.rst index
> 857f0363d3..ee4d79a4eb 100644
> --- a/doc/guides/index.rst
> +++ b/doc/guides/index.rst
> @@ -21,6 +21,7 @@ DPDK documentation
>     compressdevs/index
>     vdpadevs/index
>     regexdevs/index
> +   gpus/index
>     eventdevs/index
>     rawdevs/index
>     mempool/index
> diff --git a/doc/guides/prog_guide/gpu.rst b/doc/guides/prog_guide/gpu.rst
> new file mode 100644 index 0000000000..54f9fa8300
> --- /dev/null
> +++ b/doc/guides/prog_guide/gpu.rst
> @@ -0,0 +1,5 @@
> +.. SPDX-License-Identifier: BSD-3-Clause
> +   Copyright 2021 NVIDIA Corporation & Affiliates
> +
> +GPU Library
> +===========
> diff --git a/doc/guides/prog_guide/index.rst
> b/doc/guides/prog_guide/index.rst index 2dce507f46..dfddf90b51 100644
> --- a/doc/guides/prog_guide/index.rst
> +++ b/doc/guides/prog_guide/index.rst
> @@ -27,6 +27,7 @@ Programmer's Guide
>      cryptodev_lib
>      compressdev
>      regexdev
> +    gpu
>      rte_security
>      rawdev
>      link_bonding_poll_mode_drv_lib
> diff --git a/drivers/gpu/meson.build b/drivers/gpu/meson.build new file mode
> 100644 index 0000000000..5189950616
> --- /dev/null
> +++ b/drivers/gpu/meson.build
> @@ -0,0 +1,4 @@
> +# SPDX-License-Identifier: BSD-3-Clause # Copyright 2021 NVIDIA
> +Corporation & Affiliates
> +
> +drivers = []
> diff --git a/drivers/meson.build b/drivers/meson.build index
> bc6f4f567f..f607040d79 100644
> --- a/drivers/meson.build
> +++ b/drivers/meson.build
> @@ -18,6 +18,7 @@ subdirs = [
>          'vdpa',           # depends on common, bus and mempool.
>          'event',          # depends on common, bus, mempool and net.
>          'baseband',       # depends on common and bus.
> +        'gpu',            # depends on common and bus.
>  ]
> 
>  if meson.is_cross_build()
> diff --git a/lib/gpudev/gpu_driver.h b/lib/gpudev/gpu_driver.h new file mode
> 100644 index 0000000000..5ff609e49d
> --- /dev/null
> +++ b/lib/gpudev/gpu_driver.h
> @@ -0,0 +1,44 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2021 NVIDIA Corporation & Affiliates  */
> +
> +#ifndef GPU_DRIVER_H
> +#define GPU_DRIVER_H
> +
> +#include <stdint.h>
> +
> +#include <rte_common.h>
> +
> +#include "rte_gpudev.h"
> +
> +struct rte_gpu_dev;
> +
> +typedef int (*gpu_malloc_t)(struct rte_gpu_dev *dev, size_t size, void
> +**ptr); typedef int (*gpu_free_t)(struct rte_gpu_dev *dev, void *ptr);
> +
> +struct rte_gpu_dev {
> +	/* Backing device. */
> +	struct rte_device *device;
> +	/* GPU info structure. */
> +	struct rte_gpu_info info;
> +	/* Counter of processes using the device. */
> +	uint16_t process_cnt;
> +	/* If device is currently used or not. */
> +	enum rte_gpu_state state;
> +	/* FUNCTION: Allocate memory on the GPU. */
> +	gpu_malloc_t gpu_malloc;
> +	/* FUNCTION: Allocate memory on the CPU visible from the GPU. */
> +	gpu_malloc_t gpu_malloc_visible;
> +	/* FUNCTION: Free allocated memory on the GPU. */
> +	gpu_free_t gpu_free;
> +	/* Device interrupt handle. */
> +	struct rte_intr_handle *intr_handle;
> +	/* Driver-specific private data. */
> +	void *dev_private;
> +} __rte_cache_aligned;
> +
> +struct rte_gpu_dev *rte_gpu_dev_allocate(const char *name); struct
> +rte_gpu_dev *rte_gpu_dev_get_by_name(const char *name); int
> +rte_gpu_dev_release(struct rte_gpu_dev *gpudev);
> +
> +#endif /* GPU_DRIVER_H */
> diff --git a/lib/gpudev/meson.build b/lib/gpudev/meson.build new file mode
> 100644 index 0000000000..f05459e18d
> --- /dev/null
> +++ b/lib/gpudev/meson.build
> @@ -0,0 +1,9 @@
> +# SPDX-License-Identifier: BSD-3-Clause # Copyright 2021 NVIDIA
> +Corporation & Affiliates
> +
> +headers = files(
> +        'rte_gpudev.h',
> +)
> +
> +sources = files(
> +)
> diff --git a/lib/gpudev/rte_gpudev.h b/lib/gpudev/rte_gpudev.h new file mode
> 100644 index 0000000000..b12f35c17e
> --- /dev/null
> +++ b/lib/gpudev/rte_gpudev.h
> @@ -0,0 +1,183 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2021 NVIDIA Corporation & Affiliates  */
> +
> +#ifndef RTE_GPUDEV_H
> +#define RTE_GPUDEV_H
> +
> +#include <stdint.h>
> +#include <stdbool.h>
> +
> +#include <rte_common.h>
> +
> +/**
> + * @file
> + * Generic library to interact with a GPU.
> + *
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/** Maximum number of GPU engines. */
> +#define RTE_GPU_MAX_DEVS UINT16_C(32)
> +/** Maximum length of device name. */
> +#define RTE_GPU_NAME_MAX_LEN 128
> +
> +/** Flags indicate current state of GPU device. */ enum rte_gpu_state {
> +	RTE_GPU_STATE_UNUSED,        /**< not initialized */
> +	RTE_GPU_STATE_INITIALIZED,   /**< initialized */
> +};
> +
> +/** Store a list of info for a given GPU. */ struct rte_gpu_info {
> +	/** GPU device ID. */
> +	uint16_t gpu_id;
> +	/** Unique identifier name. */
> +	char name[RTE_GPU_NAME_MAX_LEN];
> +	/** Total memory available on device. */
> +	size_t total_memory;
> +	/** Total processors available on device. */
> +	int processor_count;
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Returns the number of GPUs detected and associated to DPDK.
> + *
> + * @return
> + *   The number of available GPUs.
> + */
> +__rte_experimental
> +uint16_t rte_gpu_dev_count_avail(void);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Check if the device is valid and initialized in DPDK.
> + *
> + * @param gpu_id
> + *   The input GPU ID.
> + *
> + * @return
> + *   - True if gpu_id is a valid and initialized GPU.
> + *   - False otherwise.
> + */
> +__rte_experimental
> +bool rte_gpu_dev_is_valid(uint16_t gpu_id);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Get the GPU ID of the next valid GPU initialized in DPDK.
> + *
> + * @param gpu_id
> + *   The initial GPU ID to start the research.
> + *
> + * @return
> + *   Next GPU ID corresponding to a valid and initialized GPU device.
> + */
> +__rte_experimental
> +uint16_t rte_gpu_dev_find_next(uint16_t gpu_id);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Macro to iterate over all valid GPUs.
> + *
> + * @param gpu_id
> + *   The ID of the next possible valid GPU.
> + * @return
> + *   Next valid GPU ID, RTE_GPU_MAX_DEVS if there is none.
> + */
> +#define RTE_GPU_FOREACH_DEV(gpu_id) \
> +	for (gpu_id = rte_gpu_find_next(0); \
> +	     gpu_id < RTE_GPU_MAX_DEVS; \
> +	     gpu_id = rte_gpu_find_next(gpu_id + 1))
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Return GPU specific info.
> + *
> + * @param gpu_id
> + *   GPU ID to get info.
> + * @param info
> + *   Memory structure to fill with the info.
> + *
> + * @return
> + *   0 on success, -1 otherwise.
> + */
> +__rte_experimental
> +int rte_gpu_dev_info_get(uint16_t gpu_id, struct rte_gpu_info **info);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Allocate a chunk of memory on the GPU.
> + *
> + * @param gpu_id
> + *   GPU ID to allocate memory.
> + * @param size
> + *   Number of bytes to allocate.
> + * @param ptr
> + *   Pointer to store the address of the allocated memory.
> + *
> + * @return
> + *   0 on success, -1 otherwise.
> + */
> +__rte_experimental
> +int rte_gpu_malloc(uint16_t gpu_id, size_t size, void **ptr);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Allocate a chunk of memory on the CPU that is visible from the GPU.
> + *
> + * @param gpu_id
> + *   Reference GPU ID.
> + * @param size
> + *   Number of bytes to allocate.
> + * @param ptr
> + *   Pointer to store the address of the allocated memory.
> + *
> + * @return
> + *   0 on success, -1 otherwise.
> + */
> +__rte_experimental
> +int rte_gpu_malloc_visible(uint16_t gpu_id, size_t size, void **ptr);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Deallocate a chunk of memory allocated with rte_gpu_malloc*.
> + *
> + * @param gpu_id
> + *   Reference GPU ID.
> + * @param ptr
> + *   Pointer to the memory area to be deallocated.
> + *
> + * @return
> + *   0 on success, -1 otherwise.
> + */
> +__rte_experimental
> +int rte_gpu_free(uint16_t gpu_id, void *ptr);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* RTE_GPUDEV_H */
> diff --git a/lib/gpudev/version.map b/lib/gpudev/version.map new file mode
> 100644 index 0000000000..9e0f218e8b
> --- /dev/null
> +++ b/lib/gpudev/version.map
> @@ -0,0 +1,11 @@
> +EXPERIMENTAL {
> +	global:
> +
> +	rte_gpu_dev_count_avail;
> +	rte_gpu_dev_find_next;
> +	rte_gpu_dev_info_get;
> +	rte_gpu_dev_is_valid;
> +	rte_gpu_free;
> +	rte_gpu_malloc;
> +	rte_gpu_malloc_visible;
> +};
> diff --git a/lib/meson.build b/lib/meson.build index 4a64756a68..ffefc64c69
> 100644
> --- a/lib/meson.build
> +++ b/lib/meson.build
> @@ -33,6 +33,7 @@ libraries = [
>          'distributor',
>          'efd',
>          'eventdev',
> +        'gpudev',
>          'gro',
>          'gso',
>          'ip_frag',
> --
> 2.31.1
Honnappa Nagarahalli June 6, 2021, 1:13 a.m. UTC | #45
<snip>

> > >
> > > 04/06/2021 17:20, Jerin Jacob:
> > > > On Fri, Jun 4, 2021 at 7:39 PM Thomas Monjalon
> <thomas@monjalon.net> wrote:
> > > > > 04/06/2021 15:59, Andrew Rybchenko:
> > > > > > On 6/4/21 4:18 PM, Thomas Monjalon wrote:
> > > > > > > 04/06/2021 15:05, Andrew Rybchenko:
> > > > > > >> On 6/4/21 3:46 PM, Thomas Monjalon wrote:
> > > > > > >>> 04/06/2021 13:09, Jerin Jacob:
> > > > > > >>>> On Fri, Jun 4, 2021 at 3:58 PM Thomas Monjalon
> <thomas@monjalon.net> wrote:
> > > > > > >>>>> 03/06/2021 11:33, Ferruh Yigit:
> > > > > > >>>>>> On 6/3/2021 8:47 AM, Jerin Jacob wrote:
> > > > > > >>>>>>> On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon
> <thomas@monjalon.net> wrote:
> > > > > > >>>>>>>> +  [gpudev]             (@ref rte_gpudev.h),
> > > > > > >>>>>>>
> > > > > > >>>>>>> Since this device does not have a queue etc? Shouldn't
> > > > > > >>>>>>> make it a library like mempool with vendor-defined ops?
> > > > > > >>>>>>
> > > > > > >>>>>> +1
> > > > > > >>>>>>
> > > > > > >>>>>> Current RFC announces additional memory allocation
> > > > > > >>>>>> capabilities, which can suits better as extension to
> > > > > > >>>>>> existing memory related library instead of a new device
> abstraction library.
> > > > > > >>>>>
> > > > > > >>>>> It is not replacing mempool.
> > > > > > >>>>> It is more at the same level as EAL memory management:
> > > > > > >>>>> allocate simple buffer, but with the exception it is
> > > > > > >>>>> done on a specific device, so it requires a device ID.
> > > > > > >>>>>
> > > > > > >>>>> The other reason it needs to be a full library is that
> > > > > > >>>>> it will start a workload on the GPU and get completion
> > > > > > >>>>> notification so we can integrate the GPU workload in a packet
> processing pipeline.
> > > > > > >>>>
> > > > > > >>>> I might have confused you. My intention is not to make to fit
> under mempool API.
> > > > > > >>>>
> > > > > > >>>> I agree that we need a separate library for this. My
> > > > > > >>>> objection is only to not call libgpudev and call it
> > > > > > >>>> libgpu. And have APIs with rte_gpu_ instead of
> > > > > > >>>> rte_gpu_dev as it not like existing "device libraries" in
> > > > > > >>>> DPDK and it like other "libraries" in DPDK.
> > > > > > >>>
> > > > > > >>> I think we should define a queue of processing actions, so
> > > > > > >>> it looks like other device libraries.
> > > > > > >>> And anyway I think a library managing a device class, and
> > > > > > >>> having some device drivers deserves the name of device library.
> > > > > > >>>
> > > > > > >>> I would like to read more opinions.
> > > > > > >>
> > > > > > >> Since the library is an unified interface to GPU device
> > > > > > >> drivers I think it should be named as in the patch - gpudev.
> > > > > > >>
> > > > > > >> Mempool looks like an exception here - initially it was
> > > > > > >> pure SW library, but not there are HW backends and
> > > > > > >> corresponding device drivers.
> > > > > > >>
> > > > > > >> What I don't understand where is GPU specifics here?
> > > > > > >
> > > > > > > That's an interesting question.
> > > > > > > Let's ask first what is a GPU for DPDK?
> > > > > > > I think it is like a sub-CPU with high parallel execution
> > > > > > > capabilities, and it is controlled by the CPU.
> > > > > >
> > > > > > I have no good ideas how to name it in accordance with above
> > > > > > description to avoid "G" which for "Graphics" if understand
> > > > > > correctly. However, may be it is not required.
> > > > > > No strong opinion on the topic, but unbinding from "Graphics"
> > > > > > would be nice.
> > > > >
> > > > > That's a question I ask myself for months now.
> > > > > I am not able to find a better name, and I start thinking that
> > > > > "GPU" is famous enough in high-load computing to convey the idea
> > > > > of what we can expect.
> > > >
> > > >
> > > > The closest I can think of is big-little architecture in ARM SoC.
> > > > https://www.arm.com/why-arm/technologies/big-little
From the application pov, big-little arch is nothing but SMT. Not sure how it is similar to another device on PCIe.

> > > >
> > > > We do have similar architecture, Where the "coprocessor" is part
> > > > of the main CPU.
> > > > It is operations are:
> > > > - Download firmware
> > > > - Memory mapping for Main CPU memory by the co-processor
> > > > - Enq/Deq Jobs from/to Main CPU/Coprocessor CPU.
> > >
> > > Yes it looks like the exact same scope.
> > > I like the word "co-processor" in this context.
> > >
> > > > If your scope is something similar and No Graphics involved here
> > > > then we can remove G.
> > >
> > > Indeed no graphics in DPDK :)
> > > By removing the G, you mean keeping only PU? like "pudev"?
> > > We could also define the G as "General".
> > >
> > > > Coincidentally, Yesterday, I had an interaction with Elena for the
> > > > same for BaseBand related work in ORAN where GPU used as Baseband
> > > > processing instead of Graphics.(So I can understand the big
> > > > picture of this library)
This patch does not provide the big picture view of what the processing looks like using GPU. It would be good to explain that.
For ex:
1) Will the notion of GPU hidden from the application? i.e. is the application allowed to launch kernels?
	1a) Will DPDK provide abstract APIs to launch kernels?
     This would require us to have the notion of GPU in DPDK and the application would depend on the availability of GPU in the system.
2) Is launching kernels hidden? i.e. the application still calls DPDK abstract APIs (such as encryption/decryption APIs) without knowing that the encryption/decryption is happening on GPU.
     This does not require us to have a notion of GPU in DPDK at the API level

If we keep CXL in mind, I would imagine that in the future the devices on PCIe could have their own local memory. May be some of the APIs could use generic names. For ex: instead of calling it as "rte_gpu_malloc" may be we could call it as "rte_dev_malloc". This way any future device which hosts its own memory that need to be managed by the application, can use these APIs.
	

> > >
> > > Yes baseband processing is one possible usage of GPU with DPDK.
> > > We could also imagine some security analysis, or any machine learning...
> > >
> > > > I can think of "coprocessor-dev" as one of the name.
> > >
> > > "coprocessor" looks too long as prefix of the functions.
> 
> Yes. Libray name can be lengthy, but API prefix should be 3 letters kind short
> form will be required.
> 
> 
> > >
> > > > We do have similar machine learning co-processors(for compute) if
> > > > we can keep a generic name and it is for the above functions we
> > > > may use this subsystem as well in the future.
> > >
> >
> > Accelerator, 'acce_dev' ? ;-)
> 
> It may get confused with HW accelerators.
> 
> 
> Some of the options I can think of. Sorting in my preference.
> 
> library name, API prefix
> 1) libhpc-dev, rte_hpc_ (hpc-> Heterogeneous processor compute)
> 2) libhc-dev, rte_hc_
> (https://en.wikipedia.org/wiki/Heterogeneous_computing see: Example
> hardware)
> 3) libpu-dev, rte_pu_ (pu -> processing unit)
> 4) libhp-dev, rte_hp_ (hp->heterogeneous processor)
> 5) libcoprocessor-dev, rte_cps_ ?
> 6) libcompute-dev, rte_cpt_ ?
> 7) libgpu-dev, rte_gpu_
These seem to assume that the application can launch its own workload on the device? Does DPDK need to provide abstract APIs for launching work on a device?


> 
> 
> 
> 
> >
> > > Yes that's the idea to share a common synchronization mechanism with
> > > different HW.
> > >
> > > That's cool to have such a big interest in the community for this patch.
> > >
> >
Jerin Jacob June 6, 2021, 5:28 a.m. UTC | #46
On Sun, Jun 6, 2021 at 6:44 AM Honnappa Nagarahalli
<Honnappa.Nagarahalli@arm.com> wrote:
>
> <snip>
>
> > > >
> > > > 04/06/2021 17:20, Jerin Jacob:
> > > > > On Fri, Jun 4, 2021 at 7:39 PM Thomas Monjalon
> > <thomas@monjalon.net> wrote:
> > > > > > 04/06/2021 15:59, Andrew Rybchenko:
> > > > > > > On 6/4/21 4:18 PM, Thomas Monjalon wrote:
> > > > > > > > 04/06/2021 15:05, Andrew Rybchenko:
> > > > > > > >> On 6/4/21 3:46 PM, Thomas Monjalon wrote:
> > > > > > > >>> 04/06/2021 13:09, Jerin Jacob:
> > > > > > > >>>> On Fri, Jun 4, 2021 at 3:58 PM Thomas Monjalon
> > <thomas@monjalon.net> wrote:
> > > > > > > >>>>> 03/06/2021 11:33, Ferruh Yigit:
> > > > > > > >>>>>> On 6/3/2021 8:47 AM, Jerin Jacob wrote:
> > > > > > > >>>>>>> On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon
> > <thomas@monjalon.net> wrote:
> > > > > > > >>>>>>>> +  [gpudev]             (@ref rte_gpudev.h),
> > > > > > > >>>>>>>
> > > > > > > >>>>>>> Since this device does not have a queue etc? Shouldn't
> > > > > > > >>>>>>> make it a library like mempool with vendor-defined ops?
> > > > > > > >>>>>>
> > > > > > > >>>>>> +1
> > > > > > > >>>>>>
> > > > > > > >>>>>> Current RFC announces additional memory allocation
> > > > > > > >>>>>> capabilities, which can suits better as extension to
> > > > > > > >>>>>> existing memory related library instead of a new device
> > abstraction library.
> > > > > > > >>>>>
> > > > > > > >>>>> It is not replacing mempool.
> > > > > > > >>>>> It is more at the same level as EAL memory management:
> > > > > > > >>>>> allocate simple buffer, but with the exception it is
> > > > > > > >>>>> done on a specific device, so it requires a device ID.
> > > > > > > >>>>>
> > > > > > > >>>>> The other reason it needs to be a full library is that
> > > > > > > >>>>> it will start a workload on the GPU and get completion
> > > > > > > >>>>> notification so we can integrate the GPU workload in a packet
> > processing pipeline.
> > > > > > > >>>>
> > > > > > > >>>> I might have confused you. My intention is not to make to fit
> > under mempool API.
> > > > > > > >>>>
> > > > > > > >>>> I agree that we need a separate library for this. My
> > > > > > > >>>> objection is only to not call libgpudev and call it
> > > > > > > >>>> libgpu. And have APIs with rte_gpu_ instead of
> > > > > > > >>>> rte_gpu_dev as it not like existing "device libraries" in
> > > > > > > >>>> DPDK and it like other "libraries" in DPDK.
> > > > > > > >>>
> > > > > > > >>> I think we should define a queue of processing actions, so
> > > > > > > >>> it looks like other device libraries.
> > > > > > > >>> And anyway I think a library managing a device class, and
> > > > > > > >>> having some device drivers deserves the name of device library.
> > > > > > > >>>
> > > > > > > >>> I would like to read more opinions.
> > > > > > > >>
> > > > > > > >> Since the library is an unified interface to GPU device
> > > > > > > >> drivers I think it should be named as in the patch - gpudev.
> > > > > > > >>
> > > > > > > >> Mempool looks like an exception here - initially it was
> > > > > > > >> pure SW library, but not there are HW backends and
> > > > > > > >> corresponding device drivers.
> > > > > > > >>
> > > > > > > >> What I don't understand where is GPU specifics here?
> > > > > > > >
> > > > > > > > That's an interesting question.
> > > > > > > > Let's ask first what is a GPU for DPDK?
> > > > > > > > I think it is like a sub-CPU with high parallel execution
> > > > > > > > capabilities, and it is controlled by the CPU.
> > > > > > >
> > > > > > > I have no good ideas how to name it in accordance with above
> > > > > > > description to avoid "G" which for "Graphics" if understand
> > > > > > > correctly. However, may be it is not required.
> > > > > > > No strong opinion on the topic, but unbinding from "Graphics"
> > > > > > > would be nice.
> > > > > >
> > > > > > That's a question I ask myself for months now.
> > > > > > I am not able to find a better name, and I start thinking that
> > > > > > "GPU" is famous enough in high-load computing to convey the idea
> > > > > > of what we can expect.
> > > > >
> > > > >
> > > > > The closest I can think of is big-little architecture in ARM SoC.
> > > > > https://www.arm.com/why-arm/technologies/big-little
> From the application pov, big-little arch is nothing but SMT. Not sure how it is similar to another device on PCIe.


Yes. It may not be a device sitting on a PCIe bus, However,  It can
access it via some bus from the main CPU.


>
> > > > >
> > > > > We do have similar architecture, Where the "coprocessor" is part
> > > > > of the main CPU.
> > > > > It is operations are:
> > > > > - Download firmware
> > > > > - Memory mapping for Main CPU memory by the co-processor
> > > > > - Enq/Deq Jobs from/to Main CPU/Coprocessor CPU.
> > > >
> > > > Yes it looks like the exact same scope.
> > > > I like the word "co-processor" in this context.
> > > >
> > > > > If your scope is something similar and No Graphics involved here
> > > > > then we can remove G.
> > > >
> > > > Indeed no graphics in DPDK :)
> > > > By removing the G, you mean keeping only PU? like "pudev"?
> > > > We could also define the G as "General".
> > > >
> > > > > Coincidentally, Yesterday, I had an interaction with Elena for the
> > > > > same for BaseBand related work in ORAN where GPU used as Baseband
> > > > > processing instead of Graphics.(So I can understand the big
> > > > > picture of this library)
> This patch does not provide the big picture view of what the processing looks like using GPU. It would be good to explain that.
> For ex:
> 1) Will the notion of GPU hidden from the application? i.e. is the application allowed to launch kernels?
>         1a) Will DPDK provide abstract APIs to launch kernels?
>      This would require us to have the notion of GPU in DPDK and the application would depend on the availability of GPU in the system.
> 2) Is launching kernels hidden? i.e. the application still calls DPDK abstract APIs (such as encryption/decryption APIs) without knowing that the encryption/decryption is happening on GPU.
>      This does not require us to have a notion of GPU in DPDK at the API level

I will leave this to Thomas.

>
> If we keep CXL in mind, I would imagine that in the future the devices on PCIe could have their own local memory. May be some of the APIs could use generic names. For ex: instead of calling it as "rte_gpu_malloc" may be we could call it as "rte_dev_malloc". This way any future device which hosts its own memory that need to be managed by the application, can use these APIs.

That is a good thought. it is possible to hook the download firmware,
memory management, Job management(as messages to/from device) to
rte_device itself.
I think, one needs to consider, how to integrate with the existing
DPDK subsystem, for example: If one decided to implement bbdev or
regexdev with such computing device,
Need to consider, Is it better to have bbdev driver has depended
gpudev or rte_device has this callback and use with bbdev driver.




>
>
> > > >
> > > > Yes baseband processing is one possible usage of GPU with DPDK.
> > > > We could also imagine some security analysis, or any machine learning...
> > > >
> > > > > I can think of "coprocessor-dev" as one of the name.
> > > >
> > > > "coprocessor" looks too long as prefix of the functions.
> >
> > Yes. Libray name can be lengthy, but API prefix should be 3 letters kind short
> > form will be required.
> >
> >
> > > >
> > > > > We do have similar machine learning co-processors(for compute) if
> > > > > we can keep a generic name and it is for the above functions we
> > > > > may use this subsystem as well in the future.
> > > >
> > >
> > > Accelerator, 'acce_dev' ? ;-)
> >
> > It may get confused with HW accelerators.
> >
> >
> > Some of the options I can think of. Sorting in my preference.
> >
> > library name, API prefix
> > 1) libhpc-dev, rte_hpc_ (hpc-> Heterogeneous processor compute)
> > 2) libhc-dev, rte_hc_
> > (https://en.wikipedia.org/wiki/Heterogeneous_computing see: Example
> > hardware)
> > 3) libpu-dev, rte_pu_ (pu -> processing unit)
> > 4) libhp-dev, rte_hp_ (hp->heterogeneous processor)
> > 5) libcoprocessor-dev, rte_cps_ ?
> > 6) libcompute-dev, rte_cpt_ ?
> > 7) libgpu-dev, rte_gpu_
> These seem to assume that the application can launch its own workload on the device? Does DPDK need to provide abstract APIs for launching work on a device?
>
>
> >
> >
> >
> >
> > >
> > > > Yes that's the idea to share a common synchronization mechanism with
> > > > different HW.
> > > >
> > > > That's cool to have such a big interest in the community for this patch.
> > > >
> > >
Wang, Haiyue June 7, 2021, 7:20 a.m. UTC | #47
> -----Original Message-----
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Sent: Sunday, June 6, 2021 09:14
> To: Jerin Jacob <jerinjacobk@gmail.com>; Wang, Haiyue <haiyue.wang@intel.com>
> Cc: thomas@monjalon.net; Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; Yigit, Ferruh
> <ferruh.yigit@intel.com>; dpdk-dev <dev@dpdk.org>; Elena Agostini <eagostini@nvidia.com>; David
> Marchand <david.marchand@redhat.com>; nd <nd@arm.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; nd <nd@arm.com>
> Subject: RE: [dpdk-dev] [PATCH] gpudev: introduce memory API
> 
> <snip>
> 
> > > >
> > > > 04/06/2021 17:20, Jerin Jacob:
> > > > > On Fri, Jun 4, 2021 at 7:39 PM Thomas Monjalon
> > <thomas@monjalon.net> wrote:
> > > > > > 04/06/2021 15:59, Andrew Rybchenko:
> > > > > > > On 6/4/21 4:18 PM, Thomas Monjalon wrote:
> > > > > > > > 04/06/2021 15:05, Andrew Rybchenko:
> > > > > > > >> On 6/4/21 3:46 PM, Thomas Monjalon wrote:
> > > > > > > >>> 04/06/2021 13:09, Jerin Jacob:
> > > > > > > >>>> On Fri, Jun 4, 2021 at 3:58 PM Thomas Monjalon
> > <thomas@monjalon.net> wrote:
> > > > > > > >>>>> 03/06/2021 11:33, Ferruh Yigit:
> > > > > > > >>>>>> On 6/3/2021 8:47 AM, Jerin Jacob wrote:
> > > > > > > >>>>>>> On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon
> > <thomas@monjalon.net> wrote:
> > > > > > > >>>>>>>> +  [gpudev]             (@ref rte_gpudev.h),
> > > > > > > >>>>>>>
> > > > > > > >>>>>>> Since this device does not have a queue etc? Shouldn't
> > > > > > > >>>>>>> make it a library like mempool with vendor-defined ops?
> > > > > > > >>>>>>
> > > > > > > >>>>>> +1
> > > > > > > >>>>>>
> > > > > > > >>>>>> Current RFC announces additional memory allocation
> > > > > > > >>>>>> capabilities, which can suits better as extension to
> > > > > > > >>>>>> existing memory related library instead of a new device
> > abstraction library.
> > > > > > > >>>>>
> > > > > > > >>>>> It is not replacing mempool.
> > > > > > > >>>>> It is more at the same level as EAL memory management:
> > > > > > > >>>>> allocate simple buffer, but with the exception it is
> > > > > > > >>>>> done on a specific device, so it requires a device ID.
> > > > > > > >>>>>
> > > > > > > >>>>> The other reason it needs to be a full library is that
> > > > > > > >>>>> it will start a workload on the GPU and get completion
> > > > > > > >>>>> notification so we can integrate the GPU workload in a packet
> > processing pipeline.
> > > > > > > >>>>
> > > > > > > >>>> I might have confused you. My intention is not to make to fit
> > under mempool API.
> > > > > > > >>>>
> > > > > > > >>>> I agree that we need a separate library for this. My
> > > > > > > >>>> objection is only to not call libgpudev and call it
> > > > > > > >>>> libgpu. And have APIs with rte_gpu_ instead of
> > > > > > > >>>> rte_gpu_dev as it not like existing "device libraries" in
> > > > > > > >>>> DPDK and it like other "libraries" in DPDK.
> > > > > > > >>>
> > > > > > > >>> I think we should define a queue of processing actions, so
> > > > > > > >>> it looks like other device libraries.
> > > > > > > >>> And anyway I think a library managing a device class, and
> > > > > > > >>> having some device drivers deserves the name of device library.
> > > > > > > >>>
> > > > > > > >>> I would like to read more opinions.
> > > > > > > >>
> > > > > > > >> Since the library is an unified interface to GPU device
> > > > > > > >> drivers I think it should be named as in the patch - gpudev.
> > > > > > > >>
> > > > > > > >> Mempool looks like an exception here - initially it was
> > > > > > > >> pure SW library, but not there are HW backends and
> > > > > > > >> corresponding device drivers.
> > > > > > > >>
> > > > > > > >> What I don't understand where is GPU specifics here?
> > > > > > > >
> > > > > > > > That's an interesting question.
> > > > > > > > Let's ask first what is a GPU for DPDK?
> > > > > > > > I think it is like a sub-CPU with high parallel execution
> > > > > > > > capabilities, and it is controlled by the CPU.
> > > > > > >
> > > > > > > I have no good ideas how to name it in accordance with above
> > > > > > > description to avoid "G" which for "Graphics" if understand
> > > > > > > correctly. However, may be it is not required.
> > > > > > > No strong opinion on the topic, but unbinding from "Graphics"
> > > > > > > would be nice.
> > > > > >
> > > > > > That's a question I ask myself for months now.
> > > > > > I am not able to find a better name, and I start thinking that
> > > > > > "GPU" is famous enough in high-load computing to convey the idea
> > > > > > of what we can expect.
> > > > >
> > > > >
> > > > > The closest I can think of is big-little architecture in ARM SoC.
> > > > > https://www.arm.com/why-arm/technologies/big-little
> From the application pov, big-little arch is nothing but SMT. Not sure how it is similar to another
> device on PCIe.
> 
> > > > >
> > > > > We do have similar architecture, Where the "coprocessor" is part
> > > > > of the main CPU.
> > > > > It is operations are:
> > > > > - Download firmware
> > > > > - Memory mapping for Main CPU memory by the co-processor
> > > > > - Enq/Deq Jobs from/to Main CPU/Coprocessor CPU.
> > > >
> > > > Yes it looks like the exact same scope.
> > > > I like the word "co-processor" in this context.
> > > >
> > > > > If your scope is something similar and No Graphics involved here
> > > > > then we can remove G.
> > > >
> > > > Indeed no graphics in DPDK :)
> > > > By removing the G, you mean keeping only PU? like "pudev"?
> > > > We could also define the G as "General".
> > > >
> > > > > Coincidentally, Yesterday, I had an interaction with Elena for the
> > > > > same for BaseBand related work in ORAN where GPU used as Baseband
> > > > > processing instead of Graphics.(So I can understand the big
> > > > > picture of this library)
> This patch does not provide the big picture view of what the processing looks like using GPU. It would
> be good to explain that.
> For ex:
> 1) Will the notion of GPU hidden from the application? i.e. is the application allowed to launch
> kernels?
> 	1a) Will DPDK provide abstract APIs to launch kernels?
>      This would require us to have the notion of GPU in DPDK and the application would depend on the
> availability of GPU in the system.
> 2) Is launching kernels hidden? i.e. the application still calls DPDK abstract APIs (such as
> encryption/decryption APIs) without knowing that the encryption/decryption is happening on GPU.
>      This does not require us to have a notion of GPU in DPDK at the API level
> 
> If we keep CXL in mind, I would imagine that in the future the devices on PCIe could have their own
> local memory. May be some of the APIs could use generic names. For ex: instead of calling it as
> "rte_gpu_malloc" may be we could call it as "rte_dev_malloc". This way any future device which hosts
> its own memory that need to be managed by the application, can use these APIs.
> 

"rte_dev_malloc" sounds a good name, then looks like we need to enhance the
'struct rte_device' with some new ops as:

eal: move DMA mapping from bus-specific to generic driver

https://patchwork.dpdk.org/project/dpdk/patch/20210331224547.2217759-1-thomas@monjalon.net/

> 
> > > >
> > > > Yes baseband processing is one possible usage of GPU with DPDK.
> > > > We could also imagine some security analysis, or any machine learning...
> > > >
> > > > > I can think of "coprocessor-dev" as one of the name.
> > > >
> > > > "coprocessor" looks too long as prefix of the functions.
> >
> > Yes. Libray name can be lengthy, but API prefix should be 3 letters kind short
> > form will be required.
> >
> >
> > > >
> > > > > We do have similar machine learning co-processors(for compute) if
> > > > > we can keep a generic name and it is for the above functions we
> > > > > may use this subsystem as well in the future.
> > > >
> > >
> > > Accelerator, 'acce_dev' ? ;-)
> >
> > It may get confused with HW accelerators.
> >
> >
> > Some of the options I can think of. Sorting in my preference.
> >
> > library name, API prefix
> > 1) libhpc-dev, rte_hpc_ (hpc-> Heterogeneous processor compute)
> > 2) libhc-dev, rte_hc_
> > (https://en.wikipedia.org/wiki/Heterogeneous_computing see: Example
> > hardware)
> > 3) libpu-dev, rte_pu_ (pu -> processing unit)
> > 4) libhp-dev, rte_hp_ (hp->heterogeneous processor)
> > 5) libcoprocessor-dev, rte_cps_ ?
> > 6) libcompute-dev, rte_cpt_ ?
> > 7) libgpu-dev, rte_gpu_
> These seem to assume that the application can launch its own workload on the device? Does DPDK need to
> provide abstract APIs for launching work on a device?
> 
> 
> >
> >
> >
> >
> > >
> > > > Yes that's the idea to share a common synchronization mechanism with
> > > > different HW.
> > > >
> > > > That's cool to have such a big interest in the community for this patch.
> > > >
> > >
Thomas Monjalon June 7, 2021, 10:29 a.m. UTC | #48
06/06/2021 07:28, Jerin Jacob:
> On Sun, Jun 6, 2021 at 6:44 AM Honnappa Nagarahalli
> > This patch does not provide the big picture view of what the processing looks like using GPU. It would be good to explain that.
> > For ex:
> > 1) Will the notion of GPU hidden from the application? i.e. is the application allowed to launch kernels?
> >         1a) Will DPDK provide abstract APIs to launch kernels?
> >      This would require us to have the notion of GPU in DPDK and the application would depend on the availability of GPU in the system.

Not sure "kernels" is a well known word in this context.
I propose talking about computing tasks.
The DPDK application running on the CPU must be synchronized
with the tasks running on devices, so yes we need a way
to decide what to launch and when from the DPDK application.

> > 2) Is launching kernels hidden? i.e. the application still calls DPDK abstract APIs (such as encryption/decryption APIs) without knowing that the encryption/decryption is happening on GPU.
> >      This does not require us to have a notion of GPU in DPDK at the API level
> 
> I will leave this to Thomas.

The general need is to allow running any kind of processing on devices.
Some processing may be very specific, others could fit in the existing
class API like crypto and regex.
I think implementing such specific class drivers based on tasks
dynamically loaded on the device may be done as a second step.

Thank you for the questions, it helps defining the big picture
for the next revision of the patch.

> > If we keep CXL in mind, I would imagine that in the future the devices on PCIe could have their own local memory. May be some of the APIs could use generic names. For ex: instead of calling it as "rte_gpu_malloc" may be we could call it as "rte_dev_malloc". This way any future device which hosts its own memory that need to be managed by the application, can use these APIs.
> 
> That is a good thought. it is possible to hook the download firmware,
> memory management, Job management(as messages to/from device) to
> rte_device itself.
> I think, one needs to consider, how to integrate with the existing
> DPDK subsystem, for example: If one decided to implement bbdev or
> regexdev with such computing device,
> Need to consider, Is it better to have bbdev driver has depended
> gpudev or rte_device has this callback and use with bbdev driver.

Absolutely. If a specialized driver class fits with a workload,
it is best handled with a driver in its specific class.

> > > > > Yes baseband processing is one possible usage of GPU with DPDK.
> > > > > We could also imagine some security analysis, or any machine learning...
> > > > >
> > > > > > I can think of "coprocessor-dev" as one of the name.
> > > > >
> > > > > "coprocessor" looks too long as prefix of the functions.
> > >
> > > Yes. Libray name can be lengthy, but API prefix should be 3 letters kind short
> > > form will be required.
> > >
> > >
> > > > >
> > > > > > We do have similar machine learning co-processors(for compute) if
> > > > > > we can keep a generic name and it is for the above functions we
> > > > > > may use this subsystem as well in the future.
> > > > >
> > > >
> > > > Accelerator, 'acce_dev' ? ;-)
> > >
> > > It may get confused with HW accelerators.
> > >
> > >
> > > Some of the options I can think of. Sorting in my preference.
> > >
> > > library name, API prefix
> > > 1) libhpc-dev, rte_hpc_ (hpc-> Heterogeneous processor compute)
> > > 2) libhc-dev, rte_hc_
> > > (https://en.wikipedia.org/wiki/Heterogeneous_computing see: Example
> > > hardware)
> > > 3) libpu-dev, rte_pu_ (pu -> processing unit)
> > > 4) libhp-dev, rte_hp_ (hp->heterogeneous processor)
> > > 5) libcoprocessor-dev, rte_cps_ ?
> > > 6) libcompute-dev, rte_cpt_ ?
> > > 7) libgpu-dev, rte_gpu_
> > 
> > These seem to assume that the application can launch its own workload on the device? Does DPDK need to provide abstract APIs for launching work on a device?

That's the difficult part.
We should not try to re-invent CUDA or OpenCL.
I think this part should not be in DPDK.
We only need to synchronize with dynamic nature of the device workload.
We will be more specific in the v2.
Thomas Monjalon June 7, 2021, 10:43 a.m. UTC | #49
07/06/2021 09:20, Wang, Haiyue:
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> > If we keep CXL in mind, I would imagine that in the future the devices on PCIe could have their own
> > local memory. May be some of the APIs could use generic names. For ex: instead of calling it as
> > "rte_gpu_malloc" may be we could call it as "rte_dev_malloc". This way any future device which hosts
> > its own memory that need to be managed by the application, can use these APIs.
> > 
> 
> "rte_dev_malloc" sounds a good name,

Yes I like the idea.
2 concerns:

1/ Device memory allocation requires a device handle.
So far we avoided exposing rte_device to the application.
How should we get a device handle from a DPDK application?

2/ Implementation must be done in a driver.
Should it be a callback defined at rte_device level?

> then looks like we need to enhance the
> 'struct rte_device' with some new ops as:
> 
> eal: move DMA mapping from bus-specific to generic driver
> 
> https://patchwork.dpdk.org/project/dpdk/patch/20210331224547.2217759-1-thomas@monjalon.net/

Not sure the above patch is a good idea.
Let's discuss this DMA detail later :)
Thomas Monjalon June 7, 2021, 10:50 a.m. UTC | #50
06/06/2021 03:10, Honnappa Nagarahalli:
> > The new library gpudev is for dealing with GPU from a DPDK application in a
> > vendor-agnostic way.
> 
> It would be good to explain how the application using GPU+DPDK would look like.

It can be anything to invent.
We can give few more words saying it is a computing done in parallel,
being either a loop or sequence of tasks. In both cases,
the DPDK application running on the CPU must be in control.

> Which parts of the workload need DPDK's support?

Control:
- start task (by function or shared memory variable)
- get completion notification (most probably by shared memory)

We will add such requirements in v2.

> Any requirements on co-existence of GPU with other accelerators?

It must be generic to allow any workload with any device.

> > As a first step, the features are focused on memory management.
> > A function allows to allocate memory inside the GPU, while another one
> > allows to use main (CPU) memory from the GPU.
> 
> Is this memory for packet buffers or something else?

Can be for packets or for control variables.
Jerin Jacob June 7, 2021, 1:54 p.m. UTC | #51
On Mon, Jun 7, 2021 at 4:13 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 07/06/2021 09:20, Wang, Haiyue:
> > From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> > > If we keep CXL in mind, I would imagine that in the future the devices on PCIe could have their own
> > > local memory. May be some of the APIs could use generic names. For ex: instead of calling it as
> > > "rte_gpu_malloc" may be we could call it as "rte_dev_malloc". This way any future device which hosts
> > > its own memory that need to be managed by the application, can use these APIs.
> > >
> >
> > "rte_dev_malloc" sounds a good name,
>
> Yes I like the idea.
> 2 concerns:
>
> 1/ Device memory allocation requires a device handle.
> So far we avoided exposing rte_device to the application.
> How should we get a device handle from a DPDK application?

Each device behaves differently at this level. In the view of the
generic application, the architecture should like

< Use DPDK subsystem as rte_ethdev, rte_bbdev etc for SPECIFIC function >
^
|
< DPDK driver>
^
|
<rte_device with this new callbacks >

An implementation may decide to have "in tree" or "out of tree"
drivers or rte_device implementaion.
But generic DPDK applications should not use devices directly. i.e
rte_device need to have this callback and
mlx ethdev/crypto driver use this driver to implement public API.
Otherwise, it is the same as rawdev in DPDK.
So not sure what it brings other than raw dev here if we are not
taking the above architecture.

>
> 2/ Implementation must be done in a driver.
> Should it be a callback defined at rte_device level?

IMO, Yes and DPDK subsystem drivers to use it.

>
> > then looks like we need to enhance the
> > 'struct rte_device' with some new ops as:
> >
> > eal: move DMA mapping from bus-specific to generic driver
> >
> > https://patchwork.dpdk.org/project/dpdk/patch/20210331224547.2217759-1-thomas@monjalon.net/
>
> Not sure the above patch is a good idea.
> Let's discuss this DMA detail later :)
>
>
>
Thomas Monjalon June 7, 2021, 4:47 p.m. UTC | #52
07/06/2021 15:54, Jerin Jacob:
> On Mon, Jun 7, 2021 at 4:13 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > 07/06/2021 09:20, Wang, Haiyue:
> > > From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> > > > If we keep CXL in mind, I would imagine that in the future the devices on PCIe could have their own
> > > > local memory. May be some of the APIs could use generic names. For ex: instead of calling it as
> > > > "rte_gpu_malloc" may be we could call it as "rte_dev_malloc". This way any future device which hosts
> > > > its own memory that need to be managed by the application, can use these APIs.
> > > >
> > >
> > > "rte_dev_malloc" sounds a good name,
> >
> > Yes I like the idea.
> > 2 concerns:
> >
> > 1/ Device memory allocation requires a device handle.
> > So far we avoided exposing rte_device to the application.
> > How should we get a device handle from a DPDK application?
> 
> Each device behaves differently at this level. In the view of the
> generic application, the architecture should like
> 
> < Use DPDK subsystem as rte_ethdev, rte_bbdev etc for SPECIFIC function >
> ^
> |
> < DPDK driver>
> ^
> |
> <rte_device with this new callbacks >

I think the formatting went wrong above.

I would add more to the block diagram:

class device API      - computing device API
        |            |              |
class device driver -   computing device driver
        |                           |
       EAL device with memory callback

The idea above is that the class device driver can use services
of the new computing device library.
One basic API service is to provide a device ID for the memory callback.
Other services are for execution control.

> An implementation may decide to have "in tree" or "out of tree"
> drivers or rte_device implementaion.
> But generic DPDK applications should not use devices directly. i.e
> rte_device need to have this callback and
> mlx ethdev/crypto driver use this driver to implement public API.
> Otherwise, it is the same as rawdev in DPDK.
> So not sure what it brings other than raw dev here if we are not
> taking the above architecture.
> 
> >
> > 2/ Implementation must be done in a driver.
> > Should it be a callback defined at rte_device level?
> 
> IMO, Yes and DPDK subsystem drivers to use it.

I'm not sure subsystems should bypass the API for device memory.
We could do some generic work in the API function and call
the driver callback only for device-specific stuff.
In such case the callback and the API would be
in the library computing device library.
On the other hand, having the callback and API in EAL would allow
having a common function for memory allocation in EAL.

Another thought: I would like to unify memory allocation in DPDK
with the same set of flags in an unique function.
A flag could be used to target devices instead of the running CPU,
and the same parameter could be shared for the device ID or NUMA node.
Honnappa Nagarahalli June 7, 2021, 11:31 p.m. UTC | #53
> -----Original Message-----
> From: Jerin Jacob <jerinjacobk@gmail.com>
> On Mon, Jun 7, 2021 at 4:13 PM Thomas Monjalon <thomas@monjalon.net>
> wrote:
> >
> > 07/06/2021 09:20, Wang, Haiyue:
> > > From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> > > > If we keep CXL in mind, I would imagine that in the future the
> > > > devices on PCIe could have their own local memory. May be some of
> > > > the APIs could use generic names. For ex: instead of calling it as
> > > > "rte_gpu_malloc" may be we could call it as "rte_dev_malloc". This way
> any future device which hosts its own memory that need to be managed by
> the application, can use these APIs.
> > > >
> > >
> > > "rte_dev_malloc" sounds a good name,
> >
> > Yes I like the idea.
> > 2 concerns:
> >
> > 1/ Device memory allocation requires a device handle.
> > So far we avoided exposing rte_device to the application.
> > How should we get a device handle from a DPDK application?
> 
> Each device behaves differently at this level. In the view of the generic
> application, the architecture should like
> 
> < Use DPDK subsystem as rte_ethdev, rte_bbdev etc for SPECIFIC function > ^
> |
> < DPDK driver>
> ^
> |
> <rte_device with this new callbacks >
> 
> An implementation may decide to have "in tree" or "out of tree"
> drivers or rte_device implementaion.
> But generic DPDK applications should not use devices directly. i.e rte_device
> need to have this callback and mlx ethdev/crypto driver use this driver to
> implement public API.
> Otherwise, it is the same as rawdev in DPDK.
> So not sure what it brings other than raw dev here if we are not taking the
> above architecture.
Agree, I think it is important to hide the device under the APIs for the application to benefit.

> 
> >
> > 2/ Implementation must be done in a driver.
> > Should it be a callback defined at rte_device level?
> 
> IMO, Yes and DPDK subsystem drivers to use it.
> 
> >
> > > then looks like we need to enhance the 'struct rte_device' with some
> > > new ops as:
> > >
> > > eal: move DMA mapping from bus-specific to generic driver
> > >
> > >
> https://patchwork.dpdk.org/project/dpdk/patch/20210331224547.2217759
> > > -1-thomas@monjalon.net/
> >
> > Not sure the above patch is a good idea.
> > Let's discuss this DMA detail later :)
> >
> >
> >
Jerin Jacob June 8, 2021, 4:10 a.m. UTC | #54
On Mon, Jun 7, 2021 at 10:17 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 07/06/2021 15:54, Jerin Jacob:
> > On Mon, Jun 7, 2021 at 4:13 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > 07/06/2021 09:20, Wang, Haiyue:
> > > > From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> > > > > If we keep CXL in mind, I would imagine that in the future the devices on PCIe could have their own
> > > > > local memory. May be some of the APIs could use generic names. For ex: instead of calling it as
> > > > > "rte_gpu_malloc" may be we could call it as "rte_dev_malloc". This way any future device which hosts
> > > > > its own memory that need to be managed by the application, can use these APIs.
> > > > >
> > > >
> > > > "rte_dev_malloc" sounds a good name,
> > >
> > > Yes I like the idea.
> > > 2 concerns:
> > >
> > > 1/ Device memory allocation requires a device handle.
> > > So far we avoided exposing rte_device to the application.
> > > How should we get a device handle from a DPDK application?
> >
> > Each device behaves differently at this level. In the view of the
> > generic application, the architecture should like
> >
> > < Use DPDK subsystem as rte_ethdev, rte_bbdev etc for SPECIFIC function >
> > ^
> > |
> > < DPDK driver>
> > ^
> > |
> > <rte_device with this new callbacks >
>
> I think the formatting went wrong above.
>
> I would add more to the block diagram:
>
> class device API      - computing device API
>         |            |              |
> class device driver -   computing device driver
>         |                           |
>        EAL device with memory callback
>
> The idea above is that the class device driver can use services
> of the new computing device library.

Yes. The question is, do we need any public DPDK _application_ APIs for that?
If it is public API then the scope is much bigger than that as the application
can use it directly and it makes it non portable.

if the scope is only, the class driver consumption then the existing
"bus"  _kind of_
abstraction/API makes sense to me.

Where it abstracts,
-FW download of device
-Memory management of device
-Opaque way to enq/deque jobs to the device.

And above should be consumed by "class driver" not "application".

If the application doing do that, we are in rte_raw device territory.


> One basic API service is to provide a device ID for the memory callback.
> Other services are for execution control.
>
> > An implementation may decide to have "in tree" or "out of tree"
> > drivers or rte_device implementaion.
> > But generic DPDK applications should not use devices directly. i.e
> > rte_device need to have this callback and
> > mlx ethdev/crypto driver use this driver to implement public API.
> > Otherwise, it is the same as rawdev in DPDK.
> > So not sure what it brings other than raw dev here if we are not
> > taking the above architecture.
> >
> > >
> > > 2/ Implementation must be done in a driver.
> > > Should it be a callback defined at rte_device level?
> >
> > IMO, Yes and DPDK subsystem drivers to use it.
>
> I'm not sure subsystems should bypass the API for device memory.
> We could do some generic work in the API function and call
> the driver callback only for device-specific stuff.
> In such case the callback and the API would be
> in the library computing device library.
> On the other hand, having the callback and API in EAL would allow
> having a common function for memory allocation in EAL.
>
> Another thought: I would like to unify memory allocation in DPDK
> with the same set of flags in an unique function.
> A flag could be used to target devices instead of the running CPU,
> and the same parameter could be shared for the device ID or NUMA node.
>
>
Thomas Monjalon June 8, 2021, 6:34 a.m. UTC | #55
08/06/2021 06:10, Jerin Jacob:
> On Mon, Jun 7, 2021 at 10:17 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > 07/06/2021 15:54, Jerin Jacob:
> > > On Mon, Jun 7, 2021 at 4:13 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > 07/06/2021 09:20, Wang, Haiyue:
> > > > > From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> > > > > > If we keep CXL in mind, I would imagine that in the future the devices on PCIe could have their own
> > > > > > local memory. May be some of the APIs could use generic names. For ex: instead of calling it as
> > > > > > "rte_gpu_malloc" may be we could call it as "rte_dev_malloc". This way any future device which hosts
> > > > > > its own memory that need to be managed by the application, can use these APIs.
> > > > > >
> > > > >
> > > > > "rte_dev_malloc" sounds a good name,
> > > >
> > > > Yes I like the idea.
> > > > 2 concerns:
> > > >
> > > > 1/ Device memory allocation requires a device handle.
> > > > So far we avoided exposing rte_device to the application.
> > > > How should we get a device handle from a DPDK application?
> > >
> > > Each device behaves differently at this level. In the view of the
> > > generic application, the architecture should like
> > >
> > > < Use DPDK subsystem as rte_ethdev, rte_bbdev etc for SPECIFIC function >
> > > ^
> > > |
> > > < DPDK driver>
> > > ^
> > > |
> > > <rte_device with this new callbacks >
> >
> > I think the formatting went wrong above.
> >
> > I would add more to the block diagram:
> >
> > class device API      - computing device API
> >         |            |              |
> > class device driver -   computing device driver
> >         |                           |
> >        EAL device with memory callback
> >
> > The idea above is that the class device driver can use services
> > of the new computing device library.
> 
> Yes. The question is, do we need any public DPDK _application_ APIs for that?

To have something generic!

> If it is public API then the scope is much bigger than that as the application
> can use it directly and it makes it non portable.

It is a non-sense. If we make an API, it will be better portable.
The only part which is non-portable is the program on the device
which may be different per computing device.
The synchronization with the DPDK application should be portable
if we define some good API.

> if the scope is only, the class driver consumption then the existing
> "bus"  _kind of_
> abstraction/API makes sense to me.
> 
> Where it abstracts,
> -FW download of device
> -Memory management of device
> -Opaque way to enq/deque jobs to the device.
> 
> And above should be consumed by "class driver" not "application".
> 
> If the application doing do that, we are in rte_raw device territory.

I'm sorry I don't understand what you make such assertion.
It seems you don't want generic API (which is the purpose of DPDK).
Jerin Jacob June 8, 2021, 7:09 a.m. UTC | #56
On Tue, Jun 8, 2021 at 12:05 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 08/06/2021 06:10, Jerin Jacob:
> > On Mon, Jun 7, 2021 at 10:17 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > >
> > > 07/06/2021 15:54, Jerin Jacob:
> > > > On Mon, Jun 7, 2021 at 4:13 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > 07/06/2021 09:20, Wang, Haiyue:
> > > > > > From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> > > > > > > If we keep CXL in mind, I would imagine that in the future the devices on PCIe could have their own
> > > > > > > local memory. May be some of the APIs could use generic names. For ex: instead of calling it as
> > > > > > > "rte_gpu_malloc" may be we could call it as "rte_dev_malloc". This way any future device which hosts
> > > > > > > its own memory that need to be managed by the application, can use these APIs.
> > > > > > >
> > > > > >
> > > > > > "rte_dev_malloc" sounds a good name,
> > > > >
> > > > > Yes I like the idea.
> > > > > 2 concerns:
> > > > >
> > > > > 1/ Device memory allocation requires a device handle.
> > > > > So far we avoided exposing rte_device to the application.
> > > > > How should we get a device handle from a DPDK application?
> > > >
> > > > Each device behaves differently at this level. In the view of the
> > > > generic application, the architecture should like
> > > >
> > > > < Use DPDK subsystem as rte_ethdev, rte_bbdev etc for SPECIFIC function >
> > > > ^
> > > > |
> > > > < DPDK driver>
> > > > ^
> > > > |
> > > > <rte_device with this new callbacks >
> > >
> > > I think the formatting went wrong above.
> > >
> > > I would add more to the block diagram:
> > >
> > > class device API      - computing device API
> > >         |            |              |
> > > class device driver -   computing device driver
> > >         |                           |
> > >        EAL device with memory callback
> > >
> > > The idea above is that the class device driver can use services
> > > of the new computing device library.
> >
> > Yes. The question is, do we need any public DPDK _application_ APIs for that?
>
> To have something generic!
>
> > If it is public API then the scope is much bigger than that as the application
> > can use it directly and it makes it non portable.
>
> It is a non-sense. If we make an API, it will be better portable.

The portal application will be using class device API.
For example, when application needs to call rte_gpu_malloc() vs rte_malloc() ?
Is it better the use of drivers specific functions used in "class
device driver" not exposed?



> The only part which is non-portable is the program on the device
> which may be different per computing device.
> The synchronization with the DPDK application should be portable
> if we define some good API.
>
> > if the scope is only, the class driver consumption then the existing
> > "bus"  _kind of_
> > abstraction/API makes sense to me.
> >
> > Where it abstracts,
> > -FW download of device
> > -Memory management of device
> > -Opaque way to enq/deque jobs to the device.
> >
> > And above should be consumed by "class driver" not "application".
> >
> > If the application doing do that, we are in rte_raw device territory.
>
> I'm sorry I don't understand what you make such assertion.
> It seems you don't want generic API (which is the purpose of DPDK).

I would like to have a generic _application_ API if the application
_needs_ to use it.

The v1 nowhere close to any compute device description.

It has a memory allocation API. It is the device attribute, not
strictly tied to ONLY TO computing device.

So at least, I am asking to have concrete
proposal on "compute device" schematic rather than start with memory API
and rubber stamp as new device adds anything in future.

When we added any all the class devices to DPDK, Everyone had a complete view
of it is function(at RFC of each subsystem had enough API to express
the "basic" usage)
and purpose from the _application_ PoV. I see that is missing here.


>
>
Thomas Monjalon June 8, 2021, 7:32 a.m. UTC | #57
08/06/2021 09:09, Jerin Jacob:
> On Tue, Jun 8, 2021 at 12:05 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > 08/06/2021 06:10, Jerin Jacob:
> > > On Mon, Jun 7, 2021 at 10:17 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > >
> > > > 07/06/2021 15:54, Jerin Jacob:
> > > > > On Mon, Jun 7, 2021 at 4:13 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > > 07/06/2021 09:20, Wang, Haiyue:
> > > > > > > From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> > > > > > > > If we keep CXL in mind, I would imagine that in the future the devices on PCIe could have their own
> > > > > > > > local memory. May be some of the APIs could use generic names. For ex: instead of calling it as
> > > > > > > > "rte_gpu_malloc" may be we could call it as "rte_dev_malloc". This way any future device which hosts
> > > > > > > > its own memory that need to be managed by the application, can use these APIs.
> > > > > > > >
> > > > > > >
> > > > > > > "rte_dev_malloc" sounds a good name,
> > > > > >
> > > > > > Yes I like the idea.
> > > > > > 2 concerns:
> > > > > >
> > > > > > 1/ Device memory allocation requires a device handle.
> > > > > > So far we avoided exposing rte_device to the application.
> > > > > > How should we get a device handle from a DPDK application?
> > > > >
> > > > > Each device behaves differently at this level. In the view of the
> > > > > generic application, the architecture should like
> > > > >
> > > > > < Use DPDK subsystem as rte_ethdev, rte_bbdev etc for SPECIFIC function >
> > > > > ^
> > > > > |
> > > > > < DPDK driver>
> > > > > ^
> > > > > |
> > > > > <rte_device with this new callbacks >
> > > >
> > > > I think the formatting went wrong above.
> > > >
> > > > I would add more to the block diagram:
> > > >
> > > > class device API      - computing device API
> > > >         |            |              |
> > > > class device driver -   computing device driver
> > > >         |                           |
> > > >        EAL device with memory callback
> > > >
> > > > The idea above is that the class device driver can use services
> > > > of the new computing device library.
> > >
> > > Yes. The question is, do we need any public DPDK _application_ APIs for that?
> >
> > To have something generic!
> >
> > > If it is public API then the scope is much bigger than that as the application
> > > can use it directly and it makes it non portable.
> >
> > It is a non-sense. If we make an API, it will be better portable.
> 
> The portal application will be using class device API.
> For example, when application needs to call rte_gpu_malloc() vs rte_malloc() ?
> Is it better the use of drivers specific functions used in "class
> device driver" not exposed?
> 
> 
> 
> > The only part which is non-portable is the program on the device
> > which may be different per computing device.
> > The synchronization with the DPDK application should be portable
> > if we define some good API.
> >
> > > if the scope is only, the class driver consumption then the existing
> > > "bus"  _kind of_
> > > abstraction/API makes sense to me.
> > >
> > > Where it abstracts,
> > > -FW download of device
> > > -Memory management of device
> > > -Opaque way to enq/deque jobs to the device.
> > >
> > > And above should be consumed by "class driver" not "application".
> > >
> > > If the application doing do that, we are in rte_raw device territory.
> >
> > I'm sorry I don't understand what you make such assertion.
> > It seems you don't want generic API (which is the purpose of DPDK).
> 
> I would like to have a generic _application_ API if the application
> _needs_ to use it.
> 
> The v1 nowhere close to any compute device description.

As I said, I forgot the RFC tag.
I just wanted to start the discussion and it was fruitful, no regret.

> It has a memory allocation API. It is the device attribute, not
> strictly tied to ONLY TO computing device.
> 
> So at least, I am asking to have concrete
> proposal on "compute device" schematic rather than start with memory API
> and rubber stamp as new device adds anything in future.
> 
> When we added any all the class devices to DPDK, Everyone had a complete view
> of it is function(at RFC of each subsystem had enough API to express
> the "basic" usage)
> and purpose from the _application_ PoV. I see that is missing here.

I keep explaining in emails while preparing a v2.
Now that we go into circles, let's wait the v2 which will address
a lot of comments.
Ferruh Yigit June 15, 2021, 6:24 p.m. UTC | #58
On 6/8/2021 7:34 AM, Thomas Monjalon wrote:
> 08/06/2021 06:10, Jerin Jacob:
>> On Mon, Jun 7, 2021 at 10:17 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>>>
>>> 07/06/2021 15:54, Jerin Jacob:
>>>> On Mon, Jun 7, 2021 at 4:13 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>>>>> 07/06/2021 09:20, Wang, Haiyue:
>>>>>> From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
>>>>>>> If we keep CXL in mind, I would imagine that in the future the devices on PCIe could have their own
>>>>>>> local memory. May be some of the APIs could use generic names. For ex: instead of calling it as
>>>>>>> "rte_gpu_malloc" may be we could call it as "rte_dev_malloc". This way any future device which hosts
>>>>>>> its own memory that need to be managed by the application, can use these APIs.
>>>>>>>
>>>>>>
>>>>>> "rte_dev_malloc" sounds a good name,
>>>>>
>>>>> Yes I like the idea.
>>>>> 2 concerns:
>>>>>
>>>>> 1/ Device memory allocation requires a device handle.
>>>>> So far we avoided exposing rte_device to the application.
>>>>> How should we get a device handle from a DPDK application?
>>>>
>>>> Each device behaves differently at this level. In the view of the
>>>> generic application, the architecture should like
>>>>
>>>> < Use DPDK subsystem as rte_ethdev, rte_bbdev etc for SPECIFIC function >
>>>> ^
>>>> |
>>>> < DPDK driver>
>>>> ^
>>>> |
>>>> <rte_device with this new callbacks >
>>>
>>> I think the formatting went wrong above.
>>>
>>> I would add more to the block diagram:
>>>
>>> class device API      - computing device API
>>>         |            |              |
>>> class device driver -   computing device driver
>>>         |                           |
>>>        EAL device with memory callback
>>>
>>> The idea above is that the class device driver can use services
>>> of the new computing device library.
>>
>> Yes. The question is, do we need any public DPDK _application_ APIs for that?
> 
> To have something generic!
> 
>> If it is public API then the scope is much bigger than that as the application
>> can use it directly and it makes it non portable.
> 
> It is a non-sense. If we make an API, it will be better portable.
> The only part which is non-portable is the program on the device
> which may be different per computing device.
> The synchronization with the DPDK application should be portable
> if we define some good API.
> 
>> if the scope is only, the class driver consumption then the existing
>> "bus"  _kind of_
>> abstraction/API makes sense to me.
>>
>> Where it abstracts,
>> -FW download of device
>> -Memory management of device
>> -Opaque way to enq/deque jobs to the device.
>>
>> And above should be consumed by "class driver" not "application".
>>
>> If the application doing do that, we are in rte_raw device territory.
> 
> I'm sorry I don't understand what you make such assertion.
> It seems you don't want generic API (which is the purpose of DPDK).
> 

The FW/kernel/"computing tasks" in the co-processor can be doing anything, as it
has been in FPGA/rawdev.

If there is no defined input & output of that computing task, an application
developed using it will be specific to that computing task, this is not portable
and feels like how rawdev works.

It is possible to have a generic API for control, to start the task and get
completion notification, but not having common input/output interface with
computing task still has same problem I think.

If the application is strictly depends to what computing task does, why not
extending rawdev to have the control APIs? Instead of new library.
And as you already said for memory, generic APIs can be used with additional
flags and using rawdev handler.

Or another option can be defining computing task a little more, have a common
interface, like mbuf, and add some capabilities/flags to let application know
more about computing task and give decision based on it, is this the intention?
Thomas Monjalon June 15, 2021, 6:54 p.m. UTC | #59
15/06/2021 20:24, Ferruh Yigit:
> On 6/8/2021 7:34 AM, Thomas Monjalon wrote:
> > 08/06/2021 06:10, Jerin Jacob:
> >> On Mon, Jun 7, 2021 at 10:17 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> >>>
> >>> 07/06/2021 15:54, Jerin Jacob:
> >>>> On Mon, Jun 7, 2021 at 4:13 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> >>>>> 07/06/2021 09:20, Wang, Haiyue:
> >>>>>> From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> >>>>>>> If we keep CXL in mind, I would imagine that in the future the devices on PCIe could have their own
> >>>>>>> local memory. May be some of the APIs could use generic names. For ex: instead of calling it as
> >>>>>>> "rte_gpu_malloc" may be we could call it as "rte_dev_malloc". This way any future device which hosts
> >>>>>>> its own memory that need to be managed by the application, can use these APIs.
> >>>>>>>
> >>>>>>
> >>>>>> "rte_dev_malloc" sounds a good name,
> >>>>>
> >>>>> Yes I like the idea.
> >>>>> 2 concerns:
> >>>>>
> >>>>> 1/ Device memory allocation requires a device handle.
> >>>>> So far we avoided exposing rte_device to the application.
> >>>>> How should we get a device handle from a DPDK application?
> >>>>
> >>>> Each device behaves differently at this level. In the view of the
> >>>> generic application, the architecture should like
> >>>>
> >>>> < Use DPDK subsystem as rte_ethdev, rte_bbdev etc for SPECIFIC function >
> >>>> ^
> >>>> |
> >>>> < DPDK driver>
> >>>> ^
> >>>> |
> >>>> <rte_device with this new callbacks >
> >>>
> >>> I think the formatting went wrong above.
> >>>
> >>> I would add more to the block diagram:
> >>>
> >>> class device API      - computing device API
> >>>         |            |              |
> >>> class device driver -   computing device driver
> >>>         |                           |
> >>>        EAL device with memory callback
> >>>
> >>> The idea above is that the class device driver can use services
> >>> of the new computing device library.
> >>
> >> Yes. The question is, do we need any public DPDK _application_ APIs for that?
> > 
> > To have something generic!
> > 
> >> If it is public API then the scope is much bigger than that as the application
> >> can use it directly and it makes it non portable.
> > 
> > It is a non-sense. If we make an API, it will be better portable.
> > The only part which is non-portable is the program on the device
> > which may be different per computing device.
> > The synchronization with the DPDK application should be portable
> > if we define some good API.
> > 
> >> if the scope is only, the class driver consumption then the existing
> >> "bus"  _kind of_
> >> abstraction/API makes sense to me.
> >>
> >> Where it abstracts,
> >> -FW download of device
> >> -Memory management of device
> >> -Opaque way to enq/deque jobs to the device.
> >>
> >> And above should be consumed by "class driver" not "application".
> >>
> >> If the application doing do that, we are in rte_raw device territory.
> > 
> > I'm sorry I don't understand what you make such assertion.
> > It seems you don't want generic API (which is the purpose of DPDK).
> > 
> 
> The FW/kernel/"computing tasks" in the co-processor can be doing anything, as it
> has been in FPGA/rawdev.
> 
> If there is no defined input & output of that computing task, an application
> developed using it will be specific to that computing task, this is not portable
> and feels like how rawdev works.
> 
> It is possible to have a generic API for control, to start the task and get
> completion notification, but not having common input/output interface with
> computing task still has same problem I think.
> 
> If the application is strictly depends to what computing task does, why not
> extending rawdev to have the control APIs? Instead of new library.
> And as you already said for memory, generic APIs can be used with additional
> flags and using rawdev handler.
> 
> Or another option can be defining computing task a little more, have a common
> interface, like mbuf, and add some capabilities/flags to let application know
> more about computing task and give decision based on it, is this the intention?

I think we'll propose a thin layer to allow device memory management with
generic API in EAL and mbuf.
The task should be defined and controlled by the application,
and there is not much DPDK can do generically.

Stay tuned, and thanks for all the feedbacks.
diff mbox series

Patch

diff --git a/.gitignore b/.gitignore
index b19c0717e6..49494e0c6c 100644
--- a/.gitignore
+++ b/.gitignore
@@ -14,6 +14,7 @@  doc/guides/compressdevs/overview_feature_table.txt
 doc/guides/regexdevs/overview_feature_table.txt
 doc/guides/vdpadevs/overview_feature_table.txt
 doc/guides/bbdevs/overview_feature_table.txt
+doc/guides/gpus/overview_feature_table.txt
 
 # ignore generated ctags/cscope files
 cscope.out.po
diff --git a/MAINTAINERS b/MAINTAINERS
index 5877a16971..c4755dfe9a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -452,6 +452,12 @@  F: app/test-regex/
 F: doc/guides/prog_guide/regexdev.rst
 F: doc/guides/regexdevs/features/default.ini
 
+GPU API - EXPERIMENTAL
+M: Elena Agostini <eagostini@nvidia.com>
+F: lib/gpudev/
+F: doc/guides/prog_guide/gpu.rst
+F: doc/guides/gpus/features/default.ini
+
 Eventdev API
 M: Jerin Jacob <jerinj@marvell.com>
 T: git://dpdk.org/next/dpdk-next-eventdev
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 1992107a03..bd10342ca2 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -21,6 +21,7 @@  The public API headers are grouped by topics:
   [compressdev]        (@ref rte_compressdev.h),
   [compress]           (@ref rte_comp.h),
   [regexdev]           (@ref rte_regexdev.h),
+  [gpudev]             (@ref rte_gpudev.h),
   [eventdev]           (@ref rte_eventdev.h),
   [event_eth_rx_adapter]   (@ref rte_event_eth_rx_adapter.h),
   [event_eth_tx_adapter]   (@ref rte_event_eth_tx_adapter.h),
diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
index 325a0195c6..831b9a6b33 100644
--- a/doc/api/doxy-api.conf.in
+++ b/doc/api/doxy-api.conf.in
@@ -40,6 +40,7 @@  INPUT                   = @TOPDIR@/doc/api/doxy-api-index.md \
                           @TOPDIR@/lib/eventdev \
                           @TOPDIR@/lib/fib \
                           @TOPDIR@/lib/flow_classify \
+                          @TOPDIR@/lib/gpudev \
                           @TOPDIR@/lib/graph \
                           @TOPDIR@/lib/gro \
                           @TOPDIR@/lib/gso \
diff --git a/doc/guides/conf.py b/doc/guides/conf.py
index 67d2dd62c7..7930da9ceb 100644
--- a/doc/guides/conf.py
+++ b/doc/guides/conf.py
@@ -152,6 +152,9 @@  def generate_overview_table(output_filename, table_id, section, table_name, titl
         name = ini_filename[:-4]
         name = name.replace('_vf', 'vf')
         pmd_names.append(name)
+    if not pmd_names:
+        # Add an empty column if table is empty (required by RST syntax)
+        pmd_names.append(' ')
 
     # Pad the table header names.
     max_header_len = len(max(pmd_names, key=len))
@@ -388,6 +391,11 @@  def setup(app):
                             'Features',
                             'Features availability in bbdev drivers',
                             'Feature')
+    table_file = dirname(__file__) + '/gpus/overview_feature_table.txt'
+    generate_overview_table(table_file, 1,
+                            'Features',
+                            'Features availability in GPU drivers',
+                            'Feature')
 
     if LooseVersion(sphinx_version) < LooseVersion('1.3.1'):
         print('Upgrade sphinx to version >= 1.3.1 for '
diff --git a/doc/guides/gpus/features/default.ini b/doc/guides/gpus/features/default.ini
new file mode 100644
index 0000000000..c363447b0d
--- /dev/null
+++ b/doc/guides/gpus/features/default.ini
@@ -0,0 +1,13 @@ 
+;
+; Features of a GPU driver.
+;
+; This file defines the features that are valid for inclusion in
+; the other driver files and also the order that they appear in
+; the features table in the documentation. The feature description
+; string should not exceed feature_str_len defined in conf.py.
+;
+[Features]
+Get device info                =
+Share CPU memory with GPU      =
+Allocate GPU memory            =
+Free memory                    =
diff --git a/doc/guides/gpus/index.rst b/doc/guides/gpus/index.rst
new file mode 100644
index 0000000000..f9c62aeb36
--- /dev/null
+++ b/doc/guides/gpus/index.rst
@@ -0,0 +1,11 @@ 
+.. SPDX-License-Identifier: BSD-3-Clause
+   Copyright 2021 NVIDIA Corporation & Affiliates
+
+GPU Drivers
+===========
+
+.. toctree::
+   :maxdepth: 2
+   :numbered:
+
+   overview
diff --git a/doc/guides/gpus/overview.rst b/doc/guides/gpus/overview.rst
new file mode 100644
index 0000000000..e7f985e98b
--- /dev/null
+++ b/doc/guides/gpus/overview.rst
@@ -0,0 +1,7 @@ 
+.. SPDX-License-Identifier: BSD-3-Clause
+   Copyright 2021 NVIDIA Corporation & Affiliates
+
+Overview of GPU Drivers
+=======================
+
+.. include:: overview_feature_table.txt
diff --git a/doc/guides/index.rst b/doc/guides/index.rst
index 857f0363d3..ee4d79a4eb 100644
--- a/doc/guides/index.rst
+++ b/doc/guides/index.rst
@@ -21,6 +21,7 @@  DPDK documentation
    compressdevs/index
    vdpadevs/index
    regexdevs/index
+   gpus/index
    eventdevs/index
    rawdevs/index
    mempool/index
diff --git a/doc/guides/prog_guide/gpu.rst b/doc/guides/prog_guide/gpu.rst
new file mode 100644
index 0000000000..54f9fa8300
--- /dev/null
+++ b/doc/guides/prog_guide/gpu.rst
@@ -0,0 +1,5 @@ 
+.. SPDX-License-Identifier: BSD-3-Clause
+   Copyright 2021 NVIDIA Corporation & Affiliates
+
+GPU Library
+===========
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index 2dce507f46..dfddf90b51 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -27,6 +27,7 @@  Programmer's Guide
     cryptodev_lib
     compressdev
     regexdev
+    gpu
     rte_security
     rawdev
     link_bonding_poll_mode_drv_lib
diff --git a/drivers/gpu/meson.build b/drivers/gpu/meson.build
new file mode 100644
index 0000000000..5189950616
--- /dev/null
+++ b/drivers/gpu/meson.build
@@ -0,0 +1,4 @@ 
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2021 NVIDIA Corporation & Affiliates
+
+drivers = []
diff --git a/drivers/meson.build b/drivers/meson.build
index bc6f4f567f..f607040d79 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -18,6 +18,7 @@  subdirs = [
         'vdpa',           # depends on common, bus and mempool.
         'event',          # depends on common, bus, mempool and net.
         'baseband',       # depends on common and bus.
+        'gpu',            # depends on common and bus.
 ]
 
 if meson.is_cross_build()
diff --git a/lib/gpudev/gpu_driver.h b/lib/gpudev/gpu_driver.h
new file mode 100644
index 0000000000..5ff609e49d
--- /dev/null
+++ b/lib/gpudev/gpu_driver.h
@@ -0,0 +1,44 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2021 NVIDIA Corporation & Affiliates
+ */
+
+#ifndef GPU_DRIVER_H
+#define GPU_DRIVER_H
+
+#include <stdint.h>
+
+#include <rte_common.h>
+
+#include "rte_gpudev.h"
+
+struct rte_gpu_dev;
+
+typedef int (*gpu_malloc_t)(struct rte_gpu_dev *dev, size_t size, void **ptr);
+typedef int (*gpu_free_t)(struct rte_gpu_dev *dev, void *ptr);
+
+struct rte_gpu_dev {
+	/* Backing device. */
+	struct rte_device *device;
+	/* GPU info structure. */
+	struct rte_gpu_info info;
+	/* Counter of processes using the device. */
+	uint16_t process_cnt;
+	/* If device is currently used or not. */
+	enum rte_gpu_state state;
+	/* FUNCTION: Allocate memory on the GPU. */
+	gpu_malloc_t gpu_malloc;
+	/* FUNCTION: Allocate memory on the CPU visible from the GPU. */
+	gpu_malloc_t gpu_malloc_visible;
+	/* FUNCTION: Free allocated memory on the GPU. */
+	gpu_free_t gpu_free;
+	/* Device interrupt handle. */
+	struct rte_intr_handle *intr_handle;
+	/* Driver-specific private data. */
+	void *dev_private;
+} __rte_cache_aligned;
+
+struct rte_gpu_dev *rte_gpu_dev_allocate(const char *name);
+struct rte_gpu_dev *rte_gpu_dev_get_by_name(const char *name);
+int rte_gpu_dev_release(struct rte_gpu_dev *gpudev);
+
+#endif /* GPU_DRIVER_H */
diff --git a/lib/gpudev/meson.build b/lib/gpudev/meson.build
new file mode 100644
index 0000000000..f05459e18d
--- /dev/null
+++ b/lib/gpudev/meson.build
@@ -0,0 +1,9 @@ 
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2021 NVIDIA Corporation & Affiliates
+
+headers = files(
+        'rte_gpudev.h',
+)
+
+sources = files(
+)
diff --git a/lib/gpudev/rte_gpudev.h b/lib/gpudev/rte_gpudev.h
new file mode 100644
index 0000000000..b12f35c17e
--- /dev/null
+++ b/lib/gpudev/rte_gpudev.h
@@ -0,0 +1,183 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2021 NVIDIA Corporation & Affiliates
+ */
+
+#ifndef RTE_GPUDEV_H
+#define RTE_GPUDEV_H
+
+#include <stdint.h>
+#include <stdbool.h>
+
+#include <rte_common.h>
+
+/**
+ * @file
+ * Generic library to interact with a GPU.
+ *
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/** Maximum number of GPU engines. */
+#define RTE_GPU_MAX_DEVS UINT16_C(32)
+/** Maximum length of device name. */
+#define RTE_GPU_NAME_MAX_LEN 128
+
+/** Flags indicate current state of GPU device. */
+enum rte_gpu_state {
+	RTE_GPU_STATE_UNUSED,        /**< not initialized */
+	RTE_GPU_STATE_INITIALIZED,   /**< initialized */
+};
+
+/** Store a list of info for a given GPU. */
+struct rte_gpu_info {
+	/** GPU device ID. */
+	uint16_t gpu_id;
+	/** Unique identifier name. */
+	char name[RTE_GPU_NAME_MAX_LEN];
+	/** Total memory available on device. */
+	size_t total_memory;
+	/** Total processors available on device. */
+	int processor_count;
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Returns the number of GPUs detected and associated to DPDK.
+ *
+ * @return
+ *   The number of available GPUs.
+ */
+__rte_experimental
+uint16_t rte_gpu_dev_count_avail(void);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Check if the device is valid and initialized in DPDK.
+ *
+ * @param gpu_id
+ *   The input GPU ID.
+ *
+ * @return
+ *   - True if gpu_id is a valid and initialized GPU.
+ *   - False otherwise.
+ */
+__rte_experimental
+bool rte_gpu_dev_is_valid(uint16_t gpu_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the GPU ID of the next valid GPU initialized in DPDK.
+ *
+ * @param gpu_id
+ *   The initial GPU ID to start the research.
+ *
+ * @return
+ *   Next GPU ID corresponding to a valid and initialized GPU device.
+ */
+__rte_experimental
+uint16_t rte_gpu_dev_find_next(uint16_t gpu_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Macro to iterate over all valid GPUs.
+ *
+ * @param gpu_id
+ *   The ID of the next possible valid GPU.
+ * @return
+ *   Next valid GPU ID, RTE_GPU_MAX_DEVS if there is none.
+ */
+#define RTE_GPU_FOREACH_DEV(gpu_id) \
+	for (gpu_id = rte_gpu_find_next(0); \
+	     gpu_id < RTE_GPU_MAX_DEVS; \
+	     gpu_id = rte_gpu_find_next(gpu_id + 1))
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Return GPU specific info.
+ *
+ * @param gpu_id
+ *   GPU ID to get info.
+ * @param info
+ *   Memory structure to fill with the info.
+ *
+ * @return
+ *   0 on success, -1 otherwise.
+ */
+__rte_experimental
+int rte_gpu_dev_info_get(uint16_t gpu_id, struct rte_gpu_info **info);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Allocate a chunk of memory on the GPU.
+ *
+ * @param gpu_id
+ *   GPU ID to allocate memory.
+ * @param size
+ *   Number of bytes to allocate.
+ * @param ptr
+ *   Pointer to store the address of the allocated memory.
+ *
+ * @return
+ *   0 on success, -1 otherwise.
+ */
+__rte_experimental
+int rte_gpu_malloc(uint16_t gpu_id, size_t size, void **ptr);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Allocate a chunk of memory on the CPU that is visible from the GPU.
+ *
+ * @param gpu_id
+ *   Reference GPU ID.
+ * @param size
+ *   Number of bytes to allocate.
+ * @param ptr
+ *   Pointer to store the address of the allocated memory.
+ *
+ * @return
+ *   0 on success, -1 otherwise.
+ */
+__rte_experimental
+int rte_gpu_malloc_visible(uint16_t gpu_id, size_t size, void **ptr);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Deallocate a chunk of memory allocated with rte_gpu_malloc*.
+ *
+ * @param gpu_id
+ *   Reference GPU ID.
+ * @param ptr
+ *   Pointer to the memory area to be deallocated.
+ *
+ * @return
+ *   0 on success, -1 otherwise.
+ */
+__rte_experimental
+int rte_gpu_free(uint16_t gpu_id, void *ptr);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_GPUDEV_H */
diff --git a/lib/gpudev/version.map b/lib/gpudev/version.map
new file mode 100644
index 0000000000..9e0f218e8b
--- /dev/null
+++ b/lib/gpudev/version.map
@@ -0,0 +1,11 @@ 
+EXPERIMENTAL {
+	global:
+
+	rte_gpu_dev_count_avail;
+	rte_gpu_dev_find_next;
+	rte_gpu_dev_info_get;
+	rte_gpu_dev_is_valid;
+	rte_gpu_free;
+	rte_gpu_malloc;
+	rte_gpu_malloc_visible;
+};
diff --git a/lib/meson.build b/lib/meson.build
index 4a64756a68..ffefc64c69 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -33,6 +33,7 @@  libraries = [
         'distributor',
         'efd',
         'eventdev',
+        'gpudev',
         'gro',
         'gso',
         'ip_frag',