diff mbox series

[v19,1/7] dmadev: introduce DMA device library public APIs

Message ID 1630588395-2804-2-git-send-email-fengchengwen@huawei.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers show
Series support dmadev | expand

Checks

Context Check Description
ci/checkpatch warning coding style issues

Commit Message

Fengchengwen Sept. 2, 2021, 1:13 p.m. UTC
The 'dmadevice' is a generic type of DMA device.

This patch introduce the 'dmadevice' public APIs which expose generic
operations that can enable configuration and I/O with the DMA devices.

Maintainers update is also included in this patch.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Jerin Jacob <jerinjacobk@gmail.com>
---
 MAINTAINERS                            |   4 +
 doc/api/doxy-api-index.md              |   1 +
 doc/api/doxy-api.conf.in               |   1 +
 doc/guides/rel_notes/release_21_11.rst |   5 +
 lib/dmadev/meson.build                 |   4 +
 lib/dmadev/rte_dmadev.h                | 949 +++++++++++++++++++++++++++++++++
 lib/dmadev/version.map                 |  24 +
 lib/meson.build                        |   1 +
 8 files changed, 989 insertions(+)
 create mode 100644 lib/dmadev/meson.build
 create mode 100644 lib/dmadev/rte_dmadev.h
 create mode 100644 lib/dmadev/version.map

Comments

Gagandeep Singh Sept. 3, 2021, 11:42 a.m. UTC | #1
Hi,

<snip>
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Close a DMA device.
> + *
> + * The device cannot be restarted after this call.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @return
> + *   0 on success. Otherwise negative value is returned.
> + */
> +__rte_experimental
> +int
> +rte_dmadev_close(uint16_t dev_id);
> +
> +/**
> + * rte_dma_direction - DMA transfer direction defines.
> + */
> +enum rte_dma_direction {
> +	RTE_DMA_DIR_MEM_TO_MEM,
> +	/**< DMA transfer direction - from memory to memory.
> +	 *
> +	 * @see struct rte_dmadev_vchan_conf::direction
> +	 */
> +	RTE_DMA_DIR_MEM_TO_DEV,
> +	/**< DMA transfer direction - from memory to device.
> +	 * In a typical scenario, the SoCs are installed on host servers as
> +	 * iNICs through the PCIe interface. In this case, the SoCs works in
> +	 * EP(endpoint) mode, it could initiate a DMA move request from
> memory
> +	 * (which is SoCs memory) to device (which is host memory).
> +	 *
> +	 * @see struct rte_dmadev_vchan_conf::direction
> +	 */
> +	RTE_DMA_DIR_DEV_TO_MEM,
> +	/**< DMA transfer direction - from device to memory.
> +	 * In a typical scenario, the SoCs are installed on host servers as
> +	 * iNICs through the PCIe interface. In this case, the SoCs works in
> +	 * EP(endpoint) mode, it could initiate a DMA move request from device
> +	 * (which is host memory) to memory (which is SoCs memory).
> +	 *
> +	 * @see struct rte_dmadev_vchan_conf::direction
> +	 */
> +	RTE_DMA_DIR_DEV_TO_DEV,
> +	/**< DMA transfer direction - from device to device.
> +	 * In a typical scenario, the SoCs are installed on host servers as
> +	 * iNICs through the PCIe interface. In this case, the SoCs works in
> +	 * EP(endpoint) mode, it could initiate a DMA move request from device
> +	 * (which is host memory) to the device (which is another host memory).
> +	 *
> +	 * @see struct rte_dmadev_vchan_conf::direction
> +	 */
> +};
> +
> +/**
>..
The enum rte_dma_direction must have a member RTE_DMA_DIR_ANY for a channel that supports all 4 directions.
<snip>


Regards,
Gagan
Bruce Richardson Sept. 3, 2021, 1:03 p.m. UTC | #2
On Thu, Sep 02, 2021 at 09:13:09PM +0800, Chengwen Feng wrote:
> The 'dmadevice' is a generic type of DMA device.
> 
> This patch introduce the 'dmadevice' public APIs which expose generic
> operations that can enable configuration and I/O with the DMA devices.
> 
> Maintainers update is also included in this patch.
> 
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
> Acked-by: Morten Brørup <mb@smartsharesystems.com>
> Acked-by: Jerin Jacob <jerinjacobk@gmail.com>
> ---
>  MAINTAINERS                            |   4 +
>  doc/api/doxy-api-index.md              |   1 +
>  doc/api/doxy-api.conf.in               |   1 +
>  doc/guides/rel_notes/release_21_11.rst |   5 +
>  lib/dmadev/meson.build                 |   4 +
>  lib/dmadev/rte_dmadev.h                | 949 +++++++++++++++++++++++++++++++++
>  lib/dmadev/version.map                 |  24 +
>  lib/meson.build                        |   1 +
>  8 files changed, 989 insertions(+)
>  create mode 100644 lib/dmadev/meson.build
>  create mode 100644 lib/dmadev/rte_dmadev.h
>  create mode 100644 lib/dmadev/version.map
> 

<snip>

> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Trigger hardware to begin performing enqueued operations.
> + *
> + * This API is used to write the "doorbell" to the hardware to trigger it
> + * to begin the operations previously enqueued by rte_dmadev_copy/fill().
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vchan
> + *   The identifier of virtual DMA channel.
> + *
> + * @return
> + *   0 on success. Otherwise negative value is returned.
> + */
> +__rte_experimental
> +int
> +rte_dmadev_submit(uint16_t dev_id, uint16_t vchan);
> +

Putting this out here for discussion:

Those developers here looking at how integration of dma acceleration into
vhost-virtio e.g. for OVS use, have come back with the request that we
provide a method for querying the amount of space in the descriptor ring,
or the size of the next burst, or similar. Basically, the reason for the
ask is to allow an app to determine if a set of jobs of size N can be
enqueued before the first one is, so that we don't get a half-offload of
copy of a multi-segment packet (for devices where scatter-gather is not
available).

In our "ioat" rawdev driver, we did this by providing a "burst_capacity"
API which returned the number of elements which could be enqueued in the
next burst without error (normally the available ring space). Looking at
the dmadev APIs, an alternative way to do this is to extend the "submit()"
function to allow a 3rd optional parameter to return this info. That is,
when submitting one burst of operations, you get info about how many more
you can enqueue in the next burst. [For submitting packets via the submit
flag, this info would not be available, as I feel ending all enqueue
operations would be excessive].

Therefore, I see a number of options for us to meet the ask for space
querying API:
1. provide a capacity API as done with ioat driver
2. provide (optional) capacity information from each submit() call
3. provide both #1 and #2 above as they are compatible
4. <some other idea>

For me, I think #3 is probably the most flexible approach. The benefit of
#2 is that the info can be provided to the application much more cheaply
than when the app has to call a separate API (which wouldn't be on the
fast-path). However, a way to provide the info apart from submitting a
burst would also be helpful, hence adding the extra function too (#1).

What are other people's thoughts or ideas on this?

Regards,
/Bruce
Kevin Laatz Sept. 3, 2021, 3:13 p.m. UTC | #3
On 02/09/2021 14:13, Chengwen Feng wrote:
> The 'dmadevice' is a generic type of DMA device.
>
> This patch introduce the 'dmadevice' public APIs which expose generic
> operations that can enable configuration and I/O with the DMA devices.
>
> Maintainers update is also included in this patch.
>
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
> Acked-by: Morten Brørup <mb@smartsharesystems.com>
> Acked-by: Jerin Jacob <jerinjacobk@gmail.com>
> ---
>   MAINTAINERS                            |   4 +
>   doc/api/doxy-api-index.md              |   1 +
>   doc/api/doxy-api.conf.in               |   1 +
>   doc/guides/rel_notes/release_21_11.rst |   5 +
>   lib/dmadev/meson.build                 |   4 +
>   lib/dmadev/rte_dmadev.h                | 949 +++++++++++++++++++++++++++++++++
>   lib/dmadev/version.map                 |  24 +
>   lib/meson.build                        |   1 +
>   8 files changed, 989 insertions(+)
>   create mode 100644 lib/dmadev/meson.build
>   create mode 100644 lib/dmadev/rte_dmadev.h
>   create mode 100644 lib/dmadev/version.map
>
<snip>
> +
> +/**
> + * rte_dma_direction - DMA transfer direction defines.
> + */
No need to have the struct name in the comment.
> +enum rte_dma_direction {
> +	RTE_DMA_DIR_MEM_TO_MEM,
> +	/**< DMA transfer direction - from memory to memory.
> +	 *
> +	 * @see struct rte_dmadev_vchan_conf::direction
> +	 */
> +	RTE_DMA_DIR_MEM_TO_DEV,
> +	/**< DMA transfer direction - from memory to device.
> +	 * In a typical scenario, the SoCs are installed on host servers as
> +	 * iNICs through the PCIe interface. In this case, the SoCs works in
> +	 * EP(endpoint) mode, it could initiate a DMA move request from memory
> +	 * (which is SoCs memory) to device (which is host memory).
> +	 *
> +	 * @see struct rte_dmadev_vchan_conf::direction
> +	 */
> +	RTE_DMA_DIR_DEV_TO_MEM,
> +	/**< DMA transfer direction - from device to memory.
> +	 * In a typical scenario, the SoCs are installed on host servers as
> +	 * iNICs through the PCIe interface. In this case, the SoCs works in
> +	 * EP(endpoint) mode, it could initiate a DMA move request from device
> +	 * (which is host memory) to memory (which is SoCs memory).
> +	 *
> +	 * @see struct rte_dmadev_vchan_conf::direction
> +	 */
> +	RTE_DMA_DIR_DEV_TO_DEV,
> +	/**< DMA transfer direction - from device to device.
> +	 * In a typical scenario, the SoCs are installed on host servers as
> +	 * iNICs through the PCIe interface. In this case, the SoCs works in
> +	 * EP(endpoint) mode, it could initiate a DMA move request from device
> +	 * (which is host memory) to the device (which is another host memory).
> +	 *
> +	 * @see struct rte_dmadev_vchan_conf::direction
> +	 */
> +};
> +
> +/**
> + * enum rte_dmadev_port_type - DMA access port type defines.
> + *
> + * @see struct rte_dmadev_port_param::port_type
> + */
> +enum rte_dmadev_port_type {
> +	RTE_DMADEV_PORT_NONE,
> +	RTE_DMADEV_PORT_PCIE, /**< The DMA access port is PCIe. */
> +};
> +
<snip>
> +
> +/**
> + * rte_dmadev_stats - running statistics.
> + */

No need to have the struct name in the comment. Maybe "Operation 
statistic counters"?


> +struct rte_dmadev_stats {
> +	uint64_t submitted;
> +	/**< Count of operations which were submitted to hardware. */
> +	uint64_t completed;
> +	/**< Count of operations which were completed. */
> +	uint64_t errors;
> +	/**< Count of operations which failed to complete. */
> +};

The comments here are a little ambiguous, it would be better to 
explicitly mention that "errors" is a subset of "completed" and not an 
independent statistic.


<snip>
> +
> +/**
> + * rte_dmadev_sge - can hold scatter-gather DMA operation request entry.
> + */
No need to have the struct name in the comment.
> +struct rte_dmadev_sge {
> +	rte_iova_t addr; /**< The DMA operation address. */
> +	uint32_t length; /**< The DMA operation length. */
> +};
> +

Apart from the minor comments, LGTM.

Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
Walsh, Conor Sept. 3, 2021, 3:35 p.m. UTC | #4
> The 'dmadevice' is a generic type of DMA device.
>
> This patch introduce the 'dmadevice' public APIs which expose generic
> operations that can enable configuration and I/O with the DMA devices.
>
> Maintainers update is also included in this patch.
>
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
> Acked-by: Morten Brørup <mb@smartsharesystems.com>
> Acked-by: Jerin Jacob <jerinjacobk@gmail.com>
> ---
<snip>
> +
> +/**
> + * rte_dmadev_stats - running statistics.
> + */
> +struct rte_dmadev_stats {
> +	uint64_t submitted;
> +	/**< Count of operations which were submitted to hardware. */
> +	uint64_t completed;
> +	/**< Count of operations which were completed. */
> +	uint64_t errors;
> +	/**< Count of operations which failed to complete. */
> +};

Please make it clear that completed is the total completed operations 
including any failures.

<snip>

Reviewed-by: Conor Walsh <conor.walsh@intel.com>
Fengchengwen Sept. 4, 2021, 1:31 a.m. UTC | #5
On 2021/9/3 19:42, Gagandeep Singh wrote:
> Hi,
> 
> <snip>
>> +
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice.
>> + *
>> + * Close a DMA device.
>> + *
>> + * The device cannot be restarted after this call.
>> + *
>> + * @param dev_id
>> + *   The identifier of the device.
>> + *
>> + * @return
>> + *   0 on success. Otherwise negative value is returned.
>> + */
>> +__rte_experimental
>> +int
>> +rte_dmadev_close(uint16_t dev_id);
>> +
>> +/**
>> + * rte_dma_direction - DMA transfer direction defines.
>> + */
>> +enum rte_dma_direction {
>> +	RTE_DMA_DIR_MEM_TO_MEM,
>> +	/**< DMA transfer direction - from memory to memory.
>> +	 *
>> +	 * @see struct rte_dmadev_vchan_conf::direction
>> +	 */
>> +	RTE_DMA_DIR_MEM_TO_DEV,
>> +	/**< DMA transfer direction - from memory to device.
>> +	 * In a typical scenario, the SoCs are installed on host servers as
>> +	 * iNICs through the PCIe interface. In this case, the SoCs works in
>> +	 * EP(endpoint) mode, it could initiate a DMA move request from
>> memory
>> +	 * (which is SoCs memory) to device (which is host memory).
>> +	 *
>> +	 * @see struct rte_dmadev_vchan_conf::direction
>> +	 */
>> +	RTE_DMA_DIR_DEV_TO_MEM,
>> +	/**< DMA transfer direction - from device to memory.
>> +	 * In a typical scenario, the SoCs are installed on host servers as
>> +	 * iNICs through the PCIe interface. In this case, the SoCs works in
>> +	 * EP(endpoint) mode, it could initiate a DMA move request from device
>> +	 * (which is host memory) to memory (which is SoCs memory).
>> +	 *
>> +	 * @see struct rte_dmadev_vchan_conf::direction
>> +	 */
>> +	RTE_DMA_DIR_DEV_TO_DEV,
>> +	/**< DMA transfer direction - from device to device.
>> +	 * In a typical scenario, the SoCs are installed on host servers as
>> +	 * iNICs through the PCIe interface. In this case, the SoCs works in
>> +	 * EP(endpoint) mode, it could initiate a DMA move request from device
>> +	 * (which is host memory) to the device (which is another host memory).
>> +	 *
>> +	 * @see struct rte_dmadev_vchan_conf::direction
>> +	 */
>> +};
>> +
>> +/**
>> ..
> The enum rte_dma_direction must have a member RTE_DMA_DIR_ANY for a channel that supports all 4 directions.

We've discussed this issue before. The earliest solution was to set up channels to support multiple DIRs, but
no hardware/driver actually used this (at least at that time). they (like octeontx2_dma/dpaa) all setup one logic
channel server single transfer direction.

So, do you have that kind of desire for your driver ?


If you have a strong desire, we'll consider the following options:

Once the channel was setup, there are no other parameters to indicate the copy request's transfer direction.
So I think it is not enough to define RTE_DMA_DIR_ANY only.

Maybe we could add RTE_DMA_OP_xxx marco (RTE_DMA_OP_FLAG_M2M/M2D/D2M/D2D), these macro will as the flags parameter
passsed to enqueue API, so the enqueue API knows which transfer direction the request corresponding.

We can easily expand from the existing framework with following:
a. define capability RTE_DMADEV_CAPA_DIR_ANY, for those device which support it could declare it.
b. define direction macro: RTE_DMA_DIR_ANY
c. define dma_op: RTE_DMA_OP_FLAG_DIR_M2M/M2D/D2M/D2D which will passed as the flags parameters.

For that driver which don't support this feature, just don't declare support it, and framework ensure that
RTE_DMA_DIR_ANY is not passed down, and it can ignored RTE_DMA_OP_FLAG_DIR_xxx flag when enqueue API.

For that driver which support this feature, application could create one channel with RTE_DMA_DIR_ANY or RTE_DMA_DIR_MEM_TO_MEM.
If created with RTE_DMA_DIR_ANY, the RTE_DMA_OP_FLAG_DIR_xxx should be sensed in the driver.
If created with RTE_DMA_DIR_MEM_TO_MEM, the RTE_DMA_OP_FLAG_DIR_xxx could be ignored.


> <snip>
> 
> 
> Regards,
> Gagan
>
Fengchengwen Sept. 4, 2021, 3:05 a.m. UTC | #6
On 2021/9/3 21:03, Bruce Richardson wrote:
> On Thu, Sep 02, 2021 at 09:13:09PM +0800, Chengwen Feng wrote:
>> The 'dmadevice' is a generic type of DMA device.
>>
>> This patch introduce the 'dmadevice' public APIs which expose generic
>> operations that can enable configuration and I/O with the DMA devices.
>>
>> Maintainers update is also included in this patch.
>>
>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
>> Acked-by: Morten Brørup <mb@smartsharesystems.com>
>> Acked-by: Jerin Jacob <jerinjacobk@gmail.com>
>> ---
>>  MAINTAINERS                            |   4 +
>>  doc/api/doxy-api-index.md              |   1 +
>>  doc/api/doxy-api.conf.in               |   1 +
>>  doc/guides/rel_notes/release_21_11.rst |   5 +
>>  lib/dmadev/meson.build                 |   4 +
>>  lib/dmadev/rte_dmadev.h                | 949 +++++++++++++++++++++++++++++++++
>>  lib/dmadev/version.map                 |  24 +
>>  lib/meson.build                        |   1 +
>>  8 files changed, 989 insertions(+)
>>  create mode 100644 lib/dmadev/meson.build
>>  create mode 100644 lib/dmadev/rte_dmadev.h
>>  create mode 100644 lib/dmadev/version.map
>>
> 
> <snip>
> 
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice.
>> + *
>> + * Trigger hardware to begin performing enqueued operations.
>> + *
>> + * This API is used to write the "doorbell" to the hardware to trigger it
>> + * to begin the operations previously enqueued by rte_dmadev_copy/fill().
>> + *
>> + * @param dev_id
>> + *   The identifier of the device.
>> + * @param vchan
>> + *   The identifier of virtual DMA channel.
>> + *
>> + * @return
>> + *   0 on success. Otherwise negative value is returned.
>> + */
>> +__rte_experimental
>> +int
>> +rte_dmadev_submit(uint16_t dev_id, uint16_t vchan);
>> +
> 
> Putting this out here for discussion:
> 
> Those developers here looking at how integration of dma acceleration into
> vhost-virtio e.g. for OVS use, have come back with the request that we
> provide a method for querying the amount of space in the descriptor ring,
> or the size of the next burst, or similar. Basically, the reason for the
> ask is to allow an app to determine if a set of jobs of size N can be
> enqueued before the first one is, so that we don't get a half-offload of
> copy of a multi-segment packet (for devices where scatter-gather is not
> available).

Agree

> 
> In our "ioat" rawdev driver, we did this by providing a "burst_capacity"
> API which returned the number of elements which could be enqueued in the
> next burst without error (normally the available ring space). Looking at
> the dmadev APIs, an alternative way to do this is to extend the "submit()"
> function to allow a 3rd optional parameter to return this info. That is,
> when submitting one burst of operations, you get info about how many more
> you can enqueue in the next burst. [For submitting packets via the submit
> flag, this info would not be available, as I feel ending all enqueue
> operations would be excessive].
> 
> Therefore, I see a number of options for us to meet the ask for space
> querying API:
> 1. provide a capacity API as done with ioat driver
> 2. provide (optional) capacity information from each submit() call
> 3. provide both #1 and #2 above as they are compatible
> 4. <some other idea>

Maybe available ring space could be calculated based on enqueue/completed
ring_idx, and ring_size, in this way, we only need to provide the following
help function, e.g.
  uint16_t rte_dmadev_burst_capacity(uint16_t enqueue_idx, // ring_idx of the latest enqueue
                                     uint16_t completed_idx, // ring_idx of the latest completed
                                     uint16_t ring_size)
  {
    return ring_size - 1 - distance(enqueue_idx, completed_idx);
   }

However, this does not apply to the scatter-gather scenario, in which one
enqueue request may occupy multiple descriptors space.

Alternatively, an sg_avg can be passed in:
  uint16_t rte_dmadev_burst_capacity(uint16_t enqueue_idx, // ring_idx of the latest enqueue
                                     uint16_t completed_idx, // ring_idx of the latest completed
                                     uint16_t ring_size,
                                     uint16_t sg_avg_descs) // average number of descriptors occupied by SG requests
  {
    return ring_size - 1 - (distance(enqueue_idx, completed_idx) * sg_avg_descs)
   }
But it's just an estimate. It's probably too big or too small.

> 
> For me, I think #3 is probably the most flexible approach. The benefit of
> #2 is that the info can be provided to the application much more cheaply
> than when the app has to call a separate API (which wouldn't be on the
> fast-path). However, a way to provide the info apart from submitting a
> burst would also be helpful, hence adding the extra function too (#1).
> 
> What are other people's thoughts or ideas on this?

In terms of the API definition, the two do not seem to be related, so I do not
recommend extending them in submit.

However, for data-plane APIs, certain compromises can be accepted I think. and it work
well in burst enqueue mode:
    1. application maintains a variable of available space, and init it with
       rte_dmadev_burst_capacity() which is new dataplacn API.
    2. enqueue multiple copy request without submit flag.
       before enqueue, application could use step1's available space to check whether
       all requests can be accommodated.
    3. submit doorbell and update available space info.
    4. do other work.
    5. call completed API:
       if this API return >=0 the driver actually available space will increase. it will
       bigger than application's available space.
    6. do other work.
    7. enqueue multiple copy request without submit flag.
       before enqueue, application could use step3's available space to check whether
       all requests can be accommodated.
    8. submit doorbell and update available space again.
    ...

As long as ring_size is set properly (ring_size equals at least 2*burst), the game can be
played.

> 
> Regards,
> /Bruce
> .
>
Morten Brørup Sept. 4, 2021, 10:10 a.m. UTC | #7
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson
> Sent: Friday, 3 September 2021 15.04
> 
> On Thu, Sep 02, 2021 at 09:13:09PM +0800, Chengwen Feng wrote:
> > The 'dmadevice' is a generic type of DMA device.
> >
> > This patch introduce the 'dmadevice' public APIs which expose generic
> > operations that can enable configuration and I/O with the DMA
> devices.
> >
> > Maintainers update is also included in this patch.
> >
> > Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> > Acked-by: Bruce Richardson <bruce.richardson@intel.com>
> > Acked-by: Morten Brørup <mb@smartsharesystems.com>
> > Acked-by: Jerin Jacob <jerinjacobk@gmail.com>
> > ---
> >  MAINTAINERS                            |   4 +
> >  doc/api/doxy-api-index.md              |   1 +
> >  doc/api/doxy-api.conf.in               |   1 +
> >  doc/guides/rel_notes/release_21_11.rst |   5 +
> >  lib/dmadev/meson.build                 |   4 +
> >  lib/dmadev/rte_dmadev.h                | 949
> +++++++++++++++++++++++++++++++++
> >  lib/dmadev/version.map                 |  24 +
> >  lib/meson.build                        |   1 +
> >  8 files changed, 989 insertions(+)
> >  create mode 100644 lib/dmadev/meson.build
> >  create mode 100644 lib/dmadev/rte_dmadev.h
> >  create mode 100644 lib/dmadev/version.map
> >
> 
> <snip>
> 
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Trigger hardware to begin performing enqueued operations.
> > + *
> > + * This API is used to write the "doorbell" to the hardware to
> trigger it
> > + * to begin the operations previously enqueued by
> rte_dmadev_copy/fill().
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param vchan
> > + *   The identifier of virtual DMA channel.
> > + *
> > + * @return
> > + *   0 on success. Otherwise negative value is returned.
> > + */
> > +__rte_experimental
> > +int
> > +rte_dmadev_submit(uint16_t dev_id, uint16_t vchan);
> > +
> 
> Putting this out here for discussion:
> 
> Those developers here looking at how integration of dma acceleration
> into
> vhost-virtio e.g. for OVS use, have come back with the request that we
> provide a method for querying the amount of space in the descriptor
> ring,
> or the size of the next burst, or similar. Basically, the reason for
> the
> ask is to allow an app to determine if a set of jobs of size N can be
> enqueued before the first one is, so that we don't get a half-offload
> of
> copy of a multi-segment packet (for devices where scatter-gather is not
> available).
> 
> In our "ioat" rawdev driver, we did this by providing a
> "burst_capacity"
> API which returned the number of elements which could be enqueued in
> the
> next burst without error (normally the available ring space). Looking
> at
> the dmadev APIs, an alternative way to do this is to extend the
> "submit()"
> function to allow a 3rd optional parameter to return this info. That
> is,
> when submitting one burst of operations, you get info about how many
> more
> you can enqueue in the next burst. [For submitting packets via the
> submit
> flag, this info would not be available, as I feel ending all enqueue
> operations would be excessive].
> 
> Therefore, I see a number of options for us to meet the ask for space
> querying API:
> 1. provide a capacity API as done with ioat driver
> 2. provide (optional) capacity information from each submit() call
> 3. provide both #1 and #2 above as they are compatible
> 4. <some other idea>
> 
> For me, I think #3 is probably the most flexible approach. The benefit
> of
> #2 is that the info can be provided to the application much more
> cheaply
> than when the app has to call a separate API (which wouldn't be on the
> fast-path). However, a way to provide the info apart from submitting a
> burst would also be helpful, hence adding the extra function too (#1).
> 
> What are other people's thoughts or ideas on this?
> 

#2 Is low cost. However, the information about the remaining capacity quickly becomes outdated if not used immediately, so we also need #1.

And #1 can be also used from slow path, e.g. for telemetry purposes.

So I vote for providing #1 and optionally #2.

I also considered if a _bulk function would be useful in addition to the _burst function. But I think that the fast path application's decision is not binary (i.e. use DMA or not), the fast path application would want to process as many as possible by DMA and then process the remaining by software.

-Morten
Gagandeep Singh Sept. 6, 2021, 6:48 a.m. UTC | #8
> -----Original Message-----
> From: fengchengwen <fengchengwen@huawei.com>
> Sent: Saturday, September 4, 2021 7:02 AM
> To: Gagandeep Singh <G.Singh@nxp.com>; thomas@monjalon.net;
> ferruh.yigit@intel.com; bruce.richardson@intel.com; jerinj@marvell.com;
> jerinjacobk@gmail.com; andrew.rybchenko@oktetlabs.ru
> Cc: dev@dpdk.org; mb@smartsharesystems.com; Nipun Gupta
> <nipun.gupta@nxp.com>; Hemant Agrawal <hemant.agrawal@nxp.com>;
> maxime.coquelin@redhat.com; honnappa.nagarahalli@arm.com;
> david.marchand@redhat.com; sburla@marvell.com; pkapoor@marvell.com;
> konstantin.ananyev@intel.com; conor.walsh@intel.com
> Subject: Re: [dpdk-dev] [PATCH v19 1/7] dmadev: introduce DMA device library
> public APIs
> 
> On 2021/9/3 19:42, Gagandeep Singh wrote:
> > Hi,
> >
> > <snip>
> >> +
> >> +/**
> >> + * @warning
> >> + * @b EXPERIMENTAL: this API may change without prior notice.
> >> + *
> >> + * Close a DMA device.
> >> + *
> >> + * The device cannot be restarted after this call.
> >> + *
> >> + * @param dev_id
> >> + *   The identifier of the device.
> >> + *
> >> + * @return
> >> + *   0 on success. Otherwise negative value is returned.
> >> + */
> >> +__rte_experimental
> >> +int
> >> +rte_dmadev_close(uint16_t dev_id);
> >> +
> >> +/**
> >> + * rte_dma_direction - DMA transfer direction defines.
> >> + */
> >> +enum rte_dma_direction {
> >> +	RTE_DMA_DIR_MEM_TO_MEM,
> >> +	/**< DMA transfer direction - from memory to memory.
> >> +	 *
> >> +	 * @see struct rte_dmadev_vchan_conf::direction
> >> +	 */
> >> +	RTE_DMA_DIR_MEM_TO_DEV,
> >> +	/**< DMA transfer direction - from memory to device.
> >> +	 * In a typical scenario, the SoCs are installed on host servers as
> >> +	 * iNICs through the PCIe interface. In this case, the SoCs works in
> >> +	 * EP(endpoint) mode, it could initiate a DMA move request from
> >> memory
> >> +	 * (which is SoCs memory) to device (which is host memory).
> >> +	 *
> >> +	 * @see struct rte_dmadev_vchan_conf::direction
> >> +	 */
> >> +	RTE_DMA_DIR_DEV_TO_MEM,
> >> +	/**< DMA transfer direction - from device to memory.
> >> +	 * In a typical scenario, the SoCs are installed on host servers as
> >> +	 * iNICs through the PCIe interface. In this case, the SoCs works in
> >> +	 * EP(endpoint) mode, it could initiate a DMA move request from device
> >> +	 * (which is host memory) to memory (which is SoCs memory).
> >> +	 *
> >> +	 * @see struct rte_dmadev_vchan_conf::direction
> >> +	 */
> >> +	RTE_DMA_DIR_DEV_TO_DEV,
> >> +	/**< DMA transfer direction - from device to device.
> >> +	 * In a typical scenario, the SoCs are installed on host servers as
> >> +	 * iNICs through the PCIe interface. In this case, the SoCs works in
> >> +	 * EP(endpoint) mode, it could initiate a DMA move request from device
> >> +	 * (which is host memory) to the device (which is another host memory).
> >> +	 *
> >> +	 * @see struct rte_dmadev_vchan_conf::direction
> >> +	 */
> >> +};
> >> +
> >> +/**
> >> ..
> > The enum rte_dma_direction must have a member RTE_DMA_DIR_ANY for a
> channel that supports all 4 directions.
> 
> We've discussed this issue before. The earliest solution was to set up channels to
> support multiple DIRs, but
> no hardware/driver actually used this (at least at that time). they (like
> octeontx2_dma/dpaa) all setup one logic
> channel server single transfer direction.
> 
> So, do you have that kind of desire for your driver ?
> 
Both DPAA1 and DPAA2 drivers can support ANY direction on a channel, so we would like to have this option as well.

> 
> If you have a strong desire, we'll consider the following options:
> 
> Once the channel was setup, there are no other parameters to indicate the copy
> request's transfer direction.
> So I think it is not enough to define RTE_DMA_DIR_ANY only.
> 
> Maybe we could add RTE_DMA_OP_xxx marco
> (RTE_DMA_OP_FLAG_M2M/M2D/D2M/D2D), these macro will as the flags
> parameter
> passsed to enqueue API, so the enqueue API knows which transfer direction the
> request corresponding.
> 
> We can easily expand from the existing framework with following:
> a. define capability RTE_DMADEV_CAPA_DIR_ANY, for those device which
> support it could declare it.
> b. define direction macro: RTE_DMA_DIR_ANY
> c. define dma_op: RTE_DMA_OP_FLAG_DIR_M2M/M2D/D2M/D2D which will
> passed as the flags parameters.
> 
> For that driver which don't support this feature, just don't declare support it, and
> framework ensure that
> RTE_DMA_DIR_ANY is not passed down, and it can ignored
> RTE_DMA_OP_FLAG_DIR_xxx flag when enqueue API.
> 
> For that driver which support this feature, application could create one channel
> with RTE_DMA_DIR_ANY or RTE_DMA_DIR_MEM_TO_MEM.
> If created with RTE_DMA_DIR_ANY, the RTE_DMA_OP_FLAG_DIR_xxx should be
> sensed in the driver.
> If created with RTE_DMA_DIR_MEM_TO_MEM, the
> RTE_DMA_OP_FLAG_DIR_xxx could be ignored.
> 
Your design looks ok to me.

> 
> > <snip>
> >
> >
> > Regards,
> > Gagan
> >
Fengchengwen Sept. 6, 2021, 7:52 a.m. UTC | #9
I think we can add support for DIR_ANY.
@Bruce @Jerin Would you please take a look at my proposal?

On 2021/9/6 14:48, Gagandeep Singh wrote:
> 
> 
>> -----Original Message-----
>> From: fengchengwen <fengchengwen@huawei.com>
>> Sent: Saturday, September 4, 2021 7:02 AM
>> To: Gagandeep Singh <G.Singh@nxp.com>; thomas@monjalon.net;
>> ferruh.yigit@intel.com; bruce.richardson@intel.com; jerinj@marvell.com;
>> jerinjacobk@gmail.com; andrew.rybchenko@oktetlabs.ru
>> Cc: dev@dpdk.org; mb@smartsharesystems.com; Nipun Gupta
>> <nipun.gupta@nxp.com>; Hemant Agrawal <hemant.agrawal@nxp.com>;
>> maxime.coquelin@redhat.com; honnappa.nagarahalli@arm.com;
>> david.marchand@redhat.com; sburla@marvell.com; pkapoor@marvell.com;
>> konstantin.ananyev@intel.com; conor.walsh@intel.com
>> Subject: Re: [dpdk-dev] [PATCH v19 1/7] dmadev: introduce DMA device library
>> public APIs
>>
>> On 2021/9/3 19:42, Gagandeep Singh wrote:
>>> Hi,
>>>
>>> <snip>
>>>> +
>>>> +/**
>>>> + * @warning
>>>> + * @b EXPERIMENTAL: this API may change without prior notice.
>>>> + *
>>>> + * Close a DMA device.
>>>> + *
>>>> + * The device cannot be restarted after this call.
>>>> + *
>>>> + * @param dev_id
>>>> + *   The identifier of the device.
>>>> + *
>>>> + * @return
>>>> + *   0 on success. Otherwise negative value is returned.
>>>> + */
>>>> +__rte_experimental
>>>> +int
>>>> +rte_dmadev_close(uint16_t dev_id);
>>>> +
>>>> +/**
>>>> + * rte_dma_direction - DMA transfer direction defines.
>>>> + */
>>>> +enum rte_dma_direction {
>>>> +	RTE_DMA_DIR_MEM_TO_MEM,
>>>> +	/**< DMA transfer direction - from memory to memory.
>>>> +	 *
>>>> +	 * @see struct rte_dmadev_vchan_conf::direction
>>>> +	 */
>>>> +	RTE_DMA_DIR_MEM_TO_DEV,
>>>> +	/**< DMA transfer direction - from memory to device.
>>>> +	 * In a typical scenario, the SoCs are installed on host servers as
>>>> +	 * iNICs through the PCIe interface. In this case, the SoCs works in
>>>> +	 * EP(endpoint) mode, it could initiate a DMA move request from
>>>> memory
>>>> +	 * (which is SoCs memory) to device (which is host memory).
>>>> +	 *
>>>> +	 * @see struct rte_dmadev_vchan_conf::direction
>>>> +	 */
>>>> +	RTE_DMA_DIR_DEV_TO_MEM,
>>>> +	/**< DMA transfer direction - from device to memory.
>>>> +	 * In a typical scenario, the SoCs are installed on host servers as
>>>> +	 * iNICs through the PCIe interface. In this case, the SoCs works in
>>>> +	 * EP(endpoint) mode, it could initiate a DMA move request from device
>>>> +	 * (which is host memory) to memory (which is SoCs memory).
>>>> +	 *
>>>> +	 * @see struct rte_dmadev_vchan_conf::direction
>>>> +	 */
>>>> +	RTE_DMA_DIR_DEV_TO_DEV,
>>>> +	/**< DMA transfer direction - from device to device.
>>>> +	 * In a typical scenario, the SoCs are installed on host servers as
>>>> +	 * iNICs through the PCIe interface. In this case, the SoCs works in
>>>> +	 * EP(endpoint) mode, it could initiate a DMA move request from device
>>>> +	 * (which is host memory) to the device (which is another host memory).
>>>> +	 *
>>>> +	 * @see struct rte_dmadev_vchan_conf::direction
>>>> +	 */
>>>> +};
>>>> +
>>>> +/**
>>>> ..
>>> The enum rte_dma_direction must have a member RTE_DMA_DIR_ANY for a
>> channel that supports all 4 directions.
>>
>> We've discussed this issue before. The earliest solution was to set up channels to
>> support multiple DIRs, but
>> no hardware/driver actually used this (at least at that time). they (like
>> octeontx2_dma/dpaa) all setup one logic
>> channel server single transfer direction.
>>
>> So, do you have that kind of desire for your driver ?
>>
> Both DPAA1 and DPAA2 drivers can support ANY direction on a channel, so we would like to have this option as well.
> 
>>
>> If you have a strong desire, we'll consider the following options:
>>
>> Once the channel was setup, there are no other parameters to indicate the copy
>> request's transfer direction.
>> So I think it is not enough to define RTE_DMA_DIR_ANY only.
>>
>> Maybe we could add RTE_DMA_OP_xxx marco
>> (RTE_DMA_OP_FLAG_M2M/M2D/D2M/D2D), these macro will as the flags
>> parameter
>> passsed to enqueue API, so the enqueue API knows which transfer direction the
>> request corresponding.
>>
>> We can easily expand from the existing framework with following:
>> a. define capability RTE_DMADEV_CAPA_DIR_ANY, for those device which
>> support it could declare it.
>> b. define direction macro: RTE_DMA_DIR_ANY
>> c. define dma_op: RTE_DMA_OP_FLAG_DIR_M2M/M2D/D2M/D2D which will
>> passed as the flags parameters.
>>
>> For that driver which don't support this feature, just don't declare support it, and
>> framework ensure that
>> RTE_DMA_DIR_ANY is not passed down, and it can ignored
>> RTE_DMA_OP_FLAG_DIR_xxx flag when enqueue API.
>>
>> For that driver which support this feature, application could create one channel
>> with RTE_DMA_DIR_ANY or RTE_DMA_DIR_MEM_TO_MEM.
>> If created with RTE_DMA_DIR_ANY, the RTE_DMA_OP_FLAG_DIR_xxx should be
>> sensed in the driver.
>> If created with RTE_DMA_DIR_MEM_TO_MEM, the
>> RTE_DMA_OP_FLAG_DIR_xxx could be ignored.
>>
> Your design looks ok to me.
> 
>>
>>> <snip>
>>>
>>>
>>> Regards,
>>> Gagan
>>>
Jerin Jacob Sept. 6, 2021, 8:06 a.m. UTC | #10
On Mon, Sep 6, 2021 at 1:22 PM fengchengwen <fengchengwen@huawei.com> wrote:
>
> I think we can add support for DIR_ANY.
> @Bruce @Jerin Would you please take a look at my proposal?

Since the channel is virtual, it will be cheap to avoid any fast path
flags and keep the current scheme
as it max we will have 4 channels for directions.
No strong opinion, if other things, that is the better way, I think,
it is okay too.


>
> On 2021/9/6 14:48, Gagandeep Singh wrote:
> >
> >
> >> -----Original Message-----
> >> From: fengchengwen <fengchengwen@huawei.com>
> >> Sent: Saturday, September 4, 2021 7:02 AM
> >> To: Gagandeep Singh <G.Singh@nxp.com>; thomas@monjalon.net;
> >> ferruh.yigit@intel.com; bruce.richardson@intel.com; jerinj@marvell.com;
> >> jerinjacobk@gmail.com; andrew.rybchenko@oktetlabs.ru
> >> Cc: dev@dpdk.org; mb@smartsharesystems.com; Nipun Gupta
> >> <nipun.gupta@nxp.com>; Hemant Agrawal <hemant.agrawal@nxp.com>;
> >> maxime.coquelin@redhat.com; honnappa.nagarahalli@arm.com;
> >> david.marchand@redhat.com; sburla@marvell.com; pkapoor@marvell.com;
> >> konstantin.ananyev@intel.com; conor.walsh@intel.com
> >> Subject: Re: [dpdk-dev] [PATCH v19 1/7] dmadev: introduce DMA device library
> >> public APIs
> >>
> >> On 2021/9/3 19:42, Gagandeep Singh wrote:
> >>> Hi,
> >>>
> >>> <snip>
> >>>> +
> >>>> +/**
> >>>> + * @warning
> >>>> + * @b EXPERIMENTAL: this API may change without prior notice.
> >>>> + *
> >>>> + * Close a DMA device.
> >>>> + *
> >>>> + * The device cannot be restarted after this call.
> >>>> + *
> >>>> + * @param dev_id
> >>>> + *   The identifier of the device.
> >>>> + *
> >>>> + * @return
> >>>> + *   0 on success. Otherwise negative value is returned.
> >>>> + */
> >>>> +__rte_experimental
> >>>> +int
> >>>> +rte_dmadev_close(uint16_t dev_id);
> >>>> +
> >>>> +/**
> >>>> + * rte_dma_direction - DMA transfer direction defines.
> >>>> + */
> >>>> +enum rte_dma_direction {
> >>>> +  RTE_DMA_DIR_MEM_TO_MEM,
> >>>> +  /**< DMA transfer direction - from memory to memory.
> >>>> +   *
> >>>> +   * @see struct rte_dmadev_vchan_conf::direction
> >>>> +   */
> >>>> +  RTE_DMA_DIR_MEM_TO_DEV,
> >>>> +  /**< DMA transfer direction - from memory to device.
> >>>> +   * In a typical scenario, the SoCs are installed on host servers as
> >>>> +   * iNICs through the PCIe interface. In this case, the SoCs works in
> >>>> +   * EP(endpoint) mode, it could initiate a DMA move request from
> >>>> memory
> >>>> +   * (which is SoCs memory) to device (which is host memory).
> >>>> +   *
> >>>> +   * @see struct rte_dmadev_vchan_conf::direction
> >>>> +   */
> >>>> +  RTE_DMA_DIR_DEV_TO_MEM,
> >>>> +  /**< DMA transfer direction - from device to memory.
> >>>> +   * In a typical scenario, the SoCs are installed on host servers as
> >>>> +   * iNICs through the PCIe interface. In this case, the SoCs works in
> >>>> +   * EP(endpoint) mode, it could initiate a DMA move request from device
> >>>> +   * (which is host memory) to memory (which is SoCs memory).
> >>>> +   *
> >>>> +   * @see struct rte_dmadev_vchan_conf::direction
> >>>> +   */
> >>>> +  RTE_DMA_DIR_DEV_TO_DEV,
> >>>> +  /**< DMA transfer direction - from device to device.
> >>>> +   * In a typical scenario, the SoCs are installed on host servers as
> >>>> +   * iNICs through the PCIe interface. In this case, the SoCs works in
> >>>> +   * EP(endpoint) mode, it could initiate a DMA move request from device
> >>>> +   * (which is host memory) to the device (which is another host memory).
> >>>> +   *
> >>>> +   * @see struct rte_dmadev_vchan_conf::direction
> >>>> +   */
> >>>> +};
> >>>> +
> >>>> +/**
> >>>> ..
> >>> The enum rte_dma_direction must have a member RTE_DMA_DIR_ANY for a
> >> channel that supports all 4 directions.
> >>
> >> We've discussed this issue before. The earliest solution was to set up channels to
> >> support multiple DIRs, but
> >> no hardware/driver actually used this (at least at that time). they (like
> >> octeontx2_dma/dpaa) all setup one logic
> >> channel server single transfer direction.
> >>
> >> So, do you have that kind of desire for your driver ?
> >>
> > Both DPAA1 and DPAA2 drivers can support ANY direction on a channel, so we would like to have this option as well.
> >
> >>
> >> If you have a strong desire, we'll consider the following options:
> >>
> >> Once the channel was setup, there are no other parameters to indicate the copy
> >> request's transfer direction.
> >> So I think it is not enough to define RTE_DMA_DIR_ANY only.
> >>
> >> Maybe we could add RTE_DMA_OP_xxx marco
> >> (RTE_DMA_OP_FLAG_M2M/M2D/D2M/D2D), these macro will as the flags
> >> parameter
> >> passsed to enqueue API, so the enqueue API knows which transfer direction the
> >> request corresponding.
> >>
> >> We can easily expand from the existing framework with following:
> >> a. define capability RTE_DMADEV_CAPA_DIR_ANY, for those device which
> >> support it could declare it.
> >> b. define direction macro: RTE_DMA_DIR_ANY
> >> c. define dma_op: RTE_DMA_OP_FLAG_DIR_M2M/M2D/D2M/D2D which will
> >> passed as the flags parameters.
> >>
> >> For that driver which don't support this feature, just don't declare support it, and
> >> framework ensure that
> >> RTE_DMA_DIR_ANY is not passed down, and it can ignored
> >> RTE_DMA_OP_FLAG_DIR_xxx flag when enqueue API.
> >>
> >> For that driver which support this feature, application could create one channel
> >> with RTE_DMA_DIR_ANY or RTE_DMA_DIR_MEM_TO_MEM.
> >> If created with RTE_DMA_DIR_ANY, the RTE_DMA_OP_FLAG_DIR_xxx should be
> >> sensed in the driver.
> >> If created with RTE_DMA_DIR_MEM_TO_MEM, the
> >> RTE_DMA_OP_FLAG_DIR_xxx could be ignored.
> >>
> > Your design looks ok to me.
> >
> >>
> >>> <snip>
> >>>
> >>>
> >>> Regards,
> >>> Gagan
> >>>
Bruce Richardson Sept. 6, 2021, 8:08 a.m. UTC | #11
On Mon, Sep 06, 2021 at 03:52:21PM +0800, fengchengwen wrote:
> I think we can add support for DIR_ANY.
> @Bruce @Jerin Would you please take a look at my proposal?
> 

I don't have a strong opinion on this. However, is one of the reasons we
have virtual-channels in the API rather than HW channels so that this
info can be encoded in the virtual channel setup? If a HW channel can
support multiple types of copy simultaneously, I thought the original
design was to create a vchan on this HW channel to support each copy type
needed?

> On 2021/9/6 14:48, Gagandeep Singh wrote:
> > 
> > 
> >> -----Original Message-----
> >> From: fengchengwen <fengchengwen@huawei.com>
> >> Sent: Saturday, September 4, 2021 7:02 AM
> >> To: Gagandeep Singh <G.Singh@nxp.com>; thomas@monjalon.net;
> >> ferruh.yigit@intel.com; bruce.richardson@intel.com; jerinj@marvell.com;
> >> jerinjacobk@gmail.com; andrew.rybchenko@oktetlabs.ru
> >> Cc: dev@dpdk.org; mb@smartsharesystems.com; Nipun Gupta
> >> <nipun.gupta@nxp.com>; Hemant Agrawal <hemant.agrawal@nxp.com>;
> >> maxime.coquelin@redhat.com; honnappa.nagarahalli@arm.com;
> >> david.marchand@redhat.com; sburla@marvell.com; pkapoor@marvell.com;
> >> konstantin.ananyev@intel.com; conor.walsh@intel.com
> >> Subject: Re: [dpdk-dev] [PATCH v19 1/7] dmadev: introduce DMA device library
> >> public APIs
> >>
> >> On 2021/9/3 19:42, Gagandeep Singh wrote:
> >>> Hi,
> >>>
> >>> <snip>
> >>>> +
> >>>> +/**
> >>>> + * @warning
> >>>> + * @b EXPERIMENTAL: this API may change without prior notice.
> >>>> + *
> >>>> + * Close a DMA device.
> >>>> + *
> >>>> + * The device cannot be restarted after this call.
> >>>> + *
> >>>> + * @param dev_id
> >>>> + *   The identifier of the device.
> >>>> + *
> >>>> + * @return
> >>>> + *   0 on success. Otherwise negative value is returned.
> >>>> + */
> >>>> +__rte_experimental
> >>>> +int
> >>>> +rte_dmadev_close(uint16_t dev_id);
> >>>> +
> >>>> +/**
> >>>> + * rte_dma_direction - DMA transfer direction defines.
> >>>> + */
> >>>> +enum rte_dma_direction {
> >>>> +	RTE_DMA_DIR_MEM_TO_MEM,
> >>>> +	/**< DMA transfer direction - from memory to memory.
> >>>> +	 *
> >>>> +	 * @see struct rte_dmadev_vchan_conf::direction
> >>>> +	 */
> >>>> +	RTE_DMA_DIR_MEM_TO_DEV,
> >>>> +	/**< DMA transfer direction - from memory to device.
> >>>> +	 * In a typical scenario, the SoCs are installed on host servers as
> >>>> +	 * iNICs through the PCIe interface. In this case, the SoCs works in
> >>>> +	 * EP(endpoint) mode, it could initiate a DMA move request from
> >>>> memory
> >>>> +	 * (which is SoCs memory) to device (which is host memory).
> >>>> +	 *
> >>>> +	 * @see struct rte_dmadev_vchan_conf::direction
> >>>> +	 */
> >>>> +	RTE_DMA_DIR_DEV_TO_MEM,
> >>>> +	/**< DMA transfer direction - from device to memory.
> >>>> +	 * In a typical scenario, the SoCs are installed on host servers as
> >>>> +	 * iNICs through the PCIe interface. In this case, the SoCs works in
> >>>> +	 * EP(endpoint) mode, it could initiate a DMA move request from device
> >>>> +	 * (which is host memory) to memory (which is SoCs memory).
> >>>> +	 *
> >>>> +	 * @see struct rte_dmadev_vchan_conf::direction
> >>>> +	 */
> >>>> +	RTE_DMA_DIR_DEV_TO_DEV,
> >>>> +	/**< DMA transfer direction - from device to device.
> >>>> +	 * In a typical scenario, the SoCs are installed on host servers as
> >>>> +	 * iNICs through the PCIe interface. In this case, the SoCs works in
> >>>> +	 * EP(endpoint) mode, it could initiate a DMA move request from device
> >>>> +	 * (which is host memory) to the device (which is another host memory).
> >>>> +	 *
> >>>> +	 * @see struct rte_dmadev_vchan_conf::direction
> >>>> +	 */
> >>>> +};
> >>>> +
> >>>> +/**
> >>>> ..
> >>> The enum rte_dma_direction must have a member RTE_DMA_DIR_ANY for a
> >> channel that supports all 4 directions.
> >>
> >> We've discussed this issue before. The earliest solution was to set up channels to
> >> support multiple DIRs, but
> >> no hardware/driver actually used this (at least at that time). they (like
> >> octeontx2_dma/dpaa) all setup one logic
> >> channel server single transfer direction.
> >>
> >> So, do you have that kind of desire for your driver ?
> >>
> > Both DPAA1 and DPAA2 drivers can support ANY direction on a channel, so we would like to have this option as well.
> > 
> >>
> >> If you have a strong desire, we'll consider the following options:
> >>
> >> Once the channel was setup, there are no other parameters to indicate the copy
> >> request's transfer direction.
> >> So I think it is not enough to define RTE_DMA_DIR_ANY only.
> >>
> >> Maybe we could add RTE_DMA_OP_xxx marco
> >> (RTE_DMA_OP_FLAG_M2M/M2D/D2M/D2D), these macro will as the flags
> >> parameter
> >> passsed to enqueue API, so the enqueue API knows which transfer direction the
> >> request corresponding.
> >>
> >> We can easily expand from the existing framework with following:
> >> a. define capability RTE_DMADEV_CAPA_DIR_ANY, for those device which
> >> support it could declare it.
> >> b. define direction macro: RTE_DMA_DIR_ANY
> >> c. define dma_op: RTE_DMA_OP_FLAG_DIR_M2M/M2D/D2M/D2D which will
> >> passed as the flags parameters.
> >>
> >> For that driver which don't support this feature, just don't declare support it, and
> >> framework ensure that
> >> RTE_DMA_DIR_ANY is not passed down, and it can ignored
> >> RTE_DMA_OP_FLAG_DIR_xxx flag when enqueue API.
> >>
> >> For that driver which support this feature, application could create one channel
> >> with RTE_DMA_DIR_ANY or RTE_DMA_DIR_MEM_TO_MEM.
> >> If created with RTE_DMA_DIR_ANY, the RTE_DMA_OP_FLAG_DIR_xxx should be
> >> sensed in the driver.
> >> If created with RTE_DMA_DIR_MEM_TO_MEM, the
> >> RTE_DMA_OP_FLAG_DIR_xxx could be ignored.
> >>
> > Your design looks ok to me.
> > 
> >>
> >>> <snip>
> >>>
> >>>
> >>> Regards,
> >>> Gagan
> >>>
Fengchengwen Sept. 7, 2021, 12:55 p.m. UTC | #12
Hi Gagandeep,

Based on the following considerations, it was decided not to support "ANY
direction on a channel".

As we previously analyze [1], many hardware (like dpaa2/octeontx2/Kunpeng)
supports multiple directions on a hardware channel.

Based on the consideration of smooth migration of existing drivers, we basically
confirmed the concept of using virtual-queue to represent different transmission
direction contexts, and which has persisted to this day.

Although it can be extended based on my proposal, this change will give rise to
new interface models, which applications have to take into account.
If we stay the same, the applications based on the original rawdev interface can
adapt quickly.

Also, Jorin has made some comments from a performance perspective, which I agree
with.

[1] https://lore.kernel.org/dpdk-dev/c4a0ee30-f7b8-f8a1-463c-8eedaec82aea@huawei.com/

BTW: @Jorin @Bruce thank you for your reply.

Thanks

On 2021/9/6 15:52, fengchengwen wrote:
> I think we can add support for DIR_ANY.
> @Bruce @Jerin Would you please take a look at my proposal?
> 
> On 2021/9/6 14:48, Gagandeep Singh wrote:
>>
>>
>>> -----Original Message-----
>>> From: fengchengwen <fengchengwen@huawei.com>
>>> Sent: Saturday, September 4, 2021 7:02 AM
>>> To: Gagandeep Singh <G.Singh@nxp.com>; thomas@monjalon.net;
>>> ferruh.yigit@intel.com; bruce.richardson@intel.com; jerinj@marvell.com;
>>> jerinjacobk@gmail.com; andrew.rybchenko@oktetlabs.ru
>>> Cc: dev@dpdk.org; mb@smartsharesystems.com; Nipun Gupta
>>> <nipun.gupta@nxp.com>; Hemant Agrawal <hemant.agrawal@nxp.com>;
>>> maxime.coquelin@redhat.com; honnappa.nagarahalli@arm.com;
>>> david.marchand@redhat.com; sburla@marvell.com; pkapoor@marvell.com;
>>> konstantin.ananyev@intel.com; conor.walsh@intel.com
>>> Subject: Re: [dpdk-dev] [PATCH v19 1/7] dmadev: introduce DMA device library
>>> public APIs
>>>
>>> On 2021/9/3 19:42, Gagandeep Singh wrote:
>>>> Hi,
>>>>
>>>> <snip>
>>>>> +
>>>>> +/**
>>>>> + * @warning
>>>>> + * @b EXPERIMENTAL: this API may change without prior notice.
>>>>> + *
>>>>> + * Close a DMA device.
>>>>> + *
>>>>> + * The device cannot be restarted after this call.
>>>>> + *
>>>>> + * @param dev_id
>>>>> + *   The identifier of the device.
>>>>> + *
>>>>> + * @return
>>>>> + *   0 on success. Otherwise negative value is returned.
>>>>> + */
>>>>> +__rte_experimental
>>>>> +int
>>>>> +rte_dmadev_close(uint16_t dev_id);
>>>>> +
>>>>> +/**
>>>>> + * rte_dma_direction - DMA transfer direction defines.
>>>>> + */
>>>>> +enum rte_dma_direction {
>>>>> +	RTE_DMA_DIR_MEM_TO_MEM,
>>>>> +	/**< DMA transfer direction - from memory to memory.
>>>>> +	 *
>>>>> +	 * @see struct rte_dmadev_vchan_conf::direction
>>>>> +	 */
>>>>> +	RTE_DMA_DIR_MEM_TO_DEV,
>>>>> +	/**< DMA transfer direction - from memory to device.
>>>>> +	 * In a typical scenario, the SoCs are installed on host servers as
>>>>> +	 * iNICs through the PCIe interface. In this case, the SoCs works in
>>>>> +	 * EP(endpoint) mode, it could initiate a DMA move request from
>>>>> memory
>>>>> +	 * (which is SoCs memory) to device (which is host memory).
>>>>> +	 *
>>>>> +	 * @see struct rte_dmadev_vchan_conf::direction
>>>>> +	 */
>>>>> +	RTE_DMA_DIR_DEV_TO_MEM,
>>>>> +	/**< DMA transfer direction - from device to memory.
>>>>> +	 * In a typical scenario, the SoCs are installed on host servers as
>>>>> +	 * iNICs through the PCIe interface. In this case, the SoCs works in
>>>>> +	 * EP(endpoint) mode, it could initiate a DMA move request from device
>>>>> +	 * (which is host memory) to memory (which is SoCs memory).
>>>>> +	 *
>>>>> +	 * @see struct rte_dmadev_vchan_conf::direction
>>>>> +	 */
>>>>> +	RTE_DMA_DIR_DEV_TO_DEV,
>>>>> +	/**< DMA transfer direction - from device to device.
>>>>> +	 * In a typical scenario, the SoCs are installed on host servers as
>>>>> +	 * iNICs through the PCIe interface. In this case, the SoCs works in
>>>>> +	 * EP(endpoint) mode, it could initiate a DMA move request from device
>>>>> +	 * (which is host memory) to the device (which is another host memory).
>>>>> +	 *
>>>>> +	 * @see struct rte_dmadev_vchan_conf::direction
>>>>> +	 */
>>>>> +};
>>>>> +
>>>>> +/**
>>>>> ..
>>>> The enum rte_dma_direction must have a member RTE_DMA_DIR_ANY for a
>>> channel that supports all 4 directions.
>>>
>>> We've discussed this issue before. The earliest solution was to set up channels to
>>> support multiple DIRs, but
>>> no hardware/driver actually used this (at least at that time). they (like
>>> octeontx2_dma/dpaa) all setup one logic
>>> channel server single transfer direction.
>>>
>>> So, do you have that kind of desire for your driver ?
>>>
>> Both DPAA1 and DPAA2 drivers can support ANY direction on a channel, so we would like to have this option as well.
>>
>>>
>>> If you have a strong desire, we'll consider the following options:
>>>
>>> Once the channel was setup, there are no other parameters to indicate the copy
>>> request's transfer direction.
>>> So I think it is not enough to define RTE_DMA_DIR_ANY only.
>>>
>>> Maybe we could add RTE_DMA_OP_xxx marco
>>> (RTE_DMA_OP_FLAG_M2M/M2D/D2M/D2D), these macro will as the flags
>>> parameter
>>> passsed to enqueue API, so the enqueue API knows which transfer direction the
>>> request corresponding.
>>>
>>> We can easily expand from the existing framework with following:
>>> a. define capability RTE_DMADEV_CAPA_DIR_ANY, for those device which
>>> support it could declare it.
>>> b. define direction macro: RTE_DMA_DIR_ANY
>>> c. define dma_op: RTE_DMA_OP_FLAG_DIR_M2M/M2D/D2M/D2D which will
>>> passed as the flags parameters.
>>>
>>> For that driver which don't support this feature, just don't declare support it, and
>>> framework ensure that
>>> RTE_DMA_DIR_ANY is not passed down, and it can ignored
>>> RTE_DMA_OP_FLAG_DIR_xxx flag when enqueue API.
>>>
>>> For that driver which support this feature, application could create one channel
>>> with RTE_DMA_DIR_ANY or RTE_DMA_DIR_MEM_TO_MEM.
>>> If created with RTE_DMA_DIR_ANY, the RTE_DMA_OP_FLAG_DIR_xxx should be
>>> sensed in the driver.
>>> If created with RTE_DMA_DIR_MEM_TO_MEM, the
>>> RTE_DMA_OP_FLAG_DIR_xxx could be ignored.
>>>
>> Your design looks ok to me.
>>
>>>
>>>> <snip>
>>>>
>>>>
>>>> Regards,
>>>> Gagan
>>>>
> .
>
diff mbox series

Patch

diff --git a/MAINTAINERS b/MAINTAINERS
index 7be9658..22dcd12 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -496,6 +496,10 @@  F: drivers/raw/skeleton/
 F: app/test/test_rawdev.c
 F: doc/guides/prog_guide/rawdev.rst
 
+DMA device API - EXPERIMENTAL
+M: Chengwen Feng <fengchengwen@huawei.com>
+F: lib/dmadev/
+
 
 Memory Pool Drivers
 -------------------
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 1992107..ce08250 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -27,6 +27,7 @@  The public API headers are grouped by topics:
   [event_timer_adapter]    (@ref rte_event_timer_adapter.h),
   [event_crypto_adapter]   (@ref rte_event_crypto_adapter.h),
   [rawdev]             (@ref rte_rawdev.h),
+  [dmadev]             (@ref rte_dmadev.h),
   [metrics]            (@ref rte_metrics.h),
   [bitrate]            (@ref rte_bitrate.h),
   [latency]            (@ref rte_latencystats.h),
diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
index 325a019..a44a92b 100644
--- a/doc/api/doxy-api.conf.in
+++ b/doc/api/doxy-api.conf.in
@@ -34,6 +34,7 @@  INPUT                   = @TOPDIR@/doc/api/doxy-api-index.md \
                           @TOPDIR@/lib/cmdline \
                           @TOPDIR@/lib/compressdev \
                           @TOPDIR@/lib/cryptodev \
+                          @TOPDIR@/lib/dmadev \
                           @TOPDIR@/lib/distributor \
                           @TOPDIR@/lib/efd \
                           @TOPDIR@/lib/ethdev \
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index d707a55..78b9691 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -55,6 +55,11 @@  New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Added dmadev library support.**
+
+  The dmadev library provides a DMA device framework for management and
+  provision of hardware and software DMA devices.
+
 
 Removed Items
 -------------
diff --git a/lib/dmadev/meson.build b/lib/dmadev/meson.build
new file mode 100644
index 0000000..6d5bd85
--- /dev/null
+++ b/lib/dmadev/meson.build
@@ -0,0 +1,4 @@ 
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2021 HiSilicon Limited.
+
+headers = files('rte_dmadev.h')
diff --git a/lib/dmadev/rte_dmadev.h b/lib/dmadev/rte_dmadev.h
new file mode 100644
index 0000000..40ae3b1
--- /dev/null
+++ b/lib/dmadev/rte_dmadev.h
@@ -0,0 +1,949 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2021 HiSilicon Limited.
+ * Copyright(c) 2021 Intel Corporation.
+ * Copyright(c) 2021 Marvell International Ltd.
+ * Copyright(c) 2021 SmartShare Systems.
+ */
+
+#ifndef _RTE_DMADEV_H_
+#define _RTE_DMADEV_H_
+
+/**
+ * @file rte_dmadev.h
+ *
+ * RTE DMA (Direct Memory Access) device APIs.
+ *
+ * The DMA framework is built on the following model:
+ *
+ *     ---------------   ---------------       ---------------
+ *     | virtual DMA |   | virtual DMA |       | virtual DMA |
+ *     | channel     |   | channel     |       | channel     |
+ *     ---------------   ---------------       ---------------
+ *            |                |                      |
+ *            ------------------                      |
+ *                     |                              |
+ *               ------------                    ------------
+ *               |  dmadev  |                    |  dmadev  |
+ *               ------------                    ------------
+ *                     |                              |
+ *            ------------------               ------------------
+ *            | HW-DMA-channel |               | HW-DMA-channel |
+ *            ------------------               ------------------
+ *                     |                              |
+ *                     --------------------------------
+ *                                     |
+ *                           ---------------------
+ *                           | HW-DMA-Controller |
+ *                           ---------------------
+ *
+ * The DMA controller could have multiple HW-DMA-channels (aka. HW-DMA-queues),
+ * each HW-DMA-channel should be represented by a dmadev.
+ *
+ * The dmadev could create multiple virtual DMA channels, each virtual DMA
+ * channel represents a different transfer context. The DMA operation request
+ * must be submitted to the virtual DMA channel. e.g. Application could create
+ * virtual DMA channel 0 for memory-to-memory transfer scenario, and create
+ * virtual DMA channel 1 for memory-to-device transfer scenario.
+ *
+ * The dmadev are dynamically allocated by rte_dmadev_pmd_allocate() during the
+ * PCI/SoC device probing phase performed at EAL initialization time. And could
+ * be released by rte_dmadev_pmd_release() during the PCI/SoC device removing
+ * phase.
+ *
+ * This framework uses 'uint16_t dev_id' as the device identifier of a dmadev,
+ * and 'uint16_t vchan' as the virtual DMA channel identifier in one dmadev.
+ *
+ * The functions exported by the dmadev API to setup a device designated by its
+ * device identifier must be invoked in the following order:
+ *     - rte_dmadev_configure()
+ *     - rte_dmadev_vchan_setup()
+ *     - rte_dmadev_start()
+ *
+ * Then, the application can invoke dataplane APIs to process jobs.
+ *
+ * If the application wants to change the configuration (i.e. invoke
+ * rte_dmadev_configure() or rte_dmadev_vchan_setup()), it must invoke
+ * rte_dmadev_stop() first to stop the device and then do the reconfiguration
+ * before invoking rte_dmadev_start() again. The dataplane APIs should not be
+ * invoked when the device is stopped.
+ *
+ * Finally, an application can close a dmadev by invoking the
+ * rte_dmadev_close() function.
+ *
+ * The dataplane APIs include two parts:
+ * The first part is the submission of operation requests:
+ *     - rte_dmadev_copy()
+ *     - rte_dmadev_copy_sg()
+ *     - rte_dmadev_fill()
+ *     - rte_dmadev_submit()
+ *
+ * These APIs could work with different virtual DMA channels which have
+ * different contexts.
+ *
+ * The first three APIs are used to submit the operation request to the virtual
+ * DMA channel, if the submission is successful, an uint16_t ring_idx is
+ * returned, otherwise a negative number is returned.
+ *
+ * The last API was used to issue doorbell to hardware, and also there are flags
+ * (@see RTE_DMA_OP_FLAG_SUBMIT) parameter of the first three APIs could do the
+ * same work.
+ *
+ * The second part is to obtain the result of requests:
+ *     - rte_dmadev_completed()
+ *         - return the number of operation requests completed successfully.
+ *     - rte_dmadev_completed_status()
+ *         - return the number of operation requests completed.
+ *
+ * @note The two completed APIs also support return the last completed
+ * operation's ring_idx.
+ * @note If the dmadev works in silent mode (@see RTE_DMADEV_CAPA_SILENT),
+ * application does not invoke the above two completed APIs.
+ *
+ * About the ring_idx which enqueue APIs (e.g. rte_dmadev_copy()
+ * rte_dmadev_fill()) returned, the rules are as follows:
+ *     - ring_idx for each virtual DMA channel are independent.
+ *     - For a virtual DMA channel, the ring_idx is monotonically incremented,
+ *       when it reach UINT16_MAX, it wraps back to zero.
+ *     - This ring_idx can be used by applications to track per-operation
+ *       metadata in an application-defined circular ring.
+ *     - The initial ring_idx of a virtual DMA channel is zero, after the
+ *       device is stopped, the ring_idx needs to be reset to zero.
+ *
+ * One example:
+ *     - step-1: start one dmadev
+ *     - step-2: enqueue a copy operation, the ring_idx return is 0
+ *     - step-3: enqueue a copy operation again, the ring_idx return is 1
+ *     - ...
+ *     - step-101: stop the dmadev
+ *     - step-102: start the dmadev
+ *     - step-103: enqueue a copy operation, the ring_idx return is 0
+ *     - ...
+ *     - step-x+0: enqueue a fill operation, the ring_idx return is 65535
+ *     - step-x+1: enqueue a copy operation, the ring_idx return is 0
+ *     - ...
+ *
+ * The DMA operation address used in enqueue APIs (i.e. rte_dmadev_copy(),
+ * rte_dmadev_copy_sg(), rte_dmadev_fill()) defined as rte_iova_t type. The
+ * dmadev supports two types of address: memory address and device address.
+ *
+ * - memory address: the source and destination address of the memory-to-memory
+ * transfer type, or the source address of the memory-to-device transfer type,
+ * or the destination address of the device-to-memory transfer type.
+ * @note If the device support SVA (@see RTE_DMADEV_CAPA_SVA), the memory
+ * address can be any VA address, otherwise it must be an IOVA address.
+ *
+ * - device address: the source and destination address of the device-to-device
+ * transfer type, or the source address of the device-to-memory transfer type,
+ * or the destination address of the memory-to-device transfer type.
+ *
+ * By default, all the functions of the dmadev API exported by a PMD are
+ * lock-free functions which assume to not be invoked in parallel on different
+ * logical cores to work on the same target dmadev object.
+ * @note Different virtual DMA channels on the same dmadev *DO NOT* support
+ * parallel invocation because these virtual DMA channels share the same
+ * HW-DMA-channel.
+ *
+ */
+
+#include <rte_common.h>
+#include <rte_compat.h>
+#include <rte_dev.h>
+#include <rte_errno.h>
+#include <rte_memory.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_DMADEV_NAME_MAX_LEN	RTE_DEV_NAME_MAX_LEN
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the device identifier for the named DMA device.
+ *
+ * @param name
+ *   DMA device name.
+ *
+ * @return
+ *   Returns DMA device identifier on success.
+ *   - <0: Failure to find named DMA device.
+ */
+__rte_experimental
+int
+rte_dmadev_get_dev_id(const char *name);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * @param dev_id
+ *   DMA device index.
+ *
+ * @return
+ *   - If the device index is valid (true) or not (false).
+ */
+__rte_experimental
+bool
+rte_dmadev_is_valid_dev(uint16_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the total number of DMA devices that have been successfully
+ * initialised.
+ *
+ * @return
+ *   The total number of usable DMA devices.
+ */
+__rte_experimental
+uint16_t
+rte_dmadev_count(void);
+
+/* Enumerates DMA device capabilities. */
+#define RTE_DMADEV_CAPA_MEM_TO_MEM	(1ull << 0)
+/**< DMA device support memory-to-memory transfer.
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ */
+
+#define RTE_DMADEV_CAPA_MEM_TO_DEV	(1ull << 1)
+/**< DMA device support memory-to-device transfer.
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ * @see struct rte_dmadev_port_param::port_type
+ */
+
+#define RTE_DMADEV_CAPA_DEV_TO_MEM	(1ull << 2)
+/**< DMA device support device-to-memory transfer.
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ * @see struct rte_dmadev_port_param::port_type
+ */
+
+#define RTE_DMADEV_CAPA_DEV_TO_DEV	(1ull << 3)
+/**< DMA device support device-to-device transfer.
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ * @see struct rte_dmadev_port_param::port_type
+ */
+
+#define RTE_DMADEV_CAPA_SVA		(1ull << 4)
+/**< DMA device support SVA which could use VA as DMA address.
+ * If device support SVA then application could pass any VA address like memory
+ * from rte_malloc(), rte_memzone(), malloc, stack memory.
+ * If device don't support SVA, then application should pass IOVA address which
+ * from rte_malloc(), rte_memzone().
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ */
+
+#define RTE_DMADEV_CAPA_SILENT		(1ull << 5)
+/**< DMA device support work in silent mode.
+ * In this mode, application don't required to invoke rte_dmadev_completed*()
+ * API.
+ *
+ * @see struct rte_dmadev_conf::silent_mode
+ */
+
+#define RTE_DMADEV_CAPA_OPS_COPY	(1ull << 32)
+/**< DMA device support copy ops.
+ * This capability start with index of 32, so that it could leave gap between
+ * normal capability and ops capability.
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ */
+
+#define RTE_DMADEV_CAPA_OPS_COPY_SG	(1ull << 33)
+/**< DMA device support scatter-gather list copy ops.
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ */
+
+#define RTE_DMADEV_CAPA_OPS_FILL	(1ull << 34)
+/**< DMA device support fill ops.
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ */
+
+/**
+ * A structure used to retrieve the information of a DMA device.
+ */
+struct rte_dmadev_info {
+	struct rte_device *device; /**< Generic Device information. */
+	uint64_t dev_capa; /**< Device capabilities (RTE_DMADEV_CAPA_*). */
+	uint16_t max_vchans;
+	/**< Maximum number of virtual DMA channels supported. */
+	uint16_t max_desc;
+	/**< Maximum allowed number of virtual DMA channel descriptors. */
+	uint16_t min_desc;
+	/**< Minimum allowed number of virtual DMA channel descriptors. */
+	uint16_t max_sges;
+	/**< Maximum number of source or destination scatter-gather entry
+	 * supported.
+	 * If the device does not support COPY_SG capability, this value can be
+	 * zero.
+	 * If the device supports COPY_SG capability, then rte_dmadev_copy_sg()
+	 * parameter nb_src/nb_dst should not exceed this value.
+	 */
+	uint16_t nb_vchans; /**< Number of virtual DMA channel configured. */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve information of a DMA device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param[out] dev_info
+ *   A pointer to a structure of type *rte_dmadev_info* to be filled with the
+ *   information of the device.
+ *
+ * @return
+ *   0 on success. Otherwise negative value is returned.
+ */
+__rte_experimental
+int
+rte_dmadev_info_get(uint16_t dev_id, struct rte_dmadev_info *dev_info);
+
+/**
+ * A structure used to configure a DMA device.
+ */
+struct rte_dmadev_conf {
+	uint16_t nb_vchans;
+	/**< The number of virtual DMA channels to set up for the DMA device.
+	 * This value cannot be greater than the field 'max_vchans' of struct
+	 * rte_dmadev_info which get from rte_dmadev_info_get().
+	 */
+	bool enable_silent;
+	/**< Indicates whether to enable silent mode.
+	 * false-default mode, true-silent mode.
+	 * This value can be set to true only when the SILENT capability is
+	 * supported.
+	 *
+	 * @see RTE_DMADEV_CAPA_SILENT
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Configure a DMA device.
+ *
+ * This function must be invoked first before any other function in the
+ * API. This function can also be re-invoked when a device is in the
+ * stopped state.
+ *
+ * @param dev_id
+ *   The identifier of the device to configure.
+ * @param dev_conf
+ *   The DMA device configuration structure encapsulated into rte_dmadev_conf
+ *   object.
+ *
+ * @return
+ *   0 on success. Otherwise negative value is returned.
+ */
+__rte_experimental
+int
+rte_dmadev_configure(uint16_t dev_id, const struct rte_dmadev_conf *dev_conf);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Start a DMA device.
+ *
+ * The device start step is the last one and consists of setting the DMA
+ * to start accepting jobs.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @return
+ *   0 on success. Otherwise negative value is returned.
+ */
+__rte_experimental
+int
+rte_dmadev_start(uint16_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Stop a DMA device.
+ *
+ * The device can be restarted with a call to rte_dmadev_start().
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @return
+ *   0 on success. Otherwise negative value is returned.
+ */
+__rte_experimental
+int
+rte_dmadev_stop(uint16_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Close a DMA device.
+ *
+ * The device cannot be restarted after this call.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @return
+ *   0 on success. Otherwise negative value is returned.
+ */
+__rte_experimental
+int
+rte_dmadev_close(uint16_t dev_id);
+
+/**
+ * rte_dma_direction - DMA transfer direction defines.
+ */
+enum rte_dma_direction {
+	RTE_DMA_DIR_MEM_TO_MEM,
+	/**< DMA transfer direction - from memory to memory.
+	 *
+	 * @see struct rte_dmadev_vchan_conf::direction
+	 */
+	RTE_DMA_DIR_MEM_TO_DEV,
+	/**< DMA transfer direction - from memory to device.
+	 * In a typical scenario, the SoCs are installed on host servers as
+	 * iNICs through the PCIe interface. In this case, the SoCs works in
+	 * EP(endpoint) mode, it could initiate a DMA move request from memory
+	 * (which is SoCs memory) to device (which is host memory).
+	 *
+	 * @see struct rte_dmadev_vchan_conf::direction
+	 */
+	RTE_DMA_DIR_DEV_TO_MEM,
+	/**< DMA transfer direction - from device to memory.
+	 * In a typical scenario, the SoCs are installed on host servers as
+	 * iNICs through the PCIe interface. In this case, the SoCs works in
+	 * EP(endpoint) mode, it could initiate a DMA move request from device
+	 * (which is host memory) to memory (which is SoCs memory).
+	 *
+	 * @see struct rte_dmadev_vchan_conf::direction
+	 */
+	RTE_DMA_DIR_DEV_TO_DEV,
+	/**< DMA transfer direction - from device to device.
+	 * In a typical scenario, the SoCs are installed on host servers as
+	 * iNICs through the PCIe interface. In this case, the SoCs works in
+	 * EP(endpoint) mode, it could initiate a DMA move request from device
+	 * (which is host memory) to the device (which is another host memory).
+	 *
+	 * @see struct rte_dmadev_vchan_conf::direction
+	 */
+};
+
+/**
+ * enum rte_dmadev_port_type - DMA access port type defines.
+ *
+ * @see struct rte_dmadev_port_param::port_type
+ */
+enum rte_dmadev_port_type {
+	RTE_DMADEV_PORT_NONE,
+	RTE_DMADEV_PORT_PCIE, /**< The DMA access port is PCIe. */
+};
+
+/**
+ * A structure used to descript DMA access port parameters.
+ *
+ * @see struct rte_dmadev_vchan_conf::src_port
+ * @see struct rte_dmadev_vchan_conf::dst_port
+ */
+struct rte_dmadev_port_param {
+	enum rte_dmadev_port_type port_type;
+	/**< The device access port type.
+	 * @see enum rte_dmadev_port_type
+	 */
+	union {
+		/** PCIe access port parameters.
+		 *
+		 * The following model shows SoC's PCIe module connects to
+		 * multiple PCIe hosts and multiple endpoints. The PCIe module
+		 * has an integrated DMA controller.
+		 *
+		 * If the DMA wants to access the memory of host A, it can be
+		 * initiated by PF1 in core0, or by VF0 of PF0 in core0.
+		 *
+		 * \code{.unparsed}
+		 * System Bus
+		 *    |     ----------PCIe module----------
+		 *    |     Bus
+		 *    |     Interface
+		 *    |     -----        ------------------
+		 *    |     |   |        | PCIe Core0     |
+		 *    |     |   |        |                |        -----------
+		 *    |     |   |        |   PF-0 -- VF-0 |        | Host A  |
+		 *    |     |   |--------|        |- VF-1 |--------| Root    |
+		 *    |     |   |        |   PF-1         |        | Complex |
+		 *    |     |   |        |   PF-2         |        -----------
+		 *    |     |   |        ------------------
+		 *    |     |   |
+		 *    |     |   |        ------------------
+		 *    |     |   |        | PCIe Core1     |
+		 *    |     |   |        |                |        -----------
+		 *    |     |   |        |   PF-0 -- VF-0 |        | Host B  |
+		 *    |-----|   |--------|   PF-1 -- VF-0 |--------| Root    |
+		 *    |     |   |        |        |- VF-1 |        | Complex |
+		 *    |     |   |        |   PF-2         |        -----------
+		 *    |     |   |        ------------------
+		 *    |     |   |
+		 *    |     |   |        ------------------
+		 *    |     |DMA|        |                |        ------
+		 *    |     |   |        |                |--------| EP |
+		 *    |     |   |--------| PCIe Core2     |        ------
+		 *    |     |   |        |                |        ------
+		 *    |     |   |        |                |--------| EP |
+		 *    |     |   |        |                |        ------
+		 *    |     -----        ------------------
+		 *
+		 * \endcode
+		 *
+		 * @note If some fields can not be supported by the
+		 * hardware/driver, then the driver ignores those fields.
+		 * Please check driver-specific documentation for limitations
+		 * and capablites.
+		 */
+		struct {
+			uint64_t coreid : 4; /**< PCIe core id used. */
+			uint64_t pfid : 8; /**< PF id used. */
+			uint64_t vfen : 1; /**< VF enable bit. */
+			uint64_t vfid : 16; /**< VF id used. */
+			uint64_t pasid : 20;
+			/**< The pasid filed in TLP packet. */
+			uint64_t attr : 3;
+			/**< The attributes filed in TLP packet. */
+			uint64_t ph : 2;
+			/**< The processing hint filed in TLP packet. */
+			uint64_t st : 16;
+			/**< The steering tag filed in TLP packet. */
+		} pcie;
+	};
+	uint64_t reserved[2]; /**< Reserved for future fields. */
+};
+
+/**
+ * A structure used to configure a virtual DMA channel.
+ */
+struct rte_dmadev_vchan_conf {
+	enum rte_dma_direction direction;
+	/**< Transfer direction
+	 * @see enum rte_dma_direction
+	 */
+	uint16_t nb_desc;
+	/**< Number of descriptor for the virtual DMA channel */
+	struct rte_dmadev_port_param src_port;
+	/**< 1) Used to describes the device access port parameter in the
+	 * device-to-memory transfer scenario.
+	 * 2) Used to describes the source device access port parameter in the
+	 * device-to-device transfer scenario.
+	 * @see struct rte_dmadev_port_param
+	 */
+	struct rte_dmadev_port_param dst_port;
+	/**< 1) Used to describes the device access port parameter in the
+	 * memory-to-device transfer scenario.
+	 * 2) Used to describes the destination device access port parameter in
+	 * the device-to-device transfer scenario.
+	 * @see struct rte_dmadev_port_param
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Allocate and set up a virtual DMA channel.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel. The value must be in the range
+ *   [0, nb_vchans - 1] previously supplied to rte_dmadev_configure().
+ * @param conf
+ *   The virtual DMA channel configuration structure encapsulated into
+ *   rte_dmadev_vchan_conf object.
+ *
+ * @return
+ *   0 on success. Otherwise negative value is returned.
+ */
+__rte_experimental
+int
+rte_dmadev_vchan_setup(uint16_t dev_id, uint16_t vchan,
+		       const struct rte_dmadev_vchan_conf *conf);
+
+/**
+ * rte_dmadev_stats - running statistics.
+ */
+struct rte_dmadev_stats {
+	uint64_t submitted;
+	/**< Count of operations which were submitted to hardware. */
+	uint64_t completed;
+	/**< Count of operations which were completed. */
+	uint64_t errors;
+	/**< Count of operations which failed to complete. */
+};
+
+#define RTE_DMADEV_ALL_VCHAN	0xFFFFu
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve basic statistics of a or all virtual DMA channel(s).
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel.
+ *   If equal RTE_DMADEV_ALL_VCHAN means all channels.
+ * @param[out] stats
+ *   The basic statistics structure encapsulated into rte_dmadev_stats
+ *   object.
+ *
+ * @return
+ *   0 on success. Otherwise negative value is returned.
+ */
+__rte_experimental
+int
+rte_dmadev_stats_get(uint16_t dev_id, uint16_t vchan,
+		     struct rte_dmadev_stats *stats);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Reset basic statistics of a or all virtual DMA channel(s).
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel.
+ *   If equal RTE_DMADEV_ALL_VCHAN means all channels.
+ *
+ * @return
+ *   0 on success. Otherwise negative value is returned.
+ */
+__rte_experimental
+int
+rte_dmadev_stats_reset(uint16_t dev_id, uint16_t vchan);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Dump DMA device info.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param f
+ *   The file to write the output to.
+ *
+ * @return
+ *   0 on success. Otherwise negative value is returned.
+ */
+__rte_experimental
+int
+rte_dmadev_dump(uint16_t dev_id, FILE *f);
+
+/**
+ * rte_dma_status_code - DMA transfer result status code defines.
+ */
+enum rte_dma_status_code {
+	RTE_DMA_STATUS_SUCCESSFUL,
+	/**< The operation completed successfully. */
+	RTE_DMA_STATUS_USER_ABORT,
+	/**< The operation failed to complete due abort by user.
+	 * This is mainly used when processing dev_stop, user could modidy the
+	 * descriptors (e.g. change one bit to tell hardware abort this job),
+	 * it allows outstanding requests to be complete as much as possible,
+	 * so reduce the time to stop the device.
+	 */
+	RTE_DMA_STATUS_NOT_ATTEMPTED,
+	/**< The operation failed to complete due to following scenarios:
+	 * The jobs in a particular batch are not attempted because they
+	 * appeared after a fence where a previous job failed. In some HW
+	 * implementation it's possible for jobs from later batches would be
+	 * completed, though, so report the status from the not attempted jobs
+	 * before reporting those newer completed jobs.
+	 */
+	RTE_DMA_STATUS_INVALID_SRC_ADDR,
+	/**< The operation failed to complete due invalid source address. */
+	RTE_DMA_STATUS_INVALID_DST_ADDR,
+	/**< The operation failed to complete due invalid destination
+	 * address.
+	 */
+	RTE_DMA_STATUS_INVALID_ADDR,
+	/**< The operation failed to complete due invalid source or destination
+	 * address, cover the case that only knows the address error, but not
+	 * sure which address error.
+	 */
+	RTE_DMA_STATUS_INVALID_LENGTH,
+	/**< The operation failed to complete due invalid length. */
+	RTE_DMA_STATUS_INVALID_OPCODE,
+	/**< The operation failed to complete due invalid opcode.
+	 * The DMA descriptor could have multiple format, which are
+	 * distinguished by the opcode field.
+	 */
+	RTE_DMA_STATUS_BUS_READ_ERROR,
+	/**< The operation failed to complete due bus read error. */
+	RTE_DMA_STATUS_BUS_WRITE_ERROR,
+	/**< The operation failed to complete due bus write error. */
+	RTE_DMA_STATUS_BUS_ERROR,
+	/**< The operation failed to complete due bus error, cover the case that
+	 * only knows the bus error, but not sure which direction error.
+	 */
+	RTE_DMA_STATUS_DATA_POISION,
+	/**< The operation failed to complete due data poison. */
+	RTE_DMA_STATUS_DESCRIPTOR_READ_ERROR,
+	/**< The operation failed to complete due descriptor read error. */
+	RTE_DMA_STATUS_DEV_LINK_ERROR,
+	/**< The operation failed to complete due device link error.
+	 * Used to indicates that the link error in the memory-to-device/
+	 * device-to-memory/device-to-device transfer scenario.
+	 */
+	RTE_DMA_STATUS_PAGE_FAULT,
+	/**< The operation failed to complete due lookup page fault. */
+	RTE_DMA_STATUS_ERROR_UNKNOWN = 0x100,
+	/**< The operation failed to complete due unknown reason.
+	 * The initial value is 256, which reserves space for future errors.
+	 */
+};
+
+/**
+ * rte_dmadev_sge - can hold scatter-gather DMA operation request entry.
+ */
+struct rte_dmadev_sge {
+	rte_iova_t addr; /**< The DMA operation address. */
+	uint32_t length; /**< The DMA operation length. */
+};
+
+/* DMA flags to augment operation preparation. */
+#define RTE_DMA_OP_FLAG_FENCE	(1ull << 0)
+/**< DMA fence flag.
+ * It means the operation with this flag must be processed only after all
+ * previous operations are completed.
+ * If the specify DMA HW works in-order (it means it has default fence between
+ * operations), this flag could be NOP.
+ *
+ * @see rte_dmadev_copy()
+ * @see rte_dmadev_copy_sg()
+ * @see rte_dmadev_fill()
+ */
+
+#define RTE_DMA_OP_FLAG_SUBMIT	(1ull << 1)
+/**< DMA submit flag.
+ * It means the operation with this flag must issue doorbell to hardware after
+ * enqueued jobs.
+ */
+
+#define RTE_DMA_OP_FLAG_LLC	(1ull << 2)
+/**< DMA write data to low level cache hint.
+ * Used for performance optimization, this is just a hint, and there is no
+ * capability bit for this, driver should not return error if this flag was set.
+ */
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Enqueue a copy operation onto the virtual DMA channel.
+ *
+ * This queues up a copy operation to be performed by hardware, if the 'flags'
+ * parameter contains RTE_DMA_OP_FLAG_SUBMIT then trigger doorbell to begin
+ * this operation, otherwise do not trigger doorbell.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel.
+ * @param src
+ *   The address of the source buffer.
+ * @param dst
+ *   The address of the destination buffer.
+ * @param length
+ *   The length of the data to be copied.
+ * @param flags
+ *   An flags for this operation.
+ *   @see RTE_DMA_OP_FLAG_*
+ *
+ * @return
+ *   - 0..UINT16_MAX: index of enqueued job.
+ *   - -ENOSPC: if no space left to enqueue.
+ *   - other values < 0 on failure.
+ */
+__rte_experimental
+int
+rte_dmadev_copy(uint16_t dev_id, uint16_t vchan, rte_iova_t src, rte_iova_t dst,
+		uint32_t length, uint64_t flags);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Enqueue a scatter-gather list copy operation onto the virtual DMA channel.
+ *
+ * This queues up a scatter-gather list copy operation to be performed by
+ * hardware, if the 'flags' parameter contains RTE_DMA_OP_FLAG_SUBMIT then
+ * trigger doorbell to begin this operation, otherwise do not trigger doorbell.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel.
+ * @param src
+ *   The pointer of source scatter-gather entry array.
+ * @param dst
+ *   The pointer of destination scatter-gather entry array.
+ * @param nb_src
+ *   The number of source scatter-gather entry.
+ *   @see struct rte_dmadev_info::max_sges
+ * @param nb_dst
+ *   The number of destination scatter-gather entry.
+ *   @see struct rte_dmadev_info::max_sges
+ * @param flags
+ *   An flags for this operation.
+ *   @see RTE_DMA_OP_FLAG_*
+ *
+ * @return
+ *   - 0..UINT16_MAX: index of enqueued job.
+ *   - -ENOSPC: if no space left to enqueue.
+ *   - other values < 0 on failure.
+ */
+__rte_experimental
+int
+rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vchan, struct rte_dmadev_sge *src,
+		   struct rte_dmadev_sge *dst, uint16_t nb_src, uint16_t nb_dst,
+		   uint64_t flags);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Enqueue a fill operation onto the virtual DMA channel.
+ *
+ * This queues up a fill operation to be performed by hardware, if the 'flags'
+ * parameter contains RTE_DMA_OP_FLAG_SUBMIT then trigger doorbell to begin
+ * this operation, otherwise do not trigger doorbell.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel.
+ * @param pattern
+ *   The pattern to populate the destination buffer with.
+ * @param dst
+ *   The address of the destination buffer.
+ * @param length
+ *   The length of the destination buffer.
+ * @param flags
+ *   An flags for this operation.
+ *   @see RTE_DMA_OP_FLAG_*
+ *
+ * @return
+ *   - 0..UINT16_MAX: index of enqueued job.
+ *   - -ENOSPC: if no space left to enqueue.
+ *   - other values < 0 on failure.
+ */
+__rte_experimental
+int
+rte_dmadev_fill(uint16_t dev_id, uint16_t vchan, uint64_t pattern,
+		rte_iova_t dst, uint32_t length, uint64_t flags);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Trigger hardware to begin performing enqueued operations.
+ *
+ * This API is used to write the "doorbell" to the hardware to trigger it
+ * to begin the operations previously enqueued by rte_dmadev_copy/fill().
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel.
+ *
+ * @return
+ *   0 on success. Otherwise negative value is returned.
+ */
+__rte_experimental
+int
+rte_dmadev_submit(uint16_t dev_id, uint16_t vchan);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Returns the number of operations that have been successfully completed.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel.
+ * @param nb_cpls
+ *   The maximum number of completed operations that can be processed.
+ * @param[out] last_idx
+ *   The last completed operation's ring_idx.
+ *   If not required, NULL can be passed in.
+ * @param[out] has_error
+ *   Indicates if there are transfer error.
+ *   If not required, NULL can be passed in.
+ *
+ * @return
+ *   The number of operations that successfully completed. This return value
+ *   must be less than or equal to the value of nb_cpls.
+ */
+__rte_experimental
+uint16_t
+rte_dmadev_completed(uint16_t dev_id, uint16_t vchan, const uint16_t nb_cpls,
+		     uint16_t *last_idx, bool *has_error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Returns the number of operations that have been completed, and the
+ * operations result may succeed or fail.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel.
+ * @param nb_cpls
+ *   Indicates the size of status array.
+ * @param[out] last_idx
+ *   The last completed operation's ring_idx.
+ *   If not required, NULL can be passed in.
+ * @param[out] status
+ *   This is a pointer to an array of length 'nb_cpls' that holds the completion
+ *   status code of each operation.
+ *   @see enum rte_dma_status_code
+ *
+ * @return
+ *   The number of operations that completed. This return value must be less
+ *   than or equal to the value of nb_cpls.
+ *   If this number is greater than zero (assuming n), then n values in the
+ *   status array are also set.
+ */
+__rte_experimental
+uint16_t
+rte_dmadev_completed_status(uint16_t dev_id, uint16_t vchan,
+			    const uint16_t nb_cpls, uint16_t *last_idx,
+			    enum rte_dma_status_code *status);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_DMADEV_H_ */
diff --git a/lib/dmadev/version.map b/lib/dmadev/version.map
new file mode 100644
index 0000000..2e37882
--- /dev/null
+++ b/lib/dmadev/version.map
@@ -0,0 +1,24 @@ 
+EXPERIMENTAL {
+	global:
+
+	rte_dmadev_close;
+	rte_dmadev_completed;
+	rte_dmadev_completed_status;
+	rte_dmadev_configure;
+	rte_dmadev_copy;
+	rte_dmadev_copy_sg;
+	rte_dmadev_count;
+	rte_dmadev_dump;
+	rte_dmadev_fill;
+	rte_dmadev_get_dev_id;
+	rte_dmadev_info_get;
+	rte_dmadev_is_valid_dev;
+	rte_dmadev_start;
+	rte_dmadev_stats_get;
+	rte_dmadev_stats_reset;
+	rte_dmadev_stop;
+	rte_dmadev_submit;
+	rte_dmadev_vchan_setup;
+
+	local: *;
+};
diff --git a/lib/meson.build b/lib/meson.build
index 1673ca4..a542c23 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -44,6 +44,7 @@  libraries = [
         'power',
         'pdump',
         'rawdev',
+        'dmadev',
         'regexdev',
         'rib',
         'reorder',