[v4] dmadev: introduce DMA device library

Message ID 1626363661-51103-1-git-send-email-fengchengwen@huawei.com (mailing list archive)
State Superseded, archived
Headers
Series [v4] dmadev: introduce DMA device library |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/github-robot success github build: passed
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-abi-testing success Testing PASS
ci/Intel-compilation success Compilation OK
ci/iol-intel-Performance fail Performance Testing issues
ci/intel-Testing success Testing PASS
ci/iol-testing success Testing PASS

Commit Message

Chengwen Feng July 15, 2021, 3:41 p.m. UTC
  This patch introduce 'dmadevice' which is a generic type of DMA
device.

The APIs of dmadev library exposes some generic operations which can
enable configuration and I/O with the DMA devices.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
---
v4:
* replace xxx_complete_fails with xxx_completed_status.
* add SILENT capability, also a silent_mode in rte_dmadev_conf.
* add op_flag_llc for performance.
* rename dmadev_xxx_t to rte_dmadev_xxx_t to avoid namespace conflict.
* delete filed 'enqueued_count' from rte_dmadev_stats.
* make rte_dmadev hold 'dev_private' filed.
* add RTE_DMA_STATUS_NOT_ATTEMPED status code.
* rename RTE_DMA_STATUS_ACTIVE_DROP to RTE_DMA_STATUS_USER_ABORT.
* rename rte_dma_sg(e) to rte_dmadev_sg(e) to make sure all struct
  prefix with rte_dmadev.
* put the comment afterwards.
* fix some doxgen problem.
* delete macro RTE_DMADEV_VALID_DEV_ID_OR_RET and
  RTE_DMADEV_PTR_OR_ERR_RET.
* replace strlcpy with rte_strscpy.
* other minor modifications from review comment.
v3:
* rm reset and fill_sg ops.
* rm MT-safe capabilities.
* add submit flag.
* redefine rte_dma_sg to implement asymmetric copy.
* delete some reserved field for future use.
* rearrangement rte_dmadev/rte_dmadev_data struct.
* refresh rte_dmadev.h copyright.
* update vchan setup parameter.
* modified some inappropriate descriptions.
* arrange version.map alphabetically.
* other minor modifications from review comment.
---
 MAINTAINERS                  |    4 +
 config/rte_config.h          |    3 +
 lib/dmadev/meson.build       |    7 +
 lib/dmadev/rte_dmadev.c      |  539 ++++++++++++++++++++++
 lib/dmadev/rte_dmadev.h      | 1009 ++++++++++++++++++++++++++++++++++++++++++
 lib/dmadev/rte_dmadev_core.h |  180 ++++++++
 lib/dmadev/rte_dmadev_pmd.h  |   72 +++
 lib/dmadev/version.map       |   37 ++
 lib/meson.build              |    1 +
 9 files changed, 1852 insertions(+)
 create mode 100644 lib/dmadev/meson.build
 create mode 100644 lib/dmadev/rte_dmadev.c
 create mode 100644 lib/dmadev/rte_dmadev.h
 create mode 100644 lib/dmadev/rte_dmadev_core.h
 create mode 100644 lib/dmadev/rte_dmadev_pmd.h
 create mode 100644 lib/dmadev/version.map
  

Comments

Chengwen Feng July 15, 2021, 4:04 p.m. UTC | #1
@burce, jerin  Some unmodified review comments are returned here:

1.
COMMENT: > +			memset(dmadev_shared_data->data, 0,
> +			       sizeof(dmadev_shared_data->data));
I believe all memzones are zero on allocation anyway, so this memset is
unecessary and can be dropped.

REPLY: I didn't find a comment to clear with this function.

2.
COMMENT: > + * @see struct rte_dmadev_info::dev_capa
> + */
Drop this flag as unnecessary. All devices either always provide ordering
guarantee - in which case it's a no-op - or else support the flag.

REPLY: I prefer define it, it could let user know whether support fence.

3.
COMMENT: I would suggest v4 to split the patch like (so that we can review and
ack each patch)
1) Only public header file with Doxygen inclusion, (There is a lot of
Doxygen syntax issue in the patch)
2) 1 or more patches for implementation.

REPLY: the V4 still one patch and with doxyen files, It's better now for just
one patch of header file and implementation.
Later I will push doxyen file patch and skelton file patch.

4.
COMMENT: > +        * @see RTE_DMA_DEV_TO_MEM
> +        * @see RTE_DMA_DEV_TO_DEV
Since we can set of only one direction per vchan . Should be we make
it as enum to
make it clear.

REPLY: May some devices support it. I think it's OK for future use.

5.
COMMENT: > +__rte_experimental
> +int
> +rte_dmadev_vchan_release(uint16_t dev_id, uint16_t vchan);
I would like to remove this to align with other device class in DPDK and use
configure and start again if there change in vchannel setup/

REPLY: I think we could have dynamic reconfig vchan without stop device ability.

6.
COMMENT: > +
> +#define RTE_DMADEV_ALL_VCHAN   0xFFFFu
RTE_DMADEV_VCHAN_ALL ??

REPLY: I don't like reserved a fix length queue stats in 'struct rte_eth_stats',
which may counter the RTE_ETHDEV_QUEUE_STAT_CNTRS too short problem.
So here I define the API could get one or ALL vchans stats.

7.
COMMENT: > +rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vchan, const struct rte_dma_sg *sg,
> +                  uint64_t flags)
In order to avoid population of rte_dma_sg in stack (as it is
fastpath), I would like
to change the API as
rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vchan, struct rte_dma_sge
*src,  struct rte_dma_sge *dst,   uint16_t nb_src, uint16_t nb_dst,
uint64_t flags)

REPLY: Too many (7) parameters if it separates. I prefer define a struct wrap it.

8.
COMMENT: change RTE_LOG_REGISTER(rte_dmadev_logtype, lib.dmadev, INFO); to
RTE_LOG_REGISTER_DEFAULT, and change rte_dmadev_logtype to logtype.

REPLY: Our CI version still don't support RTE_LOG_REGISTER_DEFAULT (have not sync newest version),
I think it could fix in next version or patch.
and because RTE_LOG_REGISTER define the rte_dmadev_logtype as a un-static variable, if we
change to logtype, there maybe namespace conflict, so I think it's ok to retain the original
variables.

thanks.

[snip]

On 2021/7/15 23:41, Chengwen Feng wrote:
> This patch introduce 'dmadevice' which is a generic type of DMA
> device.
> 
> The APIs of dmadev library exposes some generic operations which can
> enable configuration and I/O with the DMA devices.
> 
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> ---
> v4:
> * replace xxx_complete_fails with xxx_completed_status.
> * add SILENT capability, also a silent_mode in rte_dmadev_conf.
> * add op_flag_llc for performance.
> * rename dmadev_xxx_t to rte_dmadev_xxx_t to avoid namespace conflict.
> * delete filed 'enqueued_count' from rte_dmadev_stats.
> * make rte_dmadev hold 'dev_private' filed.
> * add RTE_DMA_STATUS_NOT_ATTEMPED status code.
> * rename RTE_DMA_STATUS_ACTIVE_DROP to RTE_DMA_STATUS_USER_ABORT.
> * rename rte_dma_sg(e) to rte_dmadev_sg(e) to make sure all struct
>   prefix with rte_dmadev.
> * put the comment afterwards.
> * fix some doxgen problem.
> * delete macro RTE_DMADEV_VALID_DEV_ID_OR_RET and
>   RTE_DMADEV_PTR_OR_ERR_RET.
> * replace strlcpy with rte_strscpy.
> * other minor modifications from review comment.
> v3:

[snip]
  
Bruce Richardson July 15, 2021, 4:33 p.m. UTC | #2
On Fri, Jul 16, 2021 at 12:04:33AM +0800, fengchengwen wrote:
> @burce, jerin  Some unmodified review comments are returned here:
> 
> 1.
> COMMENT: > +			memset(dmadev_shared_data->data, 0,
> > +			       sizeof(dmadev_shared_data->data));
> I believe all memzones are zero on allocation anyway, so this memset is
> unecessary and can be dropped.
> 
> REPLY: I didn't find a comment to clear with this function.

Ok. No big deal either way.

> 2.  COMMENT: > + * @see struct rte_dmadev_info::dev_capa
> > + */
> Drop this flag as unnecessary. All devices either always provide ordering
> guarantee - in which case it's a no-op - or else support the flag.
> 
> REPLY: I prefer define it, it could let user know whether support fence.
> 
I don't see it that way. The flag is pointless because the application
can't use it to make any decisions. If two operations require ordering the
application must use the fence flag because if the device doesn't guarantee
ordering it's necessary, and if the device does guarantee ordering it's
better and easier to just specify the flag than to put in code branches.
Having this as a capability is just going to confuse the user - better to
just say that if you need ordering, put in a fence.

> 3.
> COMMENT: I would suggest v4 to split the patch like (so that we can review and
> ack each patch)
> 1) Only public header file with Doxygen inclusion, (There is a lot of
> Doxygen syntax issue in the patch)
> 2) 1 or more patches for implementation.
> 
> REPLY: the V4 still one patch and with doxyen files, It's better now for just
> one patch of header file and implementation.
> Later I will push doxyen file patch and skelton file patch.
> 

It's fine for reviews and testing for now.

> 4.
> COMMENT: > +        * @see RTE_DMA_DEV_TO_MEM
> > +        * @see RTE_DMA_DEV_TO_DEV
> Since we can set of only one direction per vchan . Should be we make
> it as enum to
> make it clear.
> 
> REPLY: May some devices support it. I think it's OK for future use.
> 
+1
That may need a capability flag though, to indicate if a device supports
multi-direction in a single vchannel.

> 5.
> COMMENT: > +__rte_experimental
> > +int
> > +rte_dmadev_vchan_release(uint16_t dev_id, uint16_t vchan);
> I would like to remove this to align with other device class in DPDK and use
> configure and start again if there change in vchannel setup/
> 
> REPLY: I think we could have dynamic reconfig vchan without stop device ability.
> 
> 6.
> COMMENT: > +
> > +#define RTE_DMADEV_ALL_VCHAN   0xFFFFu
> RTE_DMADEV_VCHAN_ALL ??
> 
> REPLY: I don't like reserved a fix length queue stats in 'struct rte_eth_stats',
> which may counter the RTE_ETHDEV_QUEUE_STAT_CNTRS too short problem.
> So here I define the API could get one or ALL vchans stats.
> 
> 7.
> COMMENT: > +rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vchan, const struct rte_dma_sg *sg,
> > +                  uint64_t flags)
> In order to avoid population of rte_dma_sg in stack (as it is
> fastpath), I would like
> to change the API as
> rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vchan, struct rte_dma_sge
> *src,  struct rte_dma_sge *dst,   uint16_t nb_src, uint16_t nb_dst,
> uint64_t flags)
> 
> REPLY: Too many (7) parameters if it separates. I prefer define a struct wrap it.
> 
> 8.
> COMMENT: change RTE_LOG_REGISTER(rte_dmadev_logtype, lib.dmadev, INFO); to
> RTE_LOG_REGISTER_DEFAULT, and change rte_dmadev_logtype to logtype.
> 
> REPLY: Our CI version still don't support RTE_LOG_REGISTER_DEFAULT (have not sync newest version),
> I think it could fix in next version or patch.
> and because RTE_LOG_REGISTER define the rte_dmadev_logtype as a un-static variable, if we
> change to logtype, there maybe namespace conflict, so I think it's ok to retain the original
> variables.
> 
Ok on the variable naming. I think before merge, the macro will need to be
updated to REGISTER_DEFAULT, but it's something very minor.

> thanks.
> 
> [snip]

Thanks,
/Bruce
  
Chengwen Feng July 16, 2021, 3:04 a.m. UTC | #3
On 2021/7/16 0:33, Bruce Richardson wrote:
> On Fri, Jul 16, 2021 at 12:04:33AM +0800, fengchengwen wrote:
>> @burce, jerin  Some unmodified review comments are returned here:
>>

[snip]

> 
>> 2.  COMMENT: > + * @see struct rte_dmadev_info::dev_capa
>>> + */
>> Drop this flag as unnecessary. All devices either always provide ordering
>> guarantee - in which case it's a no-op - or else support the flag.
>>
>> REPLY: I prefer define it, it could let user know whether support fence.
>>
> I don't see it that way. The flag is pointless because the application
> can't use it to make any decisions. If two operations require ordering the
> application must use the fence flag because if the device doesn't guarantee
> ordering it's necessary, and if the device does guarantee ordering it's
> better and easier to just specify the flag than to put in code branches.
> Having this as a capability is just going to confuse the user - better to
> just say that if you need ordering, put in a fence.
> 

If driver don't support fence, and application set the fence flag, What's
driving behavior like? return error or implement fence at driver layer ?

If expose the fence capability to application, then application could decide
which option to use. e.g. commit the operations before the fence and make sure it completes,
or use another collaborative approach.

I think in this manner, the driver implementation can be simplified.

[snip]

>> 4.
>> COMMENT: > +        * @see RTE_DMA_DEV_TO_MEM
>>> +        * @see RTE_DMA_DEV_TO_DEV
>> Since we can set of only one direction per vchan . Should be we make
>> it as enum to
>> make it clear.
>>
>> REPLY: May some devices support it. I think it's OK for future use.
>>
> +1
> That may need a capability flag though, to indicate if a device supports
> multi-direction in a single vchannel.

There are a lot of combinations, and I tend not to add a capability for multi-direction.
Currently, no device supports multiple directions, So can we delay that definition?

[snip]

thanks
  
Bruce Richardson July 16, 2021, 9:50 a.m. UTC | #4
On Fri, Jul 16, 2021 at 11:04:30AM +0800, fengchengwen wrote:
> On 2021/7/16 0:33, Bruce Richardson wrote:
> > On Fri, Jul 16, 2021 at 12:04:33AM +0800, fengchengwen wrote:
> >> @burce, jerin  Some unmodified review comments are returned here:
> >>
> 
> [snip]
> 
> > 
> >> 2.  COMMENT: > + * @see struct rte_dmadev_info::dev_capa
> >>> + */
> >> Drop this flag as unnecessary. All devices either always provide ordering
> >> guarantee - in which case it's a no-op - or else support the flag.
> >>
> >> REPLY: I prefer define it, it could let user know whether support fence.
> >>
> > I don't see it that way. The flag is pointless because the application
> > can't use it to make any decisions. If two operations require ordering the
> > application must use the fence flag because if the device doesn't guarantee
> > ordering it's necessary, and if the device does guarantee ordering it's
> > better and easier to just specify the flag than to put in code branches.
> > Having this as a capability is just going to confuse the user - better to
> > just say that if you need ordering, put in a fence.
> > 
> 
> If driver don't support fence, and application set the fence flag, What's
> driving behavior like? return error or implement fence at driver layer ?
> 
The simple matter is that all devices must support ordering. Therefore all
devices must support the fence flag. If a device does all jobs in-order
anyway, then fence flag can be ignored, while if jobs are done in parallel
or out-of-order, then fence flag must be respected. A driver should never
return error just because a fence flag is set.

> If expose the fence capability to application, then application could decide
> which option to use. e.g. commit the operations before the fence and make sure it completes,
> or use another collaborative approach.
> 
> I think in this manner, the driver implementation can be simplified.
> 
What you are describing is not a fence capability, but a limitation of the
hardware to enforce ordering. Even if we don't have fence, there is no
guarantee that one burst of jobs committed will complete before the next is
started as some HW can do batches in parallel in some circumstances.

As far as I know, all currently proposed HW using this API either works
in-order or supports fencing, so this capability flag is unnecessary. I
suggest we omit it until it is needed - at which point we can re-open the
discussion based on a concrete usecase.

If you *really* want to have the flag, I suggest:
* have one for fencing doesn't order, i.e. fence flag is ignored and the
  device does not work in-order
* make it clear in the documentation that even if fencing doesn't order
  it's still not an error to put in a fence flag - it will just be ignored.

/Bruce
  
Jerin Jacob July 16, 2021, 12:34 p.m. UTC | #5
On Fri, Jul 16, 2021 at 3:20 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Fri, Jul 16, 2021 at 11:04:30AM +0800, fengchengwen wrote:
> > On 2021/7/16 0:33, Bruce Richardson wrote:
> > > On Fri, Jul 16, 2021 at 12:04:33AM +0800, fengchengwen wrote:
> > >> @burce, jerin  Some unmodified review comments are returned here:
> > >>
> >
> > [snip]
> >
> > >
> > >> 2.  COMMENT: > + * @see struct rte_dmadev_info::dev_capa
> > >>> + */
> > >> Drop this flag as unnecessary. All devices either always provide ordering
> > >> guarantee - in which case it's a no-op - or else support the flag.
> > >>
> > >> REPLY: I prefer define it, it could let user know whether support fence.
> > >>
> > > I don't see it that way. The flag is pointless because the application
> > > can't use it to make any decisions. If two operations require ordering the
> > > application must use the fence flag because if the device doesn't guarantee
> > > ordering it's necessary, and if the device does guarantee ordering it's
> > > better and easier to just specify the flag than to put in code branches.
> > > Having this as a capability is just going to confuse the user - better to
> > > just say that if you need ordering, put in a fence.
> > >
> >
> > If driver don't support fence, and application set the fence flag, What's
> > driving behavior like? return error or implement fence at driver layer ?
> >
> The simple matter is that all devices must support ordering. Therefore all
> devices must support the fence flag. If a device does all jobs in-order
> anyway, then fence flag can be ignored, while if jobs are done in parallel
> or out-of-order, then fence flag must be respected. A driver should never
> return error just because a fence flag is set.
>
> > If expose the fence capability to application, then application could decide
> > which option to use. e.g. commit the operations before the fence and make sure it completes,
> > or use another collaborative approach.
> >
> > I think in this manner, the driver implementation can be simplified.
> >
> What you are describing is not a fence capability, but a limitation of the
> hardware to enforce ordering. Even if we don't have fence, there is no
> guarantee that one burst of jobs committed will complete before the next is
> started as some HW can do batches in parallel in some circumstances.
>
> As far as I know, all currently proposed HW using this API either works
> in-order or supports fencing, so this capability flag is unnecessary. I
> suggest we omit it until it is needed - at which point we can re-open the
> discussion based on a concrete usecase.
>
> If you *really* want to have the flag, I suggest:
> * have one for fencing doesn't order, i.e. fence flag is ignored and the
>   device does not work in-order
> * make it clear in the documentation that even if fencing doesn't order
>   it's still not an error to put in a fence flag - it will just be ignored.

Since we have only two class of devices
1) By default fence is there between transfers
2) Need to explicitly add a fence.

Since there is no device that does not support the fence. IMO, As
Burce suggested we don't need a fence flag.
In class (1) devices, for fastpath functions, RTE_DMA_OP_FLAG_FENCE will be NOP.



>
> /Bruce
  
Jerin Jacob July 16, 2021, 12:40 p.m. UTC | #6
On Fri, Jul 16, 2021 at 8:34 AM fengchengwen <fengchengwen@huawei.com> wrote:
>
> On 2021/7/16 0:33, Bruce Richardson wrote:
> > On Fri, Jul 16, 2021 at 12:04:33AM +0800, fengchengwen wrote:
> >> @burce, jerin  Some unmodified review comments are returned here:
> >>
>
> [snip]
>
> >
> >> 2.  COMMENT: > + * @see struct rte_dmadev_info::dev_capa
> >>> + */
> >> Drop this flag as unnecessary. All devices either always provide ordering
> >> guarantee - in which case it's a no-op - or else support the flag.
> >>
> >> REPLY: I prefer define it, it could let user know whether support fence.
> >>
> > I don't see it that way. The flag is pointless because the application
> > can't use it to make any decisions. If two operations require ordering the
> > application must use the fence flag because if the device doesn't guarantee
> > ordering it's necessary, and if the device does guarantee ordering it's
> > better and easier to just specify the flag than to put in code branches.
> > Having this as a capability is just going to confuse the user - better to
> > just say that if you need ordering, put in a fence.
> >
>
> If driver don't support fence, and application set the fence flag, What's
> driving behavior like? return error or implement fence at driver layer ?
>
> If expose the fence capability to application, then application could decide
> which option to use. e.g. commit the operations before the fence and make sure it completes,
> or use another collaborative approach.
>
> I think in this manner, the driver implementation can be simplified.
>
> [snip]
>
> >> 4.
> >> COMMENT: > +        * @see RTE_DMA_DEV_TO_MEM
> >>> +        * @see RTE_DMA_DEV_TO_DEV
> >> Since we can set of only one direction per vchan . Should be we make
> >> it as enum to
> >> make it clear.
> >>
> >> REPLY: May some devices support it. I think it's OK for future use.
> >>
> > +1
> > That may need a capability flag though, to indicate if a device supports
> > multi-direction in a single vchannel.
>
> There are a lot of combinations, and I tend not to add a capability for multi-direction.
> Currently, no device supports multiple directions, So can we delay that definition?

Yes. IMO, we need to change the comment  from "Set of supported
transfer directions" to
"Transfer direction"

If channel supports multiple directions, then in fastpath we need
another flag to select
which direction to use.

IMO, Since none of the devices supports this mode, I think, we should
change the comment
to "Transfer direction"

>
> [snip]
>
> thanks
  
Bruce Richardson July 16, 2021, 12:48 p.m. UTC | #7
On Fri, Jul 16, 2021 at 06:10:48PM +0530, Jerin Jacob wrote:
> On Fri, Jul 16, 2021 at 8:34 AM fengchengwen <fengchengwen@huawei.com> wrote:
> >
> > On 2021/7/16 0:33, Bruce Richardson wrote:
> > > On Fri, Jul 16, 2021 at 12:04:33AM +0800, fengchengwen wrote:
> > >> @burce, jerin  Some unmodified review comments are returned here:
> > >>
> >
> > [snip]
> >
> > >
> > >> 2.  COMMENT: > + * @see struct rte_dmadev_info::dev_capa
> > >>> + */
> > >> Drop this flag as unnecessary. All devices either always provide ordering
> > >> guarantee - in which case it's a no-op - or else support the flag.
> > >>
> > >> REPLY: I prefer define it, it could let user know whether support fence.
> > >>
> > > I don't see it that way. The flag is pointless because the application
> > > can't use it to make any decisions. If two operations require ordering the
> > > application must use the fence flag because if the device doesn't guarantee
> > > ordering it's necessary, and if the device does guarantee ordering it's
> > > better and easier to just specify the flag than to put in code branches.
> > > Having this as a capability is just going to confuse the user - better to
> > > just say that if you need ordering, put in a fence.
> > >
> >
> > If driver don't support fence, and application set the fence flag, What's
> > driving behavior like? return error or implement fence at driver layer ?
> >
> > If expose the fence capability to application, then application could decide
> > which option to use. e.g. commit the operations before the fence and make sure it completes,
> > or use another collaborative approach.
> >
> > I think in this manner, the driver implementation can be simplified.
> >
> > [snip]
> >
> > >> 4.
> > >> COMMENT: > +        * @see RTE_DMA_DEV_TO_MEM
> > >>> +        * @see RTE_DMA_DEV_TO_DEV
> > >> Since we can set of only one direction per vchan . Should be we make
> > >> it as enum to
> > >> make it clear.
> > >>
> > >> REPLY: May some devices support it. I think it's OK for future use.
> > >>
> > > +1
> > > That may need a capability flag though, to indicate if a device supports
> > > multi-direction in a single vchannel.
> >
> > There are a lot of combinations, and I tend not to add a capability for multi-direction.
> > Currently, no device supports multiple directions, So can we delay that definition?
> 
> Yes. IMO, we need to change the comment  from "Set of supported
> transfer directions" to
> "Transfer direction"
> 
> If channel supports multiple directions, then in fastpath we need
> another flag to select
> which direction to use.
> 
> IMO, Since none of the devices supports this mode, I think, we should
> change the comment
> to "Transfer direction"
> 
Ok for me.
  
Jerin Jacob July 16, 2021, 12:54 p.m. UTC | #8
On Thu, Jul 15, 2021 at 9:34 PM fengchengwen <fengchengwen@huawei.com> wrote:
>
> @burce, jerin  Some unmodified review comments are returned here:
>
> 1.
> COMMENT: > +                    memset(dmadev_shared_data->data, 0,
> > +                            sizeof(dmadev_shared_data->data));
> I believe all memzones are zero on allocation anyway, so this memset is
> unecessary and can be dropped.
>
> REPLY: I didn't find a comment to clear with this function.
>
> 2.
> COMMENT: > + * @see struct rte_dmadev_info::dev_capa
> > + */
> Drop this flag as unnecessary. All devices either always provide ordering
> guarantee - in which case it's a no-op - or else support the flag.
>
> REPLY: I prefer define it, it could let user know whether support fence.
>
> 3.
> COMMENT: I would suggest v4 to split the patch like (so that we can review and
> ack each patch)
> 1) Only public header file with Doxygen inclusion, (There is a lot of
> Doxygen syntax issue in the patch)
> 2) 1 or more patches for implementation.
>
> REPLY: the V4 still one patch and with doxyen files, It's better now for just
> one patch of header file and implementation.
> Later I will push doxyen file patch and skelton file patch.

Makes sense. Whichever version you would like to have Ack, please
split in that version.

>
> 4.
> COMMENT: > +        * @see RTE_DMA_DEV_TO_MEM
> > +        * @see RTE_DMA_DEV_TO_DEV
> Since we can set of only one direction per vchan . Should be we make
> it as enum to
> make it clear.
>
> REPLY: May some devices support it. I think it's OK for future use.

Sent comment in another thread.

>
> 5.
> COMMENT: > +__rte_experimental
> > +int
> > +rte_dmadev_vchan_release(uint16_t dev_id, uint16_t vchan);
> I would like to remove this to align with other device class in DPDK and use
> configure and start again if there change in vchannel setup/
>
> REPLY: I think we could have dynamic reconfig vchan without stop device ability.

I think, In general, drivers are doing a lot of stuff in _start(),
this will give enough flexibility
to driver(like getting a complete view of HW resources at start()).
IMO, there is no need to deviate from other classes of DPDK devices here.
Also, the release is a slow path operation, so adding more calls for
reconfiguring is OK like
other DPDK device classes.



>
> 6.
> COMMENT: > +
> > +#define RTE_DMADEV_ALL_VCHAN   0xFFFFu
> RTE_DMADEV_VCHAN_ALL ??
>
> REPLY: I don't like reserved a fix length queue stats in 'struct rte_eth_stats',
> which may counter the RTE_ETHDEV_QUEUE_STAT_CNTRS too short problem.
> So here I define the API could get one or ALL vchans stats.

I meant subsystem name can come as the third item like RTE_DMADEV_VCHAN_ALL
vs RTE_DMADEV_ALL_VCHAN where ALL is not a subsystem
i.e RTE_DMADEV_<Sub system in DMA>_.._ACTION/VERB/etc




>
> 7.
> COMMENT: > +rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vchan, const struct rte_dma_sg *sg,
> > +                  uint64_t flags)
> In order to avoid population of rte_dma_sg in stack (as it is
> fastpath), I would like
> to change the API as
> rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vchan, struct rte_dma_sge
> *src,  struct rte_dma_sge *dst,   uint16_t nb_src, uint16_t nb_dst,
> uint64_t flags)
>
> REPLY: Too many (7) parameters if it separates. I prefer define a struct wrap it.

I agree if it is a slow path function. However,

I strongly prefer to not have one more stack indirection in fastpath
and also we will not
be adding new arguments to this function in the future. So that way,
it is better as

rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vchan, struct rte_dma_sge
src,  struct rte_dma_sge *dst,   uint16_t nb_src, uint16_t nb_dst,
uint64_t flags)



>
> 8.
> COMMENT: change RTE_LOG_REGISTER(rte_dmadev_logtype, lib.dmadev, INFO); to
> RTE_LOG_REGISTER_DEFAULT, and change rte_dmadev_logtype to logtype.
>
> REPLY: Our CI version still don't support RTE_LOG_REGISTER_DEFAULT (have not sync newest version),
> I think it could fix in next version or patch.
> and because RTE_LOG_REGISTER define the rte_dmadev_logtype as a un-static variable, if we
> change to logtype, there maybe namespace conflict, so I think it's ok to retain the original
> variables.
>
> thanks.
>
> [snip]
>
> On 2021/7/15 23:41, Chengwen Feng wrote:
> > This patch introduce 'dmadevice' which is a generic type of DMA
> > device.
> >
> > The APIs of dmadev library exposes some generic operations which can
> > enable configuration and I/O with the DMA devices.
> >
> > Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> > ---
> > v4:
> > * replace xxx_complete_fails with xxx_completed_status.
> > * add SILENT capability, also a silent_mode in rte_dmadev_conf.
> > * add op_flag_llc for performance.
> > * rename dmadev_xxx_t to rte_dmadev_xxx_t to avoid namespace conflict.
> > * delete filed 'enqueued_count' from rte_dmadev_stats.
> > * make rte_dmadev hold 'dev_private' filed.
> > * add RTE_DMA_STATUS_NOT_ATTEMPED status code.
> > * rename RTE_DMA_STATUS_ACTIVE_DROP to RTE_DMA_STATUS_USER_ABORT.
> > * rename rte_dma_sg(e) to rte_dmadev_sg(e) to make sure all struct
> >   prefix with rte_dmadev.
> > * put the comment afterwards.
> > * fix some doxgen problem.
> > * delete macro RTE_DMADEV_VALID_DEV_ID_OR_RET and
> >   RTE_DMADEV_PTR_OR_ERR_RET.
> > * replace strlcpy with rte_strscpy.
> > * other minor modifications from review comment.
> > v3:
>
> [snip]
  

Patch

diff --git a/MAINTAINERS b/MAINTAINERS
index af2a91d..e01a07f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -495,6 +495,10 @@  F: drivers/raw/skeleton/
 F: app/test/test_rawdev.c
 F: doc/guides/prog_guide/rawdev.rst
 
+DMA device API - EXPERIMENTAL
+M: Chengwen Feng <fengchengwen@huawei.com>
+F: lib/dmadev/
+
 
 Memory Pool Drivers
 -------------------
diff --git a/config/rte_config.h b/config/rte_config.h
index 590903c..331a431 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -81,6 +81,9 @@ 
 /* rawdev defines */
 #define RTE_RAWDEV_MAX_DEVS 64
 
+/* dmadev defines */
+#define RTE_DMADEV_MAX_DEVS 64
+
 /* ip_fragmentation defines */
 #define RTE_LIBRTE_IP_FRAG_MAX_FRAG 4
 #undef RTE_LIBRTE_IP_FRAG_TBL_STAT
diff --git a/lib/dmadev/meson.build b/lib/dmadev/meson.build
new file mode 100644
index 0000000..d2fc85e
--- /dev/null
+++ b/lib/dmadev/meson.build
@@ -0,0 +1,7 @@ 
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2021 HiSilicon Limited.
+
+sources = files('rte_dmadev.c')
+headers = files('rte_dmadev.h')
+indirect_headers += files('rte_dmadev_core.h')
+driver_sdk_headers += files('rte_dmadev_pmd.h')
diff --git a/lib/dmadev/rte_dmadev.c b/lib/dmadev/rte_dmadev.c
new file mode 100644
index 0000000..e6ecb49
--- /dev/null
+++ b/lib/dmadev/rte_dmadev.c
@@ -0,0 +1,539 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2021 HiSilicon Limited.
+ * Copyright(c) 2021 Intel Corporation.
+ */
+
+#include <ctype.h>
+#include <inttypes.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <rte_debug.h>
+#include <rte_dev.h>
+#include <rte_eal.h>
+#include <rte_errno.h>
+#include <rte_lcore.h>
+#include <rte_log.h>
+#include <rte_memory.h>
+#include <rte_memzone.h>
+#include <rte_malloc.h>
+#include <rte_string_fns.h>
+
+#include "rte_dmadev.h"
+#include "rte_dmadev_pmd.h"
+
+struct rte_dmadev rte_dmadevices[RTE_DMADEV_MAX_DEVS];
+
+static const char *MZ_RTE_DMADEV_DATA = "rte_dmadev_data";
+/* Shared memory between primary and secondary processes. */
+static struct {
+	struct rte_dmadev_data data[RTE_DMADEV_MAX_DEVS];
+} *dmadev_shared_data;
+
+RTE_LOG_REGISTER(rte_dmadev_logtype, lib.dmadev, INFO);
+#define RTE_DMADEV_LOG(level, ...) \
+	rte_log(RTE_LOG_ ## level, rte_dmadev_logtype, "" __VA_ARGS__)
+
+/* Macros to check for valid device id */
+#define RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, retval) do { \
+	if (!rte_dmadev_is_valid_dev(dev_id)) { \
+		RTE_DMADEV_LOG(ERR, "Invalid dev_id=%u\n", dev_id); \
+		return retval; \
+	} \
+} while (0)
+
+static int
+dmadev_check_name(const char *name)
+{
+	size_t name_len;
+
+	if (name == NULL) {
+		RTE_DMADEV_LOG(ERR, "Name can't be NULL\n");
+		return -EINVAL;
+	}
+
+	name_len = strnlen(name, RTE_DMADEV_NAME_MAX_LEN);
+	if (name_len == 0) {
+		RTE_DMADEV_LOG(ERR, "Zero length DMA device name\n");
+		return -EINVAL;
+	}
+	if (name_len >= RTE_DMADEV_NAME_MAX_LEN) {
+		RTE_DMADEV_LOG(ERR, "DMA device name is too long\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static uint16_t
+dmadev_find_free_dev(void)
+{
+	uint16_t i;
+
+	for (i = 0; i < RTE_DMADEV_MAX_DEVS; i++) {
+		if (dmadev_shared_data->data[i].dev_name[0] == '\0') {
+			RTE_ASSERT(rte_dmadevices[i].state ==
+				   RTE_DMADEV_UNUSED);
+			return i;
+		}
+	}
+
+	return RTE_DMADEV_MAX_DEVS;
+}
+
+static struct rte_dmadev*
+dmadev_find(const char *name)
+{
+	uint16_t i;
+
+	for (i = 0; i < RTE_DMADEV_MAX_DEVS; i++) {
+		if ((rte_dmadevices[i].state == RTE_DMADEV_ATTACHED) &&
+		    (!strcmp(name, rte_dmadevices[i].data->dev_name)))
+			return &rte_dmadevices[i];
+	}
+
+	return NULL;
+}
+
+static int
+dmadev_shared_data_prepare(void)
+{
+	const struct rte_memzone *mz;
+
+	if (dmadev_shared_data == NULL) {
+		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+			/* Allocate port data and ownership shared memory. */
+			mz = rte_memzone_reserve(MZ_RTE_DMADEV_DATA,
+					 sizeof(*dmadev_shared_data),
+					 rte_socket_id(), 0);
+		} else
+			mz = rte_memzone_lookup(MZ_RTE_DMADEV_DATA);
+		if (mz == NULL)
+			return -ENOMEM;
+
+		dmadev_shared_data = mz->addr;
+		if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+			memset(dmadev_shared_data->data, 0,
+			       sizeof(dmadev_shared_data->data));
+	}
+
+	return 0;
+}
+
+static struct rte_dmadev *
+dmadev_allocate(const char *name)
+{
+	struct rte_dmadev *dev;
+	uint16_t dev_id;
+
+	dev = dmadev_find(name);
+	if (dev != NULL) {
+		RTE_DMADEV_LOG(ERR, "DMA device already allocated\n");
+		return NULL;
+	}
+
+	dev_id = dmadev_find_free_dev();
+	if (dev_id == RTE_DMADEV_MAX_DEVS) {
+		RTE_DMADEV_LOG(ERR, "Reached maximum number of DMA devices\n");
+		return NULL;
+	}
+
+	if (dmadev_shared_data_prepare() != 0) {
+		RTE_DMADEV_LOG(ERR, "Cannot allocate DMA shared data\n");
+		return NULL;
+	}
+
+	dev = &rte_dmadevices[dev_id];
+	dev->data = &dmadev_shared_data->data[dev_id];
+	dev->data->dev_id = dev_id;
+	rte_strscpy(dev->data->dev_name, name, sizeof(dev->data->dev_name));
+
+	return dev;
+}
+
+static struct rte_dmadev *
+dmadev_attach_secondary(const char *name)
+{
+	struct rte_dmadev *dev;
+	uint16_t i;
+
+	if (dmadev_shared_data_prepare() != 0) {
+		RTE_DMADEV_LOG(ERR, "Cannot allocate DMA shared data\n");
+		return NULL;
+	}
+
+	for (i = 0; i < RTE_DMADEV_MAX_DEVS; i++) {
+		if (!strcmp(dmadev_shared_data->data[i].dev_name, name))
+			break;
+	}
+	if (i == RTE_DMADEV_MAX_DEVS) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %s is not driven by the primary process\n",
+			name);
+		return NULL;
+	}
+
+	dev = &rte_dmadevices[i];
+	dev->data = &dmadev_shared_data->data[i];
+	RTE_ASSERT(dev->data->dev_id == i);
+	dev->dev_private = dev->data->dev_private;
+
+	return dev;
+}
+
+struct rte_dmadev *
+rte_dmadev_pmd_allocate(const char *name)
+{
+	struct rte_dmadev *dev;
+
+	if (dmadev_check_name(name) != 0)
+		return NULL;
+
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+		dev = dmadev_allocate(name);
+	else
+		dev = dmadev_attach_secondary(name);
+
+	if (dev == NULL)
+		return NULL;
+	dev->state = RTE_DMADEV_ATTACHED;
+
+	return dev;
+}
+
+int
+rte_dmadev_pmd_release(struct rte_dmadev *dev)
+{
+	void *dev_private_bak;
+
+	if (dev == NULL)
+		return -EINVAL;
+
+	if (dev->state == RTE_DMADEV_UNUSED)
+		return 0;
+
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+		memset(dev->data, 0, sizeof(struct rte_dmadev_data));
+
+	dev_private_bak = dev->dev_private;
+	memset(dev, 0, sizeof(struct rte_dmadev));
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+		dev->dev_private = dev_private_bak;
+	dev->state = RTE_DMADEV_UNUSED;
+
+	return 0;
+}
+
+struct rte_dmadev *
+rte_dmadev_get_device_by_name(const char *name)
+{
+	if (dmadev_check_name(name) != 0)
+		return NULL;
+	return dmadev_find(name);
+}
+
+bool
+rte_dmadev_is_valid_dev(uint16_t dev_id)
+{
+	return (dev_id < RTE_DMADEV_MAX_DEVS) &&
+		rte_dmadevices[dev_id].state == RTE_DMADEV_ATTACHED;
+}
+
+uint16_t
+rte_dmadev_count(void)
+{
+	uint16_t count = 0;
+	uint16_t i;
+
+	for (i = 0; i < RTE_DMADEV_MAX_DEVS; i++) {
+		if (rte_dmadevices[i].state == RTE_DMADEV_ATTACHED)
+			count++;
+	}
+
+	return count;
+}
+
+int
+rte_dmadev_info_get(uint16_t dev_id, struct rte_dmadev_info *dev_info)
+{
+	const struct rte_dmadev *dev = &rte_dmadevices[dev_id];
+	int ret;
+
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	if (dev_info == NULL)
+		return -EINVAL;
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_info_get, -ENOTSUP);
+	memset(dev_info, 0, sizeof(struct rte_dmadev_info));
+	ret = (*dev->dev_ops->dev_info_get)(dev, dev_info,
+					    sizeof(struct rte_dmadev_info));
+	if (ret != 0)
+		return ret;
+
+	dev_info->device = dev->device;
+	dev_info->nb_vchans = dev->data->dev_conf.max_vchans;
+
+	return 0;
+}
+
+int
+rte_dmadev_configure(uint16_t dev_id, const struct rte_dmadev_conf *dev_conf)
+{
+	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
+	struct rte_dmadev_info info;
+	int ret;
+
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	if (dev_conf == NULL)
+		return -EINVAL;
+
+	ret = rte_dmadev_info_get(dev_id, &info);
+	if (ret != 0) {
+		RTE_DMADEV_LOG(ERR, "Device %u get device info fail\n", dev_id);
+		return -EINVAL;
+	}
+	if (dev_conf->max_vchans > info.max_vchans) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %u configure too many vchans\n", dev_id);
+		return -EINVAL;
+	}
+
+	if (dev->data->dev_started != 0) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %u must be stopped to allow configuration\n",
+			dev_id);
+		return -EBUSY;
+	}
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_configure, -ENOTSUP);
+	ret = (*dev->dev_ops->dev_configure)(dev, dev_conf);
+	if (ret == 0)
+		memcpy(&dev->data->dev_conf, dev_conf, sizeof(*dev_conf));
+
+	return ret;
+}
+
+int
+rte_dmadev_start(uint16_t dev_id)
+{
+	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
+	int ret;
+
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+
+	if (dev->data->dev_started != 0) {
+		RTE_DMADEV_LOG(WARNING, "Device %u already started\n", dev_id);
+		return 0;
+	}
+
+	if (dev->dev_ops->dev_start == NULL)
+		goto mark_started;
+
+	ret = (*dev->dev_ops->dev_start)(dev);
+	if (ret != 0)
+		return ret;
+
+mark_started:
+	dev->data->dev_started = 1;
+	return 0;
+}
+
+int
+rte_dmadev_stop(uint16_t dev_id)
+{
+	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
+	int ret;
+
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+
+	if (dev->data->dev_started == 0) {
+		RTE_DMADEV_LOG(WARNING, "Device %u already stopped\n", dev_id);
+		return 0;
+	}
+
+	if (dev->dev_ops->dev_stop == NULL)
+		goto mark_stopped;
+
+	ret = (*dev->dev_ops->dev_stop)(dev);
+	if (ret != 0)
+		return ret;
+
+mark_stopped:
+	dev->data->dev_started = 0;
+	return 0;
+}
+
+int
+rte_dmadev_close(uint16_t dev_id)
+{
+	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
+
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+
+	/* Device must be stopped before it can be closed */
+	if (dev->data->dev_started == 1) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %u must be stopped before closing\n", dev_id);
+		return -EBUSY;
+	}
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_close, -ENOTSUP);
+	return (*dev->dev_ops->dev_close)(dev);
+}
+
+int
+rte_dmadev_vchan_setup(uint16_t dev_id,
+		       const struct rte_dmadev_vchan_conf *conf)
+{
+	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
+	struct rte_dmadev_info info;
+	uint8_t dir_all = RTE_DMA_DIR_MEM_TO_MEM | RTE_DMA_DIR_MEM_TO_DEV |
+			  RTE_DMA_DIR_DEV_TO_MEM | RTE_DMA_DIR_DEV_TO_DEV;
+	int ret;
+
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	if (conf == NULL)
+		return -EINVAL;
+
+	dev = &rte_dmadevices[dev_id];
+
+	ret = rte_dmadev_info_get(dev_id, &info);
+	if (ret != 0) {
+		RTE_DMADEV_LOG(ERR, "Device %u get device info fail\n", dev_id);
+		return -EINVAL;
+	}
+	if (conf->direction == 0 ||
+	    conf->direction & ~dir_all) {
+		RTE_DMADEV_LOG(ERR, "Device %u direction invalid!\n", dev_id);
+		return -EINVAL;
+	}
+	if (conf->direction & RTE_DMA_DIR_MEM_TO_MEM &&
+	    !(info.dev_capa & RTE_DMA_DEV_CAPA_MEM_TO_MEM)) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %u don't support mem2mem transfer\n", dev_id);
+		return -EINVAL;
+	}
+	if (conf->direction & RTE_DMA_DIR_MEM_TO_DEV &&
+	    !(info.dev_capa & RTE_DMA_DEV_CAPA_MEM_TO_DEV)) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %u don't support mem2dev transfer\n", dev_id);
+		return -EINVAL;
+	}
+	if (conf->direction & RTE_DMA_DIR_DEV_TO_MEM &&
+	    !(info.dev_capa & RTE_DMA_DEV_CAPA_DEV_TO_MEM)) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %u don't support dev2mem transfer\n", dev_id);
+		return -EINVAL;
+	}
+	if (conf->direction & RTE_DMA_DIR_DEV_TO_DEV &&
+	    !(info.dev_capa & RTE_DMA_DEV_CAPA_DEV_TO_DEV)) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %u don't support dev2dev transfer\n", dev_id);
+		return -EINVAL;
+	}
+	if (conf->nb_desc < info.min_desc || conf->nb_desc > info.max_desc) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %u number of descriptors invalid\n", dev_id);
+		return -EINVAL;
+	}
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->vchan_setup, -ENOTSUP);
+	return (*dev->dev_ops->vchan_setup)(dev, conf);
+}
+
+int
+rte_dmadev_vchan_release(uint16_t dev_id, uint16_t vchan)
+{
+	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
+
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	if (vchan >= dev->data->dev_conf.max_vchans) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %u vchan %u out of range\n", dev_id, vchan);
+		return -EINVAL;
+	}
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->vchan_release, -ENOTSUP);
+	return (*dev->dev_ops->vchan_release)(dev, vchan);
+}
+
+int
+rte_dmadev_stats_get(uint16_t dev_id, uint16_t vchan,
+		     struct rte_dmadev_stats *stats)
+{
+	const struct rte_dmadev *dev = &rte_dmadevices[dev_id];
+
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	if (stats == NULL)
+		return -EINVAL;
+	if (vchan >= dev->data->dev_conf.max_vchans &&
+	    vchan != RTE_DMADEV_ALL_VCHAN) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %u vchan %u out of range\n", dev_id, vchan);
+		return -EINVAL;
+	}
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->stats_get, -ENOTSUP);
+	return (*dev->dev_ops->stats_get)(dev, vchan, stats,
+					  sizeof(struct rte_dmadev_stats));
+}
+
+int
+rte_dmadev_stats_reset(uint16_t dev_id, uint16_t vchan)
+{
+	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
+
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	if (vchan >= dev->data->dev_conf.max_vchans &&
+	    vchan != RTE_DMADEV_ALL_VCHAN) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %u vchan %u out of range\n", dev_id, vchan);
+		return -EINVAL;
+	}
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->stats_reset, -ENOTSUP);
+	return (*dev->dev_ops->stats_reset)(dev, vchan);
+}
+
+int
+rte_dmadev_dump(uint16_t dev_id, FILE *f)
+{
+	const struct rte_dmadev *dev = &rte_dmadevices[dev_id];
+	struct rte_dmadev_info info;
+	int ret;
+
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	if (f == NULL)
+		return -EINVAL;
+
+	ret = rte_dmadev_info_get(dev_id, &info);
+	if (ret != 0) {
+		RTE_DMADEV_LOG(ERR, "Device %u get device info fail\n", dev_id);
+		return -EINVAL;
+	}
+
+	fprintf(f, "DMA Dev %u, '%s' [%s]\n",
+		dev->data->dev_id,
+		dev->data->dev_name,
+		dev->data->dev_started ? "started" : "stopped");
+	fprintf(f, "  dev_capa: 0x%" PRIx64 "\n", info.dev_capa);
+	fprintf(f, "  max_vchans_supported: %u\n", info.max_vchans);
+	fprintf(f, "  max_vchans_configured: %u\n", info.nb_vchans);
+
+	if (dev->dev_ops->dev_dump != NULL)
+		return (*dev->dev_ops->dev_dump)(dev, f);
+
+	return 0;
+}
+
+int
+rte_dmadev_selftest(uint16_t dev_id)
+{
+	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
+
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_selftest, -ENOTSUP);
+	return (*dev->dev_ops->dev_selftest)(dev_id);
+}
diff --git a/lib/dmadev/rte_dmadev.h b/lib/dmadev/rte_dmadev.h
new file mode 100644
index 0000000..5e6f85c
--- /dev/null
+++ b/lib/dmadev/rte_dmadev.h
@@ -0,0 +1,1009 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2021 HiSilicon Limited.
+ * Copyright(c) 2021 Intel Corporation.
+ * Copyright(c) 2021 Marvell International Ltd.
+ * Copyright(c) 2021 SmartShare Systems.
+ */
+
+#ifndef _RTE_DMADEV_H_
+#define _RTE_DMADEV_H_
+
+/**
+ * @file rte_dmadev.h
+ *
+ * RTE DMA (Direct Memory Access) device APIs.
+ *
+ * The DMA framework is built on the following model:
+ *
+ *     ---------------   ---------------       ---------------
+ *     | virtual DMA |   | virtual DMA |       | virtual DMA |
+ *     | channel     |   | channel     |       | channel     |
+ *     ---------------   ---------------       ---------------
+ *            |                |                      |
+ *            ------------------                      |
+ *                     |                              |
+ *               ------------                    ------------
+ *               |  dmadev  |                    |  dmadev  |
+ *               ------------                    ------------
+ *                     |                              |
+ *            ------------------               ------------------
+ *            | HW-DMA-channel |               | HW-DMA-channel |
+ *            ------------------               ------------------
+ *                     |                              |
+ *                     --------------------------------
+ *                                     |
+ *                           ---------------------
+ *                           | HW-DMA-Controller |
+ *                           ---------------------
+ *
+ * The DMA controller could have multiple HW-DMA-channels (aka. HW-DMA-queues),
+ * each HW-DMA-channel should be represented by a dmadev.
+ *
+ * The dmadev could create multiple virtual DMA channel, each virtual DMA
+ * channel represents a different transfer context. The DMA operation request
+ * must be submitted to the virtual DMA channel.
+ * E.G. Application could create virtual DMA channel 0 for mem-to-mem transfer
+ *      scenario, and create virtual DMA channel 1 for mem-to-dev transfer
+ *      scenario.
+ *
+ * The dmadev are dynamically allocated by rte_dmadev_pmd_allocate() during the
+ * PCI/SoC device probing phase performed at EAL initialization time. And could
+ * be released by rte_dmadev_pmd_release() during the PCI/SoC device removing
+ * phase.
+ *
+ * This framework uses 'uint16_t dev_id' as the device identifier of a dmadev,
+ * and 'uint16_t vchan' as the virtual DMA channel identifier in one dmadev.
+ *
+ * The functions exported by the dmadev API to setup a device designated by its
+ * device identifier must be invoked in the following order:
+ *     - rte_dmadev_configure()
+ *     - rte_dmadev_vchan_setup()
+ *     - rte_dmadev_start()
+ *
+ * Then, the application can invoke dataplane APIs to process jobs.
+ *
+ * If the application wants to change the configuration (i.e. invoke
+ * rte_dmadev_configure()), it must invoke rte_dmadev_stop() first to stop the
+ * device and then do the reconfiguration before invoking rte_dmadev_start()
+ * again. The dataplane APIs should not be invoked when the device is stopped.
+ *
+ * Finally, an application can close a dmadev by invoking the
+ * rte_dmadev_close() function.
+ *
+ * The dataplane APIs include two parts:
+ *   a) The first part is the submission of operation requests:
+ *        - rte_dmadev_copy()
+ *        - rte_dmadev_copy_sg()
+ *        - rte_dmadev_fill()
+ *        - rte_dmadev_perform()
+ *      These APIs could work with different virtual DMA channels which have
+ *      different contexts.
+ *      The first three APIs are used to submit the operation request to the
+ *      virtual DMA channel, if the submission is successful, a uint16_t
+ *      ring_idx is returned, otherwise a negative number is returned.
+ *      The last API was use to issue doorbell to hardware, and also there are
+ *      flags (@see RTE_DMA_OP_FLAG_SUBMIT) parameter of the first three APIs
+ *      could do the same work.
+ *   b) The second part is to obtain the result of requests:
+ *        - rte_dmadev_completed()
+ *            - return the number of operation requests completed successfully.
+ *        - rte_dmadev_completed_fails()
+ *            - return the number of operation requests failed to complete.
+ *
+ * About the ring_idx which rte_dmadev_copy/copy_sg/fill() returned,
+ * the rules are as follows:
+ *   a) ring_idx for each virtual DMA channel are independent.
+ *   b) For a virtual DMA channel, the ring_idx is monotonically incremented,
+ *      when it reach UINT16_MAX, it wraps back to zero.
+ *   c) This ring_idx can be used by applications to track per-operation
+ *      metadata in an application-defined circular ring.
+ *   d) The initial ring_idx of a virtual DMA channel is zero, after the device
+ *      is stopped, the ring_idx needs to be reset to zero.
+ *   Example:
+ *      step-1: start one dmadev
+ *      step-2: enqueue a copy operation, the ring_idx return is 0
+ *      step-3: enqueue a copy operation again, the ring_idx return is 1
+ *      ...
+ *      step-101: stop the dmadev
+ *      step-102: start the dmadev
+ *      step-103: enqueue a copy operation, the cookie return is 0
+ *      ...
+ *      step-x+0: enqueue a fill operation, the ring_idx return is 65535
+ *      step-x+1: enqueue a copy operation, the ring_idx return is 0
+ *      ...
+ *
+ * By default, all the functions of the dmadev API exported by a PMD are
+ * lock-free functions which assume to not be invoked in parallel on different
+ * logical cores to work on the same target object.
+ *
+ */
+
+#include <rte_common.h>
+#include <rte_compat.h>
+#include <rte_dev.h>
+#include <rte_errno.h>
+#include <rte_memory.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_DMADEV_NAME_MAX_LEN	RTE_DEV_NAME_MAX_LEN
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * @param dev_id
+ *   DMA device index.
+ *
+ * @return
+ *   - If the device index is valid (true) or not (false).
+ */
+__rte_experimental
+bool
+rte_dmadev_is_valid_dev(uint16_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the total number of DMA devices that have been successfully
+ * initialised.
+ *
+ * @return
+ *   The total number of usable DMA devices.
+ */
+__rte_experimental
+uint16_t
+rte_dmadev_count(void);
+
+/* Enumerates DMA device capabilities */
+#define RTE_DMA_DEV_CAPA_MEM_TO_MEM	(1ull << 0)
+/**< DMA device support memory-to-memory transfer.
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ */
+
+#define RTE_DMA_DEV_CAPA_MEM_TO_DEV	(1ull << 1)
+/**< DMA device support memory-to-device transfer.
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ * @see struct rte_dmadev_port_param::port_type
+ */
+
+#define RTE_DMA_DEV_CAPA_DEV_TO_MEM	(1ull << 2)
+/**< DMA device support device-to-memory transfer.
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ * @see struct rte_dmadev_port_param::port_type
+ */
+
+#define RTE_DMA_DEV_CAPA_DEV_TO_DEV	(1ull << 3)
+/**< DMA device support device-to-device transfer.
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ * @see struct rte_dmadev_port_param::port_type
+ */
+
+#define RTE_DMA_DEV_CAPA_SVA		(1ull << 4)
+/**< DMA device support SVA which could use VA as DMA address.
+ * If device support SVA then application could pass any VA address like memory
+ * from rte_malloc(), rte_memzone(), malloc, stack memory.
+ * If device don't support SVA, then application should pass IOVA address which
+ * from rte_malloc(), rte_memzone().
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ */
+
+#define RTE_DMA_DEV_CAPA_SILENT		(1ull << 5)
+/**< DMA device support work in silent mode.
+ * In this mode, application don't required to invoke rte_dmadev_completed*()
+ * API.
+ *
+ * @see struct rte_dmadev_conf::silent_mode
+ */
+
+#define RTE_DMA_DEV_CAPA_FENCE		(1ull << 6)
+/**< DMA device support fence.
+ * If device support fence, then application could set a fence flags when
+ * enqueue operation by rte_dma_copy/copy_sg/fill().
+ * If a operation has a fence flags, it means the operation must be processed
+ * only after all previous operations are completed.
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ */
+
+#define RTE_DMA_DEV_CAPA_OPS_COPY	(1ull << 32)
+/**< DMA device support copy ops.
+ * This capability start with index of 32, so that it could leave gap between
+ * normal capability and ops capability.
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ */
+
+#define RTE_DMA_DEV_CAPA_OPS_COPY_SG	(1ull << 33)
+/**< DMA device support scatter-list copy ops.
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ */
+
+#define RTE_DMA_DEV_CAPA_OPS_FILL	(1ull << 34)
+/**< DMA device support fill ops.
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ */
+
+/**
+ * A structure used to retrieve the information of an DMA device.
+ */
+struct rte_dmadev_info {
+	struct rte_device *device; /**< Generic Device information. */
+	uint64_t dev_capa; /**< Device capabilities (RTE_DMA_DEV_CAPA_*). */
+	uint16_t max_vchans;
+	/**< Maximum number of virtual DMA channels supported. */
+	uint16_t max_desc;
+	/**< Maximum allowed number of virtual DMA channel descriptors. */
+	uint16_t min_desc;
+	/**< Minimum allowed number of virtual DMA channel descriptors. */
+	uint16_t nb_vchans; /**< Number of virtual DMA channel configured. */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve information of a DMA device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param[out] dev_info
+ *   A pointer to a structure of type *rte_dmadev_info* to be filled with the
+ *   information of the device.
+ *
+ * @return
+ *   - =0: Success, driver updates the information of the DMA device.
+ *   - <0: Error code returned by the driver info get function.
+ *
+ */
+__rte_experimental
+int
+rte_dmadev_info_get(uint16_t dev_id, struct rte_dmadev_info *dev_info);
+
+/**
+ * A structure used to configure a DMA device.
+ */
+struct rte_dmadev_conf {
+	uint16_t max_vchans;
+	/**< Maximum number of virtual DMA channel to use.
+	 * This value cannot be greater than the field 'max_vchans' of struct
+	 * rte_dmadev_info which get from rte_dmadev_info_get().
+	 */
+	uint8_t silent_mode;
+	/**< Indicates whether to work in silent mode.
+	 * 0-default mode, 1-silent mode.
+	 *
+	 * @see RTE_DMA_DEV_CAPA_SILENT
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Configure a DMA device.
+ *
+ * This function must be invoked first before any other function in the
+ * API. This function can also be re-invoked when a device is in the
+ * stopped state.
+ *
+ * @param dev_id
+ *   The identifier of the device to configure.
+ * @param dev_conf
+ *   The DMA device configuration structure encapsulated into rte_dmadev_conf
+ *   object.
+ *
+ * @return
+ *   - =0: Success, device configured.
+ *   - <0: Error code returned by the driver configuration function.
+ */
+__rte_experimental
+int
+rte_dmadev_configure(uint16_t dev_id, const struct rte_dmadev_conf *dev_conf);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Start a DMA device.
+ *
+ * The device start step is the last one and consists of setting the DMA
+ * to start accepting jobs.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @return
+ *   - =0: Success, device started.
+ *   - <0: Error code returned by the driver start function.
+ */
+__rte_experimental
+int
+rte_dmadev_start(uint16_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Stop a DMA device.
+ *
+ * The device can be restarted with a call to rte_dmadev_start().
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @return
+ *   - =0: Success, device stopped.
+ *   - <0: Error code returned by the driver stop function.
+ */
+__rte_experimental
+int
+rte_dmadev_stop(uint16_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Close a DMA device.
+ *
+ * The device cannot be restarted after this call.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @return
+ *  - =0: Successfully close device
+ *  - <0: Failure to close device
+ */
+__rte_experimental
+int
+rte_dmadev_close(uint16_t dev_id);
+
+/* DMA transfer direction defines. */
+#define RTE_DMA_DIR_MEM_TO_MEM	(1ull << 0)
+/**< DMA transfer direction - from memory to memory.
+ *
+ * @see struct rte_dmadev_vchan_conf::direction
+ */
+
+#define RTE_DMA_DIR_MEM_TO_DEV	(1ull << 1)
+/**< DMA transfer direction - from memory to device.
+ * In a typical scenario, ARM SoCs are installed on x86 servers as iNICs
+ * through the PCIE interface. In this case, the ARM SoCs works in EP(endpoint)
+ * mode, it could initiate a DMA move request from memory (which is ARM memory)
+ * to device (which is x86 host memory).
+ *
+ * @see struct rte_dmadev_vchan_conf::direction
+ */
+
+#define RTE_DMA_DIR_DEV_TO_MEM	(1ull << 2)
+/**< DMA transfer direction - from device to memory.
+ * In a typical scenario, ARM SoCs are installed on x86 servers as iNICs
+ * through the PCIE interface. In this case, the ARM SoCs works in EP(endpoint)
+ * mode, it could initiate a DMA move request from device (which is x86 host
+ * memory) to memory (which is ARM memory).
+ *
+ * @see struct rte_dmadev_vchan_conf::direction
+ */
+
+#define RTE_DMA_DIR_DEV_TO_DEV	(1ull << 3)
+/**< DMA transfer direction - from device to device.
+ * In a typical scenario, ARM SoCs are installed on x86 servers as iNICs
+ * through the PCIE interface. In this case, the ARM SoCs works in EP(endpoint)
+ * mode, it could initiate a DMA move request from device (which is x86 host
+ * memory) to device (which is another x86 host memory).
+ *
+ * @see struct rte_dmadev_vchan_conf::direction
+ */
+
+/**
+ * enum rte_dmadev_port_type - DMA port type defines
+ * When
+ */
+enum rte_dmadev_port_type {
+	RTE_DMADEV_PORT_PCIE = 1,
+};
+
+/**
+ * A structure used to descript DMA port parameters.
+ */
+struct rte_dmadev_port_param {
+	enum rte_dmadev_port_type port_type; /**< The device port type. */
+	union {
+		/** For PCIE port:
+		 *
+		 * The following model show SoC's PCIE module connects to
+		 * multiple PCIE hosts and multiple endpoints. The PCIE module
+		 * has an integrate DMA controller.
+		 * If the DMA wants to access the memory of host A, it can be
+		 * initiated by PF1 in core0, or by VF0 of PF0 in core0.
+		 *
+		 * System Bus
+		 *    |     ----------PCIE module----------
+		 *    |     Bus
+		 *    |     Interface
+		 *    |     -----        ------------------
+		 *    |     |   |        | PCIE Core0     |
+		 *    |     |   |        |                |        -----------
+		 *    |     |   |        |   PF-0 -- VF-0 |        | Host A  |
+		 *    |     |   |--------|        |- VF-1 |--------| Root    |
+		 *    |     |   |        |   PF-1         |        | Complex |
+		 *    |     |   |        |   PF-2         |        -----------
+		 *    |     |   |        ------------------
+		 *    |     |   |
+		 *    |     |   |        ------------------
+		 *    |     |   |        | PCIE Core1     |
+		 *    |     |   |        |                |        -----------
+		 *    |     |   |        |   PF-0 -- VF-0 |        | Host B  |
+		 *    |-----|   |--------|   PF-1 -- VF-0 |--------| Root    |
+		 *    |     |   |        |        |- VF-1 |        | Complex |
+		 *    |     |   |        |   PF-2         |        -----------
+		 *    |     |   |        ------------------
+		 *    |     |   |
+		 *    |     |   |        ------------------
+		 *    |     |DMA|        |                |        ------
+		 *    |     |   |        |                |--------| EP |
+		 *    |     |   |--------| PCIE Core2     |        ------
+		 *    |     |   |        |                |        ------
+		 *    |     |   |        |                |--------| EP |
+		 *    |     |   |        |                |        ------
+		 *    |     -----        ------------------
+		 *
+		 * The following structure is used to describe the above access
+		 * port.
+		 * @note: If some fields are not supported by hardware, set
+		 *        these fileds to zero. And also there are no
+		 *        capabilities defined for this, it is the duty of the
+		 *        application to set the correct parameters.
+		 */
+		struct {
+			uint64_t coreid : 4; /**< PCIE core id used. */
+			uint64_t pfid : 8; /**< PF id used. */
+			uint64_t vfen : 1; /**< VF enable bit. */
+			uint64_t vfid : 16; /**< VF id used. */
+			uint64_t pasid : 20;
+			/**< The pasid filed in TLP packet. */
+			uint64_t attr : 3;
+			/**< The attributes filed in TLP packet. */
+			uint64_t ph : 2;
+			/**< The processing hint filed in TLP packet. */
+			uint64_t st : 16;
+			/**< The steering tag filed in TLP packet. */
+		} pcie;
+	};
+	uint64_t reserved[2]; /**< Reserved for future fields. */
+};
+
+/**
+ * A structure used to configure a virtual DMA channel.
+ */
+struct rte_dmadev_vchan_conf {
+	uint8_t direction;
+	/**< Set of supported transfer directions
+	 * @see RTE_DMA_DIR_MEM_TO_MEM
+	 * @see RTE_DMA_DIR_MEM_TO_DEV
+	 * @see RTE_DMA_DIR_DEV_TO_MEM
+	 * @see RTE_DMA_DIR_DEV_TO_DEV
+	 */
+
+	/** Number of descriptor for the virtual DMA channel */
+	uint16_t nb_desc;
+	/** 1) Used to describes the port parameter in the device-to-memory
+	 * transfer scenario.
+	 * 2) Used to describes the source port parameter in the
+	 * device-to-device transfer scenario.
+	 * @see struct rte_dmadev_port_param
+	 */
+	struct rte_dmadev_port_param src_port;
+	/** 1) Used to describes the port parameter in the memory-to-device-to
+	 * transfer scenario.
+	 * 2) Used to describes the destination port parameter in the
+	 * device-to-device transfer scenario.
+	 * @see struct rte_dmadev_port_param
+	 */
+	struct rte_dmadev_port_param dst_port;
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Allocate and set up a virtual DMA channel.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param conf
+ *   The virtual DMA channel configuration structure encapsulated into
+ *   rte_dmadev_vchan_conf object.
+ *
+ * @return
+ *   - >=0: Allocate success, it is the virtual DMA channel id. This value must
+ *          be less than the field 'max_vchans' of struct rte_dmadev_conf
+ *          which configured by rte_dmadev_configure().
+ *   - <0: Error code returned by the driver virtual channel setup function.
+ */
+__rte_experimental
+int
+rte_dmadev_vchan_setup(uint16_t dev_id,
+		       const struct rte_dmadev_vchan_conf *conf);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Release a virtual DMA channel.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel which return by vchan setup.
+ *
+ * @return
+ *   - =0: Successfully release the virtual DMA channel.
+ *   - <0: Error code returned by the driver virtual channel release function.
+ */
+__rte_experimental
+int
+rte_dmadev_vchan_release(uint16_t dev_id, uint16_t vchan);
+
+/**
+ * rte_dmadev_stats - running statistics.
+ */
+struct rte_dmadev_stats {
+	uint64_t submitted_count;
+	/**< Count of operations which were submitted to hardware. */
+	uint64_t completed_fail_count;
+	/**< Count of operations which failed to complete. */
+	uint64_t completed_count;
+	/**< Count of operations which successfully complete. */
+};
+
+#define RTE_DMADEV_ALL_VCHAN	0xFFFFu
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve basic statistics of a or all virtual DMA channel(s).
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel.
+ *   If equal RTE_DMADEV_ALL_VCHAN means all channels.
+ * @param[out] stats
+ *   The basic statistics structure encapsulated into rte_dmadev_stats
+ *   object.
+ *
+ * @return
+ *   - =0: Successfully retrieve stats.
+ *   - <0: Failure to retrieve stats.
+ */
+__rte_experimental
+int
+rte_dmadev_stats_get(uint16_t dev_id, uint16_t vchan,
+		     struct rte_dmadev_stats *stats);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Reset basic statistics of a or all virtual DMA channel(s).
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel.
+ *   If equal RTE_DMADEV_ALL_VCHAN means all channels.
+ *
+ * @return
+ *   - =0: Successfully reset stats.
+ *   - <0: Failure to reset stats.
+ */
+__rte_experimental
+int
+rte_dmadev_stats_reset(uint16_t dev_id, uint16_t vchan);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Dump DMA device info.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param f
+ *   The file to write the output to.
+ *
+ * @return
+ *   0 on success. Non-zero otherwise.
+ */
+__rte_experimental
+int
+rte_dmadev_dump(uint16_t dev_id, FILE *f);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Trigger the dmadev self test.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @return
+ *   - 0: Selftest successful.
+ *   - -ENOTSUP if the device doesn't support selftest
+ *   - other values < 0 on failure.
+ */
+__rte_experimental
+int
+rte_dmadev_selftest(uint16_t dev_id);
+
+/* DMA transfer result status code defines */
+enum rte_dma_status_code {
+	RTE_DMA_STATUS_SUCCESSFUL = 0,
+	/**< The operation completed successfully. */
+	RTE_DMA_STATUS_USRER_ABORT,
+	/**< The operation failed to complete due abort by user.
+	 * This is mainly used when processing dev_stop, user could modidy the
+	 * descriptors (e.g. change one bit to tell hardware abort this job),
+	 * it allows outstanding requests to be complete as much as possible,
+	 * so reduce the time to stop the device.
+	 */
+	RTE_DMA_STATUS_NOT_ATTEMPED,
+	/**< The operation failed to complete due to following scenarios:
+	 * The jobs in a particular batch are not attempted because they
+	 * appeared after a fence where a previous job failed. In some HW
+	 * implementation it's possible for jobs from later batches would be
+	 * completed, though, so report the status from the not attempted jobs
+	 * before reporting those newer completed jobs.
+	 */
+	RTE_DMA_STATUS_INVALID_SRC_ADDR,
+	/**< The operation failed to complete due invalid source address. */
+	RTE_DMA_STATUS_INVALID_DST_ADDR,
+	/**< The operation failed to complete due invalid destination
+	 * address.
+	 */
+	RTE_DMA_STATUS_INVALID_LENGTH,
+	/**< The operation failed to complete due invalid length. */
+	RTE_DMA_STATUS_INVALID_OPCODE,
+	/**< The operation failed to complete due invalid opcode.
+	 * The DMA descriptor could have multiple format, which are
+	 * distinguished by the opcode field.
+	 */
+	RTE_DMA_STATUS_BUS_ERROR,
+	/**< The operation failed to complete due bus err. */
+	RTE_DMA_STATUS_DATA_POISION,
+	/**< The operation failed to complete due data poison. */
+	RTE_DMA_STATUS_DESCRIPTOR_READ_ERROR,
+	/**< The operation failed to complete due descriptor read error. */
+	RTE_DMA_STATUS_DEV_LINK_ERROR,
+	/**< The operation failed to complete due device link error.
+	 * Used to indicates that the link error in the mem-to-dev/dev-to-mem/
+	 * dev-to-dev transfer scenario.
+	 */
+	RTE_DMA_STATUS_UNKNOWN,
+	/**< The operation failed to complete due unknown reason. */
+};
+
+/**
+ * rte_dmadev_sge - can hold scatter DMA operation request entry
+ */
+struct rte_dmadev_sge {
+	rte_iova_t addr;
+	uint32_t length;
+};
+
+/**
+ * rte_dmadev_sg - can hold scatter DMA operation request
+ */
+struct rte_dmadev_sg {
+	struct rte_dmadev_sge *src;
+	struct rte_dmadev_sge *dst;
+	uint16_t nb_src; /**< The number of src entry */
+	uint16_t nb_dst; /**< The number of dst entry */
+};
+
+#include "rte_dmadev_core.h"
+
+/* DMA flags to augment operation preparation. */
+#define RTE_DMA_OP_FLAG_FENCE	(1ull << 0)
+/**< DMA fence flag.
+ * It means the operation with this flag must be processed only after all
+ * previous operations are completed.
+ *
+ * @see rte_dmadev_copy()
+ * @see rte_dmadev_copy_sg()
+ * @see rte_dmadev_fill()
+ */
+
+#define RTE_DMA_OP_FLAG_SUBMIT	(1ull << 1)
+/**< DMA submit flag.
+ * It means the operation with this flag must issue doorbell to hardware after
+ * enqueued jobs.
+ */
+
+#define RTE_DMA_OP_FLAG_LLC	(1ull << 2)
+/**< DMA write data to low level cache hint.
+ * This just a hint, and there is no capability bit for this, driver should not
+ * return error if this flag was set.
+ */
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Enqueue a copy operation onto the virtual DMA channel.
+ *
+ * This queues up a copy operation to be performed by hardware, if the 'flags'
+ * parameter contains RTE_DMA_OP_FLAG_SUBMIT then trigger hardware to begin
+ * this operation, otherwise do not trigger hardware.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel.
+ * @param src
+ *   The address of the source buffer.
+ * @param dst
+ *   The address of the destination buffer.
+ * @param length
+ *   The length of the data to be copied.
+ * @param flags
+ *   An flags for this operation.
+ *   @see RTE_DMA_OP_FLAG_*
+ *
+ * @return
+ *   - 0..UINT16_MAX: index of enqueued copy job.
+ *   - <0: Error code returned by the driver copy function.
+ */
+__rte_experimental
+static inline int
+rte_dmadev_copy(uint16_t dev_id, uint16_t vchan, rte_iova_t src, rte_iova_t dst,
+		uint32_t length, uint64_t flags)
+{
+	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
+#ifdef RTE_DMADEV_DEBUG
+	if (!rte_dmadev_is_valid_dev(dev_id) ||
+	    vchan >= dev->data->dev_conf.max_vchans)
+		return -EINVAL;
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->copy, -ENOTSUP);
+#endif
+	return (*dev->copy)(dev, vchan, src, dst, length, flags);
+}
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Enqueue a scatter list copy operation onto the virtual DMA channel.
+ *
+ * This queues up a scatter list copy operation to be performed by hardware, if
+ * the 'flags' parameter contains RTE_DMA_OP_FLAG_SUBMIT then trigger hardware
+ * to begin this operation, otherwise do not trigger hardware.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel.
+ * @param sg
+ *   The pointer of scatterlist.
+ * @param flags
+ *   An flags for this operation.
+ *   @see RTE_DMA_OP_FLAG_*
+ *
+ * @return
+ *   - 0..UINT16_MAX: index of enqueued copy scatterlist job.
+ *   - <0: Error code returned by the driver copy scatterlist function.
+ */
+__rte_experimental
+static inline int
+rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vchan,
+		   const struct rte_dmadev_sg *sg,
+		   uint64_t flags)
+{
+	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
+#ifdef RTE_DMADEV_DEBUG
+	if (!rte_dmadev_is_valid_dev(dev_id) ||
+	    vchan >= dev->data->dev_conf.max_vchans ||
+	    sg == NULL || sg->src == NULL || sg->dst == NULL ||
+	    sg->nb_src == 0 || sg->nb_dst == 0)
+		return -EINVAL;
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->copy_sg, -ENOTSUP);
+#endif
+	return (*dev->copy_sg)(dev, vchan, sg, flags);
+}
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Enqueue a fill operation onto the virtual DMA channel.
+ *
+ * This queues up a fill operation to be performed by hardware, if the 'flags'
+ * parameter contains RTE_DMA_OP_FLAG_SUBMIT then trigger hardware to begin
+ * this operation, otherwise do not trigger hardware.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel.
+ * @param pattern
+ *   The pattern to populate the destination buffer with.
+ * @param dst
+ *   The address of the destination buffer.
+ * @param length
+ *   The length of the destination buffer.
+ * @param flags
+ *   An flags for this operation.
+ *   @see RTE_DMA_OP_FLAG_*
+ *
+ * @return
+ *   - 0..UINT16_MAX: index of enqueued fill job.
+ *   - <0: Error code returned by the driver fill function.
+ */
+__rte_experimental
+static inline int
+rte_dmadev_fill(uint16_t dev_id, uint16_t vchan, uint64_t pattern,
+		rte_iova_t dst, uint32_t length, uint64_t flags)
+{
+	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
+#ifdef RTE_DMADEV_DEBUG
+	if (!rte_dmadev_is_valid_dev(dev_id) ||
+	    vchan >= dev->data->dev_conf.max_vchans)
+		return -EINVAL;
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->fill, -ENOTSUP);
+#endif
+	return (*dev->fill)(dev, vchan, pattern, dst, length, flags);
+}
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Trigger hardware to begin performing enqueued operations.
+ *
+ * This API is used to write the "doorbell" to the hardware to trigger it
+ * to begin the operations previously enqueued by rte_dmadev_copy/fill().
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel.
+ *
+ * @return
+ *   - =0: Successfully trigger hardware.
+ *   - <0: Failure to trigger hardware.
+ */
+__rte_experimental
+static inline int
+rte_dmadev_submit(uint16_t dev_id, uint16_t vchan)
+{
+	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
+#ifdef RTE_DMADEV_DEBUG
+	if (!rte_dmadev_is_valid_dev(dev_id) ||
+	    vchan >= dev->data->dev_conf.max_vchans)
+		return -EINVAL;
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->submit, -ENOTSUP);
+#endif
+	return (*dev->submit)(dev, vchan);
+}
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Returns the number of operations that have been successfully completed.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel.
+ * @param nb_cpls
+ *   The maximum number of completed operations that can be processed.
+ * @param[out] last_idx
+ *   The last completed operation's index.
+ *   If not required, NULL can be passed in.
+ * @param[out] has_error
+ *   Indicates if there are transfer error.
+ *   If not required, NULL can be passed in.
+ *
+ * @return
+ *   The number of operations that successfully completed.
+ */
+__rte_experimental
+static inline uint16_t
+rte_dmadev_completed(uint16_t dev_id, uint16_t vchan, const uint16_t nb_cpls,
+		     uint16_t *last_idx, bool *has_error)
+{
+	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
+	uint16_t idx;
+	bool err;
+
+#ifdef RTE_DMADEV_DEBUG
+	if (!rte_dmadev_is_valid_dev(dev_id) ||
+	    vchan >= dev->data->dev_conf.max_vchans ||
+	    nb_cpls == 0)
+		return -EINVAL;
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->completed, -ENOTSUP);
+#endif
+
+	/* Ensure the pointer values are non-null to simplify drivers.
+	 * In most cases these should be compile time evaluated, since this is
+	 * an inline function.
+	 * - If NULL is explicitly passed as parameter, then compiler knows the
+	 *   value is NULL
+	 * - If address of local variable is passed as parameter, then compiler
+	 *   can know it's non-NULL.
+	 */
+	if (last_idx == NULL)
+		last_idx = &idx;
+	if (has_error == NULL)
+		has_error = &err;
+
+	*has_error = false;
+	return (*dev->completed)(dev, vchan, nb_cpls, last_idx, has_error);
+}
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Returns the number of operations that have been completed, and the
+ * operations result may succeed or fail.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel.
+ * @param nb_cpls
+ *   Indicates the size of status array.
+ * @param[out] last_idx
+ *   The last completed operation's index.
+ *   If not required, NULL can be passed in.
+ * @param[out] status
+ *   The error code of operations that completed.
+ *   @see enum rte_dma_status_code
+ *
+ * @return
+ *   The number of operations that failed to complete.
+ */
+__rte_experimental
+static inline uint16_t
+rte_dmadev_completed_status(uint16_t dev_id, uint16_t vchan,
+			    const uint16_t nb_cpls, uint16_t *last_idx,
+			    enum rte_dma_status_code *status)
+{
+	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
+	uint16_t idx;
+
+#ifdef RTE_DMADEV_DEBUG
+	if (!rte_dmadev_is_valid_dev(dev_id) ||
+	    vchan >= dev->data->dev_conf.max_vchans ||
+	    nb_cpls == 0 || status == NULL)
+		return -EINVAL;
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->completed_status, -ENOTSUP);
+#endif
+
+	if (last_idx == NULL)
+		last_idx = &idx;
+
+	return (*dev->completed_status)(dev, vchan, nb_cpls, last_idx, status);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_DMADEV_H_ */
diff --git a/lib/dmadev/rte_dmadev_core.h b/lib/dmadev/rte_dmadev_core.h
new file mode 100644
index 0000000..8b2da9c
--- /dev/null
+++ b/lib/dmadev/rte_dmadev_core.h
@@ -0,0 +1,180 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2021 HiSilicon Limited.
+ * Copyright(c) 2021 Intel Corporation.
+ */
+
+#ifndef _RTE_DMADEV_CORE_H_
+#define _RTE_DMADEV_CORE_H_
+
+/**
+ * @file
+ *
+ * RTE DMA Device internal header.
+ *
+ * This header contains internal data types, that are used by the DMA devices
+ * in order to expose their ops to the class.
+ *
+ * Applications should not use these API directly.
+ *
+ */
+
+struct rte_dmadev;
+
+typedef int (*rte_dmadev_info_get_t)(const struct rte_dmadev *dev,
+				     struct rte_dmadev_info *dev_info,
+				     uint32_t info_sz);
+/** @internal Used to get device information of a device. */
+
+typedef int (*rte_dmadev_configure_t)(struct rte_dmadev *dev,
+				      const struct rte_dmadev_conf *dev_conf);
+/** @internal Used to configure a device. */
+
+typedef int (*rte_dmadev_start_t)(struct rte_dmadev *dev);
+/** @internal Used to start a configured device. */
+
+typedef int (*rte_dmadev_stop_t)(struct rte_dmadev *dev);
+/** @internal Used to stop a configured device. */
+
+typedef int (*rte_dmadev_close_t)(struct rte_dmadev *dev);
+/** @internal Used to close a configured device. */
+
+typedef int (*rte_dmadev_vchan_setup_t)(struct rte_dmadev *dev,
+				const struct rte_dmadev_vchan_conf *conf);
+/** @internal Used to allocate and set up a virtual DMA channel. */
+
+typedef int (*rte_dmadev_vchan_release_t)(struct rte_dmadev *dev,
+					  uint16_t vchan);
+/** @internal Used to release a virtual DMA channel. */
+
+typedef int (*rte_dmadev_stats_get_t)(const struct rte_dmadev *dev,
+			uint16_t vchan, struct rte_dmadev_stats *stats,
+			uint32_t stats_sz);
+/** @internal Used to retrieve basic statistics. */
+
+typedef int (*rte_dmadev_stats_reset_t)(struct rte_dmadev *dev, uint16_t vchan);
+/** @internal Used to reset basic statistics. */
+
+typedef int (*rte_dmadev_dump_t)(const struct rte_dmadev *dev, FILE *f);
+/** @internal Used to dump internal information. */
+
+typedef int (*rte_dmadev_selftest_t)(uint16_t dev_id);
+/** @internal Used to start dmadev selftest. */
+
+typedef int (*rte_dmadev_copy_t)(struct rte_dmadev *dev, uint16_t vchan,
+				 rte_iova_t src, rte_iova_t dst,
+				 uint32_t length, uint64_t flags);
+/** @internal Used to enqueue a copy operation. */
+
+typedef int (*rte_dmadev_copy_sg_t)(struct rte_dmadev *dev, uint16_t vchan,
+				    const struct rte_dmadev_sg *sg,
+				    uint64_t flags);
+/** @internal Used to enqueue a scatter list copy operation. */
+
+typedef int (*rte_dmadev_fill_t)(struct rte_dmadev *dev, uint16_t vchan,
+				 uint64_t pattern, rte_iova_t dst,
+				 uint32_t length, uint64_t flags);
+/** @internal Used to enqueue a fill operation. */
+
+typedef int (*rte_dmadev_submit_t)(struct rte_dmadev *dev, uint16_t vchan);
+/** @internal Used to trigger hardware to begin working. */
+
+typedef uint16_t (*rte_dmadev_completed_t)(struct rte_dmadev *dev,
+				uint16_t vchan, const uint16_t nb_cpls,
+				uint16_t *last_idx, bool *has_error);
+/** @internal Used to return number of successful completed operations. */
+
+typedef uint16_t (*rte_dmadev_completed_status_t)(struct rte_dmadev *dev,
+			uint16_t vchan, const uint16_t nb_cpls,
+			uint16_t *last_idx, enum rte_dma_status_code *status);
+/** @internal Used to return number of completed operations. */
+
+/**
+ * Possible states of a DMA device.
+ */
+enum rte_dmadev_state {
+	RTE_DMADEV_UNUSED = 0,
+	/**< Device is unused before being probed. */
+	RTE_DMADEV_ATTACHED,
+	/**< Device is attached when allocated in probing. */
+};
+
+/**
+ * DMA device operations function pointer table
+ */
+struct rte_dmadev_ops {
+	rte_dmadev_info_get_t dev_info_get;
+	rte_dmadev_configure_t dev_configure;
+	rte_dmadev_start_t dev_start;
+	rte_dmadev_stop_t dev_stop;
+	rte_dmadev_close_t dev_close;
+	rte_dmadev_vchan_setup_t vchan_setup;
+	rte_dmadev_vchan_release_t vchan_release;
+	rte_dmadev_stats_get_t stats_get;
+	rte_dmadev_stats_reset_t stats_reset;
+	rte_dmadev_dump_t dev_dump;
+	rte_dmadev_selftest_t dev_selftest;
+};
+
+/**
+ * @internal
+ * The data part, with no function pointers, associated with each DMA device.
+ *
+ * This structure is safe to place in shared memory to be common among different
+ * processes in a multi-process configuration.
+ */
+struct rte_dmadev_data {
+	void *dev_private;
+	/**< PMD-specific private data.
+	 * This is a copy of the 'dev_private' field in the 'struct rte_dmadev'
+	 * from primary process, it is used by the secondary process to get
+	 * dev_private information.
+	 */
+	uint16_t dev_id; /**< Device [external] identifier. */
+	char dev_name[RTE_DMADEV_NAME_MAX_LEN]; /**< Unique identifier name */
+	struct rte_dmadev_conf dev_conf; /**< DMA device configuration. */
+	uint8_t dev_started : 1; /**< Device state: STARTED(1)/STOPPED(0). */
+	uint64_t reserved[2]; /**< Reserved for future fields */
+} __rte_cache_aligned;
+
+/**
+ * @internal
+ * The generic data structure associated with each DMA device.
+ *
+ * The dataplane APIs are located at the beginning of the structure, along
+ * with the pointer to where all the data elements for the particular device
+ * are stored in shared memory. This split scheme allows the function pointer
+ * and driver data to be per-process, while the actual configuration data for
+ * the device is shared.
+ */
+struct rte_dmadev {
+	rte_dmadev_copy_t copy;
+	rte_dmadev_copy_sg_t copy_sg;
+	rte_dmadev_fill_t fill;
+	rte_dmadev_submit_t submit;
+	rte_dmadev_completed_t completed;
+	rte_dmadev_completed_status_t completed_status;
+	void *reserved_ptr; /**< Reserved for future IO function */
+	void *dev_private;
+	/**< PMD-specific private data.
+	 * If is the primary process, after dmadev allocated by
+	 * rte_dmadev_pmd_allocate(), the PCI/SoC device probing should
+	 * initialize this field, and copy it's value to the 'dev_private'
+	 * field of 'struct rte_dmadev_data' which pointer by 'data' filed.
+	 * If is the secondary process, dmadev framework will initialize this
+	 * field by copy from 'dev_private' field of 'struct rte_dmadev_data'
+	 * which initialized by primary process.
+	 * It's the primay process's responsibility to deinitialize this field
+	 * after invoke rte_dmadev_pmd_release() in the PCI/SoC device removing
+	 * stage.
+	 */
+	struct rte_dmadev_data *data; /**< Pointer to device data. */
+	const struct rte_dmadev_ops *dev_ops; /**< Functions exported by PMD. */
+	struct rte_device *device;
+	/**< Device info which supplied during device initialization. */
+	enum rte_dmadev_state state; /**< Flag indicating the device state */
+	uint64_t reserved[2]; /**< Reserved for future fields */
+} __rte_cache_aligned;
+
+extern struct rte_dmadev rte_dmadevices[];
+
+#endif /* _RTE_DMADEV_CORE_H_ */
diff --git a/lib/dmadev/rte_dmadev_pmd.h b/lib/dmadev/rte_dmadev_pmd.h
new file mode 100644
index 0000000..45141f9
--- /dev/null
+++ b/lib/dmadev/rte_dmadev_pmd.h
@@ -0,0 +1,72 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2021 HiSilicon Limited.
+ */
+
+#ifndef _RTE_DMADEV_PMD_H_
+#define _RTE_DMADEV_PMD_H_
+
+/**
+ * @file
+ *
+ * RTE DMA Device PMD APIs
+ *
+ * Driver facing APIs for a DMA device. These are not to be called directly by
+ * any application.
+ */
+
+#include "rte_dmadev.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * @internal
+ * Allocates a new dmadev slot for an DMA device and returns the pointer
+ * to that slot for the driver to use.
+ *
+ * @param name
+ *   DMA device name.
+ *
+ * @return
+ *   A pointer to the DMA device slot case of success,
+ *   NULL otherwise.
+ */
+__rte_internal
+struct rte_dmadev *
+rte_dmadev_pmd_allocate(const char *name);
+
+/**
+ * @internal
+ * Release the specified dmadev.
+ *
+ * @param dev
+ *   Device to be released.
+ *
+ * @return
+ *   - 0 on success, negative on error
+ */
+__rte_internal
+int
+rte_dmadev_pmd_release(struct rte_dmadev *dev);
+
+/**
+ * @internal
+ * Return the DMA device based on the device name.
+ *
+ * @param name
+ *   DMA device name.
+ *
+ * @return
+ *   A pointer to the DMA device slot case of success,
+ *   NULL otherwise.
+ */
+__rte_internal
+struct rte_dmadev *
+rte_dmadev_get_device_by_name(const char *name);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_DMADEV_PMD_H_ */
diff --git a/lib/dmadev/version.map b/lib/dmadev/version.map
new file mode 100644
index 0000000..2af78e4
--- /dev/null
+++ b/lib/dmadev/version.map
@@ -0,0 +1,37 @@ 
+EXPERIMENTAL {
+	global:
+
+	rte_dmadev_close;
+	rte_dmadev_completed;
+	rte_dmadev_completed_fails;
+	rte_dmadev_configure;
+	rte_dmadev_copy;
+	rte_dmadev_copy_sg;
+	rte_dmadev_count;
+	rte_dmadev_dump;
+	rte_dmadev_fill;
+	rte_dmadev_info_get;
+	rte_dmadev_is_valid_dev;
+	rte_dmadev_selftest;
+	rte_dmadev_start;
+	rte_dmadev_stats_get;
+	rte_dmadev_stats_reset;
+	rte_dmadev_stop;
+	rte_dmadev_submit;
+	rte_dmadev_vchan_release;
+	rte_dmadev_vchan_setup;
+
+	local: *;
+};
+
+INTERNAL {
+        global:
+
+	rte_dmadevices;
+	rte_dmadev_get_device_by_name;
+	rte_dmadev_pmd_allocate;
+	rte_dmadev_pmd_release;
+
+	local: *;
+};
+
diff --git a/lib/meson.build b/lib/meson.build
index 1673ca4..68d239f 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -60,6 +60,7 @@  libraries = [
         'bpf',
         'graph',
         'node',
+        'dmadev',
 ]
 
 if is_windows