[v9,4/4] doc: add cryptodev service APIs guide
diff mbox series

Message ID 20200908084253.81022-5-roy.fan.zhang@intel.com
State Changes Requested
Delegated to: akhil goyal
Headers show
Series
  • cryptodev: add data-path service APIs
Related show

Checks

Context Check Description
ci/Intel-compilation success Compilation OK
ci/checkpatch success coding style OK
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-testing success Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS

Commit Message

Zhang, Roy Fan Sept. 8, 2020, 8:42 a.m. UTC
This patch updates programmer's guide to demonstrate the usage
and limitations of cryptodev symmetric crypto data-path service
APIs.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 doc/guides/prog_guide/cryptodev_lib.rst | 90 +++++++++++++++++++++++++
 doc/guides/rel_notes/release_20_11.rst  |  7 ++
 2 files changed, 97 insertions(+)

Comments

Akhil Goyal Sept. 18, 2020, 8:39 p.m. UTC | #1
Hi Fan,

> Subject: [dpdk-dev v9 4/4] doc: add cryptodev service APIs guide
> 
> This patch updates programmer's guide to demonstrate the usage
> and limitations of cryptodev symmetric crypto data-path service
> APIs.
> 
> Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
> ---
>  doc/guides/prog_guide/cryptodev_lib.rst | 90 +++++++++++++++++++++++++
>  doc/guides/rel_notes/release_20_11.rst  |  7 ++
>  2 files changed, 97 insertions(+)

We generally do not take separate patches for documentation. Squash it the patches which
Implement the feature.

> 
> diff --git a/doc/guides/prog_guide/cryptodev_lib.rst
> b/doc/guides/prog_guide/cryptodev_lib.rst
> index c14f750fa..1321e4c5d 100644
> --- a/doc/guides/prog_guide/cryptodev_lib.rst
> +++ b/doc/guides/prog_guide/cryptodev_lib.rst
> @@ -631,6 +631,96 @@ a call argument. Status different than zero must be
> treated as error.
>  For more details, e.g. how to convert an mbuf to an SGL, please refer to an
>  example usage in the IPsec library implementation.
> 
> +Cryptodev Direct Data-path Service API
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

What do you mean by Direct here. It should be referenced as raw APIs?
Moreover, service keyword can also be dropped. We normally use it for
Software implementation of a feature which is normally done in hardware.


> +
> +Direct crypto data-path service are a set of APIs that especially provided for
> +the external libraries/applications who want to take advantage of the rich
> +features provided by cryptodev, but not necessarily depend on cryptodev
> +operations, mempools, or mbufs in the their data-path implementations.

Raw crypto data path is a set of APIs which can be used by external
libraries/applications to take advantage of the rich features provided by cryptodev,
but not necessarily depend on cryptodev operations, mempools, or mbufs in the
data-path implementations.

> +
> +The direct crypto data-path service has the following advantages:
> +- Supports raw data pointer and physical addresses as input.
> +- Do not require specific data structure allocated from heap, such as
> +  cryptodev operation.
> +- Enqueue in a burst or single operation. The service allow enqueuing in
> +  a burst similar to ``rte_cryptodev_enqueue_burst`` operation, or only
> +  enqueue one job at a time but maintaining necessary context data locally for
> +  next single job enqueue operation. The latter method is especially helpful
> +  when the user application's crypto operations are clustered into a burst.
> +  Allowing enqueue one operation at a time helps reducing one additional loop
> +  and also reduced the cache misses during the double "looping" situation.
> +- Customerizable dequeue count. Instead of dequeue maximum possible
> operations
> +  as same as ``rte_cryptodev_dequeue_burst`` operation, the service allows the
> +  user to provide a callback function to decide how many operations to be
> +  dequeued. This is especially helpful when the expected dequeue count is
> +  hidden inside the opaque data stored during enqueue. The user can provide
> +  the callback function to parse the opaque data structure.
> +- Abandon enqueue and dequeue anytime. One of the drawbacks of
> +  ``rte_cryptodev_enqueue_burst`` and ``rte_cryptodev_dequeue_burst``
> +  operations are: once an operation is enqueued/dequeued there is no way to
> +  undo the operation. The service make the operation abandon possible by
> +  creating a local copy of the queue operation data in the service context
> +  data. The data will be written back to driver maintained operation data
> +  when enqueue or dequeue done function is called.
> +

The language in the above test need to be re-written. Some sentences does not
Make complete sense and has grammatical errors.

I suggest to have an internal review within Intel first before sending the next version.

> +The cryptodev data-path service uses
> +
> +Cryptodev PMDs who supports this feature will have
> +``RTE_CRYPTODEV_FF_SYM_HW_DIRECT_API`` feature flag presented. To use

RTE_CRYPTODEV_FF_SYM_HW_RAW_DP looks better.

> this
> +feature the function ``rte_cryptodev_get_dp_service_ctx_data_size`` should

Can be renamed as rte_cryptodev_get_raw_dp_ctx_size

> +be called to get the data path service context data size. The user should
> +creates a local buffer at least this size long and initialize it using

The user should create a local buffer of at least this size and initialize it using

> +``rte_cryptodev_dp_configure_service`` function call.

rte_cryptodev_raw_dp_configure or rte_cryptodev_configure _raw_dp can be used here.

> +
> +The ``rte_cryptodev_dp_configure_service`` function call initialize or
> +updates the ``struct rte_crypto_dp_service_ctx`` buffer, in which contains the
> +driver specific queue pair data pointer and service context buffer, and a
> +set of function pointers to enqueue and dequeue different algorithms'
> +operations. The ``rte_cryptodev_dp_configure_service`` should be called when:
> +
> +- Before enqueuing or dequeuing starts (set ``is_update`` parameter to 0).
> +- When different cryptodev session, security session, or session-less xform
> +  is used (set ``is_update`` parameter to 1).

The use of is_update is not clear with above text. IMO, we do not need this flag.
Whenever an update is required, we change the session information and call the
Same API again and driver can copy all information blindly without checking.

> +
> +Two different enqueue functions are provided.
> +
> +- ``rte_cryptodev_dp_sym_submit_vec``: submit a burst of operations stored in
> +  the ``rte_crypto_sym_vec`` structure.
> +- ``rte_cryptodev_dp_submit_single_job``: submit single operation.

What is the meaning of single job here? Can we use multiple buffers/vectors of same session
In a single job? Or we can submit only a single buffer/vector in a job?

> +
> +Either enqueue functions will not command the crypto device to start
> processing
> +until ``rte_cryptodev_dp_submit_done`` function is called. Before then the user
> +shall expect the driver only stores the necessory context data in the
> +``rte_crypto_dp_service_ctx`` buffer for the next enqueue operation. If the
> user
> +wants to abandon the submitted operations, simply call
> +``rte_cryptodev_dp_configure_service`` function instead with the parameter
> +``is_update`` set to 0. The driver will recover the service context data to
> +the previous state.

Can you explain a use case where this is actually being used? This looks fancy but
Do we have this type of requirement in any protocol stacks/specifications?
I believe it to be an extra burden on the application writer if it is not a protocol requirement.

> +
> +To dequeue the operations the user also have two operations:
> +
> +- ``rte_cryptodev_dp_sym_dequeue``: fully customizable deuqueue operation.
> The
> +  user needs to provide the callback function for the driver to get the
> +  dequeue count and perform post processing such as write the status field.
> +- ``rte_cryptodev_dp_sym_dequeue_single_job``: dequeue single job.
> +
> +Same as enqueue, the function ``rte_cryptodev_dp_dequeue_done`` is used to
> +merge user's local service context data with the driver's queue operation
> +data. Also to abandon the dequeue operation (still keep the operations in the
> +queue), the user shall avoid ``rte_cryptodev_dp_dequeue_done`` function call
> +but calling ``rte_cryptodev_dp_configure_service`` function with the parameter
> +``is_update`` set to 0.
> +
> +There are a few limitations to the data path service:
> +
> +* Only support in-place operations.
> +* APIs are NOT thread-safe.
> +* CANNOT mix the direct API's enqueue with rte_cryptodev_enqueue_burst, or
> +  vice versa.
> +
> +See *DPDK API Reference* for details on each API definitions.
> +
>  Sample code
>  -----------
> 
> diff --git a/doc/guides/rel_notes/release_20_11.rst
> b/doc/guides/rel_notes/release_20_11.rst
> index df227a177..159823345 100644
> --- a/doc/guides/rel_notes/release_20_11.rst
> +++ b/doc/guides/rel_notes/release_20_11.rst
> @@ -55,6 +55,13 @@ New Features
>       Also, make sure to start the actual text at the margin.
>       =======================================================
> 
> +   * **Added data-path APIs for cryptodev library.**
> +
> +     Cryptodev is added data-path APIs to accelerate external libraries or
> +     applications those want to avail fast cryptodev enqueue/dequeue
> +     operations but does not necessarily depends on mbufs and cryptodev
> +     operation mempool.
> +
> 
>  Removed Items
>  -------------


Regards,
Akhil
Zhang, Roy Fan Sept. 21, 2020, 12:28 p.m. UTC | #2
Hi Akhil,

> -----Original Message-----
> From: Akhil Goyal <akhil.goyal@nxp.com>
> Sent: Friday, September 18, 2020 9:39 PM
> To: Zhang, Roy Fan <roy.fan.zhang@intel.com>; dev@dpdk.org
> Cc: Trahe, Fiona <fiona.trahe@intel.com>; Kusztal, ArkadiuszX
> <arkadiuszx.kusztal@intel.com>; Dybkowski, AdamX
> <adamx.dybkowski@intel.com>; Anoob Joseph <anoobj@marvell.com>
> Subject: RE: [dpdk-dev v9 4/4] doc: add cryptodev service APIs guide
> 
> Hi Fan,
> 
> > Subject: [dpdk-dev v9 4/4] doc: add cryptodev service APIs guide
> >
> > This patch updates programmer's guide to demonstrate the usage
> > and limitations of cryptodev symmetric crypto data-path service
> > APIs.
> >
> > Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
> > ---
> >  doc/guides/prog_guide/cryptodev_lib.rst | 90
> +++++++++++++++++++++++++
> >  doc/guides/rel_notes/release_20_11.rst  |  7 ++
> >  2 files changed, 97 insertions(+)
> 
> We generally do not take separate patches for documentation. Squash it the
> patches which
> Implement the feature.
> 
> >
> > diff --git a/doc/guides/prog_guide/cryptodev_lib.rst
> > b/doc/guides/prog_guide/cryptodev_lib.rst
> > index c14f750fa..1321e4c5d 100644
> > --- a/doc/guides/prog_guide/cryptodev_lib.rst
> > +++ b/doc/guides/prog_guide/cryptodev_lib.rst
> > @@ -631,6 +631,96 @@ a call argument. Status different than zero must be
> > treated as error.
> >  For more details, e.g. how to convert an mbuf to an SGL, please refer to an
> >  example usage in the IPsec library implementation.
> >
> > +Cryptodev Direct Data-path Service API
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> What do you mean by Direct here. It should be referenced as raw APIs?
> Moreover, service keyword can also be dropped. We normally use it for
> Software implementation of a feature which is normally done in hardware.
> 
You are right, raw API is a good name. Will remove service keyword.
> 
> > +
> > +Direct crypto data-path service are a set of APIs that especially provided
> for
> > +the external libraries/applications who want to take advantage of the rich
> > +features provided by cryptodev, but not necessarily depend on cryptodev
> > +operations, mempools, or mbufs in the their data-path implementations.
> 
> Raw crypto data path is a set of APIs which can be used by external
> libraries/applications to take advantage of the rich features provided by
> cryptodev,
> but not necessarily depend on cryptodev operations, mempools, or mbufs in
> the
> data-path implementations.
> 
Thanks.
> > +
> > +The direct crypto data-path service has the following advantages:
> > +- Supports raw data pointer and physical addresses as input.
> > +- Do not require specific data structure allocated from heap, such as
> > +  cryptodev operation.
> > +- Enqueue in a burst or single operation. The service allow enqueuing in
> > +  a burst similar to ``rte_cryptodev_enqueue_burst`` operation, or only
> > +  enqueue one job at a time but maintaining necessary context data locally
> for
> > +  next single job enqueue operation. The latter method is especially
> helpful
> > +  when the user application's crypto operations are clustered into a burst.
> > +  Allowing enqueue one operation at a time helps reducing one additional
> loop
> > +  and also reduced the cache misses during the double "looping" situation.
> > +- Customerizable dequeue count. Instead of dequeue maximum possible
> > operations
> > +  as same as ``rte_cryptodev_dequeue_burst`` operation, the service
> allows the
> > +  user to provide a callback function to decide how many operations to be
> > +  dequeued. This is especially helpful when the expected dequeue count
> is
> > +  hidden inside the opaque data stored during enqueue. The user can
> provide
> > +  the callback function to parse the opaque data structure.
> > +- Abandon enqueue and dequeue anytime. One of the drawbacks of
> > +  ``rte_cryptodev_enqueue_burst`` and ``rte_cryptodev_dequeue_burst``
> > +  operations are: once an operation is enqueued/dequeued there is no
> way to
> > +  undo the operation. The service make the operation abandon possible
> by
> > +  creating a local copy of the queue operation data in the service context
> > +  data. The data will be written back to driver maintained operation data
> > +  when enqueue or dequeue done function is called.
> > +
> 
> The language in the above test need to be re-written. Some sentences does
> not
> Make complete sense and has grammatical errors.
> 
> I suggest to have an internal review within Intel first before sending the next
> version.
> 
> > +The cryptodev data-path service uses
> > +
> > +Cryptodev PMDs who supports this feature will have
> > +``RTE_CRYPTODEV_FF_SYM_HW_DIRECT_API`` feature flag presented. To
> use
> 
> RTE_CRYPTODEV_FF_SYM_HW_RAW_DP looks better.
> 
> > this
> > +feature the function ``rte_cryptodev_get_dp_service_ctx_data_size``
> should
> 
> Can be renamed as rte_cryptodev_get_raw_dp_ctx_size
> 
> > +be called to get the data path service context data size. The user should
> > +creates a local buffer at least this size long and initialize it using
> 
> The user should create a local buffer of at least this size and initialize it using
> 
> > +``rte_cryptodev_dp_configure_service`` function call.
> 
> rte_cryptodev_raw_dp_configure or rte_cryptodev_configure _raw_dp can
> be used here.
> 
> > +
> > +The ``rte_cryptodev_dp_configure_service`` function call initialize or
> > +updates the ``struct rte_crypto_dp_service_ctx`` buffer, in which contains
> the
> > +driver specific queue pair data pointer and service context buffer, and a
> > +set of function pointers to enqueue and dequeue different algorithms'
> > +operations. The ``rte_cryptodev_dp_configure_service`` should be called
> when:
> > +
> > +- Before enqueuing or dequeuing starts (set ``is_update`` parameter to 0).
> > +- When different cryptodev session, security session, or session-less
> xform
> > +  is used (set ``is_update`` parameter to 1).
> 
> The use of is_update is not clear with above text. IMO, we do not need this
> flag.
> Whenever an update is required, we change the session information and call
> the
> Same API again and driver can copy all information blindly without checking.
> 
The flag is very important
- When flag is not set much queue specific data will be write into driver specific
Data field in the service data. This data is written back to the driver when 
"submit_done" or "dequeue_done" function is called.
- When flag is set the above step is not executed, driver only updates the function
handlers attached the service data for different algorithm used. Will update this
into the description.

> > +
> > +Two different enqueue functions are provided.
> > +
> > +- ``rte_cryptodev_dp_sym_submit_vec``: submit a burst of operations
> stored in
> > +  the ``rte_crypto_sym_vec`` structure.
> > +- ``rte_cryptodev_dp_submit_single_job``: submit single operation.
> 
> What is the meaning of single job here? Can we use multiple buffers/vectors
> of same session
> In a single job? Or we can submit only a single buffer/vector in a job?
> 
Same as CPU crypto, a sym vec will contains one or more jobs with same session.
But it is not an inline function as it assumes we are bursting multiple ops up.
Single job - the sole purpose is to use it as an inline function that pushes one
job into the queue, but not kicking the HW to start processing.
When the raw APIs are used in external application/lib that also works in burst,
such as VPP, single job submission is very useful to reduce the cycle cost. Since
they also have data structure translation to their specific data structure and also
work in burst, you don't want the application loops a burst of 32
ops to translate into a burst of DPDK crypto sym vec first, passing to the driver, the
driver then loops the jobs the second time to write to the HW one by one. Use of
inline "submit single" API can help reducing the cycles and cache misses, especially
when the burst size is 256. 

> > +
> > +Either enqueue functions will not command the crypto device to start
> > processing
> > +until ``rte_cryptodev_dp_submit_done`` function is called. Before then
> the user
> > +shall expect the driver only stores the necessory context data in the
> > +``rte_crypto_dp_service_ctx`` buffer for the next enqueue operation. If
> the
> > user
> > +wants to abandon the submitted operations, simply call
> > +``rte_cryptodev_dp_configure_service`` function instead with the
> parameter
> > +``is_update`` set to 0. The driver will recover the service context data to
> > +the previous state.
> 
> Can you explain a use case where this is actually being used? This looks fancy
> but
> Do we have this type of requirement in any protocol stacks/specifications?
> I believe it to be an extra burden on the application writer if it is not a
> protocol requirement.

It is not protocol stack specific. 
For now all crypto ops need to be translated into HW device/queue operation
data in a looped manner. The context between last and current ops is stored in
the driver and is updated within enqueue burst function (e.g. for QAT such
context is the shadow copy of the QAT queue tail/head).

As mentioned earlier such burst operation introduce more cycles and cache
Misses. So we want to introduce single job enqueue in DPDK so we don't loop
the same burst twice. However we have to take care of the context between
last and current submit single function. The idea is let the user allocate the buffer
for that. The "is_update" param is used to tell the driver if the it
needs to write the context to the driver (is_update == 0), or only update context
unrelated data inside dp_service buffer (updating function handler to the algo etc).
When the burst is processed the user can call submit_done function so the user
maintained context is written back to the driver. So in next burst enqueue the user
should call ``rte_cryptodev_dp_configure_service`` with "is_update" = 0 before submit any
jobs into the driver, or when the session is changed use "is_update" = 1 to only update
the function pointer.

> 
> > +
> > +To dequeue the operations the user also have two operations:
> > +
> > +- ``rte_cryptodev_dp_sym_dequeue``: fully customizable deuqueue
> operation.
> > The
> > +  user needs to provide the callback function for the driver to get the
> > +  dequeue count and perform post processing such as write the status
> field.
> > +- ``rte_cryptodev_dp_sym_dequeue_single_job``: dequeue single job.
> > +
> > +Same as enqueue, the function ``rte_cryptodev_dp_dequeue_done`` is
> used to
> > +merge user's local service context data with the driver's queue operation
> > +data. Also to abandon the dequeue operation (still keep the operations in
> the
> > +queue), the user shall avoid ``rte_cryptodev_dp_dequeue_done``
> function call
> > +but calling ``rte_cryptodev_dp_configure_service`` function with the
> parameter
> > +``is_update`` set to 0.
> > +
> > +There are a few limitations to the data path service:
> > +
> > +* Only support in-place operations.
> > +* APIs are NOT thread-safe.
> > +* CANNOT mix the direct API's enqueue with
> rte_cryptodev_enqueue_burst, or
> > +  vice versa.
> > +
> > +See *DPDK API Reference* for details on each API definitions.
> > +
> >  Sample code
> >  -----------
> >
> > diff --git a/doc/guides/rel_notes/release_20_11.rst
> > b/doc/guides/rel_notes/release_20_11.rst
> > index df227a177..159823345 100644
> > --- a/doc/guides/rel_notes/release_20_11.rst
> > +++ b/doc/guides/rel_notes/release_20_11.rst
> > @@ -55,6 +55,13 @@ New Features
> >       Also, make sure to start the actual text at the margin.
> >       =======================================================
> >
> > +   * **Added data-path APIs for cryptodev library.**
> > +
> > +     Cryptodev is added data-path APIs to accelerate external libraries or
> > +     applications those want to avail fast cryptodev enqueue/dequeue
> > +     operations but does not necessarily depends on mbufs and cryptodev
> > +     operation mempool.
> > +
> >
> >  Removed Items
> >  -------------
> 
> 
> Regards,
> Akhil
Zhang, Roy Fan Sept. 23, 2020, 1:37 p.m. UTC | #3
Hi Akhil

> -----Original Message-----
... 
> > +
> > +Either enqueue functions will not command the crypto device to start
> > processing
> > +until ``rte_cryptodev_dp_submit_done`` function is called. Before then
> the user
> > +shall expect the driver only stores the necessory context data in the
> > +``rte_crypto_dp_service_ctx`` buffer for the next enqueue operation. If
> the
> > user
> > +wants to abandon the submitted operations, simply call
> > +``rte_cryptodev_dp_configure_service`` function instead with the
> parameter
> > +``is_update`` set to 0. The driver will recover the service context data to
> > +the previous state.
> 
> Can you explain a use case where this is actually being used? This looks fancy
> but
> Do we have this type of requirement in any protocol stacks/specifications?
> I believe it to be an extra burden on the application writer if it is not a
> protocol requirement.
> 

I missed responding this one. 
The requirement comes from cooping with VPP crypto framework.

The reason for this feature is fill the gap of cryptodev enqueue and dequeue operations.
If the user application/library uses the approach similar to " rte_crypto_sym_vec" (such as VPP vnet_crypto_async_frame_t) that clusters multiple crypto ops as a burst, the application requires enqueuing and dequeuing all ops as a whole inside, or nothing. 
It is very slow for rte_cryptodev_enqueue/dequeue_burst to achieve this today as the user has no control over how many ops I want to enqueue/dequeue preciously. For example I want to enqueue a " rte_crypto_sym_vec" buffer contains 32 descriptors, and stores " rte_crypto_sym_vec" as opaque data in enqueue,  but rte_cryptodev_enqueue_burst returns 31, I have no option but cache the 1 left job for next enqueue attempt (or I manually check the inflight count in every enqueue). Also during dequeue since the number "32" is stored inside  rte_crypto_sym_vec.num, I have no way to know how many ops to dequeue, but blindly dequeue them and store in a software ring, parse the the dequeue count from retrieved opaque data, and check the ring count against dequeue count. 

With the new way provided we can easily achieve the goal. For HW crypto PMD such implementation is relatively easy, we only need to create a shadow copy to the queue pair data in ``rte_crypto_dp_service_ctx`` and updates in enqueue/dequeue, when "enqueue/dequeue_done" is called the queue is kicked to start processing jobs already set in the queue and  merge the shadow copy queue data into driver maintained one.

Regards,
Fan

Patch
diff mbox series

diff --git a/doc/guides/prog_guide/cryptodev_lib.rst b/doc/guides/prog_guide/cryptodev_lib.rst
index c14f750fa..1321e4c5d 100644
--- a/doc/guides/prog_guide/cryptodev_lib.rst
+++ b/doc/guides/prog_guide/cryptodev_lib.rst
@@ -631,6 +631,96 @@  a call argument. Status different than zero must be treated as error.
 For more details, e.g. how to convert an mbuf to an SGL, please refer to an
 example usage in the IPsec library implementation.
 
+Cryptodev Direct Data-path Service API
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Direct crypto data-path service are a set of APIs that especially provided for
+the external libraries/applications who want to take advantage of the rich
+features provided by cryptodev, but not necessarily depend on cryptodev
+operations, mempools, or mbufs in the their data-path implementations.
+
+The direct crypto data-path service has the following advantages:
+- Supports raw data pointer and physical addresses as input.
+- Do not require specific data structure allocated from heap, such as
+  cryptodev operation.
+- Enqueue in a burst or single operation. The service allow enqueuing in
+  a burst similar to ``rte_cryptodev_enqueue_burst`` operation, or only
+  enqueue one job at a time but maintaining necessary context data locally for
+  next single job enqueue operation. The latter method is especially helpful
+  when the user application's crypto operations are clustered into a burst.
+  Allowing enqueue one operation at a time helps reducing one additional loop
+  and also reduced the cache misses during the double "looping" situation.
+- Customerizable dequeue count. Instead of dequeue maximum possible operations
+  as same as ``rte_cryptodev_dequeue_burst`` operation, the service allows the
+  user to provide a callback function to decide how many operations to be
+  dequeued. This is especially helpful when the expected dequeue count is
+  hidden inside the opaque data stored during enqueue. The user can provide
+  the callback function to parse the opaque data structure.
+- Abandon enqueue and dequeue anytime. One of the drawbacks of
+  ``rte_cryptodev_enqueue_burst`` and ``rte_cryptodev_dequeue_burst``
+  operations are: once an operation is enqueued/dequeued there is no way to
+  undo the operation. The service make the operation abandon possible by
+  creating a local copy of the queue operation data in the service context
+  data. The data will be written back to driver maintained operation data
+  when enqueue or dequeue done function is called.
+
+The cryptodev data-path service uses
+
+Cryptodev PMDs who supports this feature will have
+``RTE_CRYPTODEV_FF_SYM_HW_DIRECT_API`` feature flag presented. To use this
+feature the function ``rte_cryptodev_get_dp_service_ctx_data_size`` should
+be called to get the data path service context data size. The user should
+creates a local buffer at least this size long and initialize it using
+``rte_cryptodev_dp_configure_service`` function call.
+
+The ``rte_cryptodev_dp_configure_service`` function call initialize or
+updates the ``struct rte_crypto_dp_service_ctx`` buffer, in which contains the
+driver specific queue pair data pointer and service context buffer, and a
+set of function pointers to enqueue and dequeue different algorithms'
+operations. The ``rte_cryptodev_dp_configure_service`` should be called when:
+
+- Before enqueuing or dequeuing starts (set ``is_update`` parameter to 0).
+- When different cryptodev session, security session, or session-less xform
+  is used (set ``is_update`` parameter to 1).
+
+Two different enqueue functions are provided.
+
+- ``rte_cryptodev_dp_sym_submit_vec``: submit a burst of operations stored in
+  the ``rte_crypto_sym_vec`` structure.
+- ``rte_cryptodev_dp_submit_single_job``: submit single operation.
+
+Either enqueue functions will not command the crypto device to start processing
+until ``rte_cryptodev_dp_submit_done`` function is called. Before then the user
+shall expect the driver only stores the necessory context data in the
+``rte_crypto_dp_service_ctx`` buffer for the next enqueue operation. If the user
+wants to abandon the submitted operations, simply call
+``rte_cryptodev_dp_configure_service`` function instead with the parameter
+``is_update`` set to 0. The driver will recover the service context data to
+the previous state.
+
+To dequeue the operations the user also have two operations:
+
+- ``rte_cryptodev_dp_sym_dequeue``: fully customizable deuqueue operation. The
+  user needs to provide the callback function for the driver to get the
+  dequeue count and perform post processing such as write the status field.
+- ``rte_cryptodev_dp_sym_dequeue_single_job``: dequeue single job.
+
+Same as enqueue, the function ``rte_cryptodev_dp_dequeue_done`` is used to
+merge user's local service context data with the driver's queue operation
+data. Also to abandon the dequeue operation (still keep the operations in the
+queue), the user shall avoid ``rte_cryptodev_dp_dequeue_done`` function call
+but calling ``rte_cryptodev_dp_configure_service`` function with the parameter
+``is_update`` set to 0.
+
+There are a few limitations to the data path service:
+
+* Only support in-place operations.
+* APIs are NOT thread-safe.
+* CANNOT mix the direct API's enqueue with rte_cryptodev_enqueue_burst, or
+  vice versa.
+
+See *DPDK API Reference* for details on each API definitions.
+
 Sample code
 -----------
 
diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
index df227a177..159823345 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -55,6 +55,13 @@  New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+   * **Added data-path APIs for cryptodev library.**
+
+     Cryptodev is added data-path APIs to accelerate external libraries or
+     applications those want to avail fast cryptodev enqueue/dequeue
+     operations but does not necessarily depends on mbufs and cryptodev
+     operation mempool.
+
 
 Removed Items
 -------------