[v10,2/4] cryptodev: add raw crypto data-path APIs

Message ID 20200924163417.49983-3-roy.fan.zhang@intel.com (mailing list archive)
State Changes Requested, archived
Delegated to: akhil goyal
Headers
Series cryptodev: add raw data-path APIs |

Checks

Context Check Description
ci/checkpatch warning coding style issues

Commit Message

Fan Zhang Sept. 24, 2020, 4:34 p.m. UTC
  This patch adds raw data-path APIs for enqueue and dequeue
operations to cryptodev. The APIs support flexible user-define
enqueue and dequeue behaviors.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Signed-off-by: Piotr Bronowski <piotrx.bronowski@intel.com>
---
 doc/guides/prog_guide/cryptodev_lib.rst       |  93 +++++
 doc/guides/rel_notes/release_20_11.rst        |   7 +
 lib/librte_cryptodev/rte_crypto_sym.h         |   2 +-
 lib/librte_cryptodev/rte_cryptodev.c          | 104 +++++
 lib/librte_cryptodev/rte_cryptodev.h          | 354 +++++++++++++++++-
 lib/librte_cryptodev/rte_cryptodev_pmd.h      |  47 ++-
 .../rte_cryptodev_version.map                 |  11 +
 7 files changed, 614 insertions(+), 4 deletions(-)
  

Comments

Dybkowski, AdamX Sept. 25, 2020, 8:04 a.m. UTC | #1
> -----Original Message-----
> From: Zhang, Roy Fan <roy.fan.zhang@intel.com>
> Sent: Thursday, 24 September, 2020 18:34
> To: dev@dpdk.org
> Cc: akhil.goyal@nxp.com; Trahe, Fiona <fiona.trahe@intel.com>; Kusztal,
> ArkadiuszX <arkadiuszx.kusztal@intel.com>; Dybkowski, AdamX
> <adamx.dybkowski@intel.com>; anoobj@marvell.com; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Zhang, Roy Fan
> <roy.fan.zhang@intel.com>; Bronowski, PiotrX
> <piotrx.bronowski@intel.com>
> Subject: [dpdk-dev v10 2/4] cryptodev: add raw crypto data-path APIs
> 
> This patch adds raw data-path APIs for enqueue and dequeue operations to
> cryptodev. The APIs support flexible user-define enqueue and dequeue
> behaviors.
> 
> Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
> Signed-off-by: Piotr Bronowski <piotrx.bronowski@intel.com>

Acked-by: Adam Dybkowski <adamx.dybkowski@intel.com>
  
Akhil Goyal Oct. 8, 2020, 2:26 p.m. UTC | #2
Hi Fan,

> 
> This patch adds raw data-path APIs for enqueue and dequeue
> operations to cryptodev. The APIs support flexible user-define
> enqueue and dequeue behaviors.
> 
> Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
> Signed-off-by: Piotr Bronowski <piotrx.bronowski@intel.com>
> ---
>  doc/guides/prog_guide/cryptodev_lib.rst       |  93 +++++
>  doc/guides/rel_notes/release_20_11.rst        |   7 +
>  lib/librte_cryptodev/rte_crypto_sym.h         |   2 +-
>  lib/librte_cryptodev/rte_cryptodev.c          | 104 +++++
>  lib/librte_cryptodev/rte_cryptodev.h          | 354 +++++++++++++++++-
>  lib/librte_cryptodev/rte_cryptodev_pmd.h      |  47 ++-
>  .../rte_cryptodev_version.map                 |  11 +
>  7 files changed, 614 insertions(+), 4 deletions(-)
> 
> diff --git a/doc/guides/prog_guide/cryptodev_lib.rst
> b/doc/guides/prog_guide/cryptodev_lib.rst
> index e7ba35c2d..5fe6c3c24 100644
> --- a/doc/guides/prog_guide/cryptodev_lib.rst
> +++ b/doc/guides/prog_guide/cryptodev_lib.rst
> @@ -632,6 +632,99 @@ a call argument. Status different than zero must be
> treated as error.
>  For more details, e.g. how to convert an mbuf to an SGL, please refer to an
>  example usage in the IPsec library implementation.
> 
> +Cryptodev Raw Data-path APIs
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +The Crypto Raw data-path APIs are a set of APIs are designed to enable

The Crypto Raw data-path APIs are a set of APIs designed to enable
external libraries/applications to leverage the cryptographic 

> +externel libraries/applications which want to leverage the cryptographic
> +processing provided by DPDK crypto PMDs through the cryptodev API but in a
> +manner that is not dependent on native DPDK data structures (eg. rte_mbuf,
> +rte_crypto_op, ... etc) in their data-path implementation.
> +
> +The raw data-path APIs have the following advantages:
> +- External data structure friendly design. The new APIs uses the operation
> +  descriptor ``struct rte_crypto_sym_vec`` that supports raw data pointer and
> +  IOVA addresses as input. Moreover, the APIs does not require the user to
> +  allocate the descriptor from mempool, nor requiring mbufs to describe input
> +  data's virtual and IOVA addresses. All these features made the translation
> +  from user's own data structure into the descriptor easier and more efficient.
> +- Flexible enqueue and dequeue operation. The raw data-path APIs gives the
> +  user more control to the enqueue and dequeue operations, including the
> +  capability of precious enqueue/dequeue count, abandoning enqueue or
> dequeue
> +  at any time, and operation status translation and set on the fly.
> +
> +Cryptodev PMDs who supports the raw data-path APIs will have

Cryptodev PMDs which support raw data-path APIs will have

> +``RTE_CRYPTODEV_FF_SYM_HW_RAW_DP`` feature flag presented. To use this
> +feature, the user should create a local ``struct rte_crypto_raw_dp_ctx``
> +buffer and extend to at least the length returned by
> +``rte_cryptodev_raw_get_dp_context_size`` function call. The created buffer

``rte_cryptodev_get_raw_dp_ctx_size``

> +is then configured using ``rte_cryptodev_raw_configure_dp_context`` function.
rte_cryptodev _configure_raw_dp_ctx

> +The library and the crypto device driver will then configure the buffer and
> +write necessary temporary data into the buffer for later enqueue and dequeue
> +operations. The temporary data may be treated as the shadow copy of the
> +driver's private queue pair data.
> +
> +After the ``struct rte_crypto_raw_dp_ctx`` buffer is initialized, it is then
> +attached either the cryptodev sym session, the rte_security session, or the

attached either with the cryptodev sym session, the rte_security session, or the

> +cryptodev xform for session-less operation by
> +``rte_cryptodev_raw_attach_session`` function. With the session or xform

``rte_cryptodev_attach_raw_session`` API

> +information the driver will set the corresponding enqueue and dequeue
> function
> +handlers to the ``struct rte_crypto_raw_dp_ctx`` buffer.
> +
> +After the session is attached, the ``struct rte_crypto_raw_dp_ctx`` buffer is
> +now ready for enqueue and dequeue operation. There are two different
> enqueue
> +functions: ``rte_cryptodev_raw_enqueue`` to enqueue single descriptor,
> +and ``rte_cryptodev_raw_enqueue_burst`` to enqueue multiple descriptors.
> +In case of the application uses similar approach to
> +``struct rte_crypto_sym_vec`` to manage its data burst but with different
> +data structure, using the ``rte_cryptodev_raw_enqueue_burst`` function may
> be
> +less efficient as this is a situation where the application has to loop over
> +all crypto descriptors to assemble the ``struct rte_crypto_sym_vec`` buffer
> +from its own data structure, and then the driver will loop over them again to
> +translate every crypto job to the driver's specific queue data. The
> +``rte_cryptodev_raw_enqueue`` should be used to save one loop for each data
> +burst instead.
> +
> +During the enqueue, the cryptodev driver only sets the enqueued descriptors
> +into the device queue but not initiates the device to start processing them.
> +The temporary queue pair data changes in relation to the enqueued descriptors
> +may be recorded in the ``struct rte_crypto_raw_dp_ctx`` buffer as the
> reference
> +to the next enqueue function call. When ``rte_cryptodev_raw_enqueue_done``
> is
> +called, the driver will initiate the processing of all enqueued descriptors and
> +merge the temporary queue pair data changes into the driver's private queue
> +pair data. Calling ``rte_cryptodev_raw_configure_dp_context`` twice without
> +``rte_cryptodev_dp_enqueue_done`` call in between will invalidate the
> temporary
> +data stored in ``struct rte_crypto_raw_dp_ctx`` buffer. This feature is useful
> +when the user wants to abandon partially enqueued data for a failed enqueue
> +burst operation and try enqueuing in a whole later.

This feature may not be supported by all the HW PMDs, Can there be a way to bypass
this done API?

> +
> +Similar as enqueue, there are two dequeue functions:
> +``rte_cryptodev_raw_dequeue`` for dequeing single descriptor, and
> +``rte_cryptodev_raw_dequeue_burst`` for dequeuing a burst of descriptor. The
> +dequeue functions only writes back the user data that was passed to the driver
> +during inqueue, and inform the application the operation status.

during enqueue, and inform the operation status to the application.

> +Different than ``rte_cryptodev_dequeue_burst`` which the user can only
> +set an expected dequeue count and needs to read from dequeued cryptodev
> +operations' status field, the raw data-path dequeue burst function allows
> +the user to provide callback functions to retrieve dequeue
> +count from the enqueued user data, and write the expected status value to the
> +user data on the fly.
> +
> +Same as enqueue, both ``rte_cryptodev_raw_dequeue`` and
> +``rte_cryptodev_raw_dequeue_burst`` will not wipe the dequeued descriptors
> +from cryptodev queue unless ``rte_cryptodev_dp_dequeue_done`` is called.
> The
> +dequeue related temporary queue data will be merged into the driver's private
> +queue data in the function call.
> +
> +There are a few limitations to the data path service:
> +
> +* Only support in-place operations.
> +* APIs are NOT thread-safe.
> +* CANNOT mix the direct API's enqueue with rte_cryptodev_enqueue_burst, or
> +  vice versa.
> +
> +See *DPDK API Reference* for details on each API definitions.
> +
>  Sample code
>  -----------
> 
> diff --git a/doc/guides/rel_notes/release_20_11.rst
> b/doc/guides/rel_notes/release_20_11.rst
> index 20ebaef5b..d3d9f82f7 100644
> --- a/doc/guides/rel_notes/release_20_11.rst
> +++ b/doc/guides/rel_notes/release_20_11.rst
> @@ -55,6 +55,13 @@ New Features
>       Also, make sure to start the actual text at the margin.
>       =======================================================
> 
> +   * **Added raw data-path APIs for cryptodev library.**
> +
> +     Cryptodev is added raw data-path APIs to accelerate external libraries or
> +     applications those want to avail fast cryptodev enqueue/dequeue
> +     operations but does not necessarily depends on mbufs and cryptodev
> +     operation mempool.

Raw crypto data path APIs are added to accelerate external libraries or
Applications which need to perform crypto processing on raw buffers and
Not dependent on rte_mbufs or rte_crypto_op mempools.

> +
> 
>  Removed Items
>  -------------
> diff --git a/lib/librte_cryptodev/rte_crypto_sym.h
> b/lib/librte_cryptodev/rte_crypto_sym.h
> index 8201189e0..e1f23d303 100644
> --- a/lib/librte_cryptodev/rte_crypto_sym.h
> +++ b/lib/librte_cryptodev/rte_crypto_sym.h
> @@ -57,7 +57,7 @@ struct rte_crypto_sgl {
>   */
>  struct rte_crypto_va_iova_ptr {
>  	void *va;
> -	rte_iova_t *iova;
> +	rte_iova_t iova;
>  };

This should be part of 1/4 of this patchset.

> 
>  /**
> diff --git a/lib/librte_cryptodev/rte_cryptodev.c
> b/lib/librte_cryptodev/rte_cryptodev.c
> index 1dd795bcb..daeb5f504 100644
> --- a/lib/librte_cryptodev/rte_cryptodev.c
> +++ b/lib/librte_cryptodev/rte_cryptodev.c
> @@ -1914,6 +1914,110 @@ rte_cryptodev_sym_cpu_crypto_process(uint8_t
> dev_id,
>  	return dev->dev_ops->sym_cpu_process(dev, sess, ofs, vec);
>  }
> 
> +int
> +rte_cryptodev_raw_get_dp_context_size(uint8_t dev_id)

As suggested above raw_dp should be used as keyword in all the APIs
Hence it should be rte_cryptodev_get_raw_dp_ctx_size

> +{
> +	struct rte_cryptodev *dev;
> +	int32_t size = sizeof(struct rte_crypto_raw_dp_ctx);
> +	int32_t priv_size;
> +
> +	if (!rte_cryptodev_pmd_is_valid_dev(dev_id))
> +		return -EINVAL;
> +
> +	dev = rte_cryptodev_pmd_get_dev(dev_id);
> +
> +	if (*dev->dev_ops->get_drv_ctx_size == NULL ||
> +		!(dev->feature_flags &
> RTE_CRYPTODEV_FF_SYM_HW_RAW_DP)) {
> +		return -ENOTSUP;
> +	}
> +
> +	priv_size = (*dev->dev_ops->get_drv_ctx_size)(dev);
> +	if (priv_size < 0)
> +		return -ENOTSUP;
> +
> +	return RTE_ALIGN_CEIL((size + priv_size), 8);
> +}
> +
> +int
> +rte_cryptodev_raw_configure_dp_context(uint8_t dev_id, uint16_t qp_id,
> +	struct rte_crypto_raw_dp_ctx *ctx)

rte_cryptodev_configure_raw_dp_ctx

> +{
> +	struct rte_cryptodev *dev;
> +	union rte_cryptodev_session_ctx sess_ctx = {NULL};
> +
> +	if (!rte_cryptodev_get_qp_status(dev_id, qp_id))
> +		return -EINVAL;
> +
> +	dev = rte_cryptodev_pmd_get_dev(dev_id);
> +	if (!(dev->feature_flags & RTE_CRYPTODEV_FF_SYM_HW_RAW_DP)
> +			|| dev->dev_ops->configure_dp_ctx == NULL)
> +		return -ENOTSUP;
> +
> +	return (*dev->dev_ops->configure_dp_ctx)(dev, qp_id,
> +			RTE_CRYPTO_OP_WITH_SESSION, sess_ctx, ctx);
> +}
> +
> +int
> +rte_cryptodev_raw_attach_session(uint8_t dev_id, uint16_t qp_id,
> +	struct rte_crypto_raw_dp_ctx *ctx,
> +	enum rte_crypto_op_sess_type sess_type,
> +	union rte_cryptodev_session_ctx session_ctx)
> +{
> +	struct rte_cryptodev *dev;
> +
> +	if (!rte_cryptodev_get_qp_status(dev_id, qp_id))
> +		return -EINVAL;
> +
> +	dev = rte_cryptodev_pmd_get_dev(dev_id);
> +	if (!(dev->feature_flags & RTE_CRYPTODEV_FF_SYM_HW_RAW_DP)
> +			|| dev->dev_ops->configure_dp_ctx == NULL)
> +		return -ENOTSUP;
> +	return (*dev->dev_ops->configure_dp_ctx)(dev, qp_id, sess_type,
> +			session_ctx, ctx);

What is the difference between rte_cryptodev_raw_configure_dp_context and
rte_cryptodev_raw_attach_session?
And if at all it is needed, then it should be rte_cryptodev_attach_raw_dp_session.
IMO attach is not needed, I am not clear about it.

You are calling the same dev_ops for both - one with explicit session time and other
From an argument.

> +}
> +
> +uint32_t
> +rte_cryptodev_raw_enqueue_burst(struct rte_crypto_raw_dp_ctx *ctx,
> +	struct rte_crypto_sym_vec *vec, union rte_crypto_sym_ofs ofs,
> +	void **user_data)
> +{
> +	if (vec->num == 1) {

Why do we need this check? I think user is aware that for enqueuing 1 vector,
He should use the other API. Driver will be doing the enqueue operation only one time.

> +		vec->status[0] = rte_cryptodev_raw_enqueue(ctx, vec->sgl-
> >vec,
> +			vec->sgl->num, ofs, vec->iv, vec->digest, vec->aad,
> +			user_data[0]);
> +		return (vec->status[0] == 0) ? 1 : 0;
> +	}
> +
> +	return (*ctx->enqueue_burst)(ctx->qp_data, ctx->drv_ctx_data, vec,
> +			ofs, user_data);
> +}

Where are  rte_cryptodev_raw_enqueue and rte_cryptodev_raw_dequeue ??


> +
> +int
> +rte_cryptodev_raw_enqueue_done(struct rte_crypto_raw_dp_ctx *ctx,
> +		uint32_t n)
> +{
> +	return (*ctx->enqueue_done)(ctx->qp_data, ctx->drv_ctx_data, n);
> +}
> +
> +int
> +rte_cryptodev_raw_dequeue_done(struct rte_crypto_raw_dp_ctx *ctx,
> +		uint32_t n)
> +{
> +	return (*ctx->dequeue_done)(ctx->qp_data, ctx->drv_ctx_data, n);
> +}
> +
> +uint32_t
> +rte_cryptodev_raw_dequeue_burst(struct rte_crypto_raw_dp_ctx *ctx,
> +	rte_cryptodev_raw_get_dequeue_count_t get_dequeue_count,
> +	rte_cryptodev_raw_post_dequeue_t post_dequeue,
> +	void **out_user_data, uint8_t is_user_data_array,
> +	uint32_t *n_success_jobs)
> +{
> +	return (*ctx->dequeue_burst)(ctx->qp_data, ctx->drv_ctx_data,
> +		get_dequeue_count, post_dequeue, out_user_data,
> +		is_user_data_array, n_success_jobs);
> +}
> +
>  /** Initialise rte_crypto_op mempool element */
>  static void
>  rte_crypto_op_init(struct rte_mempool *mempool,
> diff --git a/lib/librte_cryptodev/rte_cryptodev.h
> b/lib/librte_cryptodev/rte_cryptodev.h
> index 7b3ebc20f..3579ab66e 100644
> --- a/lib/librte_cryptodev/rte_cryptodev.h
> +++ b/lib/librte_cryptodev/rte_cryptodev.h
> @@ -466,7 +466,8 @@ rte_cryptodev_asym_get_xform_enum(enum
> rte_crypto_asym_xform_type *xform_enum,
>  /**< Support symmetric session-less operations */
>  #define RTE_CRYPTODEV_FF_NON_BYTE_ALIGNED_DATA		(1ULL
> << 23)
>  /**< Support operations on data which is not byte aligned */
> -
> +#define RTE_CRYPTODEV_FF_SYM_HW_RAW_DP			(1ULL
> << 24)

RTE_CRYPTODEV_FF_SYM_RAW_DP should be better

Add this in doc/guides/cryptodevs/features/default.ini as well in this patch.

> +/**< Support accelerated specific raw data-path APIs */
> 
>  /**
>   * Get the name of a crypto device feature flag
> @@ -1351,6 +1352,357 @@ rte_cryptodev_sym_cpu_crypto_process(uint8_t
> dev_id,
>  	struct rte_cryptodev_sym_session *sess, union rte_crypto_sym_ofs ofs,
>  	struct rte_crypto_sym_vec *vec);
> 
> +/**
> + * Get the size of the raw data-path context buffer.
> + *
> + * @param	dev_id		The device identifier.
> + *
> + * @return
> + *   - If the device supports raw data-path APIs, return the context size.
> + *   - If the device does not support the APIs, return -1.
> + */
> +__rte_experimental
> +int
> +rte_cryptodev_raw_get_dp_context_size(uint8_t dev_id);
> +
> +/**
> + * Union of different crypto session types, including session-less xform
> + * pointer.
> + */
> +union rte_cryptodev_session_ctx {
> +	struct rte_cryptodev_sym_session *crypto_sess;
> +	struct rte_crypto_sym_xform *xform;
> +	struct rte_security_session *sec_sess;
> +};
> +
> +/**
> + * Enqueue a data vector into device queue but the driver will not start
> + * processing until rte_cryptodev_raw_enqueue_done() is called.
> + *
> + * @param	qp		Driver specific queue pair data.
> + * @param	drv_ctx		Driver specific context data.
> + * @param	vec		The array of descriptor vectors.
> + * @param	ofs		Start and stop offsets for auth and cipher
> + *				operations.
> + * @param	user_data	The array of user data for dequeue later.
> + * @return
> + *   - The number of descriptors successfully submitted.
> + */
> +typedef uint32_t (*cryptodev_dp_sym_enqueue_burst_t)(
> +	void *qp, uint8_t *drv_ctx, struct rte_crypto_sym_vec *vec,
> +	union rte_crypto_sym_ofs ofs, void *user_data[]);
> +
> +/**
> + * Enqueue single descriptor into device queue but the driver will not start
> + * processing until rte_cryptodev_raw_enqueue_done() is called.
> + *
> + * @param	qp		Driver specific queue pair data.
> + * @param	drv_ctx		Driver specific context data.
> + * @param	data_vec	The buffer data vector.
> + * @param	n_data_vecs	Number of buffer data vectors.
> + * @param	ofs		Start and stop offsets for auth and cipher
> + *				operations.
> + * @param	iv		IV virtual and IOVA addresses
> + * @param	digest		digest virtual and IOVA addresses
> + * @param	aad_or_auth_iv	AAD or auth IV virtual and IOVA addresses,
> + *				depends on the algorithm used.
> + * @param	user_data	The user data.
> + * @return
> + *   - On success return 0.
> + *   - On failure return negative integer.
> + */
> +typedef int (*cryptodev_dp_sym_enqueue_t)(
> +	void *qp, uint8_t *drv_ctx, struct rte_crypto_vec *data_vec,
> +	uint16_t n_data_vecs, union rte_crypto_sym_ofs ofs,
> +	struct rte_crypto_va_iova_ptr *iv,
> +	struct rte_crypto_va_iova_ptr *digest,
> +	struct rte_crypto_va_iova_ptr *aad_or_auth_iv,
> +	void *user_data);
> +
> +/**
> + * Inform the cryptodev queue pair to start processing or finish dequeuing all
> + * enqueued/dequeued descriptors.
> + *
> + * @param	qp		Driver specific queue pair data.
> + * @param	drv_ctx		Driver specific context data.
> + * @param	n		The total number of processed descriptors.
> + * @return
> + *   - On success return 0.
> + *   - On failure return negative integer.
> + */
> +typedef int (*cryptodev_dp_sym_operation_done_t)(void *qp, uint8_t *drv_ctx,
> +	uint32_t n);
> +
> +/**
> + * Typedef that the user provided for the driver to get the dequeue count.
> + * The function may return a fixed number or the number parsed from the user
> + * data stored in the first processed descriptor.
> + *
> + * @param	user_data	Dequeued user data.
> + **/
> +typedef uint32_t (*rte_cryptodev_raw_get_dequeue_count_t)(void
> *user_data);
> +
> +/**
> + * Typedef that the user provided to deal with post dequeue operation, such
> + * as filling status.
> + *
> + * @param	user_data	Dequeued user data.
> + * @param	index		Index number of the processed descriptor.
> + * @param	is_op_success	Operation status provided by the driver.
> + **/
> +typedef void (*rte_cryptodev_raw_post_dequeue_t)(void *user_data,
> +	uint32_t index, uint8_t is_op_success);
> +
> +/**
> + * Dequeue symmetric crypto processing of user provided data.
> + *
> + * @param	qp			Driver specific queue pair data.
> + * @param	drv_ctx			Driver specific context data.
> + * @param	get_dequeue_count	User provided callback function to
> + *					obtain dequeue count.
> + * @param	post_dequeue		User provided callback function to
> + *					post-process a dequeued operation.
> + * @param	out_user_data		User data pointer array to be retrieve
> + *					from device queue. In case of
> + *					*is_user_data_array* is set there
> + *					should be enough room to store all
> + *					user data.
> + * @param	is_user_data_array	Set 1 if every dequeued user data will
> + *					be written into out_user_data* array.
> + * @param	n_success		Driver written value to specific the
> + *					total successful operations count.
> + *
> + * @return
> + *  - Returns number of dequeued packets.
> + */
> +typedef uint32_t (*cryptodev_dp_sym_dequeue_burst_t)(void *qp,
> +	uint8_t *drv_ctx,
> +	rte_cryptodev_raw_get_dequeue_count_t get_dequeue_count,
> +	rte_cryptodev_raw_post_dequeue_t post_dequeue,
> +	void **out_user_data, uint8_t is_user_data_array,
> +	uint32_t *n_success);
> +
> +/**
> + * Dequeue symmetric crypto processing of user provided data.
> + *
> + * @param	qp			Driver specific queue pair data.
> + * @param	drv_ctx			Driver specific context data.
> + * @param	out_user_data		User data pointer to be retrieve from
> + *					device queue.
> + *
> + * @return
> + *   - 1 if the user_data is dequeued and the operation is a success.
> + *   - 0 if the user_data is dequeued but the operation is failed.
> + *   - -1 if no operation is dequeued.
> + */
> +typedef int (*cryptodev_dp_sym_dequeue_t)(
> +		void *qp, uint8_t *drv_ctx, void **out_user_data);
> +
> +/**
> + * Context data for raw data-path API crypto process. The buffer of this
> + * structure is to be allocated by the user application with the size equal
> + * or bigger than rte_cryptodev_raw_get_dp_context_size() returned value.
> + *
> + * NOTE: the buffer is to be used and maintained by the cryptodev driver, the
> + * user should NOT alter the buffer content to avoid application or system
> + * crash.
> + */
> +struct rte_crypto_raw_dp_ctx {
> +	void *qp_data;
> +
> +	cryptodev_dp_sym_enqueue_t enqueue;
> +	cryptodev_dp_sym_enqueue_burst_t enqueue_burst;
> +	cryptodev_dp_sym_operation_done_t enqueue_done;
> +	cryptodev_dp_sym_dequeue_t dequeue;
> +	cryptodev_dp_sym_dequeue_burst_t dequeue_burst;
> +	cryptodev_dp_sym_operation_done_t dequeue_done;

These function pointers are data path only. Why do we need to add explicit dp in each one of them
These should be cryptodev_sym_raw_**

> +
> +	/* Driver specific context data */
> +	__extension__ uint8_t drv_ctx_data[];
> +};
> +
> +/**
> + * Configure raw data-path context data.
> + *
> + * NOTE:
> + * After the context data is configured, the user should call
> + * rte_cryptodev_raw_attach_session() before using it in
> + * rte_cryptodev_raw_enqueue/dequeue function call.

I am not clear of the purpose of attach API? It looks an overhead to me.

> + *
> + * @param	dev_id		The device identifier.
> + * @param	qp_id		The index of the queue pair from which to
> + *				retrieve processed packets. The value must be
> + *				in the range [0, nb_queue_pair - 1] previously
> + *				supplied to rte_cryptodev_configure().
> + * @param	ctx		The raw data-path context data.
> + * @return
> + *   - On success return 0.
> + *   - On failure return negative integer.
> + */
> +__rte_experimental
> +int
> +rte_cryptodev_raw_configure_dp_context(uint8_t dev_id, uint16_t qp_id,
> +	struct rte_crypto_raw_dp_ctx *ctx);
> +
> +/**
> + * Attach a cryptodev session to a initialized raw data path context.
> + *
> + * @param	dev_id		The device identifier.
> + * @param	qp_id		The index of the queue pair from which to
> + *				retrieve processed packets. The value must be
> + *				in the range [0, nb_queue_pair - 1] previously
> + *				supplied to rte_cryptodev_configure().
> + * @param	ctx		The raw data-path context data.
> + * @param	sess_type	session type.
> + * @param	session_ctx	Session context data.
> + * @return
> + *   - On success return 0.
> + *   - On failure return negative integer.
> + */
> +__rte_experimental
> +int
> +rte_cryptodev_raw_attach_session(uint8_t dev_id, uint16_t qp_id,
> +	struct rte_crypto_raw_dp_ctx *ctx,
> +	enum rte_crypto_op_sess_type sess_type,
> +	union rte_cryptodev_session_ctx session_ctx);
> +
> +/**
> + * Enqueue single raw data-path descriptor.
> + *
> + * The enqueued descriptor will not be started processing until
> + * rte_cryptodev_raw_enqueue_done() is called.
> + *
> + * @param	ctx		The initialized raw data-path context data.
> + * @param	data_vec	The buffer vector.
> + * @param	n_data_vecs	Number of buffer vectors.
> + * @param	ofs		Start and stop offsets for auth and cipher
> + *				operations.
> + * @param	iv		IV virtual and IOVA addresses
> + * @param	digest		digest virtual and IOVA addresses
> + * @param	aad_or_auth_iv	AAD or auth IV virtual and IOVA addresses,
> + *				depends on the algorithm used.
> + * @param	user_data	The user data.
> + * @return
> + *   - The number of descriptors successfully enqueued.
> + */
> +__rte_experimental
> +static __rte_always_inline int
> +rte_cryptodev_raw_enqueue(struct rte_crypto_raw_dp_ctx *ctx,
> +	struct rte_crypto_vec *data_vec, uint16_t n_data_vecs,
> +	union rte_crypto_sym_ofs ofs,
> +	struct rte_crypto_va_iova_ptr *iv,
> +	struct rte_crypto_va_iova_ptr *digest,
> +	struct rte_crypto_va_iova_ptr *aad_or_auth_iv,
> +	void *user_data)
> +{
> +	return (*ctx->enqueue)(ctx->qp_data, ctx->drv_ctx_data, data_vec,
> +		n_data_vecs, ofs, iv, digest, aad_or_auth_iv, user_data);
> +}
> +
> +/**
> + * Enqueue a data vector of raw data-path descriptors.
> + *
> + * The enqueued descriptors will not be started processing until
> + * rte_cryptodev_raw_enqueue_done() is called.
> + *
> + * @param	ctx		The initialized raw data-path context data.
> + * @param	vec		The array of descriptor vectors.
> + * @param	ofs		Start and stop offsets for auth and cipher
> + *				operations.
> + * @param	user_data	The array of opaque data for dequeue.
> + * @return
> + *   - The number of descriptors successfully enqueued.
> + */
> +__rte_experimental
> +uint32_t
> +rte_cryptodev_raw_enqueue_burst(struct rte_crypto_raw_dp_ctx *ctx,
> +	struct rte_crypto_sym_vec *vec, union rte_crypto_sym_ofs ofs,
> +	void **user_data);
> +
> +/**
> + * Start processing all enqueued descriptors from last
> + * rte_cryptodev_raw_configure_dp_context() call.
> + *
> + * @param	ctx	The initialized raw data-path context data.
> + * @param	n	The total number of submitted descriptors.
> + */
> +__rte_experimental
> +int
> +rte_cryptodev_raw_enqueue_done(struct rte_crypto_raw_dp_ctx *ctx,
> +		uint32_t n);
> +
> +/**
> + * Dequeue a burst of raw crypto data-path operations and write the previously
> + * enqueued user data into the array provided.
> + *
> + * The dequeued operations, including the user data stored, will not be
> + * wiped out from the device queue until rte_cryptodev_raw_dequeue_done()
> is
> + * called.
> + *
> + * @param	ctx			The initialized raw data-path context
> + *					data.
> + * @param	get_dequeue_count	User provided callback function to
> + *					obtain dequeue count.
> + * @param	post_dequeue		User provided callback function to
> + *					post-process a dequeued operation.
> + * @param	out_user_data		User data pointer array to be retrieve
> + *					from device queue. In case of
> + *					*is_user_data_array* is set there
> + *					should be enough room to store all
> + *					user data.
> + * @param	is_user_data_array	Set 1 if every dequeued user data will
> + *					be written into *out_user_data* array.
> + * @param	n_success		Driver written value to specific the
> + *					total successful operations count.

// to specify the

> + *
> + * @return
> + *   - Returns number of dequeued packets.
> + */
> +__rte_experimental
> +uint32_t
> +rte_cryptodev_raw_dequeue_burst(struct rte_crypto_raw_dp_ctx *ctx,
> +	rte_cryptodev_raw_get_dequeue_count_t get_dequeue_count,
> +	rte_cryptodev_raw_post_dequeue_t post_dequeue,
> +	void **out_user_data, uint8_t is_user_data_array,
> +	uint32_t *n_success);
> +
> +/**
> + * Dequeue a raw crypto data-path operation and write the previously
> + * enqueued user data.
> + *
> + * The dequeued operation, including the user data stored, will not be wiped
> + * out from the device queue until rte_cryptodev_raw_dequeue_done() is
> called.
> + *
> + * @param	ctx			The initialized raw data-path context
> + *					data.
> + * @param	out_user_data		User data pointer to be retrieve from
> + *					device queue. The driver shall support
> + *					NULL input of this parameter.
> + *
> + * @return
> + *   - 1 if the user data is dequeued and the operation is a success.
> + *   - 0 if the user data is dequeued but the operation is failed.
> + *   - -1 if no operation is ready to be dequeued.
> + */
> +__rte_experimental
> +static __rte_always_inline int

Why is this function specifically inline and not others?

> +rte_cryptodev_raw_dequeue(struct rte_crypto_raw_dp_ctx *ctx,
> +	void **out_user_data)
> +{
> +	return (*ctx->dequeue)(ctx->qp_data, ctx->drv_ctx_data,
> out_user_data);
> +}
> +
> +/**
> + * Inform the queue pair dequeue operations finished.
> + *
> + * @param	ctx	The initialized raw data-path context data.
> + * @param	n	The total number of jobs already dequeued.
> + */
> +__rte_experimental
> +int
> +rte_cryptodev_raw_dequeue_done(struct rte_crypto_raw_dp_ctx *ctx,
> +		uint32_t n);
> +
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/lib/librte_cryptodev/rte_cryptodev_pmd.h
> b/lib/librte_cryptodev/rte_cryptodev_pmd.h
> index 81975d72b..69a2a6d64 100644
> --- a/lib/librte_cryptodev/rte_cryptodev_pmd.h
> +++ b/lib/librte_cryptodev/rte_cryptodev_pmd.h
> @@ -316,6 +316,40 @@ typedef uint32_t
> (*cryptodev_sym_cpu_crypto_process_t)
>  	(struct rte_cryptodev *dev, struct rte_cryptodev_sym_session *sess,
>  	union rte_crypto_sym_ofs ofs, struct rte_crypto_sym_vec *vec);
> 
> +/**
> + * Typedef that the driver provided to get service context private date size.
> + *
> + * @param	dev	Crypto device pointer.
> + *
> + * @return
> + *   - On success return the size of the device's service context private data.
> + *   - On failure return negative integer.
> + */
> +typedef int (*cryptodev_dp_get_service_ctx_size_t)(
> +	struct rte_cryptodev *dev);
> +
> +/**
> + * Typedef that the driver provided to configure data-path context.
> + *
> + * @param	dev		Crypto device pointer.
> + * @param	qp_id		Crypto device queue pair index.
> + * @param	service_type	Type of the service requested.
> + * @param	sess_type	session type.
> + * @param	session_ctx	Session context data. If NULL the driver
> + *				shall only configure the drv_ctx_data in
> + *				ctx buffer. Otherwise the driver shall only
> + *				parse the session_ctx to set appropriate
> + *				function pointers in ctx.
> + * @param	ctx		The raw data-path context data.
> + * @return
> + *   - On success return 0.
> + *   - On failure return negative integer.
> + */
> +typedef int (*cryptodev_dp_configure_ctx_t)(
> +	struct rte_cryptodev *dev, uint16_t qp_id,
> +	enum rte_crypto_op_sess_type sess_type,
> +	union rte_cryptodev_session_ctx session_ctx,
> +	struct rte_crypto_raw_dp_ctx *ctx);

These typedefs names are not matching with the corresponding API names.
Can you fix it for all of them?

> 
>  /** Crypto device operations function pointer table */
>  struct rte_cryptodev_ops {
> @@ -348,8 +382,17 @@ struct rte_cryptodev_ops {
>  	/**< Clear a Crypto sessions private data. */
>  	cryptodev_asym_free_session_t asym_session_clear;
>  	/**< Clear a Crypto sessions private data. */
> -	cryptodev_sym_cpu_crypto_process_t sym_cpu_process;
> -	/**< process input data synchronously (cpu-crypto). */
> +	union {
> +		cryptodev_sym_cpu_crypto_process_t sym_cpu_process;
> +		/**< process input data synchronously (cpu-crypto). */
> +		__extension__
> +		struct {
> +			cryptodev_dp_get_service_ctx_size_t get_drv_ctx_size;
> +			/**< Get data path service context data size. */
> +			cryptodev_dp_configure_ctx_t configure_dp_ctx;
> +			/**< Initialize crypto service ctx data. */
> +		};
> +	};
>  };
> 
> 
> diff --git a/lib/librte_cryptodev/rte_cryptodev_version.map
> b/lib/librte_cryptodev/rte_cryptodev_version.map
> index 02f6dcf72..bc4cd1ea5 100644
> --- a/lib/librte_cryptodev/rte_cryptodev_version.map
> +++ b/lib/librte_cryptodev/rte_cryptodev_version.map
> @@ -105,4 +105,15 @@ EXPERIMENTAL {
> 
>  	# added in 20.08
>  	rte_cryptodev_get_qp_status;
> +
> +	# added in 20.11
> +	rte_cryptodev_raw_attach_session;
> +	rte_cryptodev_raw_configure_dp_context;
> +	rte_cryptodev_raw_get_dp_context_size;
> +	rte_cryptodev_raw_dequeue;
> +	rte_cryptodev_raw_dequeue_burst;
> +	rte_cryptodev_raw_dequeue_done;
> +	rte_cryptodev_raw_enqueue;
> +	rte_cryptodev_raw_enqueue_burst;
> +	rte_cryptodev_raw_enqueue_done;
>  };
> --
> 2.20.1
  
Akhil Goyal Oct. 8, 2020, 2:37 p.m. UTC | #3
> +/**
> + * Start processing all enqueued descriptors from last
> + * rte_cryptodev_raw_configure_dp_context() call.
> + *
> + * @param	ctx	The initialized raw data-path context data.
> + * @param	n	The total number of submitted descriptors.
> +

What does this API return?
Check for other comments as well in other APIs


 */
> +__rte_experimental
> +int
> +rte_cryptodev_raw_enqueue_done(struct rte_crypto_raw_dp_ctx *ctx,
> +		uint32_t n);
> +
  
Fan Zhang Oct. 8, 2020, 3:29 p.m. UTC | #4
Hi Akhil,

Thanks a lot for the review. Comments inline.

Regards,
Fan

> -----Original Message-----
> From: Akhil Goyal <akhil.goyal@nxp.com>
> Sent: Thursday, October 8, 2020 3:26 PM
> To: Zhang, Roy Fan <roy.fan.zhang@intel.com>; dev@dpdk.org
> Cc: Trahe, Fiona <fiona.trahe@intel.com>; Kusztal, ArkadiuszX
> <arkadiuszx.kusztal@intel.com>; Dybkowski, AdamX
> <adamx.dybkowski@intel.com>; anoobj@marvell.com; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Bronowski, PiotrX
> <piotrx.bronowski@intel.com>
> Subject: RE: [dpdk-dev v10 2/4] cryptodev: add raw crypto data-path APIs
> 
> Hi Fan,
> 
> >
> > This patch adds raw data-path APIs for enqueue and dequeue
> > operations to cryptodev. The APIs support flexible user-define
> > enqueue and dequeue behaviors.
> >
> > Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
> > Signed-off-by: Piotr Bronowski <piotrx.bronowski@intel.com>
> > ---
> >  doc/guides/prog_guide/cryptodev_lib.rst       |  93 +++++
> >  doc/guides/rel_notes/release_20_11.rst        |   7 +
> >  lib/librte_cryptodev/rte_crypto_sym.h         |   2 +-
> >  lib/librte_cryptodev/rte_cryptodev.c          | 104 +++++
> >  lib/librte_cryptodev/rte_cryptodev.h          | 354 +++++++++++++++++-
> >  lib/librte_cryptodev/rte_cryptodev_pmd.h      |  47 ++-
> >  .../rte_cryptodev_version.map                 |  11 +
> >  7 files changed, 614 insertions(+), 4 deletions(-)
> >
> > diff --git a/doc/guides/prog_guide/cryptodev_lib.rst
> > b/doc/guides/prog_guide/cryptodev_lib.rst
> > index e7ba35c2d..5fe6c3c24 100644
> > --- a/doc/guides/prog_guide/cryptodev_lib.rst
> > +++ b/doc/guides/prog_guide/cryptodev_lib.rst
> > @@ -632,6 +632,99 @@ a call argument. Status different than zero must be
> > treated as error.
> >  For more details, e.g. how to convert an mbuf to an SGL, please refer to an
> >  example usage in the IPsec library implementation.
> >
> > +Cryptodev Raw Data-path APIs
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +The Crypto Raw data-path APIs are a set of APIs are designed to enable
> 
> The Crypto Raw data-path APIs are a set of APIs designed to enable
> external libraries/applications to leverage the cryptographic
> 
> > +externel libraries/applications which want to leverage the cryptographic
> > +processing provided by DPDK crypto PMDs through the cryptodev API but
> in a
> > +manner that is not dependent on native DPDK data structures (eg.
> rte_mbuf,
> > +rte_crypto_op, ... etc) in their data-path implementation.
> > +
> > +The raw data-path APIs have the following advantages:
> > +- External data structure friendly design. The new APIs uses the operation
> > +  descriptor ``struct rte_crypto_sym_vec`` that supports raw data pointer
> and
> > +  IOVA addresses as input. Moreover, the APIs does not require the user
> to
> > +  allocate the descriptor from mempool, nor requiring mbufs to describe
> input
> > +  data's virtual and IOVA addresses. All these features made the
> translation
> > +  from user's own data structure into the descriptor easier and more
> efficient.
> > +- Flexible enqueue and dequeue operation. The raw data-path APIs gives
> the
> > +  user more control to the enqueue and dequeue operations, including
> the
> > +  capability of precious enqueue/dequeue count, abandoning enqueue or
> > dequeue
> > +  at any time, and operation status translation and set on the fly.
> > +
> > +Cryptodev PMDs who supports the raw data-path APIs will have
> 
> Cryptodev PMDs which support raw data-path APIs will have
> 
> > +``RTE_CRYPTODEV_FF_SYM_HW_RAW_DP`` feature flag presented. To
> use this
> > +feature, the user should create a local ``struct rte_crypto_raw_dp_ctx``
> > +buffer and extend to at least the length returned by
> > +``rte_cryptodev_raw_get_dp_context_size`` function call. The created
> buffer
> 
> ``rte_cryptodev_get_raw_dp_ctx_size``
> 
> > +is then configured using ``rte_cryptodev_raw_configure_dp_context``
> function.
> rte_cryptodev _configure_raw_dp_ctx
> 
> > +The library and the crypto device driver will then configure the buffer and
> > +write necessary temporary data into the buffer for later enqueue and
> dequeue
> > +operations. The temporary data may be treated as the shadow copy of
> the
> > +driver's private queue pair data.
> > +
> > +After the ``struct rte_crypto_raw_dp_ctx`` buffer is initialized, it is then
> > +attached either the cryptodev sym session, the rte_security session, or
> the
> 
> attached either with the cryptodev sym session, the rte_security session, or
> the
> 
> > +cryptodev xform for session-less operation by
> > +``rte_cryptodev_raw_attach_session`` function. With the session or
> xform
> 
> ``rte_cryptodev_attach_raw_session`` API
> 
> > +information the driver will set the corresponding enqueue and dequeue
> > function
> > +handlers to the ``struct rte_crypto_raw_dp_ctx`` buffer.
> > +
> > +After the session is attached, the ``struct rte_crypto_raw_dp_ctx`` buffer
> is
> > +now ready for enqueue and dequeue operation. There are two different
> > enqueue
> > +functions: ``rte_cryptodev_raw_enqueue`` to enqueue single descriptor,
> > +and ``rte_cryptodev_raw_enqueue_burst`` to enqueue multiple
> descriptors.
> > +In case of the application uses similar approach to
> > +``struct rte_crypto_sym_vec`` to manage its data burst but with different
> > +data structure, using the ``rte_cryptodev_raw_enqueue_burst`` function
> may
> > be
> > +less efficient as this is a situation where the application has to loop over
> > +all crypto descriptors to assemble the ``struct rte_crypto_sym_vec``
> buffer
> > +from its own data structure, and then the driver will loop over them again
> to
> > +translate every crypto job to the driver's specific queue data. The
> > +``rte_cryptodev_raw_enqueue`` should be used to save one loop for
> each data
> > +burst instead.
> > +
> > +During the enqueue, the cryptodev driver only sets the enqueued
> descriptors
> > +into the device queue but not initiates the device to start processing them.
> > +The temporary queue pair data changes in relation to the enqueued
> descriptors
> > +may be recorded in the ``struct rte_crypto_raw_dp_ctx`` buffer as the
> > reference
> > +to the next enqueue function call. When
> ``rte_cryptodev_raw_enqueue_done``
> > is
> > +called, the driver will initiate the processing of all enqueued descriptors
> and
> > +merge the temporary queue pair data changes into the driver's private
> queue
> > +pair data. Calling ``rte_cryptodev_raw_configure_dp_context`` twice
> without
> > +``rte_cryptodev_dp_enqueue_done`` call in between will invalidate the
> > temporary
> > +data stored in ``struct rte_crypto_raw_dp_ctx`` buffer. This feature is
> useful
> > +when the user wants to abandon partially enqueued data for a failed
> enqueue
> > +burst operation and try enqueuing in a whole later.
> 
> This feature may not be supported by all the HW PMDs, Can there be a way
> to bypass
> this done API?

We can add another feature flag 
"RTE_CRYPTODEV_FF_SYM_HW_RAW_DP_ALLOW_CACHE". The PMDs who
do not support this feature can simply return "- ENOTSUP" when calling
enqueue_done and dequeue_done function. What do you think?

> 
> > +
> > +Similar as enqueue, there are two dequeue functions:
> > +``rte_cryptodev_raw_dequeue`` for dequeing single descriptor, and
> > +``rte_cryptodev_raw_dequeue_burst`` for dequeuing a burst of
> descriptor. The
> > +dequeue functions only writes back the user data that was passed to the
> driver
> > +during inqueue, and inform the application the operation status.
> 
> during enqueue, and inform the operation status to the application.
> 
> > +Different than ``rte_cryptodev_dequeue_burst`` which the user can only
> > +set an expected dequeue count and needs to read from dequeued
> cryptodev
> > +operations' status field, the raw data-path dequeue burst function allows
> > +the user to provide callback functions to retrieve dequeue
> > +count from the enqueued user data, and write the expected status value
> to the
> > +user data on the fly.
> > +
> > +Same as enqueue, both ``rte_cryptodev_raw_dequeue`` and
> > +``rte_cryptodev_raw_dequeue_burst`` will not wipe the dequeued
> descriptors
> > +from cryptodev queue unless ``rte_cryptodev_dp_dequeue_done`` is
> called.
> > The
> > +dequeue related temporary queue data will be merged into the driver's
> private
> > +queue data in the function call.
> > +
> > +There are a few limitations to the data path service:
> > +
> > +* Only support in-place operations.
> > +* APIs are NOT thread-safe.
> > +* CANNOT mix the direct API's enqueue with
> rte_cryptodev_enqueue_burst, or
> > +  vice versa.
> > +
> > +See *DPDK API Reference* for details on each API definitions.
> > +
> >  Sample code
> >  -----------
> >
> > diff --git a/doc/guides/rel_notes/release_20_11.rst
> > b/doc/guides/rel_notes/release_20_11.rst
> > index 20ebaef5b..d3d9f82f7 100644
> > --- a/doc/guides/rel_notes/release_20_11.rst
> > +++ b/doc/guides/rel_notes/release_20_11.rst
> > @@ -55,6 +55,13 @@ New Features
> >       Also, make sure to start the actual text at the margin.
> >       =======================================================
> >
> > +   * **Added raw data-path APIs for cryptodev library.**
> > +
> > +     Cryptodev is added raw data-path APIs to accelerate external libraries
> or
> > +     applications those want to avail fast cryptodev enqueue/dequeue
> > +     operations but does not necessarily depends on mbufs and cryptodev
> > +     operation mempool.
> 
> Raw crypto data path APIs are added to accelerate external libraries or
> Applications which need to perform crypto processing on raw buffers and
> Not dependent on rte_mbufs or rte_crypto_op mempools.
> 
> > +
> >
> >  Removed Items
> >  -------------
> > diff --git a/lib/librte_cryptodev/rte_crypto_sym.h
> > b/lib/librte_cryptodev/rte_crypto_sym.h
> > index 8201189e0..e1f23d303 100644
> > --- a/lib/librte_cryptodev/rte_crypto_sym.h
> > +++ b/lib/librte_cryptodev/rte_crypto_sym.h
> > @@ -57,7 +57,7 @@ struct rte_crypto_sgl {
> >   */
> >  struct rte_crypto_va_iova_ptr {
> >  	void *va;
> > -	rte_iova_t *iova;
> > +	rte_iova_t iova;
> >  };
> 
> This should be part of 1/4 of this patchset.
> 

Sorry missed that, will change.
 
> >
> >  /**
> > diff --git a/lib/librte_cryptodev/rte_cryptodev.c
> > b/lib/librte_cryptodev/rte_cryptodev.c
> > index 1dd795bcb..daeb5f504 100644
> > --- a/lib/librte_cryptodev/rte_cryptodev.c
> > +++ b/lib/librte_cryptodev/rte_cryptodev.c
> > @@ -1914,6 +1914,110 @@
> rte_cryptodev_sym_cpu_crypto_process(uint8_t
> > dev_id,
> >  	return dev->dev_ops->sym_cpu_process(dev, sess, ofs, vec);
> >  }
> >
> > +int
> > +rte_cryptodev_raw_get_dp_context_size(uint8_t dev_id)
> 
> As suggested above raw_dp should be used as keyword in all the APIs
> Hence it should be rte_cryptodev_get_raw_dp_ctx_size

Will change

> 
> > +{
> > +	struct rte_cryptodev *dev;
> > +	int32_t size = sizeof(struct rte_crypto_raw_dp_ctx);
> > +	int32_t priv_size;
> > +
> > +	if (!rte_cryptodev_pmd_is_valid_dev(dev_id))
> > +		return -EINVAL;
> > +
> > +	dev = rte_cryptodev_pmd_get_dev(dev_id);
> > +
> > +	if (*dev->dev_ops->get_drv_ctx_size == NULL ||
> > +		!(dev->feature_flags &
> > RTE_CRYPTODEV_FF_SYM_HW_RAW_DP)) {
> > +		return -ENOTSUP;
> > +	}
> > +
> > +	priv_size = (*dev->dev_ops->get_drv_ctx_size)(dev);
> > +	if (priv_size < 0)
> > +		return -ENOTSUP;
> > +
> > +	return RTE_ALIGN_CEIL((size + priv_size), 8);
> > +}
> > +
> > +int
> > +rte_cryptodev_raw_configure_dp_context(uint8_t dev_id, uint16_t
> qp_id,
> > +	struct rte_crypto_raw_dp_ctx *ctx)
> 
> rte_cryptodev_configure_raw_dp_ctx
> 
> > +{
> > +	struct rte_cryptodev *dev;
> > +	union rte_cryptodev_session_ctx sess_ctx = {NULL};
> > +
> > +	if (!rte_cryptodev_get_qp_status(dev_id, qp_id))
> > +		return -EINVAL;
> > +
> > +	dev = rte_cryptodev_pmd_get_dev(dev_id);
> > +	if (!(dev->feature_flags & RTE_CRYPTODEV_FF_SYM_HW_RAW_DP)
> > +			|| dev->dev_ops->configure_dp_ctx == NULL)
> > +		return -ENOTSUP;
> > +
> > +	return (*dev->dev_ops->configure_dp_ctx)(dev, qp_id,
> > +			RTE_CRYPTO_OP_WITH_SESSION, sess_ctx, ctx);
> > +}
> > +
> > +int
> > +rte_cryptodev_raw_attach_session(uint8_t dev_id, uint16_t qp_id,
> > +	struct rte_crypto_raw_dp_ctx *ctx,
> > +	enum rte_crypto_op_sess_type sess_type,
> > +	union rte_cryptodev_session_ctx session_ctx)
> > +{
> > +	struct rte_cryptodev *dev;
> > +
> > +	if (!rte_cryptodev_get_qp_status(dev_id, qp_id))
> > +		return -EINVAL;
> > +
> > +	dev = rte_cryptodev_pmd_get_dev(dev_id);
> > +	if (!(dev->feature_flags & RTE_CRYPTODEV_FF_SYM_HW_RAW_DP)
> > +			|| dev->dev_ops->configure_dp_ctx == NULL)
> > +		return -ENOTSUP;
> > +	return (*dev->dev_ops->configure_dp_ctx)(dev, qp_id, sess_type,
> > +			session_ctx, ctx);
> 
> What is the difference between rte_cryptodev_raw_configure_dp_context
> and
> rte_cryptodev_raw_attach_session?
> And if at all it is needed, then it should be
> rte_cryptodev_attach_raw_dp_session.
> IMO attach is not needed, I am not clear about it.
> 
> You are calling the same dev_ops for both - one with explicit session time
> and other
> From an argument.

rte_cryptodev_raw_configure_dp_context creates a shadow copy of the queue
pair data with in ctx, where rte_cryptodev_raw_attach_session sets the function
handler based on the session data. Using of the same PMD callback function is to
save one function pointer stored in the dev_ops. If you don't like it I can create
2 callback functions no problem.

> 
> > +}
> > +
> > +uint32_t
> > +rte_cryptodev_raw_enqueue_burst(struct rte_crypto_raw_dp_ctx *ctx,
> > +	struct rte_crypto_sym_vec *vec, union rte_crypto_sym_ofs ofs,
> > +	void **user_data)
> > +{
> > +	if (vec->num == 1) {
> 
> Why do we need this check? I think user is aware that for enqueuing 1 vector,
> He should use the other API. Driver will be doing the enqueue operation only
> one time.

Will remove that.

> 
> > +		vec->status[0] = rte_cryptodev_raw_enqueue(ctx, vec->sgl-
> > >vec,
> > +			vec->sgl->num, ofs, vec->iv, vec->digest, vec->aad,
> > +			user_data[0]);
> > +		return (vec->status[0] == 0) ? 1 : 0;
> > +	}
> > +
> > +	return (*ctx->enqueue_burst)(ctx->qp_data, ctx->drv_ctx_data, vec,
> > +			ofs, user_data);
> > +}
> 
> Where are  rte_cryptodev_raw_enqueue and
> rte_cryptodev_raw_dequeue ??

Defined as inline function in the header file. 

> 
> 
> > +
> > +int
> > +rte_cryptodev_raw_enqueue_done(struct rte_crypto_raw_dp_ctx *ctx,
> > +		uint32_t n)
> > +{
> > +	return (*ctx->enqueue_done)(ctx->qp_data, ctx->drv_ctx_data, n);
> > +}
> > +
> > +int
> > +rte_cryptodev_raw_dequeue_done(struct rte_crypto_raw_dp_ctx *ctx,
> > +		uint32_t n)
> > +{
> > +	return (*ctx->dequeue_done)(ctx->qp_data, ctx->drv_ctx_data, n);
> > +}
> > +
> > +uint32_t
> > +rte_cryptodev_raw_dequeue_burst(struct rte_crypto_raw_dp_ctx *ctx,
> > +	rte_cryptodev_raw_get_dequeue_count_t get_dequeue_count,
> > +	rte_cryptodev_raw_post_dequeue_t post_dequeue,
> > +	void **out_user_data, uint8_t is_user_data_array,
> > +	uint32_t *n_success_jobs)
> > +{
> > +	return (*ctx->dequeue_burst)(ctx->qp_data, ctx->drv_ctx_data,
> > +		get_dequeue_count, post_dequeue, out_user_data,
> > +		is_user_data_array, n_success_jobs);
> > +}
> > +
> >  /** Initialise rte_crypto_op mempool element */
> >  static void
> >  rte_crypto_op_init(struct rte_mempool *mempool,
> > diff --git a/lib/librte_cryptodev/rte_cryptodev.h
> > b/lib/librte_cryptodev/rte_cryptodev.h
> > index 7b3ebc20f..3579ab66e 100644
> > --- a/lib/librte_cryptodev/rte_cryptodev.h
> > +++ b/lib/librte_cryptodev/rte_cryptodev.h
> > @@ -466,7 +466,8 @@ rte_cryptodev_asym_get_xform_enum(enum
> > rte_crypto_asym_xform_type *xform_enum,
> >  /**< Support symmetric session-less operations */
> >  #define RTE_CRYPTODEV_FF_NON_BYTE_ALIGNED_DATA		(1ULL
> > << 23)
> >  /**< Support operations on data which is not byte aligned */
> > -
> > +#define RTE_CRYPTODEV_FF_SYM_HW_RAW_DP			(1ULL
> > << 24)
> 
> RTE_CRYPTODEV_FF_SYM_RAW_DP should be better
> 
> Add this in doc/guides/cryptodevs/features/default.ini as well in this patch.

Will change.

> 
> > +/**< Support accelerated specific raw data-path APIs */
> >
> >  /**
> >   * Get the name of a crypto device feature flag
> > @@ -1351,6 +1352,357 @@
> rte_cryptodev_sym_cpu_crypto_process(uint8_t
> > dev_id,
> >  	struct rte_cryptodev_sym_session *sess, union rte_crypto_sym_ofs
> ofs,
> >  	struct rte_crypto_sym_vec *vec);
> >
> > +/**
> > + * Get the size of the raw data-path context buffer.
> > + *
> > + * @param	dev_id		The device identifier.
> > + *
> > + * @return
> > + *   - If the device supports raw data-path APIs, return the context size.
> > + *   - If the device does not support the APIs, return -1.
> > + */
> > +__rte_experimental
> > +int
> > +rte_cryptodev_raw_get_dp_context_size(uint8_t dev_id);
> > +
> > +/**
> > + * Union of different crypto session types, including session-less xform
> > + * pointer.
> > + */
> > +union rte_cryptodev_session_ctx {
> > +	struct rte_cryptodev_sym_session *crypto_sess;
> > +	struct rte_crypto_sym_xform *xform;
> > +	struct rte_security_session *sec_sess;
> > +};
> > +
> > +/**
> > + * Enqueue a data vector into device queue but the driver will not start
> > + * processing until rte_cryptodev_raw_enqueue_done() is called.
> > + *
> > + * @param	qp		Driver specific queue pair data.
> > + * @param	drv_ctx		Driver specific context data.
> > + * @param	vec		The array of descriptor vectors.
> > + * @param	ofs		Start and stop offsets for auth and cipher
> > + *				operations.
> > + * @param	user_data	The array of user data for dequeue later.
> > + * @return
> > + *   - The number of descriptors successfully submitted.
> > + */
> > +typedef uint32_t (*cryptodev_dp_sym_enqueue_burst_t)(
> > +	void *qp, uint8_t *drv_ctx, struct rte_crypto_sym_vec *vec,
> > +	union rte_crypto_sym_ofs ofs, void *user_data[]);
> > +
> > +/**
> > + * Enqueue single descriptor into device queue but the driver will not start
> > + * processing until rte_cryptodev_raw_enqueue_done() is called.
> > + *
> > + * @param	qp		Driver specific queue pair data.
> > + * @param	drv_ctx		Driver specific context data.
> > + * @param	data_vec	The buffer data vector.
> > + * @param	n_data_vecs	Number of buffer data vectors.
> > + * @param	ofs		Start and stop offsets for auth and cipher
> > + *				operations.
> > + * @param	iv		IV virtual and IOVA addresses
> > + * @param	digest		digest virtual and IOVA addresses
> > + * @param	aad_or_auth_iv	AAD or auth IV virtual and IOVA
> addresses,
> > + *				depends on the algorithm used.
> > + * @param	user_data	The user data.
> > + * @return
> > + *   - On success return 0.
> > + *   - On failure return negative integer.
> > + */
> > +typedef int (*cryptodev_dp_sym_enqueue_t)(
> > +	void *qp, uint8_t *drv_ctx, struct rte_crypto_vec *data_vec,
> > +	uint16_t n_data_vecs, union rte_crypto_sym_ofs ofs,
> > +	struct rte_crypto_va_iova_ptr *iv,
> > +	struct rte_crypto_va_iova_ptr *digest,
> > +	struct rte_crypto_va_iova_ptr *aad_or_auth_iv,
> > +	void *user_data);
> > +
> > +/**
> > + * Inform the cryptodev queue pair to start processing or finish dequeuing
> all
> > + * enqueued/dequeued descriptors.
> > + *
> > + * @param	qp		Driver specific queue pair data.
> > + * @param	drv_ctx		Driver specific context data.
> > + * @param	n		The total number of processed descriptors.
> > + * @return
> > + *   - On success return 0.
> > + *   - On failure return negative integer.
> > + */
> > +typedef int (*cryptodev_dp_sym_operation_done_t)(void *qp, uint8_t
> *drv_ctx,
> > +	uint32_t n);
> > +
> > +/**
> > + * Typedef that the user provided for the driver to get the dequeue count.
> > + * The function may return a fixed number or the number parsed from
> the user
> > + * data stored in the first processed descriptor.
> > + *
> > + * @param	user_data	Dequeued user data.
> > + **/
> > +typedef uint32_t (*rte_cryptodev_raw_get_dequeue_count_t)(void
> > *user_data);
> > +
> > +/**
> > + * Typedef that the user provided to deal with post dequeue operation,
> such
> > + * as filling status.
> > + *
> > + * @param	user_data	Dequeued user data.
> > + * @param	index		Index number of the processed descriptor.
> > + * @param	is_op_success	Operation status provided by the driver.
> > + **/
> > +typedef void (*rte_cryptodev_raw_post_dequeue_t)(void *user_data,
> > +	uint32_t index, uint8_t is_op_success);
> > +
> > +/**
> > + * Dequeue symmetric crypto processing of user provided data.
> > + *
> > + * @param	qp			Driver specific queue pair data.
> > + * @param	drv_ctx			Driver specific context data.
> > + * @param	get_dequeue_count	User provided callback function to
> > + *					obtain dequeue count.
> > + * @param	post_dequeue		User provided callback function to
> > + *					post-process a dequeued operation.
> > + * @param	out_user_data		User data pointer array to be retrieve
> > + *					from device queue. In case of
> > + *					*is_user_data_array* is set there
> > + *					should be enough room to store all
> > + *					user data.
> > + * @param	is_user_data_array	Set 1 if every dequeued user data will
> > + *					be written into out_user_data* array.
> > + * @param	n_success		Driver written value to specific the
> > + *					total successful operations count.
> > + *
> > + * @return
> > + *  - Returns number of dequeued packets.
> > + */
> > +typedef uint32_t (*cryptodev_dp_sym_dequeue_burst_t)(void *qp,
> > +	uint8_t *drv_ctx,
> > +	rte_cryptodev_raw_get_dequeue_count_t get_dequeue_count,
> > +	rte_cryptodev_raw_post_dequeue_t post_dequeue,
> > +	void **out_user_data, uint8_t is_user_data_array,
> > +	uint32_t *n_success);
> > +
> > +/**
> > + * Dequeue symmetric crypto processing of user provided data.
> > + *
> > + * @param	qp			Driver specific queue pair data.
> > + * @param	drv_ctx			Driver specific context data.
> > + * @param	out_user_data		User data pointer to be retrieve from
> > + *					device queue.
> > + *
> > + * @return
> > + *   - 1 if the user_data is dequeued and the operation is a success.
> > + *   - 0 if the user_data is dequeued but the operation is failed.
> > + *   - -1 if no operation is dequeued.
> > + */
> > +typedef int (*cryptodev_dp_sym_dequeue_t)(
> > +		void *qp, uint8_t *drv_ctx, void **out_user_data);
> > +
> > +/**
> > + * Context data for raw data-path API crypto process. The buffer of this
> > + * structure is to be allocated by the user application with the size equal
> > + * or bigger than rte_cryptodev_raw_get_dp_context_size() returned
> value.
> > + *
> > + * NOTE: the buffer is to be used and maintained by the cryptodev driver,
> the
> > + * user should NOT alter the buffer content to avoid application or system
> > + * crash.
> > + */
> > +struct rte_crypto_raw_dp_ctx {
> > +	void *qp_data;
> > +
> > +	cryptodev_dp_sym_enqueue_t enqueue;
> > +	cryptodev_dp_sym_enqueue_burst_t enqueue_burst;
> > +	cryptodev_dp_sym_operation_done_t enqueue_done;
> > +	cryptodev_dp_sym_dequeue_t dequeue;
> > +	cryptodev_dp_sym_dequeue_burst_t dequeue_burst;
> > +	cryptodev_dp_sym_operation_done_t dequeue_done;
> 
> These function pointers are data path only. Why do we need to add explicit
> dp in each one of them
> These should be cryptodev_sym_raw_**
> 

Good idea.

> > +
> > +	/* Driver specific context data */
> > +	__extension__ uint8_t drv_ctx_data[];
> > +};
> > +
> > +/**
> > + * Configure raw data-path context data.
> > + *
> > + * NOTE:
> > + * After the context data is configured, the user should call
> > + * rte_cryptodev_raw_attach_session() before using it in
> > + * rte_cryptodev_raw_enqueue/dequeue function call.
> 
> I am not clear of the purpose of attach API? It looks an overhead to me.
> 
> > + *
> > + * @param	dev_id		The device identifier.
> > + * @param	qp_id		The index of the queue pair from which to
> > + *				retrieve processed packets. The value must
> be
> > + *				in the range [0, nb_queue_pair - 1]
> previously
> > + *				supplied to rte_cryptodev_configure().
> > + * @param	ctx		The raw data-path context data.
> > + * @return
> > + *   - On success return 0.
> > + *   - On failure return negative integer.
> > + */
> > +__rte_experimental
> > +int
> > +rte_cryptodev_raw_configure_dp_context(uint8_t dev_id, uint16_t
> qp_id,
> > +	struct rte_crypto_raw_dp_ctx *ctx);
> > +
> > +/**
> > + * Attach a cryptodev session to a initialized raw data path context.
> > + *
> > + * @param	dev_id		The device identifier.
> > + * @param	qp_id		The index of the queue pair from which to
> > + *				retrieve processed packets. The value must
> be
> > + *				in the range [0, nb_queue_pair - 1]
> previously
> > + *				supplied to rte_cryptodev_configure().
> > + * @param	ctx		The raw data-path context data.
> > + * @param	sess_type	session type.
> > + * @param	session_ctx	Session context data.
> > + * @return
> > + *   - On success return 0.
> > + *   - On failure return negative integer.
> > + */
> > +__rte_experimental
> > +int
> > +rte_cryptodev_raw_attach_session(uint8_t dev_id, uint16_t qp_id,
> > +	struct rte_crypto_raw_dp_ctx *ctx,
> > +	enum rte_crypto_op_sess_type sess_type,
> > +	union rte_cryptodev_session_ctx session_ctx);
> > +
> > +/**
> > + * Enqueue single raw data-path descriptor.
> > + *
> > + * The enqueued descriptor will not be started processing until
> > + * rte_cryptodev_raw_enqueue_done() is called.
> > + *
> > + * @param	ctx		The initialized raw data-path context data.
> > + * @param	data_vec	The buffer vector.
> > + * @param	n_data_vecs	Number of buffer vectors.
> > + * @param	ofs		Start and stop offsets for auth and cipher
> > + *				operations.
> > + * @param	iv		IV virtual and IOVA addresses
> > + * @param	digest		digest virtual and IOVA addresses
> > + * @param	aad_or_auth_iv	AAD or auth IV virtual and IOVA
> addresses,
> > + *				depends on the algorithm used.
> > + * @param	user_data	The user data.
> > + * @return
> > + *   - The number of descriptors successfully enqueued.
> > + */
> > +__rte_experimental
> > +static __rte_always_inline int
> > +rte_cryptodev_raw_enqueue(struct rte_crypto_raw_dp_ctx *ctx,
> > +	struct rte_crypto_vec *data_vec, uint16_t n_data_vecs,
> > +	union rte_crypto_sym_ofs ofs,
> > +	struct rte_crypto_va_iova_ptr *iv,
> > +	struct rte_crypto_va_iova_ptr *digest,
> > +	struct rte_crypto_va_iova_ptr *aad_or_auth_iv,
> > +	void *user_data)
> > +{
> > +	return (*ctx->enqueue)(ctx->qp_data, ctx->drv_ctx_data, data_vec,
> > +		n_data_vecs, ofs, iv, digest, aad_or_auth_iv, user_data);
> > +}
> > +
> > +/**
> > + * Enqueue a data vector of raw data-path descriptors.
> > + *
> > + * The enqueued descriptors will not be started processing until
> > + * rte_cryptodev_raw_enqueue_done() is called.
> > + *
> > + * @param	ctx		The initialized raw data-path context data.
> > + * @param	vec		The array of descriptor vectors.
> > + * @param	ofs		Start and stop offsets for auth and cipher
> > + *				operations.
> > + * @param	user_data	The array of opaque data for dequeue.
> > + * @return
> > + *   - The number of descriptors successfully enqueued.
> > + */
> > +__rte_experimental
> > +uint32_t
> > +rte_cryptodev_raw_enqueue_burst(struct rte_crypto_raw_dp_ctx *ctx,
> > +	struct rte_crypto_sym_vec *vec, union rte_crypto_sym_ofs ofs,
> > +	void **user_data);
> > +
> > +/**
> > + * Start processing all enqueued descriptors from last
> > + * rte_cryptodev_raw_configure_dp_context() call.
> > + *
> > + * @param	ctx	The initialized raw data-path context data.
> > + * @param	n	The total number of submitted descriptors.
> > + */
> > +__rte_experimental
> > +int
> > +rte_cryptodev_raw_enqueue_done(struct rte_crypto_raw_dp_ctx *ctx,
> > +		uint32_t n);
> > +
> > +/**
> > + * Dequeue a burst of raw crypto data-path operations and write the
> previously
> > + * enqueued user data into the array provided.
> > + *
> > + * The dequeued operations, including the user data stored, will not be
> > + * wiped out from the device queue until
> rte_cryptodev_raw_dequeue_done()
> > is
> > + * called.
> > + *
> > + * @param	ctx			The initialized raw data-path context
> > + *					data.
> > + * @param	get_dequeue_count	User provided callback function to
> > + *					obtain dequeue count.
> > + * @param	post_dequeue		User provided callback function to
> > + *					post-process a dequeued operation.
> > + * @param	out_user_data		User data pointer array to be retrieve
> > + *					from device queue. In case of
> > + *					*is_user_data_array* is set there
> > + *					should be enough room to store all
> > + *					user data.
> > + * @param	is_user_data_array	Set 1 if every dequeued user data will
> > + *					be written into *out_user_data*
> array.
> > + * @param	n_success		Driver written value to specific the
> > + *					total successful operations count.
> 
> // to specify the
> 
> > + *
> > + * @return
> > + *   - Returns number of dequeued packets.
> > + */
> > +__rte_experimental
> > +uint32_t
> > +rte_cryptodev_raw_dequeue_burst(struct rte_crypto_raw_dp_ctx *ctx,
> > +	rte_cryptodev_raw_get_dequeue_count_t get_dequeue_count,
> > +	rte_cryptodev_raw_post_dequeue_t post_dequeue,
> > +	void **out_user_data, uint8_t is_user_data_array,
> > +	uint32_t *n_success);
> > +
> > +/**
> > + * Dequeue a raw crypto data-path operation and write the previously
> > + * enqueued user data.
> > + *
> > + * The dequeued operation, including the user data stored, will not be
> wiped
> > + * out from the device queue until rte_cryptodev_raw_dequeue_done()
> is
> > called.
> > + *
> > + * @param	ctx			The initialized raw data-path context
> > + *					data.
> > + * @param	out_user_data		User data pointer to be retrieve from
> > + *					device queue. The driver shall
> support
> > + *					NULL input of this parameter.
> > + *
> > + * @return
> > + *   - 1 if the user data is dequeued and the operation is a success.
> > + *   - 0 if the user data is dequeued but the operation is failed.
> > + *   - -1 if no operation is ready to be dequeued.
> > + */
> > +__rte_experimental
> > +static __rte_always_inline int
> 
> Why is this function specifically inline and not others?

Single op enqueue and dequeue helps ease the double looping performance
degradation explained in the prog_guide. But they may have to be inline as
the functions are called once for each packet. For bulk enqueue/dequeue
and enqueue/dequeue_done, as they are called once per burst, 
I don't think we need to inline them.

> 
> > +rte_cryptodev_raw_dequeue(struct rte_crypto_raw_dp_ctx *ctx,
> > +	void **out_user_data)
> > +{
> > +	return (*ctx->dequeue)(ctx->qp_data, ctx->drv_ctx_data,
> > out_user_data);
> > +}
> > +
> > +/**
> > + * Inform the queue pair dequeue operations finished.
> > + *
> > + * @param	ctx	The initialized raw data-path context data.
> > + * @param	n	The total number of jobs already dequeued.
> > + */
> > +__rte_experimental
> > +int
> > +rte_cryptodev_raw_dequeue_done(struct rte_crypto_raw_dp_ctx *ctx,
> > +		uint32_t n);
> > +
> >  #ifdef __cplusplus
> >  }
> >  #endif
> > diff --git a/lib/librte_cryptodev/rte_cryptodev_pmd.h
> > b/lib/librte_cryptodev/rte_cryptodev_pmd.h
> > index 81975d72b..69a2a6d64 100644
> > --- a/lib/librte_cryptodev/rte_cryptodev_pmd.h
> > +++ b/lib/librte_cryptodev/rte_cryptodev_pmd.h
> > @@ -316,6 +316,40 @@ typedef uint32_t
> > (*cryptodev_sym_cpu_crypto_process_t)
> >  	(struct rte_cryptodev *dev, struct rte_cryptodev_sym_session *sess,
> >  	union rte_crypto_sym_ofs ofs, struct rte_crypto_sym_vec *vec);
> >
> > +/**
> > + * Typedef that the driver provided to get service context private date
> size.
> > + *
> > + * @param	dev	Crypto device pointer.
> > + *
> > + * @return
> > + *   - On success return the size of the device's service context private data.
> > + *   - On failure return negative integer.
> > + */
> > +typedef int (*cryptodev_dp_get_service_ctx_size_t)(
> > +	struct rte_cryptodev *dev);
> > +
> > +/**
> > + * Typedef that the driver provided to configure data-path context.
> > + *
> > + * @param	dev		Crypto device pointer.
> > + * @param	qp_id		Crypto device queue pair index.
> > + * @param	service_type	Type of the service requested.
> > + * @param	sess_type	session type.
> > + * @param	session_ctx	Session context data. If NULL the driver
> > + *				shall only configure the drv_ctx_data in
> > + *				ctx buffer. Otherwise the driver shall only
> > + *				parse the session_ctx to set appropriate
> > + *				function pointers in ctx.
> > + * @param	ctx		The raw data-path context data.
> > + * @return
> > + *   - On success return 0.
> > + *   - On failure return negative integer.
> > + */
> > +typedef int (*cryptodev_dp_configure_ctx_t)(
> > +	struct rte_cryptodev *dev, uint16_t qp_id,
> > +	enum rte_crypto_op_sess_type sess_type,
> > +	union rte_cryptodev_session_ctx session_ctx,
> > +	struct rte_crypto_raw_dp_ctx *ctx);
> 
> These typedefs names are not matching with the corresponding API names.
> Can you fix it for all of them?

Will do.

> 
> >
> >  /** Crypto device operations function pointer table */
> >  struct rte_cryptodev_ops {
> > @@ -348,8 +382,17 @@ struct rte_cryptodev_ops {
> >  	/**< Clear a Crypto sessions private data. */
> >  	cryptodev_asym_free_session_t asym_session_clear;
> >  	/**< Clear a Crypto sessions private data. */
> > -	cryptodev_sym_cpu_crypto_process_t sym_cpu_process;
> > -	/**< process input data synchronously (cpu-crypto). */
> > +	union {
> > +		cryptodev_sym_cpu_crypto_process_t sym_cpu_process;
> > +		/**< process input data synchronously (cpu-crypto). */
> > +		__extension__
> > +		struct {
> > +			cryptodev_dp_get_service_ctx_size_t
> get_drv_ctx_size;
> > +			/**< Get data path service context data size. */
> > +			cryptodev_dp_configure_ctx_t configure_dp_ctx;
> > +			/**< Initialize crypto service ctx data. */
> > +		};
> > +	};
> >  };
> >
> >
> > diff --git a/lib/librte_cryptodev/rte_cryptodev_version.map
> > b/lib/librte_cryptodev/rte_cryptodev_version.map
> > index 02f6dcf72..bc4cd1ea5 100644
> > --- a/lib/librte_cryptodev/rte_cryptodev_version.map
> > +++ b/lib/librte_cryptodev/rte_cryptodev_version.map
> > @@ -105,4 +105,15 @@ EXPERIMENTAL {
> >
> >  	# added in 20.08
> >  	rte_cryptodev_get_qp_status;
> > +
> > +	# added in 20.11
> > +	rte_cryptodev_raw_attach_session;
> > +	rte_cryptodev_raw_configure_dp_context;
> > +	rte_cryptodev_raw_get_dp_context_size;
> > +	rte_cryptodev_raw_dequeue;
> > +	rte_cryptodev_raw_dequeue_burst;
> > +	rte_cryptodev_raw_dequeue_done;
> > +	rte_cryptodev_raw_enqueue;
> > +	rte_cryptodev_raw_enqueue_burst;
> > +	rte_cryptodev_raw_enqueue_done;
> >  };
> > --
> > 2.20.1
  
Akhil Goyal Oct. 8, 2020, 4:07 p.m. UTC | #5
> > > +During the enqueue, the cryptodev driver only sets the enqueued
> > descriptors
> > > +into the device queue but not initiates the device to start processing them.
> > > +The temporary queue pair data changes in relation to the enqueued
> > descriptors
> > > +may be recorded in the ``struct rte_crypto_raw_dp_ctx`` buffer as the
> > > reference
> > > +to the next enqueue function call. When
> > ``rte_cryptodev_raw_enqueue_done``
> > > is
> > > +called, the driver will initiate the processing of all enqueued descriptors
> > and
> > > +merge the temporary queue pair data changes into the driver's private
> > queue
> > > +pair data. Calling ``rte_cryptodev_raw_configure_dp_context`` twice
> > without
> > > +``rte_cryptodev_dp_enqueue_done`` call in between will invalidate the
> > > temporary
> > > +data stored in ``struct rte_crypto_raw_dp_ctx`` buffer. This feature is
> > useful
> > > +when the user wants to abandon partially enqueued data for a failed
> > enqueue
> > > +burst operation and try enqueuing in a whole later.
> >
> > This feature may not be supported by all the HW PMDs, Can there be a way
> > to bypass
> > this done API?
> 
> We can add another feature flag
> "RTE_CRYPTODEV_FF_SYM_HW_RAW_DP_ALLOW_CACHE". The PMDs who
> do not support this feature can simply return "- ENOTSUP" when calling
> enqueue_done and dequeue_done function. What do you think?

Can the enqueue/dequeue API return a flag which decide whether
to call done API or not?
Returning ENOTSUP will break the execution.


> > > +int
> > > +rte_cryptodev_raw_configure_dp_context(uint8_t dev_id, uint16_t
> > qp_id,
> > > +	struct rte_crypto_raw_dp_ctx *ctx)
> >
> > rte_cryptodev_configure_raw_dp_ctx
> >
> > > +{
> > > +	struct rte_cryptodev *dev;
> > > +	union rte_cryptodev_session_ctx sess_ctx = {NULL};
> > > +
> > > +	if (!rte_cryptodev_get_qp_status(dev_id, qp_id))
> > > +		return -EINVAL;
> > > +
> > > +	dev = rte_cryptodev_pmd_get_dev(dev_id);
> > > +	if (!(dev->feature_flags & RTE_CRYPTODEV_FF_SYM_HW_RAW_DP)
> > > +			|| dev->dev_ops->configure_dp_ctx == NULL)
> > > +		return -ENOTSUP;
> > > +
> > > +	return (*dev->dev_ops->configure_dp_ctx)(dev, qp_id,
> > > +			RTE_CRYPTO_OP_WITH_SESSION, sess_ctx, ctx);
> > > +}
> > > +
> > > +int
> > > +rte_cryptodev_raw_attach_session(uint8_t dev_id, uint16_t qp_id,
> > > +	struct rte_crypto_raw_dp_ctx *ctx,
> > > +	enum rte_crypto_op_sess_type sess_type,
> > > +	union rte_cryptodev_session_ctx session_ctx)
> > > +{
> > > +	struct rte_cryptodev *dev;
> > > +
> > > +	if (!rte_cryptodev_get_qp_status(dev_id, qp_id))
> > > +		return -EINVAL;
> > > +
> > > +	dev = rte_cryptodev_pmd_get_dev(dev_id);
> > > +	if (!(dev->feature_flags & RTE_CRYPTODEV_FF_SYM_HW_RAW_DP)
> > > +			|| dev->dev_ops->configure_dp_ctx == NULL)
> > > +		return -ENOTSUP;
> > > +	return (*dev->dev_ops->configure_dp_ctx)(dev, qp_id, sess_type,
> > > +			session_ctx, ctx);
> >
> > What is the difference between rte_cryptodev_raw_configure_dp_context
> > and
> > rte_cryptodev_raw_attach_session?
> > And if at all it is needed, then it should be
> > rte_cryptodev_attach_raw_dp_session.
> > IMO attach is not needed, I am not clear about it.
> >
> > You are calling the same dev_ops for both - one with explicit session time
> > and other
> > From an argument.
> 
> rte_cryptodev_raw_configure_dp_context creates a shadow copy of the queue
> pair data with in ctx, where rte_cryptodev_raw_attach_session sets the function
> handler based on the session data. Using of the same PMD callback function is
> to
> save one function pointer stored in the dev_ops. If you don't like it I can create
> 2 callback functions no problem.

I don't like the idea of having 2 APIs.

Why do you need to create a shadow copy of the queue data? Why it can't be
Done in the attach API by the driver? In v9 it was doing that, why is it changed?
  
Fan Zhang Oct. 8, 2020, 4:24 p.m. UTC | #6
Hi, Good idea. Will do.

> -----Original Message-----
> From: Akhil Goyal <akhil.goyal@nxp.com>
> Sent: Thursday, October 8, 2020 5:08 PM
> To: Zhang, Roy Fan <roy.fan.zhang@intel.com>; dev@dpdk.org
> Cc: Trahe, Fiona <fiona.trahe@intel.com>; Kusztal, ArkadiuszX
> <arkadiuszx.kusztal@intel.com>; Dybkowski, AdamX
> <adamx.dybkowski@intel.com>; anoobj@marvell.com; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Bronowski, PiotrX
> <piotrx.bronowski@intel.com>
> Subject: RE: [dpdk-dev v10 2/4] cryptodev: add raw crypto data-path APIs
> 
> > > > +During the enqueue, the cryptodev driver only sets the enqueued
> > > descriptors
> > > > +into the device queue but not initiates the device to start processing
> them.
> > > > +The temporary queue pair data changes in relation to the enqueued
> > > descriptors
> > > > +may be recorded in the ``struct rte_crypto_raw_dp_ctx`` buffer as the
> > > > reference
> > > > +to the next enqueue function call. When
> > > ``rte_cryptodev_raw_enqueue_done``
> > > > is
> > > > +called, the driver will initiate the processing of all enqueued
> descriptors
> > > and
> > > > +merge the temporary queue pair data changes into the driver's private
> > > queue
> > > > +pair data. Calling ``rte_cryptodev_raw_configure_dp_context`` twice
> > > without
> > > > +``rte_cryptodev_dp_enqueue_done`` call in between will invalidate
> the
> > > > temporary
> > > > +data stored in ``struct rte_crypto_raw_dp_ctx`` buffer. This feature is
> > > useful
> > > > +when the user wants to abandon partially enqueued data for a failed
> > > enqueue
> > > > +burst operation and try enqueuing in a whole later.
> > >
> > > This feature may not be supported by all the HW PMDs, Can there be a
> way
> > > to bypass
> > > this done API?
> >
> > We can add another feature flag
> > "RTE_CRYPTODEV_FF_SYM_HW_RAW_DP_ALLOW_CACHE". The PMDs
> who
> > do not support this feature can simply return "- ENOTSUP" when calling
> > enqueue_done and dequeue_done function. What do you think?
> 
> Can the enqueue/dequeue API return a flag which decide whether
> to call done API or not?
> Returning ENOTSUP will break the execution.
> 
> 
> > > > +int
> > > > +rte_cryptodev_raw_configure_dp_context(uint8_t dev_id, uint16_t
> > > qp_id,
> > > > +	struct rte_crypto_raw_dp_ctx *ctx)
> > >
> > > rte_cryptodev_configure_raw_dp_ctx
> > >
> > > > +{
> > > > +	struct rte_cryptodev *dev;
> > > > +	union rte_cryptodev_session_ctx sess_ctx = {NULL};
> > > > +
> > > > +	if (!rte_cryptodev_get_qp_status(dev_id, qp_id))
> > > > +		return -EINVAL;
> > > > +
> > > > +	dev = rte_cryptodev_pmd_get_dev(dev_id);
> > > > +	if (!(dev->feature_flags & RTE_CRYPTODEV_FF_SYM_HW_RAW_DP)
> > > > +			|| dev->dev_ops->configure_dp_ctx == NULL)
> > > > +		return -ENOTSUP;
> > > > +
> > > > +	return (*dev->dev_ops->configure_dp_ctx)(dev, qp_id,
> > > > +			RTE_CRYPTO_OP_WITH_SESSION, sess_ctx, ctx);
> > > > +}
> > > > +
> > > > +int
> > > > +rte_cryptodev_raw_attach_session(uint8_t dev_id, uint16_t qp_id,
> > > > +	struct rte_crypto_raw_dp_ctx *ctx,
> > > > +	enum rte_crypto_op_sess_type sess_type,
> > > > +	union rte_cryptodev_session_ctx session_ctx)
> > > > +{
> > > > +	struct rte_cryptodev *dev;
> > > > +
> > > > +	if (!rte_cryptodev_get_qp_status(dev_id, qp_id))
> > > > +		return -EINVAL;
> > > > +
> > > > +	dev = rte_cryptodev_pmd_get_dev(dev_id);
> > > > +	if (!(dev->feature_flags & RTE_CRYPTODEV_FF_SYM_HW_RAW_DP)
> > > > +			|| dev->dev_ops->configure_dp_ctx == NULL)
> > > > +		return -ENOTSUP;
> > > > +	return (*dev->dev_ops->configure_dp_ctx)(dev, qp_id, sess_type,
> > > > +			session_ctx, ctx);
> > >
> > > What is the difference between
> rte_cryptodev_raw_configure_dp_context
> > > and
> > > rte_cryptodev_raw_attach_session?
> > > And if at all it is needed, then it should be
> > > rte_cryptodev_attach_raw_dp_session.
> > > IMO attach is not needed, I am not clear about it.
> > >
> > > You are calling the same dev_ops for both - one with explicit session time
> > > and other
> > > From an argument.
> >
> > rte_cryptodev_raw_configure_dp_context creates a shadow copy of the
> queue
> > pair data with in ctx, where rte_cryptodev_raw_attach_session sets the
> function
> > handler based on the session data. Using of the same PMD callback
> function is
> > to
> > save one function pointer stored in the dev_ops. If you don't like it I can
> create
> > 2 callback functions no problem.
> 
> I don't like the idea of having 2 APIs.
> 
> Why do you need to create a shadow copy of the queue data? Why it can't
> be
> Done in the attach API by the driver? In v9 it was doing that, why is it changed?
>
  
Fan Zhang Oct. 9, 2020, 8:32 a.m. UTC | #7
Hi Akhil,

> > rte_cryptodev_raw_configure_dp_context creates a shadow copy of the
> queue
> > pair data with in ctx, where rte_cryptodev_raw_attach_session sets the
> function
> > handler based on the session data. Using of the same PMD callback
> function is
> > to
> > save one function pointer stored in the dev_ops. If you don't like it I can
> create
> > 2 callback functions no problem.
> 
> I don't like the idea of having 2 APIs.
> 
> Why do you need to create a shadow copy of the queue data? Why it can't
> be
> Done in the attach API by the driver? In v9 it was doing that, why is it changed?
> 

The reason for creating shadow copy is for enqueue_done and dequeue_done. 
As explained if external application uses a data structure similar to
rte_crypto_sym_vec and expect all ops or no ops are enqueued/dequeued, 
it is impossible to do so with rte_cryptodev_enqueue/dequeue_burst. The
local queue pair shadow copy helps temporary caching what is already pushed
into the HW queue but the driver has yet to issue "start processing command"
to the device. Once the application finds out not all ops can be enqueued or
dequeued the temp shadow copy can be reset by issuing
rte_cryptodev_raw_configure_dp_context again. 

In v9 rte_cryptodev_raw_configure_dp_context has another job - to write the
function pointers to ctx. So if we are to use the same ctx for AES-CBC and AES-GCM but
we don't want to erase the shadow copy data again we need the "is_update" flag
to let the driver know not to erase the queue pair shadow data but updating the
function pointers only.  As you suggested in v9 "is_update" is not needed -
to avoid using "is_update" I used 2 APIs instead, one for initializing queue pair
shadow copy, one for writing function pointers by parsing the session.

Regards,
Fan
  

Patch

diff --git a/doc/guides/prog_guide/cryptodev_lib.rst b/doc/guides/prog_guide/cryptodev_lib.rst
index e7ba35c2d..5fe6c3c24 100644
--- a/doc/guides/prog_guide/cryptodev_lib.rst
+++ b/doc/guides/prog_guide/cryptodev_lib.rst
@@ -632,6 +632,99 @@  a call argument. Status different than zero must be treated as error.
 For more details, e.g. how to convert an mbuf to an SGL, please refer to an
 example usage in the IPsec library implementation.
 
+Cryptodev Raw Data-path APIs
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The Crypto Raw data-path APIs are a set of APIs are designed to enable
+externel libraries/applications which want to leverage the cryptographic
+processing provided by DPDK crypto PMDs through the cryptodev API but in a
+manner that is not dependent on native DPDK data structures (eg. rte_mbuf,
+rte_crypto_op, ... etc) in their data-path implementation.
+
+The raw data-path APIs have the following advantages:
+- External data structure friendly design. The new APIs uses the operation
+  descriptor ``struct rte_crypto_sym_vec`` that supports raw data pointer and
+  IOVA addresses as input. Moreover, the APIs does not require the user to
+  allocate the descriptor from mempool, nor requiring mbufs to describe input
+  data's virtual and IOVA addresses. All these features made the translation
+  from user's own data structure into the descriptor easier and more efficient.
+- Flexible enqueue and dequeue operation. The raw data-path APIs gives the
+  user more control to the enqueue and dequeue operations, including the
+  capability of precious enqueue/dequeue count, abandoning enqueue or dequeue
+  at any time, and operation status translation and set on the fly.
+
+Cryptodev PMDs who supports the raw data-path APIs will have
+``RTE_CRYPTODEV_FF_SYM_HW_RAW_DP`` feature flag presented. To use this
+feature, the user should create a local ``struct rte_crypto_raw_dp_ctx``
+buffer and extend to at least the length returned by
+``rte_cryptodev_raw_get_dp_context_size`` function call. The created buffer
+is then configured using ``rte_cryptodev_raw_configure_dp_context`` function.
+The library and the crypto device driver will then configure the buffer and
+write necessary temporary data into the buffer for later enqueue and dequeue
+operations. The temporary data may be treated as the shadow copy of the
+driver's private queue pair data.
+
+After the ``struct rte_crypto_raw_dp_ctx`` buffer is initialized, it is then
+attached either the cryptodev sym session, the rte_security session, or the
+cryptodev xform for session-less operation by
+``rte_cryptodev_raw_attach_session`` function. With the session or xform
+information the driver will set the corresponding enqueue and dequeue function
+handlers to the ``struct rte_crypto_raw_dp_ctx`` buffer.
+
+After the session is attached, the ``struct rte_crypto_raw_dp_ctx`` buffer is
+now ready for enqueue and dequeue operation. There are two different enqueue
+functions: ``rte_cryptodev_raw_enqueue`` to enqueue single descriptor,
+and ``rte_cryptodev_raw_enqueue_burst`` to enqueue multiple descriptors.
+In case of the application uses similar approach to
+``struct rte_crypto_sym_vec`` to manage its data burst but with different
+data structure, using the ``rte_cryptodev_raw_enqueue_burst`` function may be
+less efficient as this is a situation where the application has to loop over
+all crypto descriptors to assemble the ``struct rte_crypto_sym_vec`` buffer
+from its own data structure, and then the driver will loop over them again to
+translate every crypto job to the driver's specific queue data. The
+``rte_cryptodev_raw_enqueue`` should be used to save one loop for each data
+burst instead.
+
+During the enqueue, the cryptodev driver only sets the enqueued descriptors
+into the device queue but not initiates the device to start processing them.
+The temporary queue pair data changes in relation to the enqueued descriptors
+may be recorded in the ``struct rte_crypto_raw_dp_ctx`` buffer as the reference
+to the next enqueue function call. When ``rte_cryptodev_raw_enqueue_done`` is
+called, the driver will initiate the processing of all enqueued descriptors and
+merge the temporary queue pair data changes into the driver's private queue
+pair data. Calling ``rte_cryptodev_raw_configure_dp_context`` twice without
+``rte_cryptodev_dp_enqueue_done`` call in between will invalidate the temporary
+data stored in ``struct rte_crypto_raw_dp_ctx`` buffer. This feature is useful
+when the user wants to abandon partially enqueued data for a failed enqueue
+burst operation and try enqueuing in a whole later.
+
+Similar as enqueue, there are two dequeue functions:
+``rte_cryptodev_raw_dequeue`` for dequeing single descriptor, and
+``rte_cryptodev_raw_dequeue_burst`` for dequeuing a burst of descriptor. The
+dequeue functions only writes back the user data that was passed to the driver
+during inqueue, and inform the application the operation status.
+Different than ``rte_cryptodev_dequeue_burst`` which the user can only
+set an expected dequeue count and needs to read from dequeued cryptodev
+operations' status field, the raw data-path dequeue burst function allows
+the user to provide callback functions to retrieve dequeue
+count from the enqueued user data, and write the expected status value to the
+user data on the fly.
+
+Same as enqueue, both ``rte_cryptodev_raw_dequeue`` and
+``rte_cryptodev_raw_dequeue_burst`` will not wipe the dequeued descriptors
+from cryptodev queue unless ``rte_cryptodev_dp_dequeue_done`` is called. The
+dequeue related temporary queue data will be merged into the driver's private
+queue data in the function call.
+
+There are a few limitations to the data path service:
+
+* Only support in-place operations.
+* APIs are NOT thread-safe.
+* CANNOT mix the direct API's enqueue with rte_cryptodev_enqueue_burst, or
+  vice versa.
+
+See *DPDK API Reference* for details on each API definitions.
+
 Sample code
 -----------
 
diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
index 20ebaef5b..d3d9f82f7 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -55,6 +55,13 @@  New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+   * **Added raw data-path APIs for cryptodev library.**
+
+     Cryptodev is added raw data-path APIs to accelerate external libraries or
+     applications those want to avail fast cryptodev enqueue/dequeue
+     operations but does not necessarily depends on mbufs and cryptodev
+     operation mempool.
+
 
 Removed Items
 -------------
diff --git a/lib/librte_cryptodev/rte_crypto_sym.h b/lib/librte_cryptodev/rte_crypto_sym.h
index 8201189e0..e1f23d303 100644
--- a/lib/librte_cryptodev/rte_crypto_sym.h
+++ b/lib/librte_cryptodev/rte_crypto_sym.h
@@ -57,7 +57,7 @@  struct rte_crypto_sgl {
  */
 struct rte_crypto_va_iova_ptr {
 	void *va;
-	rte_iova_t *iova;
+	rte_iova_t iova;
 };
 
 /**
diff --git a/lib/librte_cryptodev/rte_cryptodev.c b/lib/librte_cryptodev/rte_cryptodev.c
index 1dd795bcb..daeb5f504 100644
--- a/lib/librte_cryptodev/rte_cryptodev.c
+++ b/lib/librte_cryptodev/rte_cryptodev.c
@@ -1914,6 +1914,110 @@  rte_cryptodev_sym_cpu_crypto_process(uint8_t dev_id,
 	return dev->dev_ops->sym_cpu_process(dev, sess, ofs, vec);
 }
 
+int
+rte_cryptodev_raw_get_dp_context_size(uint8_t dev_id)
+{
+	struct rte_cryptodev *dev;
+	int32_t size = sizeof(struct rte_crypto_raw_dp_ctx);
+	int32_t priv_size;
+
+	if (!rte_cryptodev_pmd_is_valid_dev(dev_id))
+		return -EINVAL;
+
+	dev = rte_cryptodev_pmd_get_dev(dev_id);
+
+	if (*dev->dev_ops->get_drv_ctx_size == NULL ||
+		!(dev->feature_flags & RTE_CRYPTODEV_FF_SYM_HW_RAW_DP)) {
+		return -ENOTSUP;
+	}
+
+	priv_size = (*dev->dev_ops->get_drv_ctx_size)(dev);
+	if (priv_size < 0)
+		return -ENOTSUP;
+
+	return RTE_ALIGN_CEIL((size + priv_size), 8);
+}
+
+int
+rte_cryptodev_raw_configure_dp_context(uint8_t dev_id, uint16_t qp_id,
+	struct rte_crypto_raw_dp_ctx *ctx)
+{
+	struct rte_cryptodev *dev;
+	union rte_cryptodev_session_ctx sess_ctx = {NULL};
+
+	if (!rte_cryptodev_get_qp_status(dev_id, qp_id))
+		return -EINVAL;
+
+	dev = rte_cryptodev_pmd_get_dev(dev_id);
+	if (!(dev->feature_flags & RTE_CRYPTODEV_FF_SYM_HW_RAW_DP)
+			|| dev->dev_ops->configure_dp_ctx == NULL)
+		return -ENOTSUP;
+
+	return (*dev->dev_ops->configure_dp_ctx)(dev, qp_id,
+			RTE_CRYPTO_OP_WITH_SESSION, sess_ctx, ctx);
+}
+
+int
+rte_cryptodev_raw_attach_session(uint8_t dev_id, uint16_t qp_id,
+	struct rte_crypto_raw_dp_ctx *ctx,
+	enum rte_crypto_op_sess_type sess_type,
+	union rte_cryptodev_session_ctx session_ctx)
+{
+	struct rte_cryptodev *dev;
+
+	if (!rte_cryptodev_get_qp_status(dev_id, qp_id))
+		return -EINVAL;
+
+	dev = rte_cryptodev_pmd_get_dev(dev_id);
+	if (!(dev->feature_flags & RTE_CRYPTODEV_FF_SYM_HW_RAW_DP)
+			|| dev->dev_ops->configure_dp_ctx == NULL)
+		return -ENOTSUP;
+	return (*dev->dev_ops->configure_dp_ctx)(dev, qp_id, sess_type,
+			session_ctx, ctx);
+}
+
+uint32_t
+rte_cryptodev_raw_enqueue_burst(struct rte_crypto_raw_dp_ctx *ctx,
+	struct rte_crypto_sym_vec *vec, union rte_crypto_sym_ofs ofs,
+	void **user_data)
+{
+	if (vec->num == 1) {
+		vec->status[0] = rte_cryptodev_raw_enqueue(ctx, vec->sgl->vec,
+			vec->sgl->num, ofs, vec->iv, vec->digest, vec->aad,
+			user_data[0]);
+		return (vec->status[0] == 0) ? 1 : 0;
+	}
+
+	return (*ctx->enqueue_burst)(ctx->qp_data, ctx->drv_ctx_data, vec,
+			ofs, user_data);
+}
+
+int
+rte_cryptodev_raw_enqueue_done(struct rte_crypto_raw_dp_ctx *ctx,
+		uint32_t n)
+{
+	return (*ctx->enqueue_done)(ctx->qp_data, ctx->drv_ctx_data, n);
+}
+
+int
+rte_cryptodev_raw_dequeue_done(struct rte_crypto_raw_dp_ctx *ctx,
+		uint32_t n)
+{
+	return (*ctx->dequeue_done)(ctx->qp_data, ctx->drv_ctx_data, n);
+}
+
+uint32_t
+rte_cryptodev_raw_dequeue_burst(struct rte_crypto_raw_dp_ctx *ctx,
+	rte_cryptodev_raw_get_dequeue_count_t get_dequeue_count,
+	rte_cryptodev_raw_post_dequeue_t post_dequeue,
+	void **out_user_data, uint8_t is_user_data_array,
+	uint32_t *n_success_jobs)
+{
+	return (*ctx->dequeue_burst)(ctx->qp_data, ctx->drv_ctx_data,
+		get_dequeue_count, post_dequeue, out_user_data,
+		is_user_data_array, n_success_jobs);
+}
+
 /** Initialise rte_crypto_op mempool element */
 static void
 rte_crypto_op_init(struct rte_mempool *mempool,
diff --git a/lib/librte_cryptodev/rte_cryptodev.h b/lib/librte_cryptodev/rte_cryptodev.h
index 7b3ebc20f..3579ab66e 100644
--- a/lib/librte_cryptodev/rte_cryptodev.h
+++ b/lib/librte_cryptodev/rte_cryptodev.h
@@ -466,7 +466,8 @@  rte_cryptodev_asym_get_xform_enum(enum rte_crypto_asym_xform_type *xform_enum,
 /**< Support symmetric session-less operations */
 #define RTE_CRYPTODEV_FF_NON_BYTE_ALIGNED_DATA		(1ULL << 23)
 /**< Support operations on data which is not byte aligned */
-
+#define RTE_CRYPTODEV_FF_SYM_HW_RAW_DP			(1ULL << 24)
+/**< Support accelerated specific raw data-path APIs */
 
 /**
  * Get the name of a crypto device feature flag
@@ -1351,6 +1352,357 @@  rte_cryptodev_sym_cpu_crypto_process(uint8_t dev_id,
 	struct rte_cryptodev_sym_session *sess, union rte_crypto_sym_ofs ofs,
 	struct rte_crypto_sym_vec *vec);
 
+/**
+ * Get the size of the raw data-path context buffer.
+ *
+ * @param	dev_id		The device identifier.
+ *
+ * @return
+ *   - If the device supports raw data-path APIs, return the context size.
+ *   - If the device does not support the APIs, return -1.
+ */
+__rte_experimental
+int
+rte_cryptodev_raw_get_dp_context_size(uint8_t dev_id);
+
+/**
+ * Union of different crypto session types, including session-less xform
+ * pointer.
+ */
+union rte_cryptodev_session_ctx {
+	struct rte_cryptodev_sym_session *crypto_sess;
+	struct rte_crypto_sym_xform *xform;
+	struct rte_security_session *sec_sess;
+};
+
+/**
+ * Enqueue a data vector into device queue but the driver will not start
+ * processing until rte_cryptodev_raw_enqueue_done() is called.
+ *
+ * @param	qp		Driver specific queue pair data.
+ * @param	drv_ctx		Driver specific context data.
+ * @param	vec		The array of descriptor vectors.
+ * @param	ofs		Start and stop offsets for auth and cipher
+ *				operations.
+ * @param	user_data	The array of user data for dequeue later.
+ * @return
+ *   - The number of descriptors successfully submitted.
+ */
+typedef uint32_t (*cryptodev_dp_sym_enqueue_burst_t)(
+	void *qp, uint8_t *drv_ctx, struct rte_crypto_sym_vec *vec,
+	union rte_crypto_sym_ofs ofs, void *user_data[]);
+
+/**
+ * Enqueue single descriptor into device queue but the driver will not start
+ * processing until rte_cryptodev_raw_enqueue_done() is called.
+ *
+ * @param	qp		Driver specific queue pair data.
+ * @param	drv_ctx		Driver specific context data.
+ * @param	data_vec	The buffer data vector.
+ * @param	n_data_vecs	Number of buffer data vectors.
+ * @param	ofs		Start and stop offsets for auth and cipher
+ *				operations.
+ * @param	iv		IV virtual and IOVA addresses
+ * @param	digest		digest virtual and IOVA addresses
+ * @param	aad_or_auth_iv	AAD or auth IV virtual and IOVA addresses,
+ *				depends on the algorithm used.
+ * @param	user_data	The user data.
+ * @return
+ *   - On success return 0.
+ *   - On failure return negative integer.
+ */
+typedef int (*cryptodev_dp_sym_enqueue_t)(
+	void *qp, uint8_t *drv_ctx, struct rte_crypto_vec *data_vec,
+	uint16_t n_data_vecs, union rte_crypto_sym_ofs ofs,
+	struct rte_crypto_va_iova_ptr *iv,
+	struct rte_crypto_va_iova_ptr *digest,
+	struct rte_crypto_va_iova_ptr *aad_or_auth_iv,
+	void *user_data);
+
+/**
+ * Inform the cryptodev queue pair to start processing or finish dequeuing all
+ * enqueued/dequeued descriptors.
+ *
+ * @param	qp		Driver specific queue pair data.
+ * @param	drv_ctx		Driver specific context data.
+ * @param	n		The total number of processed descriptors.
+ * @return
+ *   - On success return 0.
+ *   - On failure return negative integer.
+ */
+typedef int (*cryptodev_dp_sym_operation_done_t)(void *qp, uint8_t *drv_ctx,
+	uint32_t n);
+
+/**
+ * Typedef that the user provided for the driver to get the dequeue count.
+ * The function may return a fixed number or the number parsed from the user
+ * data stored in the first processed descriptor.
+ *
+ * @param	user_data	Dequeued user data.
+ **/
+typedef uint32_t (*rte_cryptodev_raw_get_dequeue_count_t)(void *user_data);
+
+/**
+ * Typedef that the user provided to deal with post dequeue operation, such
+ * as filling status.
+ *
+ * @param	user_data	Dequeued user data.
+ * @param	index		Index number of the processed descriptor.
+ * @param	is_op_success	Operation status provided by the driver.
+ **/
+typedef void (*rte_cryptodev_raw_post_dequeue_t)(void *user_data,
+	uint32_t index, uint8_t is_op_success);
+
+/**
+ * Dequeue symmetric crypto processing of user provided data.
+ *
+ * @param	qp			Driver specific queue pair data.
+ * @param	drv_ctx			Driver specific context data.
+ * @param	get_dequeue_count	User provided callback function to
+ *					obtain dequeue count.
+ * @param	post_dequeue		User provided callback function to
+ *					post-process a dequeued operation.
+ * @param	out_user_data		User data pointer array to be retrieve
+ *					from device queue. In case of
+ *					*is_user_data_array* is set there
+ *					should be enough room to store all
+ *					user data.
+ * @param	is_user_data_array	Set 1 if every dequeued user data will
+ *					be written into out_user_data* array.
+ * @param	n_success		Driver written value to specific the
+ *					total successful operations count.
+ *
+ * @return
+ *  - Returns number of dequeued packets.
+ */
+typedef uint32_t (*cryptodev_dp_sym_dequeue_burst_t)(void *qp,
+	uint8_t *drv_ctx,
+	rte_cryptodev_raw_get_dequeue_count_t get_dequeue_count,
+	rte_cryptodev_raw_post_dequeue_t post_dequeue,
+	void **out_user_data, uint8_t is_user_data_array,
+	uint32_t *n_success);
+
+/**
+ * Dequeue symmetric crypto processing of user provided data.
+ *
+ * @param	qp			Driver specific queue pair data.
+ * @param	drv_ctx			Driver specific context data.
+ * @param	out_user_data		User data pointer to be retrieve from
+ *					device queue.
+ *
+ * @return
+ *   - 1 if the user_data is dequeued and the operation is a success.
+ *   - 0 if the user_data is dequeued but the operation is failed.
+ *   - -1 if no operation is dequeued.
+ */
+typedef int (*cryptodev_dp_sym_dequeue_t)(
+		void *qp, uint8_t *drv_ctx, void **out_user_data);
+
+/**
+ * Context data for raw data-path API crypto process. The buffer of this
+ * structure is to be allocated by the user application with the size equal
+ * or bigger than rte_cryptodev_raw_get_dp_context_size() returned value.
+ *
+ * NOTE: the buffer is to be used and maintained by the cryptodev driver, the
+ * user should NOT alter the buffer content to avoid application or system
+ * crash.
+ */
+struct rte_crypto_raw_dp_ctx {
+	void *qp_data;
+
+	cryptodev_dp_sym_enqueue_t enqueue;
+	cryptodev_dp_sym_enqueue_burst_t enqueue_burst;
+	cryptodev_dp_sym_operation_done_t enqueue_done;
+	cryptodev_dp_sym_dequeue_t dequeue;
+	cryptodev_dp_sym_dequeue_burst_t dequeue_burst;
+	cryptodev_dp_sym_operation_done_t dequeue_done;
+
+	/* Driver specific context data */
+	__extension__ uint8_t drv_ctx_data[];
+};
+
+/**
+ * Configure raw data-path context data.
+ *
+ * NOTE:
+ * After the context data is configured, the user should call
+ * rte_cryptodev_raw_attach_session() before using it in
+ * rte_cryptodev_raw_enqueue/dequeue function call.
+ *
+ * @param	dev_id		The device identifier.
+ * @param	qp_id		The index of the queue pair from which to
+ *				retrieve processed packets. The value must be
+ *				in the range [0, nb_queue_pair - 1] previously
+ *				supplied to rte_cryptodev_configure().
+ * @param	ctx		The raw data-path context data.
+ * @return
+ *   - On success return 0.
+ *   - On failure return negative integer.
+ */
+__rte_experimental
+int
+rte_cryptodev_raw_configure_dp_context(uint8_t dev_id, uint16_t qp_id,
+	struct rte_crypto_raw_dp_ctx *ctx);
+
+/**
+ * Attach a cryptodev session to a initialized raw data path context.
+ *
+ * @param	dev_id		The device identifier.
+ * @param	qp_id		The index of the queue pair from which to
+ *				retrieve processed packets. The value must be
+ *				in the range [0, nb_queue_pair - 1] previously
+ *				supplied to rte_cryptodev_configure().
+ * @param	ctx		The raw data-path context data.
+ * @param	sess_type	session type.
+ * @param	session_ctx	Session context data.
+ * @return
+ *   - On success return 0.
+ *   - On failure return negative integer.
+ */
+__rte_experimental
+int
+rte_cryptodev_raw_attach_session(uint8_t dev_id, uint16_t qp_id,
+	struct rte_crypto_raw_dp_ctx *ctx,
+	enum rte_crypto_op_sess_type sess_type,
+	union rte_cryptodev_session_ctx session_ctx);
+
+/**
+ * Enqueue single raw data-path descriptor.
+ *
+ * The enqueued descriptor will not be started processing until
+ * rte_cryptodev_raw_enqueue_done() is called.
+ *
+ * @param	ctx		The initialized raw data-path context data.
+ * @param	data_vec	The buffer vector.
+ * @param	n_data_vecs	Number of buffer vectors.
+ * @param	ofs		Start and stop offsets for auth and cipher
+ *				operations.
+ * @param	iv		IV virtual and IOVA addresses
+ * @param	digest		digest virtual and IOVA addresses
+ * @param	aad_or_auth_iv	AAD or auth IV virtual and IOVA addresses,
+ *				depends on the algorithm used.
+ * @param	user_data	The user data.
+ * @return
+ *   - The number of descriptors successfully enqueued.
+ */
+__rte_experimental
+static __rte_always_inline int
+rte_cryptodev_raw_enqueue(struct rte_crypto_raw_dp_ctx *ctx,
+	struct rte_crypto_vec *data_vec, uint16_t n_data_vecs,
+	union rte_crypto_sym_ofs ofs,
+	struct rte_crypto_va_iova_ptr *iv,
+	struct rte_crypto_va_iova_ptr *digest,
+	struct rte_crypto_va_iova_ptr *aad_or_auth_iv,
+	void *user_data)
+{
+	return (*ctx->enqueue)(ctx->qp_data, ctx->drv_ctx_data, data_vec,
+		n_data_vecs, ofs, iv, digest, aad_or_auth_iv, user_data);
+}
+
+/**
+ * Enqueue a data vector of raw data-path descriptors.
+ *
+ * The enqueued descriptors will not be started processing until
+ * rte_cryptodev_raw_enqueue_done() is called.
+ *
+ * @param	ctx		The initialized raw data-path context data.
+ * @param	vec		The array of descriptor vectors.
+ * @param	ofs		Start and stop offsets for auth and cipher
+ *				operations.
+ * @param	user_data	The array of opaque data for dequeue.
+ * @return
+ *   - The number of descriptors successfully enqueued.
+ */
+__rte_experimental
+uint32_t
+rte_cryptodev_raw_enqueue_burst(struct rte_crypto_raw_dp_ctx *ctx,
+	struct rte_crypto_sym_vec *vec, union rte_crypto_sym_ofs ofs,
+	void **user_data);
+
+/**
+ * Start processing all enqueued descriptors from last
+ * rte_cryptodev_raw_configure_dp_context() call.
+ *
+ * @param	ctx	The initialized raw data-path context data.
+ * @param	n	The total number of submitted descriptors.
+ */
+__rte_experimental
+int
+rte_cryptodev_raw_enqueue_done(struct rte_crypto_raw_dp_ctx *ctx,
+		uint32_t n);
+
+/**
+ * Dequeue a burst of raw crypto data-path operations and write the previously
+ * enqueued user data into the array provided.
+ *
+ * The dequeued operations, including the user data stored, will not be
+ * wiped out from the device queue until rte_cryptodev_raw_dequeue_done() is
+ * called.
+ *
+ * @param	ctx			The initialized raw data-path context
+ *					data.
+ * @param	get_dequeue_count	User provided callback function to
+ *					obtain dequeue count.
+ * @param	post_dequeue		User provided callback function to
+ *					post-process a dequeued operation.
+ * @param	out_user_data		User data pointer array to be retrieve
+ *					from device queue. In case of
+ *					*is_user_data_array* is set there
+ *					should be enough room to store all
+ *					user data.
+ * @param	is_user_data_array	Set 1 if every dequeued user data will
+ *					be written into *out_user_data* array.
+ * @param	n_success		Driver written value to specific the
+ *					total successful operations count.
+ *
+ * @return
+ *   - Returns number of dequeued packets.
+ */
+__rte_experimental
+uint32_t
+rte_cryptodev_raw_dequeue_burst(struct rte_crypto_raw_dp_ctx *ctx,
+	rte_cryptodev_raw_get_dequeue_count_t get_dequeue_count,
+	rte_cryptodev_raw_post_dequeue_t post_dequeue,
+	void **out_user_data, uint8_t is_user_data_array,
+	uint32_t *n_success);
+
+/**
+ * Dequeue a raw crypto data-path operation and write the previously
+ * enqueued user data.
+ *
+ * The dequeued operation, including the user data stored, will not be wiped
+ * out from the device queue until rte_cryptodev_raw_dequeue_done() is called.
+ *
+ * @param	ctx			The initialized raw data-path context
+ *					data.
+ * @param	out_user_data		User data pointer to be retrieve from
+ *					device queue. The driver shall support
+ *					NULL input of this parameter.
+ *
+ * @return
+ *   - 1 if the user data is dequeued and the operation is a success.
+ *   - 0 if the user data is dequeued but the operation is failed.
+ *   - -1 if no operation is ready to be dequeued.
+ */
+__rte_experimental
+static __rte_always_inline int
+rte_cryptodev_raw_dequeue(struct rte_crypto_raw_dp_ctx *ctx,
+	void **out_user_data)
+{
+	return (*ctx->dequeue)(ctx->qp_data, ctx->drv_ctx_data, out_user_data);
+}
+
+/**
+ * Inform the queue pair dequeue operations finished.
+ *
+ * @param	ctx	The initialized raw data-path context data.
+ * @param	n	The total number of jobs already dequeued.
+ */
+__rte_experimental
+int
+rte_cryptodev_raw_dequeue_done(struct rte_crypto_raw_dp_ctx *ctx,
+		uint32_t n);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_cryptodev/rte_cryptodev_pmd.h b/lib/librte_cryptodev/rte_cryptodev_pmd.h
index 81975d72b..69a2a6d64 100644
--- a/lib/librte_cryptodev/rte_cryptodev_pmd.h
+++ b/lib/librte_cryptodev/rte_cryptodev_pmd.h
@@ -316,6 +316,40 @@  typedef uint32_t (*cryptodev_sym_cpu_crypto_process_t)
 	(struct rte_cryptodev *dev, struct rte_cryptodev_sym_session *sess,
 	union rte_crypto_sym_ofs ofs, struct rte_crypto_sym_vec *vec);
 
+/**
+ * Typedef that the driver provided to get service context private date size.
+ *
+ * @param	dev	Crypto device pointer.
+ *
+ * @return
+ *   - On success return the size of the device's service context private data.
+ *   - On failure return negative integer.
+ */
+typedef int (*cryptodev_dp_get_service_ctx_size_t)(
+	struct rte_cryptodev *dev);
+
+/**
+ * Typedef that the driver provided to configure data-path context.
+ *
+ * @param	dev		Crypto device pointer.
+ * @param	qp_id		Crypto device queue pair index.
+ * @param	service_type	Type of the service requested.
+ * @param	sess_type	session type.
+ * @param	session_ctx	Session context data. If NULL the driver
+ *				shall only configure the drv_ctx_data in
+ *				ctx buffer. Otherwise the driver shall only
+ *				parse the session_ctx to set appropriate
+ *				function pointers in ctx.
+ * @param	ctx		The raw data-path context data.
+ * @return
+ *   - On success return 0.
+ *   - On failure return negative integer.
+ */
+typedef int (*cryptodev_dp_configure_ctx_t)(
+	struct rte_cryptodev *dev, uint16_t qp_id,
+	enum rte_crypto_op_sess_type sess_type,
+	union rte_cryptodev_session_ctx session_ctx,
+	struct rte_crypto_raw_dp_ctx *ctx);
 
 /** Crypto device operations function pointer table */
 struct rte_cryptodev_ops {
@@ -348,8 +382,17 @@  struct rte_cryptodev_ops {
 	/**< Clear a Crypto sessions private data. */
 	cryptodev_asym_free_session_t asym_session_clear;
 	/**< Clear a Crypto sessions private data. */
-	cryptodev_sym_cpu_crypto_process_t sym_cpu_process;
-	/**< process input data synchronously (cpu-crypto). */
+	union {
+		cryptodev_sym_cpu_crypto_process_t sym_cpu_process;
+		/**< process input data synchronously (cpu-crypto). */
+		__extension__
+		struct {
+			cryptodev_dp_get_service_ctx_size_t get_drv_ctx_size;
+			/**< Get data path service context data size. */
+			cryptodev_dp_configure_ctx_t configure_dp_ctx;
+			/**< Initialize crypto service ctx data. */
+		};
+	};
 };
 
 
diff --git a/lib/librte_cryptodev/rte_cryptodev_version.map b/lib/librte_cryptodev/rte_cryptodev_version.map
index 02f6dcf72..bc4cd1ea5 100644
--- a/lib/librte_cryptodev/rte_cryptodev_version.map
+++ b/lib/librte_cryptodev/rte_cryptodev_version.map
@@ -105,4 +105,15 @@  EXPERIMENTAL {
 
 	# added in 20.08
 	rte_cryptodev_get_qp_status;
+
+	# added in 20.11
+	rte_cryptodev_raw_attach_session;
+	rte_cryptodev_raw_configure_dp_context;
+	rte_cryptodev_raw_get_dp_context_size;
+	rte_cryptodev_raw_dequeue;
+	rte_cryptodev_raw_dequeue_burst;
+	rte_cryptodev_raw_dequeue_done;
+	rte_cryptodev_raw_enqueue;
+	rte_cryptodev_raw_enqueue_burst;
+	rte_cryptodev_raw_enqueue_done;
 };