[v2] cryptodev: change raw data path dequeue API
Checks
Commit Message
This patch changes the experimental raw data path dequeue burst API.
Originally the API enforces the user to provide callback function
to get maximum dequeue count. This change gives the user one more
option to pass directly the expected dequeue count.
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
app/test/test_cryptodev.c | 8 +-------
doc/guides/rel_notes/release_21_05.rst | 3 +++
drivers/crypto/qat/qat_sym_hw_dp.c | 21 ++++++++++++++++++---
lib/librte_cryptodev/rte_cryptodev.c | 5 +++--
lib/librte_cryptodev/rte_cryptodev.h | 8 ++++++++
5 files changed, 33 insertions(+), 12 deletions(-)
Comments
Hi Fan,
> This patch changes the experimental raw data path dequeue burst API.
> Originally the API enforces the user to provide callback function
> to get maximum dequeue count. This change gives the user one more
> option to pass directly the expected dequeue count.
>
> Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
> ---
> app/test/test_cryptodev.c | 8 +-------
> doc/guides/rel_notes/release_21_05.rst | 3 +++
> drivers/crypto/qat/qat_sym_hw_dp.c | 21 ++++++++++++++++++---
> lib/librte_cryptodev/rte_cryptodev.c | 5 +++--
> lib/librte_cryptodev/rte_cryptodev.h | 8 ++++++++
> 5 files changed, 33 insertions(+), 12 deletions(-)
>
> diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c
> index f91debc168..a910547423 100644
> --- a/app/test/test_cryptodev.c
> +++ b/app/test/test_cryptodev.c
> @@ -162,12 +162,6 @@ ceil_byte_length(uint32_t num_bits)
> return (num_bits >> 3);
> }
>
> -static uint32_t
> -get_raw_dp_dequeue_count(void *user_data __rte_unused)
> -{
> - return 1;
> -}
> -
> static void
> post_process_raw_dp_op(void *user_data, uint32_t index __rte_unused,
> uint8_t is_op_success)
> @@ -345,7 +339,7 @@ process_sym_raw_dp_op(uint8_t dev_id, uint16_t
> qp_id,
> n = n_success = 0;
> while (count++ < MAX_RAW_DEQUEUE_COUNT && n == 0) {
> n = rte_cryptodev_raw_dequeue_burst(ctx,
> - get_raw_dp_dequeue_count,
> post_process_raw_dp_op,
> + NULL, 1, post_process_raw_dp_op,
> (void **)&ret_op, 0, &n_success,
> &dequeue_status);
> if (dequeue_status < 0) {
> diff --git a/doc/guides/rel_notes/release_21_05.rst
> b/doc/guides/rel_notes/release_21_05.rst
> index 8e686cc627..943f1596c5 100644
> --- a/doc/guides/rel_notes/release_21_05.rst
> +++ b/doc/guides/rel_notes/release_21_05.rst
> @@ -130,6 +130,9 @@ API Changes
> Also, make sure to start the actual text at the margin.
> =======================================================
>
> +* cryptodev: the function ``rte_cryptodev_raw_dequeue_burst`` is added a
> + parameter ``max_nb_to_dequeue`` to give user a more flexible dequeue
> control.
> +
Shouldn't we remove the callback completely?
What is the use case of having 2 different methods of passing a
Simple dequeue count?
Why do we need such flexibility?
Regards,
Akhil
Hi Akhil,
It is possible the user don't know how many ops to dequeue.
For example in VPP crypto up to 64 buffers (vnet_crypto_async_frame_elt_t) are wrapped into the following data structure
typedef struct
{
CLIB_CACHE_LINE_ALIGN_MARK (cacheline0);
vnet_crypto_async_frame_state_t state;
vnet_crypto_async_op_id_t op:8;
u16 n_elts;
vnet_crypto_async_frame_elt_t elts[VNET_CRYPTO_FRAME_SIZE];
u32 buffer_indices[VNET_CRYPTO_FRAME_SIZE];
u16 next_node_index[VNET_CRYPTO_FRAME_SIZE];
u32 enqueue_thread_index;
} vnet_crypto_async_frame_t;
Instead of passing vnet_crypto_async_frame_elt_t Pointer as metadata to cryptodev, we have to pass vnet_crypto_async_frame_t pointer into cryptodev.
The callback function helps parse the first dequeued metadata to get n_elts and will dequeue that many ops.
But in case we cannot dequeue the whole frame, passing the number of ops not dequeued yet in the next dequeue_burst operation should help us to dequeue the whole frame. In this case we only have to cache up to 1 frame pointer for half dequeued frame.
As the patch stated this should help cover both cases for user either dequeue the wrapped data structure with multiple buffers, or dequeue a burst of packets - hence giving people more flexibility.
Regards,
Fan
> -----Original Message-----
> From: Akhil Goyal <gakhil@marvell.com>
> Sent: Tuesday, April 13, 2021 11:20 AM
> To: Zhang, Roy Fan <roy.fan.zhang@intel.com>; dev@dpdk.org
> Subject: RE: [EXT] [dpdk-dev v2] cryptodev: change raw data path dequeue
> API
>
> Hi Fan,
>
> > This patch changes the experimental raw data path dequeue burst API.
> > Originally the API enforces the user to provide callback function
> > to get maximum dequeue count. This change gives the user one more
> > option to pass directly the expected dequeue count.
> >
> > Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
> > ---
> > app/test/test_cryptodev.c | 8 +-------
> > doc/guides/rel_notes/release_21_05.rst | 3 +++
> > drivers/crypto/qat/qat_sym_hw_dp.c | 21 ++++++++++++++++++---
> > lib/librte_cryptodev/rte_cryptodev.c | 5 +++--
> > lib/librte_cryptodev/rte_cryptodev.h | 8 ++++++++
> > 5 files changed, 33 insertions(+), 12 deletions(-)
> >
> > diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c
> > index f91debc168..a910547423 100644
> > --- a/app/test/test_cryptodev.c
> > +++ b/app/test/test_cryptodev.c
> > @@ -162,12 +162,6 @@ ceil_byte_length(uint32_t num_bits)
> > return (num_bits >> 3);
> > }
> >
> > -static uint32_t
> > -get_raw_dp_dequeue_count(void *user_data __rte_unused)
> > -{
> > - return 1;
> > -}
> > -
> > static void
> > post_process_raw_dp_op(void *user_data, uint32_t index __rte_unused,
> > uint8_t is_op_success)
> > @@ -345,7 +339,7 @@ process_sym_raw_dp_op(uint8_t dev_id, uint16_t
> > qp_id,
> > n = n_success = 0;
> > while (count++ < MAX_RAW_DEQUEUE_COUNT && n == 0) {
> > n = rte_cryptodev_raw_dequeue_burst(ctx,
> > - get_raw_dp_dequeue_count,
> > post_process_raw_dp_op,
> > + NULL, 1, post_process_raw_dp_op,
> > (void **)&ret_op, 0, &n_success,
> > &dequeue_status);
> > if (dequeue_status < 0) {
> > diff --git a/doc/guides/rel_notes/release_21_05.rst
> > b/doc/guides/rel_notes/release_21_05.rst
> > index 8e686cc627..943f1596c5 100644
> > --- a/doc/guides/rel_notes/release_21_05.rst
> > +++ b/doc/guides/rel_notes/release_21_05.rst
> > @@ -130,6 +130,9 @@ API Changes
> > Also, make sure to start the actual text at the margin.
> > =======================================================
> >
> > +* cryptodev: the function ``rte_cryptodev_raw_dequeue_burst`` is added
> a
> > + parameter ``max_nb_to_dequeue`` to give user a more flexible
> dequeue
> > control.
> > +
>
> Shouldn't we remove the callback completely?
> What is the use case of having 2 different methods of passing a
> Simple dequeue count?
> Why do we need such flexibility?
>
> Regards,
> Akhil
>
> Hi Akhil,
>
> It is possible the user don't know how many ops to dequeue.
> For example in VPP crypto up to 64 buffers (vnet_crypto_async_frame_elt_t)
> are wrapped into the following data structure
>
> typedef struct
> {
> CLIB_CACHE_LINE_ALIGN_MARK (cacheline0);
> vnet_crypto_async_frame_state_t state;
> vnet_crypto_async_op_id_t op:8;
> u16 n_elts;
> vnet_crypto_async_frame_elt_t elts[VNET_CRYPTO_FRAME_SIZE];
> u32 buffer_indices[VNET_CRYPTO_FRAME_SIZE];
> u16 next_node_index[VNET_CRYPTO_FRAME_SIZE];
> u32 enqueue_thread_index;
> } vnet_crypto_async_frame_t;
>
> Instead of passing vnet_crypto_async_frame_elt_t Pointer as metadata to
> cryptodev, we have to pass vnet_crypto_async_frame_t pointer into
> cryptodev.
> The callback function helps parse the first dequeued metadata to get n_elts
> and will dequeue that many ops.
>
> But in case we cannot dequeue the whole frame, passing the number of ops
> not dequeued yet in the next dequeue_burst operation should help us to
> dequeue the whole frame. In this case we only have to cache up to 1 frame
> pointer for half dequeued frame.
>
> As the patch stated this should help cover both cases for user either dequeue
> the wrapped data structure with multiple buffers, or dequeue a burst of
> packets - hence giving people more flexibility.
>
> Regards,
> Fan
>
Ok.
Acked-by: Akhil Goyal <gakhil@marvell.com>
> >
> > Hi Akhil,
> >
> > It is possible the user don't know how many ops to dequeue.
> > For example in VPP crypto up to 64 buffers
> (vnet_crypto_async_frame_elt_t)
> > are wrapped into the following data structure
> >
> > typedef struct
> > {
> > CLIB_CACHE_LINE_ALIGN_MARK (cacheline0);
> > vnet_crypto_async_frame_state_t state;
> > vnet_crypto_async_op_id_t op:8;
> > u16 n_elts;
> > vnet_crypto_async_frame_elt_t elts[VNET_CRYPTO_FRAME_SIZE];
> > u32 buffer_indices[VNET_CRYPTO_FRAME_SIZE];
> > u16 next_node_index[VNET_CRYPTO_FRAME_SIZE];
> > u32 enqueue_thread_index;
> > } vnet_crypto_async_frame_t;
> >
> > Instead of passing vnet_crypto_async_frame_elt_t Pointer as metadata to
> > cryptodev, we have to pass vnet_crypto_async_frame_t pointer into
> > cryptodev.
> > The callback function helps parse the first dequeued metadata to get n_elts
> > and will dequeue that many ops.
> >
> > But in case we cannot dequeue the whole frame, passing the number of
> ops
> > not dequeued yet in the next dequeue_burst operation should help us to
> > dequeue the whole frame. In this case we only have to cache up to 1 frame
> > pointer for half dequeued frame.
> >
> > As the patch stated this should help cover both cases for user either
> dequeue
> > the wrapped data structure with multiple buffers, or dequeue a burst of
> > packets - hence giving people more flexibility.
> >
> > Regards,
> > Fan
> >
> Ok.
>
> Acked-by: Akhil Goyal <gakhil@marvell.com>
Applied to dpdk-next-crypto
Thanks.
@@ -162,12 +162,6 @@ ceil_byte_length(uint32_t num_bits)
return (num_bits >> 3);
}
-static uint32_t
-get_raw_dp_dequeue_count(void *user_data __rte_unused)
-{
- return 1;
-}
-
static void
post_process_raw_dp_op(void *user_data, uint32_t index __rte_unused,
uint8_t is_op_success)
@@ -345,7 +339,7 @@ process_sym_raw_dp_op(uint8_t dev_id, uint16_t qp_id,
n = n_success = 0;
while (count++ < MAX_RAW_DEQUEUE_COUNT && n == 0) {
n = rte_cryptodev_raw_dequeue_burst(ctx,
- get_raw_dp_dequeue_count, post_process_raw_dp_op,
+ NULL, 1, post_process_raw_dp_op,
(void **)&ret_op, 0, &n_success,
&dequeue_status);
if (dequeue_status < 0) {
@@ -130,6 +130,9 @@ API Changes
Also, make sure to start the actual text at the margin.
=======================================================
+* cryptodev: the function ``rte_cryptodev_raw_dequeue_burst`` is added a
+ parameter ``max_nb_to_dequeue`` to give user a more flexible dequeue control.
+
ABI Changes
-----------
@@ -707,6 +707,7 @@ qat_sym_dp_enqueue_chain_jobs(void *qp_data, uint8_t *drv_ctx,
static __rte_always_inline uint32_t
qat_sym_dp_dequeue_burst(void *qp_data, uint8_t *drv_ctx,
rte_cryptodev_raw_get_dequeue_count_t get_dequeue_count,
+ uint32_t max_nb_to_dequeue,
rte_cryptodev_raw_post_dequeue_t post_dequeue,
void **out_user_data, uint8_t is_user_data_array,
uint32_t *n_success_jobs, int *return_status)
@@ -736,9 +737,23 @@ qat_sym_dp_dequeue_burst(void *qp_data, uint8_t *drv_ctx,
resp_opaque = (void *)(uintptr_t)resp->opaque_data;
/* get the dequeue count */
- n = get_dequeue_count(resp_opaque);
- if (unlikely(n == 0))
- return 0;
+ if (get_dequeue_count) {
+ n = get_dequeue_count(resp_opaque);
+ if (unlikely(n == 0))
+ return 0;
+ else if (n > 1) {
+ head = (head + rx_queue->msg_size * (n - 1)) &
+ rx_queue->modulo_mask;
+ resp = (struct icp_qat_fw_comn_resp *)(
+ (uint8_t *)rx_queue->base_addr + head);
+ if (*(uint32_t *)resp == ADF_RING_EMPTY_SIG)
+ return 0;
+ }
+ } else {
+ if (unlikely(max_nb_to_dequeue == 0))
+ return 0;
+ n = max_nb_to_dequeue;
+ }
out_user_data[0] = resp_opaque;
status = QAT_SYM_DP_IS_RESP_SUCCESS(resp);
@@ -2232,13 +2232,14 @@ rte_cryptodev_raw_enqueue_done(struct rte_crypto_raw_dp_ctx *ctx,
uint32_t
rte_cryptodev_raw_dequeue_burst(struct rte_crypto_raw_dp_ctx *ctx,
rte_cryptodev_raw_get_dequeue_count_t get_dequeue_count,
+ uint32_t max_nb_to_dequeue,
rte_cryptodev_raw_post_dequeue_t post_dequeue,
void **out_user_data, uint8_t is_user_data_array,
uint32_t *n_success_jobs, int *status)
{
return (*ctx->dequeue_burst)(ctx->qp_data, ctx->drv_ctx_data,
- get_dequeue_count, post_dequeue, out_user_data,
- is_user_data_array, n_success_jobs, status);
+ get_dequeue_count, max_nb_to_dequeue, post_dequeue,
+ out_user_data, is_user_data_array, n_success_jobs, status);
}
int
@@ -1546,6 +1546,9 @@ typedef void (*rte_cryptodev_raw_post_dequeue_t)(void *user_data,
* @param drv_ctx Driver specific context data.
* @param get_dequeue_count User provided callback function to
* obtain dequeue operation count.
+ * @param max_nb_to_dequeue When get_dequeue_count is NULL this
+ * value is used to pass the maximum
+ * number of operations to be dequeued.
* @param post_dequeue User provided callback function to
* post-process a dequeued operation.
* @param out_user_data User data pointer array to be retrieve
@@ -1580,6 +1583,7 @@ typedef void (*rte_cryptodev_raw_post_dequeue_t)(void *user_data,
typedef uint32_t (*cryptodev_sym_raw_dequeue_burst_t)(void *qp,
uint8_t *drv_ctx,
rte_cryptodev_raw_get_dequeue_count_t get_dequeue_count,
+ uint32_t max_nb_to_dequeue,
rte_cryptodev_raw_post_dequeue_t post_dequeue,
void **out_user_data, uint8_t is_user_data_array,
uint32_t *n_success, int *dequeue_status);
@@ -1747,6 +1751,9 @@ rte_cryptodev_raw_enqueue_done(struct rte_crypto_raw_dp_ctx *ctx,
* data.
* @param get_dequeue_count User provided callback function to
* obtain dequeue operation count.
+ * @param max_nb_to_dequeue When get_dequeue_count is NULL this
+ * value is used to pass the maximum
+ * number of operations to be dequeued.
* @param post_dequeue User provided callback function to
* post-process a dequeued operation.
* @param out_user_data User data pointer array to be retrieve
@@ -1782,6 +1789,7 @@ __rte_experimental
uint32_t
rte_cryptodev_raw_dequeue_burst(struct rte_crypto_raw_dp_ctx *ctx,
rte_cryptodev_raw_get_dequeue_count_t get_dequeue_count,
+ uint32_t max_nb_to_dequeue,
rte_cryptodev_raw_post_dequeue_t post_dequeue,
void **out_user_data, uint8_t is_user_data_array,
uint32_t *n_success, int *dequeue_status);