[3/3] ethdev: add async queue-based flow rules operations

Message ID 20211006044835.3936226-4-akozyrev@nvidia.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series ethdev: datapath-focused flow rules management |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/Intel-compilation fail Compilation issues
ci/intel-Testing success Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/github-robot: build fail github build: failed
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-mellanox-Performance fail Performance Testing issues

Commit Message

Alexander Kozyrev Oct. 6, 2021, 4:48 a.m. UTC
  A new, faster, queue-based flow rules management mechanism is needed for
applications offloading rules inside the datapath. This asynchronous
and lockless mechanism frees the CPU for further packet processing and
reduces the performance impact of the flow rules creation/destruction
on the datapath. Note that queues are not thread-safe and queue-based
operations can be safely invoked without any locks from a single thread.

The rte_flow_q_flow_create() function enqueues a flow creation to the
requested queue. It benefits from already configured resources and sets
unique values on top of item and action templates. A flow rule is enqueued
on the specified flow queue and offloaded asynchronously to the hardware.
The function returns immediately to spare CPU for further packet
processing. The application must invoke the rte_flow_q_dequeue() function
to complete the flow rule operation offloading, to clear the queue, and to
receive the operation status. The rte_flow_q_flow_destroy() function
enqueues a flow destruction to the requested queue.

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Suggested-by: Ori Kam <orika@nvidia.com>
---
 lib/ethdev/rte_flow.h | 288 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 288 insertions(+)
  

Comments

Ivan Malov Oct. 6, 2021, 4:24 p.m. UTC | #1
Hi,

On 06/10/2021 07:48, Alexander Kozyrev wrote:
> A new, faster, queue-based flow rules management mechanism is needed for
> applications offloading rules inside the datapath. This asynchronous
> and lockless mechanism frees the CPU for further packet processing and
> reduces the performance impact of the flow rules creation/destruction
> on the datapath. Note that queues are not thread-safe and queue-based
> operations can be safely invoked without any locks from a single thread.
> 
> The rte_flow_q_flow_create() function enqueues a flow creation to the
> requested queue. It benefits from already configured resources and sets
> unique values on top of item and action templates. A flow rule is enqueued
> on the specified flow queue and offloaded asynchronously to the hardware.
> The function returns immediately to spare CPU for further packet
> processing. The application must invoke the rte_flow_q_dequeue() function
> to complete the flow rule operation offloading, to clear the queue, and to
> receive the operation status. The rte_flow_q_flow_destroy() function
> enqueues a flow destruction to the requested queue.
> 
> Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
> Suggested-by: Ori Kam <orika@nvidia.com>
> ---
>   lib/ethdev/rte_flow.h | 288 ++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 288 insertions(+)
> 
> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
> index ba3204b17e..8cdffd8d2e 100644
> --- a/lib/ethdev/rte_flow.h
> +++ b/lib/ethdev/rte_flow.h
> @@ -4298,6 +4298,13 @@ struct rte_flow_port_attr {
>   	 * Version of the struct layout, should be 0.
>   	 */
>   	uint32_t version;
> +	/**
> +	 * Number of flow queues to be configured.
> +	 * Flow queues are used for asyncronous flow rule creation/destruction.

Typo: asyncronous --> asynchronous

> +	 * The order of operations is not guaranteed inside a queue.
> +	 * Flow queues are not thread-safe.
> +	 */
> +	uint16_t nb_queues;
>   	/**
>   	 * Memory size allocated for the flow rules management.
>   	 * If set to 0, memory is allocated dynamically.
> @@ -4330,6 +4337,21 @@ struct rte_flow_port_attr {
>   	uint32_t fixed_resource_size:1;
>   };
>   
> +/**
> + * Flow engine queue configuration.
> + */
> +__extension__
> +struct rte_flow_queue_attr {

Perhaps "struct rte_flow_queue_mode" or "struct rte_flow_queue_conf". I 
don't insist.

> +	/**
> +	 * Version of the struct layout, should be 0.
> +	 */
> +	uint32_t version;
> +	/**
> +	 * Number of flow rule operations a queue can hold.
> +	 */
> +	uint32_t size;
> +};
> +
>   /**
>    * @warning
>    * @b EXPERIMENTAL: this API may change without prior notice.
> @@ -4346,6 +4368,8 @@ struct rte_flow_port_attr {
>    *   Port identifier of Ethernet device.
>    * @param[in] port_attr
>    *   Port configuration attributes.
> + * @param[in] queue_attr
> + *   Array that holds attributes for each queue.

This should probably say that the number of queues / array size is taken 
from port_attr->nb_queues.

Also, consider "... for each flow queue".

>    * @param[out] error
>    *   Perform verbose error reporting if not NULL.
>    *   PMDs initialize this structure in case of error only.
> @@ -4357,6 +4381,7 @@ __rte_experimental
>   int
>   rte_flow_configure(uint16_t port_id,
>   		   const struct rte_flow_port_attr *port_attr,
> +		   const struct rte_flow_queue_attr *queue_attr[],
>   		   struct rte_flow_error *error);
>   
>   /**
> @@ -4626,6 +4651,269 @@ __rte_experimental
>   int
>   rte_flow_table_destroy(uint16_t port_id, struct rte_flow_table *table,
>   		       struct rte_flow_error *error);
> +
> +/**
> + * Queue operation attributes
> + */
> +__extension__
> +struct rte_flow_q_ops_attr {
> +	/**
> +	 * Version of the struct layout, should be 0.
> +	 */
> +	uint32_t version;
> +	/**
> +	 * The user data that will be returned on the completion.

Maybe "on completion events"?

> +	 */
> +	void *user_data;
> +	/**
> +	 * When set, the requested action must be sent to the HW without
> +	 * any delay. Any prior requests must be also sent to the HW.
> +	 * If this bit is cleared, the application must call the
> +	 * rte_flow_queue_flush API to actually send the request to the HW.

Not sure that I understand the "Any prior requests ..." part. If this 
structure configures operation mode for the whole queue and not for each 
enqueue request, then no "prior requests" can exist in the first place 
because each submission is meant to be immediately sent to the HW.

But if this structure can vary across enqueue requests, then this 
documentation should be improved to say this clearly.


> +	 */
> +	uint32_t flush:1;
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Enqueue rule creation operation.
> + *
> + * @param port_id
> + *   Port identifier of Ethernet device.
> + * @param queue
> + *   Flow queue used to insert the rule.
> + * @param[in] attr
> + *   Rule creation operation attributes.

At a bare minimum, please consider renaming this to "queue_attr". More 
variants (see above): "queue_mode", "queue_conf". I suggest doing so to 
avoid confusion with "struct rte_flow_attr" which sits in "struct 
rte_flow_table_attr" in fact.

If this structure must be exactly the same as the one used in 
rte_flow_configure(), please say so. If, however, this structure can be 
completely different on enqueue operations, then the argument name 
should indicate it somehow. Maybe, "override_queue_conf". Otherwise, 
it's unclear.

There are similar occurrences below.

> + * @param[in] table
> + *   Table to select templates from.

Perhaps "template_table"?

> + * @param[in] items
> + *   List of pattern items to be used.
> + *   The list order should match the order in the item template.
> + *   The spec is the only relevant member of the item that is being used.
> + * @param[in] item_template_index
> + *   Item template index in the table.
> + * @param[in] actions
> + *   List of actions to be used.
> + *   The list order should match the order in the action template.
> + * @param[in] action_template_index
> + *   Action template index in the table.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL.
> + *   PMDs initialize this structure in case of error only.
> + *
> + * @return
> + *   Handle on success, NULL otherwise and rte_errno is set.
> + *   The rule handle doesn't mean that the rule was offloaded.
> + *   Only completion result indicates that the rule was offloaded.
> + */
> +__rte_experimental
> +struct rte_flow *
> +rte_flow_q_flow_create(uint16_t port_id, uint32_t queue,
> +		       const struct rte_flow_q_ops_attr *attr,
> +		       const struct rte_flow_table *table,
> +			   const struct rte_flow_item items[],
> +		       uint8_t item_template_index,
> +			   const struct rte_flow_action actions[],
> +		       uint8_t action_template_index,
> +		       struct rte_flow_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Enqueue rule destruction operation.
> + *
> + * This function enqueues a destruction operation on the queue.
> + * Application should assume that after calling this function
> + * the rule handle is not valid anymore.
> + * Completion indicates the full removal of the rule from the HW.
> + *
> + * @param port_id
> + *   Port identifier of Ethernet device.
> + * @param queue
> + *   Flow queue which is used to destroy the rule.
> + *   This must match the queue on which the rule was created.
> + * @param[in] attr
> + *   Rule destroy operation attributes.
> + * @param[in] flow
> + *   Flow handle to be destroyed.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL.
> + *   PMDs initialize this structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +__rte_experimental
> +int
> +rte_flow_q_flow_destroy(uint16_t port_id, uint32_t queue,
> +			struct rte_flow_q_ops_attr *attr,
> +			struct rte_flow *flow,
> +			struct rte_flow_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Enqueue indirect rule creation operation.
> + * @see rte_flow_action_handle_create

Why "indirect rule"? This API seems to enqueue an "indirect action"...

> + *
> + * @param[in] port_id
> + *   Port identifier of Ethernet device.
> + * @param[in] queue
> + *   Flow queue which is used to create the rule.
> + * @param[in] attr
> + *   Queue operation attributes.
> + * @param[in] conf
> + *   Action configuration for the indirect action object creation.

Perhaps "indir_conf" or "indir_action_conf"?

> + * @param[in] action
> + *   Specific configuration of the indirect action object.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL.
> + *   PMDs initialize this structure in case of error only.
> + *
> + * @return
> + *   - (0) if success.
> + *   - (-ENODEV) if *port_id* invalid.
> + *   - (-ENOSYS) if underlying device does not support this functionality.
> + *   - (-EIO) if underlying device is removed.
> + *   - (-ENOENT) if action pointed by *action* handle was not found.
> + *   - (-EBUSY) if action pointed by *action* handle still used by some rules
> + *   rte_errno is also set.
> + */
> +__rte_experimental
> +struct rte_flow_action_handle *
> +rte_flow_q_action_handle_create(uint16_t port_id, uint32_t queue,
> +				const struct rte_flow_q_ops_attr *attr,
> +				const struct rte_flow_indir_action_conf *conf,
> +				const struct rte_flow_action *action,
> +				struct rte_flow_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Enqueue indirect rule destruction operation.

Please see above. Did you mean "indirect action"?

> + * The destroy queue must be the same
> + * as the queue on which the action was created.
> + *
> + * @param[in] port_id
> + *   Port identifier of Ethernet device.
> + * @param[in] queue
> + *   Flow queue which is used to destroy the rule.
> + * @param[in] attr
> + *   Queue operation attributes.
> + * @param[in] handle
> + *   Handle for the indirect action object to be destroyed.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL.
> + *   PMDs initialize this structure in case of error only.
> + *
> + * @return
> + *   - (0) if success.
> + *   - (-ENODEV) if *port_id* invalid.
> + *   - (-ENOSYS) if underlying device does not support this functionality.
> + *   - (-EIO) if underlying device is removed.
> + *   - (-ENOENT) if action pointed by *action* handle was not found.
> + *   - (-EBUSY) if action pointed by *action* handle still used by some rules
> + *   rte_errno is also set.
> + */
> +__rte_experimental
> +int
> +rte_flow_q_action_handle_destroy(uint16_t port_id, uint32_t queue,
> +				struct rte_flow_q_ops_attr *attr,
> +				struct rte_flow_action_handle *handle,
> +				struct rte_flow_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Flush all internally stored rules to the HW.
> + * Non-flushed rules are rules that were inserted without the flush flag set.
> + * Can be used to notify the HW about batch of rules prepared by the SW to
> + * reduce the number of communications between the HW and SW.
> + *
> + * @param port_id
> + *   Port identifier of Ethernet device.
> + * @param queue
> + *   Flow queue to be flushed.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL.
> + *   PMDs initialize this structure in case of error only.
> + *
> + * @return
> + *    0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +__rte_experimental
> +int
> +rte_flow_q_flush(uint16_t port_id, uint32_t queue,
> +		 struct rte_flow_error *error);
> +
> +/**
> + * Dequeue operation status > + */
> +enum rte_flow_q_op_status {
> +	/**
> +	 * The operation was completed successfully.
> +	 */
> +	RTE_FLOW_Q_OP_SUCCESS,
> +	/**
> +	 * The operation was not completed successfully.
> +	 */
> +	RTE_FLOW_Q_OP_ERROR,
> +};
> +
> +/**
> + * Dequeue operation result.
> + */
> +struct rte_flow_q_op_res {
> +	/**
> +	 * Version of the struct layout, should be 0.
> +	 */
> +	uint32_t version;
> +	/**
> +	 * Returns the status of the operation that this completion signals.
> +	 */
> +	enum rte_flow_q_op_status status;
> +	/**
> +	 * User data that was supplied during operation submission.
> +	 */
> +	void *user_data;
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Dequeue a rte flow operation.
> + * The application must invoke this function in order to complete
> + * the flow rule offloading and to receive the flow rule operation status.
> + *
> + * @param port_id
> + *   Port identifier of Ethernet device.
> + * @param queue
> + *   Flow queue which is used to dequeue the operation.
> + * @param[out] res
> + *   Array of results that will be set.
> + * @param[in] n_res
> + *   Maximum number of results that can be returned.
> + *   This value is equal to the size of the res array.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL.
> + *   PMDs initialize this structure in case of error only.
> + *
> + * @return
> + *   Number of results that were dequeued,
> + *   a negative errno value otherwise and rte_errno is set.
> + */
> +__rte_experimental
> +int
> +rte_flow_q_dequeue(uint16_t port_id, uint32_t queue,
> +		   struct rte_flow_q_op_res res[], uint16_t n_res,
> +		   struct rte_flow_error *error);
>   #ifdef __cplusplus
>   }
>   #endif
>
  
Alexander Kozyrev Oct. 13, 2021, 1:10 a.m. UTC | #2
> From: Ivan Malov <Ivan.Malov@oktetlabs.ru> on Wednesday, October 6, 2021 12:25
> On 06/10/2021 07:48, Alexander Kozyrev wrote:
> > A new, faster, queue-based flow rules management mechanism is needed for
> > applications offloading rules inside the datapath. This asynchronous
> > and lockless mechanism frees the CPU for further packet processing and
> > reduces the performance impact of the flow rules creation/destruction
> > on the datapath. Note that queues are not thread-safe and queue-based
> > operations can be safely invoked without any locks from a single thread.
> >
> > The rte_flow_q_flow_create() function enqueues a flow creation to the
> > requested queue. It benefits from already configured resources and sets
> > unique values on top of item and action templates. A flow rule is enqueued
> > on the specified flow queue and offloaded asynchronously to the hardware.
> > The function returns immediately to spare CPU for further packet
> > processing. The application must invoke the rte_flow_q_dequeue() function
> > to complete the flow rule operation offloading, to clear the queue, and to
> > receive the operation status. The rte_flow_q_flow_destroy() function
> > enqueues a flow destruction to the requested queue.
> >
> > Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
> > Suggested-by: Ori Kam <orika@nvidia.com>
> > ---
> >   lib/ethdev/rte_flow.h | 288
> ++++++++++++++++++++++++++++++++++++++++++
> >   1 file changed, 288 insertions(+)
> >
> > diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
> > index ba3204b17e..8cdffd8d2e 100644
> > --- a/lib/ethdev/rte_flow.h
> > +++ b/lib/ethdev/rte_flow.h
> > @@ -4298,6 +4298,13 @@ struct rte_flow_port_attr {
> >   	 * Version of the struct layout, should be 0.
> >   	 */
> >   	uint32_t version;
> > +	/**
> > +	 * Number of flow queues to be configured.
> > +	 * Flow queues are used for asyncronous flow rule creation/destruction.
> 
> Typo: asyncronous --> asynchronous
Thanks for noticing, will correct all the typos.

> > +	 * The order of operations is not guaranteed inside a queue.
> > +	 * Flow queues are not thread-safe.
> > +	 */
> > +	uint16_t nb_queues;
> >   	/**
> >   	 * Memory size allocated for the flow rules management.
> >   	 * If set to 0, memory is allocated dynamically.
> > @@ -4330,6 +4337,21 @@ struct rte_flow_port_attr {
> >   	uint32_t fixed_resource_size:1;
> >   };
> >
> > +/**
> > + * Flow engine queue configuration.
> > + */
> > +__extension__
> > +struct rte_flow_queue_attr {
> 
> Perhaps "struct rte_flow_queue_mode" or "struct rte_flow_queue_conf". I
> don't insist.
I would prefer sticking to attributes for consistency. We have them elsewhere.

> > +	/**
> > +	 * Version of the struct layout, should be 0.
> > +	 */
> > +	uint32_t version;
> > +	/**
> > +	 * Number of flow rule operations a queue can hold.
> > +	 */
> > +	uint32_t size;
> > +};
> > +
> >   /**
> >    * @warning
> >    * @b EXPERIMENTAL: this API may change without prior notice.
> > @@ -4346,6 +4368,8 @@ struct rte_flow_port_attr {
> >    *   Port identifier of Ethernet device.
> >    * @param[in] port_attr
> >    *   Port configuration attributes.
> > + * @param[in] queue_attr
> > + *   Array that holds attributes for each queue.
> 
> This should probably say that the number of queues / array size is taken
> from port_attr->nb_queues.
Good idea, will mention this.
> Also, consider "... for each flow queue".
Sounds good.

> >    * @param[out] error
> >    *   Perform verbose error reporting if not NULL.
> >    *   PMDs initialize this structure in case of error only.
> > @@ -4357,6 +4381,7 @@ __rte_experimental
> >   int
> >   rte_flow_configure(uint16_t port_id,
> >   		   const struct rte_flow_port_attr *port_attr,
> > +		   const struct rte_flow_queue_attr *queue_attr[],
> >   		   struct rte_flow_error *error);
> >
> >   /**
> > @@ -4626,6 +4651,269 @@ __rte_experimental
> >   int
> >   rte_flow_table_destroy(uint16_t port_id, struct rte_flow_table *table,
> >   		       struct rte_flow_error *error);
> > +
> > +/**
> > + * Queue operation attributes
> > + */
> > +__extension__
> > +struct rte_flow_q_ops_attr {
> > +	/**
> > +	 * Version of the struct layout, should be 0.
> > +	 */
> > +	uint32_t version;
> > +	/**
> > +	 * The user data that will be returned on the completion.
> 
> Maybe "on completion events"?
Agree.

> > +	 */
> > +	void *user_data;
> > +	/**
> > +	 * When set, the requested action must be sent to the HW without
> > +	 * any delay. Any prior requests must be also sent to the HW.
> > +	 * If this bit is cleared, the application must call the
> > +	 * rte_flow_queue_flush API to actually send the request to the HW.
> 
> Not sure that I understand the "Any prior requests ..." part. If this
> structure configures operation mode for the whole queue and not for each
> enqueue request, then no "prior requests" can exist in the first place
> because each submission is meant to be immediately sent to the HW.
This structure configures attributes per single operation. 
User can send multiple operations without  "flush" attribute set to keep them in a queue.
Then it is possible to issue one operation with "flush" attribute set to purge the whole queue.
> But if this structure can vary across enqueue requests, then this
> documentation should be improved to say this clearly.
I'll try to make it clear that it is per-operation attributes indeed.


> > +	 */
> > +	uint32_t flush:1;
> > +};
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Enqueue rule creation operation.
> > + *
> > + * @param port_id
> > + *   Port identifier of Ethernet device.
> > + * @param queue
> > + *   Flow queue used to insert the rule.
> > + * @param[in] attr
> > + *   Rule creation operation attributes.
> 
> At a bare minimum, please consider renaming this to "queue_attr". More
> variants (see above): "queue_mode", "queue_conf". I suggest doing so to
> avoid confusion with "struct rte_flow_attr" which sits in "struct
> rte_flow_table_attr" in fact.
> If this structure must be exactly the same as the one used in
> rte_flow_configure(), please say so. If, however, this structure can be
> completely different on enqueue operations, then the argument name
> should indicate it somehow. Maybe, "override_queue_conf". Otherwise,
> it's unclear.
Sounds reasonable, will do. Although it is operation attributes, not the queue's ones.

> There are similar occurrences below.
> > + * @param[in] table
> > + *   Table to select templates from.
> 
> Perhaps "template_table"?
Noted.

> > + * @param[in] items
> > + *   List of pattern items to be used.
> > + *   The list order should match the order in the item template.
> > + *   The spec is the only relevant member of the item that is being used.
> > + * @param[in] item_template_index
> > + *   Item template index in the table.
> > + * @param[in] actions
> > + *   List of actions to be used.
> > + *   The list order should match the order in the action template.
> > + * @param[in] action_template_index
> > + *   Action template index in the table.
> > + * @param[out] error
> > + *   Perform verbose error reporting if not NULL.
> > + *   PMDs initialize this structure in case of error only.
> > + *
> > + * @return
> > + *   Handle on success, NULL otherwise and rte_errno is set.
> > + *   The rule handle doesn't mean that the rule was offloaded.
> > + *   Only completion result indicates that the rule was offloaded.
> > + */
> > +__rte_experimental
> > +struct rte_flow *
> > +rte_flow_q_flow_create(uint16_t port_id, uint32_t queue,
> > +		       const struct rte_flow_q_ops_attr *attr,
> > +		       const struct rte_flow_table *table,
> > +			   const struct rte_flow_item items[],
> > +		       uint8_t item_template_index,
> > +			   const struct rte_flow_action actions[],
> > +		       uint8_t action_template_index,
> > +		       struct rte_flow_error *error);
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Enqueue rule destruction operation.
> > + *
> > + * This function enqueues a destruction operation on the queue.
> > + * Application should assume that after calling this function
> > + * the rule handle is not valid anymore.
> > + * Completion indicates the full removal of the rule from the HW.
> > + *
> > + * @param port_id
> > + *   Port identifier of Ethernet device.
> > + * @param queue
> > + *   Flow queue which is used to destroy the rule.
> > + *   This must match the queue on which the rule was created.
> > + * @param[in] attr
> > + *   Rule destroy operation attributes.
> > + * @param[in] flow
> > + *   Flow handle to be destroyed.
> > + * @param[out] error
> > + *   Perform verbose error reporting if not NULL.
> > + *   PMDs initialize this structure in case of error only.
> > + *
> > + * @return
> > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > + */
> > +__rte_experimental
> > +int
> > +rte_flow_q_flow_destroy(uint16_t port_id, uint32_t queue,
> > +			struct rte_flow_q_ops_attr *attr,
> > +			struct rte_flow *flow,
> > +			struct rte_flow_error *error);
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Enqueue indirect rule creation operation.
> > + * @see rte_flow_action_handle_create
> 
> Why "indirect rule"? This API seems to enqueue an "indirect action"...
Thanks, will change to action.

> > + *
> > + * @param[in] port_id
> > + *   Port identifier of Ethernet device.
> > + * @param[in] queue
> > + *   Flow queue which is used to create the rule.
> > + * @param[in] attr
> > + *   Queue operation attributes.
> > + * @param[in] conf
> > + *   Action configuration for the indirect action object creation.
> 
> Perhaps "indir_conf" or "indir_action_conf"?
Ok.

> > + * @param[in] action
> > + *   Specific configuration of the indirect action object.
> > + * @param[out] error
> > + *   Perform verbose error reporting if not NULL.
> > + *   PMDs initialize this structure in case of error only.
> > + *
> > + * @return
> > + *   - (0) if success.
> > + *   - (-ENODEV) if *port_id* invalid.
> > + *   - (-ENOSYS) if underlying device does not support this functionality.
> > + *   - (-EIO) if underlying device is removed.
> > + *   - (-ENOENT) if action pointed by *action* handle was not found.
> > + *   - (-EBUSY) if action pointed by *action* handle still used by some rules
> > + *   rte_errno is also set.
> > + */
> > +__rte_experimental
> > +struct rte_flow_action_handle *
> > +rte_flow_q_action_handle_create(uint16_t port_id, uint32_t queue,
> > +				const struct rte_flow_q_ops_attr *attr,
> > +				const struct rte_flow_indir_action_conf *conf,
> > +				const struct rte_flow_action *action,
> > +				struct rte_flow_error *error);
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Enqueue indirect rule destruction operation.
> 
> Please see above. Did you mean "indirect action"?
Again, you are right, action is a better word here.
  
Ajit Khaparde Oct. 13, 2021, 4:57 a.m. UTC | #3
On Tue, Oct 5, 2021 at 9:49 PM Alexander Kozyrev <akozyrev@nvidia.com> wrote:
>
> A new, faster, queue-based flow rules management mechanism is needed for
> applications offloading rules inside the datapath. This asynchronous
> and lockless mechanism frees the CPU for further packet processing and
> reduces the performance impact of the flow rules creation/destruction
> on the datapath. Note that queues are not thread-safe and queue-based
> operations can be safely invoked without any locks from a single thread.
>
> The rte_flow_q_flow_create() function enqueues a flow creation to the
> requested queue. It benefits from already configured resources and sets
> unique values on top of item and action templates. A flow rule is enqueued
> on the specified flow queue and offloaded asynchronously to the hardware.
> The function returns immediately to spare CPU for further packet
> processing. The application must invoke the rte_flow_q_dequeue() function
> to complete the flow rule operation offloading, to clear the queue, and to
> receive the operation status. The rte_flow_q_flow_destroy() function
> enqueues a flow destruction to the requested queue.
>
> Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
> Suggested-by: Ori Kam <orika@nvidia.com>
> ---
>  lib/ethdev/rte_flow.h | 288 ++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 288 insertions(+)
>
> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
> index ba3204b17e..8cdffd8d2e 100644
> --- a/lib/ethdev/rte_flow.h
> +++ b/lib/ethdev/rte_flow.h
> @@ -4298,6 +4298,13 @@ struct rte_flow_port_attr {
>          * Version of the struct layout, should be 0.
>          */
>         uint32_t version;
> +       /**
> +        * Number of flow queues to be configured.
> +        * Flow queues are used for asyncronous flow rule creation/destruction.
> +        * The order of operations is not guaranteed inside a queue.
> +        * Flow queues are not thread-safe.
> +        */
> +       uint16_t nb_queues;
Will it matter if PMD can create a smaller set of queues? Or may be just one?
Should the application set this based on get_infos_get() or some other
mechanism?

::::
  
Ori Kam Oct. 13, 2021, 1:17 p.m. UTC | #4
Hi Ajit,

> -----Original Message-----
> From: Ajit Khaparde <ajit.khaparde@broadcom.com>
> Sent: Wednesday, October 13, 2021 7:58 AM
> Subject: Re: [PATCH 3/3] ethdev: add async queue-based flow rules operations
> 
> On Tue, Oct 5, 2021 at 9:49 PM Alexander Kozyrev <akozyrev@nvidia.com> wrote:
> >
> > A new, faster, queue-based flow rules management mechanism is needed
> > for applications offloading rules inside the datapath. This
> > asynchronous and lockless mechanism frees the CPU for further packet
> > processing and reduces the performance impact of the flow rules
> > creation/destruction on the datapath. Note that queues are not
> > thread-safe and queue-based operations can be safely invoked without any locks from a single
> thread.
> >
> > The rte_flow_q_flow_create() function enqueues a flow creation to the
> > requested queue. It benefits from already configured resources and
> > sets unique values on top of item and action templates. A flow rule is
> > enqueued on the specified flow queue and offloaded asynchronously to the hardware.
> > The function returns immediately to spare CPU for further packet
> > processing. The application must invoke the rte_flow_q_dequeue()
> > function to complete the flow rule operation offloading, to clear the
> > queue, and to receive the operation status. The
> > rte_flow_q_flow_destroy() function enqueues a flow destruction to the requested queue.
> >
> > Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
> > Suggested-by: Ori Kam <orika@nvidia.com>
> > ---
> >  lib/ethdev/rte_flow.h | 288
> > ++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 288 insertions(+)
> >
> > diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
> > ba3204b17e..8cdffd8d2e 100644
> > --- a/lib/ethdev/rte_flow.h
> > +++ b/lib/ethdev/rte_flow.h
> > @@ -4298,6 +4298,13 @@ struct rte_flow_port_attr {
> >          * Version of the struct layout, should be 0.
> >          */
> >         uint32_t version;
> > +       /**
> > +        * Number of flow queues to be configured.
> > +        * Flow queues are used for asyncronous flow rule creation/destruction.
> > +        * The order of operations is not guaranteed inside a queue.
> > +        * Flow queues are not thread-safe.
> > +        */
> > +       uint16_t nb_queues;
> Will it matter if PMD can create a smaller set of queues? Or may be just one?
> Should the application set this based on get_infos_get() or some other mechanism?
> 
This is the number of queues from application point of view.
PMD can implement just one queue using locks.

Best,
Ori
> ::::
  

Patch

diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index ba3204b17e..8cdffd8d2e 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -4298,6 +4298,13 @@  struct rte_flow_port_attr {
 	 * Version of the struct layout, should be 0.
 	 */
 	uint32_t version;
+	/**
+	 * Number of flow queues to be configured.
+	 * Flow queues are used for asyncronous flow rule creation/destruction.
+	 * The order of operations is not guaranteed inside a queue.
+	 * Flow queues are not thread-safe.
+	 */
+	uint16_t nb_queues;
 	/**
 	 * Memory size allocated for the flow rules management.
 	 * If set to 0, memory is allocated dynamically.
@@ -4330,6 +4337,21 @@  struct rte_flow_port_attr {
 	uint32_t fixed_resource_size:1;
 };
 
+/**
+ * Flow engine queue configuration.
+ */
+__extension__
+struct rte_flow_queue_attr {
+	/**
+	 * Version of the struct layout, should be 0.
+	 */
+	uint32_t version;
+	/**
+	 * Number of flow rule operations a queue can hold.
+	 */
+	uint32_t size;
+};
+
 /**
  * @warning
  * @b EXPERIMENTAL: this API may change without prior notice.
@@ -4346,6 +4368,8 @@  struct rte_flow_port_attr {
  *   Port identifier of Ethernet device.
  * @param[in] port_attr
  *   Port configuration attributes.
+ * @param[in] queue_attr
+ *   Array that holds attributes for each queue.
  * @param[out] error
  *   Perform verbose error reporting if not NULL.
  *   PMDs initialize this structure in case of error only.
@@ -4357,6 +4381,7 @@  __rte_experimental
 int
 rte_flow_configure(uint16_t port_id,
 		   const struct rte_flow_port_attr *port_attr,
+		   const struct rte_flow_queue_attr *queue_attr[],
 		   struct rte_flow_error *error);
 
 /**
@@ -4626,6 +4651,269 @@  __rte_experimental
 int
 rte_flow_table_destroy(uint16_t port_id, struct rte_flow_table *table,
 		       struct rte_flow_error *error);
+
+/**
+ * Queue operation attributes
+ */
+__extension__
+struct rte_flow_q_ops_attr {
+	/**
+	 * Version of the struct layout, should be 0.
+	 */
+	uint32_t version;
+	/**
+	 * The user data that will be returned on the completion.
+	 */
+	void *user_data;
+	/**
+	 * When set, the requested action must be sent to the HW without
+	 * any delay. Any prior requests must be also sent to the HW.
+	 * If this bit is cleared, the application must call the
+	 * rte_flow_queue_flush API to actually send the request to the HW.
+	 */
+	uint32_t flush:1;
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Enqueue rule creation operation.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param queue
+ *   Flow queue used to insert the rule.
+ * @param[in] attr
+ *   Rule creation operation attributes.
+ * @param[in] table
+ *   Table to select templates from.
+ * @param[in] items
+ *   List of pattern items to be used.
+ *   The list order should match the order in the item template.
+ *   The spec is the only relevant member of the item that is being used.
+ * @param[in] item_template_index
+ *   Item template index in the table.
+ * @param[in] actions
+ *   List of actions to be used.
+ *   The list order should match the order in the action template.
+ * @param[in] action_template_index
+ *   Action template index in the table.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *   PMDs initialize this structure in case of error only.
+ *
+ * @return
+ *   Handle on success, NULL otherwise and rte_errno is set.
+ *   The rule handle doesn't mean that the rule was offloaded.
+ *   Only completion result indicates that the rule was offloaded.
+ */
+__rte_experimental
+struct rte_flow *
+rte_flow_q_flow_create(uint16_t port_id, uint32_t queue,
+		       const struct rte_flow_q_ops_attr *attr,
+		       const struct rte_flow_table *table,
+			   const struct rte_flow_item items[],
+		       uint8_t item_template_index,
+			   const struct rte_flow_action actions[],
+		       uint8_t action_template_index,
+		       struct rte_flow_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Enqueue rule destruction operation.
+ *
+ * This function enqueues a destruction operation on the queue.
+ * Application should assume that after calling this function
+ * the rule handle is not valid anymore.
+ * Completion indicates the full removal of the rule from the HW.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param queue
+ *   Flow queue which is used to destroy the rule.
+ *   This must match the queue on which the rule was created.
+ * @param[in] attr
+ *   Rule destroy operation attributes.
+ * @param[in] flow
+ *   Flow handle to be destroyed.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *   PMDs initialize this structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_flow_q_flow_destroy(uint16_t port_id, uint32_t queue,
+			struct rte_flow_q_ops_attr *attr,
+			struct rte_flow *flow,
+			struct rte_flow_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Enqueue indirect rule creation operation.
+ * @see rte_flow_action_handle_create
+ *
+ * @param[in] port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] queue
+ *   Flow queue which is used to create the rule.
+ * @param[in] attr
+ *   Queue operation attributes.
+ * @param[in] conf
+ *   Action configuration for the indirect action object creation.
+ * @param[in] action
+ *   Specific configuration of the indirect action object.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *   PMDs initialize this structure in case of error only.
+ *
+ * @return
+ *   - (0) if success.
+ *   - (-ENODEV) if *port_id* invalid.
+ *   - (-ENOSYS) if underlying device does not support this functionality.
+ *   - (-EIO) if underlying device is removed.
+ *   - (-ENOENT) if action pointed by *action* handle was not found.
+ *   - (-EBUSY) if action pointed by *action* handle still used by some rules
+ *   rte_errno is also set.
+ */
+__rte_experimental
+struct rte_flow_action_handle *
+rte_flow_q_action_handle_create(uint16_t port_id, uint32_t queue,
+				const struct rte_flow_q_ops_attr *attr,
+				const struct rte_flow_indir_action_conf *conf,
+				const struct rte_flow_action *action,
+				struct rte_flow_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Enqueue indirect rule destruction operation.
+ * The destroy queue must be the same
+ * as the queue on which the action was created.
+ *
+ * @param[in] port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] queue
+ *   Flow queue which is used to destroy the rule.
+ * @param[in] attr
+ *   Queue operation attributes.
+ * @param[in] handle
+ *   Handle for the indirect action object to be destroyed.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *   PMDs initialize this structure in case of error only.
+ *
+ * @return
+ *   - (0) if success.
+ *   - (-ENODEV) if *port_id* invalid.
+ *   - (-ENOSYS) if underlying device does not support this functionality.
+ *   - (-EIO) if underlying device is removed.
+ *   - (-ENOENT) if action pointed by *action* handle was not found.
+ *   - (-EBUSY) if action pointed by *action* handle still used by some rules
+ *   rte_errno is also set.
+ */
+__rte_experimental
+int
+rte_flow_q_action_handle_destroy(uint16_t port_id, uint32_t queue,
+				struct rte_flow_q_ops_attr *attr,
+				struct rte_flow_action_handle *handle,
+				struct rte_flow_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Flush all internally stored rules to the HW.
+ * Non-flushed rules are rules that were inserted without the flush flag set.
+ * Can be used to notify the HW about batch of rules prepared by the SW to
+ * reduce the number of communications between the HW and SW.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param queue
+ *   Flow queue to be flushed.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *   PMDs initialize this structure in case of error only.
+ *
+ * @return
+ *    0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_flow_q_flush(uint16_t port_id, uint32_t queue,
+		 struct rte_flow_error *error);
+
+/**
+ * Dequeue operation status.
+ */
+enum rte_flow_q_op_status {
+	/**
+	 * The operation was completed successfully.
+	 */
+	RTE_FLOW_Q_OP_SUCCESS,
+	/**
+	 * The operation was not completed successfully.
+	 */
+	RTE_FLOW_Q_OP_ERROR,
+};
+
+/**
+ * Dequeue operation result.
+ */
+struct rte_flow_q_op_res {
+	/**
+	 * Version of the struct layout, should be 0.
+	 */
+	uint32_t version;
+	/**
+	 * Returns the status of the operation that this completion signals.
+	 */
+	enum rte_flow_q_op_status status;
+	/**
+	 * User data that was supplied during operation submission.
+	 */
+	void *user_data;
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Dequeue a rte flow operation.
+ * The application must invoke this function in order to complete
+ * the flow rule offloading and to receive the flow rule operation status.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param queue
+ *   Flow queue which is used to dequeue the operation.
+ * @param[out] res
+ *   Array of results that will be set.
+ * @param[in] n_res
+ *   Maximum number of results that can be returned.
+ *   This value is equal to the size of the res array.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *   PMDs initialize this structure in case of error only.
+ *
+ * @return
+ *   Number of results that were dequeued,
+ *   a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_flow_q_dequeue(uint16_t port_id, uint32_t queue,
+		   struct rte_flow_q_op_res res[], uint16_t n_res,
+		   struct rte_flow_error *error);
 #ifdef __cplusplus
 }
 #endif