[v1] ethdev: support Tx queue used count

Message ID 20240111151745.3800170-1-jerinj@marvell.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series [v1] ethdev: support Tx queue used count |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/iol-testing warning apply patch failure
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/intel-Functional success Functional PASS
ci/github-robot: build fail github build: failed

Commit Message

Jerin Jacob Kollanukkaran Jan. 11, 2024, 3:17 p.m. UTC
  From: Jerin Jacob <jerinj@marvell.com>

Introduce a new API to retrieve the number of used descriptors
in a Tx queue. Applications can leverage this API in the fast path to
inspect the Tx queue occupancy and take appropriate actions based on the
available free descriptors.

A notable use case could be implementing Random Early Discard (RED)
in software based on Tx queue occupancy.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
 doc/guides/nics/features.rst         | 10 ++++
 doc/guides/nics/features/default.ini |  1 +
 lib/ethdev/ethdev_driver.h           |  2 +
 lib/ethdev/ethdev_private.c          |  1 +
 lib/ethdev/ethdev_trace_points.c     |  3 ++
 lib/ethdev/rte_ethdev.h              | 74 ++++++++++++++++++++++++++++
 lib/ethdev/rte_ethdev_core.h         |  7 ++-
 lib/ethdev/rte_ethdev_trace_fp.h     |  8 +++
 lib/ethdev/version.map               |  3 ++
 9 files changed, 108 insertions(+), 1 deletion(-)

rfc..v1:
- Updated API similar to rte_eth_rx_queue_count() where it returns
"used" count instead of "free" count
  

Comments

Andrew Rybchenko Jan. 11, 2024, 4:17 p.m. UTC | #1
On 1/11/24 18:17, jerinj@marvell.com wrote:
> From: Jerin Jacob <jerinj@marvell.com>
> 
> Introduce a new API to retrieve the number of used descriptors
> in a Tx queue. Applications can leverage this API in the fast path to
> inspect the Tx queue occupancy and take appropriate actions based on the
> available free descriptors.
> 
> A notable use case could be implementing Random Early Discard (RED)
> in software based on Tx queue occupancy.
> 
> Signed-off-by: Jerin Jacob <jerinj@marvell.com>

with few nits below
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>

[snip]

> diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> index f7d9980849..0d5a8733fc 100644
> --- a/doc/guides/nics/features.rst
> +++ b/doc/guides/nics/features.rst
> @@ -962,6 +962,16 @@ management (see :doc:`../prog_guide/power_man` for more details).
>   
>   * **[implements] eth_dev_ops**: ``get_monitor_addr``
>   
> +.. _nic_features_tx_queue_used_count:

I'd stick to shorter version _nic_features_tx_queue_count to match API
naming and feature title below.

> +
> +Tx queue count
> +--------------
> +
> +Supports to get the number of used descriptors of a Tx queue.
> +
> +* **[implements] eth_dev_ops**: ``tx_queue_count``.
> +* **[related] API**: ``rte_eth_tx_queue_count()``.
> +
>   .. _nic_features_other:
>   
>   Other dev ops not represented by a Feature

[snip]

> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index 21e3a21903..af59da9652 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -6803,6 +6803,80 @@ rte_eth_recycle_mbufs(uint16_t rx_port_id, uint16_t rx_queue_id,
>   __rte_experimental
>   int rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t *ptypes, int num);
>   
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
> + *
> + * Get the number of used descriptors of a Tx queue
> + *
> + * This function retrieves the number of used descriptors of a transmit queue.
> + * Applications can use this API in the fast path to inspect Tx queue occupancy and take
> + * appropriate actions based on the available free descriptors.
> + * An example action could be implementing the Random Early Discard (RED).
> + *
> + * Since it's a fast-path function, no check is performed on port_id and
> + * tx_queue_id. The caller must therefore ensure that the port is enabled
> + * and the queue is configured and running.
> + *
> + * @param port_id
> + *   The port identifier of the device.
> + * @param tx_queue_id
> + *   The index of the transmit queue.
> + *   The value must be in the range [0, nb_tx_queue - 1] previously supplied
> + *   to rte_eth_dev_configure().
> + * @return
> + *  The number of used descriptors in the specific queue, or:
> + *   - (-ENODEV) if *port_id* is invalid. Enabled only when RTE_ETHDEV_DEBUG_TX is enabled
> + *   - (-EINVAL) if *queue_id* is invalid. Enabled only when RTE_ETHDEV_DEBUG_TX is enabled
> + *   - (-ENOTSUP) if the device does not support this function.
> + *
> + * @note This function is designed for fast-path use.
> + */
> +__rte_experimental
> +static inline int
> +rte_eth_tx_queue_count(uint16_t port_id, uint16_t tx_queue_id)
> +{
> +	struct rte_eth_fp_ops *fops;
> +	void *qd;
> +	int rc;
> +
> +#ifdef RTE_ETHDEV_DEBUG_TX
> +	if (port_id >= RTE_MAX_ETHPORTS || !rte_eth_dev_is_valid_port(port_id)) {
> +		RTE_ETHDEV_LOG_LINE(ERR, "Invalid port_id=%u", port_id);
> +		rc = -ENODEV;
> +		rte_eth_trace_tx_queue_count(port_id, tx_queue_id, rc);
> +		return rc;
> +	}
> +
> +	rc = -EINVAL;

Since it is a debug, IMHO it is better to keep the code simple and init
rc in two below if bodies separately rather than relying on shared init
here.

> +	if (tx_queue_id >= RTE_MAX_QUEUES_PER_PORT) {
> +		RTE_ETHDEV_LOG_LINE(ERR, "Invalid Tx queue_id=%u for port_id=%u",
> +				    tx_queue_id, port_id);
> +		rte_eth_trace_tx_queue_count(port_id, tx_queue_id, rc);
> +		return rc;
> +	}
> +#endif
> +
> +	/* Fetch pointer to Tx queue data */
> +	fops = &rte_eth_fp_ops[port_id];
> +	qd = fops->txq.data[tx_queue_id];
> +
> +#ifdef RTE_ETHDEV_DEBUG_TX
> +	if (qd == NULL) {
> +		RTE_ETHDEV_LOG_LINE(ERR, "Invalid Tx queue_id=%u for port_id=%u",
> +				    tx_queue_id, port_id);
> +		rte_eth_trace_tx_queue_count(port_id, tx_queue_id, rc);
> +		return rc;
> +	}
> +#endif
> +	if (fops->tx_queue_count == NULL)

Don't we need trace here? (no strong opinion, just trying to understand
why it is missing here)

> +		return -ENOTSUP;
> +
> +	rc = fops->tx_queue_count(qd);
> +	rte_eth_trace_tx_queue_count(port_id, tx_queue_id, rc);
> +
> +	return rc;
> +}
>   #ifdef __cplusplus
>   }
>   #endif

[snip]
  
Morten Brørup Jan. 11, 2024, 4:20 p.m. UTC | #2
> From: jerinj@marvell.com [mailto:jerinj@marvell.com]
> Sent: Thursday, 11 January 2024 16.18
> 
> From: Jerin Jacob <jerinj@marvell.com>
> 
> Introduce a new API to retrieve the number of used descriptors
> in a Tx queue. Applications can leverage this API in the fast path to
> inspect the Tx queue occupancy and take appropriate actions based on
> the
> available free descriptors.
> 
> A notable use case could be implementing Random Early Discard (RED)
> in software based on Tx queue occupancy.
> 
> Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> ---

Generally looks good.

Only some minor suggestions/comments regarding rte_eth_tx_queue_count():

uint16_t tx_queue_id -> uint16_t queue_id
Everywhere, also in rte_ethdev_trace_fp.h.

Similarly in the log messages:
"Invalid Tx queue_id" -> "Invalid queue_id"

I don't like that rc = -EINVAL is reused instead of set explicitly for each error case.

Also, the return value -ENOTSUP is not traced.

Consider using an "out" label for a common return path to include trace, e.g.:

/**
 * @warning
 * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
 *
 * Get the number of used descriptors of a Tx queue
 *
 * This function retrieves the number of used descriptors of a transmit queue.
 * Applications can use this API in the fast path to inspect Tx queue occupancy and take
 * appropriate actions based on the available free descriptors.
 * An example action could be implementing the Random Early Discard (RED).
 *
 * Since it's a fast-path function, no check is performed on port_id and
 * queue_id. The caller must therefore ensure that the port is enabled
 * and the queue is configured and running.
 *
 * @param port_id
 *   The port identifier of the device.
 * @param queue_id
 *   The index of the transmit queue.
 *   The value must be in the range [0, nb_tx_queue - 1] previously supplied
 *   to rte_eth_dev_configure().
 * @return
 *  The number of used descriptors in the specific queue, or:
 *   - (-ENODEV) if *port_id* is invalid. Enabled only when RTE_ETHDEV_DEBUG_TX is enabled
 *   - (-EINVAL) if *queue_id* is invalid. Enabled only when RTE_ETHDEV_DEBUG_TX is enabled
 *   - (-ENOTSUP) if the device does not support this function.
 *
 * @note This function is designed for fast-path use.
 */
__rte_experimental
static inline int
rte_eth_tx_queue_count(uint16_t port_id, uint16_t queue_id)
{
	struct rte_eth_fp_ops *fops;
	void *qd;
	int rc;

#ifdef RTE_ETHDEV_DEBUG_TX
	if (port_id >= RTE_MAX_ETHPORTS || !rte_eth_dev_is_valid_port(port_id)) {
		RTE_ETHDEV_LOG_LINE(ERR, "Invalid port_id=%u", port_id);
		rc = -ENODEV;
		goto out;
	}

	if (queue_id >= RTE_MAX_QUEUES_PER_PORT) {
		RTE_ETHDEV_LOG_LINE(ERR, "Invalid queue_id=%u for port_id=%u",
				    queue_id, port_id);
		rc = -EINVAL;
		goto out;
	}
#endif

	/* Fetch pointer to Tx queue data */
	fops = &rte_eth_fp_ops[port_id];
	qd = fops->txq.data[queue_id];

#ifdef RTE_ETHDEV_DEBUG_TX
	if (qd == NULL) {
		RTE_ETHDEV_LOG_LINE(ERR, "Invalid queue_id=%u for port_id=%u",
				    queue_id, port_id);
		rc = -EINVAL;
		goto out;
	}
#endif
	if (fops->tx_queue_count == NULL) {
		rc = -ENOTSUP;
		goto out;
	}

	rc = fops->tx_queue_count(qd);

out:
	rte_eth_trace_tx_queue_count(port_id, queue_id, rc);

	return rc;
}


With or without suggested changes:

Acked-by: Morten Brørup <mb@smartsharesystems.com>
  
Stephen Hemminger Jan. 11, 2024, 5 p.m. UTC | #3
On Thu, 11 Jan 2024 20:47:44 +0530
<jerinj@marvell.com> wrote:

> From: Jerin Jacob <jerinj@marvell.com>
> 
> Introduce a new API to retrieve the number of used descriptors
> in a Tx queue. Applications can leverage this API in the fast path to
> inspect the Tx queue occupancy and take appropriate actions based on the
> available free descriptors.
> 
> A notable use case could be implementing Random Early Discard (RED)
> in software based on Tx queue occupancy.
> 
> Signed-off-by: Jerin Jacob <jerinj@marvell.com>

Has anyone investigated implementing dynamic tx queue limits like
Linux BQL?
  
Jerin Jacob Jan. 12, 2024, 6:56 a.m. UTC | #4
On Thu, Jan 11, 2024 at 9:47 PM Andrew Rybchenko
<andrew.rybchenko@oktetlabs.ru> wrote:
>
> On 1/11/24 18:17, jerinj@marvell.com wrote:
> > From: Jerin Jacob <jerinj@marvell.com>
> >
> > Introduce a new API to retrieve the number of used descriptors
> > in a Tx queue. Applications can leverage this API in the fast path to
> > inspect the Tx queue occupancy and take appropriate actions based on the
> > available free descriptors.
> >
> > A notable use case could be implementing Random Early Discard (RED)
> > in software based on Tx queue occupancy.
> >
> > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
>
> with few nits below
> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>
> [snip]
>
> > diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> > index f7d9980849..0d5a8733fc 100644
> > --- a/doc/guides/nics/features.rst
> > +++ b/doc/guides/nics/features.rst
> > @@ -962,6 +962,16 @@ management (see :doc:`../prog_guide/power_man` for more details).
> >
> >   * **[implements] eth_dev_ops**: ``get_monitor_addr``
> >
> > +.. _nic_features_tx_queue_used_count:
>
> I'd stick to shorter version _nic_features_tx_queue_count to match API
> naming and feature title below.

Ack and change in v2.

> > +
> > +Tx queue count
> > +--------------
> > +
> > +Supports to get the number of used descriptors of a Tx queue.
> > +
> > +* **[implements] eth_dev_ops**: ``tx_queue_count``.
> > +* **[related] API**: ``rte_eth_tx_queue_count()``.
> > +
> >   .. _nic_features_other:
> >
> >   Other dev ops not represented by a Feature
>
> [snip]
>
> > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> > index 21e3a21903..af59da9652 100644
> > --- a/lib/ethdev/rte_ethdev.h
> > +++ b/lib/ethdev/rte_ethdev.h
> > @@ -6803,6 +6803,80 @@ rte_eth_recycle_mbufs(uint16_t rx_port_id, uint16_t rx_queue_id,
> >   __rte_experimental
> >   int rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t *ptypes, int num);
> >
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
> > + *
> > + * Get the number of used descriptors of a Tx queue
> > + *
> > + * This function retrieves the number of used descriptors of a transmit queue.
> > + * Applications can use this API in the fast path to inspect Tx queue occupancy and take
> > + * appropriate actions based on the available free descriptors.
> > + * An example action could be implementing the Random Early Discard (RED).
> > + *
> > + * Since it's a fast-path function, no check is performed on port_id and
> > + * tx_queue_id. The caller must therefore ensure that the port is enabled
> > + * and the queue is configured and running.
> > + *
> > + * @param port_id
> > + *   The port identifier of the device.
> > + * @param tx_queue_id
> > + *   The index of the transmit queue.
> > + *   The value must be in the range [0, nb_tx_queue - 1] previously supplied
> > + *   to rte_eth_dev_configure().
> > + * @return
> > + *  The number of used descriptors in the specific queue, or:
> > + *   - (-ENODEV) if *port_id* is invalid. Enabled only when RTE_ETHDEV_DEBUG_TX is enabled
> > + *   - (-EINVAL) if *queue_id* is invalid. Enabled only when RTE_ETHDEV_DEBUG_TX is enabled
> > + *   - (-ENOTSUP) if the device does not support this function.
> > + *
> > + * @note This function is designed for fast-path use.
> > + */
> > +__rte_experimental
> > +static inline int
> > +rte_eth_tx_queue_count(uint16_t port_id, uint16_t tx_queue_id)
> > +{
> > +     struct rte_eth_fp_ops *fops;
> > +     void *qd;
> > +     int rc;
> > +
> > +#ifdef RTE_ETHDEV_DEBUG_TX
> > +     if (port_id >= RTE_MAX_ETHPORTS || !rte_eth_dev_is_valid_port(port_id)) {
> > +             RTE_ETHDEV_LOG_LINE(ERR, "Invalid port_id=%u", port_id);
> > +             rc = -ENODEV;
> > +             rte_eth_trace_tx_queue_count(port_id, tx_queue_id, rc);
> > +             return rc;
> > +     }
> > +
> > +     rc = -EINVAL;
>
> Since it is a debug, IMHO it is better to keep the code simple and init
> rc in two below if bodies separately rather than relying on shared init
> here.

Ack and change in v2.


>
> > +     if (tx_queue_id >= RTE_MAX_QUEUES_PER_PORT) {
> > +             RTE_ETHDEV_LOG_LINE(ERR, "Invalid Tx queue_id=%u for port_id=%u",
> > +                                 tx_queue_id, port_id);
> > +             rte_eth_trace_tx_queue_count(port_id, tx_queue_id, rc);
> > +             return rc;
> > +     }
> > +#endif
> > +
> > +     /* Fetch pointer to Tx queue data */
> > +     fops = &rte_eth_fp_ops[port_id];
> > +     qd = fops->txq.data[tx_queue_id];
> > +
> > +#ifdef RTE_ETHDEV_DEBUG_TX
> > +     if (qd == NULL) {
> > +             RTE_ETHDEV_LOG_LINE(ERR, "Invalid Tx queue_id=%u for port_id=%u",
> > +                                 tx_queue_id, port_id);
> > +             rte_eth_trace_tx_queue_count(port_id, tx_queue_id, rc);
> > +             return rc;
> > +     }
> > +#endif
> > +     if (fops->tx_queue_count == NULL)
>
> Don't we need trace here? (no strong opinion, just trying to understand
> why it is missing here)

It is missed by mistake. will fix in v2.

>
> > +             return -ENOTSUP;
> > +
> > +     rc = fops->tx_queue_count(qd);
> > +     rte_eth_trace_tx_queue_count(port_id, tx_queue_id, rc);
> > +
> > +     return rc;
> > +}
> >   #ifdef __cplusplus
> >   }
> >   #endif
>
> [snip]
>
  
Jerin Jacob Jan. 12, 2024, 6:59 a.m. UTC | #5
On Thu, Jan 11, 2024 at 9:50 PM Morten Brørup <mb@smartsharesystems.com> wrote:
>
> > From: jerinj@marvell.com [mailto:jerinj@marvell.com]
> > Sent: Thursday, 11 January 2024 16.18
> >
> > From: Jerin Jacob <jerinj@marvell.com>
> >
> > Introduce a new API to retrieve the number of used descriptors
> > in a Tx queue. Applications can leverage this API in the fast path to
> > inspect the Tx queue occupancy and take appropriate actions based on
> > the
> > available free descriptors.
> >
> > A notable use case could be implementing Random Early Discard (RED)
> > in software based on Tx queue occupancy.
> >
> > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> > ---
>
> Generally looks good.
>
> Only some minor suggestions/comments regarding rte_eth_tx_queue_count():
>
> uint16_t tx_queue_id -> uint16_t queue_id
> Everywhere, also in rte_ethdev_trace_fp.h.
>
> Similarly in the log messages:
> "Invalid Tx queue_id" -> "Invalid queue_id"
>
> I don't like that rc = -EINVAL is reused instead of set explicitly for each error case.
>
> Also, the return value -ENOTSUP is not traced.
>
> Consider using an "out" label for a common return path to include trace, e.g.:
I actually like "out" label scheme. General not seen ethdev
implementation functions.

I will include above suggestion in v2.

>
> /**
>  * @warning
>  * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
>  *
>  * Get the number of used descriptors of a Tx queue
>  *
>  * This function retrieves the number of used descriptors of a transmit queue.
>  * Applications can use this API in the fast path to inspect Tx queue occupancy and take
>  * appropriate actions based on the available free descriptors.
>  * An example action could be implementing the Random Early Discard (RED).
>  *
>  * Since it's a fast-path function, no check is performed on port_id and
>  * queue_id. The caller must therefore ensure that the port is enabled
>  * and the queue is configured and running.
>  *
>  * @param port_id
>  *   The port identifier of the device.
>  * @param queue_id
>  *   The index of the transmit queue.
>  *   The value must be in the range [0, nb_tx_queue - 1] previously supplied
>  *   to rte_eth_dev_configure().
>  * @return
>  *  The number of used descriptors in the specific queue, or:
>  *   - (-ENODEV) if *port_id* is invalid. Enabled only when RTE_ETHDEV_DEBUG_TX is enabled
>  *   - (-EINVAL) if *queue_id* is invalid. Enabled only when RTE_ETHDEV_DEBUG_TX is enabled
>  *   - (-ENOTSUP) if the device does not support this function.
>  *
>  * @note This function is designed for fast-path use.
>  */
> __rte_experimental
> static inline int
> rte_eth_tx_queue_count(uint16_t port_id, uint16_t queue_id)
> {
>         struct rte_eth_fp_ops *fops;
>         void *qd;
>         int rc;
>
> #ifdef RTE_ETHDEV_DEBUG_TX
>         if (port_id >= RTE_MAX_ETHPORTS || !rte_eth_dev_is_valid_port(port_id)) {
>                 RTE_ETHDEV_LOG_LINE(ERR, "Invalid port_id=%u", port_id);
>                 rc = -ENODEV;
>                 goto out;
>         }
>
>         if (queue_id >= RTE_MAX_QUEUES_PER_PORT) {
>                 RTE_ETHDEV_LOG_LINE(ERR, "Invalid queue_id=%u for port_id=%u",
>                                     queue_id, port_id);
>                 rc = -EINVAL;
>                 goto out;
>         }
> #endif
>
>         /* Fetch pointer to Tx queue data */
>         fops = &rte_eth_fp_ops[port_id];
>         qd = fops->txq.data[queue_id];
>
> #ifdef RTE_ETHDEV_DEBUG_TX
>         if (qd == NULL) {
>                 RTE_ETHDEV_LOG_LINE(ERR, "Invalid queue_id=%u for port_id=%u",
>                                     queue_id, port_id);
>                 rc = -EINVAL;
>                 goto out;
>         }
> #endif
>         if (fops->tx_queue_count == NULL) {
>                 rc = -ENOTSUP;
>                 goto out;
>         }
>
>         rc = fops->tx_queue_count(qd);
>
> out:
>         rte_eth_trace_tx_queue_count(port_id, queue_id, rc);
>
>         return rc;
> }
>
>
> With or without suggested changes:
>
> Acked-by: Morten Brørup <mb@smartsharesystems.com>
>
  
Jerin Jacob Jan. 12, 2024, 7:01 a.m. UTC | #6
On Thu, Jan 11, 2024 at 10:30 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Thu, 11 Jan 2024 20:47:44 +0530
> <jerinj@marvell.com> wrote:
>
> > From: Jerin Jacob <jerinj@marvell.com>
> >
> > Introduce a new API to retrieve the number of used descriptors
> > in a Tx queue. Applications can leverage this API in the fast path to
> > inspect the Tx queue occupancy and take appropriate actions based on the
> > available free descriptors.
> >
> > A notable use case could be implementing Random Early Discard (RED)
> > in software based on Tx queue occupancy.
> >
> > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
>
> Has anyone investigated implementing dynamic tx queue limits like
> Linux BQL?

In DPDK APIs, it can expressed through creating correct TM
topology(shaping or rate limiting) via rte_tm API
  
David Marchand Jan. 12, 2024, 8:02 a.m. UTC | #7
Hi Jerin,

On Thu, Jan 11, 2024 at 4:32 PM <jerinj@marvell.com> wrote:
>
> From: Jerin Jacob <jerinj@marvell.com>
>
> Introduce a new API to retrieve the number of used descriptors
> in a Tx queue. Applications can leverage this API in the fast path to
> inspect the Tx queue occupancy and take appropriate actions based on the
> available free descriptors.
>
> A notable use case could be implementing Random Early Discard (RED)
> in software based on Tx queue occupancy.
>
> Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> ---
>  doc/guides/nics/features.rst         | 10 ++++
>  doc/guides/nics/features/default.ini |  1 +
>  lib/ethdev/ethdev_driver.h           |  2 +
>  lib/ethdev/ethdev_private.c          |  1 +
>  lib/ethdev/ethdev_trace_points.c     |  3 ++
>  lib/ethdev/rte_ethdev.h              | 74 ++++++++++++++++++++++++++++
>  lib/ethdev/rte_ethdev_core.h         |  7 ++-
>  lib/ethdev/rte_ethdev_trace_fp.h     |  8 +++
>  lib/ethdev/version.map               |  3 ++
>  9 files changed, 108 insertions(+), 1 deletion(-)

We need some libabigail suppression rule for the reserved2 field update.
Looking at some previous rules, something like below can do the trick.

diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
index 21b8cd6113..ba240f74d5 100644
--- a/devtools/libabigail.abignore
+++ b/devtools/libabigail.abignore
@@ -33,3 +33,6 @@
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ; Temporary exceptions till next major ABI version ;
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+[suppress_type]
+        name = rte_eth_fp_ops
+        has_data_member_inserted_between = {offset_of(reserved2), end}
  
Jerin Jacob Jan. 12, 2024, 9:29 a.m. UTC | #8
On Fri, Jan 12, 2024 at 1:33 PM David Marchand
<david.marchand@redhat.com> wrote:
>
> Hi Jerin,
>
> On Thu, Jan 11, 2024 at 4:32 PM <jerinj@marvell.com> wrote:
> >
> > From: Jerin Jacob <jerinj@marvell.com>
> >
> > Introduce a new API to retrieve the number of used descriptors
> > in a Tx queue. Applications can leverage this API in the fast path to
> > inspect the Tx queue occupancy and take appropriate actions based on the
> > available free descriptors.
> >
> > A notable use case could be implementing Random Early Discard (RED)
> > in software based on Tx queue occupancy.
> >
> > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> > ---
> >  doc/guides/nics/features.rst         | 10 ++++
> >  doc/guides/nics/features/default.ini |  1 +
> >  lib/ethdev/ethdev_driver.h           |  2 +
> >  lib/ethdev/ethdev_private.c          |  1 +
> >  lib/ethdev/ethdev_trace_points.c     |  3 ++
> >  lib/ethdev/rte_ethdev.h              | 74 ++++++++++++++++++++++++++++
> >  lib/ethdev/rte_ethdev_core.h         |  7 ++-
> >  lib/ethdev/rte_ethdev_trace_fp.h     |  8 +++
> >  lib/ethdev/version.map               |  3 ++
> >  9 files changed, 108 insertions(+), 1 deletion(-)
>
> We need some libabigail suppression rule for the reserved2 field update.
> Looking at some previous rules, something like below can do the trick.
>
> diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
> index 21b8cd6113..ba240f74d5 100644
> --- a/devtools/libabigail.abignore
> +++ b/devtools/libabigail.abignore
> @@ -33,3 +33,6 @@
>  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>  ; Temporary exceptions till next major ABI version ;
>  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> +[suppress_type]
> +        name = rte_eth_fp_ops
> +        has_data_member_inserted_between = {offset_of(reserved2), end}

Thanks David. will fix in v2.


>
> --
> David Marchand
>
  
Ferruh Yigit Jan. 12, 2024, 11:34 a.m. UTC | #9
On 1/11/2024 3:17 PM, jerinj@marvell.com wrote:
> From: Jerin Jacob <jerinj@marvell.com>
> 
> Introduce a new API to retrieve the number of used descriptors
> in a Tx queue. Applications can leverage this API in the fast path to
> inspect the Tx queue occupancy and take appropriate actions based on the
> available free descriptors.
> 
> A notable use case could be implementing Random Early Discard (RED)
> in software based on Tx queue occupancy.
> 
> Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> ---
>  doc/guides/nics/features.rst         | 10 ++++
>  doc/guides/nics/features/default.ini |  1 +
>  lib/ethdev/ethdev_driver.h           |  2 +
>  lib/ethdev/ethdev_private.c          |  1 +
>  lib/ethdev/ethdev_trace_points.c     |  3 ++
>  lib/ethdev/rte_ethdev.h              | 74 ++++++++++++++++++++++++++++
>  lib/ethdev/rte_ethdev_core.h         |  7 ++-
>  lib/ethdev/rte_ethdev_trace_fp.h     |  8 +++
>  lib/ethdev/version.map               |  3 ++
>  9 files changed, 108 insertions(+), 1 deletion(-)
> 

As we are adding a new API and dev_ops, is a driver implementation and
testpmd/example implementation planned for this release?


> rfc..v1:
> - Updated API similar to rte_eth_rx_queue_count() where it returns
> "used" count instead of "free" count
> 
> diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> index f7d9980849..0d5a8733fc 100644
> --- a/doc/guides/nics/features.rst
> +++ b/doc/guides/nics/features.rst
> @@ -962,6 +962,16 @@ management (see :doc:`../prog_guide/power_man` for more details).
>  
>  * **[implements] eth_dev_ops**: ``get_monitor_addr``
>  
> +.. _nic_features_tx_queue_used_count:
> +
> +Tx queue count
> +--------------
> +
> +Supports to get the number of used descriptors of a Tx queue.
> +
> +* **[implements] eth_dev_ops**: ``tx_queue_count``.
> +* **[related] API**: ``rte_eth_tx_queue_count()``.
> +
>

Can you please keep the order same with 'default.ini' file,
I recognized there is already some mismatch in order but we can fix them
later.

>  .. _nic_features_other:
>  
>  Other dev ops not represented by a Feature
> diff --git a/doc/guides/nics/features/default.ini b/doc/guides/nics/features/default.ini
> index 806cb033ff..3ef6d45c0e 100644
> --- a/doc/guides/nics/features/default.ini
> +++ b/doc/guides/nics/features/default.ini
> @@ -59,6 +59,7 @@ Packet type parsing  =
>  Timesync             =
>  Rx descriptor status =
>  Tx descriptor status =
> +Tx queue count       =
>

Existing Rx queue count is not documented, if we are documenting this,
can you please add "Rx queue count" in a separate patch?

>  Basic stats          =
>  Extended stats       =
>  Stats per queue      =
> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> index b482cd12bb..f05f68a67c 100644
> --- a/lib/ethdev/ethdev_driver.h
> +++ b/lib/ethdev/ethdev_driver.h
> @@ -58,6 +58,8 @@ struct rte_eth_dev {
>  	eth_rx_queue_count_t rx_queue_count;
>  	/** Check the status of a Rx descriptor */
>  	eth_rx_descriptor_status_t rx_descriptor_status;
> +	/** Get the number of used Tx descriptors */
> +	eth_tx_queue_count_t tx_queue_count;
>  	/** Check the status of a Tx descriptor */
>  	eth_tx_descriptor_status_t tx_descriptor_status;
>  	/** Pointer to PMD transmit mbufs reuse function */
> diff --git a/lib/ethdev/ethdev_private.c b/lib/ethdev/ethdev_private.c
> index a656df293c..626524558a 100644
> --- a/lib/ethdev/ethdev_private.c
> +++ b/lib/ethdev/ethdev_private.c
> @@ -273,6 +273,7 @@ eth_dev_fp_ops_setup(struct rte_eth_fp_ops *fpo,
>  	fpo->tx_pkt_prepare = dev->tx_pkt_prepare;
>  	fpo->rx_queue_count = dev->rx_queue_count;
>  	fpo->rx_descriptor_status = dev->rx_descriptor_status;
> +	fpo->tx_queue_count = dev->tx_queue_count;
>  	fpo->tx_descriptor_status = dev->tx_descriptor_status;
>  	fpo->recycle_tx_mbufs_reuse = dev->recycle_tx_mbufs_reuse;
>  	fpo->recycle_rx_descriptors_refill = dev->recycle_rx_descriptors_refill;
> diff --git a/lib/ethdev/ethdev_trace_points.c b/lib/ethdev/ethdev_trace_points.c
> index 91f71d868b..e618414392 100644
> --- a/lib/ethdev/ethdev_trace_points.c
> +++ b/lib/ethdev/ethdev_trace_points.c
> @@ -481,6 +481,9 @@ RTE_TRACE_POINT_REGISTER(rte_eth_trace_count_aggr_ports,
>  RTE_TRACE_POINT_REGISTER(rte_eth_trace_map_aggr_tx_affinity,
>  	lib.ethdev.map_aggr_tx_affinity)
>  
> +RTE_TRACE_POINT_REGISTER(rte_eth_trace_tx_queue_count,
> +	lib.ethdev.tx_queue_count)
> +

Can you please group this with 'tx_burst' & 'call_tx_callbacks' above?

>  RTE_TRACE_POINT_REGISTER(rte_flow_trace_copy,
>  	lib.ethdev.flow.copy)
>  
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index 21e3a21903..af59da9652 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -6803,6 +6803,80 @@ rte_eth_recycle_mbufs(uint16_t rx_port_id, uint16_t rx_queue_id,
>  __rte_experimental
>  int rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t *ptypes, int num);
>  
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
> + *
> + * Get the number of used descriptors of a Tx queue
> + *
> + * This function retrieves the number of used descriptors of a transmit queue.
> + * Applications can use this API in the fast path to inspect Tx queue occupancy and take
> + * appropriate actions based on the available free descriptors.
> + * An example action could be implementing the Random Early Discard (RED).
>

Above is good, but do you think does it help to mention that intended
usacase is QoS, to address the risk Bruce mentioned?


> + *
> + * Since it's a fast-path function, no check is performed on port_id and
> + * tx_queue_id. The caller must therefore ensure that the port is enabled
> + * and the queue is configured and running.
> + *
> + * @param port_id
> + *   The port identifier of the device.
> + * @param tx_queue_id
> + *   The index of the transmit queue.
> + *   The value must be in the range [0, nb_tx_queue - 1] previously supplied
> + *   to rte_eth_dev_configure().
> + * @return
> + *  The number of used descriptors in the specific queue, or:
> + *   - (-ENODEV) if *port_id* is invalid. Enabled only when RTE_ETHDEV_DEBUG_TX is enabled
> + *   - (-EINVAL) if *queue_id* is invalid. Enabled only when RTE_ETHDEV_DEBUG_TX is enabled
> + *   - (-ENOTSUP) if the device does not support this function.
> + *
> + * @note This function is designed for fast-path use.
> + */
> +__rte_experimental
> +static inline int
> +rte_eth_tx_queue_count(uint16_t port_id, uint16_t tx_queue_id)
> +{
> +	struct rte_eth_fp_ops *fops;
> +	void *qd;
> +	int rc;
> +
> +#ifdef RTE_ETHDEV_DEBUG_TX
> +	if (port_id >= RTE_MAX_ETHPORTS || !rte_eth_dev_is_valid_port(port_id)) {
> +		RTE_ETHDEV_LOG_LINE(ERR, "Invalid port_id=%u", port_id);
> +		rc = -ENODEV;
> +		rte_eth_trace_tx_queue_count(port_id, tx_queue_id, rc);
> +		return rc;
> +	}
> +
> +	rc = -EINVAL;
> +	if (tx_queue_id >= RTE_MAX_QUEUES_PER_PORT) {
> +		RTE_ETHDEV_LOG_LINE(ERR, "Invalid Tx queue_id=%u for port_id=%u",
> +				    tx_queue_id, port_id);
> +		rte_eth_trace_tx_queue_count(port_id, tx_queue_id, rc);
> +		return rc;
> +	}
> +#endif
> +
> +	/* Fetch pointer to Tx queue data */
> +	fops = &rte_eth_fp_ops[port_id];
> +	qd = fops->txq.data[tx_queue_id];
> +
> +#ifdef RTE_ETHDEV_DEBUG_TX
> +	if (qd == NULL) {
> +		RTE_ETHDEV_LOG_LINE(ERR, "Invalid Tx queue_id=%u for port_id=%u",
> +				    tx_queue_id, port_id);
> +		rte_eth_trace_tx_queue_count(port_id, tx_queue_id, rc);
> +		return rc;
> +	}
> +#endif
> +	if (fops->tx_queue_count == NULL)
> +		return -ENOTSUP;
> +
> +	rc = fops->tx_queue_count(qd);
>

Can you please put an empty line here to have a separate block for trace
code?

> +	rte_eth_trace_tx_queue_count(port_id, tx_queue_id, rc);
> +
> +	return rc;
> +}
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/lib/ethdev/rte_ethdev_core.h b/lib/ethdev/rte_ethdev_core.h
> index 4bfaf79c6c..d3f09f390d 100644
> --- a/lib/ethdev/rte_ethdev_core.h
> +++ b/lib/ethdev/rte_ethdev_core.h
> @@ -60,6 +60,9 @@ typedef uint16_t (*eth_recycle_tx_mbufs_reuse_t)(void *txq,
>  /** @internal Refill Rx descriptors with the recycling mbufs */
>  typedef void (*eth_recycle_rx_descriptors_refill_t)(void *rxq, uint16_t nb);
>  
> +/** @internal Get number of used descriptors on a transmit queue. */
> +typedef int (*eth_tx_queue_count_t)(void *txq);
> +

Can you please move it above 'tx_descriptor_status', to keep same order
kept in many other locations:
rx_queue_count
rx_descriptor_status
tx_queue_count
tx_descriptor_status


>  /**
>   * @internal
>   * Structure used to hold opaque pointers to internal ethdev Rx/Tx
> @@ -116,7 +119,9 @@ struct rte_eth_fp_ops {
>  	eth_tx_descriptor_status_t tx_descriptor_status;
>  	/** Copy used mbufs from Tx mbuf ring into Rx. */
>  	eth_recycle_tx_mbufs_reuse_t recycle_tx_mbufs_reuse;
> -	uintptr_t reserved2[2];
> +	/** Get the number of used Tx descriptors. */
> +	eth_tx_queue_count_t tx_queue_count;
>

Similarly, can you please move it above 'tx_descriptor_status'?


> +	uintptr_t reserved2[1];
>  	/**@}*/
>  
>  } __rte_cache_aligned;
> diff --git a/lib/ethdev/rte_ethdev_trace_fp.h b/lib/ethdev/rte_ethdev_trace_fp.h
> index 186271c9ff..c98c488433 100644
> --- a/lib/ethdev/rte_ethdev_trace_fp.h
> +++ b/lib/ethdev/rte_ethdev_trace_fp.h
> @@ -73,6 +73,14 @@ RTE_TRACE_POINT_FP(
>  	rte_trace_point_emit_u64(count);
>  )
>  
> +RTE_TRACE_POINT_FP(
> +	rte_eth_trace_tx_queue_count,
> +	RTE_TRACE_POINT_ARGS(uint16_t port_id, uint16_t tx_queue_id, int rc),
> +	rte_trace_point_emit_u16(port_id);
> +	rte_trace_point_emit_u16(tx_queue_id);
> +	rte_trace_point_emit_int(rc);
> +)
> +

Existing 'rx_queue_count' is missing tracepoints, for consistency, can
you please add that too in a separate patch?


>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
> index 5c4917c020..e03830902a 100644
> --- a/lib/ethdev/version.map
> +++ b/lib/ethdev/version.map
> @@ -316,6 +316,9 @@ EXPERIMENTAL {
>  	rte_eth_recycle_rx_queue_info_get;
>  	rte_flow_group_set_miss_actions;
>  	rte_flow_calc_table_hash;
> +
> +	# added in 24.03
> +	rte_eth_tx_queue_count;
>  };
>  
>  INTERNAL {
  
David Marchand Jan. 12, 2024, 12:11 p.m. UTC | #10
On Fri, Jan 12, 2024 at 12:34 PM Ferruh Yigit <ferruh.yigit@amd.com> wrote:
> > diff --git a/lib/ethdev/rte_ethdev_core.h b/lib/ethdev/rte_ethdev_core.h
> > index 4bfaf79c6c..d3f09f390d 100644
> > --- a/lib/ethdev/rte_ethdev_core.h
> > +++ b/lib/ethdev/rte_ethdev_core.h
> > @@ -60,6 +60,9 @@ typedef uint16_t (*eth_recycle_tx_mbufs_reuse_t)(void *txq,
> >  /** @internal Refill Rx descriptors with the recycling mbufs */
> >  typedef void (*eth_recycle_rx_descriptors_refill_t)(void *rxq, uint16_t nb);
> >
> > +/** @internal Get number of used descriptors on a transmit queue. */
> > +typedef int (*eth_tx_queue_count_t)(void *txq);
> > +
>
> Can you please move it above 'tx_descriptor_status', to keep same order
> kept in many other locations:
> rx_queue_count
> rx_descriptor_status
> tx_queue_count
> tx_descriptor_status
>
>
> >  /**
> >   * @internal
> >   * Structure used to hold opaque pointers to internal ethdev Rx/Tx
> > @@ -116,7 +119,9 @@ struct rte_eth_fp_ops {
> >       eth_tx_descriptor_status_t tx_descriptor_status;
> >       /** Copy used mbufs from Tx mbuf ring into Rx. */
> >       eth_recycle_tx_mbufs_reuse_t recycle_tx_mbufs_reuse;
> > -     uintptr_t reserved2[2];
> > +     /** Get the number of used Tx descriptors. */
> > +     eth_tx_queue_count_t tx_queue_count;
> >
>
> Similarly, can you please move it above 'tx_descriptor_status'?

Moving fields in rte_eth_fp_ops (where fast-path API ops are) breaks
the applications ABI.
  
Morten Brørup Jan. 12, 2024, 12:29 p.m. UTC | #11
> From: Ferruh Yigit [mailto:ferruh.yigit@amd.com]
> Sent: Friday, 12 January 2024 12.34
> 
> On 1/11/2024 3:17 PM, jerinj@marvell.com wrote:
> > From: Jerin Jacob <jerinj@marvell.com>
> >
> > Introduce a new API to retrieve the number of used descriptors
> > in a Tx queue. Applications can leverage this API in the fast path to
> > inspect the Tx queue occupancy and take appropriate actions based on
> the
> > available free descriptors.
> >
> > A notable use case could be implementing Random Early Discard (RED)
> > in software based on Tx queue occupancy.

[...]

> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change, or be removed, without
> prior notice
> > + *
> > + * Get the number of used descriptors of a Tx queue
> > + *
> > + * This function retrieves the number of used descriptors of a
> transmit queue.
> > + * Applications can use this API in the fast path to inspect Tx
> queue occupancy and take
> > + * appropriate actions based on the available free descriptors.
> > + * An example action could be implementing the Random Early Discard
> (RED).
> >
> 
> Above is good, but do you think does it help to mention that intended
> usacase is QoS, to address the risk Bruce mentioned?

A better way to address the risk is to mention that:
1. there is no need to call this function before rte_eth_tx_burst(), and
2. tx_descriptor_status() has better performance for checking a threshold than this function.

> >  /**
> >   * @internal
> >   * Structure used to hold opaque pointers to internal ethdev Rx/Tx
> > @@ -116,7 +119,9 @@ struct rte_eth_fp_ops {
> >  	eth_tx_descriptor_status_t tx_descriptor_status;
> >  	/** Copy used mbufs from Tx mbuf ring into Rx. */
> >  	eth_recycle_tx_mbufs_reuse_t recycle_tx_mbufs_reuse;
> > -	uintptr_t reserved2[2];
> > +	/** Get the number of used Tx descriptors. */
> > +	eth_tx_queue_count_t tx_queue_count;
> >
> 
> Similarly, can you please move it above 'tx_descriptor_status'?

No. I think struct rte_eth_fp_ops is part of the public API, so moving down tx_descriptor_status and recycle_tx_mbufs_reuse would break the API.

> 
> 
> > +	uintptr_t reserved2[1];
> >  	/**@}*/
> >
> >  } __rte_cache_aligned;
  
Konstantin Ananyev Jan. 12, 2024, 12:33 p.m. UTC | #12
Hi Jerin,

> Introduce a new API to retrieve the number of used descriptors
> in a Tx queue. Applications can leverage this API in the fast path to
> inspect the Tx queue occupancy and take appropriate actions based on the
> available free descriptors.
> 
> A notable use case could be implementing Random Early Discard (RED)
> in software based on Tx queue occupancy.
> 
> Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> ---
>  doc/guides/nics/features.rst         | 10 ++++
>  doc/guides/nics/features/default.ini |  1 +
>  lib/ethdev/ethdev_driver.h           |  2 +
>  lib/ethdev/ethdev_private.c          |  1 +
>  lib/ethdev/ethdev_trace_points.c     |  3 ++
>  lib/ethdev/rte_ethdev.h              | 74 ++++++++++++++++++++++++++++
>  lib/ethdev/rte_ethdev_core.h         |  7 ++-
>  lib/ethdev/rte_ethdev_trace_fp.h     |  8 +++
>  lib/ethdev/version.map               |  3 ++
>  9 files changed, 108 insertions(+), 1 deletion(-)
> 
> rfc..v1:
> - Updated API similar to rte_eth_rx_queue_count() where it returns
> "used" count instead of "free" count
> 
> diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> index f7d9980849..0d5a8733fc 100644
> --- a/doc/guides/nics/features.rst
> +++ b/doc/guides/nics/features.rst
> @@ -962,6 +962,16 @@ management (see :doc:`../prog_guide/power_man` for more details).
> 
>  * **[implements] eth_dev_ops**: ``get_monitor_addr``
> 
> +.. _nic_features_tx_queue_used_count:
> +
> +Tx queue count
> +--------------
> +
> +Supports to get the number of used descriptors of a Tx queue.
> +
> +* **[implements] eth_dev_ops**: ``tx_queue_count``.
> +* **[related] API**: ``rte_eth_tx_queue_count()``.
> +
>  .. _nic_features_other:
> 
>  Other dev ops not represented by a Feature
> diff --git a/doc/guides/nics/features/default.ini b/doc/guides/nics/features/default.ini
> index 806cb033ff..3ef6d45c0e 100644
> --- a/doc/guides/nics/features/default.ini
> +++ b/doc/guides/nics/features/default.ini
> @@ -59,6 +59,7 @@ Packet type parsing  =
>  Timesync             =
>  Rx descriptor status =
>  Tx descriptor status =
> +Tx queue count       =
>  Basic stats          =
>  Extended stats       =
>  Stats per queue      =
> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> index b482cd12bb..f05f68a67c 100644
> --- a/lib/ethdev/ethdev_driver.h
> +++ b/lib/ethdev/ethdev_driver.h
> @@ -58,6 +58,8 @@ struct rte_eth_dev {
>  	eth_rx_queue_count_t rx_queue_count;
>  	/** Check the status of a Rx descriptor */
>  	eth_rx_descriptor_status_t rx_descriptor_status;
> +	/** Get the number of used Tx descriptors */
> +	eth_tx_queue_count_t tx_queue_count;
>  	/** Check the status of a Tx descriptor */
>  	eth_tx_descriptor_status_t tx_descriptor_status;
>  	/** Pointer to PMD transmit mbufs reuse function */
> diff --git a/lib/ethdev/ethdev_private.c b/lib/ethdev/ethdev_private.c
> index a656df293c..626524558a 100644
> --- a/lib/ethdev/ethdev_private.c
> +++ b/lib/ethdev/ethdev_private.c
> @@ -273,6 +273,7 @@ eth_dev_fp_ops_setup(struct rte_eth_fp_ops *fpo,
>  	fpo->tx_pkt_prepare = dev->tx_pkt_prepare;
>  	fpo->rx_queue_count = dev->rx_queue_count;
>  	fpo->rx_descriptor_status = dev->rx_descriptor_status;
> +	fpo->tx_queue_count = dev->tx_queue_count;
>  	fpo->tx_descriptor_status = dev->tx_descriptor_status;
>  	fpo->recycle_tx_mbufs_reuse = dev->recycle_tx_mbufs_reuse;
>  	fpo->recycle_rx_descriptors_refill = dev->recycle_rx_descriptors_refill;
> diff --git a/lib/ethdev/ethdev_trace_points.c b/lib/ethdev/ethdev_trace_points.c
> index 91f71d868b..e618414392 100644
> --- a/lib/ethdev/ethdev_trace_points.c
> +++ b/lib/ethdev/ethdev_trace_points.c
> @@ -481,6 +481,9 @@ RTE_TRACE_POINT_REGISTER(rte_eth_trace_count_aggr_ports,
>  RTE_TRACE_POINT_REGISTER(rte_eth_trace_map_aggr_tx_affinity,
>  	lib.ethdev.map_aggr_tx_affinity)
> 
> +RTE_TRACE_POINT_REGISTER(rte_eth_trace_tx_queue_count,
> +	lib.ethdev.tx_queue_count)
> +
>  RTE_TRACE_POINT_REGISTER(rte_flow_trace_copy,
>  	lib.ethdev.flow.copy)
> 
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index 21e3a21903..af59da9652 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -6803,6 +6803,80 @@ rte_eth_recycle_mbufs(uint16_t rx_port_id, uint16_t rx_queue_id,
>  __rte_experimental
>  int rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t *ptypes, int num);
> 
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
> + *
> + * Get the number of used descriptors of a Tx queue
> + *
> + * This function retrieves the number of used descriptors of a transmit queue.
> + * Applications can use this API in the fast path to inspect Tx queue occupancy and take
> + * appropriate actions based on the available free descriptors.
> + * An example action could be implementing the Random Early Discard (RED).
 
Sorry, I probably misunderstood your previous mails, but wouldn't it be more convenient
for user to have rte_eth_tx_queue_free_count(...) as fast-op, and
have rte_eth_tx_queue_count(...) {  queue_txd_num - rte_eth_tx_queue_free_count(...);}
as a slow-path function in rte_ethdev.c?
Konstantin 

> + *
> + * Since it's a fast-path function, no check is performed on port_id and
> + * tx_queue_id. The caller must therefore ensure that the port is enabled
> + * and the queue is configured and running.
> + *
> + * @param port_id
> + *   The port identifier of the device.
> + * @param tx_queue_id
> + *   The index of the transmit queue.
> + *   The value must be in the range [0, nb_tx_queue - 1] previously supplied
> + *   to rte_eth_dev_configure().
> + * @return
> + *  The number of used descriptors in the specific queue, or:
> + *   - (-ENODEV) if *port_id* is invalid. Enabled only when RTE_ETHDEV_DEBUG_TX is enabled
> + *   - (-EINVAL) if *queue_id* is invalid. Enabled only when RTE_ETHDEV_DEBUG_TX is enabled
> + *   - (-ENOTSUP) if the device does not support this function.
> + *
> + * @note This function is designed for fast-path use.
> + */
> +__rte_experimental
> +static inline int
> +rte_eth_tx_queue_count(uint16_t port_id, uint16_t tx_queue_id)
> +{
> +	struct rte_eth_fp_ops *fops;
> +	void *qd;
> +	int rc;
> +
> +#ifdef RTE_ETHDEV_DEBUG_TX
> +	if (port_id >= RTE_MAX_ETHPORTS || !rte_eth_dev_is_valid_port(port_id)) {
> +		RTE_ETHDEV_LOG_LINE(ERR, "Invalid port_id=%u", port_id);
> +		rc = -ENODEV;
> +		rte_eth_trace_tx_queue_count(port_id, tx_queue_id, rc);
> +		return rc;
> +	}
> +
> +	rc = -EINVAL;
> +	if (tx_queue_id >= RTE_MAX_QUEUES_PER_PORT) {
> +		RTE_ETHDEV_LOG_LINE(ERR, "Invalid Tx queue_id=%u for port_id=%u",
> +				    tx_queue_id, port_id);
> +		rte_eth_trace_tx_queue_count(port_id, tx_queue_id, rc);
> +		return rc;
> +	}
> +#endif
> +
> +	/* Fetch pointer to Tx queue data */
> +	fops = &rte_eth_fp_ops[port_id];
> +	qd = fops->txq.data[tx_queue_id];
> +
> +#ifdef RTE_ETHDEV_DEBUG_TX
> +	if (qd == NULL) {
> +		RTE_ETHDEV_LOG_LINE(ERR, "Invalid Tx queue_id=%u for port_id=%u",
> +				    tx_queue_id, port_id);
> +		rte_eth_trace_tx_queue_count(port_id, tx_queue_id, rc);
> +		return rc;
> +	}
> +#endif
> +	if (fops->tx_queue_count == NULL)
> +		return -ENOTSUP;
> +
> +	rc = fops->tx_queue_count(qd);
> +	rte_eth_trace_tx_queue_count(port_id, tx_queue_id, rc);
> +
> +	return rc;
> +}
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/lib/ethdev/rte_ethdev_core.h b/lib/ethdev/rte_ethdev_core.h
> index 4bfaf79c6c..d3f09f390d 100644
> --- a/lib/ethdev/rte_ethdev_core.h
> +++ b/lib/ethdev/rte_ethdev_core.h
> @@ -60,6 +60,9 @@ typedef uint16_t (*eth_recycle_tx_mbufs_reuse_t)(void *txq,
>  /** @internal Refill Rx descriptors with the recycling mbufs */
>  typedef void (*eth_recycle_rx_descriptors_refill_t)(void *rxq, uint16_t nb);
> 
> +/** @internal Get number of used descriptors on a transmit queue. */
> +typedef int (*eth_tx_queue_count_t)(void *txq);
> +
>  /**
>   * @internal
>   * Structure used to hold opaque pointers to internal ethdev Rx/Tx
> @@ -116,7 +119,9 @@ struct rte_eth_fp_ops {
>  	eth_tx_descriptor_status_t tx_descriptor_status;
>  	/** Copy used mbufs from Tx mbuf ring into Rx. */
>  	eth_recycle_tx_mbufs_reuse_t recycle_tx_mbufs_reuse;
> -	uintptr_t reserved2[2];
> +	/** Get the number of used Tx descriptors. */
> +	eth_tx_queue_count_t tx_queue_count;
> +	uintptr_t reserved2[1];
>  	/**@}*/
> 
>  } __rte_cache_aligned;
> diff --git a/lib/ethdev/rte_ethdev_trace_fp.h b/lib/ethdev/rte_ethdev_trace_fp.h
> index 186271c9ff..c98c488433 100644
> --- a/lib/ethdev/rte_ethdev_trace_fp.h
> +++ b/lib/ethdev/rte_ethdev_trace_fp.h
> @@ -73,6 +73,14 @@ RTE_TRACE_POINT_FP(
>  	rte_trace_point_emit_u64(count);
>  )
> 
> +RTE_TRACE_POINT_FP(
> +	rte_eth_trace_tx_queue_count,
> +	RTE_TRACE_POINT_ARGS(uint16_t port_id, uint16_t tx_queue_id, int rc),
> +	rte_trace_point_emit_u16(port_id);
> +	rte_trace_point_emit_u16(tx_queue_id);
> +	rte_trace_point_emit_int(rc);
> +)
> +
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
> index 5c4917c020..e03830902a 100644
> --- a/lib/ethdev/version.map
> +++ b/lib/ethdev/version.map
> @@ -316,6 +316,9 @@ EXPERIMENTAL {
>  	rte_eth_recycle_rx_queue_info_get;
>  	rte_flow_group_set_miss_actions;
>  	rte_flow_calc_table_hash;
> +
> +	# added in 24.03
> +	rte_eth_tx_queue_count;
>  };
> 
>  INTERNAL {
> --
> 2.43.0
  
Ferruh Yigit Jan. 12, 2024, 2:25 p.m. UTC | #13
On 1/12/2024 12:11 PM, David Marchand wrote:
> CAUTION: This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
> 
> 
> On Fri, Jan 12, 2024 at 12:34 PM Ferruh Yigit <ferruh.yigit@amd.com> wrote:
>>> diff --git a/lib/ethdev/rte_ethdev_core.h b/lib/ethdev/rte_ethdev_core.h
>>> index 4bfaf79c6c..d3f09f390d 100644
>>> --- a/lib/ethdev/rte_ethdev_core.h
>>> +++ b/lib/ethdev/rte_ethdev_core.h
>>> @@ -60,6 +60,9 @@ typedef uint16_t (*eth_recycle_tx_mbufs_reuse_t)(void *txq,
>>>  /** @internal Refill Rx descriptors with the recycling mbufs */
>>>  typedef void (*eth_recycle_rx_descriptors_refill_t)(void *rxq, uint16_t nb);
>>>
>>> +/** @internal Get number of used descriptors on a transmit queue. */
>>> +typedef int (*eth_tx_queue_count_t)(void *txq);
>>> +
>>
>> Can you please move it above 'tx_descriptor_status', to keep same order
>> kept in many other locations:
>> rx_queue_count
>> rx_descriptor_status
>> tx_queue_count
>> tx_descriptor_status
>>
>>
>>>  /**
>>>   * @internal
>>>   * Structure used to hold opaque pointers to internal ethdev Rx/Tx
>>> @@ -116,7 +119,9 @@ struct rte_eth_fp_ops {
>>>       eth_tx_descriptor_status_t tx_descriptor_status;
>>>       /** Copy used mbufs from Tx mbuf ring into Rx. */
>>>       eth_recycle_tx_mbufs_reuse_t recycle_tx_mbufs_reuse;
>>> -     uintptr_t reserved2[2];
>>> +     /** Get the number of used Tx descriptors. */
>>> +     eth_tx_queue_count_t tx_queue_count;
>>>
>>
>> Similarly, can you please move it above 'tx_descriptor_status'?
> 
> Moving fields in rte_eth_fp_ops (where fast-path API ops are) breaks
> the applications ABI.
> 

You are right, what about put a TODO note or deprecation notice to move
it in next ABI break release?
  
Ferruh Yigit Jan. 12, 2024, 2:29 p.m. UTC | #14
On 1/12/2024 12:29 PM, Morten Brørup wrote:
>> From: Ferruh Yigit [mailto:ferruh.yigit@amd.com]
>> Sent: Friday, 12 January 2024 12.34
>>
>> On 1/11/2024 3:17 PM, jerinj@marvell.com wrote:
>>> From: Jerin Jacob <jerinj@marvell.com>
>>>
>>> Introduce a new API to retrieve the number of used descriptors
>>> in a Tx queue. Applications can leverage this API in the fast path to
>>> inspect the Tx queue occupancy and take appropriate actions based on
>> the
>>> available free descriptors.
>>>
>>> A notable use case could be implementing Random Early Discard (RED)
>>> in software based on Tx queue occupancy.

<...>

>>>  /**
>>>   * @internal
>>>   * Structure used to hold opaque pointers to internal ethdev Rx/Tx
>>> @@ -116,7 +119,9 @@ struct rte_eth_fp_ops {
>>>  	eth_tx_descriptor_status_t tx_descriptor_status;
>>>  	/** Copy used mbufs from Tx mbuf ring into Rx. */
>>>  	eth_recycle_tx_mbufs_reuse_t recycle_tx_mbufs_reuse;
>>> -	uintptr_t reserved2[2];
>>> +	/** Get the number of used Tx descriptors. */
>>> +	eth_tx_queue_count_t tx_queue_count;
>>>
>>
>> Similarly, can you please move it above 'tx_descriptor_status'?
> 
> No. I think struct rte_eth_fp_ops is part of the public API, so moving down tx_descriptor_status and recycle_tx_mbufs_reuse would break the API.
> 

ack (Dave highlighted the same)
  
Stephen Hemminger Jan. 12, 2024, 4:30 p.m. UTC | #15
On Fri, 12 Jan 2024 12:31:27 +0530
Jerin Jacob <jerinjacobk@gmail.com> wrote:

> >
> > Has anyone investigated implementing dynamic tx queue limits like
> > Linux BQL?  
> 
> In DPDK APIs, it can expressed through creating correct TM
> topology(shaping or rate limiting) via rte_tm API

That won't work for BQL.
Byte Queue Limits measures the latency for transmit queue.
A counter is started when packets are queued to hardware and
decremented when they are done transmitting. This is then fed
back into the transmit logic; the concept is too eliminate bufferbloat
in the transmit side by providing early feedback up the stack.

Doing BQL needs a mechanism to know when packets are sent,
either in ethdev or in every device driver.

Right now it is common for applications to have very large
worse case transmit queue length (descriptors) and that introduces
latency.
  
Stephen Hemminger Jan. 12, 2024, 4:52 p.m. UTC | #16
On Thu, 11 Jan 2024 20:47:44 +0530
<jerinj@marvell.com> wrote:

> @@ -116,7 +119,9 @@ struct rte_eth_fp_ops {
>  	eth_tx_descriptor_status_t tx_descriptor_status;
>  	/** Copy used mbufs from Tx mbuf ring into Rx. */
>  	eth_recycle_tx_mbufs_reuse_t recycle_tx_mbufs_reuse;
> -	uintptr_t reserved2[2];
> +	/** Get the number of used Tx descriptors. */
> +	eth_tx_queue_count_t tx_queue_count;
> +	uintptr_t reserved2[1];
>  	/**@}*/

This does introduce the question of were the reserved fields checked
in earlier versions. (ie. must be NULL).  But since the DPDK only expects rte_eth_fp_ops
to come from driver, and we don't guarantee ABI for driver API's, it is not a problem.
  
Jerin Jacob Jan. 16, 2024, 6:37 a.m. UTC | #17
On Fri, Jan 12, 2024 at 6:03 PM Konstantin Ananyev
<konstantin.ananyev@huawei.com> wrote:
>
> Hi Jerin,

Hi Konstantin,


>
> > Introduce a new API to retrieve the number of used descriptors
> > in a Tx queue. Applications can leverage this API in the fast path to
> > inspect the Tx queue occupancy and take appropriate actions based on the
> > available free descriptors.
> >
> > A notable use case could be implementing Random Early Discard (RED)
> > in software based on Tx queue occupancy.
> >
> > Signed-off-by: Jerin Jacob <jerinj@marvell.com>

> > @@ -6803,6 +6803,80 @@ rte_eth_recycle_mbufs(uint16_t rx_port_id, uint16_t rx_queue_id,
> >  __rte_experimental
> >  int rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t *ptypes, int num);
> >
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
> > + *
> > + * Get the number of used descriptors of a Tx queue
> > + *
> > + * This function retrieves the number of used descriptors of a transmit queue.
> > + * Applications can use this API in the fast path to inspect Tx queue occupancy and take
> > + * appropriate actions based on the available free descriptors.
> > + * An example action could be implementing the Random Early Discard (RED).
>
> Sorry, I probably misunderstood your previous mails, but wouldn't it be more convenient
> for user to have rte_eth_tx_queue_free_count(...) as fast-op, and
> have rte_eth_tx_queue_count(...) {  queue_txd_num - rte_eth_tx_queue_free_count(...);}
> as a slow-path function in rte_ethdev.c?

The general feedback is to align with the Rx queue API, specifically
rte_eth_rx_queue_count,
and it's noted that there is no equivalent rte_eth_rx_queue_free_count.

Given that the free count can be obtained by subtracting the used
count from queue_txd_num,
it is considered that either approach is acceptable.

The application configures queue_txd_num with tx_queue_setup(), and
the application can store that value in its structure.
This would enable fast-path usage for both base cases (whether the
application needs information about free or used descriptors)
with just one API(rte_eth_tx_queue_count())


> Konstantin
  
Jerin Jacob Jan. 18, 2024, 9:06 a.m. UTC | #18
On Fri, Jan 12, 2024 at 5:04 PM Ferruh Yigit <ferruh.yigit@amd.com> wrote:
>
> On 1/11/2024 3:17 PM, jerinj@marvell.com wrote:
> > From: Jerin Jacob <jerinj@marvell.com>
> >
> > Introduce a new API to retrieve the number of used descriptors
> > in a Tx queue. Applications can leverage this API in the fast path to
> > inspect the Tx queue occupancy and take appropriate actions based on the
> > available free descriptors.
> >
> > A notable use case could be implementing Random Early Discard (RED)
> > in software based on Tx queue occupancy.
> >
> > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> > ---

> >
>
> As we are adding a new API and dev_ops, is a driver implementation and
> testpmd/example implementation planned for this release?

Yes.

>
>
> > rfc..v1:
> > - Updated API similar to rte_eth_rx_queue_count() where it returns
> > "used" count instead of "free" count
> >
> > diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> > index f7d9980849..0d5a8733fc 100644
> > --- a/doc/guides/nics/features.rst
> > +++ b/doc/guides/nics/features.rst
> > @@ -962,6 +962,16 @@ management (see :doc:`../prog_guide/power_man` for more details).
> >
> >  * **[implements] eth_dev_ops**: ``get_monitor_addr``
> >
> > +.. _nic_features_tx_queue_used_count:
> > +
> > +Tx queue count
> > +--------------
> > +
> > +Supports to get the number of used descriptors of a Tx queue.
> > +
> > +* **[implements] eth_dev_ops**: ``tx_queue_count``.
> > +* **[related] API**: ``rte_eth_tx_queue_count()``.
> > +
> >
>
> Can you please keep the order same with 'default.ini' file,
> I recognized there is already some mismatch in order but we can fix them
> later.

Ack

>
> >  .. _nic_features_other:
> >
> >  Other dev ops not represented by a Feature
> > diff --git a/doc/guides/nics/features/default.ini b/doc/guides/nics/features/default.ini
> > index 806cb033ff..3ef6d45c0e 100644
> > --- a/doc/guides/nics/features/default.ini
> > +++ b/doc/guides/nics/features/default.ini
> > @@ -59,6 +59,7 @@ Packet type parsing  =
> >  Timesync             =
> >  Rx descriptor status =
> >  Tx descriptor status =
> > +Tx queue count       =
> >
>
> Existing Rx queue count is not documented, if we are documenting this,
> can you please add "Rx queue count" in a separate patch?

I will do.


>
> >  Basic stats          =
> >  Extended stats       =
> >  Stats per queue      =
> > diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> > index b482cd12bb..f05f68a67c 100644
> > --- a/lib/ethdev/ethdev_driver.h
> > +++ b/lib/ethdev/ethdev_driver.h
> > @@ -58,6 +58,8 @@ struct rte_eth_dev {
> >       eth_rx_queue_count_t rx_queue_count;
> >       /** Check the status of a Rx descriptor */
> >       eth_rx_descriptor_status_t rx_descriptor_status;
> > +     /** Get the number of used Tx descriptors */
> > +     eth_tx_queue_count_t tx_queue_count;
> >       /** Check the status of a Tx descriptor */
> >       eth_tx_descriptor_status_t tx_descriptor_status;
> >       /** Pointer to PMD transmit mbufs reuse function */
> > diff --git a/lib/ethdev/ethdev_private.c b/lib/ethdev/ethdev_private.c
> > index a656df293c..626524558a 100644
> > --- a/lib/ethdev/ethdev_private.c
> > +++ b/lib/ethdev/ethdev_private.c
> > @@ -273,6 +273,7 @@ eth_dev_fp_ops_setup(struct rte_eth_fp_ops *fpo,
> >       fpo->tx_pkt_prepare = dev->tx_pkt_prepare;
> >       fpo->rx_queue_count = dev->rx_queue_count;
> >       fpo->rx_descriptor_status = dev->rx_descriptor_status;
> > +     fpo->tx_queue_count = dev->tx_queue_count;
> >       fpo->tx_descriptor_status = dev->tx_descriptor_status;
> >       fpo->recycle_tx_mbufs_reuse = dev->recycle_tx_mbufs_reuse;
> >       fpo->recycle_rx_descriptors_refill = dev->recycle_rx_descriptors_refill;
> > diff --git a/lib/ethdev/ethdev_trace_points.c b/lib/ethdev/ethdev_trace_points.c
> > index 91f71d868b..e618414392 100644
> > --- a/lib/ethdev/ethdev_trace_points.c
> > +++ b/lib/ethdev/ethdev_trace_points.c
> > @@ -481,6 +481,9 @@ RTE_TRACE_POINT_REGISTER(rte_eth_trace_count_aggr_ports,
> >  RTE_TRACE_POINT_REGISTER(rte_eth_trace_map_aggr_tx_affinity,
> >       lib.ethdev.map_aggr_tx_affinity)
> >
> > +RTE_TRACE_POINT_REGISTER(rte_eth_trace_tx_queue_count,
> > +     lib.ethdev.tx_queue_count)
> > +
>
> Can you please group this with 'tx_burst' & 'call_tx_callbacks' above?

Ack


>
> >  RTE_TRACE_POINT_REGISTER(rte_flow_trace_copy,
> >       lib.ethdev.flow.copy)
> >
> > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> > index 21e3a21903..af59da9652 100644
> > --- a/lib/ethdev/rte_ethdev.h
> > +++ b/lib/ethdev/rte_ethdev.h
> > @@ -6803,6 +6803,80 @@ rte_eth_recycle_mbufs(uint16_t rx_port_id, uint16_t rx_queue_id,
> >  __rte_experimental
> >  int rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t *ptypes, int num);
> >
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
> > + *
> > + * Get the number of used descriptors of a Tx queue
> > + *
> > + * This function retrieves the number of used descriptors of a transmit queue.
> > + * Applications can use this API in the fast path to inspect Tx queue occupancy and take
> > + * appropriate actions based on the available free descriptors.
> > + * An example action could be implementing the Random Early Discard (RED).
> >
>
> Above is good, but do you think does it help to mention that intended
> usacase is QoS, to address the risk Bruce mentioned?

I will add following notes in next version.

 * @note There is no requirement to call this function before
rte_eth_tx_burst() invocation.
 * @note Utilize this function exclusively when the caller needs to
determine the used queue count
 * across all descriptors of a Tx queue. If the use case only involves
checking the status of a
 * specific descriptor slot, opt for rte_eth_tx_descriptor_status() instead.



>
>
> > + *
> > + * Since it's a fast-path function, no check is performed on port_id and
> > + * tx_queue_id. The caller must therefore ensure that the port is enabled
> > + * and the queue is configured and running.
> > + *
> > + * @param port_id
> > + *   The port identifier of the device.
> > + * @param tx_queue_id
> > + *   The index of the transmit queue.
> > + *   The value must be in the range [0, nb_tx_queue - 1] previously supplied
> > + *   to rte_eth_dev_configure().
> > + * @return
> > + *  The number of used descriptors in the specific queue, or:
> > + *   - (-ENODEV) if *port_id* is invalid. Enabled only when RTE_ETHDEV_DEBUG_TX is enabled
> > + *   - (-EINVAL) if *queue_id* is invalid. Enabled only when RTE_ETHDEV_DEBUG_TX is enabled
> > + *   - (-ENOTSUP) if the device does not support this function.
> > + *
> > + * @note This function is designed for fast-path use.
> > + */
> > +__rte_experimental
> > +static inline int
> > +rte_eth_tx_queue_count(uint16_t port_id, uint16_t tx_queue_id)
> > +{
> > +     struct rte_eth_fp_ops *fops;
> > +     void *qd;
> > +     int rc;
> > +
> > +#ifdef RTE_ETHDEV_DEBUG_TX
> > +     if (port_id >= RTE_MAX_ETHPORTS || !rte_eth_dev_is_valid_port(port_id)) {
> > +             RTE_ETHDEV_LOG_LINE(ERR, "Invalid port_id=%u", port_id);
> > +             rc = -ENODEV;
> > +             rte_eth_trace_tx_queue_count(port_id, tx_queue_id, rc);
> > +             return rc;
> > +     }
> > +
> > +     rc = -EINVAL;
> > +     if (tx_queue_id >= RTE_MAX_QUEUES_PER_PORT) {
> > +             RTE_ETHDEV_LOG_LINE(ERR, "Invalid Tx queue_id=%u for port_id=%u",
> > +                                 tx_queue_id, port_id);
> > +             rte_eth_trace_tx_queue_count(port_id, tx_queue_id, rc);
> > +             return rc;
> > +     }
> > +#endif
> > +
> > +     /* Fetch pointer to Tx queue data */
> > +     fops = &rte_eth_fp_ops[port_id];
> > +     qd = fops->txq.data[tx_queue_id];
> > +
> > +#ifdef RTE_ETHDEV_DEBUG_TX
> > +     if (qd == NULL) {
> > +             RTE_ETHDEV_LOG_LINE(ERR, "Invalid Tx queue_id=%u for port_id=%u",
> > +                                 tx_queue_id, port_id);
> > +             rte_eth_trace_tx_queue_count(port_id, tx_queue_id, rc);
> > +             return rc;
> > +     }
> > +#endif
> > +     if (fops->tx_queue_count == NULL)
> > +             return -ENOTSUP;
> > +
> > +     rc = fops->tx_queue_count(qd);
> >
>
> Can you please put an empty line here to have a separate block for trace
> code?

Ack


>
> > +     rte_eth_trace_tx_queue_count(port_id, tx_queue_id, rc);
> > +
> > +     return rc;
> > +}
> >  #ifdef __cplusplus
> >  }
> >  #endif
> > diff --git a/lib/ethdev/rte_ethdev_core.h b/lib/ethdev/rte_ethdev_core.h
> > index 4bfaf79c6c..d3f09f390d 100644
> > --- a/lib/ethdev/rte_ethdev_core.h
> > +++ b/lib/ethdev/rte_ethdev_core.h
> > @@ -60,6 +60,9 @@ typedef uint16_t (*eth_recycle_tx_mbufs_reuse_t)(void *txq,
> >  /** @internal Refill Rx descriptors with the recycling mbufs */
> >  typedef void (*eth_recycle_rx_descriptors_refill_t)(void *rxq, uint16_t nb);
> >
> > +/** @internal Get number of used descriptors on a transmit queue. */
> > +typedef int (*eth_tx_queue_count_t)(void *txq);
> > +
>
> Can you please move it above 'tx_descriptor_status', to keep same order
> kept in many other locations:
> rx_queue_count
> rx_descriptor_status
> tx_queue_count
> tx_descriptor_status

Ack


>
>
> >  /**
> >   * @internal
> >   * Structure used to hold opaque pointers to internal ethdev Rx/Tx
> > @@ -116,7 +119,9 @@ struct rte_eth_fp_ops {
> >       eth_tx_descriptor_status_t tx_descriptor_status;
> >       /** Copy used mbufs from Tx mbuf ring into Rx. */
> >       eth_recycle_tx_mbufs_reuse_t recycle_tx_mbufs_reuse;
> > -     uintptr_t reserved2[2];
> > +     /** Get the number of used Tx descriptors. */
> > +     eth_tx_queue_count_t tx_queue_count;
> >
>
> Similarly, can you please move it above 'tx_descriptor_status'?

Ack


>
>
> > +     uintptr_t reserved2[1];
> >       /**@}*/
> >
> >  } __rte_cache_aligned;
> > diff --git a/lib/ethdev/rte_ethdev_trace_fp.h b/lib/ethdev/rte_ethdev_trace_fp.h
> > index 186271c9ff..c98c488433 100644
> > --- a/lib/ethdev/rte_ethdev_trace_fp.h
> > +++ b/lib/ethdev/rte_ethdev_trace_fp.h
> > @@ -73,6 +73,14 @@ RTE_TRACE_POINT_FP(
> >       rte_trace_point_emit_u64(count);
> >  )
> >
> > +RTE_TRACE_POINT_FP(
> > +     rte_eth_trace_tx_queue_count,
> > +     RTE_TRACE_POINT_ARGS(uint16_t port_id, uint16_t tx_queue_id, int rc),
> > +     rte_trace_point_emit_u16(port_id);
> > +     rte_trace_point_emit_u16(tx_queue_id);
> > +     rte_trace_point_emit_int(rc);
> > +)
> > +
>
> Existing 'rx_queue_count' is missing tracepoints, for consistency, can
> you please add that too in a separate patch?

I will do.

>
>
> >  #ifdef __cplusplus
> >  }
> >  #endif
> > diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
> > index 5c4917c020..e03830902a 100644
> > --- a/lib/ethdev/version.map
> > +++ b/lib/ethdev/version.map
> > @@ -316,6 +316,9 @@ EXPERIMENTAL {
> >       rte_eth_recycle_rx_queue_info_get;
> >       rte_flow_group_set_miss_actions;
> >       rte_flow_calc_table_hash;
> > +
> > +     # added in 24.03
> > +     rte_eth_tx_queue_count;
> >  };
> >
> >  INTERNAL {
>
  
Konstantin Ananyev Jan. 18, 2024, 10:17 a.m. UTC | #19
Hi Jerin,

> > > Introduce a new API to retrieve the number of used descriptors
> > > in a Tx queue. Applications can leverage this API in the fast path to
> > > inspect the Tx queue occupancy and take appropriate actions based on the
> > > available free descriptors.
> > >
> > > A notable use case could be implementing Random Early Discard (RED)
> > > in software based on Tx queue occupancy.
> > >
> > > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> 
> > > @@ -6803,6 +6803,80 @@ rte_eth_recycle_mbufs(uint16_t rx_port_id, uint16_t rx_queue_id,
> > >  __rte_experimental
> > >  int rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t *ptypes, int num);
> > >
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
> > > + *
> > > + * Get the number of used descriptors of a Tx queue
> > > + *
> > > + * This function retrieves the number of used descriptors of a transmit queue.
> > > + * Applications can use this API in the fast path to inspect Tx queue occupancy and take
> > > + * appropriate actions based on the available free descriptors.
> > > + * An example action could be implementing the Random Early Discard (RED).
> >
> > Sorry, I probably misunderstood your previous mails, but wouldn't it be more convenient
> > for user to have rte_eth_tx_queue_free_count(...) as fast-op, and
> > have rte_eth_tx_queue_count(...) {  queue_txd_num - rte_eth_tx_queue_free_count(...);}
> > as a slow-path function in rte_ethdev.c?
> 
> The general feedback is to align with the Rx queue API, specifically
> rte_eth_rx_queue_count,
> and it's noted that there is no equivalent rte_eth_rx_queue_free_count.
> 
> Given that the free count can be obtained by subtracting the used
> count from queue_txd_num,
> it is considered that either approach is acceptable.
> 
> The application configures queue_txd_num with tx_queue_setup(), and
> the application can store that value in its structure.
> This would enable fast-path usage for both base cases (whether the
> application needs information about free or used descriptors)
> with just one API(rte_eth_tx_queue_count())

Right now I don't use these functions, but if I think what most people are interested in:
- how many packets you can receive immediately (rx_queue_count)
- how many packets you can transmit immediately (tx_queue_free_count)
Sure, I understand that user can store txd_num  somewhere and then do subtraction himself. 
Though it means more effort for the user, and the only reason for that, as I can see,
is to have RX and TX function naming symmetric.
Which seems much less improtant to me comparing to user convenience.
Anyway, as I stated above, I don't use these functions right now,
so if the majority of users are happy with current approach, I would not insist :)  
Konstantin
  
Jerin Jacob Jan. 18, 2024, 11:21 a.m. UTC | #20
On Thu, Jan 18, 2024 at 3:47 PM Konstantin Ananyev
<konstantin.ananyev@huawei.com> wrote:
>
>
> Hi Jerin,

Hi Konstantin,


>
> > > > Introduce a new API to retrieve the number of used descriptors
> > > > in a Tx queue. Applications can leverage this API in the fast path to
> > > > inspect the Tx queue occupancy and take appropriate actions based on the
> > > > available free descriptors.
> > > >
> > > > A notable use case could be implementing Random Early Discard (RED)
> > > > in software based on Tx queue occupancy.
> > > >
> > > > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> >
> > > > @@ -6803,6 +6803,80 @@ rte_eth_recycle_mbufs(uint16_t rx_port_id, uint16_t rx_queue_id,
> > > >  __rte_experimental
> > > >  int rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t *ptypes, int num);
> > > >
> > > > +/**
> > > > + * @warning
> > > > + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
> > > > + *
> > > > + * Get the number of used descriptors of a Tx queue
> > > > + *
> > > > + * This function retrieves the number of used descriptors of a transmit queue.
> > > > + * Applications can use this API in the fast path to inspect Tx queue occupancy and take
> > > > + * appropriate actions based on the available free descriptors.
> > > > + * An example action could be implementing the Random Early Discard (RED).
> > >
> > > Sorry, I probably misunderstood your previous mails, but wouldn't it be more convenient
> > > for user to have rte_eth_tx_queue_free_count(...) as fast-op, and
> > > have rte_eth_tx_queue_count(...) {  queue_txd_num - rte_eth_tx_queue_free_count(...);}
> > > as a slow-path function in rte_ethdev.c?
> >
> > The general feedback is to align with the Rx queue API, specifically
> > rte_eth_rx_queue_count,
> > and it's noted that there is no equivalent rte_eth_rx_queue_free_count.
> >
> > Given that the free count can be obtained by subtracting the used
> > count from queue_txd_num,
> > it is considered that either approach is acceptable.
> >
> > The application configures queue_txd_num with tx_queue_setup(), and
> > the application can store that value in its structure.
> > This would enable fast-path usage for both base cases (whether the
> > application needs information about free or used descriptors)
> > with just one API(rte_eth_tx_queue_count())
>
> Right now I don't use these functions, but if I think what most people are interested in:
> - how many packets you can receive immediately (rx_queue_count)
> - how many packets you can transmit immediately (tx_queue_free_count)

Yes. That's why initially I kept the free version.

It seems like other prominent use cases for _used_ version is for QoS
to enable something  like
in positive logic,

if < 98% % used then action Tail drop
else if  < 60% used then action RED

> Sure, I understand that user can store txd_num  somewhere and then do subtraction himself.
> Though it means more effort for the user, and the only reason for that, as I can see,
> is to have RX and TX function naming symmetric.
> Which seems much less improtant to me comparing to user convenience.
> Anyway, as I stated above, I don't use these functions right now,
> so if the majority of users are happy with current approach, I would not insist :)

No strong opinion for me either. Looks like majority users like used version.
Let's go with. I am open for changing to free version, if the majority
of users like that.

> Konstantin
>
  
Morten Brørup Jan. 18, 2024, 1:36 p.m. UTC | #21
> From: Konstantin Ananyev [mailto:konstantin.ananyev@huawei.com]
> Sent: Thursday, 18 January 2024 11.17
> 
> Hi Jerin,
> 
> > > > Introduce a new API to retrieve the number of used descriptors
> > > > in a Tx queue. Applications can leverage this API in the fast
> path to
> > > > inspect the Tx queue occupancy and take appropriate actions based
> on the
> > > > available free descriptors.
> > > >
> > > > A notable use case could be implementing Random Early Discard
> (RED)
> > > > in software based on Tx queue occupancy.
> > > >
> > > > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> >
> > > > @@ -6803,6 +6803,80 @@ rte_eth_recycle_mbufs(uint16_t rx_port_id,
> uint16_t rx_queue_id,
> > > >  __rte_experimental
> > > >  int rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t
> port_id, uint32_t *ptypes, int num);
> > > >
> > > > +/**
> > > > + * @warning
> > > > + * @b EXPERIMENTAL: this API may change, or be removed, without
> prior notice
> > > > + *
> > > > + * Get the number of used descriptors of a Tx queue
> > > > + *
> > > > + * This function retrieves the number of used descriptors of a
> transmit queue.
> > > > + * Applications can use this API in the fast path to inspect Tx
> queue occupancy and take
> > > > + * appropriate actions based on the available free descriptors.
> > > > + * An example action could be implementing the Random Early
> Discard (RED).
> > >
> > > Sorry, I probably misunderstood your previous mails, but wouldn't
> it be more convenient
> > > for user to have rte_eth_tx_queue_free_count(...) as fast-op, and
> > > have rte_eth_tx_queue_count(...) {  queue_txd_num -
> rte_eth_tx_queue_free_count(...);}
> > > as a slow-path function in rte_ethdev.c?
> >
> > The general feedback is to align with the Rx queue API, specifically
> > rte_eth_rx_queue_count,
> > and it's noted that there is no equivalent
> rte_eth_rx_queue_free_count.
> >
> > Given that the free count can be obtained by subtracting the used
> > count from queue_txd_num,
> > it is considered that either approach is acceptable.
> >
> > The application configures queue_txd_num with tx_queue_setup(), and
> > the application can store that value in its structure.
> > This would enable fast-path usage for both base cases (whether the
> > application needs information about free or used descriptors)
> > with just one API(rte_eth_tx_queue_count())
> 
> Right now I don't use these functions, but if I think what most people
> are interested in:
> - how many packets you can receive immediately (rx_queue_count)

Agreed that "used" (not "free") is the preferred info for RX.

> - how many packets you can transmit immediately (tx_queue_free_count)
> Sure, I understand that user can store txd_num  somewhere and then do
> subtraction himself.
> Though it means more effort for the user, and the only reason for that,
> as I can see,
> is to have RX and TX function naming symmetric.
> Which seems much less improtant to me comparing to user convenience.

I agree 100 % with your prioritization: Usability has higher priority than symmetric naming.

So here are some example use cases supporting the TX "Used" API:
- RED (and similar queueing algorithms) need to know how many packets the queue holds (not how much room the queue has for more packets).
- Load Balancing across multiple links, in use cases where packet reordering is allowed.
- Monitoring egress queueing, especially in many-to-one-port traffic patterns, e.g. to catch micro-burst induced spikes (which may cause latency/"bufferbloat").
- The (obsolete) ifOutQLen object in the Interfaces MIB for SNMP, which I suppose was intended for monitoring egress queueing.

> Anyway, as I stated above, I don't use these functions right now,
> so if the majority of users are happy with current approach, I would
> not insist :)

I'm very happy with the current approach. :-)
  
Konstantin Ananyev Jan. 19, 2024, 9:52 a.m. UTC | #22
> > > > > Introduce a new API to retrieve the number of used descriptors
> > > > > in a Tx queue. Applications can leverage this API in the fast
> > path to
> > > > > inspect the Tx queue occupancy and take appropriate actions based
> > on the
> > > > > available free descriptors.
> > > > >
> > > > > A notable use case could be implementing Random Early Discard
> > (RED)
> > > > > in software based on Tx queue occupancy.
> > > > >
> > > > > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> > >
> > > > > @@ -6803,6 +6803,80 @@ rte_eth_recycle_mbufs(uint16_t rx_port_id,
> > uint16_t rx_queue_id,
> > > > >  __rte_experimental
> > > > >  int rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t
> > port_id, uint32_t *ptypes, int num);
> > > > >
> > > > > +/**
> > > > > + * @warning
> > > > > + * @b EXPERIMENTAL: this API may change, or be removed, without
> > prior notice
> > > > > + *
> > > > > + * Get the number of used descriptors of a Tx queue
> > > > > + *
> > > > > + * This function retrieves the number of used descriptors of a
> > transmit queue.
> > > > > + * Applications can use this API in the fast path to inspect Tx
> > queue occupancy and take
> > > > > + * appropriate actions based on the available free descriptors.
> > > > > + * An example action could be implementing the Random Early
> > Discard (RED).
> > > >
> > > > Sorry, I probably misunderstood your previous mails, but wouldn't
> > it be more convenient
> > > > for user to have rte_eth_tx_queue_free_count(...) as fast-op, and
> > > > have rte_eth_tx_queue_count(...) {  queue_txd_num -
> > rte_eth_tx_queue_free_count(...);}
> > > > as a slow-path function in rte_ethdev.c?
> > >
> > > The general feedback is to align with the Rx queue API, specifically
> > > rte_eth_rx_queue_count,
> > > and it's noted that there is no equivalent
> > rte_eth_rx_queue_free_count.
> > >
> > > Given that the free count can be obtained by subtracting the used
> > > count from queue_txd_num,
> > > it is considered that either approach is acceptable.
> > >
> > > The application configures queue_txd_num with tx_queue_setup(), and
> > > the application can store that value in its structure.
> > > This would enable fast-path usage for both base cases (whether the
> > > application needs information about free or used descriptors)
> > > with just one API(rte_eth_tx_queue_count())
> >
> > Right now I don't use these functions, but if I think what most people
> > are interested in:
> > - how many packets you can receive immediately (rx_queue_count)
> 
> Agreed that "used" (not "free") is the preferred info for RX.
> 
> > - how many packets you can transmit immediately (tx_queue_free_count)
> > Sure, I understand that user can store txd_num  somewhere and then do
> > subtraction himself.
> > Though it means more effort for the user, and the only reason for that,
> > as I can see,
> > is to have RX and TX function naming symmetric.
> > Which seems much less improtant to me comparing to user convenience.
> 
> I agree 100 % with your prioritization: Usability has higher priority than symmetric naming.
> 
> So here are some example use cases supporting the TX "Used" API:
> - RED (and similar queueing algorithms) need to know how many packets the queue holds (not how much room the queue has for
> more packets).

Ok, but to calculate percentage we do need both numbers: txd_num and txd_used_num (or txd_free_num).
So in such case user still has to store txd_num somewhere and do the math after getting txd_used_num.
So probably  no advantage between tx_queue_used_count() and tx_queue_free_count() for that case.
 
> - Load Balancing across multiple links, in use cases where packet reordering is allowed.

I suppose for that case, you also will need to calc percentage, not the raw txd_used_num, no?
 
> - Monitoring egress queueing, especially in many-to-one-port traffic patterns, e.g. to catch micro-burst induced spikes (which may
> cause latency/"bufferbloat").
> - The (obsolete) ifOutQLen object in the Interfaces MIB for SNMP, which I suppose was intended for monitoring egress queueing.
> 
> > Anyway, as I stated above, I don't use these functions right now,
> > so if the majority of users are happy with current approach, I would
> > not insist :)
> 
> I'm very happy with the current approach. :-)

As I said, if end users are happy, then I am fine too ;)
  
Morten Brørup Jan. 19, 2024, 10:32 a.m. UTC | #23
> From: Konstantin Ananyev [mailto:konstantin.ananyev@huawei.com]
> Sent: Friday, 19 January 2024 10.53
> 
> > > > > > Introduce a new API to retrieve the number of used
> descriptors
> > > > > > in a Tx queue. Applications can leverage this API in the fast
> > > path to
> > > > > > inspect the Tx queue occupancy and take appropriate actions
> based
> > > on the
> > > > > > available free descriptors.
> > > > > >
> > > > > > A notable use case could be implementing Random Early Discard
> > > (RED)
> > > > > > in software based on Tx queue occupancy.
> > > > > >
> > > > > > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> > > >
> > > > The general feedback is to align with the Rx queue API,
> specifically
> > > > rte_eth_rx_queue_count,
> > > > and it's noted that there is no equivalent
> > > rte_eth_rx_queue_free_count.
> > > >
> > > > Given that the free count can be obtained by subtracting the used
> > > > count from queue_txd_num,
> > > > it is considered that either approach is acceptable.
> > > >
> > > > The application configures queue_txd_num with tx_queue_setup(),
> and
> > > > the application can store that value in its structure.
> > > > This would enable fast-path usage for both base cases (whether
> the
> > > > application needs information about free or used descriptors)
> > > > with just one API(rte_eth_tx_queue_count())
> > >
> > > Right now I don't use these functions, but if I think what most
> people
> > > are interested in:
> > > - how many packets you can receive immediately (rx_queue_count)
> >
> > Agreed that "used" (not "free") is the preferred info for RX.
> >
> > > - how many packets you can transmit immediately
> (tx_queue_free_count)
> > > Sure, I understand that user can store txd_num  somewhere and then
> do
> > > subtraction himself.
> > > Though it means more effort for the user, and the only reason for
> that,
> > > as I can see,
> > > is to have RX and TX function naming symmetric.
> > > Which seems much less improtant to me comparing to user
> convenience.
> >
> > I agree 100 % with your prioritization: Usability has higher priority
> than symmetric naming.
> >
> > So here are some example use cases supporting the TX "Used" API:
> > - RED (and similar queueing algorithms) need to know how many packets
> the queue holds (not how much room the queue has for
> > more packets).
> 
> Ok, but to calculate percentage we do need both numbers: txd_num and
> txd_used_num (or txd_free_num).
> So in such case user still has to store txd_num somewhere and do the
> math after getting txd_used_num.
> So probably  no advantage between tx_queue_used_count() and
> tx_queue_free_count() for that case.

If used for simple mitigation of tail-dropping, you are correct. Then the percentages would be appropriate.

But not if used for traffic engineering. In that case, the queueing algorithm's thresholds should be based on absolute numbers, configured by the network engineer.
As an optional application optimization, if an application has RED queueing configured for 100 % dropping at 200 packets in queue, the application can configure the NIC with only 256 descriptors instead of 512 or whatever the default is. In this way, the NIC queue size depends on the RED parameters, not vice versa.

> 
> > - Load Balancing across multiple links, in use cases where packet
> reordering is allowed.
> 
> I suppose for that case, you also will need to calc percentage, not the
> raw txd_used_num, no?

Not if optimizing for low latency. If one of the NICs supports 512 TX descriptors and another of the NICs supports 4096 TX descriptors, the application should transmit packets via the NIC with the lowest absolute number of used descriptors, assuming that this queue has the shortest queuing delay.

> 
> > - Monitoring egress queueing, especially in many-to-one-port traffic
> patterns, e.g. to catch micro-burst induced spikes (which may
> > cause latency/"bufferbloat").
> > - The (obsolete) ifOutQLen object in the Interfaces MIB for SNMP,
> which I suppose was intended for monitoring egress queueing.
> >
> > > Anyway, as I stated above, I don't use these functions right now,
> > > so if the majority of users are happy with current approach, I
> would
> > > not insist :)
> >
> > I'm very happy with the current approach. :-)
> 
> As I said, if end users are happy, then I am fine too ;)

Then everyone are happy; we can enjoy the coming weekend, and leave the remaining work on this to Jerin. :-))

Great discussion, Konstantin! It is important to remain focused on what the users need, and keep challenging if that is really what we are doing.
  

Patch

diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
index f7d9980849..0d5a8733fc 100644
--- a/doc/guides/nics/features.rst
+++ b/doc/guides/nics/features.rst
@@ -962,6 +962,16 @@  management (see :doc:`../prog_guide/power_man` for more details).
 
 * **[implements] eth_dev_ops**: ``get_monitor_addr``
 
+.. _nic_features_tx_queue_used_count:
+
+Tx queue count
+--------------
+
+Supports to get the number of used descriptors of a Tx queue.
+
+* **[implements] eth_dev_ops**: ``tx_queue_count``.
+* **[related] API**: ``rte_eth_tx_queue_count()``.
+
 .. _nic_features_other:
 
 Other dev ops not represented by a Feature
diff --git a/doc/guides/nics/features/default.ini b/doc/guides/nics/features/default.ini
index 806cb033ff..3ef6d45c0e 100644
--- a/doc/guides/nics/features/default.ini
+++ b/doc/guides/nics/features/default.ini
@@ -59,6 +59,7 @@  Packet type parsing  =
 Timesync             =
 Rx descriptor status =
 Tx descriptor status =
+Tx queue count       =
 Basic stats          =
 Extended stats       =
 Stats per queue      =
diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index b482cd12bb..f05f68a67c 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -58,6 +58,8 @@  struct rte_eth_dev {
 	eth_rx_queue_count_t rx_queue_count;
 	/** Check the status of a Rx descriptor */
 	eth_rx_descriptor_status_t rx_descriptor_status;
+	/** Get the number of used Tx descriptors */
+	eth_tx_queue_count_t tx_queue_count;
 	/** Check the status of a Tx descriptor */
 	eth_tx_descriptor_status_t tx_descriptor_status;
 	/** Pointer to PMD transmit mbufs reuse function */
diff --git a/lib/ethdev/ethdev_private.c b/lib/ethdev/ethdev_private.c
index a656df293c..626524558a 100644
--- a/lib/ethdev/ethdev_private.c
+++ b/lib/ethdev/ethdev_private.c
@@ -273,6 +273,7 @@  eth_dev_fp_ops_setup(struct rte_eth_fp_ops *fpo,
 	fpo->tx_pkt_prepare = dev->tx_pkt_prepare;
 	fpo->rx_queue_count = dev->rx_queue_count;
 	fpo->rx_descriptor_status = dev->rx_descriptor_status;
+	fpo->tx_queue_count = dev->tx_queue_count;
 	fpo->tx_descriptor_status = dev->tx_descriptor_status;
 	fpo->recycle_tx_mbufs_reuse = dev->recycle_tx_mbufs_reuse;
 	fpo->recycle_rx_descriptors_refill = dev->recycle_rx_descriptors_refill;
diff --git a/lib/ethdev/ethdev_trace_points.c b/lib/ethdev/ethdev_trace_points.c
index 91f71d868b..e618414392 100644
--- a/lib/ethdev/ethdev_trace_points.c
+++ b/lib/ethdev/ethdev_trace_points.c
@@ -481,6 +481,9 @@  RTE_TRACE_POINT_REGISTER(rte_eth_trace_count_aggr_ports,
 RTE_TRACE_POINT_REGISTER(rte_eth_trace_map_aggr_tx_affinity,
 	lib.ethdev.map_aggr_tx_affinity)
 
+RTE_TRACE_POINT_REGISTER(rte_eth_trace_tx_queue_count,
+	lib.ethdev.tx_queue_count)
+
 RTE_TRACE_POINT_REGISTER(rte_flow_trace_copy,
 	lib.ethdev.flow.copy)
 
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 21e3a21903..af59da9652 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -6803,6 +6803,80 @@  rte_eth_recycle_mbufs(uint16_t rx_port_id, uint16_t rx_queue_id,
 __rte_experimental
 int rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t *ptypes, int num);
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
+ *
+ * Get the number of used descriptors of a Tx queue
+ *
+ * This function retrieves the number of used descriptors of a transmit queue.
+ * Applications can use this API in the fast path to inspect Tx queue occupancy and take
+ * appropriate actions based on the available free descriptors.
+ * An example action could be implementing the Random Early Discard (RED).
+ *
+ * Since it's a fast-path function, no check is performed on port_id and
+ * tx_queue_id. The caller must therefore ensure that the port is enabled
+ * and the queue is configured and running.
+ *
+ * @param port_id
+ *   The port identifier of the device.
+ * @param tx_queue_id
+ *   The index of the transmit queue.
+ *   The value must be in the range [0, nb_tx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @return
+ *  The number of used descriptors in the specific queue, or:
+ *   - (-ENODEV) if *port_id* is invalid. Enabled only when RTE_ETHDEV_DEBUG_TX is enabled
+ *   - (-EINVAL) if *queue_id* is invalid. Enabled only when RTE_ETHDEV_DEBUG_TX is enabled
+ *   - (-ENOTSUP) if the device does not support this function.
+ *
+ * @note This function is designed for fast-path use.
+ */
+__rte_experimental
+static inline int
+rte_eth_tx_queue_count(uint16_t port_id, uint16_t tx_queue_id)
+{
+	struct rte_eth_fp_ops *fops;
+	void *qd;
+	int rc;
+
+#ifdef RTE_ETHDEV_DEBUG_TX
+	if (port_id >= RTE_MAX_ETHPORTS || !rte_eth_dev_is_valid_port(port_id)) {
+		RTE_ETHDEV_LOG_LINE(ERR, "Invalid port_id=%u", port_id);
+		rc = -ENODEV;
+		rte_eth_trace_tx_queue_count(port_id, tx_queue_id, rc);
+		return rc;
+	}
+
+	rc = -EINVAL;
+	if (tx_queue_id >= RTE_MAX_QUEUES_PER_PORT) {
+		RTE_ETHDEV_LOG_LINE(ERR, "Invalid Tx queue_id=%u for port_id=%u",
+				    tx_queue_id, port_id);
+		rte_eth_trace_tx_queue_count(port_id, tx_queue_id, rc);
+		return rc;
+	}
+#endif
+
+	/* Fetch pointer to Tx queue data */
+	fops = &rte_eth_fp_ops[port_id];
+	qd = fops->txq.data[tx_queue_id];
+
+#ifdef RTE_ETHDEV_DEBUG_TX
+	if (qd == NULL) {
+		RTE_ETHDEV_LOG_LINE(ERR, "Invalid Tx queue_id=%u for port_id=%u",
+				    tx_queue_id, port_id);
+		rte_eth_trace_tx_queue_count(port_id, tx_queue_id, rc);
+		return rc;
+	}
+#endif
+	if (fops->tx_queue_count == NULL)
+		return -ENOTSUP;
+
+	rc = fops->tx_queue_count(qd);
+	rte_eth_trace_tx_queue_count(port_id, tx_queue_id, rc);
+
+	return rc;
+}
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/ethdev/rte_ethdev_core.h b/lib/ethdev/rte_ethdev_core.h
index 4bfaf79c6c..d3f09f390d 100644
--- a/lib/ethdev/rte_ethdev_core.h
+++ b/lib/ethdev/rte_ethdev_core.h
@@ -60,6 +60,9 @@  typedef uint16_t (*eth_recycle_tx_mbufs_reuse_t)(void *txq,
 /** @internal Refill Rx descriptors with the recycling mbufs */
 typedef void (*eth_recycle_rx_descriptors_refill_t)(void *rxq, uint16_t nb);
 
+/** @internal Get number of used descriptors on a transmit queue. */
+typedef int (*eth_tx_queue_count_t)(void *txq);
+
 /**
  * @internal
  * Structure used to hold opaque pointers to internal ethdev Rx/Tx
@@ -116,7 +119,9 @@  struct rte_eth_fp_ops {
 	eth_tx_descriptor_status_t tx_descriptor_status;
 	/** Copy used mbufs from Tx mbuf ring into Rx. */
 	eth_recycle_tx_mbufs_reuse_t recycle_tx_mbufs_reuse;
-	uintptr_t reserved2[2];
+	/** Get the number of used Tx descriptors. */
+	eth_tx_queue_count_t tx_queue_count;
+	uintptr_t reserved2[1];
 	/**@}*/
 
 } __rte_cache_aligned;
diff --git a/lib/ethdev/rte_ethdev_trace_fp.h b/lib/ethdev/rte_ethdev_trace_fp.h
index 186271c9ff..c98c488433 100644
--- a/lib/ethdev/rte_ethdev_trace_fp.h
+++ b/lib/ethdev/rte_ethdev_trace_fp.h
@@ -73,6 +73,14 @@  RTE_TRACE_POINT_FP(
 	rte_trace_point_emit_u64(count);
 )
 
+RTE_TRACE_POINT_FP(
+	rte_eth_trace_tx_queue_count,
+	RTE_TRACE_POINT_ARGS(uint16_t port_id, uint16_t tx_queue_id, int rc),
+	rte_trace_point_emit_u16(port_id);
+	rte_trace_point_emit_u16(tx_queue_id);
+	rte_trace_point_emit_int(rc);
+)
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index 5c4917c020..e03830902a 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -316,6 +316,9 @@  EXPERIMENTAL {
 	rte_eth_recycle_rx_queue_info_get;
 	rte_flow_group_set_miss_actions;
 	rte_flow_calc_table_hash;
+
+	# added in 24.03
+	rte_eth_tx_queue_count;
 };
 
 INTERNAL {