[v5,1/2] ethdev: introduce the Tx map API for aggregated ports

Message ID 20230214154836.9681-2-jiaweiw@nvidia.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series Added support for Tx queue mapping with an aggregated port |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/iol-testing warning apply patch failure

Commit Message

Jiawei Wang Feb. 14, 2023, 3:48 p.m. UTC
  When multiple ports are aggregated into a single DPDK port,
(example: Linux bonding, DPDK bonding, failsafe, etc.),
we want to know which port use for Tx via a queue.

This patch introduces the new ethdev API
rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue
with an aggregated port of the DPDK port (specified with port_id),
The affinity is the number of the aggregated port.
Value 0 means no affinity and traffic could be routed to any
aggregated port, this is the default current behavior.

The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().

Add the trace point for ethdev rte_eth_dev_count_aggr_ports()
and rte_eth_dev_map_aggr_tx_affinity() functions.

Add the testpmd command line:
testpmd> port config (port_id) txq (queue_id) affinity (value)

For example, there're two physical ports connected to
a single DPDK port (port id 0), and affinity 1 stood for
the first physical port and affinity 2 stood for the second
physical port.
Use the below commands to config tx phy affinity for per Tx Queue:
        port config 0 txq 0 affinity 1
        port config 0 txq 1 affinity 1
        port config 0 txq 2 affinity 2
        port config 0 txq 3 affinity 2

These commands config the Tx Queue index 0 and Tx Queue index 1 with
phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets,
these packets will be sent from the first physical port, and similar
with the second physical port if sending packets with Tx Queue 2
or Tx Queue 3.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
---
 app/test-pmd/cmdline.c                      | 96 +++++++++++++++++++++
 doc/guides/rel_notes/release_23_03.rst      |  7 ++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 14 +++
 lib/ethdev/ethdev_driver.h                  | 39 +++++++++
 lib/ethdev/ethdev_trace.h                   | 17 ++++
 lib/ethdev/ethdev_trace_points.c            |  6 ++
 lib/ethdev/rte_ethdev.c                     | 49 +++++++++++
 lib/ethdev/rte_ethdev.h                     | 46 ++++++++++
 lib/ethdev/version.map                      |  2 +
 9 files changed, 276 insertions(+)
  

Comments

Jiawei Wang Feb. 15, 2023, 11:41 a.m. UTC | #1
Hi Ori, Thomas and Ferruh,

Could you please help to review it?

Thanks.

> -----Original Message-----
> From: Jiawei Wang <jiaweiw@nvidia.com>
> Sent: Tuesday, February 14, 2023 11:49 PM
> To: Slava Ovsiienko <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>;
> NBU-Contact-Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>;
> andrew.rybchenko@oktetlabs.ru; Aman Singh <aman.deep.singh@intel.com>;
> Yuying Zhang <yuying.zhang@intel.com>; Ferruh Yigit <ferruh.yigit@amd.com>
> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
> Subject: [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated ports
> 
> When multiple ports are aggregated into a single DPDK port,
> (example: Linux bonding, DPDK bonding, failsafe, etc.), we want to know which
> port use for Tx via a queue.
> 
> This patch introduces the new ethdev API rte_eth_dev_map_aggr_tx_affinity(),
> it's used to map a Tx queue with an aggregated port of the DPDK port
> (specified with port_id), The affinity is the number of the aggregated port.
> Value 0 means no affinity and traffic could be routed to any aggregated port,
> this is the default current behavior.
> 
> The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().
> 
> Add the trace point for ethdev rte_eth_dev_count_aggr_ports() and
> rte_eth_dev_map_aggr_tx_affinity() functions.
> 
> Add the testpmd command line:
> testpmd> port config (port_id) txq (queue_id) affinity (value)
> 
> For example, there're two physical ports connected to a single DPDK port (port
> id 0), and affinity 1 stood for the first physical port and affinity 2 stood for the
> second physical port.
> Use the below commands to config tx phy affinity for per Tx Queue:
>         port config 0 txq 0 affinity 1
>         port config 0 txq 1 affinity 1
>         port config 0 txq 2 affinity 2
>         port config 0 txq 3 affinity 2
> 
> These commands config the Tx Queue index 0 and Tx Queue index 1 with phy
> affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets, these packets will be
> sent from the first physical port, and similar with the second physical port if
> sending packets with Tx Queue 2 or Tx Queue 3.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
> ---
>  app/test-pmd/cmdline.c                      | 96 +++++++++++++++++++++
>  doc/guides/rel_notes/release_23_03.rst      |  7 ++
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst | 14 +++
>  lib/ethdev/ethdev_driver.h                  | 39 +++++++++
>  lib/ethdev/ethdev_trace.h                   | 17 ++++
>  lib/ethdev/ethdev_trace_points.c            |  6 ++
>  lib/ethdev/rte_ethdev.c                     | 49 +++++++++++
>  lib/ethdev/rte_ethdev.h                     | 46 ++++++++++
>  lib/ethdev/version.map                      |  2 +
>  9 files changed, 276 insertions(+)
> 
snip
> 2.18.1
  
Thomas Monjalon Feb. 16, 2023, 5:42 p.m. UTC | #2
For the title, I suggest
ethdev: add Tx queue mapping of aggregated ports

14/02/2023 16:48, Jiawei Wang:
> When multiple ports are aggregated into a single DPDK port,
> (example: Linux bonding, DPDK bonding, failsafe, etc.),
> we want to know which port use for Tx via a queue.
> 
> This patch introduces the new ethdev API
> rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue
> with an aggregated port of the DPDK port (specified with port_id),
> The affinity is the number of the aggregated port.
> Value 0 means no affinity and traffic could be routed to any
> aggregated port, this is the default current behavior.
> 
> The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().
> 
> Add the trace point for ethdev rte_eth_dev_count_aggr_ports()
> and rte_eth_dev_map_aggr_tx_affinity() functions.
> 
> Add the testpmd command line:
> testpmd> port config (port_id) txq (queue_id) affinity (value)
> 
> For example, there're two physical ports connected to
> a single DPDK port (port id 0), and affinity 1 stood for
> the first physical port and affinity 2 stood for the second
> physical port.
> Use the below commands to config tx phy affinity for per Tx Queue:
>         port config 0 txq 0 affinity 1
>         port config 0 txq 1 affinity 1
>         port config 0 txq 2 affinity 2
>         port config 0 txq 3 affinity 2
> 
> These commands config the Tx Queue index 0 and Tx Queue index 1 with
> phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets,
> these packets will be sent from the first physical port, and similar
> with the second physical port if sending packets with Tx Queue 2
> or Tx Queue 3.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>

Acked-by: Thomas Monjalon <thomas@monjalon.net>
  
Ferruh Yigit Feb. 16, 2023, 5:58 p.m. UTC | #3
On 2/14/2023 3:48 PM, Jiawei Wang wrote:
> When multiple ports are aggregated into a single DPDK port,
> (example: Linux bonding, DPDK bonding, failsafe, etc.),
> we want to know which port use for Tx via a queue.
> 
> This patch introduces the new ethdev API
> rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue
> with an aggregated port of the DPDK port (specified with port_id),
> The affinity is the number of the aggregated port.
> Value 0 means no affinity and traffic could be routed to any
> aggregated port, this is the default current behavior.
> 
> The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().
> 
> Add the trace point for ethdev rte_eth_dev_count_aggr_ports()
> and rte_eth_dev_map_aggr_tx_affinity() functions.
> 
> Add the testpmd command line:
> testpmd> port config (port_id) txq (queue_id) affinity (value)
> 
> For example, there're two physical ports connected to
> a single DPDK port (port id 0), and affinity 1 stood for
> the first physical port and affinity 2 stood for the second
> physical port.
> Use the below commands to config tx phy affinity for per Tx Queue:
>         port config 0 txq 0 affinity 1
>         port config 0 txq 1 affinity 1
>         port config 0 txq 2 affinity 2
>         port config 0 txq 3 affinity 2
> 
> These commands config the Tx Queue index 0 and Tx Queue index 1 with
> phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets,
> these packets will be sent from the first physical port, and similar
> with the second physical port if sending packets with Tx Queue 2
> or Tx Queue 3.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>

<...>

> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> index dc0a4eb12c..1d5b3a16b2 100644
> --- a/lib/ethdev/rte_ethdev.c
> +++ b/lib/ethdev/rte_ethdev.c
> @@ -6915,6 +6915,55 @@ rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t *ptypes
>  	return j;
>  }
>  
> +int rte_eth_dev_count_aggr_ports(uint16_t port_id)
> +{
> +	struct rte_eth_dev *dev;
> +	int ret;
> +
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> +	dev = &rte_eth_devices[port_id];
> +
> +	if (*dev->dev_ops->count_aggr_ports == NULL)
> +		return -ENOTSUP;

What do you think to return a default value when dev_ops is not defined,
assuming device is not a bounded device.
Not sure which one is better for application, return a default value or
error.


> +	ret = eth_err(port_id, (*dev->dev_ops->count_aggr_ports)(port_id));
> +
> +	rte_eth_trace_count_aggr_ports(port_id, ret);
> +
> +	return ret;
> +}
> +
> +int rte_eth_dev_map_aggr_tx_affinity(uint16_t port_id, uint16_t tx_queue_id,
> +				     uint8_t affinity)
> +{
> +	struct rte_eth_dev *dev;
> +	int ret;
> +
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> +	dev = &rte_eth_devices[port_id];
> +
> +	if (tx_queue_id >= dev->data->nb_tx_queues) {
> +		RTE_ETHDEV_LOG(ERR, "Invalid Tx queue_id=%u\n", tx_queue_id);
> +		return -EINVAL;
> +	}
> +

Although documentation says this API should be called before configure,
if user misses it I guess above can crash, is there a way to add runtime
check, like checking 'dev->data->dev_configured'?


> +	if (*dev->dev_ops->map_aggr_tx_affinity == NULL)
> +		return -ENOTSUP;
> +
> +	if (dev->data->dev_started) {
> +		RTE_ETHDEV_LOG(ERR,
> +			"Port %u must be stopped to allow configuration\n",
> +			port_id);
> +		return -EBUSY;
> +	}
> +
> +	ret = eth_err(port_id, (*dev->dev_ops->map_aggr_tx_affinity)(port_id,
> +				tx_queue_id, affinity));
> +

Should API check if port_id is a bonding port before it continue with
mapping?

> +	rte_eth_trace_map_aggr_tx_affinity(port_id, tx_queue_id, affinity, ret);
> +
> +	return ret;
> +}
> +
>  RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
>  
>  RTE_INIT(ethdev_init_telemetry)
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index c129ca1eaf..07b8250eb8 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -2589,6 +2589,52 @@ int rte_eth_hairpin_bind(uint16_t tx_port, uint16_t rx_port);
>  __rte_experimental
>  int rte_eth_hairpin_unbind(uint16_t tx_port, uint16_t rx_port);
>  
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Get the number of aggregated ports.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @return
> + *   - (>=0) the number of aggregated port if success.
> + *   - (-ENOTSUP) if not supported.
> + */
> +__rte_experimental
> +int rte_eth_dev_count_aggr_ports(uint16_t port_id);


Can you please give more details in the function description, in the
context of this patch it is clear, but someone sees it first time can be
confused what is "aggregated ports" is.

What is expected value for regular pysical port, that doesn't have any
sub-devices, 0 or 1? Can you please document?


> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + *  Map a Tx queue with an aggregated port of the DPDK port (specified with port_id).
> + *  When multiple ports are aggregated into a single one,
> + *  it allows to choose which port to use for Tx via a queue.
> + *
> + *  The application should use rte_eth_dev_map_aggr_tx_affinity()
> + *  after rte_eth_dev_configure(), rte_eth_tx_queue_setup(), and
> + *  before rte_eth_dev_start().
> + *
> + * @param port_id
> + *   The identifier of the port used in rte_eth_tx_burst().
> + * @param tx_queue_id
> + *   The index of the transmit queue used in rte_eth_tx_burst().
> + *   The value must be in the range [0, nb_tx_queue - 1] previously supplied
> + *   to rte_eth_dev_configure().
> + * @param affinity
> + *   The number of the aggregated port.
> + *   Value 0 means no affinity and traffic could be routed to any aggregated port.
> + *   The first aggregated port is number 1 and so on.
> + *   The maximum number is given by rte_eth_dev_count_aggr_ports().
> + *
> + * @return
> + *   Zero if successful. Non-zero otherwise.
> + */
> +__rte_experimental
> +int rte_eth_dev_map_aggr_tx_affinity(uint16_t port_id, uint16_t tx_queue_id,
> +				     uint8_t affinity);
> +
>  /**
>   * Return the NUMA socket to which an Ethernet device is connected
>   *
> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
> index dbc2bffe64..685aa71e51 100644
> --- a/lib/ethdev/version.map
> +++ b/lib/ethdev/version.map
> @@ -300,6 +300,8 @@ EXPERIMENTAL {
>  	rte_mtr_meter_profile_get;
>  
>  	# added in 23.03
> +	rte_eth_dev_count_aggr_ports;
> +	rte_eth_dev_map_aggr_tx_affinity;
>  	rte_flow_async_create_by_index;
>  };
>
  
Jiawei Wang Feb. 17, 2023, 6:44 a.m. UTC | #4
> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Friday, February 17, 2023 1:58 AM
> To: Jiawei(Jonny) Wang <jiaweiw@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>;
> andrew.rybchenko@oktetlabs.ru; Aman Singh <aman.deep.singh@intel.com>;
> Yuying Zhang <yuying.zhang@intel.com>
> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
> Subject: Re: [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated
> ports
> 
> On 2/14/2023 3:48 PM, Jiawei Wang wrote:
> > When multiple ports are aggregated into a single DPDK port,
> > (example: Linux bonding, DPDK bonding, failsafe, etc.), we want to
> > know which port use for Tx via a queue.
> >
> > This patch introduces the new ethdev API
> > rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue with
> > an aggregated port of the DPDK port (specified with port_id), The
> > affinity is the number of the aggregated port.
> > Value 0 means no affinity and traffic could be routed to any
> > aggregated port, this is the default current behavior.
> >
> > The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().
> >
> > Add the trace point for ethdev rte_eth_dev_count_aggr_ports() and
> > rte_eth_dev_map_aggr_tx_affinity() functions.
> >
> > Add the testpmd command line:
> > testpmd> port config (port_id) txq (queue_id) affinity (value)
> >
> > For example, there're two physical ports connected to a single DPDK
> > port (port id 0), and affinity 1 stood for the first physical port and
> > affinity 2 stood for the second physical port.
> > Use the below commands to config tx phy affinity for per Tx Queue:
> >         port config 0 txq 0 affinity 1
> >         port config 0 txq 1 affinity 1
> >         port config 0 txq 2 affinity 2
> >         port config 0 txq 3 affinity 2
> >
> > These commands config the Tx Queue index 0 and Tx Queue index 1 with
> > phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets, these
> > packets will be sent from the first physical port, and similar with
> > the second physical port if sending packets with Tx Queue 2 or Tx
> > Queue 3.
> >
> > Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
> 
> <...>
> 
> > diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c index
> > dc0a4eb12c..1d5b3a16b2 100644
> > --- a/lib/ethdev/rte_ethdev.c
> > +++ b/lib/ethdev/rte_ethdev.c
> > @@ -6915,6 +6915,55 @@
> rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t
> *ptypes
> >  	return j;
> >  }
> >
> > +int rte_eth_dev_count_aggr_ports(uint16_t port_id) {
> > +	struct rte_eth_dev *dev;
> > +	int ret;
> > +
> > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > +	dev = &rte_eth_devices[port_id];
> > +
> > +	if (*dev->dev_ops->count_aggr_ports == NULL)
> > +		return -ENOTSUP;
> 
> What do you think to return a default value when dev_ops is not defined,
> assuming device is not a bounded device.
> Not sure which one is better for application, return a default value or error.
> 

For device which isn't a boned device, the count should be zero.
So, we can return 0 as default value if the PMD doesn't support.

Per application perspective, it only needs to check the count > 0.

> 
> > +	ret = eth_err(port_id, (*dev->dev_ops->count_aggr_ports)(port_id));
> > +
> > +	rte_eth_trace_count_aggr_ports(port_id, ret);
> > +
> > +	return ret;
> > +}
> > +
> > +int rte_eth_dev_map_aggr_tx_affinity(uint16_t port_id, uint16_t
> tx_queue_id,
> > +				     uint8_t affinity)
> > +{
> > +	struct rte_eth_dev *dev;
> > +	int ret;
> > +
> > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > +	dev = &rte_eth_devices[port_id];
> > +
> > +	if (tx_queue_id >= dev->data->nb_tx_queues) {
> > +		RTE_ETHDEV_LOG(ERR, "Invalid Tx queue_id=%u\n",
> tx_queue_id);
> > +		return -EINVAL;
> > +	}
> > +
> 
> Although documentation says this API should be called before configure, if user
> misses it I guess above can crash, is there a way to add runtime check, like
> checking 'dev->data->dev_configured'?
> 

OK, I will add the checking and report the error if (dev->data->dev_configured == 0).

> 
> > +	if (*dev->dev_ops->map_aggr_tx_affinity == NULL)
> > +		return -ENOTSUP;
> > +
> > +	if (dev->data->dev_started) {
> > +		RTE_ETHDEV_LOG(ERR,
> > +			"Port %u must be stopped to allow configuration\n",
> > +			port_id);
> > +		return -EBUSY;
> > +	}
> > +
> > +	ret = eth_err(port_id, (*dev->dev_ops->map_aggr_tx_affinity)(port_id,
> > +				tx_queue_id, affinity));
> > +
> 
> Should API check if port_id is a bonding port before it continue with mapping?
> 

I added this check in the app before, will move to ethdev layer.

> > +	rte_eth_trace_map_aggr_tx_affinity(port_id, tx_queue_id, affinity,
> > +ret);
> > +
> > +	return ret;
> > +}
> > +
> >  RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
> >
> >  RTE_INIT(ethdev_init_telemetry)
> > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
> > c129ca1eaf..07b8250eb8 100644
> > --- a/lib/ethdev/rte_ethdev.h
> > +++ b/lib/ethdev/rte_ethdev.h
> > @@ -2589,6 +2589,52 @@ int rte_eth_hairpin_bind(uint16_t tx_port,
> > uint16_t rx_port);  __rte_experimental  int
> > rte_eth_hairpin_unbind(uint16_t tx_port, uint16_t rx_port);
> >
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Get the number of aggregated ports.
> > + *
> > + * @param port_id
> > + *   The port identifier of the Ethernet device.
> > + * @return
> > + *   - (>=0) the number of aggregated port if success.
> > + *   - (-ENOTSUP) if not supported.
> > + */
> > +__rte_experimental
> > +int rte_eth_dev_count_aggr_ports(uint16_t port_id);
> 
> 
> Can you please give more details in the function description, in the context of
> this patch it is clear, but someone sees it first time can be confused what is
> "aggregated ports" is.
> 

OK, for multiple ports are aggregated into single one, we can call these ports as "aggregated ports".
Will add more description in next patch.

> What is expected value for regular pysical port, that doesn't have any sub-
> devices, 0 or 1? Can you please document?
> 

OK, API return 0 for regular physical port (w/o bonded).
Will add document in next patch.

> 
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + *  Map a Tx queue with an aggregated port of the DPDK port (specified with
> port_id).
> > + *  When multiple ports are aggregated into a single one,
> > + *  it allows to choose which port to use for Tx via a queue.
> > + *
> > + *  The application should use rte_eth_dev_map_aggr_tx_affinity()
> > + *  after rte_eth_dev_configure(), rte_eth_tx_queue_setup(), and
> > + *  before rte_eth_dev_start().
> > + *
> > + * @param port_id
> > + *   The identifier of the port used in rte_eth_tx_burst().
> > + * @param tx_queue_id
> > + *   The index of the transmit queue used in rte_eth_tx_burst().
> > + *   The value must be in the range [0, nb_tx_queue - 1] previously supplied
> > + *   to rte_eth_dev_configure().
> > + * @param affinity
> > + *   The number of the aggregated port.
> > + *   Value 0 means no affinity and traffic could be routed to any aggregated
> port.
> > + *   The first aggregated port is number 1 and so on.
> > + *   The maximum number is given by rte_eth_dev_count_aggr_ports().
> > + *
> > + * @return
> > + *   Zero if successful. Non-zero otherwise.
> > + */
> > +__rte_experimental
> > +int rte_eth_dev_map_aggr_tx_affinity(uint16_t port_id, uint16_t
> tx_queue_id,
> > +				     uint8_t affinity);
> > +
> >  /**
> >   * Return the NUMA socket to which an Ethernet device is connected
> >   *
> > diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map index
> > dbc2bffe64..685aa71e51 100644
> > --- a/lib/ethdev/version.map
> > +++ b/lib/ethdev/version.map
> > @@ -300,6 +300,8 @@ EXPERIMENTAL {
> >  	rte_mtr_meter_profile_get;
> >
> >  	# added in 23.03
> > +	rte_eth_dev_count_aggr_ports;
> > +	rte_eth_dev_map_aggr_tx_affinity;
> >  	rte_flow_async_create_by_index;
> >  };
> >

Thanks.
  
Jiawei Wang Feb. 17, 2023, 6:45 a.m. UTC | #5
> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Friday, February 17, 2023 1:42 AM
> To: Jiawei(Jonny) Wang <jiaweiw@nvidia.com>
> Cc: Slava Ovsiienko <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>;
> andrew.rybchenko@oktetlabs.ru; Aman Singh <aman.deep.singh@intel.com>;
> Yuying Zhang <yuying.zhang@intel.com>; Ferruh Yigit <ferruh.yigit@amd.com>;
> dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
> Subject: Re: [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated
> ports
> 
> For the title, I suggest
> ethdev: add Tx queue mapping of aggregated ports
> 
> 14/02/2023 16:48, Jiawei Wang:
> > When multiple ports are aggregated into a single DPDK port,
> > (example: Linux bonding, DPDK bonding, failsafe, etc.), we want to
> > know which port use for Tx via a queue.
> >
> > This patch introduces the new ethdev API
> > rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue with
> > an aggregated port of the DPDK port (specified with port_id), The
> > affinity is the number of the aggregated port.
> > Value 0 means no affinity and traffic could be routed to any
> > aggregated port, this is the default current behavior.
> >
> > The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().
> >
> > Add the trace point for ethdev rte_eth_dev_count_aggr_ports() and
> > rte_eth_dev_map_aggr_tx_affinity() functions.
> >
> > Add the testpmd command line:
> > testpmd> port config (port_id) txq (queue_id) affinity (value)
> >
> > For example, there're two physical ports connected to a single DPDK
> > port (port id 0), and affinity 1 stood for the first physical port and
> > affinity 2 stood for the second physical port.
> > Use the below commands to config tx phy affinity for per Tx Queue:
> >         port config 0 txq 0 affinity 1
> >         port config 0 txq 1 affinity 1
> >         port config 0 txq 2 affinity 2
> >         port config 0 txq 3 affinity 2
> >
> > These commands config the Tx Queue index 0 and Tx Queue index 1 with
> > phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets, these
> > packets will be sent from the first physical port, and similar with
> > the second physical port if sending packets with Tx Queue 2 or Tx
> > Queue 3.
> >
> > Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
> 
> Acked-by: Thomas Monjalon <thomas@monjalon.net>
> 

OK, update the title next patch, thanks for Ack.
  
Andrew Rybchenko Feb. 17, 2023, 8:24 a.m. UTC | #6
On 2/16/23 20:58, Ferruh Yigit wrote:
> On 2/14/2023 3:48 PM, Jiawei Wang wrote:
>> When multiple ports are aggregated into a single DPDK port,
>> (example: Linux bonding, DPDK bonding, failsafe, etc.),
>> we want to know which port use for Tx via a queue.
>>
>> This patch introduces the new ethdev API
>> rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue
>> with an aggregated port of the DPDK port (specified with port_id),
>> The affinity is the number of the aggregated port.
>> Value 0 means no affinity and traffic could be routed to any
>> aggregated port, this is the default current behavior.
>>
>> The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().
>>
>> Add the trace point for ethdev rte_eth_dev_count_aggr_ports()
>> and rte_eth_dev_map_aggr_tx_affinity() functions.
>>
>> Add the testpmd command line:
>> testpmd> port config (port_id) txq (queue_id) affinity (value)
>>
>> For example, there're two physical ports connected to
>> a single DPDK port (port id 0), and affinity 1 stood for
>> the first physical port and affinity 2 stood for the second
>> physical port.
>> Use the below commands to config tx phy affinity for per Tx Queue:
>>          port config 0 txq 0 affinity 1
>>          port config 0 txq 1 affinity 1
>>          port config 0 txq 2 affinity 2
>>          port config 0 txq 3 affinity 2
>>
>> These commands config the Tx Queue index 0 and Tx Queue index 1 with
>> phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets,
>> these packets will be sent from the first physical port, and similar
>> with the second physical port if sending packets with Tx Queue 2
>> or Tx Queue 3.
>>
>> Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
> 
> <...>
> 
>> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
>> index dc0a4eb12c..1d5b3a16b2 100644
>> --- a/lib/ethdev/rte_ethdev.c
>> +++ b/lib/ethdev/rte_ethdev.c
>> @@ -6915,6 +6915,55 @@ rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t *ptypes
>>   	return j;
>>   }
>>   
>> +int rte_eth_dev_count_aggr_ports(uint16_t port_id)
>> +{
>> +	struct rte_eth_dev *dev;
>> +	int ret;
>> +
>> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
>> +	dev = &rte_eth_devices[port_id];
>> +
>> +	if (*dev->dev_ops->count_aggr_ports == NULL)
>> +		return -ENOTSUP;
> 
> What do you think to return a default value when dev_ops is not defined,
> assuming device is not a bounded device.
> Not sure which one is better for application, return a default value or
> error.

I think 0 is better here. It simply means that
rte_eth_dev_map_aggr_tx_affinity() cannot be used as
well as corresponding flow API item.
It will be true even for bonding as long as corresponding
API is not supported.

>> +	ret = eth_err(port_id, (*dev->dev_ops->count_aggr_ports)(port_id));
>> +
>> +	rte_eth_trace_count_aggr_ports(port_id, ret);
>> +
>> +	return ret;
>> +}
>> +
>> +int rte_eth_dev_map_aggr_tx_affinity(uint16_t port_id, uint16_t tx_queue_id,
>> +				     uint8_t affinity)
>> +{
>> +	struct rte_eth_dev *dev;
>> +	int ret;
>> +
>> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
>> +	dev = &rte_eth_devices[port_id];
>> +
>> +	if (tx_queue_id >= dev->data->nb_tx_queues) {
>> +		RTE_ETHDEV_LOG(ERR, "Invalid Tx queue_id=%u\n", tx_queue_id);
>> +		return -EINVAL;
>> +	}
>> +
> 
> Although documentation says this API should be called before configure,

Documentation says "after". Anyway, it is better to check vs
dev_configured.

> if user misses it I guess above can crash, is there a way to add runtime
> check, like checking 'dev->data->dev_configured'?
> 
> 
>> +	if (*dev->dev_ops->map_aggr_tx_affinity == NULL)
>> +		return -ENOTSUP;
>> +
>> +	if (dev->data->dev_started) {
>> +		RTE_ETHDEV_LOG(ERR,
>> +			"Port %u must be stopped to allow configuration\n",
>> +			port_id);
>> +		return -EBUSY;
>> +	}
>> +
>> +	ret = eth_err(port_id, (*dev->dev_ops->map_aggr_tx_affinity)(port_id,
>> +				tx_queue_id, affinity));
>> +
> 
> Should API check if port_id is a bonding port before it continue with
> mapping?

Since it is a control path I think it is a good idea to
call rte_eth_dev_count_aggr_ports() and chck affinity value.
  
Jiawei Wang Feb. 17, 2023, 9:50 a.m. UTC | #7
> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: Friday, February 17, 2023 4:24 PM
> To: Ferruh Yigit <ferruh.yigit@amd.com>; Jiawei(Jonny) Wang
> <jiaweiw@nvidia.com>; Slava Ovsiienko <viacheslavo@nvidia.com>; Ori Kam
> <orika@nvidia.com>; NBU-Contact-Thomas Monjalon (EXTERNAL)
> <thomas@monjalon.net>; Aman Singh <aman.deep.singh@intel.com>; Yuying
> Zhang <yuying.zhang@intel.com>
> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
> Subject: Re: [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated
> ports
> 
> On 2/16/23 20:58, Ferruh Yigit wrote:
> > On 2/14/2023 3:48 PM, Jiawei Wang wrote:
> >> When multiple ports are aggregated into a single DPDK port,
> >> (example: Linux bonding, DPDK bonding, failsafe, etc.), we want to
> >> know which port use for Tx via a queue.
> >>
> >> This patch introduces the new ethdev API
> >> rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue with
> >> an aggregated port of the DPDK port (specified with port_id), The
> >> affinity is the number of the aggregated port.
> >> Value 0 means no affinity and traffic could be routed to any
> >> aggregated port, this is the default current behavior.
> >>
> >> The maximum number of affinity is given by
> rte_eth_dev_count_aggr_ports().
> >>
> >> Add the trace point for ethdev rte_eth_dev_count_aggr_ports() and
> >> rte_eth_dev_map_aggr_tx_affinity() functions.
> >>
> >> Add the testpmd command line:
> >> testpmd> port config (port_id) txq (queue_id) affinity (value)
> >>
> >> For example, there're two physical ports connected to a single DPDK
> >> port (port id 0), and affinity 1 stood for the first physical port
> >> and affinity 2 stood for the second physical port.
> >> Use the below commands to config tx phy affinity for per Tx Queue:
> >>          port config 0 txq 0 affinity 1
> >>          port config 0 txq 1 affinity 1
> >>          port config 0 txq 2 affinity 2
> >>          port config 0 txq 3 affinity 2
> >>
> >> These commands config the Tx Queue index 0 and Tx Queue index 1 with
> >> phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets, these
> >> packets will be sent from the first physical port, and similar with
> >> the second physical port if sending packets with Tx Queue 2 or Tx
> >> Queue 3.
> >>
> >> Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
> >
> > <...>
> >
> >> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c index
> >> dc0a4eb12c..1d5b3a16b2 100644
> >> --- a/lib/ethdev/rte_ethdev.c
> >> +++ b/lib/ethdev/rte_ethdev.c
> >> @@ -6915,6 +6915,55 @@
> rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t
> *ptypes
> >>   	return j;
> >>   }
> >>
> >> +int rte_eth_dev_count_aggr_ports(uint16_t port_id) {
> >> +	struct rte_eth_dev *dev;
> >> +	int ret;
> >> +
> >> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> >> +	dev = &rte_eth_devices[port_id];
> >> +
> >> +	if (*dev->dev_ops->count_aggr_ports == NULL)
> >> +		return -ENOTSUP;
> >
> > What do you think to return a default value when dev_ops is not
> > defined, assuming device is not a bounded device.
> > Not sure which one is better for application, return a default value
> > or error.
> 
> I think 0 is better here. It simply means that
> rte_eth_dev_map_aggr_tx_affinity() cannot be used as well as corresponding
> flow API item.
> It will be true even for bonding as long as corresponding API is not supported.
> 

Will send the new patch later with this change.

> >> +	ret = eth_err(port_id, (*dev->dev_ops->count_aggr_ports)(port_id));
> >> +
> >> +	rte_eth_trace_count_aggr_ports(port_id, ret);
> >> +
> >> +	return ret;
> >> +}
> >> +
> >> +int rte_eth_dev_map_aggr_tx_affinity(uint16_t port_id, uint16_t
> tx_queue_id,
> >> +				     uint8_t affinity)
> >> +{
> >> +	struct rte_eth_dev *dev;
> >> +	int ret;
> >> +
> >> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> >> +	dev = &rte_eth_devices[port_id];
> >> +
> >> +	if (tx_queue_id >= dev->data->nb_tx_queues) {
> >> +		RTE_ETHDEV_LOG(ERR, "Invalid Tx queue_id=%u\n",
> tx_queue_id);
> >> +		return -EINVAL;
> >> +	}
> >> +
> >
> > Although documentation says this API should be called before
> > configure,
> 
> Documentation says "after". Anyway, it is better to check vs dev_configured.
> 

Yes, after device configure, I add the checking and will send the new patch.

> > if user misses it I guess above can crash, is there a way to add
> > runtime check, like checking 'dev->data->dev_configured'?
> >
> >
> >> +	if (*dev->dev_ops->map_aggr_tx_affinity == NULL)
> >> +		return -ENOTSUP;
> >> +
> >> +	if (dev->data->dev_started) {
> >> +		RTE_ETHDEV_LOG(ERR,
> >> +			"Port %u must be stopped to allow configuration\n",
> >> +			port_id);
> >> +		return -EBUSY;
> >> +	}
> >> +
> >> +	ret = eth_err(port_id, (*dev->dev_ops->map_aggr_tx_affinity)(port_id,
> >> +				tx_queue_id, affinity));
> >> +
> >
> > Should API check if port_id is a bonding port before it continue with
> > mapping?
> 
> Since it is a control path I think it is a good idea to call
> rte_eth_dev_count_aggr_ports() and chck affinity value.

OK, add the API call before map and check the affinity value as well.

Will send the v6 patch include all comments/suggestions.

Thanks.
  

Patch

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index bb7ff2b449..36c798ac45 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -776,6 +776,10 @@  static void cmd_help_long_parsed(void *parsed_result,
 
 			"port cleanup (port_id) txq (queue_id) (free_cnt)\n"
 			"    Cleanup txq mbufs for a specific Tx queue\n\n"
+
+			"port config (port_id) txq (queue_id) affinity (value)\n"
+			"    Map a Tx queue with an aggregated port "
+			"of the DPDK port\n\n"
 		);
 	}
 
@@ -12636,6 +12640,97 @@  static cmdline_parse_inst_t cmd_show_port_flow_transfer_proxy = {
 	}
 };
 
+/* *** configure port txq affinity value *** */
+struct cmd_config_tx_affinity_map {
+	cmdline_fixed_string_t port;
+	cmdline_fixed_string_t config;
+	portid_t portid;
+	cmdline_fixed_string_t txq;
+	uint16_t qid;
+	cmdline_fixed_string_t affinity;
+	uint8_t value;
+};
+
+static void
+cmd_config_tx_affinity_map_parsed(void *parsed_result,
+				  __rte_unused struct cmdline *cl,
+				  __rte_unused void *data)
+{
+	struct cmd_config_tx_affinity_map *res = parsed_result;
+	int ret;
+
+	if (port_id_is_invalid(res->portid, ENABLED_WARN))
+		return;
+
+	if (res->portid == (portid_t)RTE_PORT_ALL) {
+		printf("Invalid port id\n");
+		return;
+	}
+
+	if (strcmp(res->txq, "txq")) {
+		printf("Unknown parameter\n");
+		return;
+	}
+	if (tx_queue_id_is_invalid(res->qid))
+		return;
+
+	ret = rte_eth_dev_count_aggr_ports(res->portid);
+	if (ret < 0) {
+		printf("Failed to count the aggregated ports: (%s)\n",
+			strerror(-ret));
+		return;
+	}
+	if (ret == 0) {
+		printf("Number of aggregated ports is 0 which is invalid for affinity mapping\n");
+		return;
+	}
+	if (ret < res->value) {
+		printf("The affinity value %u is Invalid, exceeds the "
+		       "number of aggregated ports\n", res->value);
+		return;
+	}
+
+	rte_eth_dev_map_aggr_tx_affinity(res->portid, res->qid, res->value);
+}
+
+cmdline_parse_token_string_t cmd_config_tx_affinity_map_port =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_affinity_map,
+				 port, "port");
+cmdline_parse_token_string_t cmd_config_tx_affinity_map_config =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_affinity_map,
+				 config, "config");
+cmdline_parse_token_num_t cmd_config_tx_affinity_map_portid =
+	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_affinity_map,
+				 portid, RTE_UINT16);
+cmdline_parse_token_string_t cmd_config_tx_affinity_map_txq =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_affinity_map,
+				 txq, "txq");
+cmdline_parse_token_num_t cmd_config_tx_affinity_map_qid =
+	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_affinity_map,
+			      qid, RTE_UINT16);
+cmdline_parse_token_string_t cmd_config_tx_affinity_map_affinity =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_affinity_map,
+				 affinity, "affinity");
+cmdline_parse_token_num_t cmd_config_tx_affinity_map_value =
+	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_affinity_map,
+			      value, RTE_UINT8);
+
+static cmdline_parse_inst_t cmd_config_tx_affinity_map = {
+	.f = cmd_config_tx_affinity_map_parsed,
+	.data = (void *)0,
+	.help_str = "port config <port_id> txq <queue_id> affinity <value>",
+	.tokens = {
+		(void *)&cmd_config_tx_affinity_map_port,
+		(void *)&cmd_config_tx_affinity_map_config,
+		(void *)&cmd_config_tx_affinity_map_portid,
+		(void *)&cmd_config_tx_affinity_map_txq,
+		(void *)&cmd_config_tx_affinity_map_qid,
+		(void *)&cmd_config_tx_affinity_map_affinity,
+		(void *)&cmd_config_tx_affinity_map_value,
+		NULL,
+	},
+};
+
 /* ******************************************************************************** */
 
 /* list of instructions */
@@ -12869,6 +12964,7 @@  static cmdline_parse_ctx_t builtin_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_show_port_cman_capa,
 	(cmdline_parse_inst_t *)&cmd_show_port_cman_config,
 	(cmdline_parse_inst_t *)&cmd_set_port_cman_config,
+	(cmdline_parse_inst_t *)&cmd_config_tx_affinity_map,
 	NULL,
 };
 
diff --git a/doc/guides/rel_notes/release_23_03.rst b/doc/guides/rel_notes/release_23_03.rst
index ab998a5357..becf6fed91 100644
--- a/doc/guides/rel_notes/release_23_03.rst
+++ b/doc/guides/rel_notes/release_23_03.rst
@@ -68,6 +68,13 @@  New Features
   * Applications can register a callback at startup via
     ``rte_lcore_register_usage_cb()`` to provide lcore usage information.
 
+* **Added support for Tx queue mapping with an aggregated port.**
+
+  * Introduced new function ``rte_eth_dev_count_aggr_ports()``
+    to get the number of aggregated ports.
+  * Introduced new function ``rte_eth_dev_map_aggr_tx_affinity()``
+    to map a Tx queue with an aggregated port of the DPDK port.
+
 * **Added rte_flow support for matching IPv6 routing extension header fields.**
 
   Added ``ipv6_routing_ext`` items in rte_flow to match IPv6 routing extension
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 357adb09d7..1d3c372601 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -1612,6 +1612,20 @@  Enable or disable a per queue Tx offloading only on a specific Tx queue::
 
 This command should be run when the port is stopped, or else it will fail.
 
+config per queue Tx affinity mapping
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Map a Tx queue with an aggregated port of the DPDK port (specified with port_id)::
+
+   testpmd> port (port_id) txq (queue_id) affinity (value)
+
+* ``affinity``: the number of the aggregated port.
+                When multiple ports are aggregated into a single one,
+                it allows to choose which port to use for Tx via a queue.
+
+This command should be run when the port is stopped, otherwise it fails.
+
+
 Config VXLAN Encap outer layers
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index 6a550cfc83..b7fdc454a8 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -1171,6 +1171,40 @@  typedef int (*eth_tx_descriptor_dump_t)(const struct rte_eth_dev *dev,
 					uint16_t queue_id, uint16_t offset,
 					uint16_t num, FILE *file);
 
+/**
+ * @internal
+ * Get the number of aggregated ports.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ *
+ * @return
+ *   Negative errno value on error, 0 or positive on success.
+ *
+ * @retval >=0
+ *   The number of aggregated port if success.
+ * @retval -ENOTSUP
+ *   Get aggregated ports API is not supported.
+ */
+typedef int (*eth_count_aggr_ports_t)(uint16_t port_id);
+
+/**
+ * @internal
+ * Map a Tx queue with an aggregated port of the DPDK port.
+ *
+ * @param port_id
+ *   The identifier of the port used in rte_eth_tx_burst().
+ * @param tx_queue_id
+ *   The index of the transmit queue used in rte_eth_tx_burst().
+ * @param affinity
+ *   The number of the aggregated port.
+ *
+ * @return
+ *   Negative on error, 0 on success.
+ */
+typedef int (*eth_map_aggr_tx_affinity_t)(uint16_t port_id, uint16_t tx_queue_id,
+					  uint8_t affinity);
+
 /**
  * @internal A structure containing the functions exported by an Ethernet driver.
  */
@@ -1403,6 +1437,11 @@  struct eth_dev_ops {
 	eth_cman_config_set_t cman_config_set;
 	/** Retrieve congestion management configuration */
 	eth_cman_config_get_t cman_config_get;
+
+	/** Get the number of aggregated ports */
+	eth_count_aggr_ports_t count_aggr_ports;
+	/** Map a Tx queue with an aggregated port of the DPDK port */
+	eth_map_aggr_tx_affinity_t map_aggr_tx_affinity;
 };
 
 /**
diff --git a/lib/ethdev/ethdev_trace.h b/lib/ethdev/ethdev_trace.h
index 9fae22c490..4e210d099b 100644
--- a/lib/ethdev/ethdev_trace.h
+++ b/lib/ethdev/ethdev_trace.h
@@ -1385,6 +1385,23 @@  RTE_TRACE_POINT(
 	rte_trace_point_emit_int(ret);
 )
 
+RTE_TRACE_POINT(
+	rte_eth_trace_count_aggr_ports,
+	RTE_TRACE_POINT_ARGS(uint16_t port_id, int ret),
+	rte_trace_point_emit_u16(port_id);
+	rte_trace_point_emit_int(ret);
+)
+
+RTE_TRACE_POINT(
+	rte_eth_trace_map_aggr_tx_affinity,
+	RTE_TRACE_POINT_ARGS(uint16_t port_id, uint16_t tx_queue_id,
+			     uint8_t affinity, int ret),
+	rte_trace_point_emit_u16(port_id);
+	rte_trace_point_emit_u16(tx_queue_id);
+	rte_trace_point_emit_u8(affinity);
+	rte_trace_point_emit_int(ret);
+)
+
 RTE_TRACE_POINT(
 	rte_flow_trace_dynf_metadata_register,
 	RTE_TRACE_POINT_ARGS(int offset, uint64_t flag),
diff --git a/lib/ethdev/ethdev_trace_points.c b/lib/ethdev/ethdev_trace_points.c
index 34d12e2859..61010cae56 100644
--- a/lib/ethdev/ethdev_trace_points.c
+++ b/lib/ethdev/ethdev_trace_points.c
@@ -475,6 +475,12 @@  RTE_TRACE_POINT_REGISTER(rte_eth_trace_cman_config_set,
 RTE_TRACE_POINT_REGISTER(rte_eth_trace_cman_config_get,
 	lib.ethdev.cman_config_get)
 
+RTE_TRACE_POINT_REGISTER(rte_eth_trace_count_aggr_ports,
+	lib.ethdev.count_aggr_ports)
+
+RTE_TRACE_POINT_REGISTER(rte_eth_trace_map_aggr_tx_affinity,
+	lib.ethdev.map_aggr_tx_affinity)
+
 RTE_TRACE_POINT_REGISTER(rte_flow_trace_copy,
 	lib.ethdev.flow.copy)
 
diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index dc0a4eb12c..1d5b3a16b2 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -6915,6 +6915,55 @@  rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t *ptypes
 	return j;
 }
 
+int rte_eth_dev_count_aggr_ports(uint16_t port_id)
+{
+	struct rte_eth_dev *dev;
+	int ret;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	dev = &rte_eth_devices[port_id];
+
+	if (*dev->dev_ops->count_aggr_ports == NULL)
+		return -ENOTSUP;
+	ret = eth_err(port_id, (*dev->dev_ops->count_aggr_ports)(port_id));
+
+	rte_eth_trace_count_aggr_ports(port_id, ret);
+
+	return ret;
+}
+
+int rte_eth_dev_map_aggr_tx_affinity(uint16_t port_id, uint16_t tx_queue_id,
+				     uint8_t affinity)
+{
+	struct rte_eth_dev *dev;
+	int ret;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	dev = &rte_eth_devices[port_id];
+
+	if (tx_queue_id >= dev->data->nb_tx_queues) {
+		RTE_ETHDEV_LOG(ERR, "Invalid Tx queue_id=%u\n", tx_queue_id);
+		return -EINVAL;
+	}
+
+	if (*dev->dev_ops->map_aggr_tx_affinity == NULL)
+		return -ENOTSUP;
+
+	if (dev->data->dev_started) {
+		RTE_ETHDEV_LOG(ERR,
+			"Port %u must be stopped to allow configuration\n",
+			port_id);
+		return -EBUSY;
+	}
+
+	ret = eth_err(port_id, (*dev->dev_ops->map_aggr_tx_affinity)(port_id,
+				tx_queue_id, affinity));
+
+	rte_eth_trace_map_aggr_tx_affinity(port_id, tx_queue_id, affinity, ret);
+
+	return ret;
+}
+
 RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
 
 RTE_INIT(ethdev_init_telemetry)
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index c129ca1eaf..07b8250eb8 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -2589,6 +2589,52 @@  int rte_eth_hairpin_bind(uint16_t tx_port, uint16_t rx_port);
 __rte_experimental
 int rte_eth_hairpin_unbind(uint16_t tx_port, uint16_t rx_port);
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the number of aggregated ports.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @return
+ *   - (>=0) the number of aggregated port if success.
+ *   - (-ENOTSUP) if not supported.
+ */
+__rte_experimental
+int rte_eth_dev_count_aggr_ports(uint16_t port_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ *  Map a Tx queue with an aggregated port of the DPDK port (specified with port_id).
+ *  When multiple ports are aggregated into a single one,
+ *  it allows to choose which port to use for Tx via a queue.
+ *
+ *  The application should use rte_eth_dev_map_aggr_tx_affinity()
+ *  after rte_eth_dev_configure(), rte_eth_tx_queue_setup(), and
+ *  before rte_eth_dev_start().
+ *
+ * @param port_id
+ *   The identifier of the port used in rte_eth_tx_burst().
+ * @param tx_queue_id
+ *   The index of the transmit queue used in rte_eth_tx_burst().
+ *   The value must be in the range [0, nb_tx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @param affinity
+ *   The number of the aggregated port.
+ *   Value 0 means no affinity and traffic could be routed to any aggregated port.
+ *   The first aggregated port is number 1 and so on.
+ *   The maximum number is given by rte_eth_dev_count_aggr_ports().
+ *
+ * @return
+ *   Zero if successful. Non-zero otherwise.
+ */
+__rte_experimental
+int rte_eth_dev_map_aggr_tx_affinity(uint16_t port_id, uint16_t tx_queue_id,
+				     uint8_t affinity);
+
 /**
  * Return the NUMA socket to which an Ethernet device is connected
  *
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index dbc2bffe64..685aa71e51 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -300,6 +300,8 @@  EXPERIMENTAL {
 	rte_mtr_meter_profile_get;
 
 	# added in 23.03
+	rte_eth_dev_count_aggr_ports;
+	rte_eth_dev_map_aggr_tx_affinity;
 	rte_flow_async_create_by_index;
 };