[1/1] net/mlx5: show rx/tx descriptor ring limitations in rte_eth_dev_info

Message ID 20240616173803.424025-2-igootorov@gmail.com (mailing list archive)
State New
Delegated to: Raslan Darawsheh
Headers
Series net/mlx5: show rx/tx descriptor ring limitations in rte_eth_dev_info |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/intel-Functional success Functional PASS
ci/github-robot: build success github build: passed
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-sample-apps-testing success Testing PASS
ci/iol-compile-amd64-testing success Testing PASS
ci/iol-unit-arm64-testing success Testing PASS
ci/iol-unit-amd64-testing success Testing PASS
ci/iol-compile-arm64-testing success Testing PASS
ci/intel-Testing success Testing PASS

Commit Message

Igor Gutorov June 16, 2024, 5:38 p.m. UTC
  Currently, rte_eth_dev_info.rx_desc_lim.nb_max shows 65535 as a limit,
which results in a few problems:

* It is an incorrect value
* Allocating an RX queue and passing `rx_desc_lim.nb_max` results in an
  integer overflow and 0 ring size:

```
rte_eth_rx_queue_setup(0, 0, rx_desc_lim.nb_max, 0, NULL, mb_pool);
```

Which overflows ring size and generates the following log:
```
mlx5_net: port 0 increased number of descriptors in Rx queue 0 to the
next power of two (0)
```

This patch fixes these issues.

Signed-off-by: Igor Gutorov <igootorov@gmail.com>
---
 drivers/net/mlx5/mlx5_defs.h   | 3 +++
 drivers/net/mlx5/mlx5_ethdev.c | 5 ++++-
 2 files changed, 7 insertions(+), 1 deletion(-)
  

Comments

Slava Ovsiienko June 17, 2024, 7:18 a.m. UTC | #1
Hi, Igor

Thank you for the patch.

1. The absolute max descriptor number supported by ConnectX hardware is 32768.
2. The actual max descriptor number supported by the port (and its related representors)
    reported in log_max_wq_sz in HCA.caps.  This value should be queried and save in mlx5_devx_cmd_query_hca_attr() routine.
3. mlx5_rx_queue_pre_setup() should check requested descriptor number and reject if it exceeds log_max_wq_sz
4. Please, format your patch according to the "fix" template.

With best regards,
Slava

> -----Original Message-----
> From: Igor Gutorov <igootorov@gmail.com>
> Sent: Sunday, June 16, 2024 8:38 PM
> To: Dariusz Sosnowski <dsosnowski@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; Suanming Mou
> <suanmingm@nvidia.com>; Matan Azrad <matan@nvidia.com>
> Cc: dev@dpdk.org; Igor Gutorov <igootorov@gmail.com>
> Subject: [PATCH 1/1] net/mlx5: show rx/tx descriptor ring limitations in
> rte_eth_dev_info
> 
> Currently, rte_eth_dev_info.rx_desc_lim.nb_max shows 65535 as a limit, which
> results in a few problems:
> 
> * It is an incorrect value
> * Allocating an RX queue and passing `rx_desc_lim.nb_max` results in an
>   integer overflow and 0 ring size:
> 
> ```
> rte_eth_rx_queue_setup(0, 0, rx_desc_lim.nb_max, 0, NULL, mb_pool); ```
> 
> Which overflows ring size and generates the following log:
> ```
> mlx5_net: port 0 increased number of descriptors in Rx queue 0 to the next
> power of two (0) ```
> 
> This patch fixes these issues.
> 
> Signed-off-by: Igor Gutorov <igootorov@gmail.com>
> ---
>  drivers/net/mlx5/mlx5_defs.h   | 3 +++
>  drivers/net/mlx5/mlx5_ethdev.c | 5 ++++-
>  2 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h index
> dc5216cb24..df608f0921 100644
> --- a/drivers/net/mlx5/mlx5_defs.h
> +++ b/drivers/net/mlx5/mlx5_defs.h
> @@ -84,6 +84,9 @@
>  #define MLX5_RX_DEFAULT_BURST 64U
>  #define MLX5_TX_DEFAULT_BURST 64U
> 
> +/* Maximum number of descriptors in an RX/TX ring */ #define
> +MLX5_MAX_RING_DESC 8192
> +
>  /* Number of packets vectorized Rx can simultaneously process in a loop. */
>  #define MLX5_VPMD_DESCS_PER_LOOP      4
> 
> diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
> index aea799341c..d5be1ff1aa 100644
> --- a/drivers/net/mlx5/mlx5_ethdev.c
> +++ b/drivers/net/mlx5/mlx5_ethdev.c
> @@ -22,6 +22,7 @@
> 
>  #include <mlx5_malloc.h>
> 
> +#include "mlx5_defs.h"
>  #include "mlx5_rxtx.h"
>  #include "mlx5_rx.h"
>  #include "mlx5_tx.h"
> @@ -345,6 +346,8 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct
> rte_eth_dev_info *info)
>  	info->flow_type_rss_offloads = ~MLX5_RSS_HF_MASK;
>  	mlx5_set_default_params(dev, info);
>  	mlx5_set_txlimit_params(dev, info);
> +	info->rx_desc_lim.nb_max = MLX5_MAX_RING_DESC;
> +	info->tx_desc_lim.nb_max = MLX5_MAX_RING_DESC;
>  	if (priv->sh->cdev->config.hca_attr.mem_rq_rmp &&
>  	    priv->obj_ops.rxq_obj_new == devx_obj_ops.rxq_obj_new)
>  		info->dev_capa |= RTE_ETH_DEV_CAPA_RXQ_SHARE; @@ -
> 774,7 +777,7 @@ mlx5_hairpin_cap_get(struct rte_eth_dev *dev, struct
> rte_eth_hairpin_cap *cap)
>  	cap->max_nb_queues = UINT16_MAX;
>  	cap->max_rx_2_tx = 1;
>  	cap->max_tx_2_rx = 1;
> -	cap->max_nb_desc = 8192;
> +	cap->max_nb_desc = MLX5_MAX_RING_DESC;
>  	hca_attr = &priv->sh->cdev->config.hca_attr;
>  	cap->rx_cap.locked_device_memory = hca_attr-
> >hairpin_data_buffer_locked;
>  	cap->rx_cap.rte_memory = 0;
> --
> 2.45.2
  
Slava Ovsiienko June 23, 2024, 11:34 a.m. UTC | #2
Hi, Igor

Thank you for the v2. The patch looks good to me,  please see my further comments below.

> > 1. The absolute max descriptor number supported by ConnectX hardware is
> 32768.
> > 2. The actual max descriptor number supported by the port (and its related
> representors)
> >     reported in log_max_wq_sz in HCA.caps.  This value should be queried and
> save in mlx5_devx_cmd_query_hca_attr() routine.
> > 3. mlx5_rx_queue_pre_setup() should check requested descriptor number
> > and reject if it exceeds log_max_wq_sz
> 
> Thank you for the guidelines! I've also added the same check to
> mlx5_tx_queue_pre_setup(), I'm assuming log_max_wq_sz can be used for
> both RX and TX.
> 
> Is an `int` appropriate for `log_max_wq_sz`? Seems like a `uint8_t` is sufficient,
> but I've left it an `int` for consistency with the other `log_max_*` values.

Right, uint8_t looks to be enough. No objection to optimize others to uint8_t.

> Also, I've noticed a similar issue with MTU, it is also reported as 65535 in
> `rte_eth_dev_info.max_mtu`. I'd like to send a separate patch to fix that too.
> What's the procedure for reading max supported MTU?

MTU is not reported directly by HCA. It is per port settings and can be read from
PMTU - Port MTU Register.  ACCESS_REGISTER command should be used.
Please, see:
 https://network.nvidia.com/files/doc-2020/ethernet-adapters-programming-manual.pdf

And thorough  testing of accessing this register is needed - over physical port,
over the representors, over the VFs and SFs. Rollback to 0xFFFF should be implemented,
if register can't be accessed.

Also, this reported max MTU might be not supported for SFs/VFs, where MTU is defined
by hypervisor settings.

> 
> > 4. Please, format your patch according to the "fix" template.
> 
> I've reworded the commit message a little bit. But I don't see these issues on
> Bugzilla, I've stumbled upon them independently. If you'd like the bug reports to
> be created, let me know.

I meant this: https://doc.dpdk.org/guides/contributing/patches.html
Please see chapter "8.7. Commit Messages: Body" about "Fixes" and "Cc: stable@dpdk.org".

Also, please run checking script: /devtools/check-git-log.sh' -1 to verify commit message compliance.

With best regards,
Slava
  

Patch

diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index dc5216cb24..df608f0921 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -84,6 +84,9 @@ 
 #define MLX5_RX_DEFAULT_BURST 64U
 #define MLX5_TX_DEFAULT_BURST 64U
 
+/* Maximum number of descriptors in an RX/TX ring */
+#define MLX5_MAX_RING_DESC 8192
+
 /* Number of packets vectorized Rx can simultaneously process in a loop. */
 #define MLX5_VPMD_DESCS_PER_LOOP      4
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index aea799341c..d5be1ff1aa 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -22,6 +22,7 @@ 
 
 #include <mlx5_malloc.h>
 
+#include "mlx5_defs.h"
 #include "mlx5_rxtx.h"
 #include "mlx5_rx.h"
 #include "mlx5_tx.h"
@@ -345,6 +346,8 @@  mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 	info->flow_type_rss_offloads = ~MLX5_RSS_HF_MASK;
 	mlx5_set_default_params(dev, info);
 	mlx5_set_txlimit_params(dev, info);
+	info->rx_desc_lim.nb_max = MLX5_MAX_RING_DESC;
+	info->tx_desc_lim.nb_max = MLX5_MAX_RING_DESC;
 	if (priv->sh->cdev->config.hca_attr.mem_rq_rmp &&
 	    priv->obj_ops.rxq_obj_new == devx_obj_ops.rxq_obj_new)
 		info->dev_capa |= RTE_ETH_DEV_CAPA_RXQ_SHARE;
@@ -774,7 +777,7 @@  mlx5_hairpin_cap_get(struct rte_eth_dev *dev, struct rte_eth_hairpin_cap *cap)
 	cap->max_nb_queues = UINT16_MAX;
 	cap->max_rx_2_tx = 1;
 	cap->max_tx_2_rx = 1;
-	cap->max_nb_desc = 8192;
+	cap->max_nb_desc = MLX5_MAX_RING_DESC;
 	hca_attr = &priv->sh->cdev->config.hca_attr;
 	cap->rx_cap.locked_device_memory = hca_attr->hairpin_data_buffer_locked;
 	cap->rx_cap.rte_memory = 0;