mlx5: initially reading xstats does not cause seg fault

Message ID 20220818123014.2515261-1-huzaifa.rahman@emumba.com (mailing list archive)
State Changes Requested, archived
Delegated to: Raslan Darawsheh
Headers
Series mlx5: initially reading xstats does not cause seg fault |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/github-robot: build success github build: passed
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS

Commit Message

Huzaifa Rahman Aug. 18, 2022, 12:30 p.m. UTC
  Bugzilla ID: 296

the size of counters array in mlx5_xstats_get() was smaller
than the memory we are setting for this array in
mlx5_os_read_dev_counters(). due to which the extra memory is
corrupted and thus corrupting the seemingly unrelated variables.
this happens at the first run only because the n function arg
of mlx5_xstats_get() which is used to init counters array is
initialized by adding the preceding statistics which in our case
(i.e first run) is zero. after the initialization in
mlx5_os_stats_init() the mlx5_stats_n is populated and thus from
then onward the counters array size is correct

my changes will only affect the flow of the first run when we
need to initialize stats in mlx5_os_stats_init(). the size of the
counters array is set according the mlx5_stats_n variable. by doing
this we will avoid the memset corrupting other variables` memory

Signed-off-by: huzaifa.rahman <huzaifa.rahman@emumba.com>
---
 drivers/net/mlx5/mlx5_stats.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)
  

Comments

Kamil Vojanec Aug. 23, 2022, 7:33 a.m. UTC | #1
On 8/18/22 14:30, huzaifa.rahman wrote:

> Bugzilla ID: 296
>
> the size of counters array in mlx5_xstats_get() was smaller
> than the memory we are setting for this array in
> mlx5_os_read_dev_counters(). due to which the extra memory is
> corrupted and thus corrupting the seemingly unrelated variables.
> this happens at the first run only because the n function arg
> of mlx5_xstats_get() which is used to init counters array is
> initialized by adding the preceding statistics which in our case
> (i.e first run) is zero. after the initialization in
> mlx5_os_stats_init() the mlx5_stats_n is populated and thus from
> then onward the counters array size is correct
>
> my changes will only affect the flow of the first run when we
> need to initialize stats in mlx5_os_stats_init(). the size of the
> counters array is set according the mlx5_stats_n variable. by doing
> this we will avoid the memset corrupting other variables` memory
>
> Signed-off-by: huzaifa.rahman<huzaifa.rahman@emumba.com>

Tested-by: Kamil Vojanec<vojanec@cesnet.cz>
  
Huzaifa Rahman Sept. 22, 2022, 10:39 a.m. UTC | #2
The bugzilla ID of this bug is 701:
https://bugs.dpdk.org/show_bug.cgi?id=701

On Tue, Aug 23, 2022 at 12:33 PM Kamil Vojanec <vojanec@cesnet.cz> wrote:

> On 8/18/22 14:30, huzaifa.rahman wrote:
>
> Bugzilla ID: 296
>
> the size of counters array in mlx5_xstats_get() was smaller
> than the memory we are setting for this array in
> mlx5_os_read_dev_counters(). due to which the extra memory is
> corrupted and thus corrupting the seemingly unrelated variables.
> this happens at the first run only because the n function arg
> of mlx5_xstats_get() which is used to init counters array is
> initialized by adding the preceding statistics which in our case
> (i.e first run) is zero. after the initialization in
> mlx5_os_stats_init() the mlx5_stats_n is populated and thus from
> then onward the counters array size is correct
>
> my changes will only affect the flow of the first run when we
> need to initialize stats in mlx5_os_stats_init(). the size of the
> counters array is set according the mlx5_stats_n variable. by doing
> this we will avoid the memset corrupting other variables` memory
>
> Signed-off-by: huzaifa.rahman <huzaifa.rahman@emumba.com> <huzaifa.rahman@emumba.com>
>
> Tested-by: Kamil Vojanec <vojanec@cesnet.cz> <vojanec@cesnet.cz>
>
>
  
Huzaifa Rahman Nov. 10, 2022, 10:07 a.m. UTC | #3
Hi,

Is there any other work/changes required for this patch to be submitted?

Thanks


On Thu, Sep 22, 2022 at 3:39 PM Huzaifa Rahman <huzaifa.rahman@emumba.com>
wrote:

> The bugzilla ID of this bug is 701:
> https://bugs.dpdk.org/show_bug.cgi?id=701
>
> On Tue, Aug 23, 2022 at 12:33 PM Kamil Vojanec <vojanec@cesnet.cz> wrote:
>
>> On 8/18/22 14:30, huzaifa.rahman wrote:
>>
>> Bugzilla ID: 296
>>
>> the size of counters array in mlx5_xstats_get() was smaller
>> than the memory we are setting for this array in
>> mlx5_os_read_dev_counters(). due to which the extra memory is
>> corrupted and thus corrupting the seemingly unrelated variables.
>> this happens at the first run only because the n function arg
>> of mlx5_xstats_get() which is used to init counters array is
>> initialized by adding the preceding statistics which in our case
>> (i.e first run) is zero. after the initialization in
>> mlx5_os_stats_init() the mlx5_stats_n is populated and thus from
>> then onward the counters array size is correct
>>
>> my changes will only affect the flow of the first run when we
>> need to initialize stats in mlx5_os_stats_init(). the size of the
>> counters array is set according the mlx5_stats_n variable. by doing
>> this we will avoid the memset corrupting other variables` memory
>>
>> Signed-off-by: huzaifa.rahman <huzaifa.rahman@emumba.com> <huzaifa.rahman@emumba.com>
>>
>> Tested-by: Kamil Vojanec <vojanec@cesnet.cz> <vojanec@cesnet.cz>
>>
>>
  
Kamil Vojanec Nov. 10, 2022, 10:53 a.m. UTC | #4
Hello,

On 11/10/22 11:07, Huzaifa Rahman wrote:

> Hi,
>
> Is there any other work/changes required for this patch to be submitted?
>
> Thanks
>
>
> On Thu, Sep 22, 2022 at 3:39 PM Huzaifa Rahman<huzaifa.rahman@emumba.com>
> wrote:
>
>> The bugzilla ID of this bug is 701:
>> https://bugs.dpdk.org/show_bug.cgi?id=701
>>
>> On Tue, Aug 23, 2022 at 12:33 PM Kamil Vojanec<vojanec@cesnet.cz>  wrote:
>>
>>> On 8/18/22 14:30, huzaifa.rahman wrote:
>>>
>>> Bugzilla ID: 296
>>>
>>> the size of counters array in mlx5_xstats_get() was smaller
>>> than the memory we are setting for this array in
>>> mlx5_os_read_dev_counters(). due to which the extra memory is
>>> corrupted and thus corrupting the seemingly unrelated variables.
>>> this happens at the first run only because the n function arg
>>> of mlx5_xstats_get() which is used to init counters array is
>>> initialized by adding the preceding statistics which in our case
>>> (i.e first run) is zero. after the initialization in
>>> mlx5_os_stats_init() the mlx5_stats_n is populated and thus from
>>> then onward the counters array size is correct
>>>
>>> my changes will only affect the flow of the first run when we
>>> need to initialize stats in mlx5_os_stats_init(). the size of the
>>> counters array is set according the mlx5_stats_n variable. by doing
>>> this we will avoid the memset corrupting other variables` memory
>>>
>>> Signed-off-by: huzaifa.rahman<huzaifa.rahman@emumba.com>  <huzaifa.rahman@emumba.com>
>>>
>>> Tested-by: Kamil Vojanec<vojanec@cesnet.cz>  <vojanec@cesnet.cz>
>>>
>>>
Looks good to me
  
Slava Ovsiienko March 7, 2023, 4:42 p.m. UTC | #5
Hi, Huzaifa

Could you, please, format commit message with capitals in the
sentence beginnings? And explanation can be less wordy a little bit.

With best regards,
Slava

> -----Original Message-----
> From: huzaifa.rahman <huzaifa.rahman@emumba.com>
> Sent: четверг, 18 августа 2022 г. 15:30
> To: Matan Azrad <matan@nvidia.com>
> Cc: dev@dpdk.org; Slava Ovsiienko <viacheslavo@nvidia.com>;
> huzaifa.rahman <huzaifa.rahman@emumba.com>
> Subject: [PATCH] mlx5: initially reading xstats does not cause seg fault
> 
> Bugzilla ID: 296
> 
> the size of counters array in mlx5_xstats_get() was smaller than the memory
> we are setting for this array in mlx5_os_read_dev_counters(). due to which
> the extra memory is corrupted and thus corrupting the seemingly unrelated
> variables.
> this happens at the first run only because the n function arg of
> mlx5_xstats_get() which is used to init counters array is initialized by adding
> the preceding statistics which in our case (i.e first run) is zero. after the
> initialization in
> mlx5_os_stats_init() the mlx5_stats_n is populated and thus from then
> onward the counters array size is correct
> 
> my changes will only affect the flow of the first run when we need to initialize
> stats in mlx5_os_stats_init(). the size of the counters array is set according the
> mlx5_stats_n variable. by doing this we will avoid the memset corrupting
> other variables` memory
> 
> Signed-off-by: huzaifa.rahman <huzaifa.rahman@emumba.com>
> ---
>  drivers/net/mlx5/mlx5_stats.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_stats.c b/drivers/net/mlx5/mlx5_stats.c
> index f64fa3587b..bccfec10fb 100644
> --- a/drivers/net/mlx5/mlx5_stats.c
> +++ b/drivers/net/mlx5/mlx5_stats.c
> @@ -40,7 +40,6 @@ mlx5_xstats_get(struct rte_eth_dev *dev, struct
> rte_eth_xstat *stats,  {
>  	struct mlx5_priv *priv = dev->data->dev_private;
>  	unsigned int i;
> -	uint64_t counters[n];
>  	struct mlx5_xstats_ctrl *xstats_ctrl = &priv->xstats_ctrl;
>  	uint16_t mlx5_stats_n = xstats_ctrl->mlx5_stats_n;
> 
> @@ -51,8 +50,11 @@ mlx5_xstats_get(struct rte_eth_dev *dev, struct
> rte_eth_xstat *stats,
>  		stats_n = mlx5_os_get_stats_n(dev);
>  		if (stats_n < 0)
>  			return stats_n;
> -		if (xstats_ctrl->stats_n != stats_n)
> +		if (xstats_ctrl->stats_n != stats_n) {
>  			mlx5_os_stats_init(dev);
> +			n = xstats_ctrl->mlx5_stats_n;
> +		}
> +		uint64_t counters[n];
>  		ret = mlx5_os_read_dev_counters(dev, counters);
>  		if (ret)
>  			return ret;
> --
> 2.25.1
  
Slava Ovsiienko March 7, 2023, 4:51 p.m. UTC | #6
Hi, Huzaifa

"n" - is the parameter of the mlx5_xstats_get() routine, provided by caller.
We should not change this - it specified the size of "struct rte_eth_xstat *stats " array.

With best regards,
Slava

> -----Original Message-----
> From: huzaifa.rahman <huzaifa.rahman@emumba.com>
> Sent: четверг, 18 августа 2022 г. 15:30
> To: Matan Azrad <matan@nvidia.com>
> Cc: dev@dpdk.org; Slava Ovsiienko <viacheslavo@nvidia.com>;
> huzaifa.rahman <huzaifa.rahman@emumba.com>
> Subject: [PATCH] mlx5: initially reading xstats does not cause seg fault
> 
> Bugzilla ID: 296
> 
> the size of counters array in mlx5_xstats_get() was smaller than the memory
> we are setting for this array in mlx5_os_read_dev_counters(). due to which
> the extra memory is corrupted and thus corrupting the seemingly unrelated
> variables.
> this happens at the first run only because the n function arg of
> mlx5_xstats_get() which is used to init counters array is initialized by adding
> the preceding statistics which in our case (i.e first run) is zero. after the
> initialization in
> mlx5_os_stats_init() the mlx5_stats_n is populated and thus from then
> onward the counters array size is correct
> 
> my changes will only affect the flow of the first run when we need to initialize
> stats in mlx5_os_stats_init(). the size of the counters array is set according the
> mlx5_stats_n variable. by doing this we will avoid the memset corrupting
> other variables` memory
> 
> Signed-off-by: huzaifa.rahman <huzaifa.rahman@emumba.com>
> ---
>  drivers/net/mlx5/mlx5_stats.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_stats.c b/drivers/net/mlx5/mlx5_stats.c
> index f64fa3587b..bccfec10fb 100644
> --- a/drivers/net/mlx5/mlx5_stats.c
> +++ b/drivers/net/mlx5/mlx5_stats.c
> @@ -40,7 +40,6 @@ mlx5_xstats_get(struct rte_eth_dev *dev, struct
> rte_eth_xstat *stats,  {
>  	struct mlx5_priv *priv = dev->data->dev_private;
>  	unsigned int i;
> -	uint64_t counters[n];
>  	struct mlx5_xstats_ctrl *xstats_ctrl = &priv->xstats_ctrl;
>  	uint16_t mlx5_stats_n = xstats_ctrl->mlx5_stats_n;
> 
> @@ -51,8 +50,11 @@ mlx5_xstats_get(struct rte_eth_dev *dev, struct
> rte_eth_xstat *stats,
>  		stats_n = mlx5_os_get_stats_n(dev);
>  		if (stats_n < 0)
>  			return stats_n;
> -		if (xstats_ctrl->stats_n != stats_n)
> +		if (xstats_ctrl->stats_n != stats_n) {
>  			mlx5_os_stats_init(dev);
> +			n = xstats_ctrl->mlx5_stats_n;
> +		}
> +		uint64_t counters[n];
>  		ret = mlx5_os_read_dev_counters(dev, counters);
>  		if (ret)
>  			return ret;
> --
> 2.25.1
  

Patch

diff --git a/drivers/net/mlx5/mlx5_stats.c b/drivers/net/mlx5/mlx5_stats.c
index f64fa3587b..bccfec10fb 100644
--- a/drivers/net/mlx5/mlx5_stats.c
+++ b/drivers/net/mlx5/mlx5_stats.c
@@ -40,7 +40,6 @@  mlx5_xstats_get(struct rte_eth_dev *dev, struct rte_eth_xstat *stats,
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
 	unsigned int i;
-	uint64_t counters[n];
 	struct mlx5_xstats_ctrl *xstats_ctrl = &priv->xstats_ctrl;
 	uint16_t mlx5_stats_n = xstats_ctrl->mlx5_stats_n;
 
@@ -51,8 +50,11 @@  mlx5_xstats_get(struct rte_eth_dev *dev, struct rte_eth_xstat *stats,
 		stats_n = mlx5_os_get_stats_n(dev);
 		if (stats_n < 0)
 			return stats_n;
-		if (xstats_ctrl->stats_n != stats_n)
+		if (xstats_ctrl->stats_n != stats_n) {
 			mlx5_os_stats_init(dev);
+			n = xstats_ctrl->mlx5_stats_n;
+		}
+		uint64_t counters[n];
 		ret = mlx5_os_read_dev_counters(dev, counters);
 		if (ret)
 			return ret;