[dpdk-dev] ethdev: check number of queues less than RTE_ETHDEV_QUEUE_STAT_CNTRS

Message ID 1478786449-44745-1-git-send-email-alejandro.lucero@netronome.com (mailing list archive)
State Superseded, archived
Headers

Checks

Context Check Description
checkpatch/checkpatch warning coding style issues

Commit Message

Alejandro Lucero Nov. 10, 2016, 2 p.m. UTC
  From: Bert van Leeuwen <bert.vanleeuwen@netronome.com>

A device can have more than RTE_ETHDEV_QUEUE_STAT_CNTRS queues which
is used inside struct rte_eth_stats. Ideally, DPDK should be built with
RTE_ETHDEV_QUEUE_STAT_CNTRS to the maximum number of queues a device
can support, 65536, as uint16_t is used for keeping those values for
RX and TX. But of course, having such big arrays inside struct rte_eth_stats
is not a good idea.

Current default value is 16, which could likely be changed to 32 or 64
without too much opposition. And maybe it would be a good idea to modify
struct rte_eth_stats for allowing dynamically allocated arrays and maybe
some extra fields for keeping the array sizes.

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
---
 lib/librte_ether/rte_ethdev.c | 25 +++++++++++++++++--------
 1 file changed, 17 insertions(+), 8 deletions(-)
  

Comments

Thomas Monjalon Nov. 10, 2016, 2:42 p.m. UTC | #1
2016-11-10 14:00, Alejandro Lucero:
> From: Bert van Leeuwen <bert.vanleeuwen@netronome.com>
> 
> A device can have more than RTE_ETHDEV_QUEUE_STAT_CNTRS queues which
> is used inside struct rte_eth_stats. Ideally, DPDK should be built with
> RTE_ETHDEV_QUEUE_STAT_CNTRS to the maximum number of queues a device
> can support, 65536, as uint16_t is used for keeping those values for
> RX and TX. But of course, having such big arrays inside struct rte_eth_stats
> is not a good idea.

RTE_ETHDEV_QUEUE_STAT_CNTRS come from a limitation in Intel devices.
They have limited number of registers to store the stats per queue.

> Current default value is 16, which could likely be changed to 32 or 64
> without too much opposition. And maybe it would be a good idea to modify
> struct rte_eth_stats for allowing dynamically allocated arrays and maybe
> some extra fields for keeping the array sizes.

Yes
and? what is your issue exactly? with which device?
Please explain the idea brought by your patch.
  
Alejandro Lucero Nov. 10, 2016, 3:43 p.m. UTC | #2
On Thu, Nov 10, 2016 at 2:42 PM, Thomas Monjalon <thomas.monjalon@6wind.com>
wrote:

> 2016-11-10 14:00, Alejandro Lucero:
> > From: Bert van Leeuwen <bert.vanleeuwen@netronome.com>
> >
> > A device can have more than RTE_ETHDEV_QUEUE_STAT_CNTRS queues which
> > is used inside struct rte_eth_stats. Ideally, DPDK should be built with
> > RTE_ETHDEV_QUEUE_STAT_CNTRS to the maximum number of queues a device
> > can support, 65536, as uint16_t is used for keeping those values for
> > RX and TX. But of course, having such big arrays inside struct
> rte_eth_stats
> > is not a good idea.
>
> RTE_ETHDEV_QUEUE_STAT_CNTRS come from a limitation in Intel devices.
> They have limited number of registers to store the stats per queue.
>
> > Current default value is 16, which could likely be changed to 32 or 64
> > without too much opposition. And maybe it would be a good idea to modify
> > struct rte_eth_stats for allowing dynamically allocated arrays and maybe
> > some extra fields for keeping the array sizes.
>
> Yes
> and? what is your issue exactly? with which device?
> Please explain the idea brought by your patch.
>

Netronome NFP devices support 128 queues and future version will support
1024.

A particular VF, our PMD just supports VFs, could get as much as 128.
Although that is not likely, that could be an option for some client.

Clients want to use a DPDK coming with a distribution, so changing the
RTE_ETHDEV_QUEUE_STAT_CNTRS depending on the present devices is not an
option.

We would be happy if RTE_ETHDEV_QUEUE_STAT_CNTRS could be set to 1024,
covering current and future requirements for our cards, but maybe having
such big arrays inside struct rte_eth_stats is something people do not want
to have.

A solution could be to create such arrays dynamically based on the device
to get the stats from. For example, call to rte_eth_dev_configure could
have ax extra field for allocating a rte_eth_stats struct, which will be
based on nb_rx_q and nb_tx_q params already given to that function.

Maybe the first thing to know is what people think about just incrementing
RTE_ETHDEV_QUEUE_STAT_CNTRS to 1024.

So Thomas, what do you think about this?
  
Thomas Monjalon Nov. 10, 2016, 4:01 p.m. UTC | #3
2016-11-10 15:43, Alejandro Lucero:
> On Thu, Nov 10, 2016 at 2:42 PM, Thomas Monjalon <thomas.monjalon@6wind.com>
> wrote:
> 
> > 2016-11-10 14:00, Alejandro Lucero:
> > > From: Bert van Leeuwen <bert.vanleeuwen@netronome.com>
> > >
> > > A device can have more than RTE_ETHDEV_QUEUE_STAT_CNTRS queues which
> > > is used inside struct rte_eth_stats. Ideally, DPDK should be built with
> > > RTE_ETHDEV_QUEUE_STAT_CNTRS to the maximum number of queues a device
> > > can support, 65536, as uint16_t is used for keeping those values for
> > > RX and TX. But of course, having such big arrays inside struct
> > rte_eth_stats
> > > is not a good idea.
> >
> > RTE_ETHDEV_QUEUE_STAT_CNTRS come from a limitation in Intel devices.
> > They have limited number of registers to store the stats per queue.
> >
> > > Current default value is 16, which could likely be changed to 32 or 64
> > > without too much opposition. And maybe it would be a good idea to modify
> > > struct rte_eth_stats for allowing dynamically allocated arrays and maybe
> > > some extra fields for keeping the array sizes.
> >
> > Yes
> > and? what is your issue exactly? with which device?
> > Please explain the idea brought by your patch.
> >
> 
> Netronome NFP devices support 128 queues and future version will support
> 1024.
> 
> A particular VF, our PMD just supports VFs, could get as much as 128.
> Although that is not likely, that could be an option for some client.
> 
> Clients want to use a DPDK coming with a distribution, so changing the
> RTE_ETHDEV_QUEUE_STAT_CNTRS depending on the present devices is not an
> option.
> 
> We would be happy if RTE_ETHDEV_QUEUE_STAT_CNTRS could be set to 1024,
> covering current and future requirements for our cards, but maybe having
> such big arrays inside struct rte_eth_stats is something people do not want
> to have.
> 
> A solution could be to create such arrays dynamically based on the device
> to get the stats from. For example, call to rte_eth_dev_configure could
> have ax extra field for allocating a rte_eth_stats struct, which will be
> based on nb_rx_q and nb_tx_q params already given to that function.
> 
> Maybe the first thing to know is what people think about just incrementing
> RTE_ETHDEV_QUEUE_STAT_CNTRS to 1024.
> 
> So Thomas, what do you think about this?

I think this patch is doing something else :)

I'm not sure what is better between big arrays and variable size.
I think you must explain these 2 options in another thread,
because I'm not sure you will have enough attention in a thread starting with
"check number of queues less than RTE_ETHDEV_QUEUE_STAT_CNTRS".
  
Alejandro Lucero Nov. 10, 2016, 4:04 p.m. UTC | #4
On Thu, Nov 10, 2016 at 4:01 PM, Thomas Monjalon <thomas.monjalon@6wind.com>
wrote:

> 2016-11-10 15:43, Alejandro Lucero:
> > On Thu, Nov 10, 2016 at 2:42 PM, Thomas Monjalon <
> thomas.monjalon@6wind.com>
> > wrote:
> >
> > > 2016-11-10 14:00, Alejandro Lucero:
> > > > From: Bert van Leeuwen <bert.vanleeuwen@netronome.com>
> > > >
> > > > A device can have more than RTE_ETHDEV_QUEUE_STAT_CNTRS queues which
> > > > is used inside struct rte_eth_stats. Ideally, DPDK should be built
> with
> > > > RTE_ETHDEV_QUEUE_STAT_CNTRS to the maximum number of queues a device
> > > > can support, 65536, as uint16_t is used for keeping those values for
> > > > RX and TX. But of course, having such big arrays inside struct
> > > rte_eth_stats
> > > > is not a good idea.
> > >
> > > RTE_ETHDEV_QUEUE_STAT_CNTRS come from a limitation in Intel devices.
> > > They have limited number of registers to store the stats per queue.
> > >
> > > > Current default value is 16, which could likely be changed to 32 or
> 64
> > > > without too much opposition. And maybe it would be a good idea to
> modify
> > > > struct rte_eth_stats for allowing dynamically allocated arrays and
> maybe
> > > > some extra fields for keeping the array sizes.
> > >
> > > Yes
> > > and? what is your issue exactly? with which device?
> > > Please explain the idea brought by your patch.
> > >
> >
> > Netronome NFP devices support 128 queues and future version will support
> > 1024.
> >
> > A particular VF, our PMD just supports VFs, could get as much as 128.
> > Although that is not likely, that could be an option for some client.
> >
> > Clients want to use a DPDK coming with a distribution, so changing the
> > RTE_ETHDEV_QUEUE_STAT_CNTRS depending on the present devices is not an
> > option.
> >
> > We would be happy if RTE_ETHDEV_QUEUE_STAT_CNTRS could be set to 1024,
> > covering current and future requirements for our cards, but maybe having
> > such big arrays inside struct rte_eth_stats is something people do not
> want
> > to have.
> >
> > A solution could be to create such arrays dynamically based on the device
> > to get the stats from. For example, call to rte_eth_dev_configure could
> > have ax extra field for allocating a rte_eth_stats struct, which will be
> > based on nb_rx_q and nb_tx_q params already given to that function.
> >
> > Maybe the first thing to know is what people think about just
> incrementing
> > RTE_ETHDEV_QUEUE_STAT_CNTRS to 1024.
> >
> > So Thomas, what do you think about this?
>
> I think this patch is doing something else :)
>
>
Sure. But the problem the patch solves is pointing to this, IMHO, bigger
issue.


> I'm not sure what is better between big arrays and variable size.
> I think you must explain these 2 options in another thread,
> because I'm not sure you will have enough attention in a thread starting
> with
> "check number of queues less than RTE_ETHDEV_QUEUE_STAT_CNTRS".
>

Agree. I'll do that then.

Thanks
  
Alejandro Lucero Nov. 11, 2016, 9:16 a.m. UTC | #5
Thomas,

We are wondering if you realize this patch fixes a bug with current ethdev
code as a device can have more than RTE_ETHDEV_QUEUE_STAT_CNTRS.

Maybe the commit message is giving the wrong impression and as you
commented, it should just focus on the bug it fixes and to leave for
another email thread the discussion of how to solve the
RTE_ETHDEV_QUEUE_STAT_CNTRS
problem.

Should we remove this from patchwork and to send another patch that way?


On Thu, Nov 10, 2016 at 4:04 PM, Alejandro Lucero <
alejandro.lucero@netronome.com> wrote:

>
>
> On Thu, Nov 10, 2016 at 4:01 PM, Thomas Monjalon <
> thomas.monjalon@6wind.com> wrote:
>
>> 2016-11-10 15:43, Alejandro Lucero:
>> > On Thu, Nov 10, 2016 at 2:42 PM, Thomas Monjalon <
>> thomas.monjalon@6wind.com>
>> > wrote:
>> >
>> > > 2016-11-10 14:00, Alejandro Lucero:
>> > > > From: Bert van Leeuwen <bert.vanleeuwen@netronome.com>
>> > > >
>> > > > A device can have more than RTE_ETHDEV_QUEUE_STAT_CNTRS queues which
>> > > > is used inside struct rte_eth_stats. Ideally, DPDK should be built
>> with
>> > > > RTE_ETHDEV_QUEUE_STAT_CNTRS to the maximum number of queues a device
>> > > > can support, 65536, as uint16_t is used for keeping those values for
>> > > > RX and TX. But of course, having such big arrays inside struct
>> > > rte_eth_stats
>> > > > is not a good idea.
>> > >
>> > > RTE_ETHDEV_QUEUE_STAT_CNTRS come from a limitation in Intel devices.
>> > > They have limited number of registers to store the stats per queue.
>> > >
>> > > > Current default value is 16, which could likely be changed to 32 or
>> 64
>> > > > without too much opposition. And maybe it would be a good idea to
>> modify
>> > > > struct rte_eth_stats for allowing dynamically allocated arrays and
>> maybe
>> > > > some extra fields for keeping the array sizes.
>> > >
>> > > Yes
>> > > and? what is your issue exactly? with which device?
>> > > Please explain the idea brought by your patch.
>> > >
>> >
>> > Netronome NFP devices support 128 queues and future version will support
>> > 1024.
>> >
>> > A particular VF, our PMD just supports VFs, could get as much as 128.
>> > Although that is not likely, that could be an option for some client.
>> >
>> > Clients want to use a DPDK coming with a distribution, so changing the
>> > RTE_ETHDEV_QUEUE_STAT_CNTRS depending on the present devices is not an
>> > option.
>> >
>> > We would be happy if RTE_ETHDEV_QUEUE_STAT_CNTRS could be set to 1024,
>> > covering current and future requirements for our cards, but maybe having
>> > such big arrays inside struct rte_eth_stats is something people do not
>> want
>> > to have.
>> >
>> > A solution could be to create such arrays dynamically based on the
>> device
>> > to get the stats from. For example, call to rte_eth_dev_configure could
>> > have ax extra field for allocating a rte_eth_stats struct, which will be
>> > based on nb_rx_q and nb_tx_q params already given to that function.
>> >
>> > Maybe the first thing to know is what people think about just
>> incrementing
>> > RTE_ETHDEV_QUEUE_STAT_CNTRS to 1024.
>> >
>> > So Thomas, what do you think about this?
>>
>> I think this patch is doing something else :)
>>
>>
> Sure. But the problem the patch solves is pointing to this, IMHO, bigger
> issue.
>
>
>> I'm not sure what is better between big arrays and variable size.
>> I think you must explain these 2 options in another thread,
>> because I'm not sure you will have enough attention in a thread starting
>> with
>> "check number of queues less than RTE_ETHDEV_QUEUE_STAT_CNTRS".
>>
>
> Agree. I'll do that then.
>
> Thanks
>
  
Thomas Monjalon Nov. 11, 2016, 9:29 a.m. UTC | #6
2016-11-11 09:16, Alejandro Lucero:
> Thomas,
> 
> We are wondering if you realize this patch fixes a bug with current ethdev
> code as a device can have more than RTE_ETHDEV_QUEUE_STAT_CNTRS.
> 
> Maybe the commit message is giving the wrong impression and as you
> commented, it should just focus on the bug it fixes and to leave for
> another email thread the discussion of how to solve the
> RTE_ETHDEV_QUEUE_STAT_CNTRS
> problem.
> 
> Should we remove this from patchwork and to send another patch that way?

Yes please. It was my first comment, we don't understand the exact issue
you are fixing.
And I have a bad feeling it could break something else (really just a feeling).
It is not the kind of patch we can apply the last day of a release.
That's why I think it should wait 17.02.

Of course you can try to convince me and others to apply it as a last minute
patch. But why are you sending a patch on the generic API in the last days?

Last argument: it is not fixing a regression of 16.11, so it is not so urgent.
  
Alejandro Lucero Nov. 11, 2016, 9:32 a.m. UTC | #7
On Fri, Nov 11, 2016 at 9:29 AM, Thomas Monjalon <thomas.monjalon@6wind.com>
wrote:

> 2016-11-11 09:16, Alejandro Lucero:
> > Thomas,
> >
> > We are wondering if you realize this patch fixes a bug with current
> ethdev
> > code as a device can have more than RTE_ETHDEV_QUEUE_STAT_CNTRS.
> >
> > Maybe the commit message is giving the wrong impression and as you
> > commented, it should just focus on the bug it fixes and to leave for
> > another email thread the discussion of how to solve the
> > RTE_ETHDEV_QUEUE_STAT_CNTRS
> > problem.
> >
> > Should we remove this from patchwork and to send another patch that way?
>
> Yes please. It was my first comment, we don't understand the exact issue
> you are fixing.
>

OK


> And I have a bad feeling it could break something else (really just a
> feeling).
> It is not the kind of patch we can apply the last day of a release.
> That's why I think it should wait 17.02.
>
>
Fine.


> Of course you can try to convince me and others to apply it as a last
> minute
> patch. But why are you sending a patch on the generic API in the last days?
>
>
We just found it a couple of days ago.


> Last argument: it is not fixing a regression of 16.11, so it is not so
> urgent.
>
  
Bert van Leeuwen Nov. 11, 2016, 9:48 a.m. UTC | #8
> On 11 Nov 2016, at 11:29 AM, Thomas Monjalon <thomas.monjalon@6wind.com> wrote:
> 
> 2016-11-11 09:16, Alejandro Lucero:
>> Thomas,
>> 
>> We are wondering if you realize this patch fixes a bug with current ethdev
>> code as a device can have more than RTE_ETHDEV_QUEUE_STAT_CNTRS.
>> 
>> Maybe the commit message is giving the wrong impression and as you
>> commented, it should just focus on the bug it fixes and to leave for
>> another email thread the discussion of how to solve the
>> RTE_ETHDEV_QUEUE_STAT_CNTRS
>> problem.
>> 
>> Should we remove this from patchwork and to send another patch that way?
> 
> Yes please. It was my first comment, we don't understand the exact issue
> you are fixing.

In a nutshell, the rte_eth_xstats_get function was reading memory beyond what was stored in the internal arrays (and thus returning junk values). The patch simply prevents it from going out of bounds, it does not address the issue that the rest of the counters are lost though. This issue is not specific to our device, in fact we found it while using a multiqueue virtio device (32 queues), and traced the issue into this core ethdev functionality.

> And I have a bad feeling it could break something else (really just a feeling).
> It is not the kind of patch we can apply the last day of a release.
> That's why I think it should wait 17.02.
> 
> Of course you can try to convince me and others to apply it as a last minute
> patch. But why are you sending a patch on the generic API in the last days?
> 
> Last argument: it is not fixing a regression of 16.11, so it is not so urgent.

Yes, it looks like this bug was present in DPDK 2.2 even, and possibly before that too.
  

Patch

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index fde8112..4209ad0 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1343,8 +1343,10 @@  get_xstats_count(uint8_t port_id)
 	} else
 		count = 0;
 	count += RTE_NB_STATS;
-	count += dev->data->nb_rx_queues * RTE_NB_RXQ_STATS;
-	count += dev->data->nb_tx_queues * RTE_NB_TXQ_STATS;
+	count += RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS) *
+		 RTE_NB_RXQ_STATS;
+	count += RTE_MIN(dev->data->nb_tx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS) *
+		 RTE_NB_TXQ_STATS;
 	return count;
 }
 
@@ -1358,6 +1360,7 @@  rte_eth_xstats_get_names(uint8_t port_id,
 	int cnt_expected_entries;
 	int cnt_driver_entries;
 	uint32_t idx, id_queue;
+	uint16_t num_q;
 
 	cnt_expected_entries = get_xstats_count(port_id);
 	if (xstats_names == NULL || cnt_expected_entries < 0 ||
@@ -1374,7 +1377,8 @@  rte_eth_xstats_get_names(uint8_t port_id,
 			"%s", rte_stats_strings[idx].name);
 		cnt_used_entries++;
 	}
-	for (id_queue = 0; id_queue < dev->data->nb_rx_queues; id_queue++) {
+	num_q = RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
+	for (id_queue = 0; id_queue < num_q; id_queue++) {
 		for (idx = 0; idx < RTE_NB_RXQ_STATS; idx++) {
 			snprintf(xstats_names[cnt_used_entries].name,
 				sizeof(xstats_names[0].name),
@@ -1384,7 +1388,8 @@  rte_eth_xstats_get_names(uint8_t port_id,
 		}
 
 	}
-	for (id_queue = 0; id_queue < dev->data->nb_tx_queues; id_queue++) {
+	num_q = RTE_MIN(dev->data->nb_tx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
+	for (id_queue = 0; id_queue < num_q; id_queue++) {
 		for (idx = 0; idx < RTE_NB_TXQ_STATS; idx++) {
 			snprintf(xstats_names[cnt_used_entries].name,
 				sizeof(xstats_names[0].name),
@@ -1420,14 +1425,18 @@  rte_eth_xstats_get(uint8_t port_id, struct rte_eth_xstat *xstats,
 	unsigned count = 0, i, q;
 	signed xcount = 0;
 	uint64_t val, *stats_ptr;
+	uint16_t nb_rxqs, nb_txqs;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
 	dev = &rte_eth_devices[port_id];
 
+	nb_rxqs = RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
+	nb_txqs = RTE_MIN(dev->data->nb_tx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
+
 	/* Return generic statistics */
-	count = RTE_NB_STATS + (dev->data->nb_rx_queues * RTE_NB_RXQ_STATS) +
-		(dev->data->nb_tx_queues * RTE_NB_TXQ_STATS);
+	count = RTE_NB_STATS + (nb_rxqs * RTE_NB_RXQ_STATS) +
+		(nb_txqs * RTE_NB_TXQ_STATS);
 
 	/* implemented by the driver */
 	if (dev->dev_ops->xstats_get != NULL) {
@@ -1458,7 +1467,7 @@  rte_eth_xstats_get(uint8_t port_id, struct rte_eth_xstat *xstats,
 	}
 
 	/* per-rxq stats */
-	for (q = 0; q < dev->data->nb_rx_queues; q++) {
+	for (q = 0; q < nb_rxqs; q++) {
 		for (i = 0; i < RTE_NB_RXQ_STATS; i++) {
 			stats_ptr = RTE_PTR_ADD(&eth_stats,
 					rte_rxq_stats_strings[i].offset +
@@ -1469,7 +1478,7 @@  rte_eth_xstats_get(uint8_t port_id, struct rte_eth_xstat *xstats,
 	}
 
 	/* per-txq stats */
-	for (q = 0; q < dev->data->nb_tx_queues; q++) {
+	for (q = 0; q < nb_txqs; q++) {
 		for (i = 0; i < RTE_NB_TXQ_STATS; i++) {
 			stats_ptr = RTE_PTR_ADD(&eth_stats,
 					rte_txq_stats_strings[i].offset +