netvsc: optimize stats counters performance
Checks
Commit Message
Optimized the performance of updating the statistics counters by reducing
the number of branches.
Ordered the packet size comparisons according to the probability with
typical internet traffic mix.
Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
---
drivers/net/netvsc/hn_rxtx.c | 32 ++++++++++----------------------
1 file changed, 10 insertions(+), 22 deletions(-)
Comments
> Subject: [PATCH] netvsc: optimize stats counters performance
>
> Optimized the performance of updating the statistics counters by reducing the
> number of branches.
>
> Ordered the packet size comparisons according to the probability with typical
> internet traffic mix.
>
> Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> ---
> drivers/net/netvsc/hn_rxtx.c | 32 ++++++++++----------------------
> 1 file changed, 10 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/net/netvsc/hn_rxtx.c b/drivers/net/netvsc/hn_rxtx.c index
> 9bf1ec5509..b704b2c971 100644
> --- a/drivers/net/netvsc/hn_rxtx.c
> +++ b/drivers/net/netvsc/hn_rxtx.c
> @@ -110,30 +110,18 @@ hn_update_packet_stats(struct hn_stats *stats,
> const struct rte_mbuf *m)
> uint32_t s = m->pkt_len;
> const struct rte_ether_addr *ea;
>
> - if (s == 64) {
> - stats->size_bins[1]++;
> - } else if (s > 64 && s < 1024) {
> - uint32_t bin;
> -
> - /* count zeros, and offset into correct bin */
> - bin = (sizeof(s) * 8) - rte_clz32(s) - 5;
> - stats->size_bins[bin]++;
> - } else {
> - if (s < 64)
> - stats->size_bins[0]++;
> - else if (s < 1519)
> - stats->size_bins[6]++;
> - else
> - stats->size_bins[7]++;
> - }
> + if (s >= 1024)
> + stats->size_bins[6 + (s > 1518)]++;
> + else if (s <= 64)
> + stats->size_bins[s >> 6]++;
> + else
> + stats->size_bins[32UL - rte_clz32(s) - 5]++;
This part looks good.
>
> ea = rte_pktmbuf_mtod(m, const struct rte_ether_addr *);
> - if (rte_is_multicast_ether_addr(ea)) {
> - if (rte_is_broadcast_ether_addr(ea))
> - stats->broadcast++;
> - else
> - stats->multicast++;
> - }
> + RTE_BUILD_BUG_ON(offsetof(struct hn_stats, broadcast) !=
> + offsetof(struct hn_stats, multicast) + sizeof(uint64_t));
> + if (unlikely(rte_is_multicast_ether_addr(ea)))
> + (&stats->multicast)[rte_is_broadcast_ether_addr(ea)]++;
> }
This makes the code a little harder to read. How about just add "unlikely" to rte_is_multicast_ether_addr(ea) and keep the rest unchanged?
Thank you,
Long
> From: Long Li [mailto:longli@microsoft.com]
> Sent: Friday, 2 August 2024 18.49
>
> > Subject: [PATCH] netvsc: optimize stats counters performance
> >
> > Optimized the performance of updating the statistics counters by
> reducing the
> > number of branches.
> >
> > Ordered the packet size comparisons according to the probability with
> typical
> > internet traffic mix.
> >
> > Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> > ---
> > drivers/net/netvsc/hn_rxtx.c | 32 ++++++++++----------------------
> > 1 file changed, 10 insertions(+), 22 deletions(-)
> >
> > diff --git a/drivers/net/netvsc/hn_rxtx.c
> b/drivers/net/netvsc/hn_rxtx.c index
> > 9bf1ec5509..b704b2c971 100644
> > --- a/drivers/net/netvsc/hn_rxtx.c
> > +++ b/drivers/net/netvsc/hn_rxtx.c
> > @@ -110,30 +110,18 @@ hn_update_packet_stats(struct hn_stats *stats,
> > const struct rte_mbuf *m)
> > uint32_t s = m->pkt_len;
> > const struct rte_ether_addr *ea;
> >
> > - if (s == 64) {
> > - stats->size_bins[1]++;
> > - } else if (s > 64 && s < 1024) {
> > - uint32_t bin;
> > -
> > - /* count zeros, and offset into correct bin */
> > - bin = (sizeof(s) * 8) - rte_clz32(s) - 5;
> > - stats->size_bins[bin]++;
> > - } else {
> > - if (s < 64)
> > - stats->size_bins[0]++;
> > - else if (s < 1519)
> > - stats->size_bins[6]++;
> > - else
> > - stats->size_bins[7]++;
> > - }
> > + if (s >= 1024)
> > + stats->size_bins[6 + (s > 1518)]++;
> > + else if (s <= 64)
> > + stats->size_bins[s >> 6]++;
> > + else
> > + stats->size_bins[32UL - rte_clz32(s) - 5]++;
>
> This part looks good.
>
> >
> > ea = rte_pktmbuf_mtod(m, const struct rte_ether_addr *);
> > - if (rte_is_multicast_ether_addr(ea)) {
> > - if (rte_is_broadcast_ether_addr(ea))
> > - stats->broadcast++;
> > - else
> > - stats->multicast++;
> > - }
> > + RTE_BUILD_BUG_ON(offsetof(struct hn_stats, broadcast) !=
> > + offsetof(struct hn_stats, multicast) +
> sizeof(uint64_t));
> > + if (unlikely(rte_is_multicast_ether_addr(ea)))
> > + (&stats->multicast)[rte_is_broadcast_ether_addr(ea)]++;
> > }
>
> This makes the code a little harder to read.
I agree it is somewhat convoluted.
It's a tradeoff... I preferred performance at the cost of making the code somewhat harder to read.
The RTE_BUILD_BUG_ON() also helps showing what is going on with the weird indexing.
> How about just add
> "unlikely" to rte_is_multicast_ether_addr(ea) and keep the rest
> unchanged?
Then I could also add "unlikely" to is_broadcast().
In theory, there should be minimal broadcast traffic in any normal network. (Except when experiencing broadcast storms due to network loops or other network problems.)
But in reality, I think most real life networks see less multicast (non-broadcast) than broadcast traffic.
I don' think the following alternative makes the code significantly more readable than the method in this patch, but I'll mention it for the sake of discussion:
I could modify the hn_stats type like this:
struct hn_stats {
other fields...
+ union {
+ struct {
uint64_t multicast;
uint64_t broadcast;
+ };
+ uint64_t multicast_broadcast[2];
+ };
other fields...
};
So the code would become:
+ if (unlikely(rte_is_multicast_ether_addr(ea)))
+ stats->multicast_broadcast[rte_is_broadcast_ether_addr(ea)]++;
Whatever we decide, we should use the same method in all three patches (netvsc, virtio, vhost-user).
The choices are:
1. Stick with the original code (with two branches), and add unlikely().
2. Use the method provided in this patch (with only one branch).
3. Introduce a multicast_broadcast[2] to the stats structure as described above (also with only one branch).
@Long, @Wei, @Maxime, @Chenbo, @Stephen and anyone else interested, please cast your votes.
Comments are also welcome! :-)
I'm in favor of #2.
On Fri, 2 Aug 2024 19:28:26 +0200
Morten Brørup <mb@smartsharesystems.com> wrote:
> > > ea = rte_pktmbuf_mtod(m, const struct rte_ether_addr *);
> > > - if (rte_is_multicast_ether_addr(ea)) {
> > > - if (rte_is_broadcast_ether_addr(ea))
> > > - stats->broadcast++;
> > > - else
> > > - stats->multicast++;
> > > - }
> > > + RTE_BUILD_BUG_ON(offsetof(struct hn_stats, broadcast) !=
> > > + offsetof(struct hn_stats, multicast) +
> > sizeof(uint64_t));
> > > + if (unlikely(rte_is_multicast_ether_addr(ea)))
> > > + (&stats->multicast)[rte_is_broadcast_ether_addr(ea)]++;
> > > }
> >
> > This makes the code a little harder to read.
>
> I agree it is somewhat convoluted.
> It's a tradeoff... I preferred performance at the cost of making the code somewhat harder to read.
> The RTE_BUILD_BUG_ON() also helps showing what is going on with the weird indexing.
Optimizing for multicast packets is not worth bothering.
Keep the original code it is simpler.
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Friday, 2 August 2024 19.33
>
> On Fri, 2 Aug 2024 19:28:26 +0200
> Morten Brørup <mb@smartsharesystems.com> wrote:
>
> > > > ea = rte_pktmbuf_mtod(m, const struct rte_ether_addr *);
> > > > - if (rte_is_multicast_ether_addr(ea)) {
> > > > - if (rte_is_broadcast_ether_addr(ea))
> > > > - stats->broadcast++;
> > > > - else
> > > > - stats->multicast++;
> > > > - }
> > > > + RTE_BUILD_BUG_ON(offsetof(struct hn_stats, broadcast) !=
> > > > + offsetof(struct hn_stats, multicast) +
> > > sizeof(uint64_t));
> > > > + if (unlikely(rte_is_multicast_ether_addr(ea)))
> > > > + (&stats->multicast)[rte_is_broadcast_ether_addr(ea)]++;
> > > > }
> > >
> > > This makes the code a little harder to read.
> >
> > I agree it is somewhat convoluted.
> > It's a tradeoff... I preferred performance at the cost of making the code
> somewhat harder to read.
> > The RTE_BUILD_BUG_ON() also helps showing what is going on with the weird
> indexing.
Similar patches have been accepted by other drivers:
[virtio]: https://patchwork.dpdk.org/project/dpdk/patch/20240801160312.205281-1-mb@smartsharesystems.com/
[vhost-user]: https://patchwork.dpdk.org/project/dpdk/patch/20240802143259.269827-1-mb@smartsharesystems.com/
>
> Optimizing for multicast packets is not worth bothering.
Optimizing for multicast/broadcast comes into play in multicast environments, and during network broadcast storms.
Although I don't know if any of those two scenarios are relevant for this specific driver.
> Keep the original code it is simpler.
Let's keep similar code similar across drivers.
@Long, @Wei, please Review/Ack, so the patch can be applied.
> Subject: RE: [PATCH] netvsc: optimize stats counters performance
>
> > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > Sent: Friday, 2 August 2024 19.33
> >
> > On Fri, 2 Aug 2024 19:28:26 +0200
> > Morten Brørup <mb@smartsharesystems.com> wrote:
> >
> > > > > ea = rte_pktmbuf_mtod(m, const struct rte_ether_addr *);
> > > > > - if (rte_is_multicast_ether_addr(ea)) {
> > > > > - if (rte_is_broadcast_ether_addr(ea))
> > > > > - stats->broadcast++;
> > > > > - else
> > > > > - stats->multicast++;
> > > > > - }
> > > > > + RTE_BUILD_BUG_ON(offsetof(struct hn_stats, broadcast) !=
> > > > > + offsetof(struct hn_stats, multicast) +
> > > > sizeof(uint64_t));
> > > > > + if (unlikely(rte_is_multicast_ether_addr(ea)))
> > > > > + (&stats-
> >multicast)[rte_is_broadcast_ether_addr(ea)]++;
> > > > > }
> > > >
> > > > This makes the code a little harder to read.
> > >
> > > I agree it is somewhat convoluted.
> > > It's a tradeoff... I preferred performance at the cost of making the code
> > somewhat harder to read.
> > > The RTE_BUILD_BUG_ON() also helps showing what is going on with the
> weird
> > indexing.
>
> Similar patches have been accepted by other drivers:
> [virtio]:
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwor
> k.dpdk.org%2Fproject%2Fdpdk%2Fpatch%2F20240801160312.205281-1-
> mb%40smartsharesystems.com%2F&data=05%7C02%7Clongli%40microsoft.com
> %7C014c845ad8ec4ea25fb508dcd8b22474%7C72f988bf86f141af91ab2d7cd011
> db47%7C1%7C0%7C638623506854135960%7CUnknown%7CTWFpbGZsb3d8eyJ
> WIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0
> %7C%7C%7C&sdata=DrVP6TuTruREAxAoSEBED11TOapSqDN5stsDJJTuux0%3D&r
> eserved=0
> [vhost-user]:
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwor
> k.dpdk.org%2Fproject%2Fdpdk%2Fpatch%2F20240802143259.269827-1-
> mb%40smartsharesystems.com%2F&data=05%7C02%7Clongli%40microsoft.com
> %7C014c845ad8ec4ea25fb508dcd8b22474%7C72f988bf86f141af91ab2d7cd011
> db47%7C1%7C0%7C638623506854150865%7CUnknown%7CTWFpbGZsb3d8eyJ
> WIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0
> %7C%7C%7C&sdata=85T9qnC4nbAwU53sxnfQ2lMayp5soX8BMEJsbB58ZBw%3D
> &reserved=0
>
> >
> > Optimizing for multicast packets is not worth bothering.
>
> Optimizing for multicast/broadcast comes into play in multicast environments,
> and during network broadcast storms.
> Although I don't know if any of those two scenarios are relevant for this specific
> driver.
>
> > Keep the original code it is simpler.
>
> Let's keep similar code similar across drivers.
>
> @Long, @Wei, please Review/Ack, so the patch can be applied.
Reviewed-by: Long Li <longli@microsoft.com>
@@ -110,30 +110,18 @@ hn_update_packet_stats(struct hn_stats *stats, const struct rte_mbuf *m)
uint32_t s = m->pkt_len;
const struct rte_ether_addr *ea;
- if (s == 64) {
- stats->size_bins[1]++;
- } else if (s > 64 && s < 1024) {
- uint32_t bin;
-
- /* count zeros, and offset into correct bin */
- bin = (sizeof(s) * 8) - rte_clz32(s) - 5;
- stats->size_bins[bin]++;
- } else {
- if (s < 64)
- stats->size_bins[0]++;
- else if (s < 1519)
- stats->size_bins[6]++;
- else
- stats->size_bins[7]++;
- }
+ if (s >= 1024)
+ stats->size_bins[6 + (s > 1518)]++;
+ else if (s <= 64)
+ stats->size_bins[s >> 6]++;
+ else
+ stats->size_bins[32UL - rte_clz32(s) - 5]++;
ea = rte_pktmbuf_mtod(m, const struct rte_ether_addr *);
- if (rte_is_multicast_ether_addr(ea)) {
- if (rte_is_broadcast_ether_addr(ea))
- stats->broadcast++;
- else
- stats->multicast++;
- }
+ RTE_BUILD_BUG_ON(offsetof(struct hn_stats, broadcast) !=
+ offsetof(struct hn_stats, multicast) + sizeof(uint64_t));
+ if (unlikely(rte_is_multicast_ether_addr(ea)))
+ (&stats->multicast)[rte_is_broadcast_ether_addr(ea)]++;
}
static inline unsigned int hn_rndis_pktlen(const struct rndis_packet_msg *pkt)