[v2] ethdev: add Linux ethtool link mode conversion

Message ID 20240229154343.1752555-1-thomas@monjalon.net (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series [v2] ethdev: add Linux ethtool link mode conversion |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/github-robot: build success github build: passed
ci/intel-Functional success Functional PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-unit-arm64-testing success Testing PASS
ci/iol-compile-amd64-testing success Testing PASS
ci/iol-compile-arm64-testing success Testing PASS
ci/iol-unit-amd64-testing success Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-sample-apps-testing success Testing PASS

Commit Message

Thomas Monjalon Feb. 29, 2024, 3:42 p.m. UTC
  Speed capabilities of a NIC may be discovered through its Linux
kernel driver. It is especially useful for bifurcated drivers,
so they don't have to duplicate the same logic in the DPDK driver.

Parsing ethtool speed capabilities is made easy thanks to
the functions added in ethdev for internal usage only.
Of course these functions work only on Linux,
so they are not compiled in other environments.

In order to ease parsing, the ethtool macro names are parsed
externally in a shell command which generates a C array
included in this patch.
It also avoids to depend on a kernel version.
This C array should be updated in future to get latest ethtool bits.
Note it is easier to update this array than adding new cases
in a parsing code.

The types in the functions are following the ethtool type:
uint32_t for bitmaps, and int8_t for the number of 32-bitmaps.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---

A follow-up patch will be sent to use these functions in mlx5.
I suspect mana could use this parsing as well.

---
 lib/ethdev/ethdev_linux_ethtool.c | 161 ++++++++++++++++++++++++++++++
 lib/ethdev/ethdev_linux_ethtool.h |  41 ++++++++
 lib/ethdev/meson.build            |   9 ++
 lib/ethdev/version.map            |   3 +
 4 files changed, 214 insertions(+)
 create mode 100644 lib/ethdev/ethdev_linux_ethtool.c
 create mode 100644 lib/ethdev/ethdev_linux_ethtool.h
  

Comments

Stephen Hemminger Feb. 29, 2024, 4:45 p.m. UTC | #1
On Thu, 29 Feb 2024 16:42:56 +0100
Thomas Monjalon <thomas@monjalon.net> wrote:

> +/* Link modes sorted with index as defined in ethtool.
> + * Values are speed in Mbps with LSB indicating duplex.
> + *
> + * The ethtool bits definition should not change as it is a kernel API.
> + * Using raw numbers directly avoids checking API availability
> + * and allows to compile with new bits included even on an old kernel.
> + *
> + * The array below is built from bit definitions with this shell command:
> + *   sed -rn 's;.*(ETHTOOL_LINK_MODE_)([0-9]+)([0-9a-zA-Z_]*).*= *([0-9]*).*;'\
> + *           '[\4] = \2, /\* \1\2\3 *\/;p' /usr/include/linux/ethtool.h |
> + *   awk '/_Half_/{$3=$3+1","}1'
> + */
> +static uint32_t link_modes[] = {

Make it const please.

You could add meson rule to generate it and then use non-numeric tags.
  
Morten Brørup Feb. 29, 2024, 4:58 p.m. UTC | #2
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Thursday, 29 February 2024 17.45
> 
> On Thu, 29 Feb 2024 16:42:56 +0100
> Thomas Monjalon <thomas@monjalon.net> wrote:
> 
> > +/* Link modes sorted with index as defined in ethtool.
> > + * Values are speed in Mbps with LSB indicating duplex.
> > + *
> > + * The ethtool bits definition should not change as it is a kernel
> API.
> > + * Using raw numbers directly avoids checking API availability
> > + * and allows to compile with new bits included even on an old
> kernel.
> > + *
> > + * The array below is built from bit definitions with this shell
> command:
> > + *   sed -rn 's;.*(ETHTOOL_LINK_MODE_)([0-9]+)([0-9a-zA-Z_]*).*=
> *([0-9]*).*;'\
> > + *           '[\4] = \2, /\* \1\2\3 *\/;p'
> /usr/include/linux/ethtool.h |
> > + *   awk '/_Half_/{$3=$3+1","}1'
> > + */
> > +static uint32_t link_modes[] = {
> 
> Make it const please.
> 
> You could add meson rule to generate it and then use non-numeric tags.

However you do it, make sure it cross builds. The kernel/ethtool on the target system may differ from the one on the build system.
  
Stephen Hemminger Feb. 29, 2024, 5:38 p.m. UTC | #3
On Thu, 29 Feb 2024 17:58:13 +0100
Morten Brørup <mb@smartsharesystems.com> wrote:

> > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > Sent: Thursday, 29 February 2024 17.45
> > 
> > On Thu, 29 Feb 2024 16:42:56 +0100
> > Thomas Monjalon <thomas@monjalon.net> wrote:
> >   
> > > +/* Link modes sorted with index as defined in ethtool.
> > > + * Values are speed in Mbps with LSB indicating duplex.
> > > + *
> > > + * The ethtool bits definition should not change as it is a kernel  
> > API.  
> > > + * Using raw numbers directly avoids checking API availability
> > > + * and allows to compile with new bits included even on an old  
> > kernel.  
> > > + *
> > > + * The array below is built from bit definitions with this shell  
> > command:  
> > > + *   sed -rn 's;.*(ETHTOOL_LINK_MODE_)([0-9]+)([0-9a-zA-Z_]*).*=  
> > *([0-9]*).*;'\  
> > > + *           '[\4] = \2, /\* \1\2\3 *\/;p'  
> > /usr/include/linux/ethtool.h |  
> > > + *   awk '/_Half_/{$3=$3+1","}1'
> > > + */
> > > +static uint32_t link_modes[] = {  
> > 
> > Make it const please.
> > 
> > You could add meson rule to generate it and then use non-numeric tags.  
> 
> However you do it, make sure it cross builds. The kernel/ethtool on the target system may differ from the one on the build system.
> 

If the build system is older, the speed table will be smaller. And the code should just print "Unknown"
If the build system is newer, then the table will be larger than kernel ever returns which is Ok.
  
Thomas Monjalon March 1, 2024, 10:27 a.m. UTC | #4
29/02/2024 18:38, Stephen Hemminger:
> On Thu, 29 Feb 2024 17:58:13 +0100
> Morten Brørup <mb@smartsharesystems.com> wrote:
> 
> > > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > > Sent: Thursday, 29 February 2024 17.45
> > > 
> > > On Thu, 29 Feb 2024 16:42:56 +0100
> > > Thomas Monjalon <thomas@monjalon.net> wrote:
> > >   
> > > > +/* Link modes sorted with index as defined in ethtool.
> > > > + * Values are speed in Mbps with LSB indicating duplex.
> > > > + *
> > > > + * The ethtool bits definition should not change as it is a kernel  
> > > API.  
> > > > + * Using raw numbers directly avoids checking API availability
> > > > + * and allows to compile with new bits included even on an old  
> > > kernel.  
> > > > + *
> > > > + * The array below is built from bit definitions with this shell  
> > > command:  
> > > > + *   sed -rn 's;.*(ETHTOOL_LINK_MODE_)([0-9]+)([0-9a-zA-Z_]*).*=  
> > > *([0-9]*).*;'\  
> > > > + *           '[\4] = \2, /\* \1\2\3 *\/;p'  
> > > /usr/include/linux/ethtool.h |  
> > > > + *   awk '/_Half_/{$3=$3+1","}1'
> > > > + */
> > > > +static uint32_t link_modes[] = {  
> > > 
> > > Make it const please.

Yes


> > > You could add meson rule to generate it and then use non-numeric tags.  
> > 
> > However you do it, make sure it cross builds. The kernel/ethtool on the target system may differ from the one on the build system.
> > 
> 
> If the build system is older, the speed table will be smaller. And the code should just print "Unknown"
> If the build system is newer, then the table will be larger than kernel ever returns which is Ok.

There is no benefit in having a smaller table.
That's why I prefer using numeric indices with the best coverage possible.
  
Ferruh Yigit March 1, 2024, 1:12 p.m. UTC | #5
On 2/29/2024 3:42 PM, Thomas Monjalon wrote:
> Speed capabilities of a NIC may be discovered through its Linux
> kernel driver. It is especially useful for bifurcated drivers,
> so they don't have to duplicate the same logic in the DPDK driver.
> 
> Parsing ethtool speed capabilities is made easy thanks to
> the functions added in ethdev for internal usage only.
> Of course these functions work only on Linux,
> so they are not compiled in other environments.
> 
> In order to ease parsing, the ethtool macro names are parsed
> externally in a shell command which generates a C array
> included in this patch.
> It also avoids to depend on a kernel version.
> This C array should be updated in future to get latest ethtool bits.
> Note it is easier to update this array than adding new cases
> in a parsing code.
> 
> The types in the functions are following the ethtool type:
> uint32_t for bitmaps, and int8_t for the number of 32-bitmaps.
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> ---
> 
> A follow-up patch will be sent to use these functions in mlx5.
> I suspect mana could use this parsing as well.
>

Is the usecase driver get link info via ibverbs and convert it to DPDK
link info?

How complex or duplicated effort to get link info directly via DPDK
functions?
Because this approach is can be applied to only limited devices in DPDK
and solving an issue DPDK already has a solution, does it worth to the
code it adds?

> ---
>  lib/ethdev/ethdev_linux_ethtool.c | 161 ++++++++++++++++++++++++++++++
>  lib/ethdev/ethdev_linux_ethtool.h |  41 ++++++++
>  lib/ethdev/meson.build            |   9 ++
>  lib/ethdev/version.map            |   3 +
>  4 files changed, 214 insertions(+)
>  create mode 100644 lib/ethdev/ethdev_linux_ethtool.c
>  create mode 100644 lib/ethdev/ethdev_linux_ethtool.h
> 
> diff --git a/lib/ethdev/ethdev_linux_ethtool.c b/lib/ethdev/ethdev_linux_ethtool.c
> new file mode 100644
> index 0000000000..0ece172a75
> --- /dev/null
> +++ b/lib/ethdev/ethdev_linux_ethtool.c
> @@ -0,0 +1,161 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright (c) 2024 NVIDIA Corporation & Affiliates
> + */
> +
> +#include <rte_bitops.h>
> +
> +#include "rte_ethdev.h"
> +#include "ethdev_linux_ethtool.h"
> +
> +/* Link modes sorted with index as defined in ethtool.
> + * Values are speed in Mbps with LSB indicating duplex.
> + *
> + * The ethtool bits definition should not change as it is a kernel API.
> + * Using raw numbers directly avoids checking API availability
> + * and allows to compile with new bits included even on an old kernel.
> + *
> + * The array below is built from bit definitions with this shell command:
> + *   sed -rn 's;.*(ETHTOOL_LINK_MODE_)([0-9]+)([0-9a-zA-Z_]*).*= *([0-9]*).*;'\
> + *           '[\4] = \2, /\* \1\2\3 *\/;p' /usr/include/linux/ethtool.h |
> + *   awk '/_Half_/{$3=$3+1","}1'
> + */
> +static uint32_t link_modes[] = {
> +	  [0] =      11, /* ETHTOOL_LINK_MODE_10baseT_Half_BIT */
> +	  [1] =      10, /* ETHTOOL_LINK_MODE_10baseT_Full_BIT */
> +	  [2] =     101, /* ETHTOOL_LINK_MODE_100baseT_Half_BIT */
> +	  [3] =     100, /* ETHTOOL_LINK_MODE_100baseT_Full_BIT */
> +	  [4] =    1001, /* ETHTOOL_LINK_MODE_1000baseT_Half_BIT */
> +	  [5] =    1000, /* ETHTOOL_LINK_MODE_1000baseT_Full_BIT */
> +	 [12] =   10000, /* ETHTOOL_LINK_MODE_10000baseT_Full_BIT */
> +	 [15] =    2500, /* ETHTOOL_LINK_MODE_2500baseX_Full_BIT */
> +	 [17] =    1000, /* ETHTOOL_LINK_MODE_1000baseKX_Full_BIT */
> +	 [18] =   10000, /* ETHTOOL_LINK_MODE_10000baseKX4_Full_BIT */
> +	 [19] =   10000, /* ETHTOOL_LINK_MODE_10000baseKR_Full_BIT */
> +	 [20] =   10000, /* ETHTOOL_LINK_MODE_10000baseR_FEC_BIT */
> +	 [21] =   20000, /* ETHTOOL_LINK_MODE_20000baseMLD2_Full_BIT */
> +	 [22] =   20000, /* ETHTOOL_LINK_MODE_20000baseKR2_Full_BIT */
> +	 [23] =   40000, /* ETHTOOL_LINK_MODE_40000baseKR4_Full_BIT */
> +	 [24] =   40000, /* ETHTOOL_LINK_MODE_40000baseCR4_Full_BIT */
> +	 [25] =   40000, /* ETHTOOL_LINK_MODE_40000baseSR4_Full_BIT */
> +	 [26] =   40000, /* ETHTOOL_LINK_MODE_40000baseLR4_Full_BIT */
> +	 [27] =   56000, /* ETHTOOL_LINK_MODE_56000baseKR4_Full_BIT */
> +	 [28] =   56000, /* ETHTOOL_LINK_MODE_56000baseCR4_Full_BIT */
> +	 [29] =   56000, /* ETHTOOL_LINK_MODE_56000baseSR4_Full_BIT */
> +	 [30] =   56000, /* ETHTOOL_LINK_MODE_56000baseLR4_Full_BIT */
> +	 [31] =   25000, /* ETHTOOL_LINK_MODE_25000baseCR_Full_BIT */
> +	 [32] =   25000, /* ETHTOOL_LINK_MODE_25000baseKR_Full_BIT */
> +	 [33] =   25000, /* ETHTOOL_LINK_MODE_25000baseSR_Full_BIT */
> +	 [34] =   50000, /* ETHTOOL_LINK_MODE_50000baseCR2_Full_BIT */
> +	 [35] =   50000, /* ETHTOOL_LINK_MODE_50000baseKR2_Full_BIT */
> +	 [36] =  100000, /* ETHTOOL_LINK_MODE_100000baseKR4_Full_BIT */
> +	 [37] =  100000, /* ETHTOOL_LINK_MODE_100000baseSR4_Full_BIT */
> +	 [38] =  100000, /* ETHTOOL_LINK_MODE_100000baseCR4_Full_BIT */
> +	 [39] =  100000, /* ETHTOOL_LINK_MODE_100000baseLR4_ER4_Full_BIT */
> +	 [40] =   50000, /* ETHTOOL_LINK_MODE_50000baseSR2_Full_BIT */
> +	 [41] =    1000, /* ETHTOOL_LINK_MODE_1000baseX_Full_BIT */
> +	 [42] =   10000, /* ETHTOOL_LINK_MODE_10000baseCR_Full_BIT */
> +	 [43] =   10000, /* ETHTOOL_LINK_MODE_10000baseSR_Full_BIT */
> +	 [44] =   10000, /* ETHTOOL_LINK_MODE_10000baseLR_Full_BIT */
> +	 [45] =   10000, /* ETHTOOL_LINK_MODE_10000baseLRM_Full_BIT */
> +	 [46] =   10000, /* ETHTOOL_LINK_MODE_10000baseER_Full_BIT */
> +	 [47] =    2500, /* ETHTOOL_LINK_MODE_2500baseT_Full_BIT */
> +	 [48] =    5000, /* ETHTOOL_LINK_MODE_5000baseT_Full_BIT */
> +	 [52] =   50000, /* ETHTOOL_LINK_MODE_50000baseKR_Full_BIT */
> +	 [53] =   50000, /* ETHTOOL_LINK_MODE_50000baseSR_Full_BIT */
> +	 [54] =   50000, /* ETHTOOL_LINK_MODE_50000baseCR_Full_BIT */
> +	 [55] =   50000, /* ETHTOOL_LINK_MODE_50000baseLR_ER_FR_Full_BIT */
> +	 [56] =   50000, /* ETHTOOL_LINK_MODE_50000baseDR_Full_BIT */
> +	 [57] =  100000, /* ETHTOOL_LINK_MODE_100000baseKR2_Full_BIT */
> +	 [58] =  100000, /* ETHTOOL_LINK_MODE_100000baseSR2_Full_BIT */
> +	 [59] =  100000, /* ETHTOOL_LINK_MODE_100000baseCR2_Full_BIT */
> +	 [60] =  100000, /* ETHTOOL_LINK_MODE_100000baseLR2_ER2_FR2_Full_BIT */
> +	 [61] =  100000, /* ETHTOOL_LINK_MODE_100000baseDR2_Full_BIT */
> +	 [62] =  200000, /* ETHTOOL_LINK_MODE_200000baseKR4_Full_BIT */
> +	 [63] =  200000, /* ETHTOOL_LINK_MODE_200000baseSR4_Full_BIT */
> +	 [64] =  200000, /* ETHTOOL_LINK_MODE_200000baseLR4_ER4_FR4_Full_BIT */
> +	 [65] =  200000, /* ETHTOOL_LINK_MODE_200000baseDR4_Full_BIT */
> +	 [66] =  200000, /* ETHTOOL_LINK_MODE_200000baseCR4_Full_BIT */
> +	 [67] =     100, /* ETHTOOL_LINK_MODE_100baseT1_Full_BIT */
> +	 [68] =    1000, /* ETHTOOL_LINK_MODE_1000baseT1_Full_BIT */
> +	 [69] =  400000, /* ETHTOOL_LINK_MODE_400000baseKR8_Full_BIT */
> +	 [70] =  400000, /* ETHTOOL_LINK_MODE_400000baseSR8_Full_BIT */
> +	 [71] =  400000, /* ETHTOOL_LINK_MODE_400000baseLR8_ER8_FR8_Full_BIT */
> +	 [72] =  400000, /* ETHTOOL_LINK_MODE_400000baseDR8_Full_BIT */
> +	 [73] =  400000, /* ETHTOOL_LINK_MODE_400000baseCR8_Full_BIT */
> +	 [75] =  100000, /* ETHTOOL_LINK_MODE_100000baseKR_Full_BIT */
> +	 [76] =  100000, /* ETHTOOL_LINK_MODE_100000baseSR_Full_BIT */
> +	 [77] =  100000, /* ETHTOOL_LINK_MODE_100000baseLR_ER_FR_Full_BIT */
> +	 [78] =  100000, /* ETHTOOL_LINK_MODE_100000baseCR_Full_BIT */
> +	 [79] =  100000, /* ETHTOOL_LINK_MODE_100000baseDR_Full_BIT */
> +	 [80] =  200000, /* ETHTOOL_LINK_MODE_200000baseKR2_Full_BIT */
> +	 [81] =  200000, /* ETHTOOL_LINK_MODE_200000baseSR2_Full_BIT */
> +	 [82] =  200000, /* ETHTOOL_LINK_MODE_200000baseLR2_ER2_FR2_Full_BIT */
> +	 [83] =  200000, /* ETHTOOL_LINK_MODE_200000baseDR2_Full_BIT */
> +	 [84] =  200000, /* ETHTOOL_LINK_MODE_200000baseCR2_Full_BIT */
> +	 [85] =  400000, /* ETHTOOL_LINK_MODE_400000baseKR4_Full_BIT */
> +	 [86] =  400000, /* ETHTOOL_LINK_MODE_400000baseSR4_Full_BIT */
> +	 [87] =  400000, /* ETHTOOL_LINK_MODE_400000baseLR4_ER4_FR4_Full_BIT */
> +	 [88] =  400000, /* ETHTOOL_LINK_MODE_400000baseDR4_Full_BIT */
> +	 [89] =  400000, /* ETHTOOL_LINK_MODE_400000baseCR4_Full_BIT */
> +	 [90] =     101, /* ETHTOOL_LINK_MODE_100baseFX_Half_BIT */
> +	 [91] =     100, /* ETHTOOL_LINK_MODE_100baseFX_Full_BIT */
> +	 [92] =      10, /* ETHTOOL_LINK_MODE_10baseT1L_Full_BIT */
> +	 [93] =  800000, /* ETHTOOL_LINK_MODE_800000baseCR8_Full_BIT */
> +	 [94] =  800000, /* ETHTOOL_LINK_MODE_800000baseKR8_Full_BIT */
> +	 [95] =  800000, /* ETHTOOL_LINK_MODE_800000baseDR8_Full_BIT */
> +	 [96] =  800000, /* ETHTOOL_LINK_MODE_800000baseDR8_2_Full_BIT */
> +	 [97] =  800000, /* ETHTOOL_LINK_MODE_800000baseSR8_Full_BIT */
> +	 [98] =  800000, /* ETHTOOL_LINK_MODE_800000baseVR8_Full_BIT */
> +	 [99] =      10, /* ETHTOOL_LINK_MODE_10baseT1S_Full_BIT */
> +	[100] =      11, /* ETHTOOL_LINK_MODE_10baseT1S_Half_BIT */
> +	[101] =      11, /* ETHTOOL_LINK_MODE_10baseT1S_P2MP_Half_BIT */
> +};
> +
> +uint32_t
> +rte_eth_link_speed_ethtool(enum ethtool_link_mode_bit_indices bit)
> +{
> +	uint32_t speed;
> +	int duplex;
> +
> +	/* get mode from array */
> +	if (bit >= RTE_DIM(link_modes))
> +		return RTE_ETH_LINK_SPEED_AUTONEG;
> +	speed = link_modes[bit];
> +	if (speed == 0)
> +		return RTE_ETH_LINK_SPEED_AUTONEG;
> +	RTE_BUILD_BUG_ON(RTE_ETH_LINK_SPEED_AUTONEG != 0);
>

I think for above two checks, we can't really get the speed from
provided ethtool enum, and intention is to return something ineffective,
intention is not really return AUTONEG, right? If so why not directly
return 0?

> +
> +	/* duplex is LSB */
> +	duplex = (speed & 1) ?
> +			RTE_ETH_LINK_HALF_DUPLEX :
> +			RTE_ETH_LINK_FULL_DUPLEX;
> +	speed &= RTE_GENMASK32(31, 1);
>

As trying to zero the LSB, following also work,

speed &= ~UINT32_C(1)

> +
> +	return rte_eth_speed_bitflag(speed, duplex);
> +}
> +
> +uint32_t
> +rte_eth_link_speed_glink(const uint32_t *bitmap, int8_t nwords)
> +{
> +	uint8_t word, bit;
> +	uint32_t ethdev_bitmap = 0;
> +
> +	if (nwords < 1)
> +		return 0;
> +
> +	for (word = 0; word < nwords; word++) {
> +		for (bit = 0; bit < 32; bit++) {
>

May be (sizeof(bitmap) * CHAR_BIT) instead of hardcoded 32, although not
sure if it is required.

> +			if ((bitmap[word] & RTE_BIT32(bit)) == 0)
> +				continue;
> +			ethdev_bitmap |= rte_eth_link_speed_ethtool(word * 32 + bit);
> +		}
> +	}
> +
> +	return ethdev_bitmap;
> +}
> +
> +uint32_t
> +rte_eth_link_speed_gset(uint32_t legacy_bitmap)
> +{
> +	return rte_eth_link_speed_glink(&legacy_bitmap, 1);
> +}
> diff --git a/lib/ethdev/ethdev_linux_ethtool.h b/lib/ethdev/ethdev_linux_ethtool.h
> new file mode 100644
> index 0000000000..de235bd5f4
> --- /dev/null
> +++ b/lib/ethdev/ethdev_linux_ethtool.h
> @@ -0,0 +1,41 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright (c) 2024 NVIDIA Corporation & Affiliates
> + */
> +
> +#ifndef ETHDEV_ETHTOOL_H
> +#define ETHDEV_ETHTOOL_H
> +
> +#include <stdint.h>
> +#include <linux/ethtool.h>
> +
> +#include <rte_compat.h>
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> + * Convert bit from ETHTOOL_LINK_MODE_* to RTE_ETH_LINK_SPEED_*
> + */
> +__rte_internal
> +uint32_t rte_eth_link_speed_ethtool(enum ethtool_link_mode_bit_indices bit);
> +
> +/*
> + * Convert bitmap from ETHTOOL_GLINKSETTINGS ethtool_link_settings::link_mode_masks
> + * to bitmap RTE_ETH_LINK_SPEED_*
> + */
> +__rte_internal
> +uint32_t rte_eth_link_speed_glink(const uint32_t *bitmap, int8_t nwords);
> +
> +/*
> + * Convert bitmap from deprecated ETHTOOL_GSET ethtool_cmd::supported
> + * to bitmap RTE_ETH_LINK_SPEED_*
> + */
> +__rte_internal
> +uint32_t rte_eth_link_speed_gset(uint32_t legacy_bitmap);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* ETHDEV_ETHTOOL_H */
> diff --git a/lib/ethdev/meson.build b/lib/ethdev/meson.build
> index d11f06bc88..f1d2586591 100644
> --- a/lib/ethdev/meson.build
> +++ b/lib/ethdev/meson.build
> @@ -44,6 +44,15 @@ driver_sdk_headers += files(
>          'ethdev_vdev.h',
>  )
>  
> +if is_linux
> +    driver_sdk_headers += files(
> +            'ethdev_linux_ethtool.h',
> +    )
> +    sources += files(
> +            'ethdev_linux_ethtool.c',
> +    )
> +endif
> +
>

Should meson check if 'linux/ethtool.h' exists, for anycase?
  
Thomas Monjalon March 1, 2024, 1:37 p.m. UTC | #6
01/03/2024 14:12, Ferruh Yigit:
> On 2/29/2024 3:42 PM, Thomas Monjalon wrote:
> > Speed capabilities of a NIC may be discovered through its Linux
> > kernel driver. It is especially useful for bifurcated drivers,
> > so they don't have to duplicate the same logic in the DPDK driver.
> > 
> > Parsing ethtool speed capabilities is made easy thanks to
> > the functions added in ethdev for internal usage only.
> > Of course these functions work only on Linux,
> > so they are not compiled in other environments.
> > 
> > In order to ease parsing, the ethtool macro names are parsed
> > externally in a shell command which generates a C array
> > included in this patch.
> > It also avoids to depend on a kernel version.
> > This C array should be updated in future to get latest ethtool bits.
> > Note it is easier to update this array than adding new cases
> > in a parsing code.
> > 
> > The types in the functions are following the ethtool type:
> > uint32_t for bitmaps, and int8_t for the number of 32-bitmaps.
> > 
> > Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> > ---
> > 
> > A follow-up patch will be sent to use these functions in mlx5.
> > I suspect mana could use this parsing as well.
> >
> 
> Is the usecase driver get link info via ibverbs and convert it to DPDK
> link info?

The use case is to get capabilities from the kernel driver via ethtool ioctl.

> How complex or duplicated effort to get link info directly via DPDK
> functions?

This is done by the driver.
This is how mlx5 driver is getting speed capabilities.

> Because this approach is can be applied to only limited devices in DPDK
> and solving an issue DPDK already has a solution, does it worth to the
> code it adds?

It is going to replace code in mlx5 driver.
I could add this code in mlx5 driver,
but it could help other drivers in future like mana.

> > +	speed = link_modes[bit];
> > +	if (speed == 0)
> > +		return RTE_ETH_LINK_SPEED_AUTONEG;
> > +	RTE_BUILD_BUG_ON(RTE_ETH_LINK_SPEED_AUTONEG != 0);
> >
> 
> I think for above two checks, we can't really get the speed from
> provided ethtool enum, and intention is to return something ineffective,
> intention is not really return AUTONEG, right? If so why not directly
> return 0?

Yes it could return 0 directly, but the namespace of the returned value
is RTE_ETH_LINK_SPEED_.
Also it is semantically correct: if no other capability found,
there is no other choice than autoneg.

> > +
> > +	/* duplex is LSB */
> > +	duplex = (speed & 1) ?
> > +			RTE_ETH_LINK_HALF_DUPLEX :
> > +			RTE_ETH_LINK_FULL_DUPLEX;
> > +	speed &= RTE_GENMASK32(31, 1);
> 
> As trying to zero the LSB, following also work,
> 
> speed &= ~UINT32_C(1)

Indeed, this is what RTE_GENMASK32 is doing.
But I think using RTE_GENMASK32 better convey the intent.

[...]
> > +	for (word = 0; word < nwords; word++) {
> > +		for (bit = 0; bit < 32; bit++) {
> 
> May be (sizeof(bitmap) * CHAR_BIT) instead of hardcoded 32, although not
> sure if it is required.

Anyway we are using RTE_BIT32 below, so we must know it is 32 bits.

> > +			if ((bitmap[word] & RTE_BIT32(bit)) == 0)
> > +				continue;
> > +			ethdev_bitmap |= rte_eth_link_speed_ethtool(word * 32 + bit);

[...]
> > --- a/lib/ethdev/meson.build
> > +++ b/lib/ethdev/meson.build
> > +if is_linux
> > +    driver_sdk_headers += files(
> > +            'ethdev_linux_ethtool.h',
> > +    )
> > +    sources += files(
> > +            'ethdev_linux_ethtool.c',
> > +    )
> > +endif
> 
> Should meson check if 'linux/ethtool.h' exists, for anycase?

It is an old API header file. Why would not be there?
If we make it conditional here, we'll need to make it conditional in the caller.
  
Ferruh Yigit March 1, 2024, 3:08 p.m. UTC | #7
On 3/1/2024 1:37 PM, Thomas Monjalon wrote:
> 01/03/2024 14:12, Ferruh Yigit:
>> On 2/29/2024 3:42 PM, Thomas Monjalon wrote:
>>> Speed capabilities of a NIC may be discovered through its Linux
>>> kernel driver. It is especially useful for bifurcated drivers,
>>> so they don't have to duplicate the same logic in the DPDK driver.
>>>
>>> Parsing ethtool speed capabilities is made easy thanks to
>>> the functions added in ethdev for internal usage only.
>>> Of course these functions work only on Linux,
>>> so they are not compiled in other environments.
>>>
>>> In order to ease parsing, the ethtool macro names are parsed
>>> externally in a shell command which generates a C array
>>> included in this patch.
>>> It also avoids to depend on a kernel version.
>>> This C array should be updated in future to get latest ethtool bits.
>>> Note it is easier to update this array than adding new cases
>>> in a parsing code.
>>>
>>> The types in the functions are following the ethtool type:
>>> uint32_t for bitmaps, and int8_t for the number of 32-bitmaps.
>>>
>>> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
>>> ---
>>>
>>> A follow-up patch will be sent to use these functions in mlx5.
>>> I suspect mana could use this parsing as well.
>>>
>>
>> Is the usecase driver get link info via ibverbs and convert it to DPDK
>> link info?
> 
> The use case is to get capabilities from the kernel driver via ethtool ioctl.
> 

Sure, as it is adding kernel ethtool conversion, DPDK driver will get
link from kernel driver, thanks for clarification.

>> How complex or duplicated effort to get link info directly via DPDK
>> functions?
> 
> This is done by the driver.
> This is how mlx5 driver is getting speed capabilities.
> 
>> Because this approach is can be applied to only limited devices in DPDK
>> and solving an issue DPDK already has a solution, does it worth to the
>> code it adds?
> 
> It is going to replace code in mlx5 driver.
> I could add this code in mlx5 driver,
> but it could help other drivers in future like mana.
> 

Why replace, is there anything to fix in the DPDK link get code?


>>> +	speed = link_modes[bit];
>>> +	if (speed == 0)
>>> +		return RTE_ETH_LINK_SPEED_AUTONEG;
>>> +	RTE_BUILD_BUG_ON(RTE_ETH_LINK_SPEED_AUTONEG != 0);
>>>
>>
>> I think for above two checks, we can't really get the speed from
>> provided ethtool enum, and intention is to return something ineffective,
>> intention is not really return AUTONEG, right? If so why not directly
>> return 0?
> 
> Yes it could return 0 directly, but the namespace of the returned value
> is RTE_ETH_LINK_SPEED_.
> Also it is semantically correct: if no other capability found,
> there is no other choice than autoneg.
> 
>>> +
>>> +	/* duplex is LSB */
>>> +	duplex = (speed & 1) ?
>>> +			RTE_ETH_LINK_HALF_DUPLEX :
>>> +			RTE_ETH_LINK_FULL_DUPLEX;
>>> +	speed &= RTE_GENMASK32(31, 1);
>>
>> As trying to zero the LSB, following also work,
>>
>> speed &= ~UINT32_C(1)
> 
> Indeed, this is what RTE_GENMASK32 is doing.
> But I think using RTE_GENMASK32 better convey the intent.
> 
> [...]
>>> +	for (word = 0; word < nwords; word++) {
>>> +		for (bit = 0; bit < 32; bit++) {
>>
>> May be (sizeof(bitmap) * CHAR_BIT) instead of hardcoded 32, although not
>> sure if it is required.
> 
> Anyway we are using RTE_BIT32 below, so we must know it is 32 bits.
> 
>>> +			if ((bitmap[word] & RTE_BIT32(bit)) == 0)
>>> +				continue;
>>> +			ethdev_bitmap |= rte_eth_link_speed_ethtool(word * 32 + bit);
> 
> [...]
>>> --- a/lib/ethdev/meson.build
>>> +++ b/lib/ethdev/meson.build
>>> +if is_linux
>>> +    driver_sdk_headers += files(
>>> +            'ethdev_linux_ethtool.h',
>>> +    )
>>> +    sources += files(
>>> +            'ethdev_linux_ethtool.c',
>>> +    )
>>> +endif
>>
>> Should meson check if 'linux/ethtool.h' exists, for anycase?
> 
> It is an old API header file. Why would not be there?
>

Just to be cautious, but I just recognized this dependency already
exists in some drivers, and they don't check for the header. It seems it
is OK to not check the header.

> If we make it conditional here, we'll need to make it conditional in the caller.
> 
>
  
Thomas Monjalon March 1, 2024, 3:20 p.m. UTC | #8
01/03/2024 16:08, Ferruh Yigit:
> On 3/1/2024 1:37 PM, Thomas Monjalon wrote:
> > 01/03/2024 14:12, Ferruh Yigit:
> >> On 2/29/2024 3:42 PM, Thomas Monjalon wrote:
> >>> Speed capabilities of a NIC may be discovered through its Linux
> >>> kernel driver. It is especially useful for bifurcated drivers,
> >>> so they don't have to duplicate the same logic in the DPDK driver.
> >>>
> >>> Parsing ethtool speed capabilities is made easy thanks to
> >>> the functions added in ethdev for internal usage only.
> >>> Of course these functions work only on Linux,
> >>> so they are not compiled in other environments.
> >>>
> >>> In order to ease parsing, the ethtool macro names are parsed
> >>> externally in a shell command which generates a C array
> >>> included in this patch.
> >>> It also avoids to depend on a kernel version.
> >>> This C array should be updated in future to get latest ethtool bits.
> >>> Note it is easier to update this array than adding new cases
> >>> in a parsing code.
> >>>
> >>> The types in the functions are following the ethtool type:
> >>> uint32_t for bitmaps, and int8_t for the number of 32-bitmaps.
> >>>
> >>> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> >>> ---
> >>>
> >>> A follow-up patch will be sent to use these functions in mlx5.
> >>> I suspect mana could use this parsing as well.
> >>>
> >>
> >> Is the usecase driver get link info via ibverbs and convert it to DPDK
> >> link info?
> > 
> > The use case is to get capabilities from the kernel driver via ethtool ioctl.
> > 
> 
> Sure, as it is adding kernel ethtool conversion, DPDK driver will get
> link from kernel driver, thanks for clarification.

Yes the PMD uses ethtool API to get device capabilies.

> >> How complex or duplicated effort to get link info directly via DPDK
> >> functions?
> > 
> > This is done by the driver.
> > This is how mlx5 driver is getting speed capabilities.
> > 
> >> Because this approach is can be applied to only limited devices in DPDK
> >> and solving an issue DPDK already has a solution, does it worth to the
> >> code it adds?
> > 
> > It is going to replace code in mlx5 driver.
> > I could add this code in mlx5 driver,
> > but it could help other drivers in future like mana.
> 
> Why replace, is there anything to fix in the DPDK link get code?

There is nothing to fix in ethdev layer.
I want to replace PMD code doing ethtool queries
with something cleaner and easier to update.
  
Ferruh Yigit March 1, 2024, 5:16 p.m. UTC | #9
On 3/1/2024 3:20 PM, Thomas Monjalon wrote:
> 01/03/2024 16:08, Ferruh Yigit:
>> On 3/1/2024 1:37 PM, Thomas Monjalon wrote:
>>> 01/03/2024 14:12, Ferruh Yigit:
>>>> On 2/29/2024 3:42 PM, Thomas Monjalon wrote:
>>>>> Speed capabilities of a NIC may be discovered through its Linux
>>>>> kernel driver. It is especially useful for bifurcated drivers,
>>>>> so they don't have to duplicate the same logic in the DPDK driver.
>>>>>
>>>>> Parsing ethtool speed capabilities is made easy thanks to
>>>>> the functions added in ethdev for internal usage only.
>>>>> Of course these functions work only on Linux,
>>>>> so they are not compiled in other environments.
>>>>>
>>>>> In order to ease parsing, the ethtool macro names are parsed
>>>>> externally in a shell command which generates a C array
>>>>> included in this patch.
>>>>> It also avoids to depend on a kernel version.
>>>>> This C array should be updated in future to get latest ethtool bits.
>>>>> Note it is easier to update this array than adding new cases
>>>>> in a parsing code.
>>>>>
>>>>> The types in the functions are following the ethtool type:
>>>>> uint32_t for bitmaps, and int8_t for the number of 32-bitmaps.
>>>>>
>>>>> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
>>>>> ---
>>>>>
>>>>> A follow-up patch will be sent to use these functions in mlx5.
>>>>> I suspect mana could use this parsing as well.
>>>>>
>>>>
>>>> Is the usecase driver get link info via ibverbs and convert it to DPDK
>>>> link info?
>>>
>>> The use case is to get capabilities from the kernel driver via ethtool ioctl.
>>>
>>
>> Sure, as it is adding kernel ethtool conversion, DPDK driver will get
>> link from kernel driver, thanks for clarification.
> 
> Yes the PMD uses ethtool API to get device capabilies.
> 
>>>> How complex or duplicated effort to get link info directly via DPDK
>>>> functions?
>>>
>>> This is done by the driver.
>>> This is how mlx5 driver is getting speed capabilities.
>>>
>>>> Because this approach is can be applied to only limited devices in DPDK
>>>> and solving an issue DPDK already has a solution, does it worth to the
>>>> code it adds?
>>>
>>> It is going to replace code in mlx5 driver.
>>> I could add this code in mlx5 driver,
>>> but it could help other drivers in future like mana.
>>
>> Why replace, is there anything to fix in the DPDK link get code?
> 
> There is nothing to fix in ethdev layer.
> I want to replace PMD code doing ethtool queries
> with something cleaner and easier to update.
> 

ack, I will proceed with the patch for -rc2
  
Stephen Hemminger March 1, 2024, 6 p.m. UTC | #10
On Fri, 01 Mar 2024 16:20:56 +0100
Thomas Monjalon <thomas@monjalon.net> wrote:

> > > 
> > > The use case is to get capabilities from the kernel driver via ethtool ioctl.
> > >   
> > 
> > Sure, as it is adding kernel ethtool conversion, DPDK driver will get
> > link from kernel driver, thanks for clarification.  
> 
> Yes the PMD uses ethtool API to get device capabilies.

Is this the old ioctl interface, or the new (and preferred) ethtool over
netlink API?
  
Thomas Monjalon March 3, 2024, 9:36 a.m. UTC | #11
01/03/2024 19:00, Stephen Hemminger:
> On Fri, 01 Mar 2024 16:20:56 +0100
> Thomas Monjalon <thomas@monjalon.net> wrote:
> 
> > > > 
> > > > The use case is to get capabilities from the kernel driver via ethtool ioctl.
> > > >   
> > > 
> > > Sure, as it is adding kernel ethtool conversion, DPDK driver will get
> > > link from kernel driver, thanks for clarification.  
> > 
> > Yes the PMD uses ethtool API to get device capabilies.
> 
> Is this the old ioctl interface, or the new (and preferred) ethtool over
> netlink API?

mlx5 is using ioctl commands ETHTOOL_GSET and ETHTOOL_GLINKSETTINGS
  

Patch

diff --git a/lib/ethdev/ethdev_linux_ethtool.c b/lib/ethdev/ethdev_linux_ethtool.c
new file mode 100644
index 0000000000..0ece172a75
--- /dev/null
+++ b/lib/ethdev/ethdev_linux_ethtool.c
@@ -0,0 +1,161 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2024 NVIDIA Corporation & Affiliates
+ */
+
+#include <rte_bitops.h>
+
+#include "rte_ethdev.h"
+#include "ethdev_linux_ethtool.h"
+
+/* Link modes sorted with index as defined in ethtool.
+ * Values are speed in Mbps with LSB indicating duplex.
+ *
+ * The ethtool bits definition should not change as it is a kernel API.
+ * Using raw numbers directly avoids checking API availability
+ * and allows to compile with new bits included even on an old kernel.
+ *
+ * The array below is built from bit definitions with this shell command:
+ *   sed -rn 's;.*(ETHTOOL_LINK_MODE_)([0-9]+)([0-9a-zA-Z_]*).*= *([0-9]*).*;'\
+ *           '[\4] = \2, /\* \1\2\3 *\/;p' /usr/include/linux/ethtool.h |
+ *   awk '/_Half_/{$3=$3+1","}1'
+ */
+static uint32_t link_modes[] = {
+	  [0] =      11, /* ETHTOOL_LINK_MODE_10baseT_Half_BIT */
+	  [1] =      10, /* ETHTOOL_LINK_MODE_10baseT_Full_BIT */
+	  [2] =     101, /* ETHTOOL_LINK_MODE_100baseT_Half_BIT */
+	  [3] =     100, /* ETHTOOL_LINK_MODE_100baseT_Full_BIT */
+	  [4] =    1001, /* ETHTOOL_LINK_MODE_1000baseT_Half_BIT */
+	  [5] =    1000, /* ETHTOOL_LINK_MODE_1000baseT_Full_BIT */
+	 [12] =   10000, /* ETHTOOL_LINK_MODE_10000baseT_Full_BIT */
+	 [15] =    2500, /* ETHTOOL_LINK_MODE_2500baseX_Full_BIT */
+	 [17] =    1000, /* ETHTOOL_LINK_MODE_1000baseKX_Full_BIT */
+	 [18] =   10000, /* ETHTOOL_LINK_MODE_10000baseKX4_Full_BIT */
+	 [19] =   10000, /* ETHTOOL_LINK_MODE_10000baseKR_Full_BIT */
+	 [20] =   10000, /* ETHTOOL_LINK_MODE_10000baseR_FEC_BIT */
+	 [21] =   20000, /* ETHTOOL_LINK_MODE_20000baseMLD2_Full_BIT */
+	 [22] =   20000, /* ETHTOOL_LINK_MODE_20000baseKR2_Full_BIT */
+	 [23] =   40000, /* ETHTOOL_LINK_MODE_40000baseKR4_Full_BIT */
+	 [24] =   40000, /* ETHTOOL_LINK_MODE_40000baseCR4_Full_BIT */
+	 [25] =   40000, /* ETHTOOL_LINK_MODE_40000baseSR4_Full_BIT */
+	 [26] =   40000, /* ETHTOOL_LINK_MODE_40000baseLR4_Full_BIT */
+	 [27] =   56000, /* ETHTOOL_LINK_MODE_56000baseKR4_Full_BIT */
+	 [28] =   56000, /* ETHTOOL_LINK_MODE_56000baseCR4_Full_BIT */
+	 [29] =   56000, /* ETHTOOL_LINK_MODE_56000baseSR4_Full_BIT */
+	 [30] =   56000, /* ETHTOOL_LINK_MODE_56000baseLR4_Full_BIT */
+	 [31] =   25000, /* ETHTOOL_LINK_MODE_25000baseCR_Full_BIT */
+	 [32] =   25000, /* ETHTOOL_LINK_MODE_25000baseKR_Full_BIT */
+	 [33] =   25000, /* ETHTOOL_LINK_MODE_25000baseSR_Full_BIT */
+	 [34] =   50000, /* ETHTOOL_LINK_MODE_50000baseCR2_Full_BIT */
+	 [35] =   50000, /* ETHTOOL_LINK_MODE_50000baseKR2_Full_BIT */
+	 [36] =  100000, /* ETHTOOL_LINK_MODE_100000baseKR4_Full_BIT */
+	 [37] =  100000, /* ETHTOOL_LINK_MODE_100000baseSR4_Full_BIT */
+	 [38] =  100000, /* ETHTOOL_LINK_MODE_100000baseCR4_Full_BIT */
+	 [39] =  100000, /* ETHTOOL_LINK_MODE_100000baseLR4_ER4_Full_BIT */
+	 [40] =   50000, /* ETHTOOL_LINK_MODE_50000baseSR2_Full_BIT */
+	 [41] =    1000, /* ETHTOOL_LINK_MODE_1000baseX_Full_BIT */
+	 [42] =   10000, /* ETHTOOL_LINK_MODE_10000baseCR_Full_BIT */
+	 [43] =   10000, /* ETHTOOL_LINK_MODE_10000baseSR_Full_BIT */
+	 [44] =   10000, /* ETHTOOL_LINK_MODE_10000baseLR_Full_BIT */
+	 [45] =   10000, /* ETHTOOL_LINK_MODE_10000baseLRM_Full_BIT */
+	 [46] =   10000, /* ETHTOOL_LINK_MODE_10000baseER_Full_BIT */
+	 [47] =    2500, /* ETHTOOL_LINK_MODE_2500baseT_Full_BIT */
+	 [48] =    5000, /* ETHTOOL_LINK_MODE_5000baseT_Full_BIT */
+	 [52] =   50000, /* ETHTOOL_LINK_MODE_50000baseKR_Full_BIT */
+	 [53] =   50000, /* ETHTOOL_LINK_MODE_50000baseSR_Full_BIT */
+	 [54] =   50000, /* ETHTOOL_LINK_MODE_50000baseCR_Full_BIT */
+	 [55] =   50000, /* ETHTOOL_LINK_MODE_50000baseLR_ER_FR_Full_BIT */
+	 [56] =   50000, /* ETHTOOL_LINK_MODE_50000baseDR_Full_BIT */
+	 [57] =  100000, /* ETHTOOL_LINK_MODE_100000baseKR2_Full_BIT */
+	 [58] =  100000, /* ETHTOOL_LINK_MODE_100000baseSR2_Full_BIT */
+	 [59] =  100000, /* ETHTOOL_LINK_MODE_100000baseCR2_Full_BIT */
+	 [60] =  100000, /* ETHTOOL_LINK_MODE_100000baseLR2_ER2_FR2_Full_BIT */
+	 [61] =  100000, /* ETHTOOL_LINK_MODE_100000baseDR2_Full_BIT */
+	 [62] =  200000, /* ETHTOOL_LINK_MODE_200000baseKR4_Full_BIT */
+	 [63] =  200000, /* ETHTOOL_LINK_MODE_200000baseSR4_Full_BIT */
+	 [64] =  200000, /* ETHTOOL_LINK_MODE_200000baseLR4_ER4_FR4_Full_BIT */
+	 [65] =  200000, /* ETHTOOL_LINK_MODE_200000baseDR4_Full_BIT */
+	 [66] =  200000, /* ETHTOOL_LINK_MODE_200000baseCR4_Full_BIT */
+	 [67] =     100, /* ETHTOOL_LINK_MODE_100baseT1_Full_BIT */
+	 [68] =    1000, /* ETHTOOL_LINK_MODE_1000baseT1_Full_BIT */
+	 [69] =  400000, /* ETHTOOL_LINK_MODE_400000baseKR8_Full_BIT */
+	 [70] =  400000, /* ETHTOOL_LINK_MODE_400000baseSR8_Full_BIT */
+	 [71] =  400000, /* ETHTOOL_LINK_MODE_400000baseLR8_ER8_FR8_Full_BIT */
+	 [72] =  400000, /* ETHTOOL_LINK_MODE_400000baseDR8_Full_BIT */
+	 [73] =  400000, /* ETHTOOL_LINK_MODE_400000baseCR8_Full_BIT */
+	 [75] =  100000, /* ETHTOOL_LINK_MODE_100000baseKR_Full_BIT */
+	 [76] =  100000, /* ETHTOOL_LINK_MODE_100000baseSR_Full_BIT */
+	 [77] =  100000, /* ETHTOOL_LINK_MODE_100000baseLR_ER_FR_Full_BIT */
+	 [78] =  100000, /* ETHTOOL_LINK_MODE_100000baseCR_Full_BIT */
+	 [79] =  100000, /* ETHTOOL_LINK_MODE_100000baseDR_Full_BIT */
+	 [80] =  200000, /* ETHTOOL_LINK_MODE_200000baseKR2_Full_BIT */
+	 [81] =  200000, /* ETHTOOL_LINK_MODE_200000baseSR2_Full_BIT */
+	 [82] =  200000, /* ETHTOOL_LINK_MODE_200000baseLR2_ER2_FR2_Full_BIT */
+	 [83] =  200000, /* ETHTOOL_LINK_MODE_200000baseDR2_Full_BIT */
+	 [84] =  200000, /* ETHTOOL_LINK_MODE_200000baseCR2_Full_BIT */
+	 [85] =  400000, /* ETHTOOL_LINK_MODE_400000baseKR4_Full_BIT */
+	 [86] =  400000, /* ETHTOOL_LINK_MODE_400000baseSR4_Full_BIT */
+	 [87] =  400000, /* ETHTOOL_LINK_MODE_400000baseLR4_ER4_FR4_Full_BIT */
+	 [88] =  400000, /* ETHTOOL_LINK_MODE_400000baseDR4_Full_BIT */
+	 [89] =  400000, /* ETHTOOL_LINK_MODE_400000baseCR4_Full_BIT */
+	 [90] =     101, /* ETHTOOL_LINK_MODE_100baseFX_Half_BIT */
+	 [91] =     100, /* ETHTOOL_LINK_MODE_100baseFX_Full_BIT */
+	 [92] =      10, /* ETHTOOL_LINK_MODE_10baseT1L_Full_BIT */
+	 [93] =  800000, /* ETHTOOL_LINK_MODE_800000baseCR8_Full_BIT */
+	 [94] =  800000, /* ETHTOOL_LINK_MODE_800000baseKR8_Full_BIT */
+	 [95] =  800000, /* ETHTOOL_LINK_MODE_800000baseDR8_Full_BIT */
+	 [96] =  800000, /* ETHTOOL_LINK_MODE_800000baseDR8_2_Full_BIT */
+	 [97] =  800000, /* ETHTOOL_LINK_MODE_800000baseSR8_Full_BIT */
+	 [98] =  800000, /* ETHTOOL_LINK_MODE_800000baseVR8_Full_BIT */
+	 [99] =      10, /* ETHTOOL_LINK_MODE_10baseT1S_Full_BIT */
+	[100] =      11, /* ETHTOOL_LINK_MODE_10baseT1S_Half_BIT */
+	[101] =      11, /* ETHTOOL_LINK_MODE_10baseT1S_P2MP_Half_BIT */
+};
+
+uint32_t
+rte_eth_link_speed_ethtool(enum ethtool_link_mode_bit_indices bit)
+{
+	uint32_t speed;
+	int duplex;
+
+	/* get mode from array */
+	if (bit >= RTE_DIM(link_modes))
+		return RTE_ETH_LINK_SPEED_AUTONEG;
+	speed = link_modes[bit];
+	if (speed == 0)
+		return RTE_ETH_LINK_SPEED_AUTONEG;
+	RTE_BUILD_BUG_ON(RTE_ETH_LINK_SPEED_AUTONEG != 0);
+
+	/* duplex is LSB */
+	duplex = (speed & 1) ?
+			RTE_ETH_LINK_HALF_DUPLEX :
+			RTE_ETH_LINK_FULL_DUPLEX;
+	speed &= RTE_GENMASK32(31, 1);
+
+	return rte_eth_speed_bitflag(speed, duplex);
+}
+
+uint32_t
+rte_eth_link_speed_glink(const uint32_t *bitmap, int8_t nwords)
+{
+	uint8_t word, bit;
+	uint32_t ethdev_bitmap = 0;
+
+	if (nwords < 1)
+		return 0;
+
+	for (word = 0; word < nwords; word++) {
+		for (bit = 0; bit < 32; bit++) {
+			if ((bitmap[word] & RTE_BIT32(bit)) == 0)
+				continue;
+			ethdev_bitmap |= rte_eth_link_speed_ethtool(word * 32 + bit);
+		}
+	}
+
+	return ethdev_bitmap;
+}
+
+uint32_t
+rte_eth_link_speed_gset(uint32_t legacy_bitmap)
+{
+	return rte_eth_link_speed_glink(&legacy_bitmap, 1);
+}
diff --git a/lib/ethdev/ethdev_linux_ethtool.h b/lib/ethdev/ethdev_linux_ethtool.h
new file mode 100644
index 0000000000..de235bd5f4
--- /dev/null
+++ b/lib/ethdev/ethdev_linux_ethtool.h
@@ -0,0 +1,41 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2024 NVIDIA Corporation & Affiliates
+ */
+
+#ifndef ETHDEV_ETHTOOL_H
+#define ETHDEV_ETHTOOL_H
+
+#include <stdint.h>
+#include <linux/ethtool.h>
+
+#include <rte_compat.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+ * Convert bit from ETHTOOL_LINK_MODE_* to RTE_ETH_LINK_SPEED_*
+ */
+__rte_internal
+uint32_t rte_eth_link_speed_ethtool(enum ethtool_link_mode_bit_indices bit);
+
+/*
+ * Convert bitmap from ETHTOOL_GLINKSETTINGS ethtool_link_settings::link_mode_masks
+ * to bitmap RTE_ETH_LINK_SPEED_*
+ */
+__rte_internal
+uint32_t rte_eth_link_speed_glink(const uint32_t *bitmap, int8_t nwords);
+
+/*
+ * Convert bitmap from deprecated ETHTOOL_GSET ethtool_cmd::supported
+ * to bitmap RTE_ETH_LINK_SPEED_*
+ */
+__rte_internal
+uint32_t rte_eth_link_speed_gset(uint32_t legacy_bitmap);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* ETHDEV_ETHTOOL_H */
diff --git a/lib/ethdev/meson.build b/lib/ethdev/meson.build
index d11f06bc88..f1d2586591 100644
--- a/lib/ethdev/meson.build
+++ b/lib/ethdev/meson.build
@@ -44,6 +44,15 @@  driver_sdk_headers += files(
         'ethdev_vdev.h',
 )
 
+if is_linux
+    driver_sdk_headers += files(
+            'ethdev_linux_ethtool.h',
+    )
+    sources += files(
+            'ethdev_linux_ethtool.c',
+    )
+endif
+
 deps += ['net', 'kvargs', 'meter', 'telemetry']
 
 if is_freebsd
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index 17e4eac8a4..79f6f5293b 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -350,6 +350,9 @@  INTERNAL {
 	rte_eth_hairpin_queue_peer_unbind;
 	rte_eth_hairpin_queue_peer_update;
 	rte_eth_ip_reassembly_dynfield_register;
+	rte_eth_link_speed_ethtool; # WINDOWS_NO_EXPORT
+	rte_eth_link_speed_glink; # WINDOWS_NO_EXPORT
+	rte_eth_link_speed_gset; # WINDOWS_NO_EXPORT
 	rte_eth_pkt_burst_dummy;
 	rte_eth_representor_id_get;
 	rte_eth_switch_domain_alloc;