[v3,04/33] net/ena: sub-optimal configuration notifications support

Message ID 20240306122445.4350-5-shaibran@amazon.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series net/ena: v2.9.0 driver release |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Brandes, Shai March 6, 2024, 12:24 p.m. UTC
From: Shai Brandes <shaibran@amazon.com>

ENA device will send asynchronous notifications to the
driver in order to notify users about sub-optimal configurations
and refer them to public AWS documentation for further action.

Signed-off-by: Shai Brandes <shaibran@amazon.com>
Reviewed-by: Amit Bernstein <amitbern@amazon.com>
---
 doc/guides/rel_notes/release_24_03.rst        |  1 +
 .../net/ena/base/ena_defs/ena_admin_defs.h    | 11 +++++++-
 drivers/net/ena/ena_ethdev.c                  | 26 +++++++++++++++++--
 3 files changed, 35 insertions(+), 3 deletions(-)
  

Comments

Ferruh Yigit March 8, 2024, 5:23 p.m. UTC | #1
On 3/6/2024 12:24 PM, shaibran@amazon.com wrote:
> From: Shai Brandes <shaibran@amazon.com>
> 
> ENA device will send asynchronous notifications to the
> driver in order to notify users about sub-optimal configurations
> and refer them to public AWS documentation for further action.
> 

Hi Shai,

This is an interesting feature, I am curious, is there more public
detail provided by AWS on how it detects sub-optimal configuration and
what are the possible types of the notifications?

> Signed-off-by: Shai Brandes <shaibran@amazon.com>
> Reviewed-by: Amit Bernstein <amitbern@amazon.com>
> ---
>  doc/guides/rel_notes/release_24_03.rst        |  1 +
>  .../net/ena/base/ena_defs/ena_admin_defs.h    | 11 +++++++-
>  drivers/net/ena/ena_ethdev.c                  | 26 +++++++++++++++++--
>  3 files changed, 35 insertions(+), 3 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/release_24_03.rst b/doc/guides/rel_notes/release_24_03.rst
> index fb66d67d32..f47073c7dc 100644
> --- a/doc/guides/rel_notes/release_24_03.rst
> +++ b/doc/guides/rel_notes/release_24_03.rst
> @@ -104,6 +104,7 @@ New Features
>  * **Updated Amazon ena (Elastic Network Adapter) net driver.**
>  
>    * Removed the reporting of `rx_overruns` errors from xstats and instead updated `imissed` stat with its value.
> +  * Added support for sub-optimal configuration notifications from the device.
>  
>  * **Updated Atomic Rules' Arkville driver.**
>  
> diff --git a/drivers/net/ena/base/ena_defs/ena_admin_defs.h b/drivers/net/ena/base/ena_defs/ena_admin_defs.h
> index fa43e22918..4172916551 100644
> --- a/drivers/net/ena/base/ena_defs/ena_admin_defs.h
> +++ b/drivers/net/ena/base/ena_defs/ena_admin_defs.h
> @@ -1214,7 +1214,8 @@ enum ena_admin_aenq_group {
>  	ENA_ADMIN_NOTIFICATION                      = 3,
>  	ENA_ADMIN_KEEP_ALIVE                        = 4,
>  	ENA_ADMIN_REFRESH_CAPABILITIES              = 5,
> -	ENA_ADMIN_AENQ_GROUPS_NUM                   = 6,
> +	ENA_ADMIN_CONF_NOTIFICATIONS                = 6,
> +	ENA_ADMIN_AENQ_GROUPS_NUM                   = 7,
>  };
>  
>  enum ena_admin_aenq_notification_syndrome {
> @@ -1251,6 +1252,14 @@ struct ena_admin_aenq_keep_alive_desc {
>  	uint32_t rx_overruns_high;
>  };
>  
> +struct ena_admin_aenq_conf_notifications_desc {
> +	struct ena_admin_aenq_common_desc aenq_common_desc;
> +
> +	uint64_t notifications_bitmap;
> +
> +	uint64_t reserved;
> +};
> +
>  struct ena_admin_ena_mmio_req_read_less_resp {
>  	uint16_t req_id;
>  
> diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c
> index d3f395a832..3157237c0d 100644
> --- a/drivers/net/ena/ena_ethdev.c
> +++ b/drivers/net/ena/ena_ethdev.c
> @@ -36,6 +36,10 @@
>  
>  #define ENA_MIN_RING_DESC	128
>  
> +#define BITS_PER_BYTE 8
> +
> +#define BITS_PER_TYPE(type) (sizeof(type) * BITS_PER_BYTE)
> +
>

'CHAR_BIT' macro can be used here, but I can see there are multiple
drivers defining similar macros.So no need to update this patch, but to
record that this is something to address DPDK wide.

If ena team volunteers to tackle this update, it is welcomed ;)
  
Brandes, Shai March 10, 2024, 2:43 p.m. UTC | #2
> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Friday, March 8, 2024 7:23 PM
> To: Brandes, Shai <shaibran@amazon.com>
> Cc: dev@dpdk.org
> Subject: RE: [EXTERNAL] [PATCH v3 04/33] net/ena: sub-optimal
> configuration notifications support
> 
> CAUTION: This email originated from outside of the organization. Do not click
> links or open attachments unless you can confirm the sender and know the
> content is safe.
> 
> 
> 
> On 3/6/2024 12:24 PM, shaibran@amazon.com wrote:
> > From: Shai Brandes <shaibran@amazon.com>
> >
> > ENA device will send asynchronous notifications to the driver in order
> > to notify users about sub-optimal configurations and refer them to
> > public AWS documentation for further action.
> >
> 
> Hi Shai,
> 
> This is an interesting feature, I am curious, is there more public detail
> provided by AWS on how it detects sub-optimal configuration and what are
> the possible types of the notifications?
> 
[Brandes, Shai] This is only a framework to allow notifications to the user. Currently, the only notification the device supports relate to sub-optimal configuration when enabling ena-express feature.
The public documentation for it was not published yet, but it currently contains only two codes, indicating the user that it is better to run with normal-llq when working with ena-express and an option to increase the Tx queue depth when working with ena-express to double the default size on specific hardwares that have a larger bar (known only in run-time)


> > Signed-off-by: Shai Brandes <shaibran@amazon.com>
> > Reviewed-by: Amit Bernstein <amitbern@amazon.com>
> > ---
> >  doc/guides/rel_notes/release_24_03.rst        |  1 +
> >  .../net/ena/base/ena_defs/ena_admin_defs.h    | 11 +++++++-
> >  drivers/net/ena/ena_ethdev.c                  | 26 +++++++++++++++++--
> >  3 files changed, 35 insertions(+), 3 deletions(-)
> >
> > diff --git a/doc/guides/rel_notes/release_24_03.rst
> > b/doc/guides/rel_notes/release_24_03.rst
> > index fb66d67d32..f47073c7dc 100644
> > --- a/doc/guides/rel_notes/release_24_03.rst
> > +++ b/doc/guides/rel_notes/release_24_03.rst
> > @@ -104,6 +104,7 @@ New Features
> >  * **Updated Amazon ena (Elastic Network Adapter) net driver.**
> >
> >    * Removed the reporting of `rx_overruns` errors from xstats and instead
> updated `imissed` stat with its value.
> > +  * Added support for sub-optimal configuration notifications from the
> device.
> >
> >  * **Updated Atomic Rules' Arkville driver.**
> >
> > diff --git a/drivers/net/ena/base/ena_defs/ena_admin_defs.h
> > b/drivers/net/ena/base/ena_defs/ena_admin_defs.h
> > index fa43e22918..4172916551 100644
> > --- a/drivers/net/ena/base/ena_defs/ena_admin_defs.h
> > +++ b/drivers/net/ena/base/ena_defs/ena_admin_defs.h
> > @@ -1214,7 +1214,8 @@ enum ena_admin_aenq_group {
> >       ENA_ADMIN_NOTIFICATION                      = 3,
> >       ENA_ADMIN_KEEP_ALIVE                        = 4,
> >       ENA_ADMIN_REFRESH_CAPABILITIES              = 5,
> > -     ENA_ADMIN_AENQ_GROUPS_NUM                   = 6,
> > +     ENA_ADMIN_CONF_NOTIFICATIONS                = 6,
> > +     ENA_ADMIN_AENQ_GROUPS_NUM                   = 7,
> >  };
> >
> >  enum ena_admin_aenq_notification_syndrome { @@ -1251,6 +1252,14
> @@
> > struct ena_admin_aenq_keep_alive_desc {
> >       uint32_t rx_overruns_high;
> >  };
> >
> > +struct ena_admin_aenq_conf_notifications_desc {
> > +     struct ena_admin_aenq_common_desc aenq_common_desc;
> > +
> > +     uint64_t notifications_bitmap;
> > +
> > +     uint64_t reserved;
> > +};
> > +
> >  struct ena_admin_ena_mmio_req_read_less_resp {
> >       uint16_t req_id;
> >
> > diff --git a/drivers/net/ena/ena_ethdev.c
> > b/drivers/net/ena/ena_ethdev.c index d3f395a832..3157237c0d 100644
> > --- a/drivers/net/ena/ena_ethdev.c
> > +++ b/drivers/net/ena/ena_ethdev.c
> > @@ -36,6 +36,10 @@
> >
> >  #define ENA_MIN_RING_DESC    128
> >
> > +#define BITS_PER_BYTE 8
> > +
> > +#define BITS_PER_TYPE(type) (sizeof(type) * BITS_PER_BYTE)
> > +
> >
> 
> 'CHAR_BIT' macro can be used here, but I can see there are multiple drivers
> defining similar macros.So no need to update this patch, but to record that
> this is something to address DPDK wide.
> 
> If ena team volunteers to tackle this update, it is welcomed ;)
[Brandes, Shai] sure, can be done
  
Ferruh Yigit March 13, 2024, 11:18 a.m. UTC | #3
On 3/10/2024 2:43 PM, Brandes, Shai wrote:
> 
> 
>> -----Original Message-----
>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>> Sent: Friday, March 8, 2024 7:23 PM
>> To: Brandes, Shai <shaibran@amazon.com>
>> Cc: dev@dpdk.org
>> Subject: RE: [EXTERNAL] [PATCH v3 04/33] net/ena: sub-optimal
>> configuration notifications support
>>
>> CAUTION: This email originated from outside of the organization. Do not click
>> links or open attachments unless you can confirm the sender and know the
>> content is safe.
>>
>>
>>
>> On 3/6/2024 12:24 PM, shaibran@amazon.com wrote:
>>> From: Shai Brandes <shaibran@amazon.com>
>>>
>>> ENA device will send asynchronous notifications to the driver in order
>>> to notify users about sub-optimal configurations and refer them to
>>> public AWS documentation for further action.
>>>
>>
>> Hi Shai,
>>
>> This is an interesting feature, I am curious, is there more public detail
>> provided by AWS on how it detects sub-optimal configuration and what are
>> the possible types of the notifications?
>>
> [Brandes, Shai] This is only a framework to allow notifications to the user. Currently, the only notification the device supports relate to sub-optimal configuration when enabling ena-express feature.
> The public documentation for it was not published yet, but it currently contains only two codes, indicating the user that it is better to run with normal-llq when working with ena-express and an option to increase the Tx queue depth when working with ena-express to double the default size on specific hardwares that have a larger bar (known only in run-time)
> 

Thanks for the info. When there is a public documentation for the
feature, can you please reference it from driver documentation?

> 
>>> Signed-off-by: Shai Brandes <shaibran@amazon.com>
>>> Reviewed-by: Amit Bernstein <amitbern@amazon.com>
>>> ---
>>>  doc/guides/rel_notes/release_24_03.rst        |  1 +
>>>  .../net/ena/base/ena_defs/ena_admin_defs.h    | 11 +++++++-
>>>  drivers/net/ena/ena_ethdev.c                  | 26 +++++++++++++++++--
>>>  3 files changed, 35 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/doc/guides/rel_notes/release_24_03.rst
>>> b/doc/guides/rel_notes/release_24_03.rst
>>> index fb66d67d32..f47073c7dc 100644
>>> --- a/doc/guides/rel_notes/release_24_03.rst
>>> +++ b/doc/guides/rel_notes/release_24_03.rst
>>> @@ -104,6 +104,7 @@ New Features
>>>  * **Updated Amazon ena (Elastic Network Adapter) net driver.**
>>>
>>>    * Removed the reporting of `rx_overruns` errors from xstats and instead
>> updated `imissed` stat with its value.
>>> +  * Added support for sub-optimal configuration notifications from the
>> device.
>>>
>>>  * **Updated Atomic Rules' Arkville driver.**
>>>
>>> diff --git a/drivers/net/ena/base/ena_defs/ena_admin_defs.h
>>> b/drivers/net/ena/base/ena_defs/ena_admin_defs.h
>>> index fa43e22918..4172916551 100644
>>> --- a/drivers/net/ena/base/ena_defs/ena_admin_defs.h
>>> +++ b/drivers/net/ena/base/ena_defs/ena_admin_defs.h
>>> @@ -1214,7 +1214,8 @@ enum ena_admin_aenq_group {
>>>       ENA_ADMIN_NOTIFICATION                      = 3,
>>>       ENA_ADMIN_KEEP_ALIVE                        = 4,
>>>       ENA_ADMIN_REFRESH_CAPABILITIES              = 5,
>>> -     ENA_ADMIN_AENQ_GROUPS_NUM                   = 6,
>>> +     ENA_ADMIN_CONF_NOTIFICATIONS                = 6,
>>> +     ENA_ADMIN_AENQ_GROUPS_NUM                   = 7,
>>>  };
>>>
>>>  enum ena_admin_aenq_notification_syndrome { @@ -1251,6 +1252,14
>> @@
>>> struct ena_admin_aenq_keep_alive_desc {
>>>       uint32_t rx_overruns_high;
>>>  };
>>>
>>> +struct ena_admin_aenq_conf_notifications_desc {
>>> +     struct ena_admin_aenq_common_desc aenq_common_desc;
>>> +
>>> +     uint64_t notifications_bitmap;
>>> +
>>> +     uint64_t reserved;
>>> +};
>>> +
>>>  struct ena_admin_ena_mmio_req_read_less_resp {
>>>       uint16_t req_id;
>>>
>>> diff --git a/drivers/net/ena/ena_ethdev.c
>>> b/drivers/net/ena/ena_ethdev.c index d3f395a832..3157237c0d 100644
>>> --- a/drivers/net/ena/ena_ethdev.c
>>> +++ b/drivers/net/ena/ena_ethdev.c
>>> @@ -36,6 +36,10 @@
>>>
>>>  #define ENA_MIN_RING_DESC    128
>>>
>>> +#define BITS_PER_BYTE 8
>>> +
>>> +#define BITS_PER_TYPE(type) (sizeof(type) * BITS_PER_BYTE)
>>> +
>>>
>>
>> 'CHAR_BIT' macro can be used here, but I can see there are multiple drivers
>> defining similar macros.So no need to update this patch, but to record that
>> this is something to address DPDK wide.
>>
>> If ena team volunteers to tackle this update, it is welcomed ;)
> [Brandes, Shai] sure, can be done
> 

Thanks, appreciated.
  

Patch

diff --git a/doc/guides/rel_notes/release_24_03.rst b/doc/guides/rel_notes/release_24_03.rst
index fb66d67d32..f47073c7dc 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -104,6 +104,7 @@  New Features
 * **Updated Amazon ena (Elastic Network Adapter) net driver.**
 
   * Removed the reporting of `rx_overruns` errors from xstats and instead updated `imissed` stat with its value.
+  * Added support for sub-optimal configuration notifications from the device.
 
 * **Updated Atomic Rules' Arkville driver.**
 
diff --git a/drivers/net/ena/base/ena_defs/ena_admin_defs.h b/drivers/net/ena/base/ena_defs/ena_admin_defs.h
index fa43e22918..4172916551 100644
--- a/drivers/net/ena/base/ena_defs/ena_admin_defs.h
+++ b/drivers/net/ena/base/ena_defs/ena_admin_defs.h
@@ -1214,7 +1214,8 @@  enum ena_admin_aenq_group {
 	ENA_ADMIN_NOTIFICATION                      = 3,
 	ENA_ADMIN_KEEP_ALIVE                        = 4,
 	ENA_ADMIN_REFRESH_CAPABILITIES              = 5,
-	ENA_ADMIN_AENQ_GROUPS_NUM                   = 6,
+	ENA_ADMIN_CONF_NOTIFICATIONS                = 6,
+	ENA_ADMIN_AENQ_GROUPS_NUM                   = 7,
 };
 
 enum ena_admin_aenq_notification_syndrome {
@@ -1251,6 +1252,14 @@  struct ena_admin_aenq_keep_alive_desc {
 	uint32_t rx_overruns_high;
 };
 
+struct ena_admin_aenq_conf_notifications_desc {
+	struct ena_admin_aenq_common_desc aenq_common_desc;
+
+	uint64_t notifications_bitmap;
+
+	uint64_t reserved;
+};
+
 struct ena_admin_ena_mmio_req_read_less_resp {
 	uint16_t req_id;
 
diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c
index d3f395a832..3157237c0d 100644
--- a/drivers/net/ena/ena_ethdev.c
+++ b/drivers/net/ena/ena_ethdev.c
@@ -36,6 +36,10 @@ 
 
 #define ENA_MIN_RING_DESC	128
 
+#define BITS_PER_BYTE 8
+
+#define BITS_PER_TYPE(type) (sizeof(type) * BITS_PER_BYTE)
+
 /*
  * We should try to keep ENA_CLEANUP_BUF_SIZE lower than
  * RTE_MEMPOOL_CACHE_MAX_SIZE, so we can fit this in mempool local cache.
@@ -1842,7 +1846,8 @@  static int ena_device_init(struct ena_adapter *adapter,
 		      BIT(ENA_ADMIN_NOTIFICATION) |
 		      BIT(ENA_ADMIN_KEEP_ALIVE) |
 		      BIT(ENA_ADMIN_FATAL_ERROR) |
-		      BIT(ENA_ADMIN_WARNING);
+		      BIT(ENA_ADMIN_WARNING) |
+		      BIT(ENA_ADMIN_CONF_NOTIFICATIONS);
 
 	aenq_groups &= get_feat_ctx->aenq.supported_groups;
 
@@ -4021,6 +4026,22 @@  static void ena_keep_alive(void *adapter_data,
 	adapter->dev_stats.tx_drops = tx_drops;
 }
 
+static void ena_suboptimal_configuration(__rte_unused void *adapter_data,
+					 struct ena_admin_aenq_entry *aenq_e)
+{
+	struct ena_admin_aenq_conf_notifications_desc *desc;
+	int bit, num_bits;
+
+	desc = (struct ena_admin_aenq_conf_notifications_desc *)aenq_e;
+	num_bits = BITS_PER_TYPE(desc->notifications_bitmap);
+	for (bit = 0; bit < num_bits; bit++) {
+		if (desc->notifications_bitmap & RTE_BIT64(bit)) {
+			PMD_DRV_LOG(WARNING,
+				"Sub-optimal configuration notification code: %d\n", bit + 1);
+		}
+	}
+}
+
 /**
  * This handler will called for unknown event group or unimplemented handlers
  **/
@@ -4035,7 +4056,8 @@  static struct ena_aenq_handlers aenq_handlers = {
 	.handlers = {
 		[ENA_ADMIN_LINK_CHANGE] = ena_update_on_link_change,
 		[ENA_ADMIN_NOTIFICATION] = ena_notification,
-		[ENA_ADMIN_KEEP_ALIVE] = ena_keep_alive
+		[ENA_ADMIN_KEEP_ALIVE] = ena_keep_alive,
+		[ENA_ADMIN_CONF_NOTIFICATIONS] = ena_suboptimal_configuration
 	},
 	.unimplemented_handler = unimplemented_aenq_handler
 };