mbox series

[RFC,v1,0/7] relax barriers for ENA PMD and small fixes

Message ID 20200313091835.58039-1-gavin.hu@arm.com (mailing list archive)
Headers
Series relax barriers for ENA PMD and small fixes |

Message

Gavin Hu March 13, 2020, 9:18 a.m. UTC
  To ensure the stores to the host memory are observed by NIC HW before a
door bell ring to the NIC HW and the HW starts actions, for example,
doing DMA, a barrier is required on weak memory ordering platforms, like
aarch64.

However, unnecessarily too strong barriers like 'dsb' on aarch64 will
dampen performance.

In a typical doorbell use case, as NIC and CPU are in the outer sharable
domain, a lighter weight 'dmb osh' barrier is sufficient.

The patch set relaxes the barriers in similar places and include one
more patch for statistics logging with relaxed ordering and the other
patch removing duplicate memset.

Note this set is submitted for RFC as we don't have physical ENA NICs in
the lab and the patch set was not verified nor benchmarked.

Gavin Hu (7):
  net/ena: remove duplicate barrier
  net/ena: relax the barrier for doorbell ring
  net/ena: relax the rmb for DMA
  net/ena: relax barrier for completion queue update
  net/ena: relax the barrier for bounce buffer
  net/ena: use c11 atomic for statistics
  net/ena: remove duplicate memset

 drivers/net/ena/base/ena_eth_com.c   |  2 +-
 drivers/net/ena/base/ena_eth_com.h   |  6 ++--
 drivers/net/ena/base/ena_plat_dpdk.h |  2 +-
 drivers/net/ena/ena_ethdev.c         | 46 +++++++++++++++++-----------
 drivers/net/ena/ena_ethdev.h         |  8 ++---
 5 files changed, 38 insertions(+), 26 deletions(-)
  

Comments

Chauskin, Igor March 16, 2020, 9:34 a.m. UTC | #1
Hi Gavin,

Thank you for the contribution.
Please do not merge these changes (patches 0..7) till we (the ENA team) properly review and ack/nack.
These changes can potentially provide performance improvement, yet we need to ensure they are applicable for all possible scenarios. Specifically, the behavior on x86 platforms is likely to be different.
What testing have you done for these patches? Was x86 tested?

Thanks,
Igor

-----Original Message-----
From: Gavin Hu <gavin.hu@arm.com> 
Sent: Friday, March 13, 2020 11:18 AM
To: dev@dpdk.org
Cc: nd@arm.com; david.marchand@redhat.com; thomas@monjalon.net; mk@semihalf.com; Tzalik, Guy <gtzalik@amazon.com>; Schmeilin, Evgeny <evgenys@amazon.com>; Chauskin, Igor <igorch@amazon.com>; mw@semihalf.com; Honnappa.Nagarahalli@arm.com; ruifeng.wang@arm.com; phil.yang@arm.com; joyce.kong@arm.com
Subject: [EXTERNAL][PATCH RFC v1 0/7] relax barriers for ENA PMD and small fixes

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.



To ensure the stores to the host memory are observed by NIC HW before a door bell ring to the NIC HW and the HW starts actions, for example, doing DMA, a barrier is required on weak memory ordering platforms, like aarch64.

However, unnecessarily too strong barriers like 'dsb' on aarch64 will dampen performance.

In a typical doorbell use case, as NIC and CPU are in the outer sharable domain, a lighter weight 'dmb osh' barrier is sufficient.

The patch set relaxes the barriers in similar places and include one more patch for statistics logging with relaxed ordering and the other patch removing duplicate memset.

Note this set is submitted for RFC as we don't have physical ENA NICs in the lab and the patch set was not verified nor benchmarked.

Gavin Hu (7):
  net/ena: remove duplicate barrier
  net/ena: relax the barrier for doorbell ring
  net/ena: relax the rmb for DMA
  net/ena: relax barrier for completion queue update
  net/ena: relax the barrier for bounce buffer
  net/ena: use c11 atomic for statistics
  net/ena: remove duplicate memset

 drivers/net/ena/base/ena_eth_com.c   |  2 +-
 drivers/net/ena/base/ena_eth_com.h   |  6 ++--
 drivers/net/ena/base/ena_plat_dpdk.h |  2 +-
 drivers/net/ena/ena_ethdev.c         | 46 +++++++++++++++++-----------
 drivers/net/ena/ena_ethdev.h         |  8 ++---
 5 files changed, 38 insertions(+), 26 deletions(-)

--
2.17.1
  
Gavin Hu March 17, 2020, 7:58 a.m. UTC | #2
Hi Igor,

> -----Original Message-----
> From: Chauskin, Igor <igorch@amazon.com>
> Sent: Monday, March 16, 2020 5:35 PM
> To: Gavin Hu <Gavin.Hu@arm.com>; dev@dpdk.org
> Cc: nd <nd@arm.com>; david.marchand@redhat.com; thomas@monjalon.net;
> mk@semihalf.com; Tzalik, Guy <gtzalik@amazon.com>; Schmeilin, Evgeny
> <evgenys@amazon.com>; mw@semihalf.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>; Phil Yang <Phil.Yang@arm.com>; Joyce Kong
> <Joyce.Kong@arm.com>; Bshara, Saeed <saeedb@amazon.com>;
> Matushevsky, Alexander <matua@amazon.com>
> Subject: RE: [PATCH RFC v1 0/7] relax barriers for ENA PMD and small fixes
> 
> Hi Gavin,
> 
> Thank you for the contribution.
> Please do not merge these changes (patches 0..7) till we (the ENA team)
> properly review and ack/nack.
> These changes can potentially provide performance improvement, yet we need
> to ensure they are applicable for all possible scenarios. Specifically, the
> behavior on x86 platforms is likely to be different.
> What testing have you done for these patches? Was x86 tested?
As noted in the cover letter, these patches were not tested as we don't have ENA NICs.
We rely on you to do that, any concerns and comments welcome. 
Yes, the behavior on x86 platforms is also different, Intel people are welcome to comment. 
/Gavin
> 
> Thanks,
> Igor
> 
> -----Original Message-----
> From: Gavin Hu <gavin.hu@arm.com>
> Sent: Friday, March 13, 2020 11:18 AM
> To: dev@dpdk.org
> Cc: nd@arm.com; david.marchand@redhat.com; thomas@monjalon.net;
> mk@semihalf.com; Tzalik, Guy <gtzalik@amazon.com>; Schmeilin, Evgeny
> <evgenys@amazon.com>; Chauskin, Igor <igorch@amazon.com>;
> mw@semihalf.com; Honnappa.Nagarahalli@arm.com;
> ruifeng.wang@arm.com; phil.yang@arm.com; joyce.kong@arm.com
> Subject: [EXTERNAL][PATCH RFC v1 0/7] relax barriers for ENA PMD and small
> fixes
> 
> CAUTION: This email originated from outside of the organization. Do not click
> links or open attachments unless you can confirm the sender and know the
> content is safe.
> 
> 
> 
> To ensure the stores to the host memory are observed by NIC HW before a
> door bell ring to the NIC HW and the HW starts actions, for example, doing
> DMA, a barrier is required on weak memory ordering platforms, like aarch64.
> 
> However, unnecessarily too strong barriers like 'dsb' on aarch64 will dampen
> performance.
> 
> In a typical doorbell use case, as NIC and CPU are in the outer sharable domain,
> a lighter weight 'dmb osh' barrier is sufficient.
> 
> The patch set relaxes the barriers in similar places and include one more patch
> for statistics logging with relaxed ordering and the other patch removing
> duplicate memset.
> 
> Note this set is submitted for RFC as we don't have physical ENA NICs in the lab
> and the patch set was not verified nor benchmarked.
> 
> Gavin Hu (7):
>   net/ena: remove duplicate barrier
>   net/ena: relax the barrier for doorbell ring
>   net/ena: relax the rmb for DMA
>   net/ena: relax barrier for completion queue update
>   net/ena: relax the barrier for bounce buffer
>   net/ena: use c11 atomic for statistics
>   net/ena: remove duplicate memset
> 
>  drivers/net/ena/base/ena_eth_com.c   |  2 +-
>  drivers/net/ena/base/ena_eth_com.h   |  6 ++--
>  drivers/net/ena/base/ena_plat_dpdk.h |  2 +-
>  drivers/net/ena/ena_ethdev.c         | 46 +++++++++++++++++-----------
>  drivers/net/ena/ena_ethdev.h         |  8 ++---
>  5 files changed, 38 insertions(+), 26 deletions(-)
> 
> --
> 2.17.1
  
Ferruh Yigit April 15, 2020, 3:27 p.m. UTC | #3
On 3/16/2020 9:34 AM, Chauskin, Igor wrote:
> Hi Gavin,
> 
> Thank you for the contribution.
> Please do not merge these changes (patches 0..7) till we (the ENA team) properly review and ack/nack.

Hi Igor,

Is there any progress on reviewing the set?

Thanks,
ferruh

> These changes can potentially provide performance improvement, yet we need to ensure they are applicable for all possible scenarios. Specifically, the behavior on x86 platforms is likely to be different.
> What testing have you done for these patches? Was x86 tested?
> 
> Thanks,
> Igor
> 
> -----Original Message-----
> From: Gavin Hu <gavin.hu@arm.com> 
> Sent: Friday, March 13, 2020 11:18 AM
> To: dev@dpdk.org
> Cc: nd@arm.com; david.marchand@redhat.com; thomas@monjalon.net; mk@semihalf.com; Tzalik, Guy <gtzalik@amazon.com>; Schmeilin, Evgeny <evgenys@amazon.com>; Chauskin, Igor <igorch@amazon.com>; mw@semihalf.com; Honnappa.Nagarahalli@arm.com; ruifeng.wang@arm.com; phil.yang@arm.com; joyce.kong@arm.com
> Subject: [EXTERNAL][PATCH RFC v1 0/7] relax barriers for ENA PMD and small fixes
> 
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> To ensure the stores to the host memory are observed by NIC HW before a door bell ring to the NIC HW and the HW starts actions, for example, doing DMA, a barrier is required on weak memory ordering platforms, like aarch64.
> 
> However, unnecessarily too strong barriers like 'dsb' on aarch64 will dampen performance.
> 
> In a typical doorbell use case, as NIC and CPU are in the outer sharable domain, a lighter weight 'dmb osh' barrier is sufficient.
> 
> The patch set relaxes the barriers in similar places and include one more patch for statistics logging with relaxed ordering and the other patch removing duplicate memset.
> 
> Note this set is submitted for RFC as we don't have physical ENA NICs in the lab and the patch set was not verified nor benchmarked.
> 
> Gavin Hu (7):
>   net/ena: remove duplicate barrier
>   net/ena: relax the barrier for doorbell ring
>   net/ena: relax the rmb for DMA
>   net/ena: relax barrier for completion queue update
>   net/ena: relax the barrier for bounce buffer
>   net/ena: use c11 atomic for statistics
>   net/ena: remove duplicate memset
> 
>  drivers/net/ena/base/ena_eth_com.c   |  2 +-
>  drivers/net/ena/base/ena_eth_com.h   |  6 ++--
>  drivers/net/ena/base/ena_plat_dpdk.h |  2 +-
>  drivers/net/ena/ena_ethdev.c         | 46 +++++++++++++++++-----------
>  drivers/net/ena/ena_ethdev.h         |  8 ++---
>  5 files changed, 38 insertions(+), 26 deletions(-)
> 
> --
> 2.17.1
>
  
Chauskin, Igor April 15, 2020, 3:59 p.m. UTC | #4
Hi,

These changes are ARM-oriented and the behavior will be different on x86-based systems. As such, we need to generalize them and it's not straightforward - we're still reviewing the implications. I will update when we're ready.

Thanks,
Igor 

-----Original Message-----
From: Ferruh Yigit <ferruh.yigit@intel.com> 
Sent: Wednesday, April 15, 2020 6:28 PM
To: Chauskin, Igor <igorch@amazon.com>; Gavin Hu <gavin.hu@arm.com>; dev@dpdk.org
Cc: nd@arm.com; david.marchand@redhat.com; thomas@monjalon.net; mk@semihalf.com; Tzalik, Guy <gtzalik@amazon.com>; Schmeilin, Evgeny <evgenys@amazon.com>; mw@semihalf.com; Honnappa.Nagarahalli@arm.com; ruifeng.wang@arm.com; phil.yang@arm.com; joyce.kong@arm.com; Bshara, Saeed <saeedb@amazon.com>; Matushevsky, Alexander <matua@amazon.com>
Subject: RE: [EXTERNAL] [dpdk-dev] [PATCH RFC v1 0/7] relax barriers for ENA PMD and small fixes

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.



On 3/16/2020 9:34 AM, Chauskin, Igor wrote:
> Hi Gavin,
>
> Thank you for the contribution.
> Please do not merge these changes (patches 0..7) till we (the ENA team) properly review and ack/nack.

Hi Igor,

Is there any progress on reviewing the set?

Thanks,
ferruh

> These changes can potentially provide performance improvement, yet we need to ensure they are applicable for all possible scenarios. Specifically, the behavior on x86 platforms is likely to be different.
> What testing have you done for these patches? Was x86 tested?
>
> Thanks,
> Igor
>
> -----Original Message-----
> From: Gavin Hu <gavin.hu@arm.com>
> Sent: Friday, March 13, 2020 11:18 AM
> To: dev@dpdk.org
> Cc: nd@arm.com; david.marchand@redhat.com; thomas@monjalon.net; mk@semihalf.com; Tzalik, Guy <gtzalik@amazon.com>; Schmeilin, Evgeny <evgenys@amazon.com>; Chauskin, Igor <igorch@amazon.com>; mw@semihalf.com; Honnappa.Nagarahalli@arm.com; ruifeng.wang@arm.com; phil.yang@arm.com; joyce.kong@arm.com
> Subject: [EXTERNAL][PATCH RFC v1 0/7] relax barriers for ENA PMD and small fixes
>
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> To ensure the stores to the host memory are observed by NIC HW before a door bell ring to the NIC HW and the HW starts actions, for example, doing DMA, a barrier is required on weak memory ordering platforms, like aarch64.
>
> However, unnecessarily too strong barriers like 'dsb' on aarch64 will dampen performance.
>
> In a typical doorbell use case, as NIC and CPU are in the outer sharable domain, a lighter weight 'dmb osh' barrier is sufficient.
>
> The patch set relaxes the barriers in similar places and include one more patch for statistics logging with relaxed ordering and the other patch removing duplicate memset.
>
> Note this set is submitted for RFC as we don't have physical ENA NICs in the lab and the patch set was not verified nor benchmarked.
>
> Gavin Hu (7):
>   net/ena: remove duplicate barrier
>   net/ena: relax the barrier for doorbell ring
>   net/ena: relax the rmb for DMA
>   net/ena: relax barrier for completion queue update
>   net/ena: relax the barrier for bounce buffer
>   net/ena: use c11 atomic for statistics
>   net/ena: remove duplicate memset
>
>  drivers/net/ena/base/ena_eth_com.c   |  2 +-
>  drivers/net/ena/base/ena_eth_com.h   |  6 ++--
>  drivers/net/ena/base/ena_plat_dpdk.h |  2 +-
>  drivers/net/ena/ena_ethdev.c         | 46 +++++++++++++++++-----------
>  drivers/net/ena/ena_ethdev.h         |  8 ++---
>  5 files changed, 38 insertions(+), 26 deletions(-)
>
> --
> 2.17.1
>
  
Chauskin, Igor April 16, 2020, 1:37 p.m. UTC | #5
Hi all,

Please see the first batch of comments related to these patches:

1. Relaxing the register read/write isn't always good enough. Specifically, when barriers are required between different memory types, the reordering can occur even on x86. Yet in DPDK the io/cio/smp barrier flavors for x86 are defined as compiler-only barriers, which is not enough in cases involving different memory types. In ENA driver, when LLQ is on, there is a regular register memory access before the barrier and write-combined memory access after the barrier.

We're working on a more extensive change that will include the optimizations proposed for barriers relaxing while making them applicable to all platforms.

2.  Regarding the changes for statistics logging - the patch relies on c11 features. I'm not sure it's acceptable for all situations since we've already encountered a reports when even c99-compliant changes caused compilation issues.

3. Removing redundant zeroing of sub-struct - we're currently working on some extensive changes to the Tx flow, which will include this change among other.

Thanks,
Igor

-----Original Message-----
From: Gavin Hu <Gavin.Hu@arm.com> 
Sent: Tuesday, March 17, 2020 9:59 AM
To: Chauskin, Igor <igorch@amazon.com>; dev@dpdk.org
Cc: nd <nd@arm.com>; david.marchand@redhat.com; thomas@monjalon.net; mk@semihalf.com; Tzalik, Guy <gtzalik@amazon.com>; Schmeilin, Evgeny <evgenys@amazon.com>; mw@semihalf.com; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang <Ruifeng.Wang@arm.com>; Phil Yang <Phil.Yang@arm.com>; Joyce Kong <Joyce.Kong@arm.com>; Bshara, Saeed <saeedb@amazon.com>; Matushevsky, Alexander <matua@amazon.com>; Bruce Richardson <bruce.richardson@intel.com>; nd <nd@arm.com>
Subject: RE: [EXTERNAL] [PATCH RFC v1 0/7] relax barriers for ENA PMD and small fixes

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.



Hi Igor,

> -----Original Message-----
> From: Chauskin, Igor <igorch@amazon.com>
> Sent: Monday, March 16, 2020 5:35 PM
> To: Gavin Hu <Gavin.Hu@arm.com>; dev@dpdk.org
> Cc: nd <nd@arm.com>; david.marchand@redhat.com; thomas@monjalon.net; 
> mk@semihalf.com; Tzalik, Guy <gtzalik@amazon.com>; Schmeilin, Evgeny 
> <evgenys@amazon.com>; mw@semihalf.com; Honnappa Nagarahalli 
> <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang <Ruifeng.Wang@arm.com>; 
> Phil Yang <Phil.Yang@arm.com>; Joyce Kong <Joyce.Kong@arm.com>; 
> Bshara, Saeed <saeedb@amazon.com>; Matushevsky, Alexander 
> <matua@amazon.com>
> Subject: RE: [PATCH RFC v1 0/7] relax barriers for ENA PMD and small 
> fixes
>
> Hi Gavin,
>
> Thank you for the contribution.
> Please do not merge these changes (patches 0..7) till we (the ENA 
> team) properly review and ack/nack.
> These changes can potentially provide performance improvement, yet we 
> need to ensure they are applicable for all possible scenarios. 
> Specifically, the behavior on x86 platforms is likely to be different.
> What testing have you done for these patches? Was x86 tested?
As noted in the cover letter, these patches were not tested as we don't have ENA NICs.
We rely on you to do that, any concerns and comments welcome.
Yes, the behavior on x86 platforms is also different, Intel people are welcome to comment.
/Gavin
>
> Thanks,
> Igor
>
> -----Original Message-----
> From: Gavin Hu <gavin.hu@arm.com>
> Sent: Friday, March 13, 2020 11:18 AM
> To: dev@dpdk.org
> Cc: nd@arm.com; david.marchand@redhat.com; thomas@monjalon.net; 
> mk@semihalf.com; Tzalik, Guy <gtzalik@amazon.com>; Schmeilin, Evgeny 
> <evgenys@amazon.com>; Chauskin, Igor <igorch@amazon.com>; 
> mw@semihalf.com; Honnappa.Nagarahalli@arm.com; ruifeng.wang@arm.com; 
> phil.yang@arm.com; joyce.kong@arm.com
> Subject: [EXTERNAL][PATCH RFC v1 0/7] relax barriers for ENA PMD and 
> small fixes
>
> CAUTION: This email originated from outside of the organization. Do 
> not click links or open attachments unless you can confirm the sender 
> and know the content is safe.
>
>
>
> To ensure the stores to the host memory are observed by NIC HW before 
> a door bell ring to the NIC HW and the HW starts actions, for example, 
> doing DMA, a barrier is required on weak memory ordering platforms, like aarch64.
>
> However, unnecessarily too strong barriers like 'dsb' on aarch64 will 
> dampen performance.
>
> In a typical doorbell use case, as NIC and CPU are in the outer 
> sharable domain, a lighter weight 'dmb osh' barrier is sufficient.
>
> The patch set relaxes the barriers in similar places and include one 
> more patch for statistics logging with relaxed ordering and the other 
> patch removing duplicate memset.
>
> Note this set is submitted for RFC as we don't have physical ENA NICs 
> in the lab and the patch set was not verified nor benchmarked.
>
> Gavin Hu (7):
>   net/ena: remove duplicate barrier
>   net/ena: relax the barrier for doorbell ring
>   net/ena: relax the rmb for DMA
>   net/ena: relax barrier for completion queue update
>   net/ena: relax the barrier for bounce buffer
>   net/ena: use c11 atomic for statistics
>   net/ena: remove duplicate memset
>
>  drivers/net/ena/base/ena_eth_com.c   |  2 +-
>  drivers/net/ena/base/ena_eth_com.h   |  6 ++--
>  drivers/net/ena/base/ena_plat_dpdk.h |  2 +-
>  drivers/net/ena/ena_ethdev.c         | 46 +++++++++++++++++-----------
>  drivers/net/ena/ena_ethdev.h         |  8 ++---
>  5 files changed, 38 insertions(+), 26 deletions(-)
>
> --
> 2.17.1
  
Gavin Hu April 21, 2020, 7:45 a.m. UTC | #6
> -----Original Message-----
> From: Chauskin, Igor <igorch@amazon.com>
> Sent: Thursday, April 16, 2020 9:38 PM
> To: Gavin Hu <Gavin.Hu@arm.com>; dev@dpdk.org
> Cc: nd <nd@arm.com>; david.marchand@redhat.com;
> thomas@monjalon.net; mk@semihalf.com; Tzalik, Guy
> <gtzalik@amazon.com>; Schmeilin, Evgeny <evgenys@amazon.com>;
> mw@semihalf.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>; Phil Yang <Phil.Yang@arm.com>; Joyce Kong
> <Joyce.Kong@arm.com>; Bshara, Saeed <saeedb@amazon.com>;
> Matushevsky, Alexander <matua@amazon.com>; Bruce Richardson
> <bruce.richardson@intel.com>; nd <nd@arm.com>
> Subject: RE: [PATCH RFC v1 0/7] relax barriers for ENA PMD and small fixes
> 
> Hi all,
> 
> Please see the first batch of comments related to these patches:
> 
> 1. Relaxing the register read/write isn't always good enough. Specifically,
> when barriers are required between different memory types, the reordering
> can occur even on x86. Yet in DPDK the io/cio/smp barrier flavors for x86
> are defined as compiler-only barriers, which is not enough in cases involving
> different memory types. In ENA driver, when LLQ is on, there is a regular
> register memory access before the barrier and write-combined memory
> access after the barrier.
That's makes sense, we realized that also, we don't mean to change x86 behaviors.
> 
> We're working on a more extensive change that will include the
> optimizations proposed for barriers relaxing while making them applicable
> to all platforms.
We are working also on an extensive change, helpful to arm64 while not impacting x86.
More testing is ongoing. 
> 
> 2.  Regarding the changes for statistics logging - the patch relies on c11
> features. I'm not sure it's acceptable for all situations since we've already
> encountered a reports when even c99-compliant changes caused
> compilation issues.
C11 is already widely used in other components, even in other projects like vpp and ovs. 
Maybe it comes to time to drop C99 as a stringent requirement. 
> 
> 3. Removing redundant zeroing of sub-struct - we're currently working on
> some extensive changes to the Tx flow, which will include this change
> among other.
Ok, thanks. 
> 
> Thanks,
> Igor
> 
> -----Original Message-----
> From: Gavin Hu <Gavin.Hu@arm.com>
> Sent: Tuesday, March 17, 2020 9:59 AM
> To: Chauskin, Igor <igorch@amazon.com>; dev@dpdk.org
> Cc: nd <nd@arm.com>; david.marchand@redhat.com;
> thomas@monjalon.net; mk@semihalf.com; Tzalik, Guy
> <gtzalik@amazon.com>; Schmeilin, Evgeny <evgenys@amazon.com>;
> mw@semihalf.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>; Phil Yang <Phil.Yang@arm.com>; Joyce Kong
> <Joyce.Kong@arm.com>; Bshara, Saeed <saeedb@amazon.com>;
> Matushevsky, Alexander <matua@amazon.com>; Bruce Richardson
> <bruce.richardson@intel.com>; nd <nd@arm.com>
> Subject: RE: [EXTERNAL] [PATCH RFC v1 0/7] relax barriers for ENA PMD and
> small fixes
> 
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
> 
> 
> 
> Hi Igor,
> 
> > -----Original Message-----
> > From: Chauskin, Igor <igorch@amazon.com>
> > Sent: Monday, March 16, 2020 5:35 PM
> > To: Gavin Hu <Gavin.Hu@arm.com>; dev@dpdk.org
> > Cc: nd <nd@arm.com>; david.marchand@redhat.com;
> thomas@monjalon.net;
> > mk@semihalf.com; Tzalik, Guy <gtzalik@amazon.com>; Schmeilin, Evgeny
> > <evgenys@amazon.com>; mw@semihalf.com; Honnappa Nagarahalli
> > <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>;
> > Phil Yang <Phil.Yang@arm.com>; Joyce Kong <Joyce.Kong@arm.com>;
> > Bshara, Saeed <saeedb@amazon.com>; Matushevsky, Alexander
> > <matua@amazon.com>
> > Subject: RE: [PATCH RFC v1 0/7] relax barriers for ENA PMD and small
> > fixes
> >
> > Hi Gavin,
> >
> > Thank you for the contribution.
> > Please do not merge these changes (patches 0..7) till we (the ENA
> > team) properly review and ack/nack.
> > These changes can potentially provide performance improvement, yet we
> > need to ensure they are applicable for all possible scenarios.
> > Specifically, the behavior on x86 platforms is likely to be different.
> > What testing have you done for these patches? Was x86 tested?
> As noted in the cover letter, these patches were not tested as we don't have
> ENA NICs.
> We rely on you to do that, any concerns and comments welcome.
> Yes, the behavior on x86 platforms is also different, Intel people are
> welcome to comment.
> /Gavin
> >
> > Thanks,
> > Igor
> >
> > -----Original Message-----
> > From: Gavin Hu <gavin.hu@arm.com>
> > Sent: Friday, March 13, 2020 11:18 AM
> > To: dev@dpdk.org
> > Cc: nd@arm.com; david.marchand@redhat.com; thomas@monjalon.net;
> > mk@semihalf.com; Tzalik, Guy <gtzalik@amazon.com>; Schmeilin, Evgeny
> > <evgenys@amazon.com>; Chauskin, Igor <igorch@amazon.com>;
> > mw@semihalf.com; Honnappa.Nagarahalli@arm.com;
> ruifeng.wang@arm.com;
> > phil.yang@arm.com; joyce.kong@arm.com
> > Subject: [EXTERNAL][PATCH RFC v1 0/7] relax barriers for ENA PMD and
> > small fixes
> >
> > CAUTION: This email originated from outside of the organization. Do
> > not click links or open attachments unless you can confirm the sender
> > and know the content is safe.
> >
> >
> >
> > To ensure the stores to the host memory are observed by NIC HW before
> > a door bell ring to the NIC HW and the HW starts actions, for example,
> > doing DMA, a barrier is required on weak memory ordering platforms, like
> aarch64.
> >
> > However, unnecessarily too strong barriers like 'dsb' on aarch64 will
> > dampen performance.
> >
> > In a typical doorbell use case, as NIC and CPU are in the outer
> > sharable domain, a lighter weight 'dmb osh' barrier is sufficient.
> >
> > The patch set relaxes the barriers in similar places and include one
> > more patch for statistics logging with relaxed ordering and the other
> > patch removing duplicate memset.
> >
> > Note this set is submitted for RFC as we don't have physical ENA NICs
> > in the lab and the patch set was not verified nor benchmarked.
> >
> > Gavin Hu (7):
> >   net/ena: remove duplicate barrier
> >   net/ena: relax the barrier for doorbell ring
> >   net/ena: relax the rmb for DMA
> >   net/ena: relax barrier for completion queue update
> >   net/ena: relax the barrier for bounce buffer
> >   net/ena: use c11 atomic for statistics
> >   net/ena: remove duplicate memset
> >
> >  drivers/net/ena/base/ena_eth_com.c   |  2 +-
> >  drivers/net/ena/base/ena_eth_com.h   |  6 ++--
> >  drivers/net/ena/base/ena_plat_dpdk.h |  2 +-
> >  drivers/net/ena/ena_ethdev.c         | 46 +++++++++++++++++-----------
> >  drivers/net/ena/ena_ethdev.h         |  8 ++---
> >  5 files changed, 38 insertions(+), 26 deletions(-)
> >
> > --
> > 2.17.1
  
Honnappa Nagarahalli May 12, 2020, 9:22 p.m. UTC | #7
<snip>

Hi Igor,
	Few comments inline.

> > Subject: RE: [PATCH RFC v1 0/7] relax barriers for ENA PMD and small
> > fixes
> >
> > Hi all,
> >
> > Please see the first batch of comments related to these patches:
> >
> > 1. Relaxing the register read/write isn't always good enough.
> > Specifically, when barriers are required between different memory
> > types, the reordering can occur even on x86. Yet in DPDK the
> > io/cio/smp barrier flavors for x86 are defined as compiler-only
> > barriers, which is not enough in cases involving different memory
> > types. In ENA driver, when LLQ is on, there is a regular register
> > memory access before the barrier and write-combined memory access after
> the barrier.
> That's makes sense, we realized that also, we don't mean to change x86
> behaviors.
I think https://patches.dpdk.org/patch/70091/ should help in this. I have added to that patch, please take a look.

> >
> > We're working on a more extensive change that will include the
> > optimizations proposed for barriers relaxing while making them
> > applicable to all platforms.
> We are working also on an extensive change, helpful to arm64 while not
> impacting x86.
> More testing is ongoing.
> >
> > 2.  Regarding the changes for statistics logging - the patch relies on
> > c11 features. I'm not sure it's acceptable for all situations since
> > we've already encountered a reports when even c99-compliant changes
> > caused compilation issues.
> C11 is already widely used in other components, even in other projects like
> vpp and ovs.
> Maybe it comes to time to drop C99 as a stringent requirement.
Do you have any particular compiler version in mind? There is a proposal [1] to stop using rte_atomicNN_xxx APIs and use wrappers built around c11 __atomic built-ins. C11 is supported in GCC from 4.7 and in clang from 3.1.

[1] https://patches.dpdk.org/cover/70097/

> >
> > 3. Removing redundant zeroing of sub-struct - we're currently working
> > on some extensive changes to the Tx flow, which will include this
> > change among other.
> Ok, thanks.
> >
> > Thanks,
> > Igor
> >
> > -----Original Message-----
> > From: Gavin Hu <Gavin.Hu@arm.com>
> > Sent: Tuesday, March 17, 2020 9:59 AM
> > To: Chauskin, Igor <igorch@amazon.com>; dev@dpdk.org
> > Cc: nd <nd@arm.com>; david.marchand@redhat.com;
> thomas@monjalon.net;
> > mk@semihalf.com; Tzalik, Guy <gtzalik@amazon.com>; Schmeilin, Evgeny
> > <evgenys@amazon.com>; mw@semihalf.com; Honnappa Nagarahalli
> > <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>;
> > Phil Yang <Phil.Yang@arm.com>; Joyce Kong <Joyce.Kong@arm.com>;
> > Bshara, Saeed <saeedb@amazon.com>; Matushevsky, Alexander
> > <matua@amazon.com>; Bruce Richardson <bruce.richardson@intel.com>;
> nd
> > <nd@arm.com>
> > Subject: RE: [EXTERNAL] [PATCH RFC v1 0/7] relax barriers for ENA PMD
> > and small fixes
> >
> > CAUTION: This email originated from outside of the organization. Do
> > not click links or open attachments unless you can confirm the sender
> > and know the content is safe.
> >
> >
> >
> > Hi Igor,
> >
> > > -----Original Message-----
> > > From: Chauskin, Igor <igorch@amazon.com>
> > > Sent: Monday, March 16, 2020 5:35 PM
> > > To: Gavin Hu <Gavin.Hu@arm.com>; dev@dpdk.org
> > > Cc: nd <nd@arm.com>; david.marchand@redhat.com;
> > thomas@monjalon.net;
> > > mk@semihalf.com; Tzalik, Guy <gtzalik@amazon.com>; Schmeilin, Evgeny
> > > <evgenys@amazon.com>; mw@semihalf.com; Honnappa Nagarahalli
> > > <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> > <Ruifeng.Wang@arm.com>;
> > > Phil Yang <Phil.Yang@arm.com>; Joyce Kong <Joyce.Kong@arm.com>;
> > > Bshara, Saeed <saeedb@amazon.com>; Matushevsky, Alexander
> > > <matua@amazon.com>
> > > Subject: RE: [PATCH RFC v1 0/7] relax barriers for ENA PMD and small
> > > fixes
> > >
> > > Hi Gavin,
> > >
> > > Thank you for the contribution.
> > > Please do not merge these changes (patches 0..7) till we (the ENA
> > > team) properly review and ack/nack.
> > > These changes can potentially provide performance improvement, yet
> > > we need to ensure they are applicable for all possible scenarios.
> > > Specifically, the behavior on x86 platforms is likely to be different.
> > > What testing have you done for these patches? Was x86 tested?
> > As noted in the cover letter, these patches were not tested as we
> > don't have ENA NICs.
> > We rely on you to do that, any concerns and comments welcome.
> > Yes, the behavior on x86 platforms is also different, Intel people are
> > welcome to comment.
> > /Gavin
> > >
> > > Thanks,
> > > Igor
> > >
> > > -----Original Message-----
> > > From: Gavin Hu <gavin.hu@arm.com>
> > > Sent: Friday, March 13, 2020 11:18 AM
> > > To: dev@dpdk.org
> > > Cc: nd@arm.com; david.marchand@redhat.com; thomas@monjalon.net;
> > > mk@semihalf.com; Tzalik, Guy <gtzalik@amazon.com>; Schmeilin, Evgeny
> > > <evgenys@amazon.com>; Chauskin, Igor <igorch@amazon.com>;
> > > mw@semihalf.com; Honnappa.Nagarahalli@arm.com;
> > ruifeng.wang@arm.com;
> > > phil.yang@arm.com; joyce.kong@arm.com
> > > Subject: [EXTERNAL][PATCH RFC v1 0/7] relax barriers for ENA PMD and
> > > small fixes
> > >
> > > CAUTION: This email originated from outside of the organization. Do
> > > not click links or open attachments unless you can confirm the
> > > sender and know the content is safe.
> > >
> > >
> > >
> > > To ensure the stores to the host memory are observed by NIC HW
> > > before a door bell ring to the NIC HW and the HW starts actions, for
> > > example, doing DMA, a barrier is required on weak memory ordering
> > > platforms, like
> > aarch64.
> > >
> > > However, unnecessarily too strong barriers like 'dsb' on aarch64
> > > will dampen performance.
> > >
> > > In a typical doorbell use case, as NIC and CPU are in the outer
> > > sharable domain, a lighter weight 'dmb osh' barrier is sufficient.
> > >
> > > The patch set relaxes the barriers in similar places and include one
> > > more patch for statistics logging with relaxed ordering and the
> > > other patch removing duplicate memset.
> > >
> > > Note this set is submitted for RFC as we don't have physical ENA
> > > NICs in the lab and the patch set was not verified nor benchmarked.
> > >
> > > Gavin Hu (7):
> > >   net/ena: remove duplicate barrier
> > >   net/ena: relax the barrier for doorbell ring
> > >   net/ena: relax the rmb for DMA
> > >   net/ena: relax barrier for completion queue update
> > >   net/ena: relax the barrier for bounce buffer
> > >   net/ena: use c11 atomic for statistics
> > >   net/ena: remove duplicate memset
> > >
> > >  drivers/net/ena/base/ena_eth_com.c   |  2 +-
> > >  drivers/net/ena/base/ena_eth_com.h   |  6 ++--
> > >  drivers/net/ena/base/ena_plat_dpdk.h |  2 +-
> > >  drivers/net/ena/ena_ethdev.c         | 46 +++++++++++++++++-----------
> > >  drivers/net/ena/ena_ethdev.h         |  8 ++---
> > >  5 files changed, 38 insertions(+), 26 deletions(-)
> > >
> > > --
> > > 2.17.1
>