[v3] net/i40e: relaxed barrier in the tx fastpath

Message ID 20200208134808.212027-1-gavin.hu@arm.com (mailing list archive)
State Superseded, archived
Delegated to: xiaolong ye
Headers
Series [v3] net/i40e: relaxed barrier in the tx fastpath |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/iol-nxp-Performance fail Performance Testing issues
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-testing success Testing PASS
ci/travis-robot success Travis build: passed
ci/Intel-compilation success Compilation OK

Commit Message

Gavin Hu Feb. 8, 2020, 1:48 p.m. UTC
  To keep ordering of mixed accesses, rte_cio is sufficient.
The rte_io barrier is overkill.[1]

[1] http://inbox.dpdk.org/dev/CALBAE1M-ezVWCjqCZDBw+MMDEC4O9
qf0Kpn89EMdGDajepKoZQ@mail.gmail.com

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
---
V3:
- optimize the barriers in the fast path only, leave as it is for the
  barriers in the slow path and control path <jerin>
- drop the virtio patches from the list as they are in the control path
- it makes more sense to relax the barrier in the fast path, at the PMD level.
  relaxing the fundamental rte_io_x barriers APIs requires scrutinizations for
  each PMDs which use the barriers directly or indirectly.
V2:
- remove virtio_pci_read/write64 APIs definitions, they are not needed and generate compiling errors like " error: unused function 'virtio_pci_write64' [-Werror,-Wunused-function]"
- update the reference link to kernel source code
---
 drivers/net/i40e/i40e_rxtx.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
  

Comments

Xiaolong Ye Feb. 11, 2020, 2:11 a.m. UTC | #1
Hi, Gavin

Please help add the Fixes tag and cc stable.

Thanks,
Xiaolong

On 02/08, Gavin Hu wrote:
>To keep ordering of mixed accesses, rte_cio is sufficient.
>The rte_io barrier is overkill.[1]
>
>[1] http://inbox.dpdk.org/dev/CALBAE1M-ezVWCjqCZDBw+MMDEC4O9
>qf0Kpn89EMdGDajepKoZQ@mail.gmail.com
>
>Signed-off-by: Gavin Hu <gavin.hu@arm.com>
>---
>V3:
>- optimize the barriers in the fast path only, leave as it is for the
>  barriers in the slow path and control path <jerin>
>- drop the virtio patches from the list as they are in the control path
>- it makes more sense to relax the barrier in the fast path, at the PMD level.
>  relaxing the fundamental rte_io_x barriers APIs requires scrutinizations for
>  each PMDs which use the barriers directly or indirectly.
>V2:
>- remove virtio_pci_read/write64 APIs definitions, they are not needed and generate compiling errors like " error: unused function 'virtio_pci_write64' [-Werror,-Wunused-function]"
>- update the reference link to kernel source code
>---
> drivers/net/i40e/i40e_rxtx.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
>index fd1ae80da..8c0f7cc67 100644
>--- a/drivers/net/i40e/i40e_rxtx.c
>+++ b/drivers/net/i40e/i40e_rxtx.c
>@@ -1248,7 +1248,8 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
> 		   (unsigned) txq->port_id, (unsigned) txq->queue_id,
> 		   (unsigned) tx_id, (unsigned) nb_tx);
> 
>-	I40E_PCI_REG_WRITE(txq->qtx_tail, tx_id);
>+	rte_cio_wmb();
>+	I40E_PCI_REG_WRITE_RELAXED(txq->qtx_tail, tx_id);
> 	txq->tx_tail = tx_id;
> 
> 	return nb_tx;
>-- 
>2.17.1
>
  
Gavin Hu Feb. 12, 2020, 6:02 a.m. UTC | #2
Hi Xiaolong,

Just sent v4 for this, thanks for reminding!

Best Regards,
Gavin

> -----Original Message-----
> From: Ye Xiaolong <xiaolong.ye@intel.com>
> Sent: Tuesday, February 11, 2020 10:11 AM
> To: Gavin Hu <Gavin.Hu@arm.com>
> Cc: dev@dpdk.org; nd <nd@arm.com>; david.marchand@redhat.com;
> thomas@monjalon.net; rasland@mellanox.com;
> maxime.coquelin@redhat.com; tiwei.bie@intel.com; matan@mellanox.com;
> shahafs@mellanox.com; viacheslavo@mellanox.com;
> arybchenko@solarflare.com; stephen@networkplumber.org;
> hemant.agrawal@nxp.com; jerinj@marvell.com;
> pbhagavatula@marvell.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>; Phil Yang <Phil.Yang@arm.com>; Joyce Kong
> <Joyce.Kong@arm.com>; Steve Capper <Steve.Capper@arm.com>
> Subject: Re: [dpdk-dev] [PATCH v3] net/i40e: relaxed barrier in the tx
> fastpath
> 
> Hi, Gavin
> 
> Please help add the Fixes tag and cc stable.
> 
> Thanks,
> Xiaolong
> 
> On 02/08, Gavin Hu wrote:
> >To keep ordering of mixed accesses, rte_cio is sufficient.
> >The rte_io barrier is overkill.[1]
> >
> >[1] http://inbox.dpdk.org/dev/CALBAE1M-ezVWCjqCZDBw+MMDEC4O9
> >qf0Kpn89EMdGDajepKoZQ@mail.gmail.com
> >
> >Signed-off-by: Gavin Hu <gavin.hu@arm.com>
> >---
> >V3:
> >- optimize the barriers in the fast path only, leave as it is for the
> >  barriers in the slow path and control path <jerin>
> >- drop the virtio patches from the list as they are in the control path
> >- it makes more sense to relax the barrier in the fast path, at the PMD level.
> >  relaxing the fundamental rte_io_x barriers APIs requires scrutinizations
> for
> >  each PMDs which use the barriers directly or indirectly.
> >V2:
> >- remove virtio_pci_read/write64 APIs definitions, they are not needed and
> generate compiling errors like " error: unused function 'virtio_pci_write64' [-
> Werror,-Wunused-function]"
> >- update the reference link to kernel source code
> >---
> > drivers/net/i40e/i40e_rxtx.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> >diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
> >index fd1ae80da..8c0f7cc67 100644
> >--- a/drivers/net/i40e/i40e_rxtx.c
> >+++ b/drivers/net/i40e/i40e_rxtx.c
> >@@ -1248,7 +1248,8 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf
> **tx_pkts, uint16_t nb_pkts)
> > 		   (unsigned) txq->port_id, (unsigned) txq->queue_id,
> > 		   (unsigned) tx_id, (unsigned) nb_tx);
> >
> >-	I40E_PCI_REG_WRITE(txq->qtx_tail, tx_id);
> >+	rte_cio_wmb();
> >+	I40E_PCI_REG_WRITE_RELAXED(txq->qtx_tail, tx_id);
> > 	txq->tx_tail = tx_id;
> >
> > 	return nb_tx;
> >--
> >2.17.1
> >
  
Jerin Jacob Feb. 15, 2020, 8:25 a.m. UTC | #3
On Sat, Feb 8, 2020 at 7:22 PM Gavin Hu <gavin.hu@arm.com> wrote:
>
> To keep ordering of mixed accesses, rte_cio is sufficient.
> The rte_io barrier is overkill.[1]
>
> [1] http://inbox.dpdk.org/dev/CALBAE1M-ezVWCjqCZDBw+MMDEC4O9
> qf0Kpn89EMdGDajepKoZQ@mail.gmail.com
>
> Signed-off-by: Gavin Hu <gavin.hu@arm.com>
> ---
> V3:
> - optimize the barriers in the fast path only, leave as it is for the
>   barriers in the slow path and control path <jerin>
> - drop the virtio patches from the list as they are in the control path
> - it makes more sense to relax the barrier in the fast path, at the PMD level.
>   relaxing the fundamental rte_io_x barriers APIs requires scrutinizations for
>   each PMDs which use the barriers directly or indirectly.
> V2:
> - remove virtio_pci_read/write64 APIs definitions, they are not needed and generate compiling errors like " error: unused function 'virtio_pci_write64' [-Werror,-Wunused-function]"
> - update the reference link to kernel source code
> ---
>  drivers/net/i40e/i40e_rxtx.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
> index fd1ae80da..8c0f7cc67 100644
> --- a/drivers/net/i40e/i40e_rxtx.c
> +++ b/drivers/net/i40e/i40e_rxtx.c
> @@ -1248,7 +1248,8 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
>                    (unsigned) txq->port_id, (unsigned) txq->queue_id,
>                    (unsigned) tx_id, (unsigned) nb_tx);
>
> -       I40E_PCI_REG_WRITE(txq->qtx_tail, tx_id);
> +       rte_cio_wmb();
> +       I40E_PCI_REG_WRITE_RELAXED(txq->qtx_tail, tx_id);

Looks good to me from the ARM64 perspective or in general.

Reviewed-by: Jerin Jacob <jerinj@marvell.com>





>         txq->tx_tail = tx_id;
>
>         return nb_tx;
> --
> 2.17.1
>
  

Patch

diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index fd1ae80da..8c0f7cc67 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -1248,7 +1248,8 @@  i40e_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 		   (unsigned) txq->port_id, (unsigned) txq->queue_id,
 		   (unsigned) tx_id, (unsigned) nb_tx);
 
-	I40E_PCI_REG_WRITE(txq->qtx_tail, tx_id);
+	rte_cio_wmb();
+	I40E_PCI_REG_WRITE_RELAXED(txq->qtx_tail, tx_id);
 	txq->tx_tail = tx_id;
 
 	return nb_tx;