[v4] net/i40e: relaxed barrier in the tx fastpath

Message ID 20200212055621.118363-1-gavin.hu@arm.com (mailing list archive)
State Accepted, archived
Delegated to: xiaolong ye
Headers
Series [v4] net/i40e: relaxed barrier in the tx fastpath |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/travis-robot success Travis build: passed
ci/iol-testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/Intel-compilation success Compilation OK

Commit Message

Gavin Hu Feb. 12, 2020, 5:56 a.m. UTC
  To keep ordering of mixed accesses, rte_cio is sufficient.
The rte_io barrier inside the I40E_PCI_REG_WRITE is overkill.[1]

[1] http://inbox.dpdk.org/dev/CALBAE1M-ezVWCjqCZDBw+MMDEC4O9
qf0Kpn89EMdGDajepKoZQ@mail.gmail.com

Fixes: 4861cde46116 ("i40e: new poll mode driver")
Cc: stable@dpdk.org

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
---
V4:
- add the Fixes tag and CC stable <Xiaolong Ye>
V3:
- optimize the barriers in the fast path only, leave as it is for the
  barriers in the slow path and control path <jerin>
- drop the virtio patches from the list as they are in the control path
- it makes more sense to relax the barrier in the fast path, at the PMD level.
  relaxing the fundamental rte_io_x barriers APIs requires scrutinizations for
  each PMDs which use the barriers directly or indirectly.
V2:
- remove virtio_pci_read/write64 APIs definitions, they are not needed and generate compiling errors like " error: unused function 'virtio_pci_write64' [-Werror,-Wunused-function]"
- update the reference link to kernel source code
---
 drivers/net/i40e/i40e_rxtx.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
  

Comments

Xiaolong Ye Feb. 15, 2020, 3:16 p.m. UTC | #1
s/relaxed/relax

On 02/12, Gavin Hu wrote:
>To keep ordering of mixed accesses, rte_cio is sufficient.
>The rte_io barrier inside the I40E_PCI_REG_WRITE is overkill.[1]
>
>[1] http://inbox.dpdk.org/dev/CALBAE1M-ezVWCjqCZDBw+MMDEC4O9
>qf0Kpn89EMdGDajepKoZQ@mail.gmail.com
>
>Fixes: 4861cde46116 ("i40e: new poll mode driver")
>Cc: stable@dpdk.org
>
>Signed-off-by: Gavin Hu <gavin.hu@arm.com>
>---
>V4:
>- add the Fixes tag and CC stable <Xiaolong Ye>
>V3:
>- optimize the barriers in the fast path only, leave as it is for the
>  barriers in the slow path and control path <jerin>
>- drop the virtio patches from the list as they are in the control path
>- it makes more sense to relax the barrier in the fast path, at the PMD level.
>  relaxing the fundamental rte_io_x barriers APIs requires scrutinizations for
>  each PMDs which use the barriers directly or indirectly.
>V2:
>- remove virtio_pci_read/write64 APIs definitions, they are not needed and generate compiling errors like " error: unused function 'virtio_pci_write64' [-Werror,-Wunused-function]"
>- update the reference link to kernel source code
>---
> drivers/net/i40e/i40e_rxtx.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
>index fd1ae80da..8c0f7cc67 100644
>--- a/drivers/net/i40e/i40e_rxtx.c
>+++ b/drivers/net/i40e/i40e_rxtx.c
>@@ -1248,7 +1248,8 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
> 		   (unsigned) txq->port_id, (unsigned) txq->queue_id,
> 		   (unsigned) tx_id, (unsigned) nb_tx);
> 
>-	I40E_PCI_REG_WRITE(txq->qtx_tail, tx_id);
>+	rte_cio_wmb();
>+	I40E_PCI_REG_WRITE_RELAXED(txq->qtx_tail, tx_id);
> 	txq->tx_tail = tx_id;
> 
> 	return nb_tx;
>-- 
>2.17.1
>

Applied to dpdk-next-net-intel with Jerin's Reviewed-by tag, Thanks.
  
Thomas Monjalon Feb. 16, 2020, 9:51 a.m. UTC | #2
15/02/2020 16:16, Ye Xiaolong:
> s/relaxed/relax
> 
> On 02/12, Gavin Hu wrote:
> >To keep ordering of mixed accesses, rte_cio is sufficient.
> >The rte_io barrier inside the I40E_PCI_REG_WRITE is overkill.[1]
[...]
> 
> Applied to dpdk-next-net-intel with Jerin's Reviewed-by tag, Thanks.

I assume it is too much risky doing such optimization post-rc3.

Ferruh, Xiaolong, you don't plan anymore pull from dpdk-next-net-intel
in 20.02?
  
Xiaolong Ye Feb. 16, 2020, 4:38 p.m. UTC | #3
Hi, Thomas

On 02/16, Thomas Monjalon wrote:
>15/02/2020 16:16, Ye Xiaolong:
>> s/relaxed/relax
>> 
>> On 02/12, Gavin Hu wrote:
>> >To keep ordering of mixed accesses, rte_cio is sufficient.
>> >The rte_io barrier inside the I40E_PCI_REG_WRITE is overkill.[1]
>[...]
>> 
>> Applied to dpdk-next-net-intel with Jerin's Reviewed-by tag, Thanks.
>
>I assume it is too much risky doing such optimization post-rc3.

Yes, this iss a valid concern, I agree to postpone it to next release.

>
>Ferruh, Xiaolong, you don't plan anymore pull from dpdk-next-net-intel
>in 20.02?

There are still some bug fixing work going on in PRC, so I assume there 
should be some fix patches after RC3, they are still allowed to be merged
to 20.02, if the fix is relatively small in terms of lines of code and scope,
right?


Thanks,
Xiaolong

>
>
  
Thomas Monjalon Feb. 16, 2020, 5:36 p.m. UTC | #4
16/02/2020 17:38, Ye Xiaolong:
> Hi, Thomas
> 
> On 02/16, Thomas Monjalon wrote:
> >15/02/2020 16:16, Ye Xiaolong:
> >> s/relaxed/relax
> >> 
> >> On 02/12, Gavin Hu wrote:
> >> >To keep ordering of mixed accesses, rte_cio is sufficient.
> >> >The rte_io barrier inside the I40E_PCI_REG_WRITE is overkill.[1]
> >[...]
> >> 
> >> Applied to dpdk-next-net-intel with Jerin's Reviewed-by tag, Thanks.
> >
> >I assume it is too much risky doing such optimization post-rc3.
> 
> Yes, this iss a valid concern, I agree to postpone it to next release.
> 
> >
> >Ferruh, Xiaolong, you don't plan anymore pull from dpdk-next-net-intel
> >in 20.02?
> 
> There are still some bug fixing work going on in PRC, so I assume there 
> should be some fix patches after RC3, they are still allowed to be merged
> to 20.02, if the fix is relatively small in terms of lines of code and scope,
> right?

Right
  

Patch

diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index fd1ae80da..8c0f7cc67 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -1248,7 +1248,8 @@  i40e_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 		   (unsigned) txq->port_id, (unsigned) txq->queue_id,
 		   (unsigned) tx_id, (unsigned) nb_tx);
 
-	I40E_PCI_REG_WRITE(txq->qtx_tail, tx_id);
+	rte_cio_wmb();
+	I40E_PCI_REG_WRITE_RELAXED(txq->qtx_tail, tx_id);
 	txq->tx_tail = tx_id;
 
 	return nb_tx;