[dpdk-dev] igb_uio: use non-threaded ISR

Message ID 1484953699-3156-1-git-send-email-david.w.su@intel.com
State Accepted, archived
Delegated to: Thomas Monjalon
Headers show

Checks

Context Check Description
ci/Intel compilation success Compilation OK
ci/checkpatch success coding style OK

Commit Message

David Su Jan. 20, 2017, 11:08 p.m.
This eliminates the overhead of a task switch when an interrupt arrives.

Signed-off-by: David Su <david.w.su@intel.com>
---
 lib/librte_eal/linuxapp/igb_uio/igb_uio.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

Comments

Stephen Hemminger Jan. 20, 2017, 11:50 p.m. | #1
On Fri, 20 Jan 2017 15:08:19 -0800
David Su <david.w.su@intel.com> wrote:

> This eliminates the overhead of a task switch when an interrupt arrives.
> 
> Signed-off-by: David Su <david.w.su@intel.com>
> ---
>  lib/librte_eal/linuxapp/igb_uio/igb_uio.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> index df41e45..9338e14 100644
> --- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> +++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> @@ -382,6 +382,7 @@ igbuio_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
>  		msix_entry.entry = 0;
>  		if (pci_enable_msix(dev, &msix_entry, 1) == 0) {
>  			dev_dbg(&dev->dev, "using MSI-X");
> +			udev->info.irq_flags = IRQF_NO_THREAD;
>  			udev->info.irq = msix_entry.vector;
>  			udev->mode = RTE_INTR_MODE_MSIX;
>  			break;
> @@ -390,7 +391,7 @@ igbuio_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
>  	case RTE_INTR_MODE_LEGACY:
>  		if (pci_intx_mask_supported(dev)) {
>  			dev_dbg(&dev->dev, "using INTX");
> -			udev->info.irq_flags = IRQF_SHARED;
> +			udev->info.irq_flags = IRQF_SHARED | IRQF_NO_THREAD;
>  			udev->info.irq = dev->irq;
>  			udev->mode = RTE_INTR_MODE_LEGACY;
>  			break;

Since interrupts are only used for link state transistions with igb_uio,
I can't see how the overhead of task switch would matter.
David Su Jan. 23, 2017, 7:20 p.m. | #2
If rte_eth_conf.intr_conf.rxq != 0 and rte_eth_conf.intr_conf.lsc = 0 when rte_eth_dev_configure() is called, rx queue interrupts can be enabled/disabled with rte_eth_dev_rx_intr_{enable|disable} and DPDK applications can wait for rx queue interrupts with rte_epoll_wait().  This is the case for both igb_uio and vfio-pci drivers.

vfio-pci already uses non-threaded ISR, but currently we can only use igb_uio when running DPDK applications in virtual machines.

David
  

-----Original Message-----
From: Stephen Hemminger [mailto:stephen@networkplumber.org] 
Sent: Friday, January 20, 2017 3:51 PM
To: Su, David W <david.w.su@intel.com>
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH] igb_uio: use non-threaded ISR

On Fri, 20 Jan 2017 15:08:19 -0800
David Su <david.w.su@intel.com> wrote:

> This eliminates the overhead of a task switch when an interrupt arrives.
> 
> Signed-off-by: David Su <david.w.su@intel.com>
> ---
>  lib/librte_eal/linuxapp/igb_uio/igb_uio.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> index df41e45..9338e14 100644
> --- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> +++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> @@ -382,6 +382,7 @@ igbuio_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
>  		msix_entry.entry = 0;
>  		if (pci_enable_msix(dev, &msix_entry, 1) == 0) {
>  			dev_dbg(&dev->dev, "using MSI-X");
> +			udev->info.irq_flags = IRQF_NO_THREAD;
>  			udev->info.irq = msix_entry.vector;
>  			udev->mode = RTE_INTR_MODE_MSIX;
>  			break;
> @@ -390,7 +391,7 @@ igbuio_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
>  	case RTE_INTR_MODE_LEGACY:
>  		if (pci_intx_mask_supported(dev)) {
>  			dev_dbg(&dev->dev, "using INTX");
> -			udev->info.irq_flags = IRQF_SHARED;
> +			udev->info.irq_flags = IRQF_SHARED | IRQF_NO_THREAD;
>  			udev->info.irq = dev->irq;
>  			udev->mode = RTE_INTR_MODE_LEGACY;
>  			break;

Since interrupts are only used for link state transistions with igb_uio,
I can't see how the overhead of task switch would matter.
Stephen Hemminger Jan. 23, 2017, 9:12 p.m. | #3
On Mon, 23 Jan 2017 19:20:02 +0000
"Su, David W" <david.w.su@intel.com> wrote:

> If rte_eth_conf.intr_conf.rxq != 0 and rte_eth_conf.intr_conf.lsc = 0 when rte_eth_dev_configure() is called, rx queue interrupts can be enabled/disabled with rte_eth_dev_rx_intr_{enable|disable} and DPDK applications can wait for rx queue interrupts with rte_epoll_wait().  This is the case for both igb_uio and vfio-pci drivers.
> 
> vfio-pci already uses non-threaded ISR, but currently we can only use igb_uio when running DPDK applications in virtual machines.
> 

Can't you use vfio-noiommu mode?
Ferruh Yigit Feb. 24, 2017, 5:54 p.m. | #4
On 1/20/2017 11:08 PM, David Su wrote:
> This eliminates the overhead of a task switch when an interrupt arrives.

Hi David,

Did you test patch with l3fwd-power (or any app that uses Rx
interrupts), is there any performance gain?

> 
> Signed-off-by: David Su <david.w.su@intel.com>
<...>
David Su Feb. 25, 2017, 12:21 a.m. | #5
>-----Original Message-----
>From: Yigit, Ferruh
>Sent: Friday, February 24, 2017 9:55 AM
>To: Su, David W <david.w.su@intel.com>; dev@dpdk.org
>Subject: Re: [dpdk-dev] [PATCH] igb_uio: use non-threaded ISR
>
>On 1/20/2017 11:08 PM, David Su wrote:
>> This eliminates the overhead of a task switch when an interrupt arrives.
>
>Hi David,
>
>Did you test patch with l3fwd-power (or any app that uses Rx
>interrupts), is there any performance gain?
>
Hi Ferruh,

The test is a simple l2 forward app and it uses the same idle heuristic as l3fwd-power, i.e. it enables rx interrupts and goes to sleep after about 300us without receiving a packet.  A packet generator is configured to send a time stamped packet every 400us to ensure the test will go through the sleep-wakeup cycle with every inbound packet.  The packet generator measures packet round trip latency with the timestamps.

The average latency is more than 100us on a 2.30GHz Xeon E5-2699 v3 platform with Intel X540-AT2 NIC.  The long latency is mostly because the DPDK ixgbe driver enables interrupt throttling in the NIC and sets the minimum inter-interrupt interval to about 500us.  With this patch alone, there is no significant change to the average latency, the maximum latency is reduced from 418us to 392us.  If interrupt throttling is not enabled (http://dpdk.org/dev/patchwork/patch/19856/), the average latency is reduced from 17us to 14us and the maximum latency from 30us to 21us.

>>
>> Signed-off-by: David Su <david.w.su@intel.com>
><...>
Ferruh Yigit Feb. 27, 2017, 4:15 p.m. | #6
On 1/20/2017 11:08 PM, David Su wrote:
> This eliminates the overhead of a task switch when an interrupt arrives.
> 
> Signed-off-by: David Su <david.w.su@intel.com>

Overall I agree with Stephen to switch vfio when possible but for the
cases igb_uio still required:

Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Thomas Monjalon March 30, 2017, 8:28 p.m. | #7
2017-02-27 16:15, Ferruh Yigit:
> On 1/20/2017 11:08 PM, David Su wrote:
> > This eliminates the overhead of a task switch when an interrupt arrives.
> > 
> > Signed-off-by: David Su <david.w.su@intel.com>
> 
> Overall I agree with Stephen to switch vfio when possible but for the
> cases igb_uio still required:
> 
> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>

Applied, thanks

Patch

diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
index df41e45..9338e14 100644
--- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
+++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
@@ -382,6 +382,7 @@  igbuio_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
 		msix_entry.entry = 0;
 		if (pci_enable_msix(dev, &msix_entry, 1) == 0) {
 			dev_dbg(&dev->dev, "using MSI-X");
+			udev->info.irq_flags = IRQF_NO_THREAD;
 			udev->info.irq = msix_entry.vector;
 			udev->mode = RTE_INTR_MODE_MSIX;
 			break;
@@ -390,7 +391,7 @@  igbuio_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
 	case RTE_INTR_MODE_LEGACY:
 		if (pci_intx_mask_supported(dev)) {
 			dev_dbg(&dev->dev, "using INTX");
-			udev->info.irq_flags = IRQF_SHARED;
+			udev->info.irq_flags = IRQF_SHARED | IRQF_NO_THREAD;
 			udev->info.irq = dev->irq;
 			udev->mode = RTE_INTR_MODE_LEGACY;
 			break;