[RFC,v3,2/3] eal: add mask and unmask interrupt APIs

Message ID 20190716164424.16776-2-ndabilpuram@marvell.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series [RFC] eal: add mask and unmask interrupt apis |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/Intel-compilation fail Compilation issues

Commit Message

Nithin Dabilpuram July 16, 2019, 4:44 p.m. UTC
  Add new mask and unmask interrupt APIs to avoid using
VFIO_IRQ_SET_ACTION_TRIGGER for masking/unmasking purpose for VFIO
based handlers. This implementation is specific to Linux.

Using action trigger for masking and unmasking has below issues

 * Time consuming to do for every interrupt received as it will
   free_irq() followed by request_irq() and all other initializations
 * A race condition because of a window between free_irq() and
   request_irq() with packet reception still on and device still
   enabled and would throw warning messages like below.
   [158764.159833] do_IRQ: 9.34 No irq handler for vector

In this patch, mask/unmask is a no-op for VFIO_MSIX/VFIO_MSI interrupts
as they are edge triggered and kernel would not mask the interrupt before
delivering the event to userspace.

Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
---
v3:
* Re-org patch to move driver change to 3/3
* Add stub implementation for freebsd.
* Fix version map file for new apis.
v2:
Make change in qede driver for unmask
 lib/librte_eal/common/include/rte_interrupts.h |  23 ++++
 lib/librte_eal/freebsd/eal/eal_interrupts.c    |  54 +++++++++
 lib/librte_eal/linux/eal/eal_interrupts.c      | 155 +++++++++++++++++++++++++
 lib/librte_eal/rte_eal_version.map             |   2 +
 4 files changed, 234 insertions(+)
  

Comments

Hyong Youb Kim (hyonkim) July 17, 2019, 5:55 a.m. UTC | #1
> -----Original Message-----
> From: Nithin Dabilpuram <ndabilpuram@marvell.com>
> Sent: Wednesday, July 17, 2019 1:44 AM
> To: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>; David Marchand
> <david.marchand@redhat.com>; Thomas Monjalon
> <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>; Bruce
> Richardson <bruce.richardson@intel.com>
> Cc: jerinj@marvell.com; John Daley (johndale) <johndale@cisco.com>;
> Shahed Shaikh <shshaikh@marvell.com>; dev@dpdk.org; Nithin Dabilpuram
> <ndabilpuram@marvell.com>
> Subject: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
> 
> Add new mask and unmask interrupt APIs to avoid using
> VFIO_IRQ_SET_ACTION_TRIGGER for masking/unmasking purpose for VFIO
> based handlers. This implementation is specific to Linux.
> 
> Using action trigger for masking and unmasking has below issues
> 
>  * Time consuming to do for every interrupt received as it will
>    free_irq() followed by request_irq() and all other initializations
>  * A race condition because of a window between free_irq() and
>    request_irq() with packet reception still on and device still
>    enabled and would throw warning messages like below.
>    [158764.159833] do_IRQ: 9.34 No irq handler for vector
> 
> In this patch, mask/unmask is a no-op for VFIO_MSIX/VFIO_MSI interrupts
> as they are edge triggered and kernel would not mask the interrupt before
> delivering the event to userspace.
> 
> Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
> ---
> v3:
> * Re-org patch to move driver change to 3/3
> * Add stub implementation for freebsd.
> * Fix version map file for new apis.
> v2:
> Make change in qede driver for unmask
>  lib/librte_eal/common/include/rte_interrupts.h |  23 ++++
>  lib/librte_eal/freebsd/eal/eal_interrupts.c    |  54 +++++++++
>  lib/librte_eal/linux/eal/eal_interrupts.c      | 155
> +++++++++++++++++++++++++
>  lib/librte_eal/rte_eal_version.map             |   2 +
>  4 files changed, 234 insertions(+)
> 
> diff --git a/lib/librte_eal/common/include/rte_interrupts.h
> b/lib/librte_eal/common/include/rte_interrupts.h
> index c1e912c..b0675be 100644
> --- a/lib/librte_eal/common/include/rte_interrupts.h
> +++ b/lib/librte_eal/common/include/rte_interrupts.h
> @@ -118,6 +118,29 @@ int rte_intr_enable(const struct rte_intr_handle
> *intr_handle);
>   */
>  int rte_intr_disable(const struct rte_intr_handle *intr_handle);
> 
> +/**
> + * It masks the interrupt for the specified handle.
> + *
> + * @param intr_handle
> + *  pointer to the interrupt handle.
> + *
> + * @return
> + *  - On success, zero.
> + *  - On failure, a negative value.
> + */
> +int rte_intr_mask(const struct rte_intr_handle *intr_handle);
> +
> +/**
> + * It unmasks the interrupt for the specified handle.
> + *
> + * @param intr_handle
> + *  pointer to the interrupt handle.
> + *
> + * @return
> + *  - On success, zero.
> + *  - On failure, a negative value.
> + */
> +int rte_intr_unmask(const struct rte_intr_handle *intr_handle);
>  #ifdef __cplusplus
>  }
>  #endif
[...]
> diff --git a/lib/librte_eal/linux/eal/eal_interrupts.c
> b/lib/librte_eal/linux/eal/eal_interrupts.c
> index 79ad5e8..f619981 100644
> --- a/lib/librte_eal/linux/eal/eal_interrupts.c
> +++ b/lib/librte_eal/linux/eal/eal_interrupts.c
[...]
>  int
> +rte_intr_mask(const struct rte_intr_handle *intr_handle)
> +{
> +	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
> +		return 0;
> +
> +	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
> +		return -1;
> +
> +	switch (intr_handle->type){
> +	/* Both masking and disabling are same for UIO */
> +	case RTE_INTR_HANDLE_UIO:
> +		if (uio_intr_disable(intr_handle))
> +			return -1;
> +		break;
> +	case RTE_INTR_HANDLE_UIO_INTX:
> +		if (uio_intx_intr_disable(intr_handle))
> +			return -1;
> +		break;
> +	/* not used at this moment */
> +	case RTE_INTR_HANDLE_ALARM:
> +		return -1;
> +#ifdef VFIO_PRESENT
> +	case RTE_INTR_HANDLE_VFIO_MSIX:
> +	case RTE_INTR_HANDLE_VFIO_MSI:
> +		return 0;

Isn't this a little confusing? It returns success, but irq is not
masked.

As is, return code 0 means...
- igb_uio: irq is masked for INTx, MSI, MSI-X
- vfio-pci + INTx: irq is masked
- vfio-pci + MSI/MSI-X: no changes

Masking is useful only for INTx, IMO...

Masking MSI/MSI-X via PCI-defined mechanisms (e.g. Mask bit in MSI-X
Table) has no practical use for drivers. Handshaking/masking/unmasking
is done via device/vendor specific ways, as needed. See all those
ack/block/unblock/credit/... mechanisms used in various drivers/NICs
to control interrupts their own way.

A long time ago in early PCIe days, the linux kernel did auto-masking
for MSI/MSI-X (i.e. mask before calling netdev irq handler). It was
soon removed as it was unnecessary overhead (expensive PIOs to NIC for
every interrupt). Windows and FreeBSD do not do auto-masking either.

Thanks.
-Hyong

> +	case RTE_INTR_HANDLE_VFIO_LEGACY:
> +		if (vfio_mask_intx(intr_handle))
> +			return -1;
> +		break;
> +#ifdef HAVE_VFIO_DEV_REQ_INTERFACE
> +	case RTE_INTR_HANDLE_VFIO_REQ:
> +		return -1;
> +#endif
> +#endif
> +	/* not used at this moment */
> +	case RTE_INTR_HANDLE_DEV_EVENT:
> +		return -1;
> +	/* unknown handle type */
> +	default:
> +		RTE_LOG(ERR, EAL,
> +			"Unknown handle type of fd %d\n",
> +					intr_handle->fd);
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +int
> +rte_intr_unmask(const struct rte_intr_handle *intr_handle)
> +{
> +	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
> +		return 0;
> +
> +	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
> +		return -1;
> +
> +	switch (intr_handle->type){
> +	/* Both unmasking and disabling are same for UIO */
> +	case RTE_INTR_HANDLE_UIO:
> +		if (uio_intr_enable(intr_handle))
> +			return -1;
> +		break;
> +	case RTE_INTR_HANDLE_UIO_INTX:
> +		if (uio_intx_intr_enable(intr_handle))
> +			return -1;
> +		break;
> +	/* not used at this moment */
> +	case RTE_INTR_HANDLE_ALARM:
> +		return -1;
> +#ifdef VFIO_PRESENT
> +	case RTE_INTR_HANDLE_VFIO_MSIX:
> +	case RTE_INTR_HANDLE_VFIO_MSI:
> +		return 0;
> +	case RTE_INTR_HANDLE_VFIO_LEGACY:
> +		if (vfio_unmask_intx(intr_handle))
> +			return -1;
> +		break;
> +#ifdef HAVE_VFIO_DEV_REQ_INTERFACE
> +	case RTE_INTR_HANDLE_VFIO_REQ:
> +		return -1;
> +#endif
> +#endif
> +	/* not used at this moment */
> +	case RTE_INTR_HANDLE_DEV_EVENT:
> +		return -1;
> +	/* unknown handle type */
> +	default:
> +		RTE_LOG(ERR, EAL,
> +			"Unknown handle type of fd %d\n",
> +					intr_handle->fd);
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +int
>  rte_intr_disable(const struct rte_intr_handle *intr_handle)
>  {
>  	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
> diff --git a/lib/librte_eal/rte_eal_version.map
> b/lib/librte_eal/rte_eal_version.map
> index 1892d9e..b3b2df4 100644
> --- a/lib/librte_eal/rte_eal_version.map
> +++ b/lib/librte_eal/rte_eal_version.map
> @@ -309,6 +309,8 @@ DPDK_19.08 {
>  	rte_service_lcore_attr_reset_all;
>  	rte_service_may_be_active;
>  	rte_srand;
> +	rte_intr_unmask;
> +	rte_intr_mask;
> 
>  } DPDK_19.05;
> 
> --
> 2.8.4
  
Jerin Jacob Kollanukkaran July 17, 2019, 6:14 a.m. UTC | #2
> -----Original Message-----
> From: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>
> Sent: Wednesday, July 17, 2019 11:26 AM
> To: Nithin Kumar Dabilpuram <ndabilpuram@marvell.com>; David Marchand
> <david.marchand@redhat.com>; Thomas Monjalon
> <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>; Bruce
> Richardson <bruce.richardson@intel.com>
> Cc: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; John Daley (johndale)
> <johndale@cisco.com>; Shahed Shaikh <shshaikh@marvell.com>;
> dev@dpdk.org
> Subject: RE: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
> > +rte_intr_mask(const struct rte_intr_handle *intr_handle) {
> > +	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
> > +		return 0;
> > +
> > +	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd <
> 0)
> > +		return -1;
> > +
> > +	switch (intr_handle->type){
> > +	/* Both masking and disabling are same for UIO */
> > +	case RTE_INTR_HANDLE_UIO:
> > +		if (uio_intr_disable(intr_handle))
> > +			return -1;
> > +		break;
> > +	case RTE_INTR_HANDLE_UIO_INTX:
> > +		if (uio_intx_intr_disable(intr_handle))
> > +			return -1;
> > +		break;
> > +	/* not used at this moment */
> > +	case RTE_INTR_HANDLE_ALARM:
> > +		return -1;
> > +#ifdef VFIO_PRESENT
> > +	case RTE_INTR_HANDLE_VFIO_MSIX:
> > +	case RTE_INTR_HANDLE_VFIO_MSI:
> > +		return 0;
> 
> Isn't this a little confusing? It returns success, but irq is not masked.

Yes. How about changing the API to rte_intr_ack()(Acknowledge the interrupt)
Or something similar? i.e replace rte_intr_unmask() with rte_intr_ack() for this use
case.
 
> As is, return code 0 means...
> - igb_uio: irq is masked for INTx, MSI, MSI-X
> - vfio-pci + INTx: irq is masked
> - vfio-pci + MSI/MSI-X: no changes
> 
> Masking is useful only for INTx, IMO...
> 
> Masking MSI/MSI-X via PCI-defined mechanisms (e.g. Mask bit in MSI-X
> Table) has no practical use for drivers. Handshaking/masking/unmasking is
> done via device/vendor specific ways, as needed. See all those
> ack/block/unblock/credit/... mechanisms used in various drivers/NICs to
> control interrupts their own way.
> 
> A long time ago in early PCIe days, the linux kernel did auto-masking for
> MSI/MSI-X (i.e. mask before calling netdev irq handler). It was soon removed
> as it was unnecessary overhead (expensive PIOs to NIC for every interrupt).
> Windows and FreeBSD do not do auto-masking either.

rte_intr_ack() can abstract FreeBSD and Windows difference.
  
Hyong Youb Kim (hyonkim) July 17, 2019, 7:09 a.m. UTC | #3
> -----Original Message-----
> From: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
> Sent: Wednesday, July 17, 2019 3:15 PM
> To: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>; Nithin Kumar
> Dabilpuram <ndabilpuram@marvell.com>; David Marchand
> <david.marchand@redhat.com>; Thomas Monjalon
> <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>; Bruce
> Richardson <bruce.richardson@intel.com>
> Cc: John Daley (johndale) <johndale@cisco.com>; Shahed Shaikh
> <shshaikh@marvell.com>; dev@dpdk.org
> Subject: RE: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
> 
> > -----Original Message-----
> > From: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>
> > Sent: Wednesday, July 17, 2019 11:26 AM
> > To: Nithin Kumar Dabilpuram <ndabilpuram@marvell.com>; David
> Marchand
> > <david.marchand@redhat.com>; Thomas Monjalon
> > <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>; Bruce
> > Richardson <bruce.richardson@intel.com>
> > Cc: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; John Daley (johndale)
> > <johndale@cisco.com>; Shahed Shaikh <shshaikh@marvell.com>;
> > dev@dpdk.org
> > Subject: RE: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
> > > +rte_intr_mask(const struct rte_intr_handle *intr_handle) {
> > > +	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
> > > +		return 0;
> > > +
> > > +	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd <
> > 0)
> > > +		return -1;
> > > +
> > > +	switch (intr_handle->type){
> > > +	/* Both masking and disabling are same for UIO */
> > > +	case RTE_INTR_HANDLE_UIO:
> > > +		if (uio_intr_disable(intr_handle))
> > > +			return -1;
> > > +		break;
> > > +	case RTE_INTR_HANDLE_UIO_INTX:
> > > +		if (uio_intx_intr_disable(intr_handle))
> > > +			return -1;
> > > +		break;
> > > +	/* not used at this moment */
> > > +	case RTE_INTR_HANDLE_ALARM:
> > > +		return -1;
> > > +#ifdef VFIO_PRESENT
> > > +	case RTE_INTR_HANDLE_VFIO_MSIX:
> > > +	case RTE_INTR_HANDLE_VFIO_MSI:
> > > +		return 0;
> >
> > Isn't this a little confusing? It returns success, but irq is not masked.
> 
> Yes. How about changing the API to rte_intr_ack()(Acknowledge the
> interrupt)
> Or something similar? i.e replace rte_intr_unmask() with rte_intr_ack() for
> this use
> case.
> 

Not sure. I do not have a good suggestion here :-) Like to hear from
David when he comes back, as he spent most time on this issue..

Why not return -1 and let the caller deal with it?

Optimist view:
Maintainers will see the error as vfio-pci + MSI/MSI_X is on
everyone's test list. And it forces them to confront the issue. Do I
really need unmask here, etc.

Pessimist view:
Wastes a lot of people's time. Potentially duplicate code like this
everywhere.

  if (INTx) unmask();

BTW, are you targeting 19.08 or 19.11? Not sure how much change we can
tolerate in 19.08.

Requirements for 19.08 seem to be...
- Must fix the redhat bz (lost interrupt issue with qede + MSI/MSI-X)
- Fix potentially similar issues in other drivers too?

Thanks..
-Hyong

> > As is, return code 0 means...
> > - igb_uio: irq is masked for INTx, MSI, MSI-X
> > - vfio-pci + INTx: irq is masked
> > - vfio-pci + MSI/MSI-X: no changes
> >
> > Masking is useful only for INTx, IMO...
> >
> > Masking MSI/MSI-X via PCI-defined mechanisms (e.g. Mask bit in MSI-X
> > Table) has no practical use for drivers. Handshaking/masking/unmasking is
> > done via device/vendor specific ways, as needed. See all those
> > ack/block/unblock/credit/... mechanisms used in various drivers/NICs to
> > control interrupts their own way.
> >
> > A long time ago in early PCIe days, the linux kernel did auto-masking for
> > MSI/MSI-X (i.e. mask before calling netdev irq handler). It was soon
> removed
> > as it was unnecessary overhead (expensive PIOs to NIC for every interrupt).
> > Windows and FreeBSD do not do auto-masking either.
> 
> rte_intr_ack() can abstract FreeBSD and Windows difference.
>
  
Jerin Jacob Kollanukkaran July 17, 2019, 8:03 a.m. UTC | #4
> > > -----Original Message-----
> > > From: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>
> > > Sent: Wednesday, July 17, 2019 11:26 AM
> > > To: Nithin Kumar Dabilpuram <ndabilpuram@marvell.com>; David
> > Marchand
> > > <david.marchand@redhat.com>; Thomas Monjalon
> <thomas@monjalon.net>;
> > > Ferruh Yigit <ferruh.yigit@intel.com>; Bruce Richardson
> > > <bruce.richardson@intel.com>
> > > Cc: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; John Daley
> > > (johndale) <johndale@cisco.com>; Shahed Shaikh
> > > <shshaikh@marvell.com>; dev@dpdk.org
> > > Subject: RE: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt
> > > APIs
> > > > +rte_intr_mask(const struct rte_intr_handle *intr_handle) {
> > > > +	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
> > > > +		return 0;
> > > > +
> > > > +	if (!intr_handle || intr_handle->fd < 0 ||
> > > > +intr_handle->uio_cfg_fd <
> > > 0)
> > > > +		return -1;
> > > > +
> > > > +	switch (intr_handle->type){
> > > > +	/* Both masking and disabling are same for UIO */
> > > > +	case RTE_INTR_HANDLE_UIO:
> > > > +		if (uio_intr_disable(intr_handle))
> > > > +			return -1;
> > > > +		break;
> > > > +	case RTE_INTR_HANDLE_UIO_INTX:
> > > > +		if (uio_intx_intr_disable(intr_handle))
> > > > +			return -1;
> > > > +		break;
> > > > +	/* not used at this moment */
> > > > +	case RTE_INTR_HANDLE_ALARM:
> > > > +		return -1;
> > > > +#ifdef VFIO_PRESENT
> > > > +	case RTE_INTR_HANDLE_VFIO_MSIX:
> > > > +	case RTE_INTR_HANDLE_VFIO_MSI:
> > > > +		return 0;
> > >
> > > Isn't this a little confusing? It returns success, but irq is not masked.
> >
> > Yes. How about changing the API to rte_intr_ack()(Acknowledge the
> > interrupt)
> > Or something similar? i.e replace rte_intr_unmask() with
> > rte_intr_ack() for this use case.
> >
> 
> Not sure. I do not have a good suggestion here :-) Like to hear from
> David when he comes back, as he spent most time on this issue..

Sure. He is on vacation.
Any reason for thinking, rte_intr_ack()  may not be semantically correct?
I think, it is very much correct if there are no better suggestions.
Anyway it the name, From VFIO perspective, we know what is expected so I think it is fine.

> 
> Why not return -1 and let the caller deal with it?

If we make it as rte_intr_ack() no need to return -1 for MSIX-VFIO+Linux
as it is semantically correct.

> 
> Optimist view:
> Maintainers will see the error as vfio-pci + MSI/MSI_X is on
> everyone's test list. And it forces them to confront the issue. Do I
> really need unmask here, etc.

If we make it as ack then it fine as driver does not need to know the fine 
details.

> 
> Pessimist view:
> Wastes a lot of people's time. Potentially duplicate code like this
> everywhere.
> 
>   if (INTx) unmask();
> 
> BTW, are you targeting 19.08 or 19.11? Not sure how much change we can
> tolerate in 19.08.


19.08 as fundamentally it correct. Finer adjustment can made by existing
drivers if required in the testing phase.

It is trivial change as scope is limited to interrupt hander rte_intr_enable()
replacement with rte_intr_ack(). For MSIX case, it should be real NOP,
so I don't think there issue. It should be much better than the existing
state, where almost everything broken.

> Requirements for 19.08 seem to be...
> - Must fix the redhat bz (lost interrupt issue with qede + MSI/MSI-X)
> - Fix potentially similar issues in other drivers too?

Proposed patch will fix the above mentioned issues.

> 
> Thanks..
> -Hyong
> 
> > > As is, return code 0 means...
> > > - igb_uio: irq is masked for INTx, MSI, MSI-X
> > > - vfio-pci + INTx: irq is masked
> > > - vfio-pci + MSI/MSI-X: no changes
> > >
> > > Masking is useful only for INTx, IMO...
> > >
> > > Masking MSI/MSI-X via PCI-defined mechanisms (e.g. Mask bit in MSI-X
> > > Table) has no practical use for drivers. Handshaking/masking/unmasking
> is
> > > done via device/vendor specific ways, as needed. See all those
> > > ack/block/unblock/credit/... mechanisms used in various drivers/NICs to
> > > control interrupts their own way.
> > >
> > > A long time ago in early PCIe days, the linux kernel did auto-masking for
> > > MSI/MSI-X (i.e. mask before calling netdev irq handler). It was soon
> > removed
> > > as it was unnecessary overhead (expensive PIOs to NIC for every
> interrupt).
> > > Windows and FreeBSD do not do auto-masking either.
> >
> > rte_intr_ack() can abstract FreeBSD and Windows difference.
> >
  
Hyong Youb Kim (hyonkim) July 17, 2019, 8:45 a.m. UTC | #5
> -----Original Message-----
> From: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
> Sent: Wednesday, July 17, 2019 5:03 PM
> To: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>; Nithin Kumar
> Dabilpuram <ndabilpuram@marvell.com>; David Marchand
> <david.marchand@redhat.com>; Thomas Monjalon
> <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>; Bruce
> Richardson <bruce.richardson@intel.com>
> Cc: John Daley (johndale) <johndale@cisco.com>; Shahed Shaikh
> <shshaikh@marvell.com>; dev@dpdk.org
> Subject: RE: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
> 
> > > > -----Original Message-----
> > > > From: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>
> > > > Sent: Wednesday, July 17, 2019 11:26 AM
> > > > To: Nithin Kumar Dabilpuram <ndabilpuram@marvell.com>; David
> > > Marchand
> > > > <david.marchand@redhat.com>; Thomas Monjalon
> > <thomas@monjalon.net>;
> > > > Ferruh Yigit <ferruh.yigit@intel.com>; Bruce Richardson
> > > > <bruce.richardson@intel.com>
> > > > Cc: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; John Daley
> > > > (johndale) <johndale@cisco.com>; Shahed Shaikh
> > > > <shshaikh@marvell.com>; dev@dpdk.org
> > > > Subject: RE: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt
> > > > APIs
> > > > > +rte_intr_mask(const struct rte_intr_handle *intr_handle) {
> > > > > +	if (intr_handle && intr_handle->type ==
> RTE_INTR_HANDLE_VDEV)
> > > > > +		return 0;
> > > > > +
> > > > > +	if (!intr_handle || intr_handle->fd < 0 ||
> > > > > +intr_handle->uio_cfg_fd <
> > > > 0)
> > > > > +		return -1;
> > > > > +
> > > > > +	switch (intr_handle->type){
> > > > > +	/* Both masking and disabling are same for UIO */
> > > > > +	case RTE_INTR_HANDLE_UIO:
> > > > > +		if (uio_intr_disable(intr_handle))
> > > > > +			return -1;
> > > > > +		break;
> > > > > +	case RTE_INTR_HANDLE_UIO_INTX:
> > > > > +		if (uio_intx_intr_disable(intr_handle))
> > > > > +			return -1;
> > > > > +		break;
> > > > > +	/* not used at this moment */
> > > > > +	case RTE_INTR_HANDLE_ALARM:
> > > > > +		return -1;
> > > > > +#ifdef VFIO_PRESENT
> > > > > +	case RTE_INTR_HANDLE_VFIO_MSIX:
> > > > > +	case RTE_INTR_HANDLE_VFIO_MSI:
> > > > > +		return 0;
> > > >
> > > > Isn't this a little confusing? It returns success, but irq is not masked.
> > >
> > > Yes. How about changing the API to rte_intr_ack()(Acknowledge the
> > > interrupt)
> > > Or something similar? i.e replace rte_intr_unmask() with
> > > rte_intr_ack() for this use case.
> > >
> >
> > Not sure. I do not have a good suggestion here :-) Like to hear from
> > David when he comes back, as he spent most time on this issue..
> 
> Sure. He is on vacation.
> Any reason for thinking, rte_intr_ack()  may not be semantically correct?
> I think, it is very much correct if there are no better suggestions.
> Anyway it the name, From VFIO perspective, we know what is expected so I
> think it is fine.
> 
> >
> > Why not return -1 and let the caller deal with it?
> 
> If we make it as rte_intr_ack() no need to return -1 for MSIX-VFIO+Linux
> as it is semantically correct.
> 

Ack can be ambiguous. For INTx, ack usually means PIO to a NIC
register, saying "I got your interrupt, please de-assert irq".

Besides the name, are we agreeing that we want these?
- Unmask if INTx
- Nothing if MSI/MSI-X

So, really just "unmask if INTx". I am ok with rte_intr_unmask() if we
make this intention clear. rte_unmask_if_intx() looks messy.

Thanks..
-Hyong

> >
> > Optimist view:
> > Maintainers will see the error as vfio-pci + MSI/MSI_X is on
> > everyone's test list. And it forces them to confront the issue. Do I
> > really need unmask here, etc.
> 
> If we make it as ack then it fine as driver does not need to know the fine
> details.
> 
> >
> > Pessimist view:
> > Wastes a lot of people's time. Potentially duplicate code like this
> > everywhere.
> >
> >   if (INTx) unmask();
> >
> > BTW, are you targeting 19.08 or 19.11? Not sure how much change we can
> > tolerate in 19.08.
> 
> 
> 19.08 as fundamentally it correct. Finer adjustment can made by existing
> drivers if required in the testing phase.
> 
> It is trivial change as scope is limited to interrupt hander rte_intr_enable()
> replacement with rte_intr_ack(). For MSIX case, it should be real NOP,
> so I don't think there issue. It should be much better than the existing
> state, where almost everything broken.
> 
> > Requirements for 19.08 seem to be...
> > - Must fix the redhat bz (lost interrupt issue with qede + MSI/MSI-X)
> > - Fix potentially similar issues in other drivers too?
> 
> Proposed patch will fix the above mentioned issues.
> 
> >
> > Thanks..
> > -Hyong
> >
> > > > As is, return code 0 means...
> > > > - igb_uio: irq is masked for INTx, MSI, MSI-X
> > > > - vfio-pci + INTx: irq is masked
> > > > - vfio-pci + MSI/MSI-X: no changes
> > > >
> > > > Masking is useful only for INTx, IMO...
> > > >
> > > > Masking MSI/MSI-X via PCI-defined mechanisms (e.g. Mask bit in MSI-X
> > > > Table) has no practical use for drivers.
> Handshaking/masking/unmasking
> > is
> > > > done via device/vendor specific ways, as needed. See all those
> > > > ack/block/unblock/credit/... mechanisms used in various drivers/NICs
> to
> > > > control interrupts their own way.
> > > >
> > > > A long time ago in early PCIe days, the linux kernel did auto-masking for
> > > > MSI/MSI-X (i.e. mask before calling netdev irq handler). It was soon
> > > removed
> > > > as it was unnecessary overhead (expensive PIOs to NIC for every
> > interrupt).
> > > > Windows and FreeBSD do not do auto-masking either.
> > >
> > > rte_intr_ack() can abstract FreeBSD and Windows difference.
> > >
  
Jerin Jacob Kollanukkaran July 17, 2019, 9:20 a.m. UTC | #6
> > > Not sure. I do not have a good suggestion here :-) Like to hear from
> > > David when he comes back, as he spent most time on this issue..
> >
> > Sure. He is on vacation.
> > Any reason for thinking, rte_intr_ack()  may not be semantically correct?
> > I think, it is very much correct if there are no better suggestions.
> > Anyway it the name, From VFIO perspective, we know what is expected so
> > I think it is fine.
> >
> > >
> > > Why not return -1 and let the caller deal with it?
> >
> > If we make it as rte_intr_ack() no need to return -1 for
> > MSIX-VFIO+Linux as it is semantically correct.
> >
> 
> Ack can be ambiguous. For INTx, ack usually means PIO to a NIC register,
> saying "I got your interrupt, please de-assert irq".

I think, it vary from the perspective of IRQ Chip(or controller) vs NIC register(Source) PoV.
Since the API starts from rte_intr_* it is more of controller so _ack_ make sense
Other reason for ack:
1) It will enforce that it needs to be called form ISR
2) It would be have been really correct to unmask if VFIO+MSIx+Linux supports
it
3) if it is ack, no need to add unmask counterpart, the _mask_ API

> 
> Besides the name, are we agreeing that we want these?
> - Unmask if INTx

Yes

> - Nothing if MSI/MSI-X
Yes for MSI over VFIO
No for MSI over UIO/igb_uio

I don't  have very strong opinion unmask vs ack. I prefer to have ack due the reasons stated above.
If you really have strong opinion on using unmask, we will stick with that to make forward progress.
Let us know.


> 
> So, really just "unmask if INTx". I am ok with rte_intr_unmask() if we make
> this intention clear. rte_unmask_if_intx() looks messy.
> 
> Thanks..
> -Hyong
  
Hyong Youb Kim (hyonkim) July 17, 2019, 9:51 a.m. UTC | #7
> -----Original Message-----
> From: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
> Sent: Wednesday, July 17, 2019 6:21 PM
> To: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>; Nithin Kumar
> Dabilpuram <ndabilpuram@marvell.com>; David Marchand
> <david.marchand@redhat.com>; Thomas Monjalon
> <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>; Bruce
> Richardson <bruce.richardson@intel.com>
> Cc: John Daley (johndale) <johndale@cisco.com>; Shahed Shaikh
> <shshaikh@marvell.com>; dev@dpdk.org
> Subject: RE: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
> 
> > > > Not sure. I do not have a good suggestion here :-) Like to hear from
> > > > David when he comes back, as he spent most time on this issue..
> > >
> > > Sure. He is on vacation.
> > > Any reason for thinking, rte_intr_ack()  may not be semantically correct?
> > > I think, it is very much correct if there are no better suggestions.
> > > Anyway it the name, From VFIO perspective, we know what is expected
> so
> > > I think it is fine.
> > >
> > > >
> > > > Why not return -1 and let the caller deal with it?
> > >
> > > If we make it as rte_intr_ack() no need to return -1 for
> > > MSIX-VFIO+Linux as it is semantically correct.
> > >
> >
> > Ack can be ambiguous. For INTx, ack usually means PIO to a NIC register,
> > saying "I got your interrupt, please de-assert irq".
> 
> I think, it vary from the perspective of IRQ Chip(or controller) vs NIC
> register(Source) PoV.
> Since the API starts from rte_intr_* it is more of controller so _ack_ make
> sense
> Other reason for ack:
> 1) It will enforce that it needs to be called form ISR
> 2) It would be have been really correct to unmask if VFIO+MSIx+Linux
> supports
> it
> 3) if it is ack, no need to add unmask counterpart, the _mask_ API
> 

Just curious, what you mean by irq controller? Ack/mask/unmask PIOs
all go to the NIC. It is the NIC that asserts/de-asserts irq..

> >
> > Besides the name, are we agreeing that we want these?
> > - Unmask if INTx
> 
> Yes
> 
> > - Nothing if MSI/MSI-X
> Yes for MSI over VFIO
> No for MSI over UIO/igb_uio
> 

I guess I was not clear. For MSI/MSI-X, we do not want to do
mask/unmask regardless of vfio-pci/igb_uio.  Below is my comment about
linux/windows/freebsd from an earlier email. Do you disagree? I am
sure there are plenty of kernel NIC driver guys here. Please correct
me if I am mistaken...

---
Masking is useful only for INTx, IMO...

Masking MSI/MSI-X via PCI-defined mechanisms (e.g. Mask bit in MSI-X
Table) has no practical use for drivers. Handshaking/masking/unmasking
is done via device/vendor specific ways, as needed. See all those
ack/block/unblock/credit/... mechanisms used in various drivers/NICs
to control interrupts their own way.

A long time ago in early PCIe days, the linux kernel did auto-masking
for MSI/MSI-X (i.e. mask before calling netdev irq handler). It was
soon removed as it was unnecessary overhead (expensive PIOs to NIC for
every interrupt). Windows and FreeBSD do not do auto-masking either.
---

Most drivers have a single irq callback.

handler() {
  do_action()
  rte_intr_umask/ack()
}

Suppose MSI/MSI-X is used (super likely since it is the default).
With igb_uio, rte_intr_umask/ack() will actually do PIO writes to the
NIC to unmask. This is unnecessary overhead.

> I don't  have very strong opinion unmask vs ack. I prefer to have ack due the
> reasons stated above.
> If you really have strong opinion on using unmask, we will stick with that to
> make forward progress.
> Let us know.
> 

I have no strong opinion either.

Thanks..
-Hyong
  
Jerin Jacob Kollanukkaran July 17, 2019, 10:43 a.m. UTC | #8
> > I think, it vary from the perspective of IRQ Chip(or controller) vs
> > NIC
> > register(Source) PoV.
> > Since the API starts from rte_intr_* it is more of controller so _ack_
> > make sense Other reason for ack:
> > 1) It will enforce that it needs to be called form ISR
> > 2) It would be have been really correct to unmask if VFIO+MSIx+Linux
> > supports it
> > 3) if it is ack, no need to add unmask counterpart, the _mask_ API
> >
> 
> Just curious, what you mean by irq controller? Ack/mask/unmask PIOs all go

Programmable Interrupt Controller. Like Intel 8259A, GIC from ARM etc
The drivers in linux/drivers/irqchip/

> to the NIC. It is the NIC that asserts/de-asserts irq..
> 
> > >
> > > Besides the name, are we agreeing that we want these?
> > > - Unmask if INTx
> >
> > Yes
> >
> > > - Nothing if MSI/MSI-X
> > Yes for MSI over VFIO
> > No for MSI over UIO/igb_uio
> >
> 
> I guess I was not clear. For MSI/MSI-X, we do not want to do mask/unmask
> regardless of vfio-pci/igb_uio.  Below is my comment about
> linux/windows/freebsd from an earlier email. Do you disagree? I am sure
> there are plenty of kernel NIC driver guys here. Please correct me if I am
> mistaken...


For some reason, igb_uio kernel driver mask the interrupt for MSIx.
We need to ack or unmask if needs to work with MSIX + IGB_UIO.

See 
pci_uio_alloc_resource()
        if (dev->kdrv == RTE_KDRV_IGB_UIO)
                dev->intr_handle.type = RTE_INTR_HANDLE_UIO; 
        else {
                dev->intr_handle.type = RTE_INTR_HANDLE_UIO_INTX;

igbuio_pci_irqcontrol() for masking in kernel.

So it is more of making inline with igb_uio kernel driver AND not break
The existing drivers which was using rte_intr_enable in ISR with MSIX+IGB_UIO

I do agree with that for edge trigged interrupt mask may not require from kernel.
But I am not sure why it is added in igb_uio kernel driver. May  be it is just legacy.
Anyway this wont change schematics, when igb_uio kenrel fixed then the counter
Part can be changed in rte_intr_ack(). Ie. it is transparent to drivers.

> 
> > I don't  have very strong opinion unmask vs ack. I prefer to have ack
> > due the reasons stated above.
> > If you really have strong opinion on using unmask, we will stick with
> > that to make forward progress.
> > Let us know.
> >
> 
> I have no strong opinion either.

OK. Lets stick with rte_intr_ack().

> 
> Thanks..
> -Hyong
  
Hyong Youb Kim (hyonkim) July 17, 2019, 11:06 a.m. UTC | #9
> -----Original Message-----
> From: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
> Sent: Wednesday, July 17, 2019 7:44 PM
> To: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>; Nithin Kumar
> Dabilpuram <ndabilpuram@marvell.com>; David Marchand
> <david.marchand@redhat.com>; Thomas Monjalon
> <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>; Bruce
> Richardson <bruce.richardson@intel.com>
> Cc: John Daley (johndale) <johndale@cisco.com>; Shahed Shaikh
> <shshaikh@marvell.com>; dev@dpdk.org
> Subject: RE: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
> 
> > > I think, it vary from the perspective of IRQ Chip(or controller) vs
> > > NIC
> > > register(Source) PoV.
> > > Since the API starts from rte_intr_* it is more of controller so _ack_
> > > make sense Other reason for ack:
> > > 1) It will enforce that it needs to be called form ISR
> > > 2) It would be have been really correct to unmask if VFIO+MSIx+Linux
> > > supports it
> > > 3) if it is ack, no need to add unmask counterpart, the _mask_ API
> > >
> >
> > Just curious, what you mean by irq controller? Ack/mask/unmask PIOs all
> go
> 
> Programmable Interrupt Controller. Like Intel 8259A, GIC from ARM etc
> The drivers in linux/drivers/irqchip/
> 
> > to the NIC. It is the NIC that asserts/de-asserts irq..
> >
> > > >
> > > > Besides the name, are we agreeing that we want these?
> > > > - Unmask if INTx
> > >
> > > Yes
> > >
> > > > - Nothing if MSI/MSI-X
> > > Yes for MSI over VFIO
> > > No for MSI over UIO/igb_uio
> > >
> >
> > I guess I was not clear. For MSI/MSI-X, we do not want to do mask/unmask
> > regardless of vfio-pci/igb_uio.  Below is my comment about
> > linux/windows/freebsd from an earlier email. Do you disagree? I am sure
> > there are plenty of kernel NIC driver guys here. Please correct me if I am
> > mistaken...
> 
> 
> For some reason, igb_uio kernel driver mask the interrupt for MSIx.
> We need to ack or unmask if needs to work with MSIX + IGB_UIO.
> 
> See
> pci_uio_alloc_resource()
>         if (dev->kdrv == RTE_KDRV_IGB_UIO)
>                 dev->intr_handle.type = RTE_INTR_HANDLE_UIO;
>         else {
>                 dev->intr_handle.type = RTE_INTR_HANDLE_UIO_INTX;
> 
> igbuio_pci_irqcontrol() for masking in kernel.
> 

igb_uio does not auto-mask MSI/MSI-X.

static irqreturn_t
igbuio_pci_irqhandler(int irq, void *dev_id)
{
        struct rte_uio_pci_dev *udev = (struct rte_uio_pci_dev *)dev_id;
        struct uio_info *info = &udev->info;

        /* Legacy mode need to mask in hardware */
        if (udev->mode == RTE_INTR_MODE_LEGACY &&
            !pci_check_and_mask_intx(udev->pdev))
                return IRQ_NONE;

        uio_event_notify(info);

        /* Message signal mode, no share IRQ and automasked */
        return IRQ_HANDLED;
}

Also tested just now with igb_uio. The driver does not need to call
rte_intr_enable(), and it keeps getting interrupts without any issues.

Am I missing something?

-Hyong

> So it is more of making inline with igb_uio kernel driver AND not break
> The existing drivers which was using rte_intr_enable in ISR with
> MSIX+IGB_UIO
> 
> I do agree with that for edge trigged interrupt mask may not require from
> kernel.
> But I am not sure why it is added in igb_uio kernel driver. May  be it is just
> legacy.
> Anyway this wont change schematics, when igb_uio kenrel fixed then the
> counter
> Part can be changed in rte_intr_ack(). Ie. it is transparent to drivers.
> 
> >
> > > I don't  have very strong opinion unmask vs ack. I prefer to have ack
> > > due the reasons stated above.
> > > If you really have strong opinion on using unmask, we will stick with
> > > that to make forward progress.
> > > Let us know.
> > >
> >
> > I have no strong opinion either.
> 
> OK. Lets stick with rte_intr_ack().
> 
> >
> > Thanks..
> > -Hyong
  
Jerin Jacob Kollanukkaran July 17, 2019, 11:16 a.m. UTC | #10
> -----Original Message-----
> From: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>
> Sent: Wednesday, July 17, 2019 4:36 PM
> To: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; Nithin Kumar
> Dabilpuram <ndabilpuram@marvell.com>; David Marchand
> <david.marchand@redhat.com>; Thomas Monjalon
> <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>; Bruce
> Richardson <bruce.richardson@intel.com>
> Cc: John Daley (johndale) <johndale@cisco.com>; Shahed Shaikh
> <shshaikh@marvell.com>; dev@dpdk.org
> Subject: RE: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
> 
> > -----Original Message-----
> > From: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
> > Sent: Wednesday, July 17, 2019 7:44 PM
> > To: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>; Nithin Kumar
> > Dabilpuram <ndabilpuram@marvell.com>; David Marchand
> > <david.marchand@redhat.com>; Thomas Monjalon
> <thomas@monjalon.net>;
> > Ferruh Yigit <ferruh.yigit@intel.com>; Bruce Richardson
> > <bruce.richardson@intel.com>
> > Cc: John Daley (johndale) <johndale@cisco.com>; Shahed Shaikh
> > <shshaikh@marvell.com>; dev@dpdk.org
> > Subject: RE: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt
> > APIs
> >
> > > > I think, it vary from the perspective of IRQ Chip(or controller)
> > > > vs NIC
> > > > register(Source) PoV.
> > > > Since the API starts from rte_intr_* it is more of controller so
> > > > _ack_ make sense Other reason for ack:
> > > > 1) It will enforce that it needs to be called form ISR
> > > > 2) It would be have been really correct to unmask if
> > > > VFIO+MSIx+Linux supports it
> > > > 3) if it is ack, no need to add unmask counterpart, the _mask_ API
> > > >
> > >
> > > Just curious, what you mean by irq controller? Ack/mask/unmask PIOs
> > > all
> > go
> >
> > Programmable Interrupt Controller. Like Intel 8259A, GIC from ARM etc
> > The drivers in linux/drivers/irqchip/
> >
> > > to the NIC. It is the NIC that asserts/de-asserts irq..
> > >
> > > > >
> > > > > Besides the name, are we agreeing that we want these?
> > > > > - Unmask if INTx
> > > >
> > > > Yes
> > > >
> > > > > - Nothing if MSI/MSI-X
> > > > Yes for MSI over VFIO
> > > > No for MSI over UIO/igb_uio
> > > >
> > >
> > > I guess I was not clear. For MSI/MSI-X, we do not want to do
> > > mask/unmask regardless of vfio-pci/igb_uio.  Below is my comment
> > > about linux/windows/freebsd from an earlier email. Do you disagree?
> > > I am sure there are plenty of kernel NIC driver guys here. Please
> > > correct me if I am mistaken...
> >
> >
> > For some reason, igb_uio kernel driver mask the interrupt for MSIx.
> > We need to ack or unmask if needs to work with MSIX + IGB_UIO.
> >
> > See
> > pci_uio_alloc_resource()
> >         if (dev->kdrv == RTE_KDRV_IGB_UIO)
> >                 dev->intr_handle.type = RTE_INTR_HANDLE_UIO;
> >         else {
> >                 dev->intr_handle.type = RTE_INTR_HANDLE_UIO_INTX;
> >
> > igbuio_pci_irqcontrol() for masking in kernel.
> >
> 
> igb_uio does not auto-mask MSI/MSI-X.

I have not tested igbuio as we don't specific NIC + IGB_UIO platform.

The observation based on following code. see code under HAVE_PCI_MSI_MASK_IRQ

static int
igbuio_pci_irqcontrol(struct uio_info *info, s32 irq_state)
{
        struct rte_uio_pci_dev *udev = info->priv;
        struct pci_dev *pdev = udev->pdev;

#ifdef HAVE_PCI_MSI_MASK_IRQ
        struct irq_data *irq = irq_get_irq_data(udev->info.irq);
#endif

        pci_cfg_access_lock(pdev);

        if (udev->mode == RTE_INTR_MODE_MSIX || udev->mode == RTE_INTR_MODE_MSI) {
#ifdef HAVE_PCI_MSI_MASK_IRQ
                if (irq_state == 1)
                        pci_msi_unmask_irq(irq);
                else
                        pci_msi_mask_irq(irq);
#else
                igbuio_mask_irq(pdev, udev->mode, irq_state);
#endif
        }

        if (udev->mode == RTE_INTR_MODE_LEGACY)
                pci_intx(pdev, !!irq_state);

        pci_cfg_access_unlock(pdev);

        return 0;
}

> 
> static irqreturn_t
> igbuio_pci_irqhandler(int irq, void *dev_id) {
>         struct rte_uio_pci_dev *udev = (struct rte_uio_pci_dev *)dev_id;
>         struct uio_info *info = &udev->info;
> 
>         /* Legacy mode need to mask in hardware */
>         if (udev->mode == RTE_INTR_MODE_LEGACY &&
>             !pci_check_and_mask_intx(udev->pdev))
>                 return IRQ_NONE;
> 
>         uio_event_notify(info);
> 
>         /* Message signal mode, no share IRQ and automasked */
>         return IRQ_HANDLED;
> }
> 
> Also tested just now with igb_uio. The driver does not need to call
> rte_intr_enable(), and it keeps getting interrupts without any issues.

 If you are sure, we can make MSIX+IGB_UIO as NOP in rte_intr_ack()

> Am I missing something?
> 
> -Hyong
> 
> > So it is more of making inline with igb_uio kernel driver AND not
> > break The existing drivers which was using rte_intr_enable in ISR with
> > MSIX+IGB_UIO
> >
> > I do agree with that for edge trigged interrupt mask may not require
> > from kernel.
> > But I am not sure why it is added in igb_uio kernel driver. May  be it
> > is just legacy.
> > Anyway this wont change schematics, when igb_uio kenrel fixed then the
> > counter Part can be changed in rte_intr_ack(). Ie. it is transparent
> > to drivers.
> >
> > >
> > > > I don't  have very strong opinion unmask vs ack. I prefer to have
> > > > ack due the reasons stated above.
> > > > If you really have strong opinion on using unmask, we will stick
> > > > with that to make forward progress.
> > > > Let us know.
> > > >
> > >
> > > I have no strong opinion either.
> >
> > OK. Lets stick with rte_intr_ack().
> >
> > >
> > > Thanks..
> > > -Hyong
  
Nithin Dabilpuram July 17, 2019, 12:04 p.m. UTC | #11
On 7/17/2019 4:46 PM, Jerin Jacob Kollanukkaran wrote:
>> -----Original Message-----
>> From: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>
>> Sent: Wednesday, July 17, 2019 4:36 PM
>> To: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; Nithin Kumar
>> Dabilpuram <ndabilpuram@marvell.com>; David Marchand
>> <david.marchand@redhat.com>; Thomas Monjalon
>> <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>; Bruce
>> Richardson <bruce.richardson@intel.com>
>> Cc: John Daley (johndale) <johndale@cisco.com>; Shahed Shaikh
>> <shshaikh@marvell.com>; dev@dpdk.org
>> Subject: RE: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt APIs
>>
>>> -----Original Message-----
>>> From: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
>>> Sent: Wednesday, July 17, 2019 7:44 PM
>>> To: Hyong Youb Kim (hyonkim) <hyonkim@cisco.com>; Nithin Kumar
>>> Dabilpuram <ndabilpuram@marvell.com>; David Marchand
>>> <david.marchand@redhat.com>; Thomas Monjalon
>> <thomas@monjalon.net>;
>>> Ferruh Yigit <ferruh.yigit@intel.com>; Bruce Richardson
>>> <bruce.richardson@intel.com>
>>> Cc: John Daley (johndale) <johndale@cisco.com>; Shahed Shaikh
>>> <shshaikh@marvell.com>; dev@dpdk.org
>>> Subject: RE: [RFC PATCH v3 2/3] eal: add mask and unmask interrupt
>>> APIs
>>>
>>>>> I think, it vary from the perspective of IRQ Chip(or controller)
>>>>> vs NIC
>>>>> register(Source) PoV.
>>>>> Since the API starts from rte_intr_* it is more of controller so
>>>>> _ack_ make sense Other reason for ack:
>>>>> 1) It will enforce that it needs to be called form ISR
>>>>> 2) It would be have been really correct to unmask if
>>>>> VFIO+MSIx+Linux supports it
>>>>> 3) if it is ack, no need to add unmask counterpart, the _mask_ API
>>>>>
>>>> Just curious, what you mean by irq controller? Ack/mask/unmask PIOs
>>>> all
>>> go
>>>
>>> Programmable Interrupt Controller. Like Intel 8259A, GIC from ARM etc
>>> The drivers in linux/drivers/irqchip/
>>>
>>>> to the NIC. It is the NIC that asserts/de-asserts irq..
>>>>
>>>>>> Besides the name, are we agreeing that we want these?
>>>>>> - Unmask if INTx
>>>>> Yes
>>>>>
>>>>>> - Nothing if MSI/MSI-X
>>>>> Yes for MSI over VFIO
>>>>> No for MSI over UIO/igb_uio
>>>>>
>>>> I guess I was not clear. For MSI/MSI-X, we do not want to do
>>>> mask/unmask regardless of vfio-pci/igb_uio.  Below is my comment
>>>> about linux/windows/freebsd from an earlier email. Do you disagree?
>>>> I am sure there are plenty of kernel NIC driver guys here. Please
>>>> correct me if I am mistaken...
>>>
>>> For some reason, igb_uio kernel driver mask the interrupt for MSIx.
>>> We need to ack or unmask if needs to work with MSIX + IGB_UIO.
>>>
>>> See
>>> pci_uio_alloc_resource()
>>>          if (dev->kdrv == RTE_KDRV_IGB_UIO)
>>>                  dev->intr_handle.type = RTE_INTR_HANDLE_UIO;
>>>          else {
>>>                  dev->intr_handle.type = RTE_INTR_HANDLE_UIO_INTX;
>>>
>>> igbuio_pci_irqcontrol() for masking in kernel.
>>>
>> igb_uio does not auto-mask MSI/MSI-X.
> I have not tested igbuio as we don't specific NIC + IGB_UIO platform.
>
> The observation based on following code. see code under HAVE_PCI_MSI_MASK_IRQ
>
> static int
> igbuio_pci_irqcontrol(struct uio_info *info, s32 irq_state)
> {
>          struct rte_uio_pci_dev *udev = info->priv;
>          struct pci_dev *pdev = udev->pdev;
>
> #ifdef HAVE_PCI_MSI_MASK_IRQ
>          struct irq_data *irq = irq_get_irq_data(udev->info.irq);
> #endif
>
>          pci_cfg_access_lock(pdev);
>
>          if (udev->mode == RTE_INTR_MODE_MSIX || udev->mode == RTE_INTR_MODE_MSI) {
> #ifdef HAVE_PCI_MSI_MASK_IRQ
>                  if (irq_state == 1)
>                          pci_msi_unmask_irq(irq);
>                  else
>                          pci_msi_mask_irq(irq);
> #else
>                  igbuio_mask_irq(pdev, udev->mode, irq_state);
> #endif
>          }
>
>          if (udev->mode == RTE_INTR_MODE_LEGACY)
>                  pci_intx(pdev, !!irq_state);
>
>          pci_cfg_access_unlock(pdev);
>
>          return 0;
> }
>
>> static irqreturn_t
>> igbuio_pci_irqhandler(int irq, void *dev_id) {
>>          struct rte_uio_pci_dev *udev = (struct rte_uio_pci_dev *)dev_id;
>>          struct uio_info *info = &udev->info;
>>
>>          /* Legacy mode need to mask in hardware */
>>          if (udev->mode == RTE_INTR_MODE_LEGACY &&
>>              !pci_check_and_mask_intx(udev->pdev))
>>                  return IRQ_NONE;
>>
>>          uio_event_notify(info);
>>
>>          /* Message signal mode, no share IRQ and automasked */
>>          return IRQ_HANDLED;
>> }
>>
>> Also tested just now with igb_uio. The driver does not need to call
>> rte_intr_enable(), and it keeps getting interrupts without any issues.
>   If you are sure, we can make MSIX+IGB_UIO as NOP in rte_intr_ack()

Ok. Another problem is that we might not be able to distinguish in case 
of IGB_UIO
at rte_intr_ack() level if underlying interrupt is a INTx or MSIx. See 
igbuio_pci_enable_interrupts() that
finds and stores that mode in uio->mode.

So we think leaving the behavior as earlier is needed and simpler as it 
meets the current expectation.

>
>> Am I missing something?
>>
>> -Hyong
>>
>>> So it is more of making inline with igb_uio kernel driver AND not
>>> break The existing drivers which was using rte_intr_enable in ISR with
>>> MSIX+IGB_UIO
>>>
>>> I do agree with that for edge trigged interrupt mask may not require
>>> from kernel.
>>> But I am not sure why it is added in igb_uio kernel driver. May  be it
>>> is just legacy.
>>> Anyway this wont change schematics, when igb_uio kenrel fixed then the
>>> counter Part can be changed in rte_intr_ack(). Ie. it is transparent
>>> to drivers.
>>>
>>>>> I don't  have very strong opinion unmask vs ack. I prefer to have
>>>>> ack due the reasons stated above.
>>>>> If you really have strong opinion on using unmask, we will stick
>>>>> with that to make forward progress.
>>>>> Let us know.
>>>>>
>>>> I have no strong opinion either.
>>> OK. Lets stick with rte_intr_ack().
>>>
>>>> Thanks..
>>>> -Hyong
  

Patch

diff --git a/lib/librte_eal/common/include/rte_interrupts.h b/lib/librte_eal/common/include/rte_interrupts.h
index c1e912c..b0675be 100644
--- a/lib/librte_eal/common/include/rte_interrupts.h
+++ b/lib/librte_eal/common/include/rte_interrupts.h
@@ -118,6 +118,29 @@  int rte_intr_enable(const struct rte_intr_handle *intr_handle);
  */
 int rte_intr_disable(const struct rte_intr_handle *intr_handle);
 
+/**
+ * It masks the interrupt for the specified handle.
+ *
+ * @param intr_handle
+ *  pointer to the interrupt handle.
+ *
+ * @return
+ *  - On success, zero.
+ *  - On failure, a negative value.
+ */
+int rte_intr_mask(const struct rte_intr_handle *intr_handle);
+
+/**
+ * It unmasks the interrupt for the specified handle.
+ *
+ * @param intr_handle
+ *  pointer to the interrupt handle.
+ *
+ * @return
+ *  - On success, zero.
+ *  - On failure, a negative value.
+ */
+int rte_intr_unmask(const struct rte_intr_handle *intr_handle);
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/freebsd/eal/eal_interrupts.c b/lib/librte_eal/freebsd/eal/eal_interrupts.c
index 10375bd..c2f487a 100644
--- a/lib/librte_eal/freebsd/eal/eal_interrupts.c
+++ b/lib/librte_eal/freebsd/eal/eal_interrupts.c
@@ -387,6 +387,60 @@  rte_intr_disable(const struct rte_intr_handle *intr_handle)
 	return 0;
 }
 
+int
+rte_intr_unmask(const struct rte_intr_handle *intr_handle)
+{
+	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
+		return 0;
+
+	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
+		return -1;
+
+	switch (intr_handle->type) {
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_ALARM:
+		return -1;
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_DEV_EVENT:
+		return -1;
+	/* unknown handle type */
+	default:
+		RTE_LOG(ERR, EAL,
+			"Unknown handle type of fd %d\n",
+					intr_handle->fd);
+		return -1;
+	}
+
+	return 0;
+}
+
+int
+rte_intr_mask(const struct rte_intr_handle *intr_handle)
+{
+	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
+		return 0;
+
+	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
+		return -1;
+
+	switch (intr_handle->type) {
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_ALARM:
+		return -1;
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_DEV_EVENT:
+		return -1;
+	/* unknown handle type */
+	default:
+		RTE_LOG(ERR, EAL,
+			"Unknown handle type of fd %d\n",
+					intr_handle->fd);
+		return -1;
+	}
+
+	return 0;
+}
+
 static void
 eal_intr_process_interrupts(struct kevent *events, int nfds)
 {
diff --git a/lib/librte_eal/linux/eal/eal_interrupts.c b/lib/librte_eal/linux/eal/eal_interrupts.c
index 79ad5e8..f619981 100644
--- a/lib/librte_eal/linux/eal/eal_interrupts.c
+++ b/lib/librte_eal/linux/eal/eal_interrupts.c
@@ -197,6 +197,63 @@  vfio_disable_intx(const struct rte_intr_handle *intr_handle) {
 	return 0;
 }
 
+/* unmask legacy (INTx) interrupts */
+static int
+vfio_unmask_intx(const struct rte_intr_handle *intr_handle)
+{
+	struct vfio_irq_set *irq_set;
+	char irq_set_buf[IRQ_SET_BUF_LEN];
+	int len, ret;
+
+	len = sizeof(struct vfio_irq_set);
+
+	/* unmask INTx */
+	irq_set = (struct vfio_irq_set *) irq_set_buf;
+	memset(irq_set, 0, len);
+	irq_set->argsz = len;
+	irq_set->count = 1;
+	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_UNMASK;
+	irq_set->index = VFIO_PCI_INTX_IRQ_INDEX;
+	irq_set->start = 0;
+
+	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
+
+	if (ret) {
+		RTE_LOG(ERR, EAL, "Error unmasking INTx interrupts for fd %d\n",
+			intr_handle->fd);
+		return -1;
+	}
+	return 0;
+}
+
+/* mask legacy (INTx) interrupts */
+static int
+vfio_mask_intx(const struct rte_intr_handle *intr_handle)
+{
+	struct vfio_irq_set *irq_set;
+	char irq_set_buf[IRQ_SET_BUF_LEN];
+	int len, ret;
+
+	len = sizeof(struct vfio_irq_set);
+
+	/* mask interrupts */
+	irq_set = (struct vfio_irq_set *) irq_set_buf;
+	irq_set->argsz = len;
+	irq_set->count = 1;
+	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_MASK;
+	irq_set->index = VFIO_PCI_INTX_IRQ_INDEX;
+	irq_set->start = 0;
+
+	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
+
+	if (ret) {
+		RTE_LOG(ERR, EAL, "Error masking INTx interrupts for fd %d\n",
+			intr_handle->fd);
+		return -1;
+	}
+
+	return 0;
+}
 /* enable MSI interrupts */
 static int
 vfio_enable_msi(const struct rte_intr_handle *intr_handle) {
@@ -694,6 +751,104 @@  rte_intr_enable(const struct rte_intr_handle *intr_handle)
 }
 
 int
+rte_intr_mask(const struct rte_intr_handle *intr_handle)
+{
+	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
+		return 0;
+
+	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
+		return -1;
+
+	switch (intr_handle->type){
+	/* Both masking and disabling are same for UIO */
+	case RTE_INTR_HANDLE_UIO:
+		if (uio_intr_disable(intr_handle))
+			return -1;
+		break;
+	case RTE_INTR_HANDLE_UIO_INTX:
+		if (uio_intx_intr_disable(intr_handle))
+			return -1;
+		break;
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_ALARM:
+		return -1;
+#ifdef VFIO_PRESENT
+	case RTE_INTR_HANDLE_VFIO_MSIX:
+	case RTE_INTR_HANDLE_VFIO_MSI:
+		return 0;
+	case RTE_INTR_HANDLE_VFIO_LEGACY:
+		if (vfio_mask_intx(intr_handle))
+			return -1;
+		break;
+#ifdef HAVE_VFIO_DEV_REQ_INTERFACE
+	case RTE_INTR_HANDLE_VFIO_REQ:
+		return -1;
+#endif
+#endif
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_DEV_EVENT:
+		return -1;
+	/* unknown handle type */
+	default:
+		RTE_LOG(ERR, EAL,
+			"Unknown handle type of fd %d\n",
+					intr_handle->fd);
+		return -1;
+	}
+
+	return 0;
+}
+
+int
+rte_intr_unmask(const struct rte_intr_handle *intr_handle)
+{
+	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
+		return 0;
+
+	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
+		return -1;
+
+	switch (intr_handle->type){
+	/* Both unmasking and disabling are same for UIO */
+	case RTE_INTR_HANDLE_UIO:
+		if (uio_intr_enable(intr_handle))
+			return -1;
+		break;
+	case RTE_INTR_HANDLE_UIO_INTX:
+		if (uio_intx_intr_enable(intr_handle))
+			return -1;
+		break;
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_ALARM:
+		return -1;
+#ifdef VFIO_PRESENT
+	case RTE_INTR_HANDLE_VFIO_MSIX:
+	case RTE_INTR_HANDLE_VFIO_MSI:
+		return 0;
+	case RTE_INTR_HANDLE_VFIO_LEGACY:
+		if (vfio_unmask_intx(intr_handle))
+			return -1;
+		break;
+#ifdef HAVE_VFIO_DEV_REQ_INTERFACE
+	case RTE_INTR_HANDLE_VFIO_REQ:
+		return -1;
+#endif
+#endif
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_DEV_EVENT:
+		return -1;
+	/* unknown handle type */
+	default:
+		RTE_LOG(ERR, EAL,
+			"Unknown handle type of fd %d\n",
+					intr_handle->fd);
+		return -1;
+	}
+
+	return 0;
+}
+
+int
 rte_intr_disable(const struct rte_intr_handle *intr_handle)
 {
 	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map
index 1892d9e..b3b2df4 100644
--- a/lib/librte_eal/rte_eal_version.map
+++ b/lib/librte_eal/rte_eal_version.map
@@ -309,6 +309,8 @@  DPDK_19.08 {
 	rte_service_lcore_attr_reset_all;
 	rte_service_may_be_active;
 	rte_srand;
+	rte_intr_unmask;
+	rte_intr_mask;
 
 } DPDK_19.05;