[dpdk-dev,v2,2/2] i40evf: support interrupt based pf reset request

Message ID 1453859378-23912-3-git-send-email-jingjing.wu@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Bruce Richardson
Headers

Commit Message

Jingjing Wu Jan. 27, 2016, 1:49 a.m. UTC
  Interrupt based request of PF reset from PF is supported by
enabling the adminq event process in VF driver.
Users can register a callback for this interrupt event to get
informed, when a PF reset request detected like:
  rte_eth_dev_callback_register(portid,
		RTE_ETH_EVENT_INTR_RESET,
		reset_event_callback,
		arg);

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
---
 doc/guides/rel_notes/release_2_3.rst |   1 +
 drivers/net/i40e/i40e_ethdev_vf.c    | 274 +++++++++++++++++++++++++++++++----
 lib/librte_ether/rte_ethdev.h        |   1 +
 3 files changed, 246 insertions(+), 30 deletions(-)
  

Comments

David Marchand Jan. 27, 2016, 8:34 a.m. UTC | #1
Hello Jingjing,

On Wed, Jan 27, 2016 at 2:49 AM, Jingjing Wu <jingjing.wu@intel.com> wrote:
> Interrupt based request of PF reset from PF is supported by
> enabling the adminq event process in VF driver.
> Users can register a callback for this interrupt event to get
> informed, when a PF reset request detected like:
>   rte_eth_dev_callback_register(portid,
>                 RTE_ETH_EVENT_INTR_RESET,
>                 reset_event_callback,
>                 arg);
>
> Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>

Just adding my previous comment in this thread.

Having this infrastructure is one thing, but the initial problem was
that the driver did not recover from this reset event.
The linux i40e vf driver handles this kind of event itself.
Could we have something similar ?

Thanks.
  
Zhe Tao Jan. 28, 2016, 7:03 a.m. UTC | #2
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jingjing Wu
> Sent: Wednesday, January 27, 2016 9:50 AM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v2 2/2] i40evf: support interrupt based pf reset
> request
> 
> Interrupt based request of PF reset from PF is supported by enabling the
> adminq event process in VF driver.
> Users can register a callback for this interrupt event to get informed, when a
> PF reset request detected like:
>   rte_eth_dev_callback_register(portid,
> 		RTE_ETH_EVENT_INTR_RESET,
> 		reset_event_callback,
> 		arg);
> 
> Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
> ---
>  doc/guides/rel_notes/release_2_3.rst |   1 +
>  drivers/net/i40e/i40e_ethdev_vf.c    | 274
> +++++++++++++++++++++++++++++++----
>  lib/librte_ether/rte_ethdev.h        |   1 +
>  3 files changed, 246 insertions(+), 30 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/release_2_3.rst
> b/doc/guides/rel_notes/release_2_3.rst
> index 99de186..73d5f76 100644
> --- a/doc/guides/rel_notes/release_2_3.rst
> +++ b/doc/guides/rel_notes/release_2_3.rst
> @@ -4,6 +4,7 @@ DPDK Release 2.3
>  New Features
>  ------------
> 
> +* **Added pf reset event reported in i40e vf PMD driver.
> 
>  Resolved Issues
>  ---------------
> diff --git a/drivers/net/i40e/i40e_ethdev_vf.c
> b/drivers/net/i40e/i40e_ethdev_vf.c
> index 64e6957..1ffe64e 100644
> --- a/drivers/net/i40e/i40e_ethdev_vf.c
> +++ b/drivers/net/i40e/i40e_ethdev_vf.c
> @@ -74,8 +74,6 @@
> +static void
> @@ -1662,7 +1869,8 @@ i40evf_enable_queues_intr(struct rte_eth_dev
> *dev)
>  		I40E_WRITE_REG(hw,
>  			       I40E_VFINT_DYN_CTL01,
>  			       I40E_VFINT_DYN_CTL01_INTENA_MASK |
> -			       I40E_VFINT_DYN_CTL01_CLEARPBA_MASK);
> +			       I40E_VFINT_DYN_CTL01_CLEARPBA_MASK |
> +			       I40E_VFINT_DYN_CTL01_ITR_INDX_MASK);
What the usage for ITR bits here?
>  		I40EVF_WRITE_FLUSH(hw);
>  		return;
>  	}
> @@ -1673,11 +1881,10 @@ i40evf_enable_queues_intr(struct rte_eth_dev
> *dev)
> 
> 	I40E_VFINT_DYN_CTLN1(I40EVF_VSI_DEFAULT_MSIX_INTR - 1),
>  			I40E_VFINT_DYN_CTLN1_INTENA_MASK |
>  			I40E_VFINT_DYN_CTLN_CLEARPBA_MASK);
> -	else
> -		/* To support Linux PF host */
> -		I40E_WRITE_REG(hw, I40E_VFINT_DYN_CTL01,
> 
> --
> 2.4.0
  
Zhe Tao Jan. 29, 2016, 8:50 a.m. UTC | #3
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jingjing Wu
> Sent: Wednesday, January 27, 2016 9:50 AM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v2 2/2] i40evf: support interrupt based pf reset
> request
> 
> Interrupt based request of PF reset from PF is supported by enabling the
> adminq event process in VF driver.
> Users can register a callback for this interrupt event to get informed, when a
> PF reset request detected like:
>   rte_eth_dev_callback_register(portid,
> 		RTE_ETH_EVENT_INTR_RESET,
> 		reset_event_callback,
> 		arg);
> 
> Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Two questions,
1.If the VF RX/TX using msix 1 and Admin Queue using msix 0, 
how the two interrupts can both be read in user spaces, in VM VFIO not supported,

2.But if we want to  run l3fwd-power in VF, we can only assign the rx/tx intr to msix0,
but both thread will using epoll to wait the msix0 event, 
and intr thread may miss the vf reset event if l3fwd-power thread clean the msix0 related fd firstly
 if there are no more packets and msg response come in, the intr thread will not be wake up again
  
Jingjing Wu Feb. 14, 2016, 2:12 a.m. UTC | #4
> -----Original Message-----
> From: Tao, Zhe
> Sent: Thursday, January 28, 2016 3:03 PM
> To: Wu, Jingjing
> Cc: dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2 2/2] i40evf: support interrupt based pf
> reset request
> 
> > @@ -74,8 +74,6 @@
> > +static void
> > @@ -1662,7 +1869,8 @@ (struct rte_eth_dev
> > *dev)
> >  		I40E_WRITE_REG(hw,
> >  			       I40E_VFINT_DYN_CTL01,
> >  			       I40E_VFINT_DYN_CTL01_INTENA_MASK |
> > -			       I40E_VFINT_DYN_CTL01_CLEARPBA_MASK);
> > +			       I40E_VFINT_DYN_CTL01_CLEARPBA_MASK |
> > +			       I40E_VFINT_DYN_CTL01_ITR_INDX_MASK);
> What the usage for ITR bits here?
According to the access type of register I40E_VFINT_DYN_CTL01, the ITR_INDX_MASK
here means don't update the ITR index.
> >  		I40EVF_WRITE_FLUSH(hw);
> >  		return;
> >  	}
  
Jingjing Wu Feb. 14, 2016, 3:04 a.m. UTC | #5
> -----Original Message-----
> From: Tao, Zhe
> Sent: Friday, January 29, 2016 4:51 PM
> To: Wu, Jingjing; dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2 2/2] i40evf: support interrupt based pf
> reset request
> 
> 
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jingjing Wu
> > Sent: Wednesday, January 27, 2016 9:50 AM
> > To: dev@dpdk.org
> > Subject: [dpdk-dev] [PATCH v2 2/2] i40evf: support interrupt based pf
> > reset request
> >
> > Interrupt based request of PF reset from PF is supported by enabling
> > the adminq event process in VF driver.
> > Users can register a callback for this interrupt event to get
> > informed, when a PF reset request detected like:
> >   rte_eth_dev_callback_register(portid,
> > 		RTE_ETH_EVENT_INTR_RESET,
> > 		reset_event_callback,
> > 		arg);
> >
> > Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
> Two questions,
> 1.If the VF RX/TX using msix 1 and Admin Queue using msix 0, how the two
> interrupts can both be read in user spaces, in VM VFIO not supported,
> 
If i40e kernel driver is used as host driver, no matter VM VFIO is supported or not,
only msix 0 is reserved for DPDK i40e VF.
> 2.But if we want to  run l3fwd-power in VF, we can only assign the rx/tx intr
> to msix0, but both thread will using epoll to wait the msix0 event, and intr
> thread may miss the vf reset event if l3fwd-power thread clean the msix0
> related fd firstly  if there are no more packets and msg response come in, the
> intr thread will not be wake up again
Yes, you are right. The same as above answer, if i40e kernel driver is used as
host driver, then RX interrupt is not supported.

Thanks
Jingjing
  
Jingjing Wu Feb. 14, 2016, 3:25 a.m. UTC | #6
> -----Original Message-----

> From: David Marchand [mailto:david.marchand@6wind.com]

> Sent: Wednesday, January 27, 2016 4:34 PM

> To: Wu, Jingjing

> Cc: dev@dpdk.org

> Subject: Re: [dpdk-dev] [PATCH v2 2/2] i40evf: support interrupt based pf

> reset request

> 

> Hello Jingjing,

> 

> On Wed, Jan 27, 2016 at 2:49 AM, Jingjing Wu <jingjing.wu@intel.com> wrote:

> > Interrupt based request of PF reset from PF is supported by enabling

> > the adminq event process in VF driver.

> > Users can register a callback for this interrupt event to get

> > informed, when a PF reset request detected like:

> >   rte_eth_dev_callback_register(portid,

> >                 RTE_ETH_EVENT_INTR_RESET,

> >                 reset_event_callback,

> >                 arg);

> >

> > Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>

> 

> Just adding my previous comment in this thread.

> 

> Having this infrastructure is one thing, but the initial problem was that the

> driver did not recover from this reset event.

> The linux i40e vf driver handles this kind of event itself.

> Could we have something similar ?

> 


Hi, David

Considering about the how to use DPDK PMD, and how to setup resource, we can
know that lots of resources are managed by application. I think based on current
PMD driver framework, driver cannot reset without application's help.
If we need to support driver recovery automatically, we'd better to find a way to do that.
Do you have any idea?
 
However, this patch can notify the reset event to application, even it is not a perfect
solution as you hoped.

Thanks
Jingjing
 
> Thanks.

> 

> --

> David Marchand
  
David Marchand Feb. 15, 2016, 1:16 p.m. UTC | #7
Hello,

On Sun, Feb 14, 2016 at 4:25 AM, Wu, Jingjing <jingjing.wu@intel.com> wrote:
>> -----Original Message-----
>> From: David Marchand [mailto:david.marchand@6wind.com]
>> Having this infrastructure is one thing, but the initial problem was that the
>> driver did not recover from this reset event.
>> The linux i40e vf driver handles this kind of event itself.
>> Could we have something similar ?
>>
>
> Considering about the how to use DPDK PMD, and how to setup resource, we can
> know that lots of resources are managed by application. I think based on current
> PMD driver framework, driver cannot reset without application's help.

I reported an issue on ixgbe.
What you provide here is a workaround for i40e.
I am not even sure this can be applied to ixgbe.

Does it mean that anytime we have a problem with drivers, workarounds
should be applied to ethdev / eal ... so that you don't have to handle
anything in the drivers ?
This is not the first time I complain about this kind of design issues.


> If we need to support driver recovery automatically, we'd better to find a way to do that.
> Do you have any idea?

First, list those "lots of resources" that "are managed by application".
If your driver needs to keep track of those, this is i40e driver job
to do this internally without requiring ethdev to be modified.

If this proves to be generic enough, maybe moving part of this to
ethdev will then make sense.


Thanks.
  
Zhe Tao Feb. 18, 2016, 4:06 a.m. UTC | #8
On Mon, Feb 15, 2016 at 02:16:16PM +0100, David Marchand wrote:
Hello,
> 
> On Sun, Feb 14, 2016 at 4:25 AM, Wu, Jingjing <jingjing.wu@intel.com> wrote:
> >> -----Original Message-----
> >> From: David Marchand [mailto:david.marchand@6wind.com]
> >> Having this infrastructure is one thing, but the initial problem was that the
> >> driver did not recover from this reset event.
> >> The linux i40e vf driver handles this kind of event itself.
> >> Could we have something similar ?
> >>
> >
> > Considering about the how to use DPDK PMD, and how to setup resource, we can
> > know that lots of resources are managed by application. I think based on current
> > PMD driver framework, driver cannot reset without application's help.
> 
> I reported an issue on ixgbe.
> What you provide here is a workaround for i40e.
> I am not even sure this can be applied to ixgbe.
> 
> Does it mean that anytime we have a problem with drivers, workarounds
> should be applied to ethdev / eal ... so that you don't have to handle
> anything in the drivers ?
I think this patch provides a necessary framework for i40e VF to handle
the asynchronous event, maybe Jingjing can help to change the description of
this patch to let it does not limit to the VF reset event. 
> This is not the first time I complain about this kind of design issues.
> 
> 
> > If we need to support driver recovery automatically, we'd better to find a way to do that.
> > Do you have any idea?
> 
> First, list those "lots of resources" that "are managed by application".
> If your driver needs to keep track of those, this is i40e driver job
> to do this internally without requiring ethdev to be modified.
> 
> If this proves to be generic enough, maybe moving part of this to
> ethdev will then make sense.
> 
> 
> Thanks.
> 
> -- 
> David Marchand
Thanks,

Zhe Tao
  
Jingjing Wu Feb. 19, 2016, 5:51 a.m. UTC | #9
> I reported an issue on ixgbe.

Yes, thanks, we also notice such issue on ixgbe.

> What you provide here is a workaround for i40e.

> I am not even sure this can be applied to ixgbe.

>

Yes, not just workaround, also a basic one, without the patch, DPDK VF even
doesn't know the pf reset happened. I think ixgbe also need to know that.

> Does it mean that anytime we have a problem with drivers, workarounds

> should be applied to ethdev / eal ... so that you don't have to handle

> anything in the drivers ?


Currently as my understanding DPDK PMD driver is part of DPDK library.
Even the driver loading is in the thread which is created by application. From this
side, there is no a task which managed by driver internally. In fact, we also help
the reset process can be down automatically or at least provide an simple API to
application to help them recovery simply. Maybe the latter one is following the
current DPDK's framework. Otherwise, we need a thread for each driver?

And back to this patch, the patch just make the interrupt of pf reset can be received
by i40e vf PMD driver. It didn't change the ethdev/eal..... 
I don't think you have objection to it, right?

About how to process the reset event, we can raise another thread to discuss?

> This is not the first time I complain about this kind of design issues.

> 

> 

> > If we need to support driver recovery automatically, we'd better to find a way to do that.

> > Do you have any idea?

> 

> First, list those "lots of resources" that "are managed by application".

> If your driver needs to keep track of those, this is i40e driver job

> to do this internally without requiring ethdev to be modified.

>

Agree about the resource listing. But again, about the "internally", can you share your idea about it?
As you know, pmd driver even have no internal thread.

> If this proves to be generic enough, maybe moving part of this to

> ethdev will then make sense.

>

We can discuss, I think most NICs may have such issue. We need to make agreement on that.

Thanks
Jingjing
> 

> Thanks.

> 

> --

> David Marchand
  
Zhang, Helin Feb. 22, 2016, 8:26 a.m. UTC | #10
> -----Original Message-----
> From: Wu, Jingjing
> Sent: Wednesday, January 27, 2016 9:50 AM
> To: dev@dpdk.org
> Cc: Wu, Jingjing; Zhang, Helin; Lu, Wenzhuo; Pei, Yulong
> Subject: [PATCH v2 2/2] i40evf: support interrupt based pf reset request
> 
> Interrupt based request of PF reset from PF is supported by enabling the
> adminq event process in VF driver.
> Users can register a callback for this interrupt event to get informed, when a
> PF reset request detected like:
>   rte_eth_dev_callback_register(portid,
> 		RTE_ETH_EVENT_INTR_RESET,
> 		reset_event_callback,
> 		arg);
> 
> Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
> ---
>  doc/guides/rel_notes/release_2_3.rst |   1 +
>  drivers/net/i40e/i40e_ethdev_vf.c    | 274
> +++++++++++++++++++++++++++++++----
>  lib/librte_ether/rte_ethdev.h        |   1 +
>  3 files changed, 246 insertions(+), 30 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/release_2_3.rst
> b/doc/guides/rel_notes/release_2_3.rst
> index 99de186..73d5f76 100644
> --- a/doc/guides/rel_notes/release_2_3.rst
> +++ b/doc/guides/rel_notes/release_2_3.rst
> @@ -4,6 +4,7 @@ DPDK Release 2.3
>  New Features
>  ------------
> 
> +* **Added pf reset event reported in i40e vf PMD driver.
> 
>  Resolved Issues
>  ---------------
> diff --git a/drivers/net/i40e/i40e_ethdev_vf.c
> b/drivers/net/i40e/i40e_ethdev_vf.c
> index 64e6957..1ffe64e 100644
> --- a/drivers/net/i40e/i40e_ethdev_vf.c
> +++ b/drivers/net/i40e/i40e_ethdev_vf.c
> @@ -74,8 +74,6 @@
>  #define I40EVF_BUSY_WAIT_DELAY 10
>  #define I40EVF_BUSY_WAIT_COUNT 50
>  #define MAX_RESET_WAIT_CNT     20
> -/*ITR index for NOITR*/
> -#define I40E_QINT_RQCTL_MSIX_INDX_NOITR     3
> 
>  struct i40evf_arq_msg_info {
>  	enum i40e_virtchnl_ops ops;
> @@ -151,6 +149,9 @@ static int
>  i40evf_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t
> queue_id);  static int  i40evf_dev_rx_queue_intr_disable(struct rte_eth_dev
> *dev, uint16_t queue_id);
> +static void i40evf_handle_pf_event(__rte_unused struct rte_eth_dev *dev,
> +			   uint8_t *msg,
> +			   uint16_t msglen);
> 
>  /* Default hash key buffer for RSS */
>  static uint32_t rss_key_default[I40E_VFQF_HKEY_MAX_INDEX + 1]; @@ -
> 357,20 +358,42 @@ i40evf_execute_vf_cmd(struct rte_eth_dev *dev, struct
> vf_cmd_info *args)
>  		return err;
>  	}
> 
> -	do {
> -		/* Delay some time first */
> -		rte_delay_ms(ASQ_DELAY_MS);
> -		ret = i40evf_read_pfmsg(dev, &info);
> -		if (ret == I40EVF_MSG_CMD) {
> -			err = 0;
> -			break;
> -		} else if (ret == I40EVF_MSG_ERR) {
> -			err = -1;
> -			break;
> -		}
> -		/* If don't read msg or read sys event, continue */
> -	} while (i++ < MAX_TRY_TIMES);
> -	_clear_cmd(vf);
> +	switch (args->ops) {
> +	case I40E_VIRTCHNL_OP_RESET_VF:
> +		/*no need to process in this function */
> +		break;
> +	case I40E_VIRTCHNL_OP_VERSION:
> +	case I40E_VIRTCHNL_OP_GET_VF_RESOURCES:
> +		/* for init adminq commands, need to poll the response */
> +		do {
> +			/* Delay some time first */
> +			rte_delay_ms(ASQ_DELAY_MS);
> +			ret = i40evf_read_pfmsg(dev, &info);
> +			if (ret == I40EVF_MSG_CMD) {
> +				err = 0;
> +				break;
> +			} else if (ret == I40EVF_MSG_ERR) {
> +				err = -1;
> +				break;
> +			}
> +			/* If don't read msg or read sys event, continue */
> +		} while (i++ < MAX_TRY_TIMES);
> +		_clear_cmd(vf);
> +		break;
> +
> +	default:
> +		/* for other adminq in running time, waiting the cmd done
> flag */
> +		do {
> +			/* Delay some time first */
> +			rte_delay_ms(ASQ_DELAY_MS);
> +			if (vf->pend_cmd ==
> I40E_VIRTCHNL_OP_UNKNOWN) {
> +				err = 0;
> +				break;
> +			}
> +			/* If don't read msg or read sys event, continue */
> +		} while (i++ < MAX_TRY_TIMES);
> +		break;
> +	}
> 
>  	return (err | vf->cmd_retval);
>  }
> @@ -719,7 +742,7 @@ i40evf_config_irq_map(struct rte_eth_dev *dev)
> 
>  	map_info = (struct i40e_virtchnl_irq_map_info *)cmd_buffer;
>  	map_info->num_vectors = 1;
> -	map_info->vecmap[0].rxitr_idx =
> I40E_QINT_RQCTL_MSIX_INDX_NOITR;
> +	map_info->vecmap[0].rxitr_idx = I40E_ITR_INDEX_DEFAULT;
>  	map_info->vecmap[0].vsi_id = vf->vsi_res->vsi_id;
>  	/* Alway use default dynamic MSIX interrupt */
>  	map_info->vecmap[0].vector_id = vector_id; @@ -1093,6 +1116,38
> @@ i40evf_dev_atomic_write_link_status(struct rte_eth_dev *dev,
>  	return 0;
>  }
> 
> +/* Disable IRQ0 */
> +static inline void
> +i40evf_disable_irq0(struct i40e_hw *hw) {
> +	/* Disable all interrupt types */
> +	I40E_WRITE_REG(hw, I40E_VFINT_ICR0_ENA1, 0);
> +	I40E_WRITE_REG(hw, I40E_VFINT_DYN_CTL01,
> +		       I40E_VFINT_DYN_CTL01_ITR_INDX_MASK);
> +	I40EVF_WRITE_FLUSH(hw);
> +}
> +
> +/* Enable IRQ0 */
> +static inline void
> +i40evf_enable_irq0(struct i40e_hw *hw)
> +{
> +	/* Enable admin queue interrupt trigger */
> +	uint32_t val;
> +
> +	i40evf_disable_irq0(hw);
> +	val = I40E_READ_REG(hw, I40E_VFINT_ICR0_ENA1);
> +	val |= I40E_VFINT_ICR0_ENA1_ADMINQ_MASK |
> +		I40E_VFINT_ICR0_ENA1_LINK_STAT_CHANGE_MASK;
> +	I40E_WRITE_REG(hw, I40E_VFINT_ICR0_ENA1, val);
> +
> +	I40E_WRITE_REG(hw, I40E_VFINT_DYN_CTL01,
> +		I40E_VFINT_DYN_CTL01_INTENA_MASK |
> +		I40E_VFINT_DYN_CTL01_CLEARPBA_MASK |
> +		I40E_VFINT_DYN_CTL01_ITR_INDX_MASK);
> +
> +	I40EVF_WRITE_FLUSH(hw);
> +}
> +
>  static int
>  i40evf_reset_vf(struct i40e_hw *hw)
>  {
> @@ -1137,6 +1192,8 @@ i40evf_init_vf(struct rte_eth_dev *dev)
>  	int i, err, bufsz;
>  	struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data-
> >dev_private);
>  	struct i40e_vf *vf = I40EVF_DEV_PRIVATE_TO_VF(dev->data-
> >dev_private);
> +	uint16_t interval =
> +		i40e_calc_itr_interval(I40E_QUEUE_ITR_INTERVAL_MAX);
> 
>  	vf->adapter = I40E_DEV_PRIVATE_TO_ADAPTER(dev->data-
> >dev_private);
>  	vf->dev_data = dev->data;
> @@ -1218,6 +1275,15 @@ i40evf_init_vf(struct rte_eth_dev *dev)
>  	ether_addr_copy((struct ether_addr *)vf->vsi_res-
> >default_mac_addr,
>  					(struct ether_addr *)hw->mac.addr);
> 
> +	/* If the PF host is not DPDK, set the interval of ITR0 to max*/
> +	if (vf->version_major != I40E_DPDK_VERSION_MAJOR) {
> +		I40E_WRITE_REG(hw, I40E_VFINT_DYN_CTL01,
> +			       (I40E_ITR_INDEX_DEFAULT <<
> +				I40E_VFINT_DYN_CTL0_ITR_INDX_SHIFT) |
> +			       (interval <<
> +				I40E_VFINT_DYN_CTL0_INTERVAL_SHIFT));
> +	}
A write flush might be needed here?
  

Patch

diff --git a/doc/guides/rel_notes/release_2_3.rst b/doc/guides/rel_notes/release_2_3.rst
index 99de186..73d5f76 100644
--- a/doc/guides/rel_notes/release_2_3.rst
+++ b/doc/guides/rel_notes/release_2_3.rst
@@ -4,6 +4,7 @@  DPDK Release 2.3
 New Features
 ------------
 
+* **Added pf reset event reported in i40e vf PMD driver.
 
 Resolved Issues
 ---------------
diff --git a/drivers/net/i40e/i40e_ethdev_vf.c b/drivers/net/i40e/i40e_ethdev_vf.c
index 64e6957..1ffe64e 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -74,8 +74,6 @@ 
 #define I40EVF_BUSY_WAIT_DELAY 10
 #define I40EVF_BUSY_WAIT_COUNT 50
 #define MAX_RESET_WAIT_CNT     20
-/*ITR index for NOITR*/
-#define I40E_QINT_RQCTL_MSIX_INDX_NOITR     3
 
 struct i40evf_arq_msg_info {
 	enum i40e_virtchnl_ops ops;
@@ -151,6 +149,9 @@  static int
 i40evf_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id);
 static int
 i40evf_dev_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t queue_id);
+static void i40evf_handle_pf_event(__rte_unused struct rte_eth_dev *dev,
+			   uint8_t *msg,
+			   uint16_t msglen);
 
 /* Default hash key buffer for RSS */
 static uint32_t rss_key_default[I40E_VFQF_HKEY_MAX_INDEX + 1];
@@ -357,20 +358,42 @@  i40evf_execute_vf_cmd(struct rte_eth_dev *dev, struct vf_cmd_info *args)
 		return err;
 	}
 
-	do {
-		/* Delay some time first */
-		rte_delay_ms(ASQ_DELAY_MS);
-		ret = i40evf_read_pfmsg(dev, &info);
-		if (ret == I40EVF_MSG_CMD) {
-			err = 0;
-			break;
-		} else if (ret == I40EVF_MSG_ERR) {
-			err = -1;
-			break;
-		}
-		/* If don't read msg or read sys event, continue */
-	} while (i++ < MAX_TRY_TIMES);
-	_clear_cmd(vf);
+	switch (args->ops) {
+	case I40E_VIRTCHNL_OP_RESET_VF:
+		/*no need to process in this function */
+		break;
+	case I40E_VIRTCHNL_OP_VERSION:
+	case I40E_VIRTCHNL_OP_GET_VF_RESOURCES:
+		/* for init adminq commands, need to poll the response */
+		do {
+			/* Delay some time first */
+			rte_delay_ms(ASQ_DELAY_MS);
+			ret = i40evf_read_pfmsg(dev, &info);
+			if (ret == I40EVF_MSG_CMD) {
+				err = 0;
+				break;
+			} else if (ret == I40EVF_MSG_ERR) {
+				err = -1;
+				break;
+			}
+			/* If don't read msg or read sys event, continue */
+		} while (i++ < MAX_TRY_TIMES);
+		_clear_cmd(vf);
+		break;
+
+	default:
+		/* for other adminq in running time, waiting the cmd done flag */
+		do {
+			/* Delay some time first */
+			rte_delay_ms(ASQ_DELAY_MS);
+			if (vf->pend_cmd == I40E_VIRTCHNL_OP_UNKNOWN) {
+				err = 0;
+				break;
+			}
+			/* If don't read msg or read sys event, continue */
+		} while (i++ < MAX_TRY_TIMES);
+		break;
+	}
 
 	return (err | vf->cmd_retval);
 }
@@ -719,7 +742,7 @@  i40evf_config_irq_map(struct rte_eth_dev *dev)
 
 	map_info = (struct i40e_virtchnl_irq_map_info *)cmd_buffer;
 	map_info->num_vectors = 1;
-	map_info->vecmap[0].rxitr_idx = I40E_QINT_RQCTL_MSIX_INDX_NOITR;
+	map_info->vecmap[0].rxitr_idx = I40E_ITR_INDEX_DEFAULT;
 	map_info->vecmap[0].vsi_id = vf->vsi_res->vsi_id;
 	/* Alway use default dynamic MSIX interrupt */
 	map_info->vecmap[0].vector_id = vector_id;
@@ -1093,6 +1116,38 @@  i40evf_dev_atomic_write_link_status(struct rte_eth_dev *dev,
 	return 0;
 }
 
+/* Disable IRQ0 */
+static inline void
+i40evf_disable_irq0(struct i40e_hw *hw)
+{
+	/* Disable all interrupt types */
+	I40E_WRITE_REG(hw, I40E_VFINT_ICR0_ENA1, 0);
+	I40E_WRITE_REG(hw, I40E_VFINT_DYN_CTL01,
+		       I40E_VFINT_DYN_CTL01_ITR_INDX_MASK);
+	I40EVF_WRITE_FLUSH(hw);
+}
+
+/* Enable IRQ0 */
+static inline void
+i40evf_enable_irq0(struct i40e_hw *hw)
+{
+	/* Enable admin queue interrupt trigger */
+	uint32_t val;
+
+	i40evf_disable_irq0(hw);
+	val = I40E_READ_REG(hw, I40E_VFINT_ICR0_ENA1);
+	val |= I40E_VFINT_ICR0_ENA1_ADMINQ_MASK |
+		I40E_VFINT_ICR0_ENA1_LINK_STAT_CHANGE_MASK;
+	I40E_WRITE_REG(hw, I40E_VFINT_ICR0_ENA1, val);
+
+	I40E_WRITE_REG(hw, I40E_VFINT_DYN_CTL01,
+		I40E_VFINT_DYN_CTL01_INTENA_MASK |
+		I40E_VFINT_DYN_CTL01_CLEARPBA_MASK |
+		I40E_VFINT_DYN_CTL01_ITR_INDX_MASK);
+
+	I40EVF_WRITE_FLUSH(hw);
+}
+
 static int
 i40evf_reset_vf(struct i40e_hw *hw)
 {
@@ -1137,6 +1192,8 @@  i40evf_init_vf(struct rte_eth_dev *dev)
 	int i, err, bufsz;
 	struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private);
 	struct i40e_vf *vf = I40EVF_DEV_PRIVATE_TO_VF(dev->data->dev_private);
+	uint16_t interval =
+		i40e_calc_itr_interval(I40E_QUEUE_ITR_INTERVAL_MAX);
 
 	vf->adapter = I40E_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
 	vf->dev_data = dev->data;
@@ -1218,6 +1275,15 @@  i40evf_init_vf(struct rte_eth_dev *dev)
 	ether_addr_copy((struct ether_addr *)vf->vsi_res->default_mac_addr,
 					(struct ether_addr *)hw->mac.addr);
 
+	/* If the PF host is not DPDK, set the interval of ITR0 to max*/
+	if (vf->version_major != I40E_DPDK_VERSION_MAJOR) {
+		I40E_WRITE_REG(hw, I40E_VFINT_DYN_CTL01,
+			       (I40E_ITR_INDEX_DEFAULT <<
+				I40E_VFINT_DYN_CTL0_ITR_INDX_SHIFT) |
+			       (interval <<
+				I40E_VFINT_DYN_CTL0_INTERVAL_SHIFT));
+	}
+
 	return 0;
 
 err_alloc:
@@ -1246,11 +1312,142 @@  i40evf_uninit_vf(struct rte_eth_dev *dev)
 	return 0;
 }
 
+static void
+i40evf_handle_pf_event(__rte_unused struct rte_eth_dev *dev,
+			   uint8_t *msg,
+			   __rte_unused uint16_t msglen)
+{
+	struct i40e_virtchnl_pf_event *pf_msg =
+			(struct i40e_virtchnl_pf_event *)msg;
+
+	switch (pf_msg->event) {
+	case I40E_VIRTCHNL_EVENT_RESET_IMPENDING:
+		PMD_DRV_LOG(DEBUG, "VIRTCHNL_EVENT_RESET_IMPENDING event\n");
+		_rte_eth_dev_callback_process(dev, RTE_ETH_EVENT_INTR_RESET);
+		break;
+	case I40E_VIRTCHNL_EVENT_LINK_CHANGE:
+		PMD_DRV_LOG(DEBUG, "VIRTCHNL_EVENT_LINK_CHANGE event\n");
+		break;
+	case I40E_VIRTCHNL_EVENT_PF_DRIVER_CLOSE:
+		PMD_DRV_LOG(DEBUG, "VIRTCHNL_EVENT_PF_DRIVER_CLOSE event\n");
+		break;
+	default:
+		PMD_DRV_LOG(ERR, " unknown event received %u", pf_msg->event);
+		break;
+	}
+}
+
+static void
+i40evf_handle_aq_msg(struct rte_eth_dev *dev)
+{
+	struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	struct i40e_vf *vf = I40EVF_DEV_PRIVATE_TO_VF(dev->data->dev_private);
+	struct i40e_arq_event_info info;
+	struct i40e_virtchnl_msg *v_msg;
+	uint16_t pending, opcode;
+	int ret;
+
+	info.buf_len = I40E_AQ_BUF_SZ;
+	if (!vf->aq_resp) {
+		PMD_DRV_LOG(ERR, "Buffer for adminq resp should not be NULL");
+		return;
+	}
+	info.msg_buf = vf->aq_resp;
+	v_msg = (struct i40e_virtchnl_msg *)&info.desc;
+
+	pending = 1;
+	while (pending) {
+		ret = i40e_clean_arq_element(hw, &info, &pending);
+
+		if (ret != I40E_SUCCESS) {
+			PMD_DRV_LOG(INFO, "Failed to read msg from AdminQ,"
+				    "ret: %d", ret);
+			break;
+		}
+		opcode = rte_le_to_cpu_16(info.desc.opcode);
+
+		switch (opcode) {
+		case i40e_aqc_opc_send_msg_to_vf:
+			if (v_msg->v_opcode == I40E_VIRTCHNL_OP_EVENT)
+				/* process event*/
+				i40evf_handle_pf_event(dev, info.msg_buf,
+							info.msg_len);
+			else {
+				/* read message and it's expected one */
+				if (v_msg->v_opcode == vf->pend_cmd) {
+					vf->cmd_retval = v_msg->v_retval;
+					/* prevent compiler reordering */
+					rte_compiler_barrier();
+					_clear_cmd(vf);
+				} else
+					PMD_DRV_LOG(ERR, "command mismatch,"
+						"expect %u, get %u",
+						vf->pend_cmd, v_msg->v_opcode);
+				 PMD_DRV_LOG(DEBUG, "adminq response is received,"
+					     " opcode = %d\n", v_msg->v_opcode);
+			}
+			break;
+		default:
+			PMD_DRV_LOG(ERR, "Request %u is not supported yet",
+				    opcode);
+			break;
+		}
+	}
+}
+
+/**
+ * Interrupt handler triggered by NIC  for handling
+ * specific interrupt. Only adminq interrupt is processed in VF.
+ *
+ * @param handle
+ *  Pointer to interrupt handle.
+ * @param param
+ *  The address of parameter (struct rte_eth_dev *) regsitered before.
+ *
+ * @return
+ *  void
+ */
+static void
+i40evf_dev_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
+			   void *param)
+{
+	struct rte_eth_dev *dev = (struct rte_eth_dev *)param;
+	struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	uint32_t icr0;
+
+	i40evf_disable_irq0(hw);
+
+	/* read out interrupt causes */
+	icr0 = I40E_READ_REG(hw, I40E_VFINT_ICR01);
+
+	/* No interrupt event indicated */
+	if (!(icr0 & I40E_VFINT_ICR01_INTEVENT_MASK)) {
+		PMD_DRV_LOG(DEBUG, "No interrupt event, nothing to do\n");
+		goto done;
+	}
+
+	if (icr0 & I40E_VFINT_ICR01_ADMINQ_MASK) {
+		PMD_DRV_LOG(DEBUG, "ICR01_ADMINQ is reported\n");
+		i40evf_handle_aq_msg(dev);
+	}
+
+	/* Link Status Change interrupt */
+	if (icr0 & I40E_VFINT_ICR01_LINK_STAT_CHANGE_MASK)
+		PMD_DRV_LOG(DEBUG, "LINK_STAT_CHANGE is reported,"
+				   " do nothing\n");
+
+done:
+	i40evf_enable_irq0(hw);
+	rte_intr_enable(&(dev->pci_dev->intr_handle));
+}
+
+
 static int
 i40evf_dev_init(struct rte_eth_dev *eth_dev)
 {
 	struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(\
 			eth_dev->data->dev_private);
+	struct rte_pci_device *pci_dev = eth_dev->pci_dev;
 
 	PMD_INIT_FUNC_TRACE();
 
@@ -1285,6 +1482,16 @@  i40evf_dev_init(struct rte_eth_dev *eth_dev)
 		return -1;
 	}
 
+	/* register callback func to eal lib */
+	rte_intr_callback_register(&(pci_dev->intr_handle),
+		i40evf_dev_interrupt_handler, (void *)eth_dev);
+
+	/* configure and enable device interrupt */
+	i40evf_enable_irq0(hw);
+	/* intr is enabled in i40evf_enable_queues_intr when dev_start */
+
+	/* enable uio intr after callback register */
+	rte_intr_enable(&(pci_dev->intr_handle));
 	/* copy mac addr */
 	eth_dev->data->mac_addrs = rte_zmalloc("i40evf_mac",
 					ETHER_ADDR_LEN, 0);
@@ -1662,7 +1869,8 @@  i40evf_enable_queues_intr(struct rte_eth_dev *dev)
 		I40E_WRITE_REG(hw,
 			       I40E_VFINT_DYN_CTL01,
 			       I40E_VFINT_DYN_CTL01_INTENA_MASK |
-			       I40E_VFINT_DYN_CTL01_CLEARPBA_MASK);
+			       I40E_VFINT_DYN_CTL01_CLEARPBA_MASK |
+			       I40E_VFINT_DYN_CTL01_ITR_INDX_MASK);
 		I40EVF_WRITE_FLUSH(hw);
 		return;
 	}
@@ -1673,11 +1881,10 @@  i40evf_enable_queues_intr(struct rte_eth_dev *dev)
 			I40E_VFINT_DYN_CTLN1(I40EVF_VSI_DEFAULT_MSIX_INTR - 1),
 			I40E_VFINT_DYN_CTLN1_INTENA_MASK |
 			I40E_VFINT_DYN_CTLN_CLEARPBA_MASK);
-	else
-		/* To support Linux PF host */
-		I40E_WRITE_REG(hw, I40E_VFINT_DYN_CTL01,
-				I40E_VFINT_DYN_CTL01_INTENA_MASK |
-				I40E_VFINT_DYN_CTL01_CLEARPBA_MASK);
+	/* If host driver is kernel driver, do nothing.
+	 * Interrupt 0 is used for rx packets, but don't set I40E_VFINT_DYN_CTL01,
+	 * because it is already done in i40evf_enable_irq0.
+	 */
 
 	I40EVF_WRITE_FLUSH(hw);
 }
@@ -1690,7 +1897,8 @@  i40evf_disable_queues_intr(struct rte_eth_dev *dev)
 	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
 
 	if (!rte_intr_allow_others(intr_handle)) {
-		I40E_WRITE_REG(hw, I40E_VFINT_DYN_CTL01, 0);
+		I40E_WRITE_REG(hw, I40E_VFINT_DYN_CTL01,
+			       I40E_VFINT_DYN_CTL01_ITR_INDX_MASK);
 		I40EVF_WRITE_FLUSH(hw);
 		return;
 	}
@@ -1700,8 +1908,10 @@  i40evf_disable_queues_intr(struct rte_eth_dev *dev)
 			       I40E_VFINT_DYN_CTLN1(I40EVF_VSI_DEFAULT_MSIX_INTR
 						    - 1),
 			       0);
-	else
-		I40E_WRITE_REG(hw, I40E_VFINT_DYN_CTL01, 0);
+	/* If host driver is kernel driver, do nothing.
+	 * Interrupt 0 is used for rx packets, but don't zero I40E_VFINT_DYN_CTL01,
+	 * because interrupt 0 is also used for adminq processing.
+	 */
 
 	I40EVF_WRITE_FLUSH(hw);
 }
@@ -1825,10 +2035,6 @@  i40evf_dev_start(struct rte_eth_dev *dev)
 		goto err_mac;
 	}
 
-	/* vf don't allow intr except for rxq intr */
-	if (dev->data->dev_conf.intr_conf.rxq != 0)
-		rte_intr_enable(intr_handle);
-
 	i40evf_enable_queues_intr(dev);
 	return 0;
 
@@ -2020,12 +2226,20 @@  static void
 i40evf_dev_close(struct rte_eth_dev *dev)
 {
 	struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	struct rte_pci_device *pci_dev = dev->pci_dev;
 
 	i40evf_dev_stop(dev);
 	hw->adapter_stopped = 1;
 	i40e_dev_free_queues(dev);
 	i40evf_reset_vf(hw);
 	i40e_shutdown_adminq(hw);
+	/* disable uio intr before callback unregister */
+	rte_intr_disable(&(pci_dev->intr_handle));
+
+	/* unregister callback func from eal lib */
+	rte_intr_callback_unregister(&(pci_dev->intr_handle),
+		i40evf_dev_interrupt_handler, (void *)dev);
+	i40evf_disable_irq0(hw);
 }
 
 static int
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index bada8ad..1be1783 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -2666,6 +2666,7 @@  rte_eth_tx_burst(uint8_t port_id, uint16_t queue_id,
 enum rte_eth_event_type {
 	RTE_ETH_EVENT_UNKNOWN,  /**< unknown event type */
 	RTE_ETH_EVENT_INTR_LSC, /**< lsc interrupt event */
+	RTE_ETH_EVENT_INTR_RESET, /**< reset interrupt event */
 	RTE_ETH_EVENT_MAX       /**< max value of this enum */
 };