[v4,2/3] net/iavf: enable PCI bus master after reset

Message ID 20210427133912.261993-3-haiyue.wang@intel.com (mailing list archive)
State Superseded, archived
Delegated to: David Marchand
Headers
Series fix PF reset causes VF memory request failure |

Checks

Context Check Description
ci/checkpatch warning coding style issues

Commit Message

Wang, Haiyue April 27, 2021, 1:39 p.m. UTC
  The VF reset can be triggerred by the PF reset event, in this case, the
PCI bus master will be cleared, then the VF is not allowed to issue any
Memory or I/O Requests.

So after the reset event is detected, always enable the PCI bus master.

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
---
 drivers/net/iavf/iavf_ethdev.c | 3 +++
 1 file changed, 3 insertions(+)
  

Comments

Qi Zhang April 28, 2021, 3:34 a.m. UTC | #1
> -----Original Message-----
> From: Wang, Haiyue <haiyue.wang@intel.com>
> Sent: Tuesday, April 27, 2021 9:39 PM
> To: dev@dpdk.org
> Cc: Zhang, Qi Z <qi.z.zhang@intel.com>; Wang, Liang-min
> <liang-min.wang@intel.com>; david.marchand@redhat.com; Wang, Haiyue
> <haiyue.wang@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>; Xing, Beilei
> <beilei.xing@intel.com>
> Subject: [PATCH v4 2/3] net/iavf: enable PCI bus master after reset
> 
> The VF reset can be triggerred by the PF reset event, in this case, the PCI bus
> master will be cleared, then the VF is not allowed to issue any Memory or I/O
> Requests.
> 
> So after the reset event is detected, always enable the PCI bus master.
> 
> Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
> ---
>  drivers/net/iavf/iavf_ethdev.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
> index d523a0618..9a0a20a29 100644
> --- a/drivers/net/iavf/iavf_ethdev.c
> +++ b/drivers/net/iavf/iavf_ethdev.c
> @@ -2255,6 +2255,9 @@ iavf_dev_close(struct rte_eth_dev *dev)
>  	rte_free(vf->aq_resp);
>  	vf->aq_resp = NULL;
> 
> +	if (vf->vf_reset)
> +		rte_pci_set_bus_master(pci_dev, true);
> +
>  	vf->vf_reset = false;
> 
>  	return ret;
> --
> 2.31.1

Acked-by: Qi Zhang <qi.z.zhang@intel.com>
  
David Marchand May 4, 2021, 11:32 a.m. UTC | #2
On Tue, Apr 27, 2021 at 4:05 PM Haiyue Wang <haiyue.wang@intel.com> wrote:
>
> The VF reset can be triggerred by the PF reset event, in this case, the
> PCI bus master will be cleared, then the VF is not allowed to issue any
> Memory or I/O Requests.
>
> So after the reset event is detected, always enable the PCI bus master.
>
> Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
> ---
>  drivers/net/iavf/iavf_ethdev.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
> index d523a0618..9a0a20a29 100644
> --- a/drivers/net/iavf/iavf_ethdev.c
> +++ b/drivers/net/iavf/iavf_ethdev.c
> @@ -2255,6 +2255,9 @@ iavf_dev_close(struct rte_eth_dev *dev)
>         rte_free(vf->aq_resp);
>         vf->aq_resp = NULL;
>
> +       if (vf->vf_reset)
> +               rte_pci_set_bus_master(pci_dev, true);
> +
>         vf->vf_reset = false;

Not checking for the return code can leave the device in an invalid state.
Then after this, calling the init code will fail.

I'd rather move rte_pci_set_bus_master() (it is a noop if bus master
is already enabled) in the init path and check for the return code
there?
WDYT?
  
Wang, Haiyue May 4, 2021, 3:07 p.m. UTC | #3
> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Tuesday, May 4, 2021 19:32
> To: Wang, Haiyue <haiyue.wang@intel.com>
> Cc: dev <dev@dpdk.org>; Zhang, Qi Z <qi.z.zhang@intel.com>; Wang, Liang-min <liang-min.wang@intel.com>;
> Wu, Jingjing <jingjing.wu@intel.com>; Xing, Beilei <beilei.xing@intel.com>
> Subject: Re: [PATCH v4 2/3] net/iavf: enable PCI bus master after reset
> 
> On Tue, Apr 27, 2021 at 4:05 PM Haiyue Wang <haiyue.wang@intel.com> wrote:
> >
> > The VF reset can be triggerred by the PF reset event, in this case, the
> > PCI bus master will be cleared, then the VF is not allowed to issue any
> > Memory or I/O Requests.
> >
> > So after the reset event is detected, always enable the PCI bus master.
> >
> > Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
> > ---
> >  drivers/net/iavf/iavf_ethdev.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
> > index d523a0618..9a0a20a29 100644
> > --- a/drivers/net/iavf/iavf_ethdev.c
> > +++ b/drivers/net/iavf/iavf_ethdev.c
> > @@ -2255,6 +2255,9 @@ iavf_dev_close(struct rte_eth_dev *dev)
> >         rte_free(vf->aq_resp);
> >         vf->aq_resp = NULL;
> >
> > +       if (vf->vf_reset)
> > +               rte_pci_set_bus_master(pci_dev, true);
> > +
> >         vf->vf_reset = false;
> 
> Not checking for the return code can leave the device in an invalid state.
> Then after this, calling the init code will fail.
> 
> I'd rather move rte_pci_set_bus_master() (it is a noop if bus master
> is already enabled) in the init path and check for the return code

Currently, just called when vf_reset happened. And I noticed that the kernel
doesn't handle pci_read/write_config_word return code in most cases. I'm wondering
if the PCI ops really failed, the system can return back ?

> there?
> WDYT?
> 
> 
> --
> David Marchand
  
Wang, Haiyue May 5, 2021, 2:56 a.m. UTC | #4
> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Tuesday, May 4, 2021 19:32
> To: Wang, Haiyue <haiyue.wang@intel.com>
> Cc: dev <dev@dpdk.org>; Zhang, Qi Z <qi.z.zhang@intel.com>; Wang, Liang-min <liang-min.wang@intel.com>;
> Wu, Jingjing <jingjing.wu@intel.com>; Xing, Beilei <beilei.xing@intel.com>
> Subject: Re: [PATCH v4 2/3] net/iavf: enable PCI bus master after reset
> 
> On Tue, Apr 27, 2021 at 4:05 PM Haiyue Wang <haiyue.wang@intel.com> wrote:
> >
> > The VF reset can be triggerred by the PF reset event, in this case, the
> > PCI bus master will be cleared, then the VF is not allowed to issue any
> > Memory or I/O Requests.
> >
> > So after the reset event is detected, always enable the PCI bus master.
> >
> > Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
> > ---
> >  drivers/net/iavf/iavf_ethdev.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
> > index d523a0618..9a0a20a29 100644
> > --- a/drivers/net/iavf/iavf_ethdev.c
> > +++ b/drivers/net/iavf/iavf_ethdev.c
> > @@ -2255,6 +2255,9 @@ iavf_dev_close(struct rte_eth_dev *dev)
> >         rte_free(vf->aq_resp);
> >         vf->aq_resp = NULL;
> >
> > +       if (vf->vf_reset)
> > +               rte_pci_set_bus_master(pci_dev, true);
> > +
> >         vf->vf_reset = false;
> 
> Not checking for the return code can leave the device in an invalid state.
> Then after this, calling the init code will fail.

From the upper application's view, if this bus master fix can't recover
the device into valid state, then the device hotplug API should be used
to make the device fully recover. So I'd prefer to call bus master "try
best" to fix. If still have error, the system may be in bad state. 

The init code is mostly called from PCI VFIO/UIO device management which
has enabled bus master. If we put the bus master here, people may confuse.
Bind setting the bus master with 'vf->vf_reset' together looks better.

> 
> I'd rather move rte_pci_set_bus_master() (it is a noop if bus master
> is already enabled) in the init path and check for the return code

In fact, not real noop, since the reading PCI ops may fail. ;-)

> there?
> WDYT?
> 
> 
> --
> David Marchand
  
David Marchand May 5, 2021, 8:39 a.m. UTC | #5
On Wed, May 5, 2021 at 4:56 AM Wang, Haiyue <haiyue.wang@intel.com> wrote:
>
> > -----Original Message-----
> > From: David Marchand <david.marchand@redhat.com>
> > Sent: Tuesday, May 4, 2021 19:32
> > To: Wang, Haiyue <haiyue.wang@intel.com>
> > Cc: dev <dev@dpdk.org>; Zhang, Qi Z <qi.z.zhang@intel.com>; Wang, Liang-min <liang-min.wang@intel.com>;
> > Wu, Jingjing <jingjing.wu@intel.com>; Xing, Beilei <beilei.xing@intel.com>
> > Subject: Re: [PATCH v4 2/3] net/iavf: enable PCI bus master after reset
> >
> > On Tue, Apr 27, 2021 at 4:05 PM Haiyue Wang <haiyue.wang@intel.com> wrote:
> > >
> > > The VF reset can be triggerred by the PF reset event, in this case, the
> > > PCI bus master will be cleared, then the VF is not allowed to issue any
> > > Memory or I/O Requests.
> > >
> > > So after the reset event is detected, always enable the PCI bus master.
> > >
> > > Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
> > > ---
> > >  drivers/net/iavf/iavf_ethdev.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
> > > index d523a0618..9a0a20a29 100644
> > > --- a/drivers/net/iavf/iavf_ethdev.c
> > > +++ b/drivers/net/iavf/iavf_ethdev.c
> > > @@ -2255,6 +2255,9 @@ iavf_dev_close(struct rte_eth_dev *dev)
> > >         rte_free(vf->aq_resp);
> > >         vf->aq_resp = NULL;
> > >
> > > +       if (vf->vf_reset)
> > > +               rte_pci_set_bus_master(pci_dev, true);
> > > +
> > >         vf->vf_reset = false;
> >
> > Not checking for the return code can leave the device in an invalid state.
> > Then after this, calling the init code will fail.
>
> From the upper application's view, if this bus master fix can't recover
> the device into valid state, then the device hotplug API should be used
> to make the device fully recover. So I'd prefer to call bus master "try
> best" to fix. If still have error, the system may be in bad state.

I find it odd to put something required for (re)initialising in a
.dev_close ops.
Maybe the place is more into .dev_reset if you find it confusing in .dev_init.
But this is your driver, I'll let it to your judgement.


On the other hand, what is the point of not failing early/propagating
the rte_pci_set_bus_master error?
The driver can't work without bus master enabled, correct?
  
Wang, Haiyue May 6, 2021, 3:02 a.m. UTC | #6
> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Wednesday, May 5, 2021 16:39
> To: Wang, Haiyue <haiyue.wang@intel.com>
> Cc: dev <dev@dpdk.org>; Zhang, Qi Z <qi.z.zhang@intel.com>; Wang, Liang-min <liang-min.wang@intel.com>;
> Wu, Jingjing <jingjing.wu@intel.com>; Xing, Beilei <beilei.xing@intel.com>
> Subject: Re: [PATCH v4 2/3] net/iavf: enable PCI bus master after reset
> 
> On Wed, May 5, 2021 at 4:56 AM Wang, Haiyue <haiyue.wang@intel.com> wrote:
> >
> > > -----Original Message-----
> > > From: David Marchand <david.marchand@redhat.com>
> > > Sent: Tuesday, May 4, 2021 19:32
> > > To: Wang, Haiyue <haiyue.wang@intel.com>
> > > Cc: dev <dev@dpdk.org>; Zhang, Qi Z <qi.z.zhang@intel.com>; Wang, Liang-min <liang-
> min.wang@intel.com>;
> > > Wu, Jingjing <jingjing.wu@intel.com>; Xing, Beilei <beilei.xing@intel.com>
> > > Subject: Re: [PATCH v4 2/3] net/iavf: enable PCI bus master after reset
> > >
> > > On Tue, Apr 27, 2021 at 4:05 PM Haiyue Wang <haiyue.wang@intel.com> wrote:
> > > >
> > > > The VF reset can be triggerred by the PF reset event, in this case, the
> > > > PCI bus master will be cleared, then the VF is not allowed to issue any
> > > > Memory or I/O Requests.
> > > >
> > > > So after the reset event is detected, always enable the PCI bus master.
> > > >
> > > > Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
> > > > ---
> > > >  drivers/net/iavf/iavf_ethdev.c | 3 +++
> > > >  1 file changed, 3 insertions(+)
> > > >
> > > > diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
> > > > index d523a0618..9a0a20a29 100644
> > > > --- a/drivers/net/iavf/iavf_ethdev.c
> > > > +++ b/drivers/net/iavf/iavf_ethdev.c
> > > > @@ -2255,6 +2255,9 @@ iavf_dev_close(struct rte_eth_dev *dev)
> > > >         rte_free(vf->aq_resp);
> > > >         vf->aq_resp = NULL;
> > > >
> > > > +       if (vf->vf_reset)
> > > > +               rte_pci_set_bus_master(pci_dev, true);
> > > > +
> > > >         vf->vf_reset = false;
> > >
> > > Not checking for the return code can leave the device in an invalid state.
> > > Then after this, calling the init code will fail.
> >
> > From the upper application's view, if this bus master fix can't recover
> > the device into valid state, then the device hotplug API should be used
> > to make the device fully recover. So I'd prefer to call bus master "try
> > best" to fix. If still have error, the system may be in bad state.
> 
> I find it odd to put something required for (re)initialising in a
> .dev_close ops.
> Maybe the place is more into .dev_reset if you find it confusing in .dev_init.
> But this is your driver, I'll let it to your judgement.
> 

In fact, the .dev_close is close to .dev_reset.

static int
iavf_dev_reset(struct rte_eth_dev *dev)
{
	int ret;

	ret = iavf_dev_uninit(dev); --> iavf_dev_close
	if (ret)
		return ret;

	return iavf_dev_init(dev);
}

> 
> On the other hand, what is the point of not failing early/propagating
> the rte_pci_set_bus_master error?
> The driver can't work without bus master enabled, correct?

From this point of view, correct. ;-)

Now I changed the vf_reset handling as bellow, if bus master failed, the
device will still be in VF reset state, so that other modules will return
immediately (the PMD has considered the VF reset state already).

if (vf->vf_reset &&
    !rte_pci_set_bus_master(pci_dev, true))
	vf->vf_reset = false;

> 
> 
> --
> David Marchand
  

Patch

diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
index d523a0618..9a0a20a29 100644
--- a/drivers/net/iavf/iavf_ethdev.c
+++ b/drivers/net/iavf/iavf_ethdev.c
@@ -2255,6 +2255,9 @@  iavf_dev_close(struct rte_eth_dev *dev)
 	rte_free(vf->aq_resp);
 	vf->aq_resp = NULL;
 
+	if (vf->vf_reset)
+		rte_pci_set_bus_master(pci_dev, true);
+
 	vf->vf_reset = false;
 
 	return ret;