bus/vmbus: fix race in sub channel creation
Checks
Commit Message
When using multiple queues, there was a race with the kernel
in setting up the second channel. This is do due to a kernel change
whiche does not allow accessing sysfs files for Hyper-V
channels that are not opened.
The fix is simple, just move the logic to detect not ready
sub channels earlier in the existing loop.
Fixes: 831dba47bd36 ("bus/vmbus: add Hyper-V virtual bus support")
Reported-by:Mohammed Gamal <mgamal@redhat.com>
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
drivers/bus/vmbus/linux/vmbus_uio.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
Comments
On Fri, 2018-11-30 at 12:24 -0800, Stephen Hemminger wrote:
> When using multiple queues, there was a race with the kernel
> in setting up the second channel. This is do due to a kernel change
> whiche does not allow accessing sysfs files for Hyper-V
> channels that are not opened.
>
> The fix is simple, just move the logic to detect not ready
> sub channels earlier in the existing loop.
>
> Fixes: 831dba47bd36 ("bus/vmbus: add Hyper-V virtual bus support")
> Reported-by:Mohammed Gamal <mgamal@redhat.com>
> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> ---
> drivers/bus/vmbus/linux/vmbus_uio.c | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/bus/vmbus/linux/vmbus_uio.c
> b/drivers/bus/vmbus/linux/vmbus_uio.c
> index 12e97e3a420a..38df4d724ed5 100644
> --- a/drivers/bus/vmbus/linux/vmbus_uio.c
> +++ b/drivers/bus/vmbus/linux/vmbus_uio.c
> @@ -357,6 +357,12 @@ int vmbus_uio_get_subchan(struct vmbus_channel
> *primary,
> continue;
> }
>
> + if (!vmbus_isnew_subchannel(primary, relid))
> + continue; /* Already know about you
> */
> +
> + if (!vmbus_uio_ring_present(dev, relid))
> + continue; /* Ring may not be ready
> yet */
> +
> snprintf(subchan_path, sizeof(subchan_path),
> "%s/%lu",
> chan_path, relid);
> err = vmbus_uio_sysfs_read(subchan_path,
> "subchannel_id",
> @@ -370,12 +376,6 @@ int vmbus_uio_get_subchan(struct vmbus_channel
> *primary,
> if (subid == 0)
> continue; /* skip primary channel */
>
> - if (!vmbus_isnew_subchannel(primary, relid))
> - continue;
> -
> - if (!vmbus_uio_ring_present(dev, relid))
> - continue; /* Ring may not be ready
> yet */
> -
> err = vmbus_uio_sysfs_read(subchan_path,
> "monitor_id",
> &monid, UINT8_MAX);
> if (err) {
With this patch I am now getting the following error:
[...]
Configuring Port 0 (socket 0)
hn_dev_configure(): >>
hn_rndis_link_status(): link status 0x40020006
hn_subchan_configure(): open 1 subchannels
hn_subchan_configure(): open subchannel failed: -2
hn_dev_configure(): subchannel configuration failed
Port0 dev_configure = -2
hn_dev_rx_queue_release(): >>
hn_dev_rx_queue_release(): >>
hn_dev_tx_queue_release(): >>
hn_dev_tx_queue_release(): >>
Fail to configure port 0
EAL: Error - exiting with code: 1
Cause: Start ports failed
Apparently, no subchannels were ready. Anything I may have missed or
misconfigured?
Regards,
Mohammed
On Mon, 03 Dec 2018 07:02:55 +0100
Mohammed Gamal <mgamal@redhat.com> wrote:
> On Fri, 2018-11-30 at 12:24 -0800, Stephen Hemminger wrote:
> > When using multiple queues, there was a race with the kernel
> > in setting up the second channel. This is do due to a kernel change
> > whiche does not allow accessing sysfs files for Hyper-V
> > channels that are not opened.
> >
> > The fix is simple, just move the logic to detect not ready
> > sub channels earlier in the existing loop.
> >
> > Fixes: 831dba47bd36 ("bus/vmbus: add Hyper-V virtual bus support")
> > Reported-by:Mohammed Gamal <mgamal@redhat.com>
> > Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> > ---
> > drivers/bus/vmbus/linux/vmbus_uio.c | 12 ++++++------
> > 1 file changed, 6 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/bus/vmbus/linux/vmbus_uio.c
> > b/drivers/bus/vmbus/linux/vmbus_uio.c
> > index 12e97e3a420a..38df4d724ed5 100644
> > --- a/drivers/bus/vmbus/linux/vmbus_uio.c
> > +++ b/drivers/bus/vmbus/linux/vmbus_uio.c
> > @@ -357,6 +357,12 @@ int vmbus_uio_get_subchan(struct vmbus_channel
> > *primary,
> > continue;
> > }
> >
> > + if (!vmbus_isnew_subchannel(primary, relid))
> > + continue; /* Already know about you
> > */
> > +
> > + if (!vmbus_uio_ring_present(dev, relid))
> > + continue; /* Ring may not be ready
> > yet */
> > +
> > snprintf(subchan_path, sizeof(subchan_path),
> > "%s/%lu",
> > chan_path, relid);
> > err = vmbus_uio_sysfs_read(subchan_path,
> > "subchannel_id",
> > @@ -370,12 +376,6 @@ int vmbus_uio_get_subchan(struct vmbus_channel
> > *primary,
> > if (subid == 0)
> > continue; /* skip primary channel */
> >
> > - if (!vmbus_isnew_subchannel(primary, relid))
> > - continue;
> > -
> > - if (!vmbus_uio_ring_present(dev, relid))
> > - continue; /* Ring may not be ready
> > yet */
> > -
> > err = vmbus_uio_sysfs_read(subchan_path,
> > "monitor_id",
> > &monid, UINT8_MAX);
> > if (err) {
>
> With this patch I am now getting the following error:
> [...]
> Configuring Port 0 (socket 0)
> hn_dev_configure(): >>
> hn_rndis_link_status(): link status 0x40020006
> hn_subchan_configure(): open 1 subchannels
> hn_subchan_configure(): open subchannel failed: -2
> hn_dev_configure(): subchannel configuration failed
> Port0 dev_configure = -2
> hn_dev_rx_queue_release(): >>
> hn_dev_rx_queue_release(): >>
> hn_dev_tx_queue_release(): >>
> hn_dev_tx_queue_release(): >>
> Fail to configure port 0
> EAL: Error - exiting with code: 1
> Cause: Start ports failed
>
> Apparently, no subchannels were ready. Anything I may have missed or
> misconfigured?
>
> Regards,
> Mohammed
Could you check the kernel log?
The way sub channel configuration works is that the userspace code in DPDK
sends a message to the hypervisor that it would like N subchannels, then
the response from the hypervisor is processed by the kernel causing sysfs
files to be created. Meanwhile the userspace is polling waiting for the
sysfs files to show up (for 10 seconds). You could increas the timeout or
go looking in the sysfs directory to see what is present.
There is no good way to handle errors here, the hypervisor doesn't really
give much feedback.
On Mon, 2018-12-03 at 08:48 -0800, Stephen Hemminger wrote:
> On Mon, 03 Dec 2018 07:02:55 +0100
> Mohammed Gamal <mgamal@redhat.com> wrote:
>
> > On Fri, 2018-11-30 at 12:24 -0800, Stephen Hemminger wrote:
> > > When using multiple queues, there was a race with the kernel
> > > in setting up the second channel. This is do due to a kernel
> > > change
> > > whiche does not allow accessing sysfs files for Hyper-V
> > > channels that are not opened.
> > >
> > > The fix is simple, just move the logic to detect not ready
> > > sub channels earlier in the existing loop.
> > >
> > > Fixes: 831dba47bd36 ("bus/vmbus: add Hyper-V virtual bus
> > > support")
> > > Reported-by:Mohammed Gamal <mgamal@redhat.com>
> > > Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> > > ---
> > > drivers/bus/vmbus/linux/vmbus_uio.c | 12 ++++++------
> > > 1 file changed, 6 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/bus/vmbus/linux/vmbus_uio.c
> > > b/drivers/bus/vmbus/linux/vmbus_uio.c
> > > index 12e97e3a420a..38df4d724ed5 100644
> > > --- a/drivers/bus/vmbus/linux/vmbus_uio.c
> > > +++ b/drivers/bus/vmbus/linux/vmbus_uio.c
> > > @@ -357,6 +357,12 @@ int vmbus_uio_get_subchan(struct
> > > vmbus_channel
> > > *primary,
> > > continue;
> > > }
> > >
> > > + if (!vmbus_isnew_subchannel(primary, relid))
> > > + continue; /* Already know about
> > > you
> > > */
> > > +
> > > + if (!vmbus_uio_ring_present(dev, relid))
> > > + continue; /* Ring may not be
> > > ready
> > > yet */
> > > +
> > > snprintf(subchan_path, sizeof(subchan_path),
> > > "%s/%lu",
> > > chan_path, relid);
> > > err = vmbus_uio_sysfs_read(subchan_path,
> > > "subchannel_id",
> > > @@ -370,12 +376,6 @@ int vmbus_uio_get_subchan(struct
> > > vmbus_channel
> > > *primary,
> > > if (subid == 0)
> > > continue; /* skip primary channel
> > > */
> > >
> > > - if (!vmbus_isnew_subchannel(primary, relid))
> > > - continue;
> > > -
> > > - if (!vmbus_uio_ring_present(dev, relid))
> > > - continue; /* Ring may not be
> > > ready
> > > yet */
> > > -
> > > err = vmbus_uio_sysfs_read(subchan_path,
> > > "monitor_id",
> > > &monid, UINT8_MAX);
> > > if (err) {
> >
> > With this patch I am now getting the following error:
> > [...]
> > Configuring Port 0 (socket 0)
> > hn_dev_configure(): >>
> > hn_rndis_link_status(): link status 0x40020006
> > hn_subchan_configure(): open 1 subchannels
> > hn_subchan_configure(): open subchannel failed: -2
> > hn_dev_configure(): subchannel configuration failed
> > Port0 dev_configure = -2
> > hn_dev_rx_queue_release(): >>
> > hn_dev_rx_queue_release(): >>
> > hn_dev_tx_queue_release(): >>
> > hn_dev_tx_queue_release(): >>
> > Fail to configure port 0
> > EAL: Error - exiting with code: 1
> > Cause: Start ports failed
> >
> > Apparently, no subchannels were ready. Anything I may have missed
> > or
> > misconfigured?
> >
> > Regards,
> > Mohammed
>
> Could you check the kernel log?
>
I did. No relevant messages seem to be there.
> The way sub channel configuration works is that the userspace code in
> DPDK
> sends a message to the hypervisor that it would like N subchannels,
> then
> the response from the hypervisor is processed by the kernel causing
> sysfs
> files to be created. Meanwhile the userspace is polling waiting for
> the
> sysfs files to show up (for 10 seconds). You could increas the
> timeout or
> go looking in the sysfs directory to see what is present.
Tried increasing that up to 100 seconds, still nothing. Could it be a
problem on my host? The VM I am using is on a local hyper-v instance.
>
> There is no good way to handle errors here, the hypervisor doesn't
> really
> give much feedback.
@@ -357,6 +357,12 @@ int vmbus_uio_get_subchan(struct vmbus_channel *primary,
continue;
}
+ if (!vmbus_isnew_subchannel(primary, relid))
+ continue; /* Already know about you */
+
+ if (!vmbus_uio_ring_present(dev, relid))
+ continue; /* Ring may not be ready yet */
+
snprintf(subchan_path, sizeof(subchan_path), "%s/%lu",
chan_path, relid);
err = vmbus_uio_sysfs_read(subchan_path, "subchannel_id",
@@ -370,12 +376,6 @@ int vmbus_uio_get_subchan(struct vmbus_channel *primary,
if (subid == 0)
continue; /* skip primary channel */
- if (!vmbus_isnew_subchannel(primary, relid))
- continue;
-
- if (!vmbus_uio_ring_present(dev, relid))
- continue; /* Ring may not be ready yet */
-
err = vmbus_uio_sysfs_read(subchan_path, "monitor_id",
&monid, UINT8_MAX);
if (err) {