[dpdk-dev,0/4,for,2.3] vhost-user live migration support

Message ID 20151215133612.GJ29571@yliu-dev.sh.intel.com (mailing list archive)
State Not Applicable, archived
Headers

Commit Message

Yuanhan Liu Dec. 15, 2015, 1:36 p.m. UTC
  On Tue, Dec 15, 2015 at 03:24:48PM +0300, Pavel Fedin wrote:
>  Hello!
> 
> > After a migration, to avoid network outage, the guest must announce its new location to the L2 layer, typically with a GARP. Otherwise requests sent to
> > the guest arrive to the old host until a ARP request is sent (after 30 seconds) or the guest sends some data.
> > QEMU implementation of self announce after a migration with a vhost backend is the following:
> > - If the VIRTIO_GUEST_ANNOUNCE feature has been negotiated the guest sends automatically a GARP.
> > - Else if the vhost backend implements VHOST_USER_SEND_RARP this request is sent to the vhost backend. When this message is received the vhost backend
> > must act as it receives a RARP from the guest (purpose of this RARP is to update switches' MAC->port maaping as a GARP). This RARP is a false one,
> > created by the vhost backend,
> > - Else nothing is done and we have a network outage until a ARP is sent or the guest sends some data.
> 
>  But what is qemu_announce_self() then? It's just unconditionally triggered after migration, but indeed sends some strange thing.
> 
> > VIRTIO_GUEST_ANNOUNCE feature is negotiated if:
> >  - the vhost backend announces the support of this feature. Maybe QEMU can be updated to support unconditionnaly this feature
> 
>  Wrong. I tried to unconditionally enforce it in qemu (my guest does support it), and the link stopped working at all. I don't understand why.

I'm wondering how did you do that? Why do you need enforece it in QEMU?
Isn't it already supported so far?

Actually, what's we need to do is to add such feature bit in vhost
library, to claim we support it so that the the guest will send a 
gratuitous ARP when migration is done (check virtio_net_load()).

----

However, I found the GARP is not sent out at all, due to an error
I met and reported before:

    KVM: injection failed, MSI lost (Operation not permitted)

Which happened at the time QEMU is about to send the interrupt to the 
guest for announce event. However, it failed, hence no GARP was received.

One thing worth noting is that it happened only when I did live migration
on two different hosts (the two hosts happened to be using a same old 
kernel: v3.11.10).  It works pretty well on same host. So, seems like
a KVM bug then?

	--yliu
  

Comments

Pavel Fedin Dec. 15, 2015, 1:48 p.m. UTC | #1
Hello!

> >  Wrong. I tried to unconditionally enforce it in qemu (my guest does support it), and the
> link stopped working at all. I don't understand why.
> 
> I'm wondering how did you do that? Why do you need enforece it in QEMU?
> Isn't it already supported so far?

 I mean - qemu first asks vhost-user server (ovs/DPDK in our case) about capabilities, then negotiates them with the guest. And DPDK
doesn't report VIRTIO_NET_F_GUEST_ANNOUNCE, so i just ORed this flag in qemu before the negotiation with guest (because indeed my
logic says that the host should not do anything special about it). So the overall effect is the same as in your patch

> diff --git a/lib/librte_vhost/virtio-net.c
> b/lib/librte_vhost/virtio-net.c
> index 03044f6..0ba5045 100644
> --- a/lib/librte_vhost/virtio-net.c
> +++ b/lib/librte_vhost/virtio-net.c
> @@ -74,6 +74,7 @@ static struct virtio_net_config_ll *ll_root;
>  #define VHOST_SUPPORTED_FEATURES ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | \
>                                 (1ULL << VIRTIO_NET_F_CTRL_VQ) | \
>                                 (1ULL << VIRTIO_NET_F_CTRL_RX) | \
> +                               (1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE) | \
>                                 (VHOST_SUPPORTS_MQ)            | \
>                                 (1ULL << VIRTIO_F_VERSION_1)   | \
>                                 (1ULL << VHOST_F_LOG_ALL)      | \

 But i was somehow wrong and this causes the whole thing to stop working instead. Even after just booting up the network doesn't
work and PINGs do not pass.

> However, I found the GARP is not sent out at all, due to an error
> I met and reported before:
> 
>     KVM: injection failed, MSI lost (Operation not permitted)

 Interesting, i don't have this problem here. Some bug in your kernel/hardware?

> One thing worth noting is that it happened only when I did live migration
> on two different hosts (the two hosts happened to be using a same old
> kernel: v3.11.10).  It works pretty well on same host. So, seems like
> a KVM bug then?

 3.18.9 here and no this problem.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia
  
Yuanhan Liu Dec. 15, 2015, 1:59 p.m. UTC | #2
On Tue, Dec 15, 2015 at 04:48:12PM +0300, Pavel Fedin wrote:
>  Hello!
> 
> > >  Wrong. I tried to unconditionally enforce it in qemu (my guest does support it), and the
> > link stopped working at all. I don't understand why.
> > 
> > I'm wondering how did you do that? Why do you need enforece it in QEMU?
> > Isn't it already supported so far?
> 
>  I mean - qemu first asks vhost-user server (ovs/DPDK in our case) about capabilities, then negotiates them with the guest. And DPDK
> doesn't report VIRTIO_NET_F_GUEST_ANNOUNCE, so i just ORed this flag in qemu before the negotiation with guest (because indeed my
> logic says that the host should not do anything special about it). So the overall effect is the same as in your patch

I see.

> 
> > diff --git a/lib/librte_vhost/virtio-net.c
> > b/lib/librte_vhost/virtio-net.c
> > index 03044f6..0ba5045 100644
> > --- a/lib/librte_vhost/virtio-net.c
> > +++ b/lib/librte_vhost/virtio-net.c
> > @@ -74,6 +74,7 @@ static struct virtio_net_config_ll *ll_root;
> >  #define VHOST_SUPPORTED_FEATURES ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | \
> >                                 (1ULL << VIRTIO_NET_F_CTRL_VQ) | \
> >                                 (1ULL << VIRTIO_NET_F_CTRL_RX) | \
> > +                               (1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE) | \
> >                                 (VHOST_SUPPORTS_MQ)            | \
> >                                 (1ULL << VIRTIO_F_VERSION_1)   | \
> >                                 (1ULL << VHOST_F_LOG_ALL)      | \
> 
>  But i was somehow wrong and this causes the whole thing to stop working instead. Even after just booting up the network doesn't
> work and PINGs do not pass.

No idea. Maybe you have changed some other configures (such as of ovs)
without notice? Or, the ovs bridge interface resets?

BTW, would you please try my v1 patch set with above diff applied to
see if the ping loss is still there. You might also want to run tcpdump
with the dest host ovs bridge, to see if GARP is actually sent.

> 
> > However, I found the GARP is not sent out at all, due to an error
> > I met and reported before:
> > 
> >     KVM: injection failed, MSI lost (Operation not permitted)

I was thinking that may be caused by the difference of my two hosts (a
desktop and a server). I will try to find two similar hosts tomorrow
to do more tests. Besides that, it'd be great if you could do a more
test with above diff applied.

	--yliu
> 
>  Interesting, i don't have this problem here. Some bug in your kernel/hardware?
> 
> > One thing worth noting is that it happened only when I did live migration
> > on two different hosts (the two hosts happened to be using a same old
> > kernel: v3.11.10).  It works pretty well on same host. So, seems like
> > a KVM bug then?
> 
>  3.18.9 here and no this problem.
> 
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
  

Patch

diff --git a/lib/librte_vhost/virtio-net.c
b/lib/librte_vhost/virtio-net.c
index 03044f6..0ba5045 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -74,6 +74,7 @@  static struct virtio_net_config_ll *ll_root;
 #define VHOST_SUPPORTED_FEATURES ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | \
                                (1ULL << VIRTIO_NET_F_CTRL_VQ) | \
                                (1ULL << VIRTIO_NET_F_CTRL_RX) | \
+                               (1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE) | \
                                (VHOST_SUPPORTS_MQ)            | \
                                (1ULL << VIRTIO_F_VERSION_1)   | \
                                (1ULL << VHOST_F_LOG_ALL)      | \