[dpdk-dev,v7,3/3] net/virtio: support GUEST ANNOUNCE

Message ID 20180109142651.84582-4-xiao.w.wang@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Yuanhan Liu
Headers

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/Intel-compilation success Compilation OK

Commit Message

Xiao Wang Jan. 9, 2018, 2:26 p.m. UTC
  When live migration is done, for the backup VM, either the virtio
frontend or the vhost backend needs to send out gratuitous RARP packet
to announce its new network location.

This patch enables VIRTIO_NET_F_GUEST_ANNOUNCE feature to support live
migration scenario where the vhost backend doesn't have the ability to
generate RARP packet.

Brief introduction of the work flow:
1. QEMU finishes live migration, pokes the backup VM with an interrupt.
2. Virtio interrupt handler reads out the interrupt status value, and
   realizes it needs to send out RARP packet to announce its location.
3. Pause device to stop worker thread touching the queues.
4. Inject a RARP packet into a Tx Queue.
5. Ack the interrupt via control queue.
6. Resume device to continue packet processing.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
---
 drivers/net/virtio/virtio_ethdev.c | 95 +++++++++++++++++++++++++++++++++++++-
 drivers/net/virtio/virtio_ethdev.h |  1 +
 drivers/net/virtio/virtqueue.h     | 11 +++++
 3 files changed, 105 insertions(+), 2 deletions(-)
  

Comments

Maxime Coquelin Jan. 9, 2018, 8:49 a.m. UTC | #1
On 01/09/2018 03:26 PM, Xiao Wang wrote:
> When live migration is done, for the backup VM, either the virtio
> frontend or the vhost backend needs to send out gratuitous RARP packet
> to announce its new network location.
> 
> This patch enables VIRTIO_NET_F_GUEST_ANNOUNCE feature to support live
> migration scenario where the vhost backend doesn't have the ability to
> generate RARP packet.
> 
> Brief introduction of the work flow:
> 1. QEMU finishes live migration, pokes the backup VM with an interrupt.
> 2. Virtio interrupt handler reads out the interrupt status value, and
>     realizes it needs to send out RARP packet to announce its location.
> 3. Pause device to stop worker thread touching the queues.
> 4. Inject a RARP packet into a Tx Queue.
> 5. Ack the interrupt via control queue.
> 6. Resume device to continue packet processing.
> 
> Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
> ---
>   drivers/net/virtio/virtio_ethdev.c | 95 +++++++++++++++++++++++++++++++++++++-
>   drivers/net/virtio/virtio_ethdev.h |  1 +
>   drivers/net/virtio/virtqueue.h     | 11 +++++
>   3 files changed, 105 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
> index e8ff1e449..9606df514 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -19,6 +19,8 @@
>   #include <rte_pci.h>
>   #include <rte_bus_pci.h>
>   #include <rte_ether.h>
> +#include <rte_ip.h>
> +#include <rte_arp.h>
>   #include <rte_common.h>
>   #include <rte_errno.h>
>   #include <rte_cpuflags.h>
> @@ -78,6 +80,11 @@ static int virtio_dev_queue_stats_mapping_set(
>   	uint8_t stat_idx,
>   	uint8_t is_rx);
>   
> +static int make_rarp_packet(struct rte_mbuf *rarp_mbuf,
> +		const struct ether_addr *mac);
> +static void virtio_notify_peers(struct rte_eth_dev *dev);
> +static void virtio_ack_link_announce(struct rte_eth_dev *dev);
> +
>   /*
>    * The set of PCI devices this driver supports
>    */
> @@ -1272,9 +1279,89 @@ virtio_inject_pkts(struct rte_eth_dev *dev, struct rte_mbuf **tx_pkts,
>   	return ret;
>   }
>   
> +#define RARP_PKT_SIZE	64
> +static int
> +make_rarp_packet(struct rte_mbuf *rarp_mbuf, const struct ether_addr *mac)
> +{
> +	struct ether_hdr *eth_hdr;
> +	struct arp_hdr *rarp;
> +
> +	if (rarp_mbuf->buf_len < RARP_PKT_SIZE) {
> +		PMD_DRV_LOG(ERR, "mbuf size too small %u (< %d)",
> +				rarp_mbuf->buf_len, RARP_PKT_SIZE);
> +		return -1;
> +	}
> +
> +	/* Ethernet header. */
> +	eth_hdr = rte_pktmbuf_mtod(rarp_mbuf, struct ether_hdr *);
> +	memset(eth_hdr->d_addr.addr_bytes, 0xff, ETHER_ADDR_LEN);
> +	ether_addr_copy(mac, &eth_hdr->s_addr);
> +	eth_hdr->ether_type = htons(ETHER_TYPE_RARP);
> +
> +	/* RARP header. */
> +	rarp = (struct arp_hdr *)(eth_hdr + 1);
> +	rarp->arp_hrd = htons(ARP_HRD_ETHER);
> +	rarp->arp_pro = htons(ETHER_TYPE_IPv4);
> +	rarp->arp_hln = ETHER_ADDR_LEN;
> +	rarp->arp_pln = 4;
> +	rarp->arp_op  = htons(ARP_OP_REVREQUEST);
> +
> +	ether_addr_copy(mac, &rarp->arp_data.arp_sha);
> +	ether_addr_copy(mac, &rarp->arp_data.arp_tha);
> +	memset(&rarp->arp_data.arp_sip, 0x00, 4);
> +	memset(&rarp->arp_data.arp_tip, 0x00, 4);
> +
> +	rarp_mbuf->data_len = RARP_PKT_SIZE;
> +	rarp_mbuf->pkt_len = RARP_PKT_SIZE;
> +
> +	return 0;
> +}

Do you think it could make sense to have this function in a lib, as
vhost user lib does exactly the same?

I don't know if it could be useful to others than vhost/virtio though.

Thanks,
Maxime
  
Xiao Wang Jan. 9, 2018, 10:58 a.m. UTC | #2
Hi,

> -----Original Message-----

> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]

> Sent: Tuesday, January 9, 2018 4:50 PM

> To: Wang, Xiao W <xiao.w.wang@intel.com>; yliu@fridaylinux.org

> Cc: Bie, Tiwei <tiwei.bie@intel.com>; dev@dpdk.org;

> stephen@networkplumber.org

> Subject: Re: [dpdk-dev] [PATCH v7 3/3] net/virtio: support GUEST ANNOUNCE

> 

> 

> 

> On 01/09/2018 03:26 PM, Xiao Wang wrote:

> > When live migration is done, for the backup VM, either the virtio

> > frontend or the vhost backend needs to send out gratuitous RARP packet

> > to announce its new network location.

> >

> > This patch enables VIRTIO_NET_F_GUEST_ANNOUNCE feature to support

> live

> > migration scenario where the vhost backend doesn't have the ability to

> > generate RARP packet.

> >

> > Brief introduction of the work flow:

> > 1. QEMU finishes live migration, pokes the backup VM with an interrupt.

> > 2. Virtio interrupt handler reads out the interrupt status value, and

> >     realizes it needs to send out RARP packet to announce its location.

> > 3. Pause device to stop worker thread touching the queues.

> > 4. Inject a RARP packet into a Tx Queue.

> > 5. Ack the interrupt via control queue.

> > 6. Resume device to continue packet processing.

> >

> > Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>

> > ---

> >   drivers/net/virtio/virtio_ethdev.c | 95

> +++++++++++++++++++++++++++++++++++++-

> >   drivers/net/virtio/virtio_ethdev.h |  1 +

> >   drivers/net/virtio/virtqueue.h     | 11 +++++

> >   3 files changed, 105 insertions(+), 2 deletions(-)

> >

> > diff --git a/drivers/net/virtio/virtio_ethdev.c

> b/drivers/net/virtio/virtio_ethdev.c

> > index e8ff1e449..9606df514 100644

> > --- a/drivers/net/virtio/virtio_ethdev.c

> > +++ b/drivers/net/virtio/virtio_ethdev.c

> > @@ -19,6 +19,8 @@

> >   #include <rte_pci.h>

> >   #include <rte_bus_pci.h>

> >   #include <rte_ether.h>

> > +#include <rte_ip.h>

> > +#include <rte_arp.h>

> >   #include <rte_common.h>

> >   #include <rte_errno.h>

> >   #include <rte_cpuflags.h>

> > @@ -78,6 +80,11 @@ static int virtio_dev_queue_stats_mapping_set(

> >   	uint8_t stat_idx,

> >   	uint8_t is_rx);

> >

> > +static int make_rarp_packet(struct rte_mbuf *rarp_mbuf,

> > +		const struct ether_addr *mac);

> > +static void virtio_notify_peers(struct rte_eth_dev *dev);

> > +static void virtio_ack_link_announce(struct rte_eth_dev *dev);

> > +

> >   /*

> >    * The set of PCI devices this driver supports

> >    */

> > @@ -1272,9 +1279,89 @@ virtio_inject_pkts(struct rte_eth_dev *dev, struct

> rte_mbuf **tx_pkts,

> >   	return ret;

> >   }

> >

> > +#define RARP_PKT_SIZE	64

> > +static int

> > +make_rarp_packet(struct rte_mbuf *rarp_mbuf, const struct ether_addr

> *mac)

> > +{

> > +	struct ether_hdr *eth_hdr;

> > +	struct arp_hdr *rarp;

> > +

> > +	if (rarp_mbuf->buf_len < RARP_PKT_SIZE) {

> > +		PMD_DRV_LOG(ERR, "mbuf size too small %u (< %d)",

> > +				rarp_mbuf->buf_len, RARP_PKT_SIZE);

> > +		return -1;

> > +	}

> > +

> > +	/* Ethernet header. */

> > +	eth_hdr = rte_pktmbuf_mtod(rarp_mbuf, struct ether_hdr *);

> > +	memset(eth_hdr->d_addr.addr_bytes, 0xff, ETHER_ADDR_LEN);

> > +	ether_addr_copy(mac, &eth_hdr->s_addr);

> > +	eth_hdr->ether_type = htons(ETHER_TYPE_RARP);

> > +

> > +	/* RARP header. */

> > +	rarp = (struct arp_hdr *)(eth_hdr + 1);

> > +	rarp->arp_hrd = htons(ARP_HRD_ETHER);

> > +	rarp->arp_pro = htons(ETHER_TYPE_IPv4);

> > +	rarp->arp_hln = ETHER_ADDR_LEN;

> > +	rarp->arp_pln = 4;

> > +	rarp->arp_op  = htons(ARP_OP_REVREQUEST);

> > +

> > +	ether_addr_copy(mac, &rarp->arp_data.arp_sha);

> > +	ether_addr_copy(mac, &rarp->arp_data.arp_tha);

> > +	memset(&rarp->arp_data.arp_sip, 0x00, 4);

> > +	memset(&rarp->arp_data.arp_tip, 0x00, 4);

> > +

> > +	rarp_mbuf->data_len = RARP_PKT_SIZE;

> > +	rarp_mbuf->pkt_len = RARP_PKT_SIZE;

> > +

> > +	return 0;

> > +}

> 

> Do you think it could make sense to have this function in a lib, as

> vhost user lib does exactly the same?

> 

> I don't know if it could be useful to others than vhost/virtio though.

> 

> Thanks,

> Maxime


Hi Thomas,

Do you think it's worth adding a new helper for ARP in lib/librte_net/?
Currently we just need a helper to build RARP packet (the above make_rarp_packet)

BRs,
Xiao
  
Xiao Wang Jan. 9, 2018, 11:03 a.m. UTC | #3
> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]

> On 01/09/2018 03:26 PM, Xiao Wang wrote:

> > When live migration is done, for the backup VM, either the virtio

> > frontend or the vhost backend needs to send out gratuitous RARP packet

> > to announce its new network location.

> >

> > This patch enables VIRTIO_NET_F_GUEST_ANNOUNCE feature to support

> live

> > migration scenario where the vhost backend doesn't have the ability to

> > generate RARP packet.

> >

> > Brief introduction of the work flow:

> > 1. QEMU finishes live migration, pokes the backup VM with an interrupt.

> > 2. Virtio interrupt handler reads out the interrupt status value, and

> >     realizes it needs to send out RARP packet to announce its location.

> > 3. Pause device to stop worker thread touching the queues.

> > 4. Inject a RARP packet into a Tx Queue.

> > 5. Ack the interrupt via control queue.

> > 6. Resume device to continue packet processing.

> >

> > Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>

> > ---

> >   drivers/net/virtio/virtio_ethdev.c | 95

> +++++++++++++++++++++++++++++++++++++-

> >   drivers/net/virtio/virtio_ethdev.h |  1 +

> >   drivers/net/virtio/virtqueue.h     | 11 +++++

> >   3 files changed, 105 insertions(+), 2 deletions(-)

> >

> > diff --git a/drivers/net/virtio/virtio_ethdev.c

> b/drivers/net/virtio/virtio_ethdev.c

> > index e8ff1e449..9606df514 100644

> > --- a/drivers/net/virtio/virtio_ethdev.c

> > +++ b/drivers/net/virtio/virtio_ethdev.c

> > @@ -19,6 +19,8 @@

> >   #include <rte_pci.h>

> >   #include <rte_bus_pci.h>

> >   #include <rte_ether.h>

> > +#include <rte_ip.h>

> > +#include <rte_arp.h>

> >   #include <rte_common.h>

> >   #include <rte_errno.h>

> >   #include <rte_cpuflags.h>

> > @@ -78,6 +80,11 @@ static int virtio_dev_queue_stats_mapping_set(

> >   	uint8_t stat_idx,

> >   	uint8_t is_rx);

> >

> > +static int make_rarp_packet(struct rte_mbuf *rarp_mbuf,

> > +		const struct ether_addr *mac);

> > +static void virtio_notify_peers(struct rte_eth_dev *dev);

> > +static void virtio_ack_link_announce(struct rte_eth_dev *dev);

> > +

> >   /*

> >    * The set of PCI devices this driver supports

> >    */

> > @@ -1272,9 +1279,89 @@ virtio_inject_pkts(struct rte_eth_dev *dev, struct

> rte_mbuf **tx_pkts,

> >   	return ret;

> >   }

> >

> > +#define RARP_PKT_SIZE	64

> > +static int

> > +make_rarp_packet(struct rte_mbuf *rarp_mbuf, const struct ether_addr

> *mac)

> > +{

> > +	struct ether_hdr *eth_hdr;

> > +	struct arp_hdr *rarp;

> > +

> > +	if (rarp_mbuf->buf_len < RARP_PKT_SIZE) {

> > +		PMD_DRV_LOG(ERR, "mbuf size too small %u (< %d)",

> > +				rarp_mbuf->buf_len, RARP_PKT_SIZE);

> > +		return -1;

> > +	}

> > +

> > +	/* Ethernet header. */

> > +	eth_hdr = rte_pktmbuf_mtod(rarp_mbuf, struct ether_hdr *);

> > +	memset(eth_hdr->d_addr.addr_bytes, 0xff, ETHER_ADDR_LEN);

> > +	ether_addr_copy(mac, &eth_hdr->s_addr);

> > +	eth_hdr->ether_type = htons(ETHER_TYPE_RARP);

> > +

> > +	/* RARP header. */

> > +	rarp = (struct arp_hdr *)(eth_hdr + 1);

> > +	rarp->arp_hrd = htons(ARP_HRD_ETHER);

> > +	rarp->arp_pro = htons(ETHER_TYPE_IPv4);

> > +	rarp->arp_hln = ETHER_ADDR_LEN;

> > +	rarp->arp_pln = 4;

> > +	rarp->arp_op  = htons(ARP_OP_REVREQUEST);

> > +

> > +	ether_addr_copy(mac, &rarp->arp_data.arp_sha);

> > +	ether_addr_copy(mac, &rarp->arp_data.arp_tha);

> > +	memset(&rarp->arp_data.arp_sip, 0x00, 4);

> > +	memset(&rarp->arp_data.arp_tip, 0x00, 4);

> > +

> > +	rarp_mbuf->data_len = RARP_PKT_SIZE;

> > +	rarp_mbuf->pkt_len = RARP_PKT_SIZE;

> > +

> > +	return 0;

> > +}

> 

> Do you think it could make sense to have this function in a lib, as

> vhost user lib does exactly the same?

> 

> I don't know if it could be useful to others than vhost/virtio though.


Hi Thomas,

Do you think it's worth adding a new helper for ARP in lib/librte_net/?
Currently we just need a helper to build RARP packet (the above make_rarp_packet)

BRs,
Xiao


> 

> Thanks,

> Maxime
  
Thomas Monjalon Jan. 9, 2018, 11:41 a.m. UTC | #4
09/01/2018 12:03, Wang, Xiao W:
> > From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
> > On 01/09/2018 03:26 PM, Xiao Wang wrote:
> > > +#define RARP_PKT_SIZE	64
> > > +static int
> > > +make_rarp_packet(struct rte_mbuf *rarp_mbuf, const struct ether_addr
> > *mac)
> > > +{
> > > +	struct ether_hdr *eth_hdr;
> > > +	struct arp_hdr *rarp;
> > > +
> > > +	if (rarp_mbuf->buf_len < RARP_PKT_SIZE) {
> > > +		PMD_DRV_LOG(ERR, "mbuf size too small %u (< %d)",
> > > +				rarp_mbuf->buf_len, RARP_PKT_SIZE);
> > > +		return -1;
> > > +	}
> > > +
> > > +	/* Ethernet header. */
> > > +	eth_hdr = rte_pktmbuf_mtod(rarp_mbuf, struct ether_hdr *);
> > > +	memset(eth_hdr->d_addr.addr_bytes, 0xff, ETHER_ADDR_LEN);
> > > +	ether_addr_copy(mac, &eth_hdr->s_addr);
> > > +	eth_hdr->ether_type = htons(ETHER_TYPE_RARP);
> > > +
> > > +	/* RARP header. */
> > > +	rarp = (struct arp_hdr *)(eth_hdr + 1);
> > > +	rarp->arp_hrd = htons(ARP_HRD_ETHER);
> > > +	rarp->arp_pro = htons(ETHER_TYPE_IPv4);
> > > +	rarp->arp_hln = ETHER_ADDR_LEN;
> > > +	rarp->arp_pln = 4;
> > > +	rarp->arp_op  = htons(ARP_OP_REVREQUEST);
> > > +
> > > +	ether_addr_copy(mac, &rarp->arp_data.arp_sha);
> > > +	ether_addr_copy(mac, &rarp->arp_data.arp_tha);
> > > +	memset(&rarp->arp_data.arp_sip, 0x00, 4);
> > > +	memset(&rarp->arp_data.arp_tip, 0x00, 4);
> > > +
> > > +	rarp_mbuf->data_len = RARP_PKT_SIZE;
> > > +	rarp_mbuf->pkt_len = RARP_PKT_SIZE;
> > > +
> > > +	return 0;
> > > +}
> > 
> > Do you think it could make sense to have this function in a lib, as
> > vhost user lib does exactly the same?
> > 
> > I don't know if it could be useful to others than vhost/virtio though.
> 
> Hi Thomas,
> 
> Do you think it's worth adding a new helper for ARP in lib/librte_net/?
> Currently we just need a helper to build RARP packet (the above make_rarp_packet)

Yes, good idea
  
Xiao Wang Jan. 9, 2018, 1:26 p.m. UTC | #5
v8:
- Add a helper in lib/librte_net to make rarp packet, it's used by
  both vhost and virtio.

v7:
- Improve comment for state_lock.
- Rename spinlock variable 'sl' to 'lock'.

v6:
- Use rte_pktmbuf_alloc() instead of rte_mbuf_raw_alloc().
- Remove the 'len' parameter in calling virtio_send_command().
- Remove extra space between typo and var.
- Improve comment and alignment.
- Remove the unnecessary header file.
- A better usage of 'unlikely' indication.

v5:
- Remove txvq parameter in virtio_inject_pkts.
- Zero hw->special_buf after using it.
- Return the retval of tx_pkt_burst().
- Allocate a mbuf pointer on stack directly.

v4:
- Move spinlock lock/unlock into dev_pause/resume.
- Separate out a patch for packet injection.

v3:
- Remove Tx function code duplication, use a special pointer for rarp
  injection.
- Rename function generate_rarp to virtio_notify_peers, replace
  'virtnet_' with 'virtio_'.
- Add comment for state_lock.
- Typo fix and comment improvement.

v2:
- Use spaces instead of tabs between the code and comments.
- Remove unnecessary parentheses.
- Use rte_pktmbuf_mtod directly to get eth_hdr addr.
- Fix virtio_dev_pause return value check.

Xiao Wang (5):
  net/virtio: make control queue thread-safe
  net/virtio: add packet injection method
  net: add a helper for making RARP packet
  vhost: use lib API to make RARP packet
  net/virtio: support GUEST ANNOUNCE

 drivers/net/virtio/virtio_ethdev.c      | 118 +++++++++++++++++++++++++++++++-
 drivers/net/virtio/virtio_ethdev.h      |   6 ++
 drivers/net/virtio/virtio_pci.h         |   7 ++
 drivers/net/virtio/virtio_rxtx.c        |   3 +-
 drivers/net/virtio/virtio_rxtx.h        |   1 +
 drivers/net/virtio/virtio_rxtx_simple.c |   2 +-
 drivers/net/virtio/virtqueue.h          |  11 +++
 lib/Makefile                            |   3 +-
 lib/librte_net/Makefile                 |   1 +
 lib/librte_net/rte_arp.c                |  42 ++++++++++++
 lib/librte_net/rte_arp.h                |  14 ++++
 lib/librte_net/rte_net_version.map      |   6 ++
 lib/librte_vhost/Makefile               |   2 +-
 lib/librte_vhost/virtio_net.c           |  41 +----------
 14 files changed, 210 insertions(+), 47 deletions(-)
 create mode 100644 lib/librte_net/rte_arp.c
  
Yuanhan Liu Jan. 9, 2018, 1:36 p.m. UTC | #6
On Tue, Jan 09, 2018 at 12:41:53PM +0100, Thomas Monjalon wrote:
> > > Do you think it could make sense to have this function in a lib, as
> > > vhost user lib does exactly the same?
> > > 
> > > I don't know if it could be useful to others than vhost/virtio though.
> > 
> > Hi Thomas,
> > 
> > Do you think it's worth adding a new helper for ARP in lib/librte_net/?
> > Currently we just need a helper to build RARP packet (the above make_rarp_packet)
> 
> Yes, good idea

+1

	--yliu
  
Maxime Coquelin Jan. 9, 2018, 2:38 p.m. UTC | #7
On 01/09/2018 02:26 PM, Xiao Wang wrote:
> v8:
> - Add a helper in lib/librte_net to make rarp packet, it's used by
>    both vhost and virtio.
> 
> v7:
> - Improve comment for state_lock.
> - Rename spinlock variable 'sl' to 'lock'.
> 
> v6:
> - Use rte_pktmbuf_alloc() instead of rte_mbuf_raw_alloc().
> - Remove the 'len' parameter in calling virtio_send_command().
> - Remove extra space between typo and var.
> - Improve comment and alignment.
> - Remove the unnecessary header file.
> - A better usage of 'unlikely' indication.
> 
> v5:
> - Remove txvq parameter in virtio_inject_pkts.
> - Zero hw->special_buf after using it.
> - Return the retval of tx_pkt_burst().
> - Allocate a mbuf pointer on stack directly.
> 
> v4:
> - Move spinlock lock/unlock into dev_pause/resume.
> - Separate out a patch for packet injection.
> 
> v3:
> - Remove Tx function code duplication, use a special pointer for rarp
>    injection.
> - Rename function generate_rarp to virtio_notify_peers, replace
>    'virtnet_' with 'virtio_'.
> - Add comment for state_lock.
> - Typo fix and comment improvement.
> 
> v2:
> - Use spaces instead of tabs between the code and comments.
> - Remove unnecessary parentheses.
> - Use rte_pktmbuf_mtod directly to get eth_hdr addr.
> - Fix virtio_dev_pause return value check.
> 
> Xiao Wang (5):
>    net/virtio: make control queue thread-safe
>    net/virtio: add packet injection method
>    net: add a helper for making RARP packet
For for handling the change!

>    vhost: use lib API to make RARP packet
>    net/virtio: support GUEST ANNOUNCE
> 
>   drivers/net/virtio/virtio_ethdev.c      | 118 +++++++++++++++++++++++++++++++-
>   drivers/net/virtio/virtio_ethdev.h      |   6 ++
>   drivers/net/virtio/virtio_pci.h         |   7 ++
>   drivers/net/virtio/virtio_rxtx.c        |   3 +-
>   drivers/net/virtio/virtio_rxtx.h        |   1 +
>   drivers/net/virtio/virtio_rxtx_simple.c |   2 +-
>   drivers/net/virtio/virtqueue.h          |  11 +++
>   lib/Makefile                            |   3 +-
>   lib/librte_net/Makefile                 |   1 +
>   lib/librte_net/rte_arp.c                |  42 ++++++++++++
>   lib/librte_net/rte_arp.h                |  14 ++++
>   lib/librte_net/rte_net_version.map      |   6 ++
>   lib/librte_vhost/Makefile               |   2 +-
>   lib/librte_vhost/virtio_net.c           |  41 +----------
>   14 files changed, 210 insertions(+), 47 deletions(-)
>   create mode 100644 lib/librte_net/rte_arp.c
> 

For the series:
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

Maxime
  

Patch

diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
index e8ff1e449..9606df514 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -19,6 +19,8 @@ 
 #include <rte_pci.h>
 #include <rte_bus_pci.h>
 #include <rte_ether.h>
+#include <rte_ip.h>
+#include <rte_arp.h>
 #include <rte_common.h>
 #include <rte_errno.h>
 #include <rte_cpuflags.h>
@@ -78,6 +80,11 @@  static int virtio_dev_queue_stats_mapping_set(
 	uint8_t stat_idx,
 	uint8_t is_rx);
 
+static int make_rarp_packet(struct rte_mbuf *rarp_mbuf,
+		const struct ether_addr *mac);
+static void virtio_notify_peers(struct rte_eth_dev *dev);
+static void virtio_ack_link_announce(struct rte_eth_dev *dev);
+
 /*
  * The set of PCI devices this driver supports
  */
@@ -1272,9 +1279,89 @@  virtio_inject_pkts(struct rte_eth_dev *dev, struct rte_mbuf **tx_pkts,
 	return ret;
 }
 
+#define RARP_PKT_SIZE	64
+static int
+make_rarp_packet(struct rte_mbuf *rarp_mbuf, const struct ether_addr *mac)
+{
+	struct ether_hdr *eth_hdr;
+	struct arp_hdr *rarp;
+
+	if (rarp_mbuf->buf_len < RARP_PKT_SIZE) {
+		PMD_DRV_LOG(ERR, "mbuf size too small %u (< %d)",
+				rarp_mbuf->buf_len, RARP_PKT_SIZE);
+		return -1;
+	}
+
+	/* Ethernet header. */
+	eth_hdr = rte_pktmbuf_mtod(rarp_mbuf, struct ether_hdr *);
+	memset(eth_hdr->d_addr.addr_bytes, 0xff, ETHER_ADDR_LEN);
+	ether_addr_copy(mac, &eth_hdr->s_addr);
+	eth_hdr->ether_type = htons(ETHER_TYPE_RARP);
+
+	/* RARP header. */
+	rarp = (struct arp_hdr *)(eth_hdr + 1);
+	rarp->arp_hrd = htons(ARP_HRD_ETHER);
+	rarp->arp_pro = htons(ETHER_TYPE_IPv4);
+	rarp->arp_hln = ETHER_ADDR_LEN;
+	rarp->arp_pln = 4;
+	rarp->arp_op  = htons(ARP_OP_REVREQUEST);
+
+	ether_addr_copy(mac, &rarp->arp_data.arp_sha);
+	ether_addr_copy(mac, &rarp->arp_data.arp_tha);
+	memset(&rarp->arp_data.arp_sip, 0x00, 4);
+	memset(&rarp->arp_data.arp_tip, 0x00, 4);
+
+	rarp_mbuf->data_len = RARP_PKT_SIZE;
+	rarp_mbuf->pkt_len = RARP_PKT_SIZE;
+
+	return 0;
+}
+
+static void
+virtio_notify_peers(struct rte_eth_dev *dev)
+{
+	struct virtio_hw *hw = dev->data->dev_private;
+	struct virtnet_rx *rxvq = dev->data->rx_queues[0];
+	struct rte_mbuf *rarp_mbuf;
+
+	rarp_mbuf = rte_pktmbuf_alloc(rxvq->mpool);
+	if (rarp_mbuf == NULL) {
+		PMD_DRV_LOG(ERR, "mbuf allocate failed");
+		return;
+	}
+
+	if (make_rarp_packet(rarp_mbuf,
+			(struct ether_addr *)hw->mac_addr) < 0) {
+		rte_pktmbuf_free(rarp_mbuf);
+		return;
+	}
+
+	/* If virtio port just stopped, no need to send RARP */
+	if (virtio_dev_pause(dev) < 0) {
+		rte_pktmbuf_free(rarp_mbuf);
+		return;
+	}
+
+	virtio_inject_pkts(dev, &rarp_mbuf, 1);
+	virtio_dev_resume(dev);
+}
+
+static void
+virtio_ack_link_announce(struct rte_eth_dev *dev)
+{
+	struct virtio_hw *hw = dev->data->dev_private;
+	struct virtio_pmd_ctrl ctrl;
+
+	ctrl.hdr.class = VIRTIO_NET_CTRL_ANNOUNCE;
+	ctrl.hdr.cmd = VIRTIO_NET_CTRL_ANNOUNCE_ACK;
+
+	virtio_send_command(hw->cvq, &ctrl, NULL, 0);
+}
+
 /*
- * Process Virtio Config changed interrupt and call the callback
- * if link state changed.
+ * Process virtio config changed interrupt. Call the callback
+ * if link state changed, generate gratuitous RARP packet if
+ * the status indicates an ANNOUNCE.
  */
 void
 virtio_interrupt_handler(void *param)
@@ -1297,6 +1384,10 @@  virtio_interrupt_handler(void *param)
 						      NULL, NULL);
 	}
 
+	if (isr & VIRTIO_NET_S_ANNOUNCE) {
+		virtio_notify_peers(dev);
+		virtio_ack_link_announce(dev);
+	}
 }
 
 /* set rx and tx handlers according to what is supported */
diff --git a/drivers/net/virtio/virtio_ethdev.h b/drivers/net/virtio/virtio_ethdev.h
index 69b30b7e1..09ebc5fb5 100644
--- a/drivers/net/virtio/virtio_ethdev.h
+++ b/drivers/net/virtio/virtio_ethdev.h
@@ -38,6 +38,7 @@ 
 	 1u << VIRTIO_NET_F_HOST_TSO6	  |	\
 	 1u << VIRTIO_NET_F_MRG_RXBUF	  |	\
 	 1u << VIRTIO_NET_F_MTU	| \
+	 1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE |	\
 	 1u << VIRTIO_RING_F_INDIRECT_DESC |    \
 	 1ULL << VIRTIO_F_VERSION_1       |	\
 	 1ULL << VIRTIO_F_IOMMU_PLATFORM)
diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h
index 1482a951d..60df359b3 100644
--- a/drivers/net/virtio/virtqueue.h
+++ b/drivers/net/virtio/virtqueue.h
@@ -129,6 +129,17 @@  struct virtio_net_ctrl_mac {
 #define VIRTIO_NET_CTRL_VLAN_ADD 0
 #define VIRTIO_NET_CTRL_VLAN_DEL 1
 
+/*
+ * Control link announce acknowledgement
+ *
+ * The command VIRTIO_NET_CTRL_ANNOUNCE_ACK is used to indicate that
+ * driver has recevied the notification; device would clear the
+ * VIRTIO_NET_S_ANNOUNCE bit in the status field after it receives
+ * this command.
+ */
+#define VIRTIO_NET_CTRL_ANNOUNCE     3
+#define VIRTIO_NET_CTRL_ANNOUNCE_ACK 0
+
 struct virtio_net_ctrl_hdr {
 	uint8_t class;
 	uint8_t cmd;