From patchwork Thu Jan 4 15:59:38 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiao Wang X-Patchwork-Id: 32879 X-Patchwork-Delegate: yuanhan.liu@linux.intel.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 6673F1B01F; Thu, 4 Jan 2018 08:24:09 +0100 (CET) Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id 8E2291B04C for ; Thu, 4 Jan 2018 08:24:07 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 03 Jan 2018 23:24:07 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.45,506,1508828400"; d="scan'208";a="7933835" Received: from dpdk-xiao-1.sh.intel.com ([10.67.110.153]) by orsmga008.jf.intel.com with ESMTP; 03 Jan 2018 23:24:06 -0800 From: Xiao Wang To: tiwei.bie@intel.com Cc: dev@dpdk.org, yliu@fridaylinux.org, stephen@networkplumber.org, Xiao Wang Date: Thu, 4 Jan 2018 07:59:38 -0800 Message-Id: <1515081578-30649-4-git-send-email-xiao.w.wang@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1515081578-30649-1-git-send-email-xiao.w.wang@intel.com> References: <1515051700-117262-3-git-send-email-xiao.w.wang@intel.com> <1515081578-30649-1-git-send-email-xiao.w.wang@intel.com> Subject: [dpdk-dev] [PATCH v4 3/3] net/virtio: support GUEST ANNOUNCE X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" When live migration is done, for the backup VM, either the virtio frontend or the vhost backend needs to send out gratuitous RARP packet to announce its new network location. This patch enables VIRTIO_NET_F_GUEST_ANNOUNCE feature to support live migration scenario where the vhost backend doesn't have the ability to generate RARP packet. Brief introduction of the work flow: 1. QEMU finishes live migration, pokes the backup VM with an interrupt. 2. Virtio interrupt handler reads out the interrupt status value, and realizes it needs to send out RARP packet to announce its location. 3. Pause device to stop worker thread touching the queues. 4. Inject a RARP packet into a Tx Queue. 5. Ack the interrupt via control queue. 6. Resume device to continue packet processing. Signed-off-by: Xiao Wang --- drivers/net/virtio/virtio_ethdev.c | 108 ++++++++++++++++++++++++++++++++++++- drivers/net/virtio/virtio_ethdev.h | 1 + drivers/net/virtio/virtqueue.h | 11 ++++ 3 files changed, 118 insertions(+), 2 deletions(-) diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c index 6745de7..288a1a7 100644 --- a/drivers/net/virtio/virtio_ethdev.c +++ b/drivers/net/virtio/virtio_ethdev.c @@ -48,6 +48,8 @@ #include #include #include +#include +#include #include #include #include @@ -107,6 +109,11 @@ static int virtio_dev_queue_stats_mapping_set( uint8_t stat_idx, uint8_t is_rx); +static int make_rarp_packet(struct rte_mbuf *rarp_mbuf, + const struct ether_addr *mac); +static void virtio_notify_peers(struct rte_eth_dev *dev); +static void virtio_ack_link_announce(struct rte_eth_dev *dev); + /* * The set of PCI devices this driver supports */ @@ -1289,9 +1296,102 @@ static int virtio_dev_xstats_get_names(struct rte_eth_dev *dev, dev->tx_pkt_burst(txvq, buf, count); } +#define RARP_PKT_SIZE 64 +static int +make_rarp_packet(struct rte_mbuf *rarp_mbuf, const struct ether_addr *mac) +{ + struct ether_hdr *eth_hdr; + struct arp_hdr *rarp; + + if (rarp_mbuf->buf_len < RARP_PKT_SIZE) { + PMD_DRV_LOG(ERR, "mbuf size too small %u (< %d)", + rarp_mbuf->buf_len, RARP_PKT_SIZE); + return -1; + } + + /* Ethernet header. */ + eth_hdr = rte_pktmbuf_mtod(rarp_mbuf, struct ether_hdr *); + memset(eth_hdr->d_addr.addr_bytes, 0xff, ETHER_ADDR_LEN); + ether_addr_copy(mac, ð_hdr->s_addr); + eth_hdr->ether_type = htons(ETHER_TYPE_RARP); + + /* RARP header. */ + rarp = (struct arp_hdr *)(eth_hdr + 1); + rarp->arp_hrd = htons(ARP_HRD_ETHER); + rarp->arp_pro = htons(ETHER_TYPE_IPv4); + rarp->arp_hln = ETHER_ADDR_LEN; + rarp->arp_pln = 4; + rarp->arp_op = htons(ARP_OP_REVREQUEST); + + ether_addr_copy(mac, &rarp->arp_data.arp_sha); + ether_addr_copy(mac, &rarp->arp_data.arp_tha); + memset(&rarp->arp_data.arp_sip, 0x00, 4); + memset(&rarp->arp_data.arp_tip, 0x00, 4); + + rarp_mbuf->data_len = RARP_PKT_SIZE; + rarp_mbuf->pkt_len = RARP_PKT_SIZE; + + return 0; +} + +static void +virtio_notify_peers(struct rte_eth_dev *dev) +{ + struct virtio_hw *hw = dev->data->dev_private; + struct virtnet_tx *txvq = dev->data->tx_queues[0]; + struct virtnet_rx *rxvq = dev->data->rx_queues[0]; + struct rte_mbuf **rarp_buf; + + rarp_buf = rte_zmalloc("rarp_buf", sizeof(struct rte_mbuf *), 0); + if (!rarp_buf) { + PMD_INIT_LOG(ERR, "Failed to allocate rarp pointer"); + return; + } + + rarp_buf[0] = rte_mbuf_raw_alloc(rxvq->mpool); + if (rarp_buf[0] == NULL) { + PMD_DRV_LOG(ERR, "first mbuf allocate free_bufed"); + goto free_buf; + } + + if (make_rarp_packet(rarp_buf[0], + (struct ether_addr *)hw->mac_addr)) { + rte_pktmbuf_free(rarp_buf[0]); + goto free_buf; + } + + /* If virtio port just stopped, no need to send RARP */ + if (virtio_dev_pause(dev) < 0) { + rte_pktmbuf_free(rarp_buf[0]); + goto free_buf; + } + + virtio_inject_pkts(dev, txvq, rarp_buf, 1); + /* Recover the stored hw status to let worker thread continue */ + virtio_dev_resume(dev); + +free_buf: + rte_free(rarp_buf); +} + +static void +virtio_ack_link_announce(struct rte_eth_dev *dev) +{ + struct virtio_hw *hw = dev->data->dev_private; + struct virtio_pmd_ctrl ctrl; + int len; + + ctrl.hdr.class = VIRTIO_NET_CTRL_ANNOUNCE; + ctrl.hdr.cmd = VIRTIO_NET_CTRL_ANNOUNCE_ACK; + len = 0; + + virtio_send_command(hw->cvq, &ctrl, &len, 0); +} + /* - * Process Virtio Config changed interrupt and call the callback - * if link state changed. + * Process virtio config changed interrupt. Call the callback + * if link state changed, generate gratuitous RARP packet if + * the status indicates an ANNOUNCE. */ void virtio_interrupt_handler(void *param) @@ -1314,6 +1414,10 @@ static int virtio_dev_xstats_get_names(struct rte_eth_dev *dev, NULL, NULL); } + if (isr & VIRTIO_NET_S_ANNOUNCE) { + virtio_notify_peers(dev); + virtio_ack_link_announce(dev); + } } /* set rx and tx handlers according to what is supported */ diff --git a/drivers/net/virtio/virtio_ethdev.h b/drivers/net/virtio/virtio_ethdev.h index e973de3..13a5c86 100644 --- a/drivers/net/virtio/virtio_ethdev.h +++ b/drivers/net/virtio/virtio_ethdev.h @@ -68,6 +68,7 @@ 1u << VIRTIO_NET_F_HOST_TSO6 | \ 1u << VIRTIO_NET_F_MRG_RXBUF | \ 1u << VIRTIO_NET_F_MTU | \ + 1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE | \ 1u << VIRTIO_RING_F_INDIRECT_DESC | \ 1ULL << VIRTIO_F_VERSION_1 | \ 1ULL << VIRTIO_F_IOMMU_PLATFORM) diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h index 2305d91..d9045e1 100644 --- a/drivers/net/virtio/virtqueue.h +++ b/drivers/net/virtio/virtqueue.h @@ -158,6 +158,17 @@ struct virtio_net_ctrl_mac { #define VIRTIO_NET_CTRL_VLAN_ADD 0 #define VIRTIO_NET_CTRL_VLAN_DEL 1 +/* + * Control link announce acknowledgement + * + * The command VIRTIO_NET_CTRL_ANNOUNCE_ACK is used to indicate that + * driver has recevied the notification; device would clear the + * VIRTIO_NET_S_ANNOUNCE bit in the status field after it receives + * this command. + */ +#define VIRTIO_NET_CTRL_ANNOUNCE 3 +#define VIRTIO_NET_CTRL_ANNOUNCE_ACK 0 + struct virtio_net_ctrl_hdr { uint8_t class; uint8_t cmd;