Show a patch.

GET /api/patches/473/
HTTP 200 OK
Allow: GET, PUT, PATCH, HEAD, OPTIONS
Content-Type: application/json
Vary: Accept

{
    "id": 473,
    "url": "http://patches.dpdk.org/api/patches/473/",
    "web_url": "http://patches.dpdk.org/patch/473/",
    "project": {
        "id": 1,
        "url": "http://patches.dpdk.org/api/projects/1/",
        "name": "DPDK",
        "link_name": "dpdk",
        "list_id": "dev.dpdk.org",
        "list_email": "dev@dpdk.org",
        "web_url": "http://core.dpdk.org",
        "scm_url": "git://dpdk.org/dpdk",
        "webscm_url": "http://git.dpdk.org/dpdk"
    },
    "msgid": "<6BD6202160B55B409D423293115822625483C6@SHSMSX101.ccr.corp.intel.com>",
    "date": "2014-09-24T09:25:58",
    "name": "[dpdk-dev] examples/vhost: Support jumbo frame in user space vhost",
    "commit_ref": null,
    "pull_url": null,
    "state": "not-applicable",
    "archived": true,
    "hash": "475f021acd788b3a520b97abbccbb8f598056812",
    "submitter": {
        "id": 45,
        "url": "http://patches.dpdk.org/api/people/45/",
        "name": "Fu, JingguoX",
        "email": "jingguox.fu@intel.com"
    },
    "delegate": null,
    "mbox": "http://patches.dpdk.org/patch/473/mbox/",
    "series": [],
    "comments": "http://patches.dpdk.org/api/patches/473/comments/",
    "check": "pending",
    "checks": "http://patches.dpdk.org/api/patches/473/checks/",
    "tags": {},
    "headers": {
        "Return-Path": "<dev-bounces@dpdk.org>",
        "References": "<1408078681-3511-1-git-send-email-changchun.ouyang@intel.com>",
        "X-Mailman-Version": "2.1.15",
        "X-IronPort-AV": "E=Sophos;i=\"5.04,587,1406617200\"; d=\"scan'208\";a=\"604551227\"",
        "Date": "Wed, 24 Sep 2014 09:25:58 +0000",
        "x-originating-ip": "[10.239.127.40]",
        "List-Post": "<mailto:dev@dpdk.org>",
        "Thread-Index": "AQHPuEYmGwUpmTHly0ye3BDP+3zY1ZwQQgNw",
        "Content-Type": "text/plain; charset=\"us-ascii\"",
        "Delivered-To": "patchwork@dpdk.org",
        "X-Original-To": "patchwork@dpdk.org",
        "Received": [
            "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id B17BFB344;\n\tWed, 24 Sep 2014 11:19:53 +0200 (CEST)",
            "from mga11.intel.com (mga11.intel.com [192.55.52.93])\n\tby dpdk.org (Postfix) with ESMTP id 05ADF6885\n\tfor <dev@dpdk.org>; Wed, 24 Sep 2014 11:19:50 +0200 (CEST)",
            "from fmsmga002.fm.intel.com ([10.253.24.26])\n\tby fmsmga102.fm.intel.com with ESMTP; 24 Sep 2014 02:26:02 -0700",
            "from fmsmsx104.amr.corp.intel.com ([10.18.124.202])\n\tby fmsmga002.fm.intel.com with ESMTP; 24 Sep 2014 02:26:01 -0700",
            "from fmsmsx152.amr.corp.intel.com (10.18.125.5) by\n\tfmsmsx104.amr.corp.intel.com (10.18.124.202) with Microsoft SMTP\n\tServer (TLS) id 14.3.195.1; Wed, 24 Sep 2014 02:26:01 -0700",
            "from shsmsx152.ccr.corp.intel.com (10.239.6.52) by\n\tFMSMSX152.amr.corp.intel.com (10.18.125.5) with Microsoft SMTP Server\n\t(TLS) id 14.3.195.1; Wed, 24 Sep 2014 02:26:00 -0700",
            "from shsmsx101.ccr.corp.intel.com ([169.254.1.203]) by\n\tSHSMSX152.ccr.corp.intel.com ([169.254.6.190]) with mapi id\n\t14.03.0195.001; Wed, 24 Sep 2014 17:25:58 +0800"
        ],
        "Subject": "Re: [dpdk-dev] [PATCH] examples/vhost: Support jumbo frame in\n\tuser\tspace vhost",
        "Sender": "\"dev\" <dev-bounces@dpdk.org>",
        "List-Help": "<mailto:dev-request@dpdk.org?subject=help>",
        "Content-Language": "en-US",
        "Accept-Language": "en-US",
        "Message-ID": "<6BD6202160B55B409D423293115822625483C6@SHSMSX101.ccr.corp.intel.com>",
        "X-MS-Has-Attach": "",
        "X-BeenThere": "dev@dpdk.org",
        "From": "\"Fu, JingguoX\" <jingguox.fu@intel.com>",
        "List-Archive": "<http://dpdk.org/ml/archives/dev/>",
        "X-ExtLoop1": "1",
        "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>",
        "List-Id": "patches and discussions about DPDK <dev.dpdk.org>",
        "Precedence": "list",
        "Thread-Topic": "[dpdk-dev] [PATCH] examples/vhost: Support jumbo frame in user\n\tspace vhost",
        "In-Reply-To": "<1408078681-3511-1-git-send-email-changchun.ouyang@intel.com>",
        "Errors-To": "dev-bounces@dpdk.org",
        "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>",
        "MIME-Version": "1.0",
        "Content-Transfer-Encoding": "quoted-printable",
        "To": "\"dev@dpdk.org\" <dev@dpdk.org>",
        "X-MS-TNEF-Correlator": ""
    },
    "content": "Tested-by: Jingguo Fu <jingguox.fu at intel.com>\n\nThis patch includes 1 file, and has been tested by Intel.\nPlease see information as the following:\n\nHost:\nFedora 19 x86_64, Linux Kernel 3.9.0, GCC 4.8.2  Intel Xeon CPU E5-2680 v2 @ 2.80GHz\n NIC: Intel Niantic 82599, Intel i350, Intel 82580 and Intel 82576\n\nGuest:\nFedora 16 x86_64, Linux Kernel 3.4.2, GCC 4.6.3 Qemu emulator 1.4.2\n\nWe verified zero copy and one copy functional test and performance test, that is regression test with front end support jumbo frame \nWe verified jumbo frame support on front end, with linux legacy back end.\nWe verified jumbo frame support on front end, with vhost backend\n\n-----Original Message-----\nFrom: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ouyang Changchun\nSent: Friday, August 15, 2014 12:58\nTo: dev@dpdk.org\nSubject: [dpdk-dev] [PATCH] examples/vhost: Support jumbo frame in user space vhost\n\nThis patch support mergeable RX feature and thus support jumbo frame RX and TX\nin user space vhost(as virtio backend).\n \nOn RX, it secures enough room from vring to accommodate one complete scattered\npacket which is received by PMD from physical port, and then copy data from\nmbuf to vring buffer, possibly across a few vring entries and descriptors.\n \nOn TX, it gets a jumbo frame, possibly described by a few vring descriptors which\nare chained together with the flags of 'NEXT', and then copy them into one scattered\npacket and TX it to physical port through PMD.\n\nSigned-off-by: Changchun Ouyang <changchun.ouyang@intel.com>\nAcked-by: Huawei Xie <huawei.xie@intel.com>\n---\n examples/vhost/main.c       | 726 ++++++++++++++++++++++++++++++++++++++++----\n examples/vhost/virtio-net.h |  14 +\n 2 files changed, 687 insertions(+), 53 deletions(-)",
    "diff": "diff --git a/examples/vhost/main.c b/examples/vhost/main.c\nindex 193aa25..7d9e6a2 100644\n--- a/examples/vhost/main.c\n+++ b/examples/vhost/main.c\n@@ -106,6 +106,8 @@\n #define BURST_RX_WAIT_US 15 \t/* Defines how long we wait between retries on RX */\n #define BURST_RX_RETRIES 4\t\t/* Number of retries on RX. */\n \n+#define JUMBO_FRAME_MAX_SIZE    0x2600\n+\n /* State of virtio device. */\n #define DEVICE_MAC_LEARNING 0\n #define DEVICE_RX\t\t\t1\n@@ -676,8 +678,12 @@ us_vhost_parse_args(int argc, char **argv)\n \t\t\t\t\tus_vhost_usage(prgname);\n \t\t\t\t\treturn -1;\n \t\t\t\t} else {\n-\t\t\t\t\tif (ret)\n+\t\t\t\t\tif (ret) {\n+\t\t\t\t\t\tvmdq_conf_default.rxmode.jumbo_frame = 1;\n+\t\t\t\t\t\tvmdq_conf_default.rxmode.max_rx_pkt_len\n+\t\t\t\t\t\t\t= JUMBO_FRAME_MAX_SIZE;\n \t\t\t\t\t\tVHOST_FEATURES = (1ULL << VIRTIO_NET_F_MRG_RXBUF);\n+\t\t\t\t\t}\n \t\t\t\t}\n \t\t\t}\n \n@@ -797,6 +803,14 @@ us_vhost_parse_args(int argc, char **argv)\n \t\treturn -1;\n \t}\n \n+\tif ((zero_copy == 1) && (vmdq_conf_default.rxmode.jumbo_frame == 1)) {\n+\t\tRTE_LOG(INFO, VHOST_PORT,\n+\t\t\t\"Vhost zero copy doesn't support jumbo frame,\"\n+\t\t\t\"please specify '--mergeable 0' to disable the \"\n+\t\t\t\"mergeable feature.\\n\");\n+\t\treturn -1;\n+\t}\n+\n \treturn 0;\n }\n \n@@ -916,7 +930,7 @@ gpa_to_hpa(struct virtio_net *dev, uint64_t guest_pa,\n  * This function adds buffers to the virtio devices RX virtqueue. Buffers can\n  * be received from the physical port or from another virtio device. A packet\n  * count is returned to indicate the number of packets that were succesfully\n- * added to the RX queue.\n+ * added to the RX queue. This function works when mergeable is disabled.\n  */\n static inline uint32_t __attribute__((always_inline))\n virtio_dev_rx(struct virtio_net *dev, struct rte_mbuf **pkts, uint32_t count)\n@@ -930,7 +944,6 @@ virtio_dev_rx(struct virtio_net *dev, struct rte_mbuf **pkts, uint32_t count)\n \tuint64_t buff_hdr_addr = 0;\n \tuint32_t head[MAX_PKT_BURST], packet_len = 0;\n \tuint32_t head_idx, packet_success = 0;\n-\tuint32_t mergeable, mrg_count = 0;\n \tuint32_t retry = 0;\n \tuint16_t avail_idx, res_cur_idx;\n \tuint16_t res_base_idx, res_end_idx;\n@@ -940,6 +953,7 @@ virtio_dev_rx(struct virtio_net *dev, struct rte_mbuf **pkts, uint32_t count)\n \tLOG_DEBUG(VHOST_DATA, \"(%\"PRIu64\") virtio_dev_rx()\\n\", dev->device_fh);\n \tvq = dev->virtqueue[VIRTIO_RXQ];\n \tcount = (count > MAX_PKT_BURST) ? MAX_PKT_BURST : count;\n+\n \t/* As many data cores may want access to available buffers, they need to be reserved. */\n \tdo {\n \t\tres_base_idx = vq->last_used_idx_res;\n@@ -976,9 +990,6 @@ virtio_dev_rx(struct virtio_net *dev, struct rte_mbuf **pkts, uint32_t count)\n \t/* Prefetch available ring to retrieve indexes. */\n \trte_prefetch0(&vq->avail->ring[res_cur_idx & (vq->size - 1)]);\n \n-\t/* Check if the VIRTIO_NET_F_MRG_RXBUF feature is enabled. */\n-\tmergeable = dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF);\n-\n \t/* Retrieve all of the head indexes first to avoid caching issues. */\n \tfor (head_idx = 0; head_idx < count; head_idx++)\n \t\thead[head_idx] = vq->avail->ring[(res_cur_idx + head_idx) & (vq->size - 1)];\n@@ -997,56 +1008,44 @@ virtio_dev_rx(struct virtio_net *dev, struct rte_mbuf **pkts, uint32_t count)\n \t\t/* Prefetch buffer address. */\n \t\trte_prefetch0((void*)(uintptr_t)buff_addr);\n \n-\t\tif (mergeable && (mrg_count != 0)) {\n-\t\t\tdesc->len = packet_len = rte_pktmbuf_data_len(buff);\n-\t\t} else {\n-\t\t\t/* Copy virtio_hdr to packet and increment buffer address */\n-\t\t\tbuff_hdr_addr = buff_addr;\n-\t\t\tpacket_len = rte_pktmbuf_data_len(buff) + vq->vhost_hlen;\n+\t\t/* Copy virtio_hdr to packet and increment buffer address */\n+\t\tbuff_hdr_addr = buff_addr;\n+\t\tpacket_len = rte_pktmbuf_data_len(buff) + vq->vhost_hlen;\n \n-\t\t\t/*\n-\t\t\t * If the descriptors are chained the header and data are placed in\n-\t\t\t * separate buffers.\n-\t\t\t */\n-\t\t\tif (desc->flags & VRING_DESC_F_NEXT) {\n-\t\t\t\tdesc->len = vq->vhost_hlen;\n-\t\t\t\tdesc = &vq->desc[desc->next];\n-\t\t\t\t/* Buffer address translation. */\n-\t\t\t\tbuff_addr = gpa_to_vva(dev, desc->addr);\n-\t\t\t\tdesc->len = rte_pktmbuf_data_len(buff);\n-\t\t\t} else {\n-\t\t\t\tbuff_addr += vq->vhost_hlen;\n-\t\t\t\tdesc->len = packet_len;\n-\t\t\t}\n+\t\t/*\n+\t\t * If the descriptors are chained the header and data are\n+\t\t * placed in separate buffers.\n+\t\t */\n+\t\tif (desc->flags & VRING_DESC_F_NEXT) {\n+\t\t\tdesc->len = vq->vhost_hlen;\n+\t\t\tdesc = &vq->desc[desc->next];\n+\t\t\t/* Buffer address translation. */\n+\t\t\tbuff_addr = gpa_to_vva(dev, desc->addr);\n+\t\t\tdesc->len = rte_pktmbuf_data_len(buff);\n+\t\t} else {\n+\t\t\tbuff_addr += vq->vhost_hlen;\n+\t\t\tdesc->len = packet_len;\n \t\t}\n \n-\t\tPRINT_PACKET(dev, (uintptr_t)buff_addr, rte_pktmbuf_data_len(buff), 0);\n-\n \t\t/* Update used ring with desc information */\n \t\tvq->used->ring[res_cur_idx & (vq->size - 1)].id = head[packet_success];\n \t\tvq->used->ring[res_cur_idx & (vq->size - 1)].len = packet_len;\n \n \t\t/* Copy mbuf data to buffer */\n-\t\trte_memcpy((void *)(uintptr_t)buff_addr, (const void*)buff->pkt.data, rte_pktmbuf_data_len(buff));\n+\t\trte_memcpy((void *)(uintptr_t)buff_addr,\n+\t\t\t(const void *)buff->pkt.data,\n+\t\t\trte_pktmbuf_data_len(buff));\n+\t\tPRINT_PACKET(dev, (uintptr_t)buff_addr,\n+\t\t\trte_pktmbuf_data_len(buff), 0);\n \n \t\tres_cur_idx++;\n \t\tpacket_success++;\n \n-\t\t/* If mergeable is disabled then a header is required per buffer. */\n-\t\tif (!mergeable) {\n-\t\t\trte_memcpy((void *)(uintptr_t)buff_hdr_addr, (const void*)&virtio_hdr, vq->vhost_hlen);\n-\t\t\tPRINT_PACKET(dev, (uintptr_t)buff_hdr_addr, vq->vhost_hlen, 1);\n-\t\t} else {\n-\t\t\tmrg_count++;\n-\t\t\t/* Merge buffer can only handle so many buffers at a time. Tell the guest if this limit is reached. */\n-\t\t\tif ((mrg_count == MAX_MRG_PKT_BURST) || (res_cur_idx == res_end_idx)) {\n-\t\t\t\tvirtio_hdr.num_buffers = mrg_count;\n-\t\t\t\tLOG_DEBUG(VHOST_DATA, \"(%\"PRIu64\") RX: Num merge buffers %d\\n\", dev->device_fh, virtio_hdr.num_buffers);\n-\t\t\t\trte_memcpy((void *)(uintptr_t)buff_hdr_addr, (const void*)&virtio_hdr, vq->vhost_hlen);\n-\t\t\t\tPRINT_PACKET(dev, (uintptr_t)buff_hdr_addr, vq->vhost_hlen, 1);\n-\t\t\t\tmrg_count = 0;\n-\t\t\t}\n-\t\t}\n+\t\trte_memcpy((void *)(uintptr_t)buff_hdr_addr,\n+\t\t\t(const void *)&virtio_hdr, vq->vhost_hlen);\n+\n+\t\tPRINT_PACKET(dev, (uintptr_t)buff_hdr_addr, vq->vhost_hlen, 1);\n+\n \t\tif (res_cur_idx < res_end_idx) {\n \t\t\t/* Prefetch descriptor index. */\n \t\t\trte_prefetch0(&vq->desc[head[packet_success]]);\n@@ -1068,6 +1067,356 @@ virtio_dev_rx(struct virtio_net *dev, struct rte_mbuf **pkts, uint32_t count)\n \treturn count;\n }\n \n+static inline uint32_t __attribute__((always_inline))\n+copy_from_mbuf_to_vring(struct virtio_net *dev,\n+\tuint16_t res_base_idx, uint16_t res_end_idx,\n+\tstruct rte_mbuf *pkt)\n+{\n+\tuint32_t vec_idx = 0;\n+\tuint32_t entry_success = 0;\n+\tstruct vhost_virtqueue *vq;\n+\t/* The virtio_hdr is initialised to 0. */\n+\tstruct virtio_net_hdr_mrg_rxbuf virtio_hdr = {\n+\t\t{0, 0, 0, 0, 0, 0}, 0};\n+\tuint16_t cur_idx = res_base_idx;\n+\tuint64_t vb_addr = 0;\n+\tuint64_t vb_hdr_addr = 0;\n+\tuint32_t seg_offset = 0;\n+\tuint32_t vb_offset = 0;\n+\tuint32_t seg_avail;\n+\tuint32_t vb_avail;\n+\tuint32_t cpy_len, entry_len;\n+\n+\tif (pkt == NULL)\n+\t\treturn 0;\n+\n+\tLOG_DEBUG(VHOST_DATA, \"(%\"PRIu64\") Current Index %d| \"\n+\t\t\"End Index %d\\n\",\n+\t\tdev->device_fh, cur_idx, res_end_idx);\n+\n+\t/*\n+\t * Convert from gpa to vva\n+\t * (guest physical addr -> vhost virtual addr)\n+\t */\n+\tvq = dev->virtqueue[VIRTIO_RXQ];\n+\tvb_addr =\n+\t\tgpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr);\n+\tvb_hdr_addr = vb_addr;\n+\n+\t/* Prefetch buffer address. */\n+\trte_prefetch0((void *)(uintptr_t)vb_addr);\n+\n+\tvirtio_hdr.num_buffers = res_end_idx - res_base_idx;\n+\n+\tLOG_DEBUG(VHOST_DATA, \"(%\"PRIu64\") RX: Num merge buffers %d\\n\",\n+\t\tdev->device_fh, virtio_hdr.num_buffers);\n+\n+\trte_memcpy((void *)(uintptr_t)vb_hdr_addr,\n+\t\t(const void *)&virtio_hdr, vq->vhost_hlen);\n+\n+\tPRINT_PACKET(dev, (uintptr_t)vb_hdr_addr, vq->vhost_hlen, 1);\n+\n+\tseg_avail = rte_pktmbuf_data_len(pkt);\n+\tvb_offset = vq->vhost_hlen;\n+\tvb_avail =\n+\t\tvq->buf_vec[vec_idx].buf_len - vq->vhost_hlen;\n+\n+\tentry_len = vq->vhost_hlen;\n+\n+\tif (vb_avail == 0) {\n+\t\tuint32_t desc_idx =\n+\t\t\tvq->buf_vec[vec_idx].desc_idx;\n+\t\tvq->desc[desc_idx].len = vq->vhost_hlen;\n+\n+\t\tif ((vq->desc[desc_idx].flags\n+\t\t\t& VRING_DESC_F_NEXT) == 0) {\n+\t\t\t/* Update used ring with desc information */\n+\t\t\tvq->used->ring[cur_idx & (vq->size - 1)].id\n+\t\t\t\t= vq->buf_vec[vec_idx].desc_idx;\n+\t\t\tvq->used->ring[cur_idx & (vq->size - 1)].len\n+\t\t\t\t= entry_len;\n+\n+\t\t\tentry_len = 0;\n+\t\t\tcur_idx++;\n+\t\t\tentry_success++;\n+\t\t}\n+\n+\t\tvec_idx++;\n+\t\tvb_addr =\n+\t\t\tgpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr);\n+\n+\t\t/* Prefetch buffer address. */\n+\t\trte_prefetch0((void *)(uintptr_t)vb_addr);\n+\t\tvb_offset = 0;\n+\t\tvb_avail = vq->buf_vec[vec_idx].buf_len;\n+\t}\n+\n+\tcpy_len = RTE_MIN(vb_avail, seg_avail);\n+\n+\twhile (cpy_len > 0) {\n+\t\t/* Copy mbuf data to vring buffer */\n+\t\trte_memcpy((void *)(uintptr_t)(vb_addr + vb_offset),\n+\t\t\t(const void *)(rte_pktmbuf_mtod(pkt, char*) + seg_offset),\n+\t\t\tcpy_len);\n+\n+\t\tPRINT_PACKET(dev,\n+\t\t\t(uintptr_t)(vb_addr + vb_offset),\n+\t\t\tcpy_len, 0);\n+\n+\t\tseg_offset += cpy_len;\n+\t\tvb_offset += cpy_len;\n+\t\tseg_avail -= cpy_len;\n+\t\tvb_avail -= cpy_len;\n+\t\tentry_len += cpy_len;\n+\n+\t\tif (seg_avail != 0) {\n+\t\t\t/*\n+\t\t\t * The virtio buffer in this vring\n+\t\t\t * entry reach to its end.\n+\t\t\t * But the segment doesn't complete.\n+\t\t\t */\n+\t\t\tif ((vq->desc[vq->buf_vec[vec_idx].desc_idx].flags &\n+\t\t\t\tVRING_DESC_F_NEXT) == 0) {\n+\t\t\t\t/* Update used ring with desc information */\n+\t\t\t\tvq->used->ring[cur_idx & (vq->size - 1)].id\n+\t\t\t\t\t= vq->buf_vec[vec_idx].desc_idx;\n+\t\t\t\tvq->used->ring[cur_idx & (vq->size - 1)].len\n+\t\t\t\t\t= entry_len;\n+\t\t\t\tentry_len = 0;\n+\t\t\t\tcur_idx++;\n+\t\t\t\tentry_success++;\n+\t\t\t}\n+\n+\t\t\tvec_idx++;\n+\t\t\tvb_addr = gpa_to_vva(dev,\n+\t\t\t\tvq->buf_vec[vec_idx].buf_addr);\n+\t\t\tvb_offset = 0;\n+\t\t\tvb_avail = vq->buf_vec[vec_idx].buf_len;\n+\t\t\tcpy_len = RTE_MIN(vb_avail, seg_avail);\n+\t\t} else {\n+\t\t\t/*\n+\t\t\t * This current segment complete, need continue to\n+\t\t\t * check if the whole packet complete or not.\n+\t\t\t */\n+\t\t\tpkt = pkt->pkt.next;\n+\t\t\tif (pkt != NULL) {\n+\t\t\t\t/*\n+\t\t\t\t * There are more segments.\n+\t\t\t\t */\n+\t\t\t\tif (vb_avail == 0) {\n+\t\t\t\t\t/*\n+\t\t\t\t\t * This current buffer from vring is\n+\t\t\t\t\t * used up, need fetch next buffer\n+\t\t\t\t\t * from buf_vec.\n+\t\t\t\t\t */\n+\t\t\t\t\tuint32_t desc_idx =\n+\t\t\t\t\t\tvq->buf_vec[vec_idx].desc_idx;\n+\t\t\t\t\tvq->desc[desc_idx].len = vb_offset;\n+\n+\t\t\t\t\tif ((vq->desc[desc_idx].flags &\n+\t\t\t\t\t\tVRING_DESC_F_NEXT) == 0) {\n+\t\t\t\t\t\tuint16_t wrapped_idx =\n+\t\t\t\t\t\t\tcur_idx & (vq->size - 1);\n+\t\t\t\t\t\t/*\n+\t\t\t\t\t\t * Update used ring with the\n+\t\t\t\t\t\t * descriptor information\n+\t\t\t\t\t\t */\n+\t\t\t\t\t\tvq->used->ring[wrapped_idx].id\n+\t\t\t\t\t\t\t= desc_idx;\n+\t\t\t\t\t\tvq->used->ring[wrapped_idx].len\n+\t\t\t\t\t\t\t= entry_len;\n+\t\t\t\t\t\tentry_success++;\n+\t\t\t\t\t\tentry_len = 0;\n+\t\t\t\t\t\tcur_idx++;\n+\t\t\t\t\t}\n+\n+\t\t\t\t\t/* Get next buffer from buf_vec. */\n+\t\t\t\t\tvec_idx++;\n+\t\t\t\t\tvb_addr = gpa_to_vva(dev,\n+\t\t\t\t\t\tvq->buf_vec[vec_idx].buf_addr);\n+\t\t\t\t\tvb_avail =\n+\t\t\t\t\t\tvq->buf_vec[vec_idx].buf_len;\n+\t\t\t\t\tvb_offset = 0;\n+\t\t\t\t}\n+\n+\t\t\t\tseg_offset = 0;\n+\t\t\t\tseg_avail = rte_pktmbuf_data_len(pkt);\n+\t\t\t\tcpy_len = RTE_MIN(vb_avail, seg_avail);\n+\t\t\t} else {\n+\t\t\t\t/*\n+\t\t\t\t * This whole packet completes.\n+\t\t\t\t */\n+\t\t\t\tuint32_t desc_idx =\n+\t\t\t\t\tvq->buf_vec[vec_idx].desc_idx;\n+\t\t\t\tvq->desc[desc_idx].len = vb_offset;\n+\n+\t\t\t\twhile (vq->desc[desc_idx].flags &\n+\t\t\t\t\tVRING_DESC_F_NEXT) {\n+\t\t\t\t\tdesc_idx = vq->desc[desc_idx].next;\n+\t\t\t\t\t vq->desc[desc_idx].len = 0;\n+\t\t\t\t}\n+\n+\t\t\t\t/* Update used ring with desc information */\n+\t\t\t\tvq->used->ring[cur_idx & (vq->size - 1)].id\n+\t\t\t\t\t= vq->buf_vec[vec_idx].desc_idx;\n+\t\t\t\tvq->used->ring[cur_idx & (vq->size - 1)].len\n+\t\t\t\t\t= entry_len;\n+\t\t\t\tentry_len = 0;\n+\t\t\t\tcur_idx++;\n+\t\t\t\tentry_success++;\n+\t\t\t\tseg_avail = 0;\n+\t\t\t\tcpy_len = RTE_MIN(vb_avail, seg_avail);\n+\t\t\t}\n+\t\t}\n+\t}\n+\n+\treturn entry_success;\n+}\n+\n+/*\n+ * This function adds buffers to the virtio devices RX virtqueue. Buffers can\n+ * be received from the physical port or from another virtio device. A packet\n+ * count is returned to indicate the number of packets that were succesfully\n+ * added to the RX queue. This function works for mergeable RX.\n+ */\n+static inline uint32_t __attribute__((always_inline))\n+virtio_dev_merge_rx(struct virtio_net *dev, struct rte_mbuf **pkts,\n+\tuint32_t count)\n+{\n+\tstruct vhost_virtqueue *vq;\n+\tuint32_t pkt_idx = 0, entry_success = 0;\n+\tuint32_t retry = 0;\n+\tuint16_t avail_idx, res_cur_idx;\n+\tuint16_t res_base_idx, res_end_idx;\n+\tuint8_t success = 0;\n+\n+\tLOG_DEBUG(VHOST_DATA, \"(%\"PRIu64\") virtio_dev_merge_rx()\\n\",\n+\t\tdev->device_fh);\n+\tvq = dev->virtqueue[VIRTIO_RXQ];\n+\tcount = RTE_MIN((uint32_t)MAX_PKT_BURST, count);\n+\n+\tif (count == 0)\n+\t\treturn 0;\n+\n+\tfor (pkt_idx = 0; pkt_idx < count; pkt_idx++) {\n+\t\tuint32_t secure_len = 0;\n+\t\tuint16_t need_cnt;\n+\t\tuint32_t vec_idx = 0;\n+\t\tuint32_t pkt_len = pkts[pkt_idx]->pkt.pkt_len + vq->vhost_hlen;\n+\t\tuint16_t i, id;\n+\n+\t\tdo {\n+\t\t\t/*\n+\t\t\t * As many data cores may want access to available\n+\t\t\t * buffers, they need to be reserved.\n+\t\t\t */\n+\t\t\tres_base_idx = vq->last_used_idx_res;\n+\t\t\tres_cur_idx = res_base_idx;\n+\n+\t\t\tdo {\n+\t\t\t\tavail_idx = *((volatile uint16_t *)&vq->avail->idx);\n+\t\t\t\tif (unlikely(res_cur_idx == avail_idx)) {\n+\t\t\t\t\t/*\n+\t\t\t\t\t * If retry is enabled and the queue is\n+\t\t\t\t\t * full then we wait and retry to avoid\n+\t\t\t\t\t * packet loss.\n+\t\t\t\t\t */\n+\t\t\t\t\tif (enable_retry) {\n+\t\t\t\t\t\tuint8_t cont = 0;\n+\t\t\t\t\t\tfor (retry = 0; retry < burst_rx_retry_num; retry++) {\n+\t\t\t\t\t\t\trte_delay_us(burst_rx_delay_time);\n+\t\t\t\t\t\t\tavail_idx =\n+\t\t\t\t\t\t\t\t*((volatile uint16_t *)&vq->avail->idx);\n+\t\t\t\t\t\t\tif (likely(res_cur_idx != avail_idx)) {\n+\t\t\t\t\t\t\t\tcont = 1;\n+\t\t\t\t\t\t\t\tbreak;\n+\t\t\t\t\t\t\t}\n+\t\t\t\t\t\t}\n+\t\t\t\t\t\tif (cont == 1)\n+\t\t\t\t\t\t\tcontinue;\n+\t\t\t\t\t}\n+\n+\t\t\t\t\tLOG_DEBUG(VHOST_DATA,\n+\t\t\t\t\t\t\"(%\"PRIu64\") Failed \"\n+\t\t\t\t\t\t\"to get enough desc from \"\n+\t\t\t\t\t\t\"vring\\n\",\n+\t\t\t\t\t\tdev->device_fh);\n+\t\t\t\t\treturn pkt_idx;\n+\t\t\t\t} else {\n+\t\t\t\t\tuint16_t wrapped_idx =\n+\t\t\t\t\t\t(res_cur_idx) & (vq->size - 1);\n+\t\t\t\t\tuint32_t idx =\n+\t\t\t\t\t\tvq->avail->ring[wrapped_idx];\n+\t\t\t\t\tuint8_t next_desc;\n+\n+\t\t\t\t\tdo {\n+\t\t\t\t\t\tnext_desc = 0;\n+\t\t\t\t\t\tsecure_len += vq->desc[idx].len;\n+\t\t\t\t\t\tif (vq->desc[idx].flags &\n+\t\t\t\t\t\t\tVRING_DESC_F_NEXT) {\n+\t\t\t\t\t\t\tidx = vq->desc[idx].next;\n+\t\t\t\t\t\t\tnext_desc = 1;\n+\t\t\t\t\t\t}\n+\t\t\t\t\t} while (next_desc);\n+\n+\t\t\t\t\tres_cur_idx++;\n+\t\t\t\t}\n+\t\t\t} while (pkt_len > secure_len);\n+\n+\t\t\t/* vq->last_used_idx_res is atomically updated. */\n+\t\t\tsuccess = rte_atomic16_cmpset(&vq->last_used_idx_res,\n+\t\t\t\t\t\t\tres_base_idx,\n+\t\t\t\t\t\t\tres_cur_idx);\n+\t\t} while (success == 0);\n+\n+\t\tid = res_base_idx;\n+\t\tneed_cnt = res_cur_idx - res_base_idx;\n+\n+\t\tfor (i = 0; i < need_cnt; i++, id++) {\n+\t\t\tuint16_t wrapped_idx = id & (vq->size - 1);\n+\t\t\tuint32_t idx = vq->avail->ring[wrapped_idx];\n+\t\t\tuint8_t next_desc;\n+\t\t\tdo {\n+\t\t\t\tnext_desc = 0;\n+\t\t\t\tvq->buf_vec[vec_idx].buf_addr =\n+\t\t\t\t\tvq->desc[idx].addr;\n+\t\t\t\tvq->buf_vec[vec_idx].buf_len =\n+\t\t\t\t\tvq->desc[idx].len;\n+\t\t\t\tvq->buf_vec[vec_idx].desc_idx = idx;\n+\t\t\t\tvec_idx++;\n+\n+\t\t\t\tif (vq->desc[idx].flags & VRING_DESC_F_NEXT) {\n+\t\t\t\t\tidx = vq->desc[idx].next;\n+\t\t\t\t\tnext_desc = 1;\n+\t\t\t\t}\n+\t\t\t} while (next_desc);\n+\t\t}\n+\n+\t\tres_end_idx = res_cur_idx;\n+\n+\t\tentry_success = copy_from_mbuf_to_vring(dev, res_base_idx,\n+\t\t\tres_end_idx, pkts[pkt_idx]);\n+\n+\t\trte_compiler_barrier();\n+\n+\t\t/*\n+\t\t * Wait until it's our turn to add our buffer\n+\t\t * to the used ring.\n+\t\t */\n+\t\twhile (unlikely(vq->last_used_idx != res_base_idx))\n+\t\t\trte_pause();\n+\n+\t\t*(volatile uint16_t *)&vq->used->idx += entry_success;\n+\t\tvq->last_used_idx = res_end_idx;\n+\n+\t\t/* Kick the guest if necessary. */\n+\t\tif (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT))\n+\t\t\teventfd_write((int)vq->kickfd, 1);\n+\t}\n+\n+\treturn count;\n+}\n+\n /*\n  * Compares a packet destination MAC address to a device MAC address.\n  */\n@@ -1199,8 +1548,17 @@ virtio_tx_local(struct virtio_net *dev, struct rte_mbuf *m)\n \t\t\t\t/*drop the packet if the device is marked for removal*/\n \t\t\t\tLOG_DEBUG(VHOST_DATA, \"(%\"PRIu64\") Device is marked for removal\\n\", dev_ll->dev->device_fh);\n \t\t\t} else {\n+\t\t\t\tuint32_t mergeable =\n+\t\t\t\t\tdev_ll->dev->features &\n+\t\t\t\t\t(1 << VIRTIO_NET_F_MRG_RXBUF);\n+\n \t\t\t\t/*send the packet to the local virtio device*/\n-\t\t\t\tret = virtio_dev_rx(dev_ll->dev, &m, 1);\n+\t\t\t\tif (likely(mergeable == 0))\n+\t\t\t\t\tret = virtio_dev_rx(dev_ll->dev, &m, 1);\n+\t\t\t\telse\n+\t\t\t\t\tret = virtio_dev_merge_rx(dev_ll->dev,\n+\t\t\t\t\t\t&m, 1);\n+\n \t\t\t\tif (enable_stats) {\n \t\t\t\t\trte_atomic64_add(\n \t\t\t\t\t&dev_statistics[dev_ll->dev->device_fh].rx_total_atomic,\n@@ -1231,7 +1589,7 @@ virtio_tx_route(struct virtio_net* dev, struct rte_mbuf *m, struct rte_mempool *\n \tstruct mbuf_table *tx_q;\n \tstruct vlan_ethhdr *vlan_hdr;\n \tstruct rte_mbuf **m_table;\n-\tstruct rte_mbuf *mbuf;\n+\tstruct rte_mbuf *mbuf, *prev;\n \tunsigned len, ret, offset = 0;\n \tconst uint16_t lcore_id = rte_lcore_id();\n \tstruct virtio_net_data_ll *dev_ll = ll_root_used;\n@@ -1284,12 +1642,14 @@ virtio_tx_route(struct virtio_net* dev, struct rte_mbuf *m, struct rte_mempool *\n \t/* Allocate an mbuf and populate the structure. */\n \tmbuf = rte_pktmbuf_alloc(mbuf_pool);\n \tif (unlikely(mbuf == NULL)) {\n-\t\tRTE_LOG(ERR, VHOST_DATA, \"Failed to allocate memory for mbuf.\\n\");\n+\t\tRTE_LOG(ERR, VHOST_DATA,\n+\t\t\t\"Failed to allocate memory for mbuf.\\n\");\n \t\treturn;\n \t}\n \n \tmbuf->pkt.data_len = m->pkt.data_len + VLAN_HLEN + offset;\n-\tmbuf->pkt.pkt_len = mbuf->pkt.data_len;\n+\tmbuf->pkt.pkt_len = m->pkt.pkt_len + VLAN_HLEN + offset;\n+\tmbuf->pkt.nb_segs = m->pkt.nb_segs;\n \n \t/* Copy ethernet header to mbuf. */\n \trte_memcpy((void*)mbuf->pkt.data, (const void*)m->pkt.data, ETH_HLEN);\n@@ -1304,6 +1664,29 @@ virtio_tx_route(struct virtio_net* dev, struct rte_mbuf *m, struct rte_mempool *\n \t/* Copy the remaining packet contents to the mbuf. */\n \trte_memcpy((void*) ((uint8_t*)mbuf->pkt.data + VLAN_ETH_HLEN),\n \t\t(const void*) ((uint8_t*)m->pkt.data + ETH_HLEN), (m->pkt.data_len - ETH_HLEN));\n+\n+\t/* Copy the remaining segments for the whole packet. */\n+\tprev = mbuf;\n+\twhile (m->pkt.next) {\n+\t\t/* Allocate an mbuf and populate the structure. */\n+\t\tstruct rte_mbuf *next_mbuf = rte_pktmbuf_alloc(mbuf_pool);\n+\t\tif (unlikely(next_mbuf == NULL)) {\n+\t\t\trte_pktmbuf_free(mbuf);\n+\t\t\tRTE_LOG(ERR, VHOST_DATA,\n+\t\t\t\t\"Failed to allocate memory for mbuf.\\n\");\n+\t\t\treturn;\n+\t\t}\n+\n+\t\tm = m->pkt.next;\n+\t\tprev->pkt.next = next_mbuf;\n+\t\tprev = next_mbuf;\n+\t\tnext_mbuf->pkt.data_len = m->pkt.data_len;\n+\n+\t\t/* Copy data to next mbuf. */\n+\t\trte_memcpy(rte_pktmbuf_mtod(next_mbuf, void *),\n+\t\t\trte_pktmbuf_mtod(m, const void *), m->pkt.data_len);\n+\t}\n+\n \ttx_q->m_table[len] = mbuf;\n \tlen++;\n \tif (enable_stats) {\n@@ -1394,6 +1777,7 @@ virtio_dev_tx(struct virtio_net* dev, struct rte_mempool *mbuf_pool)\n \n \t\t/* Setup dummy mbuf. This is copied to a real mbuf if transmitted out the physical port. */\n \t\tm.pkt.data_len = desc->len;\n+\t\tm.pkt.pkt_len = desc->len;\n \t\tm.pkt.data = (void*)(uintptr_t)buff_addr;\n \n \t\tPRINT_PACKET(dev, (uintptr_t)buff_addr, desc->len, 0);\n@@ -1420,6 +1804,227 @@ virtio_dev_tx(struct virtio_net* dev, struct rte_mempool *mbuf_pool)\n \t\teventfd_write((int)vq->kickfd, 1);\n }\n \n+/* This function works for TX packets with mergeable feature enabled. */\n+static inline void __attribute__((always_inline))\n+virtio_dev_merge_tx(struct virtio_net *dev, struct rte_mempool *mbuf_pool)\n+{\n+\tstruct rte_mbuf *m, *prev;\n+\tstruct vhost_virtqueue *vq;\n+\tstruct vring_desc *desc;\n+\tuint64_t vb_addr = 0;\n+\tuint32_t head[MAX_PKT_BURST];\n+\tuint32_t used_idx;\n+\tuint32_t i;\n+\tuint16_t free_entries, entry_success = 0;\n+\tuint16_t avail_idx;\n+\tuint32_t buf_size = MBUF_SIZE - (sizeof(struct rte_mbuf)\n+\t\t\t+ RTE_PKTMBUF_HEADROOM);\n+\n+\tvq = dev->virtqueue[VIRTIO_TXQ];\n+\tavail_idx =  *((volatile uint16_t *)&vq->avail->idx);\n+\n+\t/* If there are no available buffers then return. */\n+\tif (vq->last_used_idx == avail_idx)\n+\t\treturn;\n+\n+\tLOG_DEBUG(VHOST_DATA, \"(%\"PRIu64\") virtio_dev_merge_tx()\\n\",\n+\t\tdev->device_fh);\n+\n+\t/* Prefetch available ring to retrieve head indexes. */\n+\trte_prefetch0(&vq->avail->ring[vq->last_used_idx & (vq->size - 1)]);\n+\n+\t/*get the number of free entries in the ring*/\n+\tfree_entries = (avail_idx - vq->last_used_idx);\n+\n+\t/* Limit to MAX_PKT_BURST. */\n+\tfree_entries = RTE_MIN(free_entries, MAX_PKT_BURST);\n+\n+\tLOG_DEBUG(VHOST_DATA, \"(%\"PRIu64\") Buffers available %d\\n\",\n+\t\tdev->device_fh, free_entries);\n+\t/* Retrieve all of the head indexes first to avoid caching issues. */\n+\tfor (i = 0; i < free_entries; i++)\n+\t\thead[i] = vq->avail->ring[(vq->last_used_idx + i) & (vq->size - 1)];\n+\n+\t/* Prefetch descriptor index. */\n+\trte_prefetch0(&vq->desc[head[entry_success]]);\n+\trte_prefetch0(&vq->used->ring[vq->last_used_idx & (vq->size - 1)]);\n+\n+\twhile (entry_success < free_entries) {\n+\t\tuint32_t vb_avail, vb_offset;\n+\t\tuint32_t seg_avail, seg_offset;\n+\t\tuint32_t cpy_len;\n+\t\tuint32_t seg_num = 0;\n+\t\tstruct rte_mbuf *cur;\n+\t\tuint8_t alloc_err = 0;\n+\n+\t\tdesc = &vq->desc[head[entry_success]];\n+\n+\t\t/* Discard first buffer as it is the virtio header */\n+\t\tdesc = &vq->desc[desc->next];\n+\n+\t\t/* Buffer address translation. */\n+\t\tvb_addr = gpa_to_vva(dev, desc->addr);\n+\t\t/* Prefetch buffer address. */\n+\t\trte_prefetch0((void *)(uintptr_t)vb_addr);\n+\n+\t\tused_idx = vq->last_used_idx & (vq->size - 1);\n+\n+\t\tif (entry_success < (free_entries - 1)) {\n+\t\t\t/* Prefetch descriptor index. */\n+\t\t\trte_prefetch0(&vq->desc[head[entry_success+1]]);\n+\t\t\trte_prefetch0(&vq->used->ring[(used_idx + 1) & (vq->size - 1)]);\n+\t\t}\n+\n+\t\t/* Update used index buffer information. */\n+\t\tvq->used->ring[used_idx].id = head[entry_success];\n+\t\tvq->used->ring[used_idx].len = 0;\n+\n+\t\tvb_offset = 0;\n+\t\tvb_avail = desc->len;\n+\t\tseg_offset = 0;\n+\t\tseg_avail = buf_size;\n+\t\tcpy_len = RTE_MIN(vb_avail, seg_avail);\n+\n+\t\tPRINT_PACKET(dev, (uintptr_t)vb_addr, desc->len, 0);\n+\n+\t\t/* Allocate an mbuf and populate the structure. */\n+\t\tm = rte_pktmbuf_alloc(mbuf_pool);\n+\t\tif (unlikely(m == NULL)) {\n+\t\t\tRTE_LOG(ERR, VHOST_DATA,\n+\t\t\t\t\"Failed to allocate memory for mbuf.\\n\");\n+\t\t\treturn;\n+\t\t}\n+\n+\t\tseg_num++;\n+\t\tcur = m;\n+\t\tprev = m;\n+\t\twhile (cpy_len != 0) {\n+\t\t\trte_memcpy((void *)(rte_pktmbuf_mtod(cur, char *) + seg_offset),\n+\t\t\t\t(void *)((uintptr_t)(vb_addr + vb_offset)),\n+\t\t\t\tcpy_len);\n+\n+\t\t\tseg_offset += cpy_len;\n+\t\t\tvb_offset += cpy_len;\n+\t\t\tvb_avail -= cpy_len;\n+\t\t\tseg_avail -= cpy_len;\n+\n+\t\t\tif (vb_avail != 0) {\n+\t\t\t\t/*\n+\t\t\t\t * The segment reachs to its end,\n+\t\t\t\t * while the virtio buffer in TX vring has\n+\t\t\t\t * more data to be copied.\n+\t\t\t\t */\n+\t\t\t\tcur->pkt.data_len = seg_offset;\n+\t\t\t\tm->pkt.pkt_len += seg_offset;\n+\t\t\t\t/* Allocate mbuf and populate the structure. */\n+\t\t\t\tcur = rte_pktmbuf_alloc(mbuf_pool);\n+\t\t\t\tif (unlikely(cur == NULL)) {\n+\t\t\t\t\tRTE_LOG(ERR, VHOST_DATA, \"Failed to \"\n+\t\t\t\t\t\t\"allocate memory for mbuf.\\n\");\n+\t\t\t\t\trte_pktmbuf_free(m);\n+\t\t\t\t\talloc_err = 1;\n+\t\t\t\t\tbreak;\n+\t\t\t\t}\n+\n+\t\t\t\tseg_num++;\n+\t\t\t\tprev->pkt.next = cur;\n+\t\t\t\tprev = cur;\n+\t\t\t\tseg_offset = 0;\n+\t\t\t\tseg_avail = buf_size;\n+\t\t\t} else {\n+\t\t\t\tif (desc->flags & VRING_DESC_F_NEXT) {\n+\t\t\t\t\t/*\n+\t\t\t\t\t * There are more virtio buffers in\n+\t\t\t\t\t * same vring entry need to be copied.\n+\t\t\t\t\t */\n+\t\t\t\t\tif (seg_avail == 0) {\n+\t\t\t\t\t\t/*\n+\t\t\t\t\t\t * The current segment hasn't\n+\t\t\t\t\t\t * room to accomodate more\n+\t\t\t\t\t\t * data.\n+\t\t\t\t\t\t */\n+\t\t\t\t\t\tcur->pkt.data_len = seg_offset;\n+\t\t\t\t\t\tm->pkt.pkt_len += seg_offset;\n+\t\t\t\t\t\t/*\n+\t\t\t\t\t\t * Allocate an mbuf and\n+\t\t\t\t\t\t * populate the structure.\n+\t\t\t\t\t\t */\n+\t\t\t\t\t\tcur = rte_pktmbuf_alloc(mbuf_pool);\n+\t\t\t\t\t\tif (unlikely(cur == NULL)) {\n+\t\t\t\t\t\t\tRTE_LOG(ERR,\n+\t\t\t\t\t\t\t\tVHOST_DATA,\n+\t\t\t\t\t\t\t\t\"Failed to \"\n+\t\t\t\t\t\t\t\t\"allocate memory \"\n+\t\t\t\t\t\t\t\t\"for mbuf\\n\");\n+\t\t\t\t\t\t\trte_pktmbuf_free(m);\n+\t\t\t\t\t\t\talloc_err = 1;\n+\t\t\t\t\t\t\tbreak;\n+\t\t\t\t\t\t}\n+\t\t\t\t\t\tseg_num++;\n+\t\t\t\t\t\tprev->pkt.next = cur;\n+\t\t\t\t\t\tprev = cur;\n+\t\t\t\t\t\tseg_offset = 0;\n+\t\t\t\t\t\tseg_avail = buf_size;\n+\t\t\t\t\t}\n+\n+\t\t\t\t\tdesc = &vq->desc[desc->next];\n+\n+\t\t\t\t\t/* Buffer address translation. */\n+\t\t\t\t\tvb_addr = gpa_to_vva(dev, desc->addr);\n+\t\t\t\t\t/* Prefetch buffer address. */\n+\t\t\t\t\trte_prefetch0((void *)(uintptr_t)vb_addr);\n+\t\t\t\t\tvb_offset = 0;\n+\t\t\t\t\tvb_avail = desc->len;\n+\n+\t\t\t\t\tPRINT_PACKET(dev, (uintptr_t)vb_addr,\n+\t\t\t\t\t\tdesc->len, 0);\n+\t\t\t\t} else {\n+\t\t\t\t\t/* The whole packet completes. */\n+\t\t\t\t\tcur->pkt.data_len = seg_offset;\n+\t\t\t\t\tm->pkt.pkt_len += seg_offset;\n+\t\t\t\t\tvb_avail = 0;\n+\t\t\t\t}\n+\t\t\t}\n+\n+\t\t\tcpy_len = RTE_MIN(vb_avail, seg_avail);\n+\t\t}\n+\n+\t\tif (unlikely(alloc_err == 1))\n+\t\t\tbreak;\n+\n+\t\tm->pkt.nb_segs = seg_num;\n+\n+\t\t/*\n+\t\t * If this is the first received packet we need to learn\n+\t\t * the MAC and setup VMDQ\n+\t\t */\n+\t\tif (dev->ready == DEVICE_MAC_LEARNING) {\n+\t\t\tif (dev->remove || (link_vmdq(dev, m) == -1)) {\n+\t\t\t\t/*\n+\t\t\t\t * Discard frame if device is scheduled for\n+\t\t\t\t * removal or a duplicate MAC address is found.\n+\t\t\t\t */\n+\t\t\t\tentry_success = free_entries;\n+\t\t\t\tvq->last_used_idx += entry_success;\n+\t\t\t\trte_pktmbuf_free(m);\n+\t\t\t\tbreak;\n+\t\t\t}\n+\t\t}\n+\n+\t\tvirtio_tx_route(dev, m, mbuf_pool, (uint16_t)dev->device_fh);\n+\t\tvq->last_used_idx++;\n+\t\tentry_success++;\n+\t\trte_pktmbuf_free(m);\n+\t}\n+\n+\trte_compiler_barrier();\n+\tvq->used->idx += entry_success;\n+\t/* Kick guest if required. */\n+\tif (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT))\n+\t\teventfd_write((int)vq->kickfd, 1);\n+\n+}\n+\n /*\n  * This function is called by each data core. It handles all RX/TX registered with the\n  * core. For TX the specific lcore linked list is used. For RX, MAC addresses are compared\n@@ -1440,8 +2045,9 @@ switch_worker(__attribute__((unused)) void *arg)\n \tconst uint16_t lcore_id = rte_lcore_id();\n \tconst uint16_t num_cores = (uint16_t)rte_lcore_count();\n \tuint16_t rx_count = 0;\n+\tuint32_t mergeable = 0;\n \n-\tRTE_LOG(INFO, VHOST_DATA, \"Procesing on Core %u started \\n\", lcore_id);\n+\tRTE_LOG(INFO, VHOST_DATA, \"Procesing on Core %u started\\n\", lcore_id);\n \tlcore_ll = lcore_info[lcore_id].lcore_ll;\n \tprev_tsc = 0;\n \n@@ -1497,6 +2103,8 @@ switch_worker(__attribute__((unused)) void *arg)\n \t\twhile (dev_ll != NULL) {\n \t\t\t/*get virtio device ID*/\n \t\t\tdev = dev_ll->dev;\n+\t\t\tmergeable =\n+\t\t\t\tdev->features & (1 << VIRTIO_NET_F_MRG_RXBUF);\n \n \t\t\tif (dev->remove) {\n \t\t\t\tdev_ll = dev_ll->next;\n@@ -1510,7 +2118,15 @@ switch_worker(__attribute__((unused)) void *arg)\n \t\t\t\t\t(uint16_t)dev->vmdq_rx_q, pkts_burst, MAX_PKT_BURST);\n \n \t\t\t\tif (rx_count) {\n-\t\t\t\t\tret_count = virtio_dev_rx(dev, pkts_burst, rx_count);\n+\t\t\t\t\tif (likely(mergeable == 0))\n+\t\t\t\t\t\tret_count =\n+\t\t\t\t\t\t\tvirtio_dev_rx(dev,\n+\t\t\t\t\t\t\tpkts_burst, rx_count);\n+\t\t\t\t\telse\n+\t\t\t\t\t\tret_count =\n+\t\t\t\t\t\t\tvirtio_dev_merge_rx(dev,\n+\t\t\t\t\t\t\tpkts_burst, rx_count);\n+\n \t\t\t\t\tif (enable_stats) {\n \t\t\t\t\t\trte_atomic64_add(\n \t\t\t\t\t\t&dev_statistics[dev_ll->dev->device_fh].rx_total_atomic,\n@@ -1520,15 +2136,19 @@ switch_worker(__attribute__((unused)) void *arg)\n \t\t\t\t\t}\n \t\t\t\t\twhile (likely(rx_count)) {\n \t\t\t\t\t\trx_count--;\n-\t\t\t\t\t\trte_pktmbuf_free_seg(pkts_burst[rx_count]);\n+\t\t\t\t\t\trte_pktmbuf_free(pkts_burst[rx_count]);\n \t\t\t\t\t}\n \n \t\t\t\t}\n \t\t\t}\n \n-\t\t\tif (!dev->remove)\n+\t\t\tif (!dev->remove) {\n \t\t\t\t/*Handle guest TX*/\n-\t\t\t\tvirtio_dev_tx(dev, mbuf_pool);\n+\t\t\t\tif (likely(mergeable == 0))\n+\t\t\t\t\tvirtio_dev_tx(dev, mbuf_pool);\n+\t\t\t\telse\n+\t\t\t\t\tvirtio_dev_merge_tx(dev, mbuf_pool);\n+\t\t\t}\n \n \t\t\t/*move to the next device in the list*/\n \t\t\tdev_ll = dev_ll->next;\ndiff --git a/examples/vhost/virtio-net.h b/examples/vhost/virtio-net.h\nindex 3d1f255..1a2f0dc 100644\n--- a/examples/vhost/virtio-net.h\n+++ b/examples/vhost/virtio-net.h\n@@ -45,6 +45,18 @@\n /* Enum for virtqueue management. */\n enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};\n \n+#define BUF_VECTOR_MAX 256\n+\n+/*\n+ * Structure contains buffer address, length and descriptor index\n+ * from vring to do scatter RX.\n+*/\n+struct buf_vector {\n+uint64_t buf_addr;\n+uint32_t buf_len;\n+uint32_t desc_idx;\n+};\n+\n /*\n  * Structure contains variables relevant to TX/RX virtqueues.\n  */\n@@ -60,6 +72,8 @@ struct vhost_virtqueue\n \tvolatile uint16_t\tlast_used_idx_res;\t/* Used for multiple devices reserving buffers. */\n \teventfd_t\t\t\tcallfd;\t\t\t\t/* Currently unused as polling mode is enabled. */\n \teventfd_t\t\t\tkickfd;\t\t\t\t/* Used to notify the guest (trigger interrupt). */\n+\t/* Used for scatter RX. */\n+\tstruct buf_vector\tbuf_vec[BUF_VECTOR_MAX];\n } __rte_cache_aligned;\n \n /*\n",
    "prefixes": [
        "dpdk-dev"
    ]
}