Show a cover letter.

GET /api/covers/61839/?format=api
HTTP 200 OK
Allow: GET, HEAD, OPTIONS
Content-Type: application/json
Vary: Accept

{
    "id": 61839,
    "url": "http://patches.dpdk.org/api/covers/61839/?format=api",
    "web_url": "http://patches.dpdk.org/project/dpdk/cover/20191024160832.14543-1-yong.liu@intel.com/",
    "project": {
        "id": 1,
        "url": "http://patches.dpdk.org/api/projects/1/?format=api",
        "name": "DPDK",
        "link_name": "dpdk",
        "list_id": "dev.dpdk.org",
        "list_email": "dev@dpdk.org",
        "web_url": "http://core.dpdk.org",
        "scm_url": "git://dpdk.org/dpdk",
        "webscm_url": "http://git.dpdk.org/dpdk",
        "list_archive_url": "https://inbox.dpdk.org/dev",
        "list_archive_url_format": "https://inbox.dpdk.org/dev/{}",
        "commit_url_format": ""
    },
    "msgid": "<20191024160832.14543-1-yong.liu@intel.com>",
    "list_archive_url": "https://inbox.dpdk.org/dev/20191024160832.14543-1-yong.liu@intel.com",
    "date": "2019-10-24T16:08:19",
    "name": "[v9,00/13] vhost packed ring performance optimization",
    "submitter": {
        "id": 17,
        "url": "http://patches.dpdk.org/api/people/17/?format=api",
        "name": "Marvin Liu",
        "email": "yong.liu@intel.com"
    },
    "mbox": "http://patches.dpdk.org/project/dpdk/cover/20191024160832.14543-1-yong.liu@intel.com/mbox/",
    "series": [
        {
            "id": 7033,
            "url": "http://patches.dpdk.org/api/series/7033/?format=api",
            "web_url": "http://patches.dpdk.org/project/dpdk/list/?series=7033",
            "date": "2019-10-24T16:08:19",
            "name": "vhost packed ring performance optimization",
            "version": 9,
            "mbox": "http://patches.dpdk.org/series/7033/mbox/"
        }
    ],
    "comments": "http://patches.dpdk.org/api/covers/61839/comments/",
    "headers": {
        "Return-Path": "<dev-bounces@dpdk.org>",
        "X-Original-To": "patchwork@dpdk.org",
        "Delivered-To": "patchwork@dpdk.org",
        "Received": [
            "from [92.243.14.124] (localhost [127.0.0.1])\n\tby dpdk.org (Postfix) with ESMTP id 762C01E536;\n\tThu, 24 Oct 2019 10:28:41 +0200 (CEST)",
            "from mga06.intel.com (mga06.intel.com [134.134.136.31])\n\tby dpdk.org (Postfix) with ESMTP id 97CA31E535\n\tfor <dev@dpdk.org>; Thu, 24 Oct 2019 10:28:38 +0200 (CEST)",
            "from fmsmga002.fm.intel.com ([10.253.24.26])\n\tby orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;\n\t24 Oct 2019 01:28:37 -0700",
            "from npg-dpdk-virtual-marvin-dev.sh.intel.com ([10.67.119.142])\n\tby fmsmga002.fm.intel.com with ESMTP; 24 Oct 2019 01:28:35 -0700"
        ],
        "X-Amp-Result": "SKIPPED(no attachment in message)",
        "X-Amp-File-Uploaded": "False",
        "X-ExtLoop1": "1",
        "X-IronPort-AV": "E=Sophos;i=\"5.68,224,1569308400\"; d=\"scan'208\";a=\"228431049\"",
        "From": "Marvin Liu <yong.liu@intel.com>",
        "To": "maxime.coquelin@redhat.com, tiwei.bie@intel.com, zhihong.wang@intel.com, \n\tstephen@networkplumber.org, gavin.hu@arm.com",
        "Cc": "dev@dpdk.org,\n\tMarvin Liu <yong.liu@intel.com>",
        "Date": "Fri, 25 Oct 2019 00:08:19 +0800",
        "Message-Id": "<20191024160832.14543-1-yong.liu@intel.com>",
        "X-Mailer": "git-send-email 2.17.1",
        "In-Reply-To": "<20191021220813.55236-1-yong.liu@intel.com>",
        "References": "<20191021220813.55236-1-yong.liu@intel.com>",
        "Subject": "[dpdk-dev] [PATCH v9 00/13] vhost packed ring performance\n\toptimization",
        "X-BeenThere": "dev@dpdk.org",
        "X-Mailman-Version": "2.1.15",
        "Precedence": "list",
        "List-Id": "DPDK patches and discussions <dev.dpdk.org>",
        "List-Unsubscribe": "<https://mails.dpdk.org/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>",
        "List-Archive": "<http://mails.dpdk.org/archives/dev/>",
        "List-Post": "<mailto:dev@dpdk.org>",
        "List-Help": "<mailto:dev-request@dpdk.org?subject=help>",
        "List-Subscribe": "<https://mails.dpdk.org/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>",
        "Errors-To": "dev-bounces@dpdk.org",
        "Sender": "\"dev\" <dev-bounces@dpdk.org>"
    },
    "content": "Packed ring has more compact ring format and thus can significantly\nreduce the number of cache miss. It can lead to better performance.\nThis has been approved in virtio user driver, on normal E5 Xeon cpu\nsingle core performance can raise 12%.\n\nhttp://mails.dpdk.org/archives/dev/2018-April/095470.html\n\nHowever vhost performance with packed ring performance was decreased.\nThrough analysis, mostly extra cost was from the calculating of each\ndescriptor flag which depended on ring wrap counter. Moreover, both\nfrontend and backend need to write same descriptors which will cause\ncache contention. Especially when doing vhost enqueue function, virtio\nrefill packed ring function may write same cache line when vhost doing\nenqueue function. This kind of extra cache cost will reduce the benefit\nof reducing cache misses. \n\nFor optimizing vhost packed ring performance, vhost enqueue and dequeue\nfunction will be split into fast and normal path.\n\nSeveral methods will be taken in fast path:\n  Handle descriptors in one cache line by batch.\n  Split loop function into more pieces and unroll them.\n  Prerequisite check that whether I/O space can copy directly into mbuf\n    space and vice versa. \n  Prerequisite check that whether descriptor mapping is successful.\n  Distinguish vhost used ring update function by enqueue and dequeue\n    function.\n  Buffer dequeue used descriptors as many as possible.\n  Update enqueue used descriptors by cache line.\n\nAfter all these methods done, single core vhost PvP performance with 64B\npacket on Xeon 8180 can boost 35%.\n\nv9:\n- Fix clang build error\n\nv8:\n- Allocate mbuf by virtio_dev_pktmbuf_alloc\n\nv7:\n- Rebase code\n- Rename unroll macro and definitions\n- Calculate flags when doing single dequeue\n\nv6:\n- Fix dequeue zcopy result check\n\nv5:\n- Remove disable sw prefetch as performance impact is small\n- Change unroll pragma macro format\n- Rename shadow counter elements names\n- Clean dequeue update check condition\n- Add inline functions replace of duplicated code\n- Unify code style\n\nv4:\n- Support meson build\n- Remove memory region cache for no clear performance gain and ABI break\n- Not assume ring size is power of two\n\nv3:\n- Check available index overflow\n- Remove dequeue remained descs number check\n- Remove changes in split ring datapath\n- Call memory write barriers once when updating used flags\n- Rename some functions and macros\n- Code style optimization\n\nv2:\n- Utilize compiler's pragma to unroll loop, distinguish clang/icc/gcc\n- Buffered dequeue used desc number changed to (RING_SZ - PKT_BURST)\n- Optimize dequeue used ring update when in_order negotiated\n\n\nMarvin Liu (13):\n  vhost: add packed ring indexes increasing function\n  vhost: add packed ring single enqueue\n  vhost: try to unroll for each loop\n  vhost: add packed ring batch enqueue\n  vhost: add packed ring single dequeue\n  vhost: add packed ring batch dequeue\n  vhost: flush enqueue updates by cacheline\n  vhost: flush batched enqueue descs directly\n  vhost: buffer packed ring dequeue updates\n  vhost: optimize packed ring enqueue\n  vhost: add packed ring zcopy batch and single dequeue\n  vhost: optimize packed ring dequeue\n  vhost: optimize packed ring dequeue when in-order\n\n lib/librte_vhost/Makefile     |  18 +\n lib/librte_vhost/meson.build  |   7 +\n lib/librte_vhost/vhost.h      |  57 ++\n lib/librte_vhost/virtio_net.c | 948 +++++++++++++++++++++++++++-------\n 4 files changed, 837 insertions(+), 193 deletions(-)"
}