Show a patch.

GET /api/patches/458/?format=api
Content-Type: application/json
Vary: Accept

    "id": 458,
    "url": "",
    "web_url": "",
    "project": {
        "id": 1,
        "url": "",
        "name": "DPDK",
        "link_name": "dpdk",
        "list_id": "",
        "list_email": "",
        "web_url": "",
        "scm_url": "git://",
        "webscm_url": ""
    "msgid": "<>",
    "date": "2014-09-23T11:08:14",
    "name": "[dpdk-dev,v2,2/5] ixgbe: add prefetch to improve slow-path tx perf",
    "commit_ref": null,
    "pull_url": null,
    "state": "accepted",
    "archived": true,
    "hash": "42ea264c2e05f86a01dfa1dea3930e6c5b5ffac7",
    "submitter": {
        "id": 20,
        "url": "",
        "name": "Bruce Richardson",
        "email": ""
    "delegate": null,
    "mbox": "",
    "series": [],
    "comments": "",
    "check": "pending",
    "checks": "",
    "tags": {},
    "headers": {
        "List-Archive": "<>",
        "Return-Path": "<>",
        "Message-Id": "<>",
        "X-Mailer": "git-send-email",
        "To": "",
        "List-Subscribe": "<>,\n\t<>",
        "Received": [
            "from [] (localhost [IPv6:::1])\n\tby (Postfix) with ESMTP id 50DDBB3AE;\n\tTue, 23 Sep 2014 13:02:42 +0200 (CEST)",
            "from ( [])\n\tby (Postfix) with ESMTP id 69425B3AE\n\tfor <>; Tue, 23 Sep 2014 13:02:40 +0200 (CEST)",
            "from ([])\n\tby with ESMTP; 23 Sep 2014 04:08:47 -0700",
            "from ([])\n\tby with ESMTP; 23 Sep 2014 04:08:19 -0700",
            "from (\n\t[])\n\tby (8.14.3/8.13.6/MailSET/Hub) with ESMTP id\n\ts8NB8Iwn012722; Tue, 23 Sep 2014 12:08:18 +0100",
            "from (localhost [])\n\tby with ESMTP id s8NB8IMK010428;\n\tTue, 23 Sep 2014 12:08:18 +0100",
            "(from bricha3@localhost)\n\tby with  id s8NB8IGs010423;\n\tTue, 23 Sep 2014 12:08:18 +0100"
        "X-BeenThere": "",
        "List-Unsubscribe": "<>,\n\t<>",
        "Subject": "[dpdk-dev] [PATCH v2 2/5] ixgbe: add prefetch to improve slow-path\n\ttx perf",
        "In-Reply-To": "<>",
        "List-Id": "patches and discussions about DPDK <>",
        "List-Post": "<>",
        "Date": "Tue, 23 Sep 2014 12:08:14 +0100",
        "X-ExtLoop1": "1",
        "Precedence": "list",
        "From": "Bruce Richardson <>",
        "X-IronPort-AV": "E=Sophos;i=\"5.04,579,1406617200\"; d=\"scan'208\";a=\"603959842\"",
        "References": "<>\n\t<>",
        "X-Original-To": "",
        "Sender": "\"dev\" <>",
        "Errors-To": "",
        "List-Help": "<>",
        "Delivered-To": "",
        "X-Mailman-Version": "2.1.15"
    "content": "Make a small improvement to slow path TX performance by adding in a\nprefetch for the second mbuf cache line.\nAlso move assignment of l2/l3 length values only when needed.\n\nWhat I've done with the prefetches is two-fold:\n1) changed it from prefetching the mbuf (first cache line) to prefetching\nthe mbuf pool pointer (second cache line) so that when we go to access\nthe pool pointer to free transmitted mbufs we don't get a cache miss. When\nclearing the ring and freeing mbufs, the pool pointer is the only mbuf\nfield used, so we don't need that first cache line.\n2) changed the code to prefetch earlier - in effect to prefetch one mbuf\nahead. The original code prefetched the mbuf to be freed as soon as it\nstarted processing the mbuf to replace it. Instead now, every time we\ncalculate what the next mbuf position is going to be we prefetch the mbuf\nin that position (i.e. the mbuf pool pointer we are going to free the mbuf\nto), even while we are still updating the previous mbuf slot on the ring.\nThis gives the prefetch much more time to resolve and get the data we need\nin the cache before we need it.\n\nIn terms of performance difference, a quick sanity test using testpmd\non a Xeon (Sandy Bridge uarch) platform showed performance increases\nbetween approx 8-18%, depending on the particular RX path used in\nconjuntion with this TX path code.\n\nChanges in V2:\n* Expanded commit message with extra details of change.\n\nSigned-off-by: Bruce Richardson <>\n---\n lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 12 +++++++-----\n 1 file changed, 7 insertions(+), 5 deletions(-)",
    "diff": "diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c\nindex 6f702b3..c0bb49f 100644\n--- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c\n+++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c\n@@ -565,25 +565,26 @@ ixgbe_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,\n \t\tixgbe_xmit_cleanup(txq);\n \t}\n \n+\trte_prefetch0(&txe->mbuf->pool);\n+\n \t/* TX loop */\n \tfor (nb_tx = 0; nb_tx < nb_pkts; nb_tx++) {\n \t\tnew_ctx = 0;\n \t\ttx_pkt = *tx_pkts++;\n \t\tpkt_len = tx_pkt->pkt_len;\n \n-\t\tRTE_MBUF_PREFETCH_TO_FREE(txe->mbuf);\n-\n \t\t/*\n \t\t * Determine how many (if any) context descriptors\n \t\t * are needed for offload functionality.\n \t\t */\n \t\tol_flags = tx_pkt->ol_flags;\n-\t\tvlan_macip_lens.f.vlan_tci = tx_pkt->vlan_tci;\n-\t\tvlan_macip_lens.f.l2_l3_len = tx_pkt->l2_l3_len;\n \n \t\t/* If hardware offload required */\n \t\ttx_ol_req = ol_flags & PKT_TX_OFFLOAD_MASK;\n \t\tif (tx_ol_req) {\n+\t\t\tvlan_macip_lens.f.vlan_tci = tx_pkt->vlan_tci;\n+\t\t\tvlan_macip_lens.f.l2_l3_len = tx_pkt->l2_l3_len;\n+\n \t\t\t/* If new context need be built or reuse the exist ctx. */\n \t\t\tctx = what_advctx_update(txq, tx_ol_req,\n \t\t\t\;\n@@ -720,7 +721,7 @@ ixgbe_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,\n \t\t\t\t    &txr[tx_id];\n \n \t\t\t\ttxn = &sw_ring[txe->next_id];\n-\t\t\t\tRTE_MBUF_PREFETCH_TO_FREE(txn->mbuf);\n+\t\t\t\trte_prefetch0(&txn->mbuf->pool);\n \n \t\t\t\tif (txe->mbuf != NULL) {\n \t\t\t\t\trte_pktmbuf_free_seg(txe->mbuf);\n@@ -749,6 +750,7 @@ ixgbe_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,\n \t\tdo {\n \t\t\ttxd = &txr[tx_id];\n \t\t\ttxn = &sw_ring[txe->next_id];\n+\t\t\trte_prefetch0(&txn->mbuf->pool);\n \n \t\t\tif (txe->mbuf != NULL)\n \t\t\t\trte_pktmbuf_free_seg(txe->mbuf);\n",
    "prefixes": [