List comments

GET /api/patches/74479/comments/
HTTP 200 OK
Allow: GET, HEAD, OPTIONS
Content-Type: application/json
Vary: Accept

[
    {
        "id": 116301,
        "web_url": "https://patches.dpdk.org/comment/116301/",
        "msgid": "<CAJFAV8wm2EBusAfWTKRTYQ1tyRYkMhx=DVu8NbPUBvXTbtnjyA@mail.gmail.com>",
        "date": "2020-07-20T12:20:32",
        "subject": "Re: [dpdk-dev] [PATCH v9 1/4] eal: add WC store functions",
        "submitter": {
            "id": 1173,
            "url": "https://patches.dpdk.org/api/people/1173/",
            "name": "David Marchand",
            "email": "david.marchand@redhat.com"
        },
        "content": "On Mon, Jul 20, 2020 at 11:12 AM Radu Nicolau <radu.nicolau@intel.com> wrote:\n>\n> Add rte_write32_wc and rte_write32_wc_relaxed functions\n> that implement 32bit stores using write combining memory protocol.\n> Provided generic stubs and x86 implementation.\n\nWhat is the difference of using this new API when compared to the\nexisting pci driver flag RTE_PCI_DRV_WC_ACTIVATE?\nDo we have some overlap between the two?\n\nThis commitlog is quite short for something that touches performance.\nI saw a question from Ruifeng, it is worth adding this to the commitlog.\n\nWhich x86 platforms will benefit from it?\nWhat is the impact on performance for existing platforms that have no\nMOVDIRI support?\n\n\n>\n> Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>\n> Acked-by: Bruce Richardson <bruce.richardson@intel.com>\n> ---\n>  lib/librte_eal/arm/include/rte_io_64.h  | 12 +++++++\n>  lib/librte_eal/include/generic/rte_io.h | 48 ++++++++++++++++++++++++++++\n>  lib/librte_eal/x86/include/rte_io.h     | 56 +++++++++++++++++++++++++++++++++\n>  3 files changed, 116 insertions(+)\n>\n> diff --git a/lib/librte_eal/arm/include/rte_io_64.h b/lib/librte_eal/arm/include/rte_io_64.h\n> index e534624..d07d9cb 100644\n> --- a/lib/librte_eal/arm/include/rte_io_64.h\n> +++ b/lib/librte_eal/arm/include/rte_io_64.h\n> @@ -164,6 +164,18 @@ rte_write64(uint64_t value, volatile void *addr)\n>         rte_write64_relaxed(value, addr);\n>  }\n>\n> +static __rte_always_inline void\n> +rte_write32_wc(uint32_t value, volatile void *addr)\n> +{\n> +       rte_write32(value, addr);\n> +}\n> +\n> +static __rte_always_inline void\n> +rte_write32_wc_relaxed(uint32_t value, volatile void *addr)\n> +{\n> +       rte_write32_relaxed(value, addr);\n> +}\n> +\n\nWe were using a single knob RTE_OVERRIDE_IO_H for overriding the whole rte_io.h.\nNow we would have a special case for an API for x86 and the code is\ncopy/pasted in the ARM header and keeping the \"whole\" override mode.\n\nThis leaves an unfinished taste.\n\nWhy did you not flag all relevant \"native\" helpers?\nThis would factor some code from the ARM header.\n\n\n>  #ifdef __cplusplus\n>  }\n>  #endif\n> diff --git a/lib/librte_eal/include/generic/rte_io.h b/lib/librte_eal/include/generic/rte_io.h\n> index da457f7..0669baa 100644\n> --- a/lib/librte_eal/include/generic/rte_io.h\n> +++ b/lib/librte_eal/include/generic/rte_io.h\n> @@ -229,6 +229,40 @@ rte_write32(uint32_t value, volatile void *addr);\n>  static inline void\n>  rte_write64(uint64_t value, volatile void *addr);\n>\n> +/**\n> + * Write a 32-bit value to I/O device memory address addr using write\n> + * combining memory write protocol. Depending on the platform write combining\n> + * may not be available and/or may be treated as a hint and the behavior may\n> + * fallback to a regular store.\n> + *\n> + * @param value\n> + *  Value to write\n> + * @param addr\n> + *  I/O memory address to write the value to\n> + */\n> +__rte_experimental\n> +static inline void\n> +rte_write32_wc(uint32_t value, volatile void *addr);\n> +\n> +/**\n> + * Write a 32-bit value to I/O device memory address addr using write\n> + * combining memory write protocol. Depending on the platform write combining\n> + * may not be available and/or may be treated as a hint and the behavior may\n> + * fallback to a regular store.\n> + *\n> + * The relaxed version does not have additional I/O memory barrier, useful in\n> + * accessing the device registers of integrated controllers which implicitly\n> + * strongly ordered with respect to memory access.\n\nIt might be just me, but I have trouble reading the last part of this sentence.\nMaybe remove \"with respect to\"?\n\n\n> + *\n> + * @param value\n> + *  Value to write\n> + * @param addr\n> + *  I/O memory address to write the value to\n> + */\n> +__rte_experimental\n> +static inline void\n> +rte_write32_wc_relaxed(uint32_t value, volatile void *addr);\n> +\n>  #endif /* __DOXYGEN__ */\n>\n>  #ifndef RTE_OVERRIDE_IO_H\n> @@ -345,6 +379,20 @@ rte_write64(uint64_t value, volatile void *addr)\n>         rte_write64_relaxed(value, addr);\n>  }\n>\n> +#ifndef RTE_NATIVE_WRITE32_WC\n> +static __rte_always_inline void\n> +rte_write32_wc(uint32_t value, volatile void *addr)\n> +{\n> +       rte_write32(value, addr);\n> +}\n> +\n> +static __rte_always_inline void\n> +rte_write32_wc_relaxed(uint32_t value, volatile void *addr)\n> +{\n> +       rte_write32_relaxed(value, addr);\n> +}\n> +#endif /* RTE_NATIVE_WRITE32_WC */\n> +\n>  #endif /* RTE_OVERRIDE_IO_H */\n>\n>  #endif /* _RTE_IO_H_ */\n> diff --git a/lib/librte_eal/x86/include/rte_io.h b/lib/librte_eal/x86/include/rte_io.h\n> index 2db71b1..c95ed67 100644\n> --- a/lib/librte_eal/x86/include/rte_io.h\n> +++ b/lib/librte_eal/x86/include/rte_io.h\n> @@ -9,8 +9,64 @@\n>  extern \"C\" {\n>  #endif\n>\n> +#include \"rte_cpuflags.h\"\n> +\n> +#define RTE_NATIVE_WRITE32_WC\n>  #include \"generic/rte_io.h\"\n>\n> +/**\n> + * @internal\n> + * MOVDIRI wrapper.\n> + */\n> +static __rte_always_inline void\n> +_rte_x86_movdiri(uint32_t value, volatile void *addr)\n> +{\n> +       asm volatile(\n> +               /* MOVDIRI */\n> +               \".byte 0x40, 0x0f, 0x38, 0xf9, 0x02\"\n> +               :\n> +               : \"a\" (value), \"d\" (addr));\n> +}\n> +\n> +static __rte_always_inline void\n> +rte_write32_wc(uint32_t value, volatile void *addr)\n> +{\n> +       static int _x86_movdiri_flag = -1;\n> +       if (_x86_movdiri_flag == 1) {\n> +               rte_wmb();\n> +               _rte_x86_movdiri(value, addr);\n> +       } else if (_x86_movdiri_flag == 0) {\n> +               rte_write32(value, addr);\n> +       } else {\n> +               _x86_movdiri_flag =\n> +                       (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MOVDIRI) > 0);\n> +               if (_x86_movdiri_flag == 1) {\n> +                       rte_wmb();\n> +                       _rte_x86_movdiri(value, addr);\n> +               } else {\n> +                       rte_write32(value, addr);\n> +               }\n> +       }\n> +}\n> +\n> +static __rte_always_inline void\n> +rte_write32_wc_relaxed(uint32_t value, volatile void *addr)\n> +{\n> +       static int _x86_movdiri_flag = -1;\n> +       if (_x86_movdiri_flag == 1) {\n> +               _rte_x86_movdiri(value, addr);\n> +       } else if (_x86_movdiri_flag == 0) {\n> +               rte_write32_relaxed(value, addr);\n> +       } else {\n> +               _x86_movdiri_flag =\n> +                       (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MOVDIRI) > 0);\n> +               if (_x86_movdiri_flag == 1)\n> +                       _rte_x86_movdiri(value, addr);\n> +               else\n> +                       rte_write32_relaxed(value, addr);\n> +       }\n> +}\n> +\n\nRepeating some comments I made earlier.\n\n- If a single helper called by both rte_write32_wc and\nrte_write32_wc_relaxed with a _constant_ flag is not to your liking (I\ndon't see where it would have an impact on performance), then maybe\nrte_write32_wc() can be simply implemented as:\n\n+static __rte_always_inline void\n+rte_write32_wc(uint32_t value, volatile void *addr)\n+{\n+    rte_wmb();\n+    rte_write32_wc_relaxed(value, addr);\n+}\n+\n\n- Looking at this above suggestion, I wonder about the non-relaxed case.\nIs rte_io_wmb() not enough?\n\n\n- The cpuflag check can be resolved at init once and for all.\nBy this, I mean in lib/librte_eal/x86/rte_cpuflags.c:\n\n+int rte_x86_movdiri_flag = -1;\n+\n+RTE_INIT(rte_x86_movdiri_init)\n+{\n+       rte_x86_movdiri_flag =\n+               (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MOVDIRI) > 0);\n+}\n\nThe variable can be exported in lib/librte_eal/x86/include/rte_cpuflags.h.\n\nThen rte_write32_wc_relaxed() becomes:\n\n+static __rte_always_inline void\n+rte_write32_wc_relaxed(uint32_t value, volatile void *addr)\n+{\n+    if (rte_x86_movdiri_flag == 1) {\n+        asm volatile(\n+            /* MOVDIRI */\n+            \".byte 0x40, 0x0f, 0x38, 0xf9, 0x02\"\n+            :\n+            : \"a\" (value), \"d\" (addr));\n+        return;\n+    }\n+\n+    rte_write32_relaxed(value, addr);\n+}\n+\n\n\nThanks.",
        "headers": {
            "List-Subscribe": "<https://mails.dpdk.org/listinfo/dev>,\n <mailto:dev-request@dpdk.org?subject=subscribe>",
            "X-Mailman-Version": "2.1.15",
            "X-Mimecast-Originator": "redhat.com",
            "Authentication-Results": "relay.mimecast.com;\n auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=dmarchan@redhat.com",
            "X-Mimecast-Spam-Score": "0",
            "DKIM-Signature": "v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;\n s=mimecast20190719; t=1595247646;\n h=from:from:reply-to:subject:subject:date:date:message-id:message-id:\n to:to:cc:cc:mime-version:mime-version:content-type:content-type:\n in-reply-to:in-reply-to:references:references;\n bh=wgWTy+6gNrx5yYrZL8b1U8v8aq3jlvfXwfraCjMND5o=;\n b=ZiBYgcKwEOvqPpe4LQm9CPN4WFMp8rGGsiA1UqkKy9rSVWhB2wxVCmw23owAqGIn3QLMUZ\n 5ChoxEorXUo34YdIRuvzcImI8kefxMJ7cYX0Mha2xQvYo4Ew7fWHO61A3PhHw2kNr3jLh1\n +OPqUSMp1Agzkc3lnaitC49XLRZaZt0=",
            "Precedence": "list",
            "X-Gm-Message-State": "AOAM5339JDWivf4sRWo3HxkYw/sUO2ye5xaiNJRYoADVKLCTE1gSjZaO\n /r6I9vjU3fYZaHBiSDK5fMGiubA/OUE4pUJQRPHKOYhUwKBrA70bNDdRh8Z4L50ispVqG2rsjeO\n nbl8BrejXVukrtN26fUQ=",
            "X-Google-Smtp-Source": "\n ABdhPJxtHld3fN23mlsGiFhVNHOZORzB47OAdskT85OfFnIPWUgGjzra7iL7w96zQleJOJbzVWkrK1DY9gAjvpCRj1k=",
            "List-Post": "<mailto:dev@dpdk.org>",
            "MIME-Version": "1.0",
            "List-Id": "DPDK patches and discussions <dev.dpdk.org>",
            "X-BeenThere": "dev@dpdk.org",
            "References": "<1591870283-7776-1-git-send-email-radu.nicolau@intel.com>\n <1595236337-28230-1-git-send-email-radu.nicolau@intel.com>\n <1595236337-28230-2-git-send-email-radu.nicolau@intel.com>",
            "Subject": "Re: [dpdk-dev] [PATCH v9 1/4] eal: add WC store functions",
            "Content-Type": "text/plain; charset=\"UTF-8\"",
            "From": "David Marchand <david.marchand@redhat.com>",
            "Received": [
                "from dpdk.org (dpdk.org [92.243.14.124])\n\tby inbox.dpdk.org (Postfix) with ESMTP id B8E60A0540;\n\tMon, 20 Jul 2020 14:20:49 +0200 (CEST)",
                "from [92.243.14.124] (localhost [127.0.0.1])\n\tby dpdk.org (Postfix) with ESMTP id 8C6FE1DBB;\n\tMon, 20 Jul 2020 14:20:48 +0200 (CEST)",
                "from us-smtp-delivery-1.mimecast.com (us-smtp-2.mimecast.com\n [207.211.31.81]) by dpdk.org (Postfix) with ESMTP id 3B4E01023\n for <dev@dpdk.org>; Mon, 20 Jul 2020 14:20:47 +0200 (CEST)",
                "from mail-vs1-f70.google.com (mail-vs1-f70.google.com\n [209.85.217.70]) (Using TLS) by relay.mimecast.com with ESMTP id\n us-mta-63-zBP9XzSDNSCcExqZ6JyUdQ-1; Mon, 20 Jul 2020 08:20:44 -0400",
                "by mail-vs1-f70.google.com with SMTP id d15so2890538vsr.14\n for <dev@dpdk.org>; Mon, 20 Jul 2020 05:20:44 -0700 (PDT)"
            ],
            "List-Archive": "<http://mails.dpdk.org/archives/dev/>",
            "X-Original-To": "patchwork@inbox.dpdk.org",
            "List-Unsubscribe": "<https://mails.dpdk.org/options/dev>,\n <mailto:dev-request@dpdk.org?subject=unsubscribe>",
            "X-Google-DKIM-Signature": "v=1; a=rsa-sha256; c=relaxed/relaxed;\n d=1e100.net; s=20161025;\n h=x-gm-message-state:mime-version:references:in-reply-to:from:date\n :message-id:subject:to:cc;\n bh=wgWTy+6gNrx5yYrZL8b1U8v8aq3jlvfXwfraCjMND5o=;\n b=bkUh9kNREO3wRoF/orclreQJeFIf32IlvqwvIiw6QPyhjJk9/ztOJfsWnBINUM3FUR\n 4p+WKpZ1/NyWr5IMM5E9JQPRjmaQ5MoteavKD0dny7f2UCVlLzvvdqs/oJOFHIibF2GO\n rDNbBIYFOC/U5NZgYixusl+bsEMG/O7UjRmMi2SZaT4YVCNfABBHaA9BZN4EyaKfWGgg\n JRwnWmfKTLgXZLSn55x6hgKPXznBakybJ33jo4tZ9s5k5xAmRlVHlE3PIDZG5rc0tkYr\n iny0lMdXqN3JNAezaf2JirHi9fjCrfJczpiHF59hHJ7qmVzF5R6+zQy0kfQZycptXfRJ\n 3QAw==",
            "X-MC-Unique": "zBP9XzSDNSCcExqZ6JyUdQ-1",
            "Message-ID": "\n <CAJFAV8wm2EBusAfWTKRTYQ1tyRYkMhx=DVu8NbPUBvXTbtnjyA@mail.gmail.com>",
            "Date": "Mon, 20 Jul 2020 14:20:32 +0200",
            "Sender": "\"dev\" <dev-bounces@dpdk.org>",
            "X-Received": [
                "by 2002:ab0:6950:: with SMTP id c16mr15585460uas.53.1595247644196;\n Mon, 20 Jul 2020 05:20:44 -0700 (PDT)",
                "by 2002:ab0:6950:: with SMTP id c16mr15585434uas.53.1595247643816;\n Mon, 20 Jul 2020 05:20:43 -0700 (PDT)"
            ],
            "To": "Radu Nicolau <radu.nicolau@intel.com>",
            "Delivered-To": "patchwork@inbox.dpdk.org",
            "In-Reply-To": "<1595236337-28230-2-git-send-email-radu.nicolau@intel.com>",
            "Cc": "dev <dev@dpdk.org>, Beilei Xing <beilei.xing@intel.com>,\n Jeff Guo <jia.guo@intel.com>,\n Bruce Richardson <bruce.richardson@intel.com>,\n \"Ananyev, Konstantin\" <konstantin.ananyev@intel.com>,\n Jerin Jacob <jerinjacobk@gmail.com>,\n \"Trahe, Fiona\" <fiona.trahe@intel.com>, Wei Zhao <wei.zhao1@intel.com>,\n \"Ruifeng Wang (Arm Technology China)\" <ruifeng.wang@arm.com>",
            "List-Help": "<mailto:dev-request@dpdk.org?subject=help>",
            "Errors-To": "dev-bounces@dpdk.org",
            "Return-Path": "<dev-bounces@dpdk.org>"
        }
    },
    {
        "id": 116393,
        "web_url": "https://patches.dpdk.org/comment/116393/",
        "msgid": "<b182a9f2-dee5-d3b9-a80b-ef97cc28766b@intel.com>",
        "date": "2020-07-21T08:56:53",
        "subject": "Re: [dpdk-dev] [PATCH v9 1/4] eal: add WC store functions",
        "submitter": {
            "id": 743,
            "url": "https://patches.dpdk.org/api/people/743/",
            "name": "Nicolau, Radu",
            "email": "radu.nicolau@intel.com"
        },
        "content": "On 7/20/2020 1:20 PM, David Marchand wrote:\n> On Mon, Jul 20, 2020 at 11:12 AM Radu Nicolau <radu.nicolau@intel.com> wrote:\n>> Add rte_write32_wc and rte_write32_wc_relaxed functions\n>> that implement 32bit stores using write combining memory protocol.\n>> Provided generic stubs and x86 implementation.\n> What is the difference of using this new API when compared to the\n> existing pci driver flag RTE_PCI_DRV_WC_ACTIVATE?\n> Do we have some overlap between the two?\nNo, the RTE_PCI_DRV_WC_ACTIVATE define will map the BARs as write \ncombining, whereas these functions will use the WC regardless of the \nmapping, and they can be used for memory areas that won't be mapped \nusing the PCI infrastructure.\n>\n> This commitlog is quite short for something that touches performance.\n> I saw a question from Ruifeng, it is worth adding this to the commitlog.\n\nStrictly speaking, this patch only enables support for WC stores, and it \ndoes not have any performance implications on its own. I think a fair \nassumption is that if someone is looking to use WC stores they will know \nexactly how and where they can be used.\n\n>\n> Which x86 platforms will benefit from it?\n> What is the impact on performance for existing platforms that have no\n> MOVDIRI support?\n\nx86 platforms supporting MOVDIRI instruction with this particular patch, \nand any other platform that have similar instructions if they were to be \nenabled.\n\nThere is no impact for the ones that don't support it.\n\n>\n>\n>> Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>\n>> Acked-by: Bruce Richardson <bruce.richardson@intel.com>\n>> ---\n>>   lib/librte_eal/arm/include/rte_io_64.h  | 12 +++++++\n>>   lib/librte_eal/include/generic/rte_io.h | 48 ++++++++++++++++++++++++++++\n>>   lib/librte_eal/x86/include/rte_io.h     | 56 +++++++++++++++++++++++++++++++++\n>>   3 files changed, 116 insertions(+)\n>>\n>> diff --git a/lib/librte_eal/arm/include/rte_io_64.h b/lib/librte_eal/arm/include/rte_io_64.h\n>> index e534624..d07d9cb 100644\n>> --- a/lib/librte_eal/arm/include/rte_io_64.h\n>> +++ b/lib/librte_eal/arm/include/rte_io_64.h\n>> @@ -164,6 +164,18 @@ rte_write64(uint64_t value, volatile void *addr)\n>>          rte_write64_relaxed(value, addr);\n>>   }\n>>\n>> +static __rte_always_inline void\n>> +rte_write32_wc(uint32_t value, volatile void *addr)\n>> +{\n>> +       rte_write32(value, addr);\n>> +}\n>> +\n>> +static __rte_always_inline void\n>> +rte_write32_wc_relaxed(uint32_t value, volatile void *addr)\n>> +{\n>> +       rte_write32_relaxed(value, addr);\n>> +}\n>> +\n> We were using a single knob RTE_OVERRIDE_IO_H for overriding the whole rte_io.h.\n> Now we would have a special case for an API for x86 and the code is\n> copy/pasted in the ARM header and keeping the \"whole\" override mode.\n>\n> This leaves an unfinished taste.\n>\n> Why did you not flag all relevant \"native\" helpers?\n> This would factor some code from the ARM header.\nI agree that having a more granular approach is better, having a single \nknob is why ARM header had about half of the functions overridden and \nhalf copied and pasted before this patch. But this is outside the scope \nof this patch.\n>\n>\n>>   #ifdef __cplusplus\n>>   }\n>>   #endif\n>> diff --git a/lib/librte_eal/include/generic/rte_io.h b/lib/librte_eal/include/generic/rte_io.h\n>> index da457f7..0669baa 100644\n>> --- a/lib/librte_eal/include/generic/rte_io.h\n>> +++ b/lib/librte_eal/include/generic/rte_io.h\n>> @@ -229,6 +229,40 @@ rte_write32(uint32_t value, volatile void *addr);\n>>   static inline void\n>>   rte_write64(uint64_t value, volatile void *addr);\n>>\n>> +/**\n>> + * Write a 32-bit value to I/O device memory address addr using write\n>> + * combining memory write protocol. Depending on the platform write combining\n>> + * may not be available and/or may be treated as a hint and the behavior may\n>> + * fallback to a regular store.\n>> + *\n>> + * @param value\n>> + *  Value to write\n>> + * @param addr\n>> + *  I/O memory address to write the value to\n>> + */\n>> +__rte_experimental\n>> +static inline void\n>> +rte_write32_wc(uint32_t value, volatile void *addr);\n>> +\n>> +/**\n>> + * Write a 32-bit value to I/O device memory address addr using write\n>> + * combining memory write protocol. Depending on the platform write combining\n>> + * may not be available and/or may be treated as a hint and the behavior may\n>> + * fallback to a regular store.\n>> + *\n>> + * The relaxed version does not have additional I/O memory barrier, useful in\n>> + * accessing the device registers of integrated controllers which implicitly\n>> + * strongly ordered with respect to memory access.\n> It might be just me, but I have trouble reading the last part of this sentence.\n> Maybe remove \"with respect to\"?\nIt was copied and pasted from the other _relaxed functions for \nconsistency reasons.\n>\n>\n>> + *\n>> + * @param value\n>> + *  Value to write\n>> + * @param addr\n>> + *  I/O memory address to write the value to\n>> + */\n>> +__rte_experimental\n>> +static inline void\n>> +rte_write32_wc_relaxed(uint32_t value, volatile void *addr);\n>> +\n>>   #endif /* __DOXYGEN__ */\n>>\n>>   #ifndef RTE_OVERRIDE_IO_H\n>> @@ -345,6 +379,20 @@ rte_write64(uint64_t value, volatile void *addr)\n>>          rte_write64_relaxed(value, addr);\n>>   }\n>>\n>> +#ifndef RTE_NATIVE_WRITE32_WC\n>> +static __rte_always_inline void\n>> +rte_write32_wc(uint32_t value, volatile void *addr)\n>> +{\n>> +       rte_write32(value, addr);\n>> +}\n>> +\n>> +static __rte_always_inline void\n>> +rte_write32_wc_relaxed(uint32_t value, volatile void *addr)\n>> +{\n>> +       rte_write32_relaxed(value, addr);\n>> +}\n>> +#endif /* RTE_NATIVE_WRITE32_WC */\n>> +\n>>   #endif /* RTE_OVERRIDE_IO_H */\n>>\n>>   #endif /* _RTE_IO_H_ */\n>> diff --git a/lib/librte_eal/x86/include/rte_io.h b/lib/librte_eal/x86/include/rte_io.h\n>> index 2db71b1..c95ed67 100644\n>> --- a/lib/librte_eal/x86/include/rte_io.h\n>> +++ b/lib/librte_eal/x86/include/rte_io.h\n>> @@ -9,8 +9,64 @@\n>>   extern \"C\" {\n>>   #endif\n>>\n>> +#include \"rte_cpuflags.h\"\n>> +\n>> +#define RTE_NATIVE_WRITE32_WC\n>>   #include \"generic/rte_io.h\"\n>>\n>> +/**\n>> + * @internal\n>> + * MOVDIRI wrapper.\n>> + */\n>> +static __rte_always_inline void\n>> +_rte_x86_movdiri(uint32_t value, volatile void *addr)\n>> +{\n>> +       asm volatile(\n>> +               /* MOVDIRI */\n>> +               \".byte 0x40, 0x0f, 0x38, 0xf9, 0x02\"\n>> +               :\n>> +               : \"a\" (value), \"d\" (addr));\n>> +}\n>> +\n>> +static __rte_always_inline void\n>> +rte_write32_wc(uint32_t value, volatile void *addr)\n>> +{\n>> +       static int _x86_movdiri_flag = -1;\n>> +       if (_x86_movdiri_flag == 1) {\n>> +               rte_wmb();\n>> +               _rte_x86_movdiri(value, addr);\n>> +       } else if (_x86_movdiri_flag == 0) {\n>> +               rte_write32(value, addr);\n>> +       } else {\n>> +               _x86_movdiri_flag =\n>> +                       (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MOVDIRI) > 0);\n>> +               if (_x86_movdiri_flag == 1) {\n>> +                       rte_wmb();\n>> +                       _rte_x86_movdiri(value, addr);\n>> +               } else {\n>> +                       rte_write32(value, addr);\n>> +               }\n>> +       }\n>> +}\n>> +\n>> +static __rte_always_inline void\n>> +rte_write32_wc_relaxed(uint32_t value, volatile void *addr)\n>> +{\n>> +       static int _x86_movdiri_flag = -1;\n>> +       if (_x86_movdiri_flag == 1) {\n>> +               _rte_x86_movdiri(value, addr);\n>> +       } else if (_x86_movdiri_flag == 0) {\n>> +               rte_write32_relaxed(value, addr);\n>> +       } else {\n>> +               _x86_movdiri_flag =\n>> +                       (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MOVDIRI) > 0);\n>> +               if (_x86_movdiri_flag == 1)\n>> +                       _rte_x86_movdiri(value, addr);\n>> +               else\n>> +                       rte_write32_relaxed(value, addr);\n>> +       }\n>> +}\n>> +\n> Repeating some comments I made earlier.\n>\n> - If a single helper called by both rte_write32_wc and\n> rte_write32_wc_relaxed with a _constant_ flag is not to your liking (I\n> don't see where it would have an impact on performance), then maybe\n> rte_write32_wc() can be simply implemented as:\n>\n> +static __rte_always_inline void\n> +rte_write32_wc(uint32_t value, volatile void *addr)\n> +{\n> +    rte_wmb();\n> +    rte_write32_wc_relaxed(value, addr);\n> +}\n> +\nYes, it can be written as such, I will update the patch.\n>\n> - Looking at this above suggestion, I wonder about the non-relaxed case.\n> Is rte_io_wmb() not enough?\nNo, we need an actual memory fence, on x86 rte_io_wmb() is defined as \nrte_compiler_barrier()\n>\n>\n> - The cpuflag check can be resolved at init once and for all.\n> By this, I mean in lib/librte_eal/x86/rte_cpuflags.c:\n>\n> +int rte_x86_movdiri_flag = -1;\n> +\n> +RTE_INIT(rte_x86_movdiri_init)\n> +{\n> +       rte_x86_movdiri_flag =\n> +               (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MOVDIRI) > 0);\n> +}\n>\n> The variable can be exported in lib/librte_eal/x86/include/rte_cpuflags.h.\n>\n> Then rte_write32_wc_relaxed() becomes:\n>\n> +static __rte_always_inline void\n> +rte_write32_wc_relaxed(uint32_t value, volatile void *addr)\n> +{\n> +    if (rte_x86_movdiri_flag == 1) {\n> +        asm volatile(\n> +            /* MOVDIRI */\n> +            \".byte 0x40, 0x0f, 0x38, 0xf9, 0x02\"\n> +            :\n> +            : \"a\" (value), \"d\" (addr));\n> +        return;\n> +    }\n> +\n> +    rte_write32_relaxed(value, addr);\n> +}\n> +\nI tried this before sending the patch, there is an issue when linking \nthe shared library version with the --no-undefined flag. This will \nrequire to have the variable exported in the overall map file, requiring \na change across all platforms for something that is very x86 specific.\n>\n> Thanks.\n>",
        "headers": {
            "List-Subscribe": "<https://mails.dpdk.org/listinfo/dev>,\n <mailto:dev-request@dpdk.org?subject=subscribe>",
            "X-IronPort-AV": [
                "E=McAfee;i=\"6000,8403,9688\"; a=\"130164930\"",
                "E=Sophos;i=\"5.75,378,1589266800\"; d=\"scan'208\";a=\"130164930\"",
                "E=Sophos;i=\"5.75,378,1589266800\"; d=\"scan'208\";a=\"431912343\""
            ],
            "IronPort-SDR": [
                "\n Phendh6LRZZHqQDI9avcD6ihrHpmo2tWOF1yggEQbmwelGvbb8TxcKH/T2OIcK9tWkw9qsNw6w\n 2qajvt28RYGg==",
                "\n mUKReZVSgk5KbnqvKFw5qVMv90YLn8VqbE4eYtJixFUQrDNrJFw7vjJxcyzxNq4sSXhghIWO7N\n /X/8RXxPTMBw=="
            ],
            "X-Amp-File-Uploaded": "False",
            "Precedence": "list",
            "X-Mailman-Version": "2.1.15",
            "X-Original-To": "patchwork@inbox.dpdk.org",
            "Sender": "\"dev\" <dev-bounces@dpdk.org>",
            "List-Post": "<mailto:dev@dpdk.org>",
            "MIME-Version": "1.0",
            "List-Id": "DPDK patches and discussions <dev.dpdk.org>",
            "X-BeenThere": "dev@dpdk.org",
            "References": "<1591870283-7776-1-git-send-email-radu.nicolau@intel.com>\n <1595236337-28230-1-git-send-email-radu.nicolau@intel.com>\n <1595236337-28230-2-git-send-email-radu.nicolau@intel.com>\n <CAJFAV8wm2EBusAfWTKRTYQ1tyRYkMhx=DVu8NbPUBvXTbtnjyA@mail.gmail.com>",
            "Subject": "Re: [dpdk-dev] [PATCH v9 1/4] eal: add WC store functions",
            "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101\n Thunderbird/68.10.0",
            "Content-Type": "text/plain; charset=utf-8; format=flowed",
            "From": "\"Nicolau, Radu\" <radu.nicolau@intel.com>",
            "Received": [
                "from dpdk.org (dpdk.org [92.243.14.124])\n\tby inbox.dpdk.org (Postfix) with ESMTP id 0152BA0526;\n\tTue, 21 Jul 2020 10:57:01 +0200 (CEST)",
                "from [92.243.14.124] (localhost [127.0.0.1])\n\tby dpdk.org (Postfix) with ESMTP id C4FA41BFEF;\n\tTue, 21 Jul 2020 10:57:00 +0200 (CEST)",
                "from mga17.intel.com (mga17.intel.com [192.55.52.151])\n by dpdk.org (Postfix) with ESMTP id 92CE71BFE4\n for <dev@dpdk.org>; Tue, 21 Jul 2020 10:56:58 +0200 (CEST)",
                "from orsmga004.jf.intel.com ([10.7.209.38])\n by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;\n 21 Jul 2020 01:56:57 -0700",
                "from rnicolau-mobl1.ger.corp.intel.com (HELO [10.213.251.241])\n ([10.213.251.241])\n by orsmga004.jf.intel.com with ESMTP; 21 Jul 2020 01:56:54 -0700"
            ],
            "List-Help": "<mailto:dev-request@dpdk.org?subject=help>",
            "X-Amp-Result": "SKIPPED(no attachment in message)",
            "List-Archive": "<http://mails.dpdk.org/archives/dev/>",
            "List-Unsubscribe": "<https://mails.dpdk.org/options/dev>,\n <mailto:dev-request@dpdk.org?subject=unsubscribe>",
            "X-ExtLoop1": "1",
            "Message-ID": "<b182a9f2-dee5-d3b9-a80b-ef97cc28766b@intel.com>",
            "Date": "Tue, 21 Jul 2020 09:56:53 +0100",
            "Content-Transfer-Encoding": "7bit",
            "To": "David Marchand <david.marchand@redhat.com>",
            "Delivered-To": "patchwork@inbox.dpdk.org",
            "In-Reply-To": "\n <CAJFAV8wm2EBusAfWTKRTYQ1tyRYkMhx=DVu8NbPUBvXTbtnjyA@mail.gmail.com>",
            "Cc": "dev <dev@dpdk.org>, Beilei Xing <beilei.xing@intel.com>,\n Jeff Guo <jia.guo@intel.com>, Bruce Richardson <bruce.richardson@intel.com>,\n \"Ananyev, Konstantin\" <konstantin.ananyev@intel.com>,\n Jerin Jacob <jerinjacobk@gmail.com>, \"Trahe, Fiona\" <fiona.trahe@intel.com>,\n Wei Zhao <wei.zhao1@intel.com>,\n \"Ruifeng Wang (Arm Technology China)\" <ruifeng.wang@arm.com>",
            "Return-Path": "<dev-bounces@dpdk.org>",
            "Errors-To": "dev-bounces@dpdk.org",
            "Content-Language": "en-GB"
        }
    }
]