Patch Detail
get:
Show a patch.
patch:
Update a patch.
put:
Update a patch.
GET /api/patches/563/?format=api
https://patches.dpdk.org/api/patches/563/?format=api", "web_url": "https://patches.dpdk.org/project/dpdk/patch/1411724018-7738-7-git-send-email-bjzhuc@cn.ibm.com/", "project": { "id": 1, "url": "https://patches.dpdk.org/api/projects/1/?format=api", "name": "DPDK", "link_name": "dpdk", "list_id": "dev.dpdk.org", "list_email": "dev@dpdk.org", "web_url": "http://core.dpdk.org", "scm_url": "git://dpdk.org/dpdk", "webscm_url": "http://git.dpdk.org/dpdk", "list_archive_url": "https://inbox.dpdk.org/dev", "list_archive_url_format": "https://inbox.dpdk.org/dev/{}", "commit_url_format": "" }, "msgid": "<1411724018-7738-7-git-send-email-bjzhuc@cn.ibm.com>", "list_archive_url": "https://inbox.dpdk.org/dev/1411724018-7738-7-git-send-email-bjzhuc@cn.ibm.com", "date": "2014-09-26T09:33:37", "name": "[dpdk-dev,6/7] Split memcpy operation to architecture specific", "commit_ref": null, "pull_url": null, "state": "superseded", "archived": true, "hash": "e502fe65be6d40a3c87808c6d7472d81a15a29da", "submitter": { "id": 80, "url": "https://patches.dpdk.org/api/people/80/?format=api", "name": "Chao Zhu", "email": "bjzhuc@cn.ibm.com" }, "delegate": null, "mbox": "https://patches.dpdk.org/project/dpdk/patch/1411724018-7738-7-git-send-email-bjzhuc@cn.ibm.com/mbox/", "series": [], "comments": "https://patches.dpdk.org/api/patches/563/comments/", "check": "pending", "checks": "https://patches.dpdk.org/api/patches/563/checks/", "tags": {}, "related": [], "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@dpdk.org", "Delivered-To": "patchwork@dpdk.org", "Received": [ "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id 33B497E04;\n\tFri, 26 Sep 2014 11:27:50 +0200 (CEST)", "from e9.ny.us.ibm.com (e9.ny.us.ibm.com [32.97.182.139])\n\tby dpdk.org (Postfix) with ESMTP id 8F5237DEC\n\tfor <dev@dpdk.org>; Fri, 26 Sep 2014 11:27:46 +0200 (CEST)", "from /spool/local\n\tby e9.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only!\n\tViolators will be prosecuted\n\tfor <dev@dpdk.org> from <bjzhuc@cn.ibm.com>;\n\tFri, 26 Sep 2014 05:34:07 -0400", "from d01dlp01.pok.ibm.com (9.56.250.166)\n\tby e9.ny.us.ibm.com (192.168.1.109) with IBM ESMTP SMTP Gateway:\n\tAuthorized Use Only! Violators will be prosecuted; \n\tFri, 26 Sep 2014 05:34:05 -0400", "from b01cxnp22033.gho.pok.ibm.com (b01cxnp22033.gho.pok.ibm.com\n\t[9.57.198.23])\n\tby d01dlp01.pok.ibm.com (Postfix) with ESMTP id 5710B38C8039\n\tfor <dev@dpdk.org>; Fri, 26 Sep 2014 05:34:05 -0400 (EDT)", "from d01av05.pok.ibm.com (d01av05.pok.ibm.com [9.56.224.195])\n\tby b01cxnp22033.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP\n\tid\n\ts8Q9XvrA4129098 for <dev@dpdk.org>; Fri, 26 Sep 2014 09:34:05 GMT", "from d01av05.pok.ibm.com (localhost [127.0.0.1])\n\tby d01av05.pok.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id\n\ts8Q9XW6k000947 for <dev@dpdk.org>; Fri, 26 Sep 2014 05:33:32 -0400", "from d01hub02.pok.ibm.com (d01hub02.pok.ibm.com [9.63.10.236])\n\tby d01av05.pok.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id\n\ts8Q9XWEX000704 for <dev@dpdk.org>; Fri, 26 Sep 2014 05:33:32 -0400", "from localhost.localdomain ([9.186.57.14])\n\tby rescrl1.research.ibm.com (IBM Domino Release 9.0.1)\n\twith ESMTP id 2014092617324764-312540 ;\n\tFri, 26 Sep 2014 17:32:47 +0800 " ], "From": "Chao Zhu <bjzhuc@cn.ibm.com>", "To": "dev@dpdk.org", "Date": "Fri, 26 Sep 2014 05:33:37 -0400", "Message-Id": "<1411724018-7738-7-git-send-email-bjzhuc@cn.ibm.com>", "X-Mailer": "git-send-email 1.7.1", "In-Reply-To": "<1411724018-7738-1-git-send-email-bjzhuc@cn.ibm.com>", "References": "<1411724018-7738-1-git-send-email-bjzhuc@cn.ibm.com>", "X-MIMETrack": "Itemize by SMTP Server on\n\trescrl1/Research/Affiliated/IBM(Release\n\t9.0.1|October 14, 2013) at 2014/09/26 17:32:47,\n\tSerialize by Router on D01HUB02/01/H/IBM(Release 8.5.3FP2\n\tZX853FP2HF5|February, 2013) at 09/26/2014 05:33:32,\n\tSerialize complete at 09/26/2014 05:33:32", "X-TM-AS-MML": "disable", "X-Content-Scanned": "Fidelis XPS MAILER", "x-cbid": "14092609-7182-0000-0000-0000008E3761", "Subject": "[dpdk-dev] [PATCH 6/7] Split memcpy operation to architecture\n\tspecific", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "content": "This patch splits the vector instruction based memory copy from DPDK and\npush them to architecture specific arch directories, so that other\nprocessor architecture to support DPDK can be easily adopted.\n\nSigned-off-by: Chao Zhu <bjzhuc@cn.ibm.com>\n---\n lib/librte_eal/common/Makefile | 2 +-\n .../common/include/i686/arch/rte_memcpy_arch.h | 199 ++++++++++++++++++++\n lib/librte_eal/common/include/rte_memcpy.h | 95 +---------\n .../common/include/x86_64/arch/rte_memcpy_arch.h | 199 ++++++++++++++++++++\n 4 files changed, 406 insertions(+), 89 deletions(-)\n create mode 100644 lib/librte_eal/common/include/i686/arch/rte_memcpy_arch.h\n create mode 100644 lib/librte_eal/common/include/x86_64/arch/rte_memcpy_arch.h\n\n\\ No newline at end of file", "diff": "diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile\nindex 249ea2f..4add1c1 100644\n--- a/lib/librte_eal/common/Makefile\n+++ b/lib/librte_eal/common/Makefile\n@@ -46,7 +46,7 @@ ifeq ($(CONFIG_RTE_INSECURE_FUNCTION_WARNING),y)\n INC += rte_warnings.h\n endif\n \n-ARCH_INC := rte_atomic.h rte_atomic_arch.h rte_byteorder_arch.h rte_cycles_arch.h rte_prefetch_arch.h rte_spinlock_arch.h\n+ARCH_INC := rte_atomic.h rte_atomic_arch.h rte_byteorder_arch.h rte_cycles_arch.h rte_prefetch_arch.h rte_spinlock_arch.h rte_memcpy_arch.h \n \n SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC))\n SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include/arch := \\\ndiff --git a/lib/librte_eal/common/include/i686/arch/rte_memcpy_arch.h b/lib/librte_eal/common/include/i686/arch/rte_memcpy_arch.h\nnew file mode 100644\nindex 0000000..44f7760\n--- /dev/null\n+++ b/lib/librte_eal/common/include/i686/arch/rte_memcpy_arch.h\n@@ -0,0 +1,199 @@\n+/*-\n+ * BSD LICENSE\n+ *\n+ * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n+ * All rights reserved.\n+ *\n+ * Redistribution and use in source and binary forms, with or without\n+ * modification, are permitted provided that the following conditions\n+ * are met:\n+ *\n+ * * Redistributions of source code must retain the above copyright\n+ * notice, this list of conditions and the following disclaimer.\n+ * * Redistributions in binary form must reproduce the above copyright\n+ * notice, this list of conditions and the following disclaimer in\n+ * the documentation and/or other materials provided with the\n+ * distribution.\n+ * * Neither the name of Intel Corporation nor the names of its\n+ * contributors may be used to endorse or promote products derived\n+ * from this software without specific prior written permission.\n+ *\n+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n+ * \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n+ */\n+\n+#ifndef _RTE_MEMCPY_ARCH_H_\n+#define _RTE_MEMCPY_ARCH_H_\n+\n+#include <stdint.h>\n+#include <string.h>\n+#include <emmintrin.h>\n+\n+#ifdef __INTEL_COMPILER\n+#pragma warning(disable:593) /* Stop unused variable warning (reg_a etc). */\n+#endif\n+\n+/**\n+ * Copy 16 bytes from one location to another using optimised SSE\n+ * instructions. The locations should not overlap.\n+ *\n+ * @param dst\n+ * Pointer to the destination of the data.\n+ * @param src\n+ * Pointer to the source data.\n+ */\n+static inline void\n+rte_arch_mov16(uint8_t *dst, const uint8_t *src)\n+{\n+\t__m128i reg_a;\n+\tasm volatile (\n+\t\t\"movdqu (%[src]), %[reg_a]\\n\\t\"\n+\t\t\"movdqu %[reg_a], (%[dst])\\n\\t\"\n+\t\t: [reg_a] \"=x\" (reg_a)\n+\t\t: [src] \"r\" (src),\n+\t\t [dst] \"r\"(dst)\n+\t\t: \"memory\"\n+\t);\n+}\n+\n+/**\n+ * Copy 32 bytes from one location to another using optimised SSE\n+ * instructions. The locations should not overlap.\n+ *\n+ * @param dst\n+ * Pointer to the destination of the data.\n+ * @param src\n+ * Pointer to the source data.\n+ */\n+static inline void\n+rte_arch_mov32(uint8_t *dst, const uint8_t *src)\n+{\n+\t__m128i reg_a, reg_b;\n+\tasm volatile (\n+\t\t\"movdqu (%[src]), %[reg_a]\\n\\t\"\n+\t\t\"movdqu 16(%[src]), %[reg_b]\\n\\t\"\n+\t\t\"movdqu %[reg_a], (%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_b], 16(%[dst])\\n\\t\"\n+\t\t: [reg_a] \"=x\" (reg_a),\n+\t\t [reg_b] \"=x\" (reg_b)\n+\t\t: [src] \"r\" (src),\n+\t\t [dst] \"r\"(dst)\n+\t\t: \"memory\"\n+\t);\n+}\n+\n+/**\n+ * Copy 48 bytes from one location to another using optimised SSE\n+ * instructions. The locations should not overlap.\n+ *\n+ * @param dst\n+ * Pointer to the destination of the data.\n+ * @param src\n+ * Pointer to the source data.\n+ */\n+static inline void\n+rte_arch_mov48(uint8_t *dst, const uint8_t *src)\n+{\n+\t__m128i reg_a, reg_b, reg_c;\n+\tasm volatile (\n+\t\t\"movdqu (%[src]), %[reg_a]\\n\\t\"\n+\t\t\"movdqu 16(%[src]), %[reg_b]\\n\\t\"\n+\t\t\"movdqu 32(%[src]), %[reg_c]\\n\\t\"\n+\t\t\"movdqu %[reg_a], (%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_b], 16(%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_c], 32(%[dst])\\n\\t\"\n+\t\t: [reg_a] \"=x\" (reg_a),\n+\t\t [reg_b] \"=x\" (reg_b),\n+\t\t [reg_c] \"=x\" (reg_c)\n+\t\t: [src] \"r\" (src),\n+\t\t [dst] \"r\"(dst)\n+\t\t: \"memory\"\n+\t);\n+}\n+\n+/**\n+ * Copy 64 bytes from one location to another using optimised SSE\n+ * instructions. The locations should not overlap.\n+ *\n+ * @param dst\n+ * Pointer to the destination of the data.\n+ * @param src\n+ * Pointer to the source data.\n+ */\n+static inline void\n+rte_arch_mov64(uint8_t *dst, const uint8_t *src)\n+{\n+\t__m128i reg_a, reg_b, reg_c, reg_d;\n+\tasm volatile (\n+\t\t\"movdqu (%[src]), %[reg_a]\\n\\t\"\n+\t\t\"movdqu 16(%[src]), %[reg_b]\\n\\t\"\n+\t\t\"movdqu 32(%[src]), %[reg_c]\\n\\t\"\n+\t\t\"movdqu 48(%[src]), %[reg_d]\\n\\t\"\n+\t\t\"movdqu %[reg_a], (%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_b], 16(%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_c], 32(%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_d], 48(%[dst])\\n\\t\"\n+\t\t: [reg_a] \"=x\" (reg_a),\n+\t\t [reg_b] \"=x\" (reg_b),\n+\t\t [reg_c] \"=x\" (reg_c),\n+\t\t [reg_d] \"=x\" (reg_d)\n+\t\t: [src] \"r\" (src),\n+\t\t [dst] \"r\"(dst)\n+\t\t: \"memory\"\n+\t);\n+}\n+\n+/**\n+ * Copy 128 bytes from one location to another using optimised SSE\n+ * instructions. The locations should not overlap.\n+ *\n+ * @param dst\n+ * Pointer to the destination of the data.\n+ * @param src\n+ * Pointer to the source data.\n+ */\n+static inline void\n+rte_arch_mov128(uint8_t *dst, const uint8_t *src)\n+{\n+\t__m128i reg_a, reg_b, reg_c, reg_d, reg_e, reg_f, reg_g, reg_h;\n+\tasm volatile (\n+\t\t\"movdqu (%[src]), %[reg_a]\\n\\t\"\n+\t\t\"movdqu 16(%[src]), %[reg_b]\\n\\t\"\n+\t\t\"movdqu 32(%[src]), %[reg_c]\\n\\t\"\n+\t\t\"movdqu 48(%[src]), %[reg_d]\\n\\t\"\n+\t\t\"movdqu 64(%[src]), %[reg_e]\\n\\t\"\n+\t\t\"movdqu 80(%[src]), %[reg_f]\\n\\t\"\n+\t\t\"movdqu 96(%[src]), %[reg_g]\\n\\t\"\n+\t\t\"movdqu 112(%[src]), %[reg_h]\\n\\t\"\n+\t\t\"movdqu %[reg_a], (%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_b], 16(%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_c], 32(%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_d], 48(%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_e], 64(%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_f], 80(%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_g], 96(%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_h], 112(%[dst])\\n\\t\"\n+\t\t: [reg_a] \"=x\" (reg_a),\n+\t\t [reg_b] \"=x\" (reg_b),\n+\t\t [reg_c] \"=x\" (reg_c),\n+\t\t [reg_d] \"=x\" (reg_d),\n+\t\t [reg_e] \"=x\" (reg_e),\n+\t\t [reg_f] \"=x\" (reg_f),\n+\t\t [reg_g] \"=x\" (reg_g),\n+\t\t [reg_h] \"=x\" (reg_h)\n+\t\t: [src] \"r\" (src),\n+\t\t [dst] \"r\"(dst)\n+\t\t: \"memory\"\n+\t);\n+}\n+\n+#endif /* _RTE_MEMCPY_ARCH_H_ */\n\\ No newline at end of file\ndiff --git a/lib/librte_eal/common/include/rte_memcpy.h b/lib/librte_eal/common/include/rte_memcpy.h\nindex 131b196..11a099e 100644\n--- a/lib/librte_eal/common/include/rte_memcpy.h\n+++ b/lib/librte_eal/common/include/rte_memcpy.h\n@@ -37,12 +37,10 @@\n /**\n * @file\n *\n- * Functions for SSE implementation of memcpy().\n+ * Functions for vector instruction implementation of memcpy().\n */\n \n-#include <stdint.h>\n-#include <string.h>\n-#include <emmintrin.h>\n+#include \"arch/rte_memcpy_arch.h\"\n \n #ifdef __cplusplus\n extern \"C\" {\n@@ -64,15 +62,7 @@ extern \"C\" {\n static inline void\n rte_mov16(uint8_t *dst, const uint8_t *src)\n {\n-\t__m128i reg_a;\n-\tasm volatile (\n-\t\t\"movdqu (%[src]), %[reg_a]\\n\\t\"\n-\t\t\"movdqu %[reg_a], (%[dst])\\n\\t\"\n-\t\t: [reg_a] \"=x\" (reg_a)\n-\t\t: [src] \"r\" (src),\n-\t\t [dst] \"r\"(dst)\n-\t\t: \"memory\"\n-\t);\n+\trte_arch_mov16(dst, src);\n }\n \n /**\n@@ -87,18 +77,7 @@ rte_mov16(uint8_t *dst, const uint8_t *src)\n static inline void\n rte_mov32(uint8_t *dst, const uint8_t *src)\n {\n-\t__m128i reg_a, reg_b;\n-\tasm volatile (\n-\t\t\"movdqu (%[src]), %[reg_a]\\n\\t\"\n-\t\t\"movdqu 16(%[src]), %[reg_b]\\n\\t\"\n-\t\t\"movdqu %[reg_a], (%[dst])\\n\\t\"\n-\t\t\"movdqu %[reg_b], 16(%[dst])\\n\\t\"\n-\t\t: [reg_a] \"=x\" (reg_a),\n-\t\t [reg_b] \"=x\" (reg_b)\n-\t\t: [src] \"r\" (src),\n-\t\t [dst] \"r\"(dst)\n-\t\t: \"memory\"\n-\t);\n+\trte_arch_mov32(dst, src);\n }\n \n /**\n@@ -113,21 +92,7 @@ rte_mov32(uint8_t *dst, const uint8_t *src)\n static inline void\n rte_mov48(uint8_t *dst, const uint8_t *src)\n {\n-\t__m128i reg_a, reg_b, reg_c;\n-\tasm volatile (\n-\t\t\"movdqu (%[src]), %[reg_a]\\n\\t\"\n-\t\t\"movdqu 16(%[src]), %[reg_b]\\n\\t\"\n-\t\t\"movdqu 32(%[src]), %[reg_c]\\n\\t\"\n-\t\t\"movdqu %[reg_a], (%[dst])\\n\\t\"\n-\t\t\"movdqu %[reg_b], 16(%[dst])\\n\\t\"\n-\t\t\"movdqu %[reg_c], 32(%[dst])\\n\\t\"\n-\t\t: [reg_a] \"=x\" (reg_a),\n-\t\t [reg_b] \"=x\" (reg_b),\n-\t\t [reg_c] \"=x\" (reg_c)\n-\t\t: [src] \"r\" (src),\n-\t\t [dst] \"r\"(dst)\n-\t\t: \"memory\"\n-\t);\n+\trte_arch_mov48(dst, src);\n }\n \n /**\n@@ -142,24 +107,7 @@ rte_mov48(uint8_t *dst, const uint8_t *src)\n static inline void\n rte_mov64(uint8_t *dst, const uint8_t *src)\n {\n-\t__m128i reg_a, reg_b, reg_c, reg_d;\n-\tasm volatile (\n-\t\t\"movdqu (%[src]), %[reg_a]\\n\\t\"\n-\t\t\"movdqu 16(%[src]), %[reg_b]\\n\\t\"\n-\t\t\"movdqu 32(%[src]), %[reg_c]\\n\\t\"\n-\t\t\"movdqu 48(%[src]), %[reg_d]\\n\\t\"\n-\t\t\"movdqu %[reg_a], (%[dst])\\n\\t\"\n-\t\t\"movdqu %[reg_b], 16(%[dst])\\n\\t\"\n-\t\t\"movdqu %[reg_c], 32(%[dst])\\n\\t\"\n-\t\t\"movdqu %[reg_d], 48(%[dst])\\n\\t\"\n-\t\t: [reg_a] \"=x\" (reg_a),\n-\t\t [reg_b] \"=x\" (reg_b),\n-\t\t [reg_c] \"=x\" (reg_c),\n-\t\t [reg_d] \"=x\" (reg_d)\n-\t\t: [src] \"r\" (src),\n-\t\t [dst] \"r\"(dst)\n-\t\t: \"memory\"\n-\t);\n+\trte_arch_mov64(dst, src);\n }\n \n /**\n@@ -174,36 +122,7 @@ rte_mov64(uint8_t *dst, const uint8_t *src)\n static inline void\n rte_mov128(uint8_t *dst, const uint8_t *src)\n {\n-\t__m128i reg_a, reg_b, reg_c, reg_d, reg_e, reg_f, reg_g, reg_h;\n-\tasm volatile (\n-\t\t\"movdqu (%[src]), %[reg_a]\\n\\t\"\n-\t\t\"movdqu 16(%[src]), %[reg_b]\\n\\t\"\n-\t\t\"movdqu 32(%[src]), %[reg_c]\\n\\t\"\n-\t\t\"movdqu 48(%[src]), %[reg_d]\\n\\t\"\n-\t\t\"movdqu 64(%[src]), %[reg_e]\\n\\t\"\n-\t\t\"movdqu 80(%[src]), %[reg_f]\\n\\t\"\n-\t\t\"movdqu 96(%[src]), %[reg_g]\\n\\t\"\n-\t\t\"movdqu 112(%[src]), %[reg_h]\\n\\t\"\n-\t\t\"movdqu %[reg_a], (%[dst])\\n\\t\"\n-\t\t\"movdqu %[reg_b], 16(%[dst])\\n\\t\"\n-\t\t\"movdqu %[reg_c], 32(%[dst])\\n\\t\"\n-\t\t\"movdqu %[reg_d], 48(%[dst])\\n\\t\"\n-\t\t\"movdqu %[reg_e], 64(%[dst])\\n\\t\"\n-\t\t\"movdqu %[reg_f], 80(%[dst])\\n\\t\"\n-\t\t\"movdqu %[reg_g], 96(%[dst])\\n\\t\"\n-\t\t\"movdqu %[reg_h], 112(%[dst])\\n\\t\"\n-\t\t: [reg_a] \"=x\" (reg_a),\n-\t\t [reg_b] \"=x\" (reg_b),\n-\t\t [reg_c] \"=x\" (reg_c),\n-\t\t [reg_d] \"=x\" (reg_d),\n-\t\t [reg_e] \"=x\" (reg_e),\n-\t\t [reg_f] \"=x\" (reg_f),\n-\t\t [reg_g] \"=x\" (reg_g),\n-\t\t [reg_h] \"=x\" (reg_h)\n-\t\t: [src] \"r\" (src),\n-\t\t [dst] \"r\"(dst)\n-\t\t: \"memory\"\n-\t);\n+\trte_arch_mov128(dst, src);\n }\n \n #ifdef __INTEL_COMPILER\ndiff --git a/lib/librte_eal/common/include/x86_64/arch/rte_memcpy_arch.h b/lib/librte_eal/common/include/x86_64/arch/rte_memcpy_arch.h\nnew file mode 100644\nindex 0000000..44f7760\n--- /dev/null\n+++ b/lib/librte_eal/common/include/x86_64/arch/rte_memcpy_arch.h\n@@ -0,0 +1,199 @@\n+/*-\n+ * BSD LICENSE\n+ *\n+ * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n+ * All rights reserved.\n+ *\n+ * Redistribution and use in source and binary forms, with or without\n+ * modification, are permitted provided that the following conditions\n+ * are met:\n+ *\n+ * * Redistributions of source code must retain the above copyright\n+ * notice, this list of conditions and the following disclaimer.\n+ * * Redistributions in binary form must reproduce the above copyright\n+ * notice, this list of conditions and the following disclaimer in\n+ * the documentation and/or other materials provided with the\n+ * distribution.\n+ * * Neither the name of Intel Corporation nor the names of its\n+ * contributors may be used to endorse or promote products derived\n+ * from this software without specific prior written permission.\n+ *\n+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n+ * \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n+ */\n+\n+#ifndef _RTE_MEMCPY_ARCH_H_\n+#define _RTE_MEMCPY_ARCH_H_\n+\n+#include <stdint.h>\n+#include <string.h>\n+#include <emmintrin.h>\n+\n+#ifdef __INTEL_COMPILER\n+#pragma warning(disable:593) /* Stop unused variable warning (reg_a etc). */\n+#endif\n+\n+/**\n+ * Copy 16 bytes from one location to another using optimised SSE\n+ * instructions. The locations should not overlap.\n+ *\n+ * @param dst\n+ * Pointer to the destination of the data.\n+ * @param src\n+ * Pointer to the source data.\n+ */\n+static inline void\n+rte_arch_mov16(uint8_t *dst, const uint8_t *src)\n+{\n+\t__m128i reg_a;\n+\tasm volatile (\n+\t\t\"movdqu (%[src]), %[reg_a]\\n\\t\"\n+\t\t\"movdqu %[reg_a], (%[dst])\\n\\t\"\n+\t\t: [reg_a] \"=x\" (reg_a)\n+\t\t: [src] \"r\" (src),\n+\t\t [dst] \"r\"(dst)\n+\t\t: \"memory\"\n+\t);\n+}\n+\n+/**\n+ * Copy 32 bytes from one location to another using optimised SSE\n+ * instructions. The locations should not overlap.\n+ *\n+ * @param dst\n+ * Pointer to the destination of the data.\n+ * @param src\n+ * Pointer to the source data.\n+ */\n+static inline void\n+rte_arch_mov32(uint8_t *dst, const uint8_t *src)\n+{\n+\t__m128i reg_a, reg_b;\n+\tasm volatile (\n+\t\t\"movdqu (%[src]), %[reg_a]\\n\\t\"\n+\t\t\"movdqu 16(%[src]), %[reg_b]\\n\\t\"\n+\t\t\"movdqu %[reg_a], (%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_b], 16(%[dst])\\n\\t\"\n+\t\t: [reg_a] \"=x\" (reg_a),\n+\t\t [reg_b] \"=x\" (reg_b)\n+\t\t: [src] \"r\" (src),\n+\t\t [dst] \"r\"(dst)\n+\t\t: \"memory\"\n+\t);\n+}\n+\n+/**\n+ * Copy 48 bytes from one location to another using optimised SSE\n+ * instructions. The locations should not overlap.\n+ *\n+ * @param dst\n+ * Pointer to the destination of the data.\n+ * @param src\n+ * Pointer to the source data.\n+ */\n+static inline void\n+rte_arch_mov48(uint8_t *dst, const uint8_t *src)\n+{\n+\t__m128i reg_a, reg_b, reg_c;\n+\tasm volatile (\n+\t\t\"movdqu (%[src]), %[reg_a]\\n\\t\"\n+\t\t\"movdqu 16(%[src]), %[reg_b]\\n\\t\"\n+\t\t\"movdqu 32(%[src]), %[reg_c]\\n\\t\"\n+\t\t\"movdqu %[reg_a], (%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_b], 16(%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_c], 32(%[dst])\\n\\t\"\n+\t\t: [reg_a] \"=x\" (reg_a),\n+\t\t [reg_b] \"=x\" (reg_b),\n+\t\t [reg_c] \"=x\" (reg_c)\n+\t\t: [src] \"r\" (src),\n+\t\t [dst] \"r\"(dst)\n+\t\t: \"memory\"\n+\t);\n+}\n+\n+/**\n+ * Copy 64 bytes from one location to another using optimised SSE\n+ * instructions. The locations should not overlap.\n+ *\n+ * @param dst\n+ * Pointer to the destination of the data.\n+ * @param src\n+ * Pointer to the source data.\n+ */\n+static inline void\n+rte_arch_mov64(uint8_t *dst, const uint8_t *src)\n+{\n+\t__m128i reg_a, reg_b, reg_c, reg_d;\n+\tasm volatile (\n+\t\t\"movdqu (%[src]), %[reg_a]\\n\\t\"\n+\t\t\"movdqu 16(%[src]), %[reg_b]\\n\\t\"\n+\t\t\"movdqu 32(%[src]), %[reg_c]\\n\\t\"\n+\t\t\"movdqu 48(%[src]), %[reg_d]\\n\\t\"\n+\t\t\"movdqu %[reg_a], (%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_b], 16(%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_c], 32(%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_d], 48(%[dst])\\n\\t\"\n+\t\t: [reg_a] \"=x\" (reg_a),\n+\t\t [reg_b] \"=x\" (reg_b),\n+\t\t [reg_c] \"=x\" (reg_c),\n+\t\t [reg_d] \"=x\" (reg_d)\n+\t\t: [src] \"r\" (src),\n+\t\t [dst] \"r\"(dst)\n+\t\t: \"memory\"\n+\t);\n+}\n+\n+/**\n+ * Copy 128 bytes from one location to another using optimised SSE\n+ * instructions. The locations should not overlap.\n+ *\n+ * @param dst\n+ * Pointer to the destination of the data.\n+ * @param src\n+ * Pointer to the source data.\n+ */\n+static inline void\n+rte_arch_mov128(uint8_t *dst, const uint8_t *src)\n+{\n+\t__m128i reg_a, reg_b, reg_c, reg_d, reg_e, reg_f, reg_g, reg_h;\n+\tasm volatile (\n+\t\t\"movdqu (%[src]), %[reg_a]\\n\\t\"\n+\t\t\"movdqu 16(%[src]), %[reg_b]\\n\\t\"\n+\t\t\"movdqu 32(%[src]), %[reg_c]\\n\\t\"\n+\t\t\"movdqu 48(%[src]), %[reg_d]\\n\\t\"\n+\t\t\"movdqu 64(%[src]), %[reg_e]\\n\\t\"\n+\t\t\"movdqu 80(%[src]), %[reg_f]\\n\\t\"\n+\t\t\"movdqu 96(%[src]), %[reg_g]\\n\\t\"\n+\t\t\"movdqu 112(%[src]), %[reg_h]\\n\\t\"\n+\t\t\"movdqu %[reg_a], (%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_b], 16(%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_c], 32(%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_d], 48(%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_e], 64(%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_f], 80(%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_g], 96(%[dst])\\n\\t\"\n+\t\t\"movdqu %[reg_h], 112(%[dst])\\n\\t\"\n+\t\t: [reg_a] \"=x\" (reg_a),\n+\t\t [reg_b] \"=x\" (reg_b),\n+\t\t [reg_c] \"=x\" (reg_c),\n+\t\t [reg_d] \"=x\" (reg_d),\n+\t\t [reg_e] \"=x\" (reg_e),\n+\t\t [reg_f] \"=x\" (reg_f),\n+\t\t [reg_g] \"=x\" (reg_g),\n+\t\t [reg_h] \"=x\" (reg_h)\n+\t\t: [src] \"r\" (src),\n+\t\t [dst] \"r\"(dst)\n+\t\t: \"memory\"\n+\t);\n+}\n+\n+#endif /* _RTE_MEMCPY_ARCH_H_ */\n", "prefixes": [ "dpdk-dev", "6/7" ] }{ "id": 563, "url": "