Patch Detail
get:
Show a patch.
patch:
Update a patch.
put:
Update a patch.
GET /api/patches/2261/?format=api
https://patches.dpdk.org/api/patches/2261/?format=api", "web_url": "https://patches.dpdk.org/project/dpdk/patch/1421090181-17150-11-git-send-email-konstantin.ananyev@intel.com/", "project": { "id": 1, "url": "https://patches.dpdk.org/api/projects/1/?format=api", "name": "DPDK", "link_name": "dpdk", "list_id": "dev.dpdk.org", "list_email": "dev@dpdk.org", "web_url": "http://core.dpdk.org", "scm_url": "git://dpdk.org/dpdk", "webscm_url": "http://git.dpdk.org/dpdk", "list_archive_url": "https://inbox.dpdk.org/dev", "list_archive_url_format": "https://inbox.dpdk.org/dev/{}", "commit_url_format": "" }, "msgid": "<1421090181-17150-11-git-send-email-konstantin.ananyev@intel.com>", "list_archive_url": "https://inbox.dpdk.org/dev/1421090181-17150-11-git-send-email-konstantin.ananyev@intel.com", "date": "2015-01-12T19:16:14", "name": "[dpdk-dev,v2,10/17] EAL: introduce rte_ymm and relatives in rte_common_vect.h.", "commit_ref": null, "pull_url": null, "state": "superseded", "archived": true, "hash": "dc59341676cee27a173c26bb76865e6a2bc95165", "submitter": { "id": 33, "url": "https://patches.dpdk.org/api/people/33/?format=api", "name": "Ananyev, Konstantin", "email": "konstantin.ananyev@intel.com" }, "delegate": null, "mbox": "https://patches.dpdk.org/project/dpdk/patch/1421090181-17150-11-git-send-email-konstantin.ananyev@intel.com/mbox/", "series": [], "comments": "https://patches.dpdk.org/api/patches/2261/comments/", "check": "pending", "checks": "https://patches.dpdk.org/api/patches/2261/checks/", "tags": {}, "related": [], "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@dpdk.org", "Delivered-To": "patchwork@dpdk.org", "Received": [ "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id E13385B00;\n\tMon, 12 Jan 2015 20:17:08 +0100 (CET)", "from mga02.intel.com (mga02.intel.com [134.134.136.20])\n\tby dpdk.org (Postfix) with ESMTP id A74335A96\n\tfor <dev@dpdk.org>; Mon, 12 Jan 2015 20:16:39 +0100 (CET)", "from fmsmga001.fm.intel.com ([10.253.24.23])\n\tby orsmga101.jf.intel.com with ESMTP; 12 Jan 2015 11:16:37 -0800", "from irvmail001.ir.intel.com ([163.33.26.43])\n\tby fmsmga001.fm.intel.com with ESMTP; 12 Jan 2015 11:16:36 -0800", "from sivswdev02.ir.intel.com (sivswdev02.ir.intel.com\n\t[10.237.217.46])\n\tby irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id\n\tt0CJGZZO008637; Mon, 12 Jan 2015 19:16:36 GMT", "from sivswdev02.ir.intel.com (localhost [127.0.0.1])\n\tby sivswdev02.ir.intel.com with ESMTP id t0CJGZc0017271;\n\tMon, 12 Jan 2015 19:16:35 GMT", "(from kananye1@localhost)\n\tby sivswdev02.ir.intel.com with id t0CJGZD8017266;\n\tMon, 12 Jan 2015 19:16:35 GMT" ], "X-ExtLoop1": "1", "X-IronPort-AV": "E=Sophos;i=\"5.07,745,1413270000\"; d=\"scan'208\";a=\"649962540\"", "From": "Konstantin Ananyev <konstantin.ananyev@intel.com>", "To": "dev@dpdk.org", "Date": "Mon, 12 Jan 2015 19:16:14 +0000", "Message-Id": "<1421090181-17150-11-git-send-email-konstantin.ananyev@intel.com>", "X-Mailer": "git-send-email 1.7.4.1", "In-Reply-To": "<1421090181-17150-1-git-send-email-konstantin.ananyev@intel.com>", "References": "<1421090181-17150-1-git-send-email-konstantin.ananyev@intel.com>", "Subject": "[dpdk-dev] [PATCH v2 10/17] EAL: introduce rte_ymm and relatives in\n\trte_common_vect.h.", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "content": "New data type to manipulate 256 bit AVX values.\nRename field in the rte_xmm to keep common naming accross SSE/AVX fields.\n\nSigned-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>\n---\n examples/l3fwd/main.c | 2 +-\n lib/librte_acl/acl_run_sse.c | 88 ++++++++++++-------------\n lib/librte_acl/rte_acl_osdep_alone.h | 35 +++++++++-\n lib/librte_eal/common/include/rte_common_vect.h | 27 +++++++-\n lib/librte_lpm/rte_lpm.h | 2 +-\n 5 files changed, 104 insertions(+), 50 deletions(-)", "diff": "diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c\nindex 918f2cb..6f7d7d4 100644\n--- a/examples/l3fwd/main.c\n+++ b/examples/l3fwd/main.c\n@@ -1170,7 +1170,7 @@ processx4_step2(const struct lcore_conf *qconf, __m128i dip, uint32_t flag,\n \tif (likely(flag != 0)) {\n \t\trte_lpm_lookupx4(qconf->ipv4_lookup_struct, dip, dprt, portid);\n \t} else {\n-\t\tdst.m = dip;\n+\t\tdst.x = dip;\n \t\tdprt[0] = get_dst_port(qconf, pkt[0], dst.u32[0], portid);\n \t\tdprt[1] = get_dst_port(qconf, pkt[1], dst.u32[1], portid);\n \t\tdprt[2] = get_dst_port(qconf, pkt[2], dst.u32[2], portid);\ndiff --git a/lib/librte_acl/acl_run_sse.c b/lib/librte_acl/acl_run_sse.c\nindex 09e32be..4605b58 100644\n--- a/lib/librte_acl/acl_run_sse.c\n+++ b/lib/librte_acl/acl_run_sse.c\n@@ -359,16 +359,16 @@ search_sse_8(const struct rte_acl_ctx *ctx, const uint8_t **data,\n \n \t /* Check for any matches. */\n \tacl_match_check_x4(0, ctx, parms, &flows,\n-\t\t&indices1, &indices2, mm_match_mask.m);\n+\t\t&indices1, &indices2, mm_match_mask.x);\n \tacl_match_check_x4(4, ctx, parms, &flows,\n-\t\t&indices3, &indices4, mm_match_mask.m);\n+\t\t&indices3, &indices4, mm_match_mask.x);\n \n \twhile (flows.started > 0) {\n \n \t\t/* Gather 4 bytes of input data for each stream. */\n-\t\tinput0 = MM_INSERT32(mm_ones_16.m, GET_NEXT_4BYTES(parms, 0),\n+\t\tinput0 = MM_INSERT32(mm_ones_16.x, GET_NEXT_4BYTES(parms, 0),\n \t\t\t0);\n-\t\tinput1 = MM_INSERT32(mm_ones_16.m, GET_NEXT_4BYTES(parms, 4),\n+\t\tinput1 = MM_INSERT32(mm_ones_16.x, GET_NEXT_4BYTES(parms, 4),\n \t\t\t0);\n \n \t\tinput0 = MM_INSERT32(input0, GET_NEXT_4BYTES(parms, 1), 1);\n@@ -382,43 +382,43 @@ search_sse_8(const struct rte_acl_ctx *ctx, const uint8_t **data,\n \n \t\t /* Process the 4 bytes of input on each stream. */\n \n-\t\tinput0 = transition4(mm_index_mask.m, input0,\n-\t\t\tmm_shuffle_input.m, mm_ones_16.m,\n+\t\tinput0 = transition4(mm_index_mask.x, input0,\n+\t\t\tmm_shuffle_input.x, mm_ones_16.x,\n \t\t\tflows.trans, &indices1, &indices2);\n \n-\t\tinput1 = transition4(mm_index_mask.m, input1,\n-\t\t\tmm_shuffle_input.m, mm_ones_16.m,\n+\t\tinput1 = transition4(mm_index_mask.x, input1,\n+\t\t\tmm_shuffle_input.x, mm_ones_16.x,\n \t\t\tflows.trans, &indices3, &indices4);\n \n-\t\tinput0 = transition4(mm_index_mask.m, input0,\n-\t\t\tmm_shuffle_input.m, mm_ones_16.m,\n+\t\tinput0 = transition4(mm_index_mask.x, input0,\n+\t\t\tmm_shuffle_input.x, mm_ones_16.x,\n \t\t\tflows.trans, &indices1, &indices2);\n \n-\t\tinput1 = transition4(mm_index_mask.m, input1,\n-\t\t\tmm_shuffle_input.m, mm_ones_16.m,\n+\t\tinput1 = transition4(mm_index_mask.x, input1,\n+\t\t\tmm_shuffle_input.x, mm_ones_16.x,\n \t\t\tflows.trans, &indices3, &indices4);\n \n-\t\tinput0 = transition4(mm_index_mask.m, input0,\n-\t\t\tmm_shuffle_input.m, mm_ones_16.m,\n+\t\tinput0 = transition4(mm_index_mask.x, input0,\n+\t\t\tmm_shuffle_input.x, mm_ones_16.x,\n \t\t\tflows.trans, &indices1, &indices2);\n \n-\t\tinput1 = transition4(mm_index_mask.m, input1,\n-\t\t\tmm_shuffle_input.m, mm_ones_16.m,\n+\t\tinput1 = transition4(mm_index_mask.x, input1,\n+\t\t\tmm_shuffle_input.x, mm_ones_16.x,\n \t\t\tflows.trans, &indices3, &indices4);\n \n-\t\tinput0 = transition4(mm_index_mask.m, input0,\n-\t\t\tmm_shuffle_input.m, mm_ones_16.m,\n+\t\tinput0 = transition4(mm_index_mask.x, input0,\n+\t\t\tmm_shuffle_input.x, mm_ones_16.x,\n \t\t\tflows.trans, &indices1, &indices2);\n \n-\t\tinput1 = transition4(mm_index_mask.m, input1,\n-\t\t\tmm_shuffle_input.m, mm_ones_16.m,\n+\t\tinput1 = transition4(mm_index_mask.x, input1,\n+\t\t\tmm_shuffle_input.x, mm_ones_16.x,\n \t\t\tflows.trans, &indices3, &indices4);\n \n \t\t /* Check for any matches. */\n \t\tacl_match_check_x4(0, ctx, parms, &flows,\n-\t\t\t&indices1, &indices2, mm_match_mask.m);\n+\t\t\t&indices1, &indices2, mm_match_mask.x);\n \t\tacl_match_check_x4(4, ctx, parms, &flows,\n-\t\t\t&indices3, &indices4, mm_match_mask.m);\n+\t\t\t&indices3, &indices4, mm_match_mask.x);\n \t}\n \n \treturn 0;\n@@ -451,36 +451,36 @@ search_sse_4(const struct rte_acl_ctx *ctx, const uint8_t **data,\n \n \t/* Check for any matches. */\n \tacl_match_check_x4(0, ctx, parms, &flows,\n-\t\t&indices1, &indices2, mm_match_mask.m);\n+\t\t&indices1, &indices2, mm_match_mask.x);\n \n \twhile (flows.started > 0) {\n \n \t\t/* Gather 4 bytes of input data for each stream. */\n-\t\tinput = MM_INSERT32(mm_ones_16.m, GET_NEXT_4BYTES(parms, 0), 0);\n+\t\tinput = MM_INSERT32(mm_ones_16.x, GET_NEXT_4BYTES(parms, 0), 0);\n \t\tinput = MM_INSERT32(input, GET_NEXT_4BYTES(parms, 1), 1);\n \t\tinput = MM_INSERT32(input, GET_NEXT_4BYTES(parms, 2), 2);\n \t\tinput = MM_INSERT32(input, GET_NEXT_4BYTES(parms, 3), 3);\n \n \t\t/* Process the 4 bytes of input on each stream. */\n-\t\tinput = transition4(mm_index_mask.m, input,\n-\t\t\tmm_shuffle_input.m, mm_ones_16.m,\n+\t\tinput = transition4(mm_index_mask.x, input,\n+\t\t\tmm_shuffle_input.x, mm_ones_16.x,\n \t\t\tflows.trans, &indices1, &indices2);\n \n-\t\t input = transition4(mm_index_mask.m, input,\n-\t\t\tmm_shuffle_input.m, mm_ones_16.m,\n+\t\t input = transition4(mm_index_mask.x, input,\n+\t\t\tmm_shuffle_input.x, mm_ones_16.x,\n \t\t\tflows.trans, &indices1, &indices2);\n \n-\t\t input = transition4(mm_index_mask.m, input,\n-\t\t\tmm_shuffle_input.m, mm_ones_16.m,\n+\t\t input = transition4(mm_index_mask.x, input,\n+\t\t\tmm_shuffle_input.x, mm_ones_16.x,\n \t\t\tflows.trans, &indices1, &indices2);\n \n-\t\t input = transition4(mm_index_mask.m, input,\n-\t\t\tmm_shuffle_input.m, mm_ones_16.m,\n+\t\t input = transition4(mm_index_mask.x, input,\n+\t\t\tmm_shuffle_input.x, mm_ones_16.x,\n \t\t\tflows.trans, &indices1, &indices2);\n \n \t\t/* Check for any matches. */\n \t\tacl_match_check_x4(0, ctx, parms, &flows,\n-\t\t\t&indices1, &indices2, mm_match_mask.m);\n+\t\t\t&indices1, &indices2, mm_match_mask.x);\n \t}\n \n \treturn 0;\n@@ -534,35 +534,35 @@ search_sse_2(const struct rte_acl_ctx *ctx, const uint8_t **data,\n \tindices = MM_LOADU((xmm_t *) &index_array[0]);\n \n \t/* Check for any matches. */\n-\tacl_match_check_x2(0, ctx, parms, &flows, &indices, mm_match_mask64.m);\n+\tacl_match_check_x2(0, ctx, parms, &flows, &indices, mm_match_mask64.x);\n \n \twhile (flows.started > 0) {\n \n \t\t/* Gather 4 bytes of input data for each stream. */\n-\t\tinput = MM_INSERT32(mm_ones_16.m, GET_NEXT_4BYTES(parms, 0), 0);\n+\t\tinput = MM_INSERT32(mm_ones_16.x, GET_NEXT_4BYTES(parms, 0), 0);\n \t\tinput = MM_INSERT32(input, GET_NEXT_4BYTES(parms, 1), 1);\n \n \t\t/* Process the 4 bytes of input on each stream. */\n \n-\t\tinput = transition2(mm_index_mask64.m, input,\n-\t\t\tmm_shuffle_input64.m, mm_ones_16.m,\n+\t\tinput = transition2(mm_index_mask64.x, input,\n+\t\t\tmm_shuffle_input64.x, mm_ones_16.x,\n \t\t\tflows.trans, &indices);\n \n-\t\tinput = transition2(mm_index_mask64.m, input,\n-\t\t\tmm_shuffle_input64.m, mm_ones_16.m,\n+\t\tinput = transition2(mm_index_mask64.x, input,\n+\t\t\tmm_shuffle_input64.x, mm_ones_16.x,\n \t\t\tflows.trans, &indices);\n \n-\t\tinput = transition2(mm_index_mask64.m, input,\n-\t\t\tmm_shuffle_input64.m, mm_ones_16.m,\n+\t\tinput = transition2(mm_index_mask64.x, input,\n+\t\t\tmm_shuffle_input64.x, mm_ones_16.x,\n \t\t\tflows.trans, &indices);\n \n-\t\tinput = transition2(mm_index_mask64.m, input,\n-\t\t\tmm_shuffle_input64.m, mm_ones_16.m,\n+\t\tinput = transition2(mm_index_mask64.x, input,\n+\t\t\tmm_shuffle_input64.x, mm_ones_16.x,\n \t\t\tflows.trans, &indices);\n \n \t\t/* Check for any matches. */\n \t\tacl_match_check_x2(0, ctx, parms, &flows, &indices,\n-\t\t\tmm_match_mask64.m);\n+\t\t\tmm_match_mask64.x);\n \t}\n \n \treturn 0;\ndiff --git a/lib/librte_acl/rte_acl_osdep_alone.h b/lib/librte_acl/rte_acl_osdep_alone.h\nindex 2a99860..58c4f6a 100644\n--- a/lib/librte_acl/rte_acl_osdep_alone.h\n+++ b/lib/librte_acl/rte_acl_osdep_alone.h\n@@ -57,6 +57,10 @@\n #include <smmintrin.h>\n #endif\n \n+#if defined(__AVX__)\n+#include <immintrin.h>\n+#endif\n+\n #else\n \n #include <x86intrin.h>\n@@ -128,8 +132,8 @@ typedef __m128i xmm_t;\n #define\tXMM_SIZE\t(sizeof(xmm_t))\n #define\tXMM_MASK\t(XMM_SIZE - 1)\n \n-typedef union rte_mmsse {\n-\txmm_t m;\n+typedef union rte_xmm {\n+\txmm_t x;\n \tuint8_t u8[XMM_SIZE / sizeof(uint8_t)];\n \tuint16_t u16[XMM_SIZE / sizeof(uint16_t)];\n \tuint32_t u32[XMM_SIZE / sizeof(uint32_t)];\n@@ -137,6 +141,33 @@ typedef union rte_mmsse {\n \tdouble pd[XMM_SIZE / sizeof(double)];\n } rte_xmm_t;\n \n+#ifdef __AVX__\n+\n+typedef __m256i ymm_t;\n+\n+#define\tYMM_SIZE\t(sizeof(ymm_t))\n+#define\tYMM_MASK\t(YMM_SIZE - 1)\n+\n+typedef union rte_ymm {\n+\tymm_t y;\n+\txmm_t x[YMM_SIZE / sizeof(xmm_t)];\n+\tuint8_t u8[YMM_SIZE / sizeof(uint8_t)];\n+\tuint16_t u16[YMM_SIZE / sizeof(uint16_t)];\n+\tuint32_t u32[YMM_SIZE / sizeof(uint32_t)];\n+\tuint64_t u64[YMM_SIZE / sizeof(uint64_t)];\n+\tdouble pd[YMM_SIZE / sizeof(double)];\n+} rte_ymm_t;\n+\n+#endif /* __AVX__ */\n+\n+#ifdef RTE_ARCH_I686\n+#define _mm_cvtsi128_si64(a) ({ \\\n+\trte_xmm_t m; \\\n+\tm.x = (a); \\\n+\t(m.u64[0]); \\\n+})\n+#endif\n+\n /*\n * rte_cycles related.\n */\ndiff --git a/lib/librte_eal/common/include/rte_common_vect.h b/lib/librte_eal/common/include/rte_common_vect.h\nindex 95bf4b1..617470b 100644\n--- a/lib/librte_eal/common/include/rte_common_vect.h\n+++ b/lib/librte_eal/common/include/rte_common_vect.h\n@@ -54,6 +54,10 @@\n #include <smmintrin.h>\n #endif\n \n+#if defined(__AVX__)\n+#include <immintrin.h>\n+#endif\n+\n #else\n \n #include <x86intrin.h>\n@@ -70,7 +74,7 @@ typedef __m128i xmm_t;\n #define\tXMM_MASK\t(XMM_SIZE - 1)\n \n typedef union rte_xmm {\n-\txmm_t m;\n+\txmm_t x;\n \tuint8_t u8[XMM_SIZE / sizeof(uint8_t)];\n \tuint16_t u16[XMM_SIZE / sizeof(uint16_t)];\n \tuint32_t u32[XMM_SIZE / sizeof(uint32_t)];\n@@ -78,10 +82,29 @@ typedef union rte_xmm {\n \tdouble pd[XMM_SIZE / sizeof(double)];\n } rte_xmm_t;\n \n+#ifdef __AVX__\n+\n+typedef __m256i ymm_t;\n+\n+#define\tYMM_SIZE\t(sizeof(ymm_t))\n+#define\tYMM_MASK\t(YMM_SIZE - 1)\n+\n+typedef union rte_ymm {\n+\tymm_t y;\n+\txmm_t x[YMM_SIZE / sizeof(xmm_t)];\n+\tuint8_t u8[YMM_SIZE / sizeof(uint8_t)];\n+\tuint16_t u16[YMM_SIZE / sizeof(uint16_t)];\n+\tuint32_t u32[YMM_SIZE / sizeof(uint32_t)];\n+\tuint64_t u64[YMM_SIZE / sizeof(uint64_t)];\n+\tdouble pd[YMM_SIZE / sizeof(double)];\n+} rte_ymm_t;\n+\n+#endif /* __AVX__ */\n+\n #ifdef RTE_ARCH_I686\n #define _mm_cvtsi128_si64(a) ({ \\\n \trte_xmm_t m; \\\n-\tm.m = (a); \\\n+\tm.x = (a); \\\n \t(m.u64[0]); \\\n })\n #endif\ndiff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h\nindex 62d7736..586300b 100644\n--- a/lib/librte_lpm/rte_lpm.h\n+++ b/lib/librte_lpm/rte_lpm.h\n@@ -420,7 +420,7 @@ rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4],\n \ttbl[3] = *(const uint16_t *)&lpm->tbl24[idx >> 32];\n \n \t/* get 4 indexes for tbl8[]. */\n-\ti8.m = _mm_and_si128(ip, mask8);\n+\ti8.x = _mm_and_si128(ip, mask8);\n \n \tpt = (uint64_t)tbl[0] |\n \t\t(uint64_t)tbl[1] << 16 |\n", "prefixes": [ "dpdk-dev", "v2", "10/17" ] }{ "id": 2261, "url": "