From patchwork Tue Oct 27 19:13:48 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Viktorin X-Patchwork-Id: 8104 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id B71E28E9A; Tue, 27 Oct 2015 20:16:33 +0100 (CET) Received: from wes1-so1.wedos.net (wes1-so1.wedos.net [46.28.106.15]) by dpdk.org (Postfix) with ESMTP id E6AA38DA4 for ; Tue, 27 Oct 2015 20:16:16 +0100 (CET) Received: from pcviktorin.fit.vutbr.cz (pcviktorin.fit.vutbr.cz [147.229.13.147]) by wes1-so1.wedos.net (Postfix) with ESMTPSA id 3nljSJ51SNz59H; Tue, 27 Oct 2015 20:16:16 +0100 (CET) From: Jan Viktorin To: dev@dpdk.org, David Hunt , David Marchand , "Ananyev, Konstantin" Date: Tue, 27 Oct 2015 20:13:48 +0100 Message-Id: <1445973229-22058-17-git-send-email-viktorin@rehivetech.com> X-Mailer: git-send-email 2.6.1 In-Reply-To: <1445973229-22058-1-git-send-email-viktorin@rehivetech.com> References: <1445877458-31052-1-git-send-email-viktorin@rehivetech.com> <1445973229-22058-1-git-send-email-viktorin@rehivetech.com> Cc: Vlastimil Kosar Subject: [dpdk-dev] [PATCH v3 16/17] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk for non-x86 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Vlastimil Kosar LPM function rte_lpm_lookupx4() uses i686/x86_64 SIMD intrinsics. Therefore, the function is reimplemented using non-vector operations for non-x86 architectures. LPM now builds for ARM. Signed-off-by: Vlastimil Kosar Signed-off-by: Jan Viktorin --- v2 -> v3: as SIMD operations have been moved to rte_vect.h, this patch is now quite clear and just defines the non-x86 version of rte_lpm_lookupx4 --- config/defconfig_arm-armv7-a-linuxapp-gcc | 1 - lib/librte_lpm/rte_lpm.h | 24 +++++++++++++++++++++--- 2 files changed, 21 insertions(+), 4 deletions(-) diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc b/config/defconfig_arm-armv7-a-linuxapp-gcc index 5a778cf..a2c8b95 100644 --- a/config/defconfig_arm-armv7-a-linuxapp-gcc +++ b/config/defconfig_arm-armv7-a-linuxapp-gcc @@ -55,7 +55,6 @@ CONFIG_RTE_EAL_IGB_UIO=n # fails to compile on ARM CONFIG_RTE_LIBRTE_ACL=n -CONFIG_RTE_LIBRTE_LPM=n # cannot use those on ARM CONFIG_RTE_KNI_KMOD=n diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h index c299ce2..c02b355 100644 --- a/lib/librte_lpm/rte_lpm.h +++ b/lib/librte_lpm/rte_lpm.h @@ -358,9 +358,6 @@ rte_lpm_lookup_bulk_func(const struct rte_lpm *lpm, const uint32_t * ips, return 0; } -/* Mask four results. */ -#define RTE_LPM_MASKX4_RES UINT64_C(0x00ff00ff00ff00ff) - /** * Lookup four IP addresses in an LPM table. * @@ -382,6 +379,14 @@ rte_lpm_lookup_bulk_func(const struct rte_lpm *lpm, const uint32_t * ips, */ static inline void rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4], + uint16_t defv); + +#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) +/* Mask four results. */ +#define RTE_LPM_MASKX4_RES UINT64_C(0x00ff00ff00ff00ff) + +static inline void +rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4], uint16_t defv) { __m128i i24; @@ -472,6 +477,19 @@ rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4], hop[2] = (tbl[2] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[2] : defv; hop[3] = (tbl[3] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[3] : defv; } +#else +static inline void +rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4], + uint16_t defv) +{ + rte_lpm_lookup_bulk(lpm, ip.val.uint32, hop, 4); + + hop[0] = (hop[0] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[0] : defv; + hop[1] = (hop[1] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[1] : defv; + hop[2] = (hop[2] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[2] : defv; + hop[3] = (hop[3] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[3] : defv; +} +#endif #ifdef __cplusplus }