From patchwork Thu Oct 22 18:50:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lance Richardson X-Patchwork-Id: 81821 X-Patchwork-Delegate: ajit.khaparde@broadcom.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7697BA04DD; Thu, 22 Oct 2020 20:51:08 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id BEEE35AAB; Thu, 22 Oct 2020 20:51:06 +0200 (CEST) Received: from mail-pf1-f195.google.com (mail-pf1-f195.google.com [209.85.210.195]) by dpdk.org (Postfix) with ESMTP id C1F235AA3 for ; Thu, 22 Oct 2020 20:51:03 +0200 (CEST) Received: by mail-pf1-f195.google.com with SMTP id b26so1730917pff.3 for ; Thu, 22 Oct 2020 11:51:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=from:to:cc:subject:date:message-id:mime-version; bh=DvKu/umeETKPaHwHznqI2yEW5Y+WE5SPNTtzYX4IhUc=; b=NGySgRd2emve5rJuIALDAgXRQ83Fs2hsmXCt0JpWnVKmsAvibd8YjUt69Qyzfdzo04 znGM4pBGdP+KrylsfqH4ajzimQ9G4/St7bWyeqE1UDkIWa9dkB/eLcXiUbSt30fGBZZK mkoQBShsWTAdBI+WoULo311bRRbjQ9nL1j56A= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version; bh=DvKu/umeETKPaHwHznqI2yEW5Y+WE5SPNTtzYX4IhUc=; b=J0WwYM8uOOSOQUHs0G8+EGx1Q6HcdZGRTtDHBc40d3+Ao9YoDoJ3hu8PlJF5XZmeXr PxXlqPLTOi44tJoCX8c8qiqVp9VLuJXlzE3OtjseK3VDbsVGfMrLNvzi2s6jxUXt65bJ nKp7KLsBnbOYBP/md6h8AJIS6GMj/T2zpLE7gBMpQkITDoivUw89EgnZ8MSmQ5puEPAU 1kgqYt03uuw9GR8m/CCQRDzpU0DmWYkCxxJw6gAojzKqLgSZ9ptPu4Pr49Dt3lveMxGy z1VPrvXoV/e50ORI3M4PsGk4Q47R6lx5SdFDktJSNSiNcf4S6Y0//NBswXQ3/875hZPP QJ8g== X-Gm-Message-State: AOAM530pGeJIpSfxLrkDnA3MDfkeNY/Od1DwbdpLdar2X3aqpOfBdH8E E8TMT/+jwheoo0oAQL0eAhmfew== X-Google-Smtp-Source: ABdhPJzf4kJltdzm8WgAmDdCo9peHzzz4mIl0yhMPnMARrPHFbk+JxowLUYqeLW0iiBU21bLW+aDsw== X-Received: by 2002:a17:90b:4d0d:: with SMTP id mw13mr3558998pjb.192.1603392661752; Thu, 22 Oct 2020 11:51:01 -0700 (PDT) Received: from localhost.localdomain ([192.19.231.250]) by smtp.gmail.com with ESMTPSA id z30sm3210569pfq.87.2020.10.22.11.50.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Oct 2020 11:51:01 -0700 (PDT) From: Lance Richardson To: Ajit Khaparde , Somnath Kotur Cc: dev@dpdk.org Date: Thu, 22 Oct 2020 14:50:51 -0400 Message-Id: <20201022185051.183164-1-lance.richardson@broadcom.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: [dpdk-dev] [PATCH] net/bnxt: use shorter SIMD initializers X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Make SIMD initialization code less verbose by using appropriate intrinsics when all lanes of a vector are initialized to the same value. Signed-off-by: Lance Richardson Reviewed-by: Ajit Khaparde --- drivers/net/bnxt/bnxt_rxtx_vec_neon.c | 58 +++++++-------------------- drivers/net/bnxt/bnxt_rxtx_vec_sse.c | 37 +++++------------ 2 files changed, 23 insertions(+), 72 deletions(-) diff --git a/drivers/net/bnxt/bnxt_rxtx_vec_neon.c b/drivers/net/bnxt/bnxt_rxtx_vec_neon.c index f49e29ccb..de1d96570 100644 --- a/drivers/net/bnxt/bnxt_rxtx_vec_neon.c +++ b/drivers/net/bnxt/bnxt_rxtx_vec_neon.c @@ -67,40 +67,17 @@ descs_to_mbufs(uint32x4_t mm_rxcmp[4], uint32x4_t mm_rxcmp1[4], 0xFF, 0xFF, /* vlan_tci (zeroes) */ 12, 13, 14, 15 /* rss hash */ }; - const uint32x4_t flags_type_mask = { - RX_PKT_CMPL_FLAGS_ITYPE_MASK, - RX_PKT_CMPL_FLAGS_ITYPE_MASK, - RX_PKT_CMPL_FLAGS_ITYPE_MASK, - RX_PKT_CMPL_FLAGS_ITYPE_MASK - }; - const uint32x4_t flags2_mask1 = { - RX_PKT_CMPL_FLAGS2_META_FORMAT_VLAN | - RX_PKT_CMPL_FLAGS2_T_IP_CS_CALC, - RX_PKT_CMPL_FLAGS2_META_FORMAT_VLAN | - RX_PKT_CMPL_FLAGS2_T_IP_CS_CALC, - RX_PKT_CMPL_FLAGS2_META_FORMAT_VLAN | - RX_PKT_CMPL_FLAGS2_T_IP_CS_CALC, - RX_PKT_CMPL_FLAGS2_META_FORMAT_VLAN | - RX_PKT_CMPL_FLAGS2_T_IP_CS_CALC - }; - const uint32x4_t flags2_mask2 = { - RX_PKT_CMPL_FLAGS2_IP_TYPE, - RX_PKT_CMPL_FLAGS2_IP_TYPE, - RX_PKT_CMPL_FLAGS2_IP_TYPE, - RX_PKT_CMPL_FLAGS2_IP_TYPE - }; - const uint32x4_t rss_mask = { - RX_PKT_CMPL_FLAGS_RSS_VALID, - RX_PKT_CMPL_FLAGS_RSS_VALID, - RX_PKT_CMPL_FLAGS_RSS_VALID, - RX_PKT_CMPL_FLAGS_RSS_VALID - }; - const uint32x4_t flags2_index_mask = { - 0x1F, 0x1F, 0x1F, 0x1F - }; - const uint32x4_t flags2_error_mask = { - 0xF, 0xF, 0xF, 0xF - }; + const uint32x4_t flags_type_mask = + vdupq_n_u32(RX_PKT_CMPL_FLAGS_ITYPE_MASK); + const uint32x4_t flags2_mask1 = + vdupq_n_u32(RX_PKT_CMPL_FLAGS2_META_FORMAT_VLAN | + RX_PKT_CMPL_FLAGS2_T_IP_CS_CALC); + const uint32x4_t flags2_mask2 = + vdupq_n_u32(RX_PKT_CMPL_FLAGS2_IP_TYPE); + const uint32x4_t rss_mask = + vdupq_n_u32(RX_PKT_CMPL_FLAGS_RSS_VALID); + const uint32x4_t flags2_index_mask = vdupq_n_u32(0x1F); + const uint32x4_t flags2_error_mask = vdupq_n_u32(0x0F); uint32x4_t flags_type, flags2, index, errors, rss_flags; uint32x4_t tmp, ptype_idx; uint64x2_t t0, t1; @@ -180,20 +157,13 @@ bnxt_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t rx_ring_size = rxr->rx_ring_struct->ring_size; struct cmpl_base *cp_desc_ring = cpr->cp_desc_ring; uint64_t valid, desc_valid_mask = ~0UL; - const uint32x4_t info3_v_mask = { - CMPL_BASE_V, CMPL_BASE_V, - CMPL_BASE_V, CMPL_BASE_V - }; + const uint32x4_t info3_v_mask = vdupq_n_u32(CMPL_BASE_V); uint32_t raw_cons = cpr->cp_raw_cons; uint32_t cons, mbcons; int nb_rx_pkts = 0; const uint64x2_t mb_init = {rxq->mbuf_initializer, 0}; - const uint32x4_t valid_target = { - !!(raw_cons & cp_ring_size), - !!(raw_cons & cp_ring_size), - !!(raw_cons & cp_ring_size), - !!(raw_cons & cp_ring_size) - }; + const uint32x4_t valid_target = + vdupq_n_u32(!!(raw_cons & cp_ring_size)); int i; /* If Rx Q was stopped return */ diff --git a/drivers/net/bnxt/bnxt_rxtx_vec_sse.c b/drivers/net/bnxt/bnxt_rxtx_vec_sse.c index e4ba63551..e12bf8bb7 100644 --- a/drivers/net/bnxt/bnxt_rxtx_vec_sse.c +++ b/drivers/net/bnxt/bnxt_rxtx_vec_sse.c @@ -63,29 +63,14 @@ descs_to_mbufs(__m128i mm_rxcmp[4], __m128i mm_rxcmp1[4], 0xFF, 0xFF, 3, 2, /* pkt_len */ 0xFF, 0xFF, 0xFF, 0xFF); /* pkt_type (zeroes) */ const __m128i flags_type_mask = - _mm_set_epi32(RX_PKT_CMPL_FLAGS_ITYPE_MASK, - RX_PKT_CMPL_FLAGS_ITYPE_MASK, - RX_PKT_CMPL_FLAGS_ITYPE_MASK, - RX_PKT_CMPL_FLAGS_ITYPE_MASK); + _mm_set1_epi32(RX_PKT_CMPL_FLAGS_ITYPE_MASK); const __m128i flags2_mask1 = - _mm_set_epi32(RX_PKT_CMPL_FLAGS2_META_FORMAT_VLAN | - RX_PKT_CMPL_FLAGS2_T_IP_CS_CALC, - RX_PKT_CMPL_FLAGS2_META_FORMAT_VLAN | - RX_PKT_CMPL_FLAGS2_T_IP_CS_CALC, - RX_PKT_CMPL_FLAGS2_META_FORMAT_VLAN | - RX_PKT_CMPL_FLAGS2_T_IP_CS_CALC, - RX_PKT_CMPL_FLAGS2_META_FORMAT_VLAN | - RX_PKT_CMPL_FLAGS2_T_IP_CS_CALC); + _mm_set1_epi32(RX_PKT_CMPL_FLAGS2_META_FORMAT_VLAN | + RX_PKT_CMPL_FLAGS2_T_IP_CS_CALC); const __m128i flags2_mask2 = - _mm_set_epi32(RX_PKT_CMPL_FLAGS2_IP_TYPE, - RX_PKT_CMPL_FLAGS2_IP_TYPE, - RX_PKT_CMPL_FLAGS2_IP_TYPE, - RX_PKT_CMPL_FLAGS2_IP_TYPE); + _mm_set1_epi32(RX_PKT_CMPL_FLAGS2_IP_TYPE); const __m128i rss_mask = - _mm_set_epi32(RX_PKT_CMPL_FLAGS_RSS_VALID, - RX_PKT_CMPL_FLAGS_RSS_VALID, - RX_PKT_CMPL_FLAGS_RSS_VALID, - RX_PKT_CMPL_FLAGS_RSS_VALID); + _mm_set1_epi32(RX_PKT_CMPL_FLAGS_RSS_VALID); __m128i t0, t1, flags_type, flags2, index, errors, rss_flags; __m128i ptype_idx; uint32_t ol_flags; @@ -114,10 +99,10 @@ descs_to_mbufs(__m128i mm_rxcmp[4], __m128i mm_rxcmp1[4], t1 = _mm_unpackhi_epi32(mm_rxcmp1[2], mm_rxcmp1[3]); /* Compute ol_flags and checksum error indexes for four packets. */ - flags2 = _mm_and_si128(flags2, _mm_set_epi32(0x1F, 0x1F, 0x1F, 0x1F)); + flags2 = _mm_and_si128(flags2, _mm_set1_epi32(0x1F)); errors = _mm_srli_epi32(_mm_unpacklo_epi64(t0, t1), 4); - errors = _mm_and_si128(errors, _mm_set_epi32(0xF, 0xF, 0xF, 0xF)); + errors = _mm_and_si128(errors, _mm_set1_epi32(0xF)); errors = _mm_and_si128(errors, flags2); index = _mm_andnot_si128(errors, flags2); @@ -165,16 +150,12 @@ bnxt_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t rx_ring_size = rxr->rx_ring_struct->ring_size; struct cmpl_base *cp_desc_ring = cpr->cp_desc_ring; uint64_t valid, desc_valid_mask = ~0ULL; - const __m128i info3_v_mask = _mm_set_epi32(CMPL_BASE_V, CMPL_BASE_V, - CMPL_BASE_V, CMPL_BASE_V); + const __m128i info3_v_mask = _mm_set1_epi32(CMPL_BASE_V); uint32_t raw_cons = cpr->cp_raw_cons; uint32_t cons, mbcons; int nb_rx_pkts = 0; const __m128i valid_target = - _mm_set_epi32(!!(raw_cons & cp_ring_size), - !!(raw_cons & cp_ring_size), - !!(raw_cons & cp_ring_size), - !!(raw_cons & cp_ring_size)); + _mm_set1_epi32(!!(raw_cons & cp_ring_size)); int i; /* If Rx Q was stopped return */