From patchwork Thu Jul 7 18:34:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Mattias_R=C3=B6nnblom?= X-Patchwork-Id: 113791 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 45A5FA0543; Thu, 7 Jul 2022 20:35:17 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 097DD42836; Thu, 7 Jul 2022 20:35:12 +0200 (CEST) Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3]) by mails.dpdk.org (Postfix) with ESMTP id AC21840A7B; Thu, 7 Jul 2022 20:35:10 +0200 (CEST) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id 90CC4113BA; Thu, 7 Jul 2022 20:35:10 +0200 (CEST) Received: by mail.lysator.liu.se (Postfix, from userid 1004) id 8F855112BF; Thu, 7 Jul 2022 20:35:10 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on hermod.lysator.liu.se X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=ALL_TRUSTED,AWL, T_SCC_BODY_TEXT_LINE autolearn=disabled version=3.4.6 X-Spam-Score: -1.8 Received: from isengard.friendlyfire.se (unknown [62.63.215.114]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id 3C3F811345; Thu, 7 Jul 2022 20:35:09 +0200 (CEST) From: =?utf-8?q?Mattias_R=C3=B6nnblom?= To: olivier.matz@6wind.com Cc: Emil Berg , bruce.richardson@intel.com, stephen@networkplumber.org, stable@dpdk.org, bugzilla@dpdk.org, dev@dpdk.org, onar.olsen@ericsson.com, =?utf-8?q?Morten_Br=C3=B8rup?= , =?utf-8?q?Mattia?= =?utf-8?q?s_R=C3=B6nnblom?= Subject: [PATCH 2/2] net: have checksum routines accept unaligned data Date: Thu, 7 Jul 2022 20:34:50 +0200 Message-Id: <20220707183450.3203361-2-hofors@lysator.liu.se> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220707183450.3203361-1-hofors@lysator.liu.se> References: <98CBD80474FA8B44BF855DF32C47DC35D87189@smartserver.smartshare.dk> <20220707183450.3203361-1-hofors@lysator.liu.se> MIME-Version: 1.0 X-Virus-Scanned: ClamAV using ClamSMTP X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Mattias Rönnblom __rte_raw_cksum() (used by rte_raw_cksum() among others) accessed its data through an uint16_t pointer, which allowed the compiler to assume the data was 16-bit aligned. This in turn would, with certain architectures and compiler flag combinations, result in code with SIMD load or store instructions with restrictions on data alignment. This patch keeps the old algorithm, but data is read using memcpy() instead of direct pointer access, forcing the compiler to always generate code that handles unaligned input. The __may_alias__ GCC attribute is no longer needed. The data on which the Internet checksum functions operates are almost always 16-bit aligned, but there are exceptions. In particular, the PDCP protocol header may (literally) have an odd size. Performance impact seems to range from none to a very slight regression. Bugzilla ID: 1035 Cc: stable@dpdk.org Signed-off-by: Mattias Rönnblom Reviewed-by: Morten Brørup --- lib/net/rte_ip.h | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h index b502481670..a9e6251f14 100644 --- a/lib/net/rte_ip.h +++ b/lib/net/rte_ip.h @@ -160,18 +160,23 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr *ipv4_hdr) static inline uint32_t __rte_raw_cksum(const void *buf, size_t len, uint32_t sum) { - /* extend strict-aliasing rules */ - typedef uint16_t __attribute__((__may_alias__)) u16_p; - const u16_p *u16_buf = (const u16_p *)buf; - const u16_p *end = u16_buf + len / sizeof(*u16_buf); + const void *end; - for (; u16_buf != end; ++u16_buf) - sum += *u16_buf; + for (end = RTE_PTR_ADD(buf, (len/sizeof(uint16_t)) * sizeof(uint16_t)); + buf != end; buf = RTE_PTR_ADD(buf, sizeof(uint16_t))) { + uint16_t v; + + memcpy(&v, buf, sizeof(uint16_t)); + sum += v; + } /* if length is odd, keeping it byte order independent */ if (unlikely(len % 2)) { + uint8_t last; uint16_t left = 0; - *(unsigned char *)&left = *(const unsigned char *)end; + + memcpy(&last, end, 1); + *(unsigned char *)&left = last; sum += left; }