From patchwork Tue May 5 15:38:27 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: WangDong X-Patchwork-Id: 4634 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id 79A1111C5; Tue, 5 May 2015 17:38:46 +0200 (CEST) Received: from BLU004-OMC4S13.hotmail.com (blu004-omc4s13.hotmail.com [65.55.111.152]) by dpdk.org (Postfix) with ESMTP id A219F9E7 for ; Tue, 5 May 2015 17:38:45 +0200 (CEST) Received: from BLU436-SMTP181 ([65.55.111.137]) by BLU004-OMC4S13.hotmail.com over TLS secured channel with Microsoft SMTPSVC(7.5.7601.22751); Tue, 5 May 2015 08:38:45 -0700 X-TMN: [RW0iAVZ7KV2eLyXo2vGye+XOoxg/qjCX] X-Originating-Email: [dong.wang.pro@hotmail.com] Message-ID: From: WangDong To: dev@dpdk.org Date: Tue, 5 May 2015 23:38:27 +0800 X-Mailer: git-send-email 1.9.1 X-OriginalArrivalTime: 05 May 2015 15:38:44.0329 (UTC) FILETIME=[91DD0D90:01D08749] MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH] librte_eal:Using compiler memory barrier for IA processor's rte_wmb/rte_rmb. X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The current implementation of rte_wmb/rte_rmb for x86 is using processor memory barrier. It's unnessary for IA processor, compiler memory barrier is enough. But if dpdk runing on a AMD processor, maybe we should use processor memory barrier. I add a macro to distinguish them, if we compile DPDK for IA processor, add the macro (RTE_ARCH_X86_IA) can improve performance with compiler memory barrier. Or we can add RTE_ARCH_X86_AMD for using processor memory barrier, in this case, if didn't add the macro, the memory ordering will not be guaranteed. Which macro is better? If this patch applied, the PMD's old implementation of compiler memory barrier (some volatile variable) can be fixed with rte_rmb() and rte_wmb() for any architecture. --- lib/librte_eal/common/include/arch/x86/rte_atomic.h | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/lib/librte_eal/common/include/arch/x86/rte_atomic.h b/lib/librte_eal/common/include/arch/x86/rte_atomic.h index e93e8ee..52b1e81 100644 --- a/lib/librte_eal/common/include/arch/x86/rte_atomic.h +++ b/lib/librte_eal/common/include/arch/x86/rte_atomic.h @@ -49,10 +49,20 @@ extern "C" { #define rte_mb() _mm_mfence() +#ifdef RTE_ARCH_X86_IA + +#define rte_wmb() rte_compiler_barrier() + +#define rte_rmb() rte_compiler_barrier() + +#else + #define rte_wmb() _mm_sfence() #define rte_rmb() _mm_lfence() +#endif + /*------------------------- 16 bit atomic operations -------------------------*/ #ifndef RTE_FORCE_INTRINSICS