From patchwork Fri Apr 10 16:41:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gavin Hu X-Patchwork-Id: 68160 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id BB67EA0598; Fri, 10 Apr 2020 18:41:51 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 6CF371D5D0; Fri, 10 Apr 2020 18:41:48 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id DE2F41D5CF for ; Fri, 10 Apr 2020 18:41:46 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5755930E; Fri, 10 Apr 2020 09:41:46 -0700 (PDT) Received: from net-arm-thunderx2-01.shanghai.arm.com (net-arm-thunderx2-01.shanghai.arm.com [10.169.41.214]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 26E633F52E; Fri, 10 Apr 2020 09:41:41 -0700 (PDT) From: Gavin Hu To: dev@dpdk.org Cc: nd@arm.com, david.marchand@redhat.com, thomas@monjalon.net, rasland@mellanox.com, drc@linux.vnet.ibm.com, bruce.richardson@intel.com, konstantin.ananyev@intel.com, matan@mellanox.com, shahafs@mellanox.com, viacheslavo@mellanox.com, jerinj@marvell.com, Honnappa.Nagarahalli@arm.com, ruifeng.wang@arm.com, phil.yang@arm.com, joyce.kong@arm.com, steve.capper@arm.com Date: Sat, 11 Apr 2020 00:41:21 +0800 Message-Id: <20200410164127.54229-2-gavin.hu@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200410164127.54229-1-gavin.hu@arm.com> References: <20200410164127.54229-1-gavin.hu@arm.com> In-Reply-To: <20200213123854.203566-1-gavin.hu@arm.com> References: <20200213123854.203566-1-gavin.hu@arm.com> Subject: [dpdk-dev] [PATCH RFC v2 1/7] eal: introduce new class of barriers for DMA use cases X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" In DPDK we use rte_*mb barriers to ensure that memory accesses to DMA regions are observed before MMIO accesses to hardware registers. On AArch64, the rte_*mb barriers are implemented by "DSB" (Data Synchronisation Barrier) style instructions which are the strongest barriers possible. Recently, however, it has been realised [1], that for devices where the MMIO regions are shared between all CPUs, that it is possible to relax this memory barrier. There are cases where we wish to retain the strength of the rte_*mb memory barriers; thus rather than relax rte_*mb we opt instead to introduce a new class of barrier rte_dma_*mb. For AArch64, rte_dma_*mb will be implemented by a relaxed "DMB OSH" style of barrier. For other architectures, we implement rte_dma_*mb as rte_*mb so this should not result in any functional changes. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ commit/?id=22ec71615d824f4f11d38d0e55a88d8956b7e45f Signed-off-by: Gavin Hu Reviewed-by: Steve Capper --- lib/librte_eal/arm/include/rte_atomic_32.h | 6 ++++ lib/librte_eal/arm/include/rte_atomic_64.h | 6 ++++ lib/librte_eal/include/generic/rte_atomic.h | 31 +++++++++++++++++++++ lib/librte_eal/ppc/include/rte_atomic.h | 6 ++++ lib/librte_eal/x86/include/rte_atomic.h | 6 ++++ 5 files changed, 55 insertions(+) diff --git a/lib/librte_eal/arm/include/rte_atomic_32.h b/lib/librte_eal/arm/include/rte_atomic_32.h index 7dc0d06d1..80208467e 100644 --- a/lib/librte_eal/arm/include/rte_atomic_32.h +++ b/lib/librte_eal/arm/include/rte_atomic_32.h @@ -33,6 +33,12 @@ extern "C" { #define rte_io_rmb() rte_rmb() +#define rte_dma_mb() rte_mb() + +#define rte_dma_wmb() rte_wmb() + +#define rte_dma_rmb() rte_rmb() + #define rte_cio_wmb() rte_wmb() #define rte_cio_rmb() rte_rmb() diff --git a/lib/librte_eal/arm/include/rte_atomic_64.h b/lib/librte_eal/arm/include/rte_atomic_64.h index 7b7099cdc..608726c29 100644 --- a/lib/librte_eal/arm/include/rte_atomic_64.h +++ b/lib/librte_eal/arm/include/rte_atomic_64.h @@ -37,6 +37,12 @@ extern "C" { #define rte_io_rmb() rte_rmb() +#define rte_dma_mb() asm volatile("dmb osh" : : : "memory") + +#define rte_dma_wmb() asm volatile("dmb oshst" : : : "memory") + +#define rte_dma_rmb() asm volatile("dmb oshld" : : : "memory") + #define rte_cio_wmb() asm volatile("dmb oshst" : : : "memory") #define rte_cio_rmb() asm volatile("dmb oshld" : : : "memory") diff --git a/lib/librte_eal/include/generic/rte_atomic.h b/lib/librte_eal/include/generic/rte_atomic.h index e6ab15a97..042264c7e 100644 --- a/lib/librte_eal/include/generic/rte_atomic.h +++ b/lib/librte_eal/include/generic/rte_atomic.h @@ -107,6 +107,37 @@ static inline void rte_io_wmb(void); static inline void rte_io_rmb(void); ///@} +/** @name DMA Memory Barrier + */ +///@{ +/** + * memory barrier for DMA use cases + * + * Guarantees that the LOAD and STORE operations that precede the rte_dma_mb() + * call are visible to CPU and I/O device that is shared between all CPUs + * before the LOAD and STORE operations that follow it. + */ +static inline void rte_dma_mb(void); + +/** + * Write memory barrier for DMA use cases + * + * Guarantees that the STORE operations that precede the rte_dma_wmb() call are + * visible to CPU and I/O device that is shared between all CPUs before the + * STORE operations that follow it. + */ +static inline void rte_dma_wmb(void); + +/** + * Read memory barrier for DMA use cases + * + * Guarantees that the LOAD operations that precede the rte_dma_rmb() call are + * visible to CPU and IO device that is shared between all CPUs before the LOAD + * operations that follow it. + */ +static inline void rte_dma_rmb(void); +///@} + /** @name Coherent I/O Memory Barrier * * Coherent I/O memory barrier is a lightweight version of I/O memory diff --git a/lib/librte_eal/ppc/include/rte_atomic.h b/lib/librte_eal/ppc/include/rte_atomic.h index 7e3e13118..faa36bb76 100644 --- a/lib/librte_eal/ppc/include/rte_atomic.h +++ b/lib/librte_eal/ppc/include/rte_atomic.h @@ -36,6 +36,12 @@ extern "C" { #define rte_io_rmb() rte_rmb() +#define rte_dma_mb() rte_mb() + +#define rte_dma_wmb() rte_wmb() + +#define rte_dma_rmb() rte_rmb() + #define rte_cio_wmb() rte_wmb() #define rte_cio_rmb() rte_rmb() diff --git a/lib/librte_eal/x86/include/rte_atomic.h b/lib/librte_eal/x86/include/rte_atomic.h index 148398f50..0b1d452f3 100644 --- a/lib/librte_eal/x86/include/rte_atomic.h +++ b/lib/librte_eal/x86/include/rte_atomic.h @@ -79,6 +79,12 @@ rte_smp_mb(void) #define rte_io_rmb() rte_compiler_barrier() +#define rte_dma_mb() rte_mb() + +#define rte_dma_wmb() rte_wmb() + +#define rte_dma_rmb() rte_rmb() + #define rte_cio_wmb() rte_compiler_barrier() #define rte_cio_rmb() rte_compiler_barrier() From patchwork Fri Apr 10 16:41:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gavin Hu X-Patchwork-Id: 68161 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id E55EDA0598; Fri, 10 Apr 2020 18:42:01 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id ADBF91D5D7; Fri, 10 Apr 2020 18:41:52 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id 7DB361D5D7 for ; Fri, 10 Apr 2020 18:41:51 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 078D67FA; Fri, 10 Apr 2020 09:41:51 -0700 (PDT) Received: from net-arm-thunderx2-01.shanghai.arm.com (net-arm-thunderx2-01.shanghai.arm.com [10.169.41.214]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id C2C3B3F52E; Fri, 10 Apr 2020 09:41:46 -0700 (PDT) From: Gavin Hu To: dev@dpdk.org Cc: nd@arm.com, david.marchand@redhat.com, thomas@monjalon.net, rasland@mellanox.com, drc@linux.vnet.ibm.com, bruce.richardson@intel.com, konstantin.ananyev@intel.com, matan@mellanox.com, shahafs@mellanox.com, viacheslavo@mellanox.com, jerinj@marvell.com, Honnappa.Nagarahalli@arm.com, ruifeng.wang@arm.com, phil.yang@arm.com, joyce.kong@arm.com, steve.capper@arm.com Date: Sat, 11 Apr 2020 00:41:22 +0800 Message-Id: <20200410164127.54229-3-gavin.hu@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200410164127.54229-1-gavin.hu@arm.com> References: <20200410164127.54229-1-gavin.hu@arm.com> In-Reply-To: <20200213123854.203566-1-gavin.hu@arm.com> References: <20200213123854.203566-1-gavin.hu@arm.com> Subject: [dpdk-dev] [PATCH RFC v2 2/7] net/mlx5: dmb for immediate doorbell ring on aarch64 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" A 'DMB' is enough to evict the merge buffer on aarch64,when the doorbell register is mapped as 'Normal-NC', the counterpart of WC on x86. Otherwise, it is mapped as Device memory, no barriers required at all. Signed-off-by: Gavin Hu --- drivers/net/mlx5/mlx5_rxtx.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h index 939778aa5..e509f3b88 100644 --- a/drivers/net/mlx5/mlx5_rxtx.h +++ b/drivers/net/mlx5/mlx5_rxtx.h @@ -661,7 +661,7 @@ mlx5_tx_dbrec_cond_wmb(struct mlx5_txq_data *txq, volatile struct mlx5_wqe *wqe, rte_wmb(); mlx5_uar_write64_relaxed(*src, dst, txq->uar_lock); if (cond) - rte_wmb(); + rte_dma_wmb(); } /** From patchwork Fri Apr 10 16:41:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gavin Hu X-Patchwork-Id: 68162 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id DCB37A0598; Fri, 10 Apr 2020 18:42:11 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 1CA411D5E3; Fri, 10 Apr 2020 18:41:58 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id 5B2FE1D5DF; Fri, 10 Apr 2020 18:41:56 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E252830E; Fri, 10 Apr 2020 09:41:55 -0700 (PDT) Received: from net-arm-thunderx2-01.shanghai.arm.com (net-arm-thunderx2-01.shanghai.arm.com [10.169.41.214]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 7393D3F52E; Fri, 10 Apr 2020 09:41:51 -0700 (PDT) From: Gavin Hu To: dev@dpdk.org Cc: nd@arm.com, david.marchand@redhat.com, thomas@monjalon.net, rasland@mellanox.com, drc@linux.vnet.ibm.com, bruce.richardson@intel.com, konstantin.ananyev@intel.com, matan@mellanox.com, shahafs@mellanox.com, viacheslavo@mellanox.com, jerinj@marvell.com, Honnappa.Nagarahalli@arm.com, ruifeng.wang@arm.com, phil.yang@arm.com, joyce.kong@arm.com, steve.capper@arm.com, stable@dpdk.org Date: Sat, 11 Apr 2020 00:41:23 +0800 Message-Id: <20200410164127.54229-4-gavin.hu@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200410164127.54229-1-gavin.hu@arm.com> References: <20200410164127.54229-1-gavin.hu@arm.com> In-Reply-To: <20200213123854.203566-1-gavin.hu@arm.com> References: <20200213123854.203566-1-gavin.hu@arm.com> Subject: [dpdk-dev] [PATCH RFC v2 3/7] net/mlx5: relax barrier to order UAR writes on aarch64 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" To order the writes to host memory and the MMIO device memory, 'DMB' is sufficient on aarch64, as a 'other-multi-copy' architecture. 'DSB' is over-killing, especially in the fast path. Using the rte_dma_wmb can take the advantage on aarch64 while no impacting x86 and ppc. Fixes: 6bf10ab69be0 ("net/mlx5: support 32-bit systems") Cc: stable@dpdk.org Signed-off-by: Gavin Hu Reviewed-by: Phil Yang --- drivers/net/mlx5/mlx5_rxtx.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h index e509f3b88..da5d81350 100644 --- a/drivers/net/mlx5/mlx5_rxtx.h +++ b/drivers/net/mlx5/mlx5_rxtx.h @@ -546,7 +546,7 @@ __mlx5_uar_write64_relaxed(uint64_t val, void *addr, static __rte_always_inline void __mlx5_uar_write64(uint64_t val, void *addr, rte_spinlock_t *lock) { - rte_io_wmb(); + rte_dma_wmb(); __mlx5_uar_write64_relaxed(val, addr, lock); } From patchwork Fri Apr 10 16:41:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gavin Hu X-Patchwork-Id: 68163 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id A29FEA0598; Fri, 10 Apr 2020 18:42:23 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id E08391D5F0; Fri, 10 Apr 2020 18:42:02 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id 43D8C1D5EE; Fri, 10 Apr 2020 18:42:01 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C82CA30E; Fri, 10 Apr 2020 09:42:00 -0700 (PDT) Received: from net-arm-thunderx2-01.shanghai.arm.com (net-arm-thunderx2-01.shanghai.arm.com [10.169.41.214]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 59C3F3F52E; Fri, 10 Apr 2020 09:41:56 -0700 (PDT) From: Gavin Hu To: dev@dpdk.org Cc: nd@arm.com, david.marchand@redhat.com, thomas@monjalon.net, rasland@mellanox.com, drc@linux.vnet.ibm.com, bruce.richardson@intel.com, konstantin.ananyev@intel.com, matan@mellanox.com, shahafs@mellanox.com, viacheslavo@mellanox.com, jerinj@marvell.com, Honnappa.Nagarahalli@arm.com, ruifeng.wang@arm.com, phil.yang@arm.com, joyce.kong@arm.com, steve.capper@arm.com, stable@dpdk.org Date: Sat, 11 Apr 2020 00:41:24 +0800 Message-Id: <20200410164127.54229-5-gavin.hu@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200410164127.54229-1-gavin.hu@arm.com> References: <20200410164127.54229-1-gavin.hu@arm.com> In-Reply-To: <20200213123854.203566-1-gavin.hu@arm.com> References: <20200213123854.203566-1-gavin.hu@arm.com> Subject: [dpdk-dev] [PATCH RFC v2 4/7] net/mlx5: relax barrier for aarch64 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" To ensure the WQE and doorbell record, which reside in the host memory, are visible to HW before the blue frame, an ordered mlx5_uar_write call is sufficient, a rte_wmb is overkill for aarch64. Fixes: 6cb559d67b83 ("net/mlx5: add vectorized Rx/Tx burst for x86") Cc: stable@dpdk.org Signed-off-by: Gavin Hu Reviewed-by: Phil Yang --- drivers/net/mlx5/mlx5_rxtx.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h index da5d81350..228e37de5 100644 --- a/drivers/net/mlx5/mlx5_rxtx.h +++ b/drivers/net/mlx5/mlx5_rxtx.h @@ -658,8 +658,7 @@ mlx5_tx_dbrec_cond_wmb(struct mlx5_txq_data *txq, volatile struct mlx5_wqe *wqe, rte_cio_wmb(); *txq->qp_db = rte_cpu_to_be_32(txq->wqe_ci); /* Ensure ordering between DB record and BF copy. */ - rte_wmb(); - mlx5_uar_write64_relaxed(*src, dst, txq->uar_lock); + mlx5_uar_write64(*src, dst, txq->uar_lock); if (cond) rte_dma_wmb(); } From patchwork Fri Apr 10 16:41:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gavin Hu X-Patchwork-Id: 68164 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 6FD0FA0598; Fri, 10 Apr 2020 18:42:31 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 2E4291D5E1; Fri, 10 Apr 2020 18:42:07 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id E4A831D5CF for ; Fri, 10 Apr 2020 18:42:05 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 780B830E; Fri, 10 Apr 2020 09:42:05 -0700 (PDT) Received: from net-arm-thunderx2-01.shanghai.arm.com (net-arm-thunderx2-01.shanghai.arm.com [10.169.41.214]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 3F70E3F52E; Fri, 10 Apr 2020 09:42:01 -0700 (PDT) From: Gavin Hu To: dev@dpdk.org Cc: nd@arm.com, david.marchand@redhat.com, thomas@monjalon.net, rasland@mellanox.com, drc@linux.vnet.ibm.com, bruce.richardson@intel.com, konstantin.ananyev@intel.com, matan@mellanox.com, shahafs@mellanox.com, viacheslavo@mellanox.com, jerinj@marvell.com, Honnappa.Nagarahalli@arm.com, ruifeng.wang@arm.com, phil.yang@arm.com, joyce.kong@arm.com, steve.capper@arm.com Date: Sat, 11 Apr 2020 00:41:25 +0800 Message-Id: <20200410164127.54229-6-gavin.hu@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200410164127.54229-1-gavin.hu@arm.com> References: <20200410164127.54229-1-gavin.hu@arm.com> In-Reply-To: <20200213123854.203566-1-gavin.hu@arm.com> References: <20200213123854.203566-1-gavin.hu@arm.com> Subject: [dpdk-dev] [PATCH RFC v2 5/7] net/mlx5: add descriptive comment for a barrier X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The barrier is not required or can be moved down if HW waits for the doorbell ring to execute the WQE. This is not the case as HW can start executing the WQE until it gets the ownership(passed by SW writing the doorbell record). Add a decriptive comment for this HW specific behavior. Signed-off-by: Gavin Hu Reviewed-by: Phil Yang --- drivers/net/mlx5/mlx5_rxtx.h | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h index 228e37de5..737d5716d 100644 --- a/drivers/net/mlx5/mlx5_rxtx.h +++ b/drivers/net/mlx5/mlx5_rxtx.h @@ -655,6 +655,11 @@ mlx5_tx_dbrec_cond_wmb(struct mlx5_txq_data *txq, volatile struct mlx5_wqe *wqe, uint64_t *dst = MLX5_TX_BFREG(txq); volatile uint64_t *src = ((volatile uint64_t *)wqe); + /* The ownership of WQE is passed to HW by updating the doorbell + * record. Once WQE ownership has been passed to the HCA, HW can + * execute the WQE. The barrier is to ensure the WQE are visible + * to HW before HW execute it. + */ rte_cio_wmb(); *txq->qp_db = rte_cpu_to_be_32(txq->wqe_ci); /* Ensure ordering between DB record and BF copy. */ From patchwork Fri Apr 10 16:41:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gavin Hu X-Patchwork-Id: 68165 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 56AE3A0598; Fri, 10 Apr 2020 18:42:41 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 95A5F1D5FC; Fri, 10 Apr 2020 18:42:12 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id DB8221D5EE; Fri, 10 Apr 2020 18:42:10 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 55B4830E; Fri, 10 Apr 2020 09:42:10 -0700 (PDT) Received: from net-arm-thunderx2-01.shanghai.arm.com (net-arm-thunderx2-01.shanghai.arm.com [10.169.41.214]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id E3DF43F52E; Fri, 10 Apr 2020 09:42:05 -0700 (PDT) From: Gavin Hu To: dev@dpdk.org Cc: nd@arm.com, Phil Yang , david.marchand@redhat.com, thomas@monjalon.net, rasland@mellanox.com, drc@linux.vnet.ibm.com, bruce.richardson@intel.com, konstantin.ananyev@intel.com, matan@mellanox.com, shahafs@mellanox.com, viacheslavo@mellanox.com, jerinj@marvell.com, Honnappa.Nagarahalli@arm.com, ruifeng.wang@arm.com, joyce.kong@arm.com, steve.capper@arm.com, stable@dpdk.org Date: Sat, 11 Apr 2020 00:41:26 +0800 Message-Id: <20200410164127.54229-7-gavin.hu@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200410164127.54229-1-gavin.hu@arm.com> References: <20200410164127.54229-1-gavin.hu@arm.com> In-Reply-To: <20200213123854.203566-1-gavin.hu@arm.com> References: <20200213123854.203566-1-gavin.hu@arm.com> Subject: [dpdk-dev] [PATCH RFC v2 6/7] net/mlx5: relax ordering for multi-packet RQ buffer refcnt X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Phil Yang PMD Rx queue descriptor contains two mlx5_mprq_buf fields, which are the multi-packet RQ buffer header pointers. It uses the common rte_atomic_XXX functions to make sure the refcnt access is atomic. The common rte_atomic_XXX functions are full barriers on aarch64. Optimized it with one-way barrier to improve performance. Fixes: 7d6bf6b866b8 ("net/mlx5: add Multi-Packet Rx support") Cc: stable@dpdk.org Suggested-by: Gavin Hu Signed-off-by: Phil Yang Reviewed-by: Gavin Hu --- drivers/net/mlx5/mlx5_rxq.c | 2 +- drivers/net/mlx5/mlx5_rxtx.c | 16 +++++++++------- drivers/net/mlx5/mlx5_rxtx.h | 2 +- 3 files changed, 11 insertions(+), 9 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c index 8a6b410ef..834057c3b 100644 --- a/drivers/net/mlx5/mlx5_rxq.c +++ b/drivers/net/mlx5/mlx5_rxq.c @@ -1539,7 +1539,7 @@ mlx5_mprq_buf_init(struct rte_mempool *mp, void *opaque_arg, memset(_m, 0, sizeof(*buf)); buf->mp = mp; - rte_atomic16_set(&buf->refcnt, 1); + __atomic_store_n(&buf->refcnt, 1, __ATOMIC_RELAXED); for (j = 0; j != strd_n; ++j) { shinfo = &buf->shinfos[j]; shinfo->free_cb = mlx5_mprq_buf_free_cb; diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index f3bf76376..039dd0a05 100644 --- a/drivers/net/mlx5/mlx5_rxtx.c +++ b/drivers/net/mlx5/mlx5_rxtx.c @@ -1592,10 +1592,11 @@ mlx5_mprq_buf_free_cb(void *addr __rte_unused, void *opaque) { struct mlx5_mprq_buf *buf = opaque; - if (rte_atomic16_read(&buf->refcnt) == 1) { + if (__atomic_load_n(&buf->refcnt, __ATOMIC_RELAXED) == 1) { rte_mempool_put(buf->mp, buf); - } else if (rte_atomic16_add_return(&buf->refcnt, -1) == 0) { - rte_atomic16_set(&buf->refcnt, 1); + } else if (unlikely(__atomic_sub_fetch(&buf->refcnt, 1, + __ATOMIC_RELAXED) == 0)) { + __atomic_store_n(&buf->refcnt, 1, __ATOMIC_RELAXED); rte_mempool_put(buf->mp, buf); } } @@ -1676,7 +1677,8 @@ mlx5_rx_burst_mprq(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n) if (consumed_strd == strd_n) { /* Replace WQE only if the buffer is still in use. */ - if (rte_atomic16_read(&buf->refcnt) > 1) { + if (__atomic_load_n(&buf->refcnt, + __ATOMIC_RELAXED) > 1) { mprq_buf_replace(rxq, rq_ci & wq_mask, strd_n); /* Release the old buffer. */ mlx5_mprq_buf_free(buf); @@ -1766,9 +1768,9 @@ mlx5_rx_burst_mprq(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n) void *buf_addr; /* Increment the refcnt of the whole chunk. */ - rte_atomic16_add_return(&buf->refcnt, 1); - MLX5_ASSERT((uint16_t)rte_atomic16_read(&buf->refcnt) <= - strd_n + 1); + __atomic_add_fetch(&buf->refcnt, 1, __ATOMIC_ACQUIRE); + MLX5_ASSERT(__atomic_load_n(&buf->refcnt, + __ATOMIC_RELAXED) <= strd_n + 1); buf_addr = RTE_PTR_SUB(addr, headroom_sz); /* * MLX5 device doesn't use iova but it is necessary in a diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h index 737d5716d..d0a1bffa5 100644 --- a/drivers/net/mlx5/mlx5_rxtx.h +++ b/drivers/net/mlx5/mlx5_rxtx.h @@ -78,7 +78,7 @@ struct rxq_zip { /* Multi-Packet RQ buffer header. */ struct mlx5_mprq_buf { struct rte_mempool *mp; - rte_atomic16_t refcnt; /* Atomically accessed refcnt. */ + uint16_t refcnt; /* Atomically accessed refcnt. */ uint8_t pad[RTE_PKTMBUF_HEADROOM]; /* Headroom for the first packet. */ struct rte_mbuf_ext_shared_info shinfos[]; /* From patchwork Fri Apr 10 16:41:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gavin Hu X-Patchwork-Id: 68166 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id DE960A0598; Fri, 10 Apr 2020 18:42:50 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id ED4731D607; Fri, 10 Apr 2020 18:42:16 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id 6A2531D606 for ; Fri, 10 Apr 2020 18:42:15 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id F0B4630E; Fri, 10 Apr 2020 09:42:14 -0700 (PDT) Received: from net-arm-thunderx2-01.shanghai.arm.com (net-arm-thunderx2-01.shanghai.arm.com [10.169.41.214]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id C0FEB3F52E; Fri, 10 Apr 2020 09:42:10 -0700 (PDT) From: Gavin Hu To: dev@dpdk.org Cc: nd@arm.com, david.marchand@redhat.com, thomas@monjalon.net, rasland@mellanox.com, drc@linux.vnet.ibm.com, bruce.richardson@intel.com, konstantin.ananyev@intel.com, matan@mellanox.com, shahafs@mellanox.com, viacheslavo@mellanox.com, jerinj@marvell.com, Honnappa.Nagarahalli@arm.com, ruifeng.wang@arm.com, phil.yang@arm.com, joyce.kong@arm.com, steve.capper@arm.com Date: Sat, 11 Apr 2020 00:41:27 +0800 Message-Id: <20200410164127.54229-8-gavin.hu@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200410164127.54229-1-gavin.hu@arm.com> References: <20200410164127.54229-1-gavin.hu@arm.com> In-Reply-To: <20200213123854.203566-1-gavin.hu@arm.com> References: <20200213123854.203566-1-gavin.hu@arm.com> Subject: [dpdk-dev] [PATCH RFC v2 7/7] doc: clarify one configuration in mlx5 guide X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The 'tx_db_nc' is used to differntiate two mapping types, WC and non-WC, both are actually non-cacheable. The Write-Combining on x86, is not-cacheablei. The Normal-NC, the counterpart on aarch64, is non-cacheable too, as its name suggests, the cache hierarchy was bypassed for accesses to these two memory regions. Interpreting it to 'non-cacheable' makes no distinction and is confusing. re-interprete it to 'non-combining' can make the distinction. Also explains why aarch64 default setting for this is different. Signed-off-by: Gavin Hu --- doc/guides/nics/mlx5.rst | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index afd11cd83..addec9f12 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -610,9 +610,9 @@ Run-time configuration The rdma core library can map doorbell register in two ways, depending on the environment variable "MLX5_SHUT_UP_BF": - - As regular cached memory (usually with write combining attribute), if the + - As regular memory (usually with write combining attribute), if the variable is either missing or set to zero. - - As non-cached memory, if the variable is present and set to not "0" value. + - As non-combining memory, if the variable is present and set to not "0" value. The type of mapping may slightly affect the Tx performance, the optimal choice is strongly relied on the host architecture and should be deduced practically. @@ -638,6 +638,8 @@ Run-time configuration If ``tx_db_nc`` is omitted or set to zero, the preset (if any) environment variable "MLX5_SHUT_UP_BF" value is used. If there is no "MLX5_SHUT_UP_BF", the default ``tx_db_nc`` value is zero for ARM64 hosts and one for others. + ARM64 is different because it has a gathering buffer, the enforced barrier + can evict the doorbell ring immediately. - ``tx_vec_en`` parameter [int]