[dpdk-dev] net/mlx5: fix Memory Region lookup

Message ID 20180119075255.2542-1-yskoh@mellanox.com (mailing list archive)
State Accepted, archived
Delegated to: Ferruh Yigit
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation fail Compilation issues

Commit Message

Yongseok Koh Jan. 19, 2018, 7:52 a.m. UTC
  This patch reverts:
	commit 3a6f2eb8c5c5 ("net/mlx5: fix Memory Region registration")

Although granularity of chunks in a mempool is a cacheline, addresses are
extended to align to page boundary for performance reason in device when
registering a MR (Memory Region). This could make some regions overlap,
then can cause Tx completion error due to incorrect LKEY search. If the
error occurs, the Tx queue will get stuck. It is because buffer address is
compared against aligned addresses for Memory Region. Saving original
addresses of mempool for comparison doesn't create any overlap.

Fixes: b0b093845793 ("net/mlx5: use buffer address for LKEY search")
Fixes: 3a6f2eb8c5c5 ("net/mlx5: fix Memory Region registration")
Cc: stable@dpdk.org

Reported-by: Xueming Li <xuemingl@mellanox.com>
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 drivers/net/mlx5/mlx5_mr.c   | 5 +++--
 drivers/net/mlx5/mlx5_rxtx.h | 2 +-
 2 files changed, 4 insertions(+), 3 deletions(-)
  

Comments

Nélio Laranjeiro Jan. 19, 2018, 8:37 a.m. UTC | #1
On Thu, Jan 18, 2018 at 11:52:55PM -0800, Yongseok Koh wrote:
> This patch reverts:
> 	commit 3a6f2eb8c5c5 ("net/mlx5: fix Memory Region registration")
> 
> Although granularity of chunks in a mempool is a cacheline, addresses are
> extended to align to page boundary for performance reason in device when
> registering a MR (Memory Region). This could make some regions overlap,
> then can cause Tx completion error due to incorrect LKEY search. If the
> error occurs, the Tx queue will get stuck. It is because buffer address is
> compared against aligned addresses for Memory Region. Saving original
> addresses of mempool for comparison doesn't create any overlap.
> 
> Fixes: b0b093845793 ("net/mlx5: use buffer address for LKEY search")
> Fixes: 3a6f2eb8c5c5 ("net/mlx5: fix Memory Region registration")
> Cc: stable@dpdk.org
> 
> Reported-by: Xueming Li <xuemingl@mellanox.com>
> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
  
Shahaf Shuler Jan. 21, 2018, 6:37 a.m. UTC | #2
Friday, January 19, 2018 10:37 AM, Nélio Laranjeiro:
> On Thu, Jan 18, 2018 at 11:52:55PM -0800, Yongseok Koh wrote:
> > This patch reverts:
> > 	commit 3a6f2eb8c5c5 ("net/mlx5: fix Memory Region registration")
> >
> > Although granularity of chunks in a mempool is a cacheline, addresses
> > are extended to align to page boundary for performance reason in
> > device when registering a MR (Memory Region). This could make some
> > regions overlap, then can cause Tx completion error due to incorrect
> > LKEY search. If the error occurs, the Tx queue will get stuck. It is
> > because buffer address is compared against aligned addresses for
> > Memory Region. Saving original addresses of mempool for comparison
> doesn't create any overlap.
> >
> > Fixes: b0b093845793 ("net/mlx5: use buffer address for LKEY search")
> > Fixes: 3a6f2eb8c5c5 ("net/mlx5: fix Memory Region registration")
> > Cc: stable@dpdk.org
> >
> > Reported-by: Xueming Li <xuemingl@mellanox.com>
> > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>

Applied to next-net-mlx, thanks. 

> 
> --
> Nélio Laranjeiro
> 6WIND
  

Patch

diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
index 6b29eed55..2776dc700 100644
--- a/drivers/net/mlx5/mlx5_mr.c
+++ b/drivers/net/mlx5/mlx5_mr.c
@@ -291,6 +291,9 @@  priv_mr_new(struct priv *priv, struct rte_mempool *mp)
 	DEBUG("mempool %p area start=%p end=%p size=%zu",
 	      (void *)mp, (void *)start, (void *)end,
 	      (size_t)(end - start));
+	/* Save original addresses for exact MR lookup. */
+	mr->start = start;
+	mr->end = end;
 	/* Round start and end to page boundary if found in memory segments. */
 	for (i = 0; (i < RTE_MAX_MEMSEG) && (ms[i].addr != NULL); ++i) {
 		uintptr_t addr = (uintptr_t)ms[i].addr;
@@ -309,8 +312,6 @@  priv_mr_new(struct priv *priv, struct rte_mempool *mp)
 			    IBV_ACCESS_LOCAL_WRITE);
 	mr->mp = mp;
 	mr->lkey = rte_cpu_to_be_32(mr->mr->lkey);
-	mr->start = start;
-	mr->end = (uintptr_t)mr->mr->addr + mr->mr->length;
 	rte_atomic32_inc(&mr->refcnt);
 	DEBUG("%p: new Memory Region %p refcnt: %d", (void *)priv,
 	      (void *)mr, rte_atomic32_read(&mr->refcnt));
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index a239642ac..2eb2f0506 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -548,7 +548,7 @@  static __rte_always_inline uint32_t
 mlx5_tx_mb2mr(struct mlx5_txq_data *txq, struct rte_mbuf *mb)
 {
 	uint16_t i = txq->mr_cache_idx;
-	uintptr_t addr = rte_pktmbuf_mtod_offset(mb, uintptr_t, DATA_LEN(mb));
+	uintptr_t addr = rte_pktmbuf_mtod(mb, uintptr_t);
 	struct mlx5_mr *mr;
 
 	assert(i < RTE_DIM(txq->mp2mr));