Message ID | 20220114105217.343139-1-dkozlyuk@nvidia.com (mailing list archive) |
---|---|
State | Accepted, archived |
Delegated to: | Raslan Darawsheh |
Headers | show |
Series | common/mlx5: fix MR lookup for non-contiguous mempool | expand |
Context | Check | Description |
---|---|---|
ci/iol-abi-testing | success | Testing PASS |
ci/iol-aarch64-compile-testing | success | Testing PASS |
ci/iol-aarch64-unit-testing | success | Testing PASS |
ci/iol-x86_64-compile-testing | success | Testing PASS |
ci/iol-intel-Functional | success | Functional Testing PASS |
ci/iol-x86_64-unit-testing | success | Testing PASS |
ci/iol-intel-Performance | success | Performance Testing PASS |
ci/iol-broadcom-Performance | success | Performance Testing PASS |
ci/iol-broadcom-Functional | success | Functional Testing PASS |
ci/iol-mellanox-Performance | success | Performance Testing PASS |
ci/intel-Testing | success | Testing PASS |
ci/Intel-compilation | success | Compilation OK |
ci/github-robot: build | success | github build: passed |
ci/checkpatch | success | coding style OK |
Hi, > -----Original Message----- > From: Dmitry Kozlyuk <dkozlyuk@nvidia.com> > Sent: Friday, January 14, 2022 12:52 PM > To: dev@dpdk.org > Cc: stable@dpdk.org; Wang Yunjian <wangyunjian@huawei.com>; Slava > Ovsiienko <viacheslavo@nvidia.com>; Matan Azrad <matan@nvidia.com> > Subject: [PATCH] common/mlx5: fix MR lookup for non-contiguous mempool > > Memory region (MR) lookup by address inside mempool MRs was not > accounting for the upper bound of an MR. > For mempools covered by multiple MRs this could return a wrong MR LKey, > typically resulting in an unrecoverable TxQ failure: > > mlx5_net: Cannot change Tx QP state to INIT Invalid argument > > Corresponding message from /var/log/dpdk_mlx5_port_X_txq_Y_index_Z*: > > Unexpected CQE error syndrome 0x04 CQN = 128 SQN = 4848 > wqe_counter = 0 wq_ci = 9 cq_ci = 122 > > This is likely to happen with --legacy-mem and IOVA-as-PA, because EAL > intentionally maps pages at non-adjacent PA to non-adjacent VA in this > mode, and MLX5 PMD works with VA. > > Fixes: 690b2a88c2f7 ("common/mlx5: add mempool registration facilities") > Cc: stable@dpdk.org > > Reported-by: Wang Yunjian <wangyunjian@huawei.com> > Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> > Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Patch applied to next-net-mlx, Kindest regards, Raslan Darawsheh
diff --git a/drivers/common/mlx5/mlx5_common_mr.c b/drivers/common/mlx5/mlx5_common_mr.c index 1537b5d428..5f7e4f6734 100644 --- a/drivers/common/mlx5/mlx5_common_mr.c +++ b/drivers/common/mlx5/mlx5_common_mr.c @@ -1834,12 +1834,13 @@ mlx5_mempool_reg_addr2mr(struct mlx5_mempool_reg *mpr, uintptr_t addr, for (i = 0; i < mpr->mrs_n; i++) { const struct mlx5_pmd_mr *mr = &mpr->mrs[i].pmd_mr; - uintptr_t mr_addr = (uintptr_t)mr->addr; + uintptr_t mr_start = (uintptr_t)mr->addr; + uintptr_t mr_end = mr_start + mr->len; - if (mr_addr <= addr) { + if (mr_start <= addr && addr < mr_end) { lkey = rte_cpu_to_be_32(mr->lkey); - entry->start = mr_addr; - entry->end = mr_addr + mr->len; + entry->start = mr_start; + entry->end = mr_end; entry->lkey = lkey; break; }