[v3,07/17] net/mlx5: create Tx queues with DevX

Message ID 1594887800-6563-8-git-send-email-viacheslavo@mellanox.com (mailing list archive)
State Accepted, archived
Delegated to: Raslan Darawsheh
Headers
Series net/mlx5: introduce accurate packet Tx scheduling |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Slava Ovsiienko July 16, 2020, 8:23 a.m. UTC
  To provide the packet send schedule on mbuf timestamp the Tx
queue must be attached to the same UAR as Clock Queue is.
UAR is special hardware related resource mapped to the host
memory and provides doorbell registers, the assigning UAR
to the queue being created is provided via DevX API only.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/mlx5/mlx5_rxtx.c    | 108 +++++++++-----
 drivers/net/mlx5/mlx5_rxtx.h    |  14 ++
 drivers/net/mlx5/mlx5_trigger.c |   4 +-
 drivers/net/mlx5/mlx5_txq.c     | 306 +++++++++++++++++++++++++++++++++++++++-
 4 files changed, 392 insertions(+), 40 deletions(-)
  

Comments

Ferruh Yigit July 20, 2020, 2:18 p.m. UTC | #1
On 7/16/2020 9:23 AM, Viacheslav Ovsiienko wrote:
> To provide the packet send schedule on mbuf timestamp the Tx
> queue must be attached to the same UAR as Clock Queue is.
> UAR is special hardware related resource mapped to the host
> memory and provides doorbell registers, the assigning UAR
> to the queue being created is provided via DevX API only.
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> Acked-by: Matan Azrad <matan@mellanox.com>

<...>

> +	MLX5_ASSERT(sh->tx_uar);
> +	MLX5_ASSERT(sh->tx_uar->reg_addr);
> +	txq_ctrl->bf_reg = sh->tx_uar->reg_addr;
> +	txq_ctrl->uar_mmap_offset = sh->tx_uar->mmap_off;
> +	rte_atomic32_set(&txq_obj->refcnt, 1);

This is using a function we plan to deprecate in long term, and checkpatch has a
warning for it [1].

To prevent this being blocker, I will preceed with the patchset and can you
please send an increamental patch to fix it, I can squash it before -rc2.

Thanks,
ferruh


[1]
Warning in drivers/net/mlx5/mlx5_txq.c:
Using rte_atomicNN_xxx
  
Ferruh Yigit July 20, 2020, 3:25 p.m. UTC | #2
On 7/20/2020 3:18 PM, Ferruh Yigit wrote:
> On 7/16/2020 9:23 AM, Viacheslav Ovsiienko wrote:
>> To provide the packet send schedule on mbuf timestamp the Tx
>> queue must be attached to the same UAR as Clock Queue is.
>> UAR is special hardware related resource mapped to the host
>> memory and provides doorbell registers, the assigning UAR
>> to the queue being created is provided via DevX API only.
>>
>> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
>> Acked-by: Matan Azrad <matan@mellanox.com>
> 
> <...>
> 
>> +	MLX5_ASSERT(sh->tx_uar);
>> +	MLX5_ASSERT(sh->tx_uar->reg_addr);
>> +	txq_ctrl->bf_reg = sh->tx_uar->reg_addr;
>> +	txq_ctrl->uar_mmap_offset = sh->tx_uar->mmap_off;
>> +	rte_atomic32_set(&txq_obj->refcnt, 1);
> 
> This is using a function we plan to deprecate in long term, and checkpatch has a
> warning for it [1].
> 
> To prevent this being blocker, I will preceed with the patchset and can you
> please send an increamental patch to fix it, I can squash it before -rc2.
> 
> Thanks,
> ferruh
> 
> 
> [1]
> Warning in drivers/net/mlx5/mlx5_txq.c:
> Using rte_atomicNN_xxx
> 

cc'ed Honnapa too, from techboard discussion [2] I understand we won't accept
new code with old API. But also to be fair the check was not there until last
week so this was easy to miss by developers.

@Slava, can you please do your best to replace them before the release? Perhaps
@Honnappa can support on the effort?

And if this can't be done withing the release not sure what to do, one option is
to get them with the commitment from Mellanox to fix this on 20.11?

[2]
https://mails.dpdk.org/archives/dev/2020-April/165143.html
  
Slava Ovsiienko July 21, 2020, 11:35 a.m. UTC | #3
> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@intel.com>
> Sent: Monday, July 20, 2020 18:26
> To: Slava Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
> Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; olivier.matz@6wind.com; Thomas Monjalon
> <thomas@monjalon.net>; David Marchand <david.marchand@redhat.com>;
> Phil Yang <phil.yang@arm.com>; Ruifeng Wang <ruifeng.wang@arm.com>;
> Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Subject: Re: [dpdk-dev] [PATCH v3 07/17] net/mlx5: create Tx queues with
> DevX
> 
> On 7/20/2020 3:18 PM, Ferruh Yigit wrote:
> > On 7/16/2020 9:23 AM, Viacheslav Ovsiienko wrote:
> >> To provide the packet send schedule on mbuf timestamp the Tx queue
> >> must be attached to the same UAR as Clock Queue is.
> >> UAR is special hardware related resource mapped to the host memory
> >> and provides doorbell registers, the assigning UAR to the queue being
> >> created is provided via DevX API only.
> >>
> >> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> >> Acked-by: Matan Azrad <matan@mellanox.com>
> >
> > <...>
> >
> >> +	MLX5_ASSERT(sh->tx_uar);
> >> +	MLX5_ASSERT(sh->tx_uar->reg_addr);
> >> +	txq_ctrl->bf_reg = sh->tx_uar->reg_addr;
> >> +	txq_ctrl->uar_mmap_offset = sh->tx_uar->mmap_off;
> >> +	rte_atomic32_set(&txq_obj->refcnt, 1);
> >
> > This is using a function we plan to deprecate in long term, and
> > checkpatch has a warning for it [1].
> >
> > To prevent this being blocker, I will preceed with the patchset and
> > can you please send an increamental patch to fix it, I can squash it before -
> rc2.
> >
> > Thanks,
> > ferruh
> >
> >
> > [1]
> > Warning in drivers/net/mlx5/mlx5_txq.c:
> > Using rte_atomicNN_xxx
> >
> 
> cc'ed Honnapa too, from techboard discussion [2] I understand we won't
> accept new code with old API. But also to be fair the check was not there
> until last week so this was easy to miss by developers.
> 
> @Slava, can you please do your best to replace them before the release?
> Perhaps @Honnappa can support on the effort?


> And if this can't be done withing the release not sure what to do, one option
> is to get them with the commitment from Mellanox to fix this on 20.11?

I try to do the best before the release personally, but not sure we can pass the
full testing/verification cycle. So, I suppose to take the commitment to fix it on 20.11
(there is no other way due to "atomic"  deprecation). If we have the update before
20.08rc3, we'll push it.

With best regards, Slava

> 
> [2]
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmails.
> dpdk.org%2Farchives%2Fdev%2F2020-
> April%2F165143.html&amp;data=02%7C01%7Cviacheslavo%40mellanox.co
> m%7Cea28f4d3c667404c4cdd08d82cc137fe%7Ca652971c7d2e4d9ba6a4d14
> 9256f461b%7C0%7C0%7C637308555673206316&amp;sdata=iNXzvMc%2FPh
> TnrLuW52F9z2t0bag9%2Ftw7AOTxwp8rkfI%3D&amp;reserved=0
  

Patch

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 894f441..227289e 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -944,43 +944,79 @@  enum mlx5_txcmp_code {
 		struct mlx5_txq_data *txq = (*priv->txqs)[sm->queue_id];
 		struct mlx5_txq_ctrl *txq_ctrl =
 			container_of(txq, struct mlx5_txq_ctrl, txq);
-		struct ibv_qp_attr mod = {
-			.qp_state = IBV_QPS_RESET,
-			.port_num = (uint8_t)priv->dev_port,
-		};
-		struct ibv_qp *qp = txq_ctrl->obj->qp;
 
-		ret = mlx5_glue->modify_qp(qp, &mod, IBV_QP_STATE);
-		if (ret) {
-			DRV_LOG(ERR, "Cannot change the Tx QP state to RESET "
-				"%s", strerror(errno));
-			rte_errno = errno;
-			return ret;
-		}
-		mod.qp_state = IBV_QPS_INIT;
-		ret = mlx5_glue->modify_qp(qp, &mod,
-					   (IBV_QP_STATE | IBV_QP_PORT));
-		if (ret) {
-			DRV_LOG(ERR, "Cannot change Tx QP state to INIT %s",
-				strerror(errno));
-			rte_errno = errno;
-			return ret;
-		}
-		mod.qp_state = IBV_QPS_RTR;
-		ret = mlx5_glue->modify_qp(qp, &mod, IBV_QP_STATE);
-		if (ret) {
-			DRV_LOG(ERR, "Cannot change Tx QP state to RTR %s",
-				strerror(errno));
-			rte_errno = errno;
-			return ret;
-		}
-		mod.qp_state = IBV_QPS_RTS;
-		ret = mlx5_glue->modify_qp(qp, &mod, IBV_QP_STATE);
-		if (ret) {
-			DRV_LOG(ERR, "Cannot change Tx QP state to RTS %s",
-				strerror(errno));
-			rte_errno = errno;
-			return ret;
+		if (txq_ctrl->obj->type == MLX5_TXQ_OBJ_TYPE_DEVX_SQ) {
+			struct mlx5_devx_modify_sq_attr msq_attr = { 0 };
+
+			/* Change queue state to reset. */
+			msq_attr.sq_state = MLX5_SQC_STATE_ERR;
+			msq_attr.state = MLX5_SQC_STATE_RST;
+			ret = mlx5_devx_cmd_modify_sq(txq_ctrl->obj->sq_devx,
+						      &msq_attr);
+			if (ret) {
+				DRV_LOG(ERR, "Cannot change the "
+					"Tx QP state to RESET %s",
+					strerror(errno));
+				rte_errno = errno;
+				return ret;
+			}
+			/* Change queue state to ready. */
+			msq_attr.sq_state = MLX5_SQC_STATE_RST;
+			msq_attr.state = MLX5_SQC_STATE_RDY;
+			ret = mlx5_devx_cmd_modify_sq(txq_ctrl->obj->sq_devx,
+						      &msq_attr);
+			if (ret) {
+				DRV_LOG(ERR, "Cannot change the "
+					"Tx QP state to READY %s",
+					strerror(errno));
+				rte_errno = errno;
+				return ret;
+			}
+		} else {
+			struct ibv_qp_attr mod = {
+				.qp_state = IBV_QPS_RESET,
+				.port_num = (uint8_t)priv->dev_port,
+			};
+			struct ibv_qp *qp = txq_ctrl->obj->qp;
+
+			MLX5_ASSERT
+				(txq_ctrl->obj->type == MLX5_TXQ_OBJ_TYPE_IBV);
+
+			ret = mlx5_glue->modify_qp(qp, &mod, IBV_QP_STATE);
+			if (ret) {
+				DRV_LOG(ERR, "Cannot change the "
+					"Tx QP state to RESET %s",
+					strerror(errno));
+				rte_errno = errno;
+				return ret;
+			}
+			mod.qp_state = IBV_QPS_INIT;
+			ret = mlx5_glue->modify_qp(qp, &mod, IBV_QP_STATE);
+			if (ret) {
+				DRV_LOG(ERR, "Cannot change the "
+					"Tx QP state to INIT %s",
+					strerror(errno));
+				rte_errno = errno;
+				return ret;
+			}
+			mod.qp_state = IBV_QPS_RTR;
+			ret = mlx5_glue->modify_qp(qp, &mod, IBV_QP_STATE);
+			if (ret) {
+				DRV_LOG(ERR, "Cannot change the "
+					"Tx QP state to RTR %s",
+					strerror(errno));
+				rte_errno = errno;
+				return ret;
+			}
+			mod.qp_state = IBV_QPS_RTS;
+			ret = mlx5_glue->modify_qp(qp, &mod, IBV_QP_STATE);
+			if (ret) {
+				DRV_LOG(ERR, "Cannot change the "
+					"Tx QP state to RTS %s",
+					strerror(errno));
+				rte_errno = errno;
+				return ret;
+			}
 		}
 	}
 	return 0;
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 26621ff..1b797da 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -323,6 +323,7 @@  struct mlx5_txq_data {
 
 enum mlx5_txq_obj_type {
 	MLX5_TXQ_OBJ_TYPE_IBV,		/* mlx5_txq_obj with ibv_wq. */
+	MLX5_TXQ_OBJ_TYPE_DEVX_SQ,	/* mlx5_txq_obj with mlx5_devx_sq. */
 	MLX5_TXQ_OBJ_TYPE_DEVX_HAIRPIN,
 	/* mlx5_txq_obj with mlx5_devx_tq and hairpin support. */
 };
@@ -349,6 +350,19 @@  struct mlx5_txq_obj {
 			/* DevX object for Sx queue. */
 			struct mlx5_devx_obj *tis; /* The TIS object. */
 		};
+		struct {
+			struct rte_eth_dev *dev;
+			struct mlx5_devx_obj *cq_devx;
+			struct mlx5dv_devx_umem *cq_umem;
+			void *cq_buf;
+			int64_t cq_dbrec_offset;
+			struct mlx5_devx_dbr_page *cq_dbrec_page;
+			struct mlx5_devx_obj *sq_devx;
+			struct mlx5dv_devx_umem *sq_umem;
+			void *sq_buf;
+			int64_t sq_dbrec_offset;
+			struct mlx5_devx_dbr_page *sq_dbrec_page;
+		};
 	};
 };
 
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index ca25ad9..e324319 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -59,7 +59,9 @@ 
 		} else {
 			txq_alloc_elts(txq_ctrl);
 			txq_ctrl->obj = mlx5_txq_obj_new
-				(dev, i, MLX5_TXQ_OBJ_TYPE_IBV);
+				(dev, i, priv->txpp_en ?
+				MLX5_TXQ_OBJ_TYPE_DEVX_SQ :
+				MLX5_TXQ_OBJ_TYPE_IBV);
 		}
 		if (!txq_ctrl->obj) {
 			rte_errno = ENOMEM;
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index e1fa24e..f9ed504 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -31,6 +31,7 @@ 
 #include <mlx5_devx_cmds.h>
 #include <mlx5_common.h>
 #include <mlx5_common_mr.h>
+#include <mlx5_common_os.h>
 
 #include "mlx5_defs.h"
 #include "mlx5_utils.h"
@@ -573,6 +574,297 @@ 
 }
 
 /**
+ * Destroy the Tx queue DevX object.
+ *
+ * @param txq_obj
+ *   Txq object to destroy
+ */
+static void
+txq_release_sq_resources(struct mlx5_txq_obj *txq_obj)
+{
+	MLX5_ASSERT(txq_obj->type == MLX5_TXQ_OBJ_TYPE_DEVX_SQ);
+
+	if (txq_obj->sq_devx)
+		claim_zero(mlx5_devx_cmd_destroy(txq_obj->sq_devx));
+	if (txq_obj->sq_dbrec_page)
+		claim_zero(mlx5_release_dbr
+				(&txq_obj->txq_ctrl->priv->dbrpgs,
+				mlx5_os_get_umem_id
+					(txq_obj->sq_dbrec_page->umem),
+				txq_obj->sq_dbrec_offset));
+	if (txq_obj->sq_umem)
+		claim_zero(mlx5_glue->devx_umem_dereg(txq_obj->sq_umem));
+	if (txq_obj->sq_buf)
+		rte_free(txq_obj->sq_buf);
+	if (txq_obj->cq_devx)
+		claim_zero(mlx5_devx_cmd_destroy(txq_obj->cq_devx));
+	if (txq_obj->cq_dbrec_page)
+		claim_zero(mlx5_release_dbr
+				(&txq_obj->txq_ctrl->priv->dbrpgs,
+				mlx5_os_get_umem_id
+					(txq_obj->cq_dbrec_page->umem),
+				txq_obj->cq_dbrec_offset));
+	if (txq_obj->cq_umem)
+		claim_zero(mlx5_glue->devx_umem_dereg(txq_obj->cq_umem));
+	if (txq_obj->cq_buf)
+		rte_free(txq_obj->cq_buf);
+}
+
+/**
+ * Create the Tx queue DevX object.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param idx
+ *   Queue index in DPDK Tx queue array
+ *
+ * @return
+ *   The DevX object initialised, NULL otherwise and rte_errno is set.
+ */
+static struct mlx5_txq_obj *
+mlx5_txq_obj_devx_new(struct rte_eth_dev *dev, uint16_t idx)
+{
+#ifndef HAVE_MLX5DV_DEVX_UAR_OFFSET
+	DRV_LOG(ERR, "port %u Tx queue %u cannot create with DevX, no UAR",
+		     dev->data->port_id, idx);
+	rte_errno = ENOMEM;
+	return NULL;
+#else
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_dev_ctx_shared *sh = priv->sh;
+	struct mlx5_txq_data *txq_data = (*priv->txqs)[idx];
+	struct mlx5_txq_ctrl *txq_ctrl =
+		container_of(txq_data, struct mlx5_txq_ctrl, txq);
+	struct mlx5_devx_create_sq_attr sq_attr = { 0 };
+	struct mlx5_devx_modify_sq_attr msq_attr = { 0 };
+	struct mlx5_devx_cq_attr cq_attr = { 0 };
+	struct mlx5_txq_obj *txq_obj = NULL;
+	size_t page_size = sysconf(_SC_PAGESIZE);
+	struct mlx5_cqe *cqe;
+	uint32_t i, nqe;
+	int ret = 0;
+
+	MLX5_ASSERT(txq_data);
+	MLX5_ASSERT(!txq_ctrl->obj);
+	txq_obj = rte_calloc_socket(__func__, 1,
+				    sizeof(struct mlx5_txq_obj), 0,
+				    txq_ctrl->socket);
+	if (!txq_obj) {
+		DRV_LOG(ERR,
+			"port %u Tx queue %u cannot allocate memory resources",
+			dev->data->port_id, txq_data->idx);
+		rte_errno = ENOMEM;
+		return NULL;
+	}
+	txq_obj->type = MLX5_TXQ_OBJ_TYPE_DEVX_SQ;
+	txq_obj->txq_ctrl = txq_ctrl;
+	txq_obj->dev = dev;
+	/* Create the Completion Queue. */
+	nqe = (1UL << txq_data->elts_n) / MLX5_TX_COMP_THRESH +
+	       1 + MLX5_TX_COMP_THRESH_INLINE_DIV;
+	nqe = 1UL << log2above(nqe);
+	if (nqe > UINT16_MAX) {
+		DRV_LOG(ERR,
+			"port %u Tx queue %u requests to many CQEs %u",
+			dev->data->port_id, txq_data->idx, nqe);
+		rte_errno = EINVAL;
+		goto error;
+	}
+	/* Allocate memory buffer for CQEs. */
+	txq_obj->cq_buf = rte_zmalloc_socket(__func__,
+					     nqe * sizeof(struct mlx5_cqe),
+					     MLX5_CQE_BUF_ALIGNMENT,
+					     sh->numa_node);
+	if (!txq_obj->cq_buf) {
+		DRV_LOG(ERR,
+			"port %u Tx queue %u cannot allocate memory (CQ)",
+			dev->data->port_id, txq_data->idx);
+		rte_errno = ENOMEM;
+		goto error;
+	}
+	txq_data->cqe_n = log2above(nqe);
+	txq_data->cqe_s = 1 << txq_data->cqe_n;
+	txq_data->cqe_m = txq_data->cqe_s - 1;
+	txq_data->cqes = (volatile struct mlx5_cqe *)txq_obj->cq_buf;
+	txq_data->cq_ci = 0;
+	txq_data->cq_pi = 0;
+	/* Register allocated buffer in user space with DevX. */
+	txq_obj->cq_umem = mlx5_glue->devx_umem_reg
+					(sh->ctx,
+					 (void *)txq_obj->cq_buf,
+					 nqe * sizeof(struct mlx5_cqe),
+					 IBV_ACCESS_LOCAL_WRITE);
+	if (!txq_obj->cq_umem) {
+		rte_errno = errno;
+		DRV_LOG(ERR,
+			"port %u Tx queue %u cannot register memory (CQ)",
+			dev->data->port_id, txq_data->idx);
+		goto error;
+	}
+	/* Allocate doorbell record for completion queue. */
+	txq_obj->cq_dbrec_offset = mlx5_get_dbr(sh->ctx,
+						&priv->dbrpgs,
+						&txq_obj->cq_dbrec_page);
+	if (txq_obj->cq_dbrec_offset < 0)
+		goto error;
+	txq_data->cq_db = (volatile uint32_t *)(txq_obj->cq_dbrec_page->dbrs +
+						txq_obj->cq_dbrec_offset);
+	*txq_data->cq_db = 0;
+	/* Create completion queue object with DevX. */
+	cq_attr.cqe_size = (sizeof(struct mlx5_cqe) == 128) ?
+			    MLX5_CQE_SIZE_128B : MLX5_CQE_SIZE_64B;
+	cq_attr.uar_page_id = sh->tx_uar->page_id;
+	cq_attr.eqn = sh->txpp.eqn;
+	cq_attr.q_umem_valid = 1;
+	cq_attr.q_umem_offset = (uintptr_t)txq_obj->cq_buf % page_size;
+	cq_attr.q_umem_id = txq_obj->cq_umem->umem_id;
+	cq_attr.db_umem_valid = 1;
+	cq_attr.db_umem_offset = txq_obj->cq_dbrec_offset;
+	cq_attr.db_umem_id = mlx5_os_get_umem_id(txq_obj->cq_dbrec_page->umem);
+	cq_attr.log_cq_size = rte_log2_u32(nqe);
+	cq_attr.log_page_size = rte_log2_u32(page_size);
+	txq_obj->cq_devx = mlx5_devx_cmd_create_cq(sh->ctx, &cq_attr);
+	if (!txq_obj->cq_devx) {
+		rte_errno = errno;
+		DRV_LOG(ERR, "port %u Tx queue %u CQ creation failure",
+			dev->data->port_id, idx);
+		goto error;
+	}
+	/* Initial fill CQ buffer with invalid CQE opcode. */
+	cqe = (struct mlx5_cqe *)txq_obj->cq_buf;
+	for (i = 0; i < txq_data->cqe_s; i++) {
+		cqe->op_own = (MLX5_CQE_INVALID << 4) | MLX5_CQE_OWNER_MASK;
+		++cqe;
+	}
+	/* Create the Work Queue. */
+	nqe = RTE_MIN(1UL << txq_data->elts_n,
+		      (uint32_t)sh->device_attr.max_qp_wr);
+	txq_obj->sq_buf = rte_zmalloc_socket(__func__,
+					     nqe * sizeof(struct mlx5_wqe),
+					     page_size,
+					     sh->numa_node);
+	if (!txq_obj->sq_buf) {
+		DRV_LOG(ERR,
+			"port %u Tx queue %u cannot allocate memory (SQ)",
+			dev->data->port_id, txq_data->idx);
+		rte_errno = ENOMEM;
+		goto error;
+	}
+	txq_data->wqe_n = log2above(nqe);
+	txq_data->wqe_s = 1 << txq_data->wqe_n;
+	txq_data->wqe_m = txq_data->wqe_s - 1;
+	txq_data->wqes = (struct mlx5_wqe *)txq_obj->sq_buf;
+	txq_data->wqes_end = txq_data->wqes + txq_data->wqe_s;
+	txq_data->wqe_ci = 0;
+	txq_data->wqe_pi = 0;
+	txq_data->wqe_comp = 0;
+	txq_data->wqe_thres = txq_data->wqe_s / MLX5_TX_COMP_THRESH_INLINE_DIV;
+	/* Register allocated buffer in user space with DevX. */
+	txq_obj->sq_umem = mlx5_glue->devx_umem_reg
+					(sh->ctx,
+					 (void *)txq_obj->sq_buf,
+					 nqe * sizeof(struct mlx5_wqe),
+					 IBV_ACCESS_LOCAL_WRITE);
+	if (!txq_obj->sq_umem) {
+		rte_errno = errno;
+		DRV_LOG(ERR,
+			"port %u Tx queue %u cannot register memory (SQ)",
+			dev->data->port_id, txq_data->idx);
+		goto error;
+	}
+	/* Allocate doorbell record for completion queue. */
+	txq_obj->cq_dbrec_offset = mlx5_get_dbr(sh->ctx,
+						&priv->dbrpgs,
+						&txq_obj->sq_dbrec_page);
+	if (txq_obj->sq_dbrec_offset < 0)
+		goto error;
+	txq_data->qp_db = (volatile uint32_t *)
+					(txq_obj->sq_dbrec_page->dbrs +
+					 txq_obj->sq_dbrec_offset +
+					 MLX5_SND_DBR * sizeof(uint32_t));
+	*txq_data->qp_db = 0;
+	/* Create Send Queue object with DevX. */
+	sq_attr.tis_lst_sz = 1;
+	sq_attr.tis_num = sh->tis->id;
+	sq_attr.state = MLX5_SQC_STATE_RST;
+	sq_attr.cqn = txq_obj->cq_devx->id;
+	sq_attr.flush_in_error_en = 1;
+	sq_attr.allow_multi_pkt_send_wqe = !!priv->config.mps;
+	sq_attr.allow_swp = !!priv->config.swp;
+	sq_attr.min_wqe_inline_mode = priv->config.hca_attr.vport_inline_mode;
+	sq_attr.wq_attr.uar_page = sh->tx_uar->page_id;
+	sq_attr.wq_attr.wq_type = MLX5_WQ_TYPE_CYCLIC;
+	sq_attr.wq_attr.pd = sh->pdn;
+	sq_attr.wq_attr.log_wq_stride = rte_log2_u32(MLX5_WQE_SIZE);
+	sq_attr.wq_attr.log_wq_sz = txq_data->wqe_n;
+	sq_attr.wq_attr.dbr_umem_valid = 1;
+	sq_attr.wq_attr.dbr_addr = txq_obj->cq_dbrec_offset;
+	sq_attr.wq_attr.dbr_umem_id =
+			mlx5_os_get_umem_id(txq_obj->cq_dbrec_page->umem);
+	sq_attr.wq_attr.wq_umem_valid = 1;
+	sq_attr.wq_attr.wq_umem_id = txq_obj->sq_umem->umem_id;
+	sq_attr.wq_attr.wq_umem_offset = (uintptr_t)txq_obj->sq_buf % page_size;
+	txq_obj->sq_devx = mlx5_devx_cmd_create_sq(sh->ctx, &sq_attr);
+	if (!txq_obj->sq_devx) {
+		rte_errno = errno;
+		DRV_LOG(ERR, "port %u Tx queue %u SQ creation failure",
+			dev->data->port_id, idx);
+		goto error;
+	}
+	txq_data->qp_num_8s = txq_obj->sq_devx->id << 8;
+	/* Change Send Queue state to Ready-to-Send. */
+	msq_attr.sq_state = MLX5_SQC_STATE_RST;
+	msq_attr.state = MLX5_SQC_STATE_RDY;
+	ret = mlx5_devx_cmd_modify_sq(txq_obj->sq_devx, &msq_attr);
+	if (ret) {
+		rte_errno = errno;
+		DRV_LOG(ERR,
+			"port %u Tx queue %u SP state to SQC_STATE_RDY failed",
+			dev->data->port_id, idx);
+		goto error;
+	}
+	txq_data->fcqs = rte_calloc_socket(__func__,
+					   txq_data->cqe_s,
+					   sizeof(*txq_data->fcqs),
+					   RTE_CACHE_LINE_SIZE,
+					   txq_ctrl->socket);
+	if (!txq_data->fcqs) {
+		DRV_LOG(ERR, "port %u Tx queue %u cannot allocate memory (FCQ)",
+			dev->data->port_id, idx);
+		rte_errno = ENOMEM;
+		goto error;
+	}
+#ifdef HAVE_IBV_FLOW_DV_SUPPORT
+	/*
+	 * If using DevX need to query and store TIS transport domain value.
+	 * This is done once per port.
+	 * Will use this value on Rx, when creating matching TIR.
+	 */
+	if (priv->config.devx && !priv->sh->tdn)
+		priv->sh->tdn = priv->sh->td->id;
+#endif
+	MLX5_ASSERT(sh->tx_uar);
+	MLX5_ASSERT(sh->tx_uar->reg_addr);
+	txq_ctrl->bf_reg = sh->tx_uar->reg_addr;
+	txq_ctrl->uar_mmap_offset = sh->tx_uar->mmap_off;
+	rte_atomic32_set(&txq_obj->refcnt, 1);
+	txq_uar_init(txq_ctrl);
+	LIST_INSERT_HEAD(&priv->txqsobj, txq_obj, next);
+	return txq_obj;
+error:
+	ret = rte_errno; /* Save rte_errno before cleanup. */
+	txq_release_sq_resources(txq_obj);
+	if (txq_data->fcqs) {
+		rte_free(txq_data->fcqs);
+		txq_data->fcqs = NULL;
+	}
+	rte_free(txq_obj);
+	rte_errno = ret; /* Restore rte_errno. */
+	return NULL;
+#endif
+}
+
+/**
  * Create the Tx queue Verbs object.
  *
  * @param dev
@@ -609,6 +901,8 @@  struct mlx5_txq_obj *
 
 	if (type == MLX5_TXQ_OBJ_TYPE_DEVX_HAIRPIN)
 		return mlx5_txq_obj_hairpin_new(dev, idx);
+	if (type == MLX5_TXQ_OBJ_TYPE_DEVX_SQ)
+		return mlx5_txq_obj_devx_new(dev, idx);
 #ifdef HAVE_IBV_FLOW_DV_SUPPORT
 	/* If using DevX, need additional mask to read tisn value. */
 	if (priv->config.devx && !priv->sh->tdn)
@@ -817,8 +1111,10 @@  struct mlx5_txq_obj *
 		claim_zero(mlx5_glue->destroy_cq(tmpl.cq));
 	if (tmpl.qp)
 		claim_zero(mlx5_glue->destroy_qp(tmpl.qp));
-	if (txq_data->fcqs)
+	if (txq_data && txq_data->fcqs) {
 		rte_free(txq_data->fcqs);
+		txq_data->fcqs = NULL;
+	}
 	if (txq_obj)
 		rte_free(txq_obj);
 	priv->verbs_alloc_ctx.type = MLX5_VERBS_ALLOC_TYPE_NONE;
@@ -870,11 +1166,15 @@  struct mlx5_txq_obj *
 		if (txq_obj->type == MLX5_TXQ_OBJ_TYPE_DEVX_HAIRPIN) {
 			if (txq_obj->tis)
 				claim_zero(mlx5_devx_cmd_destroy(txq_obj->tis));
+		} else if (txq_obj->type == MLX5_TXQ_OBJ_TYPE_DEVX_SQ) {
+			txq_release_sq_resources(txq_obj);
 		} else {
 			claim_zero(mlx5_glue->destroy_qp(txq_obj->qp));
 			claim_zero(mlx5_glue->destroy_cq(txq_obj->cq));
-				if (txq_obj->txq_ctrl->txq.fcqs)
-					rte_free(txq_obj->txq_ctrl->txq.fcqs);
+		}
+		if (txq_obj->txq_ctrl->txq.fcqs) {
+			rte_free(txq_obj->txq_ctrl->txq.fcqs);
+			txq_obj->txq_ctrl->txq.fcqs = NULL;
 		}
 		LIST_REMOVE(txq_obj, next);
 		rte_free(txq_obj);