[2/4] net/mlx5: add support for Rx queue delay drop

Message ID 20211104112644.17278-3-bingz@nvidia.com (mailing list archive)
State Superseded, archived
Delegated to: Raslan Darawsheh
Headers
Series Add delay drop support for Rx queue |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Bing Zhao Nov. 4, 2021, 11:26 a.m. UTC
  For an Ethernet RQ, packets received when receive WQEs are exhausted
are dropped. This behavior prevents slow or malicious software
entities at the host from affecting the network. While for hairpin
cases, even if there is no software involved during the packet
forwarding from Rx to Tx side, some hiccup in the hardware or back
pressure from Tx side may still cause the WQEs to be exhausted. In
certain scenarios it may be preferred to configure the device to
avoid such packet drops, assuming the posting of WQEs will resume
shortly.

To support this, a new devarg "delay_drop_en" is introduced, by
default, the delay drop is enabled for hairpin Rx queues and
disabled for standard Rx queues. This value is used as a bit mask:
  - bit 0: enablement of standard Rx queue
  - bit 1: enablement of hairpin Rx queue
And this attribute will be applied to all Rx queues of a device.

If the hardware capabilities do not support this delay drop, all the
Rx queues will still be created without this attribute, and the
devarg setting will be ignored.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
---
 drivers/net/mlx5/linux/mlx5_os.c | 11 +++++++++++
 drivers/net/mlx5/mlx5.c          |  7 +++++++
 drivers/net/mlx5/mlx5.h          |  9 +++++++++
 drivers/net/mlx5/mlx5_devx.c     |  5 +++++
 drivers/net/mlx5/mlx5_rx.h       |  1 +
 5 files changed, 33 insertions(+)
  

Comments

David Marchand Nov. 4, 2021, 2:01 p.m. UTC | #1
On Thu, Nov 4, 2021 at 12:27 PM Bing Zhao <bingz@nvidia.com> wrote:
>
> For an Ethernet RQ, packets received when receive WQEs are exhausted
> are dropped. This behavior prevents slow or malicious software
> entities at the host from affecting the network. While for hairpin
> cases, even if there is no software involved during the packet
> forwarding from Rx to Tx side, some hiccup in the hardware or back
> pressure from Tx side may still cause the WQEs to be exhausted. In
> certain scenarios it may be preferred to configure the device to
> avoid such packet drops, assuming the posting of WQEs will resume
> shortly.
>
> To support this, a new devarg "delay_drop_en" is introduced, by
> default, the delay drop is enabled for hairpin Rx queues and
> disabled for standard Rx queues. This value is used as a bit mask:
>   - bit 0: enablement of standard Rx queue
>   - bit 1: enablement of hairpin Rx queue
> And this attribute will be applied to all Rx queues of a device.

Rather than a devargs, why can't the driver use this option in the
identified usecases where it makes sense?
Here, hairpin.
  
Bing Zhao Nov. 4, 2021, 2:34 p.m. UTC | #2
Hi David,

Many thanks for this comments. My answers are inline.

> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Thursday, November 4, 2021 10:01 PM
> To: Bing Zhao <bingz@nvidia.com>
> Cc: Slava Ovsiienko <viacheslavo@nvidia.com>; Matan Azrad
> <matan@nvidia.com>; dev <dev@dpdk.org>; Raslan Darawsheh
> <rasland@nvidia.com>; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>; Ori Kam <orika@nvidia.com>
> Subject: Re: [dpdk-dev] [PATCH 2/4] net/mlx5: add support for Rx
> queue delay drop
> 
> External email: Use caution opening links or attachments
> 
> 
> On Thu, Nov 4, 2021 at 12:27 PM Bing Zhao <bingz@nvidia.com> wrote:
> >
> > For an Ethernet RQ, packets received when receive WQEs are
> exhausted
> > are dropped. This behavior prevents slow or malicious software
> > entities at the host from affecting the network. While for hairpin
> > cases, even if there is no software involved during the packet
> > forwarding from Rx to Tx side, some hiccup in the hardware or back
> > pressure from Tx side may still cause the WQEs to be exhausted. In
> > certain scenarios it may be preferred to configure the device to
> avoid
> > such packet drops, assuming the posting of WQEs will resume
> shortly.
> >
> > To support this, a new devarg "delay_drop_en" is introduced, by
> > default, the delay drop is enabled for hairpin Rx queues and
> disabled
> > for standard Rx queues. This value is used as a bit mask:
> >   - bit 0: enablement of standard Rx queue
> >   - bit 1: enablement of hairpin Rx queue And this attribute will
> be
> > applied to all Rx queues of a device.
> 
> Rather than a devargs, why can't the driver use this option in the
> identified usecases where it makes sense?
> Here, hairpin.

In the patch set v2, the attribute for hairpin is also disabled, then the default behavior will remain the same as today. This is only some minor change but it may have some impact on the HW processing.
With this attribute ON for a specific queue, it will have the such impact:

Pros: If there is some hiccup in the SW / HW, or there is a burst and the SW is not fast enough to handle. Once the WQEs get exhausted in the queue, the packets will not be dropped immediately and held in the NIC. It gives more tolerance and make the queue work as a dropless queue.

Cons: While some packets are waiting for the available WQEs, the new packets maybe dropped if there is not enough space. Or some new packets may have a bigger latency since the previous ones are waiting. If the traffic exceeds the line rate or the SW is too slow to handle the incoming traffic, the packets will be dropped eventually. Some contexts are global and the waiting on one queue may have an impact on other queues.

So right now this devarg is to give the flexibility / ability for the application to verify and decide if this is needed in the real-life. Theoretically, this would help for most of the cases.

> 
> 
> --
> David Marchand

BR. Bing
  

Patch

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index f51da8c3a3..def2cca3cd 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1506,6 +1506,15 @@  mlx5_dev_spawn(struct rte_device *dpdk_dev,
 		goto error;
 #endif
 	}
+	if (config->std_delay_drop || config->hp_delay_drop) {
+		if (!config->hca_attr.rq_delay_drop) {
+			config->std_delay_drop = 0;
+			config->hp_delay_drop = 0;
+			DRV_LOG(WARNING,
+				"dev_port-%u: Rxq delay drop is not supported",
+				priv->dev_port);
+		}
+	}
 	if (sh->devx) {
 		uint32_t reg[MLX5_ST_SZ_DW(register_mtutc)];
 
@@ -2075,6 +2084,8 @@  mlx5_os_config_default(struct mlx5_dev_config *config)
 	config->decap_en = 1;
 	config->log_hp_size = MLX5_ARG_UNSET;
 	config->allow_duplicate_pattern = 1;
+	config->std_delay_drop = 0;
+	config->hp_delay_drop = 1;
 }
 
 /**
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index dc15688f21..80a6692b94 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -183,6 +183,9 @@ 
 /* Device parameter to configure implicit registration of mempool memory. */
 #define MLX5_MR_MEMPOOL_REG_EN "mr_mempool_reg_en"
 
+/* Device parameter to configure the delay drop when creating Rxqs. */
+#define MLX5_DELAY_DROP_EN "delay_drop_en"
+
 /* Shared memory between primary and secondary processes. */
 struct mlx5_shared_data *mlx5_shared_data;
 
@@ -2095,6 +2098,9 @@  mlx5_args_check(const char *key, const char *val, void *opaque)
 		config->decap_en = !!tmp;
 	} else if (strcmp(MLX5_ALLOW_DUPLICATE_PATTERN, key) == 0) {
 		config->allow_duplicate_pattern = !!tmp;
+	} else if (strcmp(MLX5_DELAY_DROP_EN, key) == 0) {
+		config->std_delay_drop = tmp & MLX5_DELAY_DROP_STANDARD;
+		config->hp_delay_drop = tmp & MLX5_DELAY_DROP_HAIRPIN;
 	} else {
 		DRV_LOG(WARNING, "%s: unknown parameter", key);
 		rte_errno = EINVAL;
@@ -2157,6 +2163,7 @@  mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs)
 		MLX5_DECAP_EN,
 		MLX5_ALLOW_DUPLICATE_PATTERN,
 		MLX5_MR_MEMPOOL_REG_EN,
+		MLX5_DELAY_DROP_EN,
 		NULL,
 	};
 	struct rte_kvargs *kvlist;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 74af88ec19..8d32d55c9a 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -99,6 +99,13 @@  enum mlx5_flow_type {
 	MLX5_FLOW_TYPE_MAXI,
 };
 
+/* The mode of delay drop for Rx queues. */
+enum mlx5_delay_drop_mode {
+	MLX5_DELAY_DROP_NONE = 0, /* All disabled. */
+	MLX5_DELAY_DROP_STANDARD = RTE_BIT32(0), /* Standard queues enable. */
+	MLX5_DELAY_DROP_HAIRPIN = RTE_BIT32(1), /* Hairpin queues enable. */
+};
+
 /* Hlist and list callback context. */
 struct mlx5_flow_cb_ctx {
 	struct rte_eth_dev *dev;
@@ -264,6 +271,8 @@  struct mlx5_dev_config {
 	unsigned int dv_miss_info:1; /* restore packet after partial hw miss */
 	unsigned int allow_duplicate_pattern:1;
 	/* Allow/Prevent the duplicate rules pattern. */
+	unsigned int std_delay_drop:1; /* Enable standard Rxq delay drop. */
+	unsigned int hp_delay_drop:1; /* Enable hairpin Rxq delay drop. */
 	struct {
 		unsigned int enabled:1; /* Whether MPRQ is enabled. */
 		unsigned int stride_num_n; /* Number of strides. */
diff --git a/drivers/net/mlx5/mlx5_devx.c b/drivers/net/mlx5/mlx5_devx.c
index 424f77be79..2e1d849eab 100644
--- a/drivers/net/mlx5/mlx5_devx.c
+++ b/drivers/net/mlx5/mlx5_devx.c
@@ -280,6 +280,7 @@  mlx5_rxq_create_devx_rq_resources(struct rte_eth_dev *dev,
 						MLX5_WQ_END_PAD_MODE_NONE;
 	rq_attr.wq_attr.pd = cdev->pdn;
 	rq_attr.counter_set_id = priv->counter_set_id;
+	rq_attr.delay_drop_en = rxq_data->delay_drop;
 	/* Create RQ using DevX API. */
 	return mlx5_devx_rq_create(cdev->ctx, &rxq_ctrl->obj->rq_obj, wqe_size,
 				   log_desc_n, &rq_attr, rxq_ctrl->socket);
@@ -443,6 +444,8 @@  mlx5_rxq_obj_hairpin_new(struct rte_eth_dev *dev, uint16_t idx)
 			attr.wq_attr.log_hairpin_data_sz -
 			MLX5_HAIRPIN_QUEUE_STRIDE;
 	attr.counter_set_id = priv->counter_set_id;
+	rxq_data->delay_drop = priv->config.hp_delay_drop;
+	attr.delay_drop_en = priv->config.hp_delay_drop;
 	tmpl->rq = mlx5_devx_cmd_create_rq(priv->sh->cdev->ctx, &attr,
 					   rxq_ctrl->socket);
 	if (!tmpl->rq) {
@@ -503,6 +506,7 @@  mlx5_rxq_devx_obj_new(struct rte_eth_dev *dev, uint16_t idx)
 		DRV_LOG(ERR, "Failed to create CQ.");
 		goto error;
 	}
+	rxq_data->delay_drop = priv->config.std_delay_drop;
 	/* Create RQ using DevX API. */
 	ret = mlx5_rxq_create_devx_rq_resources(dev, rxq_data);
 	if (ret) {
@@ -921,6 +925,7 @@  mlx5_rxq_devx_obj_drop_create(struct rte_eth_dev *dev)
 	rxq_ctrl->priv = priv;
 	rxq_ctrl->obj = rxq;
 	rxq_data = &rxq_ctrl->rxq;
+	rxq_data->delay_drop = 0;
 	/* Create CQ using DevX API. */
 	ret = mlx5_rxq_create_devx_cq_resources(dev, rxq_data);
 	if (ret != 0) {
diff --git a/drivers/net/mlx5/mlx5_rx.h b/drivers/net/mlx5/mlx5_rx.h
index 69b1263339..05807764b8 100644
--- a/drivers/net/mlx5/mlx5_rx.h
+++ b/drivers/net/mlx5/mlx5_rx.h
@@ -92,6 +92,7 @@  struct mlx5_rxq_data {
 	unsigned int lro:1; /* Enable LRO. */
 	unsigned int dynf_meta:1; /* Dynamic metadata is configured. */
 	unsigned int mcqe_format:3; /* CQE compression format. */
+	unsigned int delay_drop:1; /* Enable delay drop. */
 	volatile uint32_t *rq_db;
 	volatile uint32_t *cq_db;
 	uint16_t port_id;