[v2] mlx5/testpmd: fix crash on quit with avail thresh enabled

Message ID 20221102114425.2091580-1-spiked@nvidia.com (mailing list archive)
State Accepted, archived
Delegated to: Raslan Darawsheh
Headers
Series [v2] mlx5/testpmd: fix crash on quit with avail thresh enabled |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/intel-Testing success Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/github-robot: build success github build: passed

Commit Message

Spike Du Nov. 2, 2022, 11:44 a.m. UTC
  When testpmd quit with mlx5 avail_thresh enabled, a rte timer handler
delays to reconfigure rx queue to re-arm this event. However at the same
time, testpmd is destroying rx queues.
It's never a valid use case for mlx5 avail_thresh. Before testpmd quit,
user should disable avail_thresh configuration to not handle the events.
This is documented in mlx5 driver guide.

To avoid the crash in such use case, check port status, if it is not
RTE_PORT_STARTED, don't process the avail_thresh event.

Fixes: f41a5092e6ae ("app/testpmd: add host shaper command")
Cc: stable@dpdk.org

Signed-off-by: Spike Du <spiked@nvidia.com>
---
 drivers/net/mlx5/mlx5_testpmd.c | 8 ++++++++
 1 file changed, 8 insertions(+)
  

Comments

Matan Azrad Nov. 6, 2022, 1:26 p.m. UTC | #1
From: Spike Du <spiked@nvidia.com>
> When testpmd quit with mlx5 avail_thresh enabled, a rte timer handler
> delays to reconfigure rx queue to re-arm this event. However at the same
> time, testpmd is destroying rx queues.
> It's never a valid use case for mlx5 avail_thresh. Before testpmd quit, user
> should disable avail_thresh configuration to not handle the events.
> This is documented in mlx5 driver guide.
> 
> To avoid the crash in such use case, check port status, if it is not
> RTE_PORT_STARTED, don't process the avail_thresh event.
> 
> Fixes: f41a5092e6ae ("app/testpmd: add host shaper command")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Spike Du <spiked@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
  
Raslan Darawsheh Nov. 6, 2022, 3:42 p.m. UTC | #2
Hi,

> -----Original Message-----
> From: Spike Du <spiked@nvidia.com>
> Sent: Wednesday, November 2, 2022 1:44 PM
> To: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Shahaf Shuler
> <shahafs@nvidia.com>
> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>;
> stable@dpdk.org
> Subject: [PATCH v2] mlx5/testpmd: fix crash on quit with avail thresh enabled
> 
> When testpmd quit with mlx5 avail_thresh enabled, a rte timer handler
> delays to reconfigure rx queue to re-arm this event. However at the same
> time, testpmd is destroying rx queues.
> It's never a valid use case for mlx5 avail_thresh. Before testpmd quit,
> user should disable avail_thresh configuration to not handle the events.
> This is documented in mlx5 driver guide.
> 
> To avoid the crash in such use case, check port status, if it is not
> RTE_PORT_STARTED, don't process the avail_thresh event.
> 
> Fixes: f41a5092e6ae ("app/testpmd: add host shaper command")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Spike Du <spiked@nvidia.com>

Patch applied to next-net-mlx,

Kindest regards,
Raslan Darawsheh
  

Patch

diff --git a/drivers/net/mlx5/mlx5_testpmd.c b/drivers/net/mlx5/mlx5_testpmd.c
index ed84583..879ea28 100644
--- a/drivers/net/mlx5/mlx5_testpmd.c
+++ b/drivers/net/mlx5/mlx5_testpmd.c
@@ -39,7 +39,15 @@ 
 	uint16_t port_id = port_rxq_id & 0xffff;
 	uint16_t qid = (port_rxq_id >> 16) & 0xffff;
 	struct rte_eth_rxq_info qinfo;
+	struct rte_port *port;
 
+	port = &ports[port_id];
+	if (port->port_status != RTE_PORT_STARTED) {
+		printf("%s port_status(%d) is incorrect, stop avail_thresh "
+		       "event processing.\n",
+		       __func__, port->port_status);
+		return;
+	}
 	printf("%s disable shaper\n", __func__);
 	if (rte_eth_rx_queue_info_get(port_id, qid, &qinfo)) {
 		printf("rx_queue_info_get returns error\n");