[v4] net/bnxt: fix crash after port stop/start

Message ID 20210823154453.403-1-somnath.kotur@broadcom.com (mailing list archive)
State Accepted, archived
Delegated to: Ajit Khaparde
Headers
Series [v4] net/bnxt: fix crash after port stop/start |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/github-robot: build success github build: passed
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS

Commit Message

Somnath Kotur Aug. 23, 2021, 3:44 p.m. UTC
  On chips like Thor, port stop/start sequence could result in a crash
in the application. This is because of false detection of a bad
opaque in the Rx completion and the subsequent kicking-in of the ring
reset code to recover from the condition.
The root cause being that the port stop/start would result in the HW
starting with fresh values, while the driver internal tracker variable
`rx_next_cons` is still pointing to a stale value.
Fix this by resetting rx_next_cons to 0 in bnxt_init_one_rx_ring()

Fixes: 03c8f2fe111c ("net/bnxt: detect bad opaque in Rx completion")
Cc: stable@dpdk.org

Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
---
v4: Updated commit log as requested by Ferruh
v3: Updated commit log and summary as requested by Ferruh
v2: Updated commit log as requested by Ferruh
 drivers/net/bnxt/bnxt_rxr.c | 3 +++
 1 file changed, 3 insertions(+)
  

Comments

Ajit Khaparde Aug. 24, 2021, 1:51 a.m. UTC | #1
On Mon, Aug 23, 2021 at 8:49 AM Somnath Kotur
<somnath.kotur@broadcom.com> wrote:
>
> On chips like Thor, port stop/start sequence could result in a crash
> in the application. This is because of false detection of a bad
> opaque in the Rx completion and the subsequent kicking-in of the ring
> reset code to recover from the condition.
> The root cause being that the port stop/start would result in the HW
> starting with fresh values, while the driver internal tracker variable
> `rx_next_cons` is still pointing to a stale value.
> Fix this by resetting rx_next_cons to 0 in bnxt_init_one_rx_ring()
>
> Fixes: 03c8f2fe111c ("net/bnxt: detect bad opaque in Rx completion")
> Cc: stable@dpdk.org
>
> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Thanks Ferruh.
Patch applied to dpdk-next-net-brcm
  

Patch

diff --git a/drivers/net/bnxt/bnxt_rxr.c b/drivers/net/bnxt/bnxt_rxr.c
index aea71703d1..73fbdd17d1 100644
--- a/drivers/net/bnxt/bnxt_rxr.c
+++ b/drivers/net/bnxt/bnxt_rxr.c
@@ -1379,6 +1379,9 @@  int bnxt_init_one_rx_ring(struct bnxt_rx_queue *rxq)
 	}
 	PMD_DRV_LOG(DEBUG, "TPA alloc Done!\n");
 
+	/* Explicitly reset this driver internal tracker on a ring init */
+	rxr->rx_next_cons = 0;
+
 	return 0;
 }