Segfault in rcu

Message ID 20250301022622.GA6940@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net (mailing list archive)
State Changes Requested, archived
Delegated to: Thomas Monjalon
Headers
Series Segfault in rcu |

Checks

Context Check Description
ci/Intel-compilation fail Compilation issues
ci/intel-Testing success Testing PASS
ci/intel-Functional success Functional PASS
ci/github-robot: build success github build: passed
ci/iol-testing warning apply patch failure

Commit Message

Andre Muezerie March 1, 2025, 2:26 a.m. UTC
Hi,

Are there known issues with rcu lib?
I tried to remove the VLA from it by calling alloca() and gcc 14.2 on Linux immediately complains about "array subscript outside array bounds".

$ ninja -C build
ninja: Entering directory `build'
[1/144] Compiling C object lib/librte_rcu.a.p/rcu_rte_rcu_qsbr.c.o
In file included from ../lib/ring/rte_ring_elem.h:20,
                 from ../lib/rcu/rte_rcu_qsbr.c:17:
In function ‘__rte_ring_dequeue_elems_32’,
    inlined from ‘__rte_ring_do_dequeue_elems’ at ../lib/ring/rte_ring_elem_pvt.h:289:3,
    inlined from ‘__rte_ring_dequeue_elems’ at ../lib/ring/rte_ring_elem_pvt.h:298:2,
    inlined from ‘__rte_ring_do_dequeue_start’ at ../lib/ring/rte_ring_peek_elem_pvt.h:172:3,
    inlined from ‘rte_ring_dequeue_bulk_elem_start’ at ../lib/ring/rte_ring_peek.h:237:9,
    inlined from ‘rte_rcu_qsbr_dq_reclaim’ at ../lib/rcu/rte_rcu_qsbr.c:392:4:
../lib/ring/rte_ring_elem_pvt.h:176:36: warning: array subscript 2 is outside array bounds of ‘char[8]’ [-Warray-bounds=]
  176 |                         obj[i + 2] = ring[idx + 2];
      |                         ~~~~~~~~~~~^~~~~~~~~~~~~~~


If I stubbornly ignore the warning and run the test, rcu_qsbr_autotest segfaults (no surprise given the previous warning).
Wondering if somebody more familiar with that code would like to take a look.

These were the only changes I had made, on latest main (commit fab31a03ba98e7457284df95dd9eef2223a4ccaa)

$ cat 0001-rcu.patch 
From dc7776a7b918cd96ffce62dd3bfaca85fc449d1b Mon Sep 17 00:00:00 2001
From: Andre Muezerie <andremue@linux.microsoft.com>
Date: Fri, 28 Feb 2025 21:05:25 -0500
Subject: [PATCH] rcu

---
 lib/rcu/rte_rcu_qsbr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
  

Comments

Dmitry Kozlyuk March 1, 2025, 7:35 p.m. UTC | #1
Hi Andre,

> @@ -386,7 +386,7 @@ rte_rcu_qsbr_dq_reclaim(struct rte_rcu_qsbr_dq *dq, unsigned int n,
>
>      cnt = 0;
>
> -    char data[dq->esize];
> +    char *data = alloca(dq->esize);
>      /* Check reader threads quiescent state and reclaim resources */
>      while (cnt < n &&
>          rte_ring_dequeue_bulk_elem_start(dq->r, &data,

In the last line, "&data" was equivalent to "data" when "data" was an array.
This is no longer true when "data" is a pointer.
Removing "&" fixes the issue.
Maybe coccinelle can check for similar mistakes caused by mechanical
replacement of VLA to alloca()?
  
Andre Muezerie March 3, 2025, 9:37 p.m. UTC | #2
On Sat, Mar 01, 2025 at 10:35:48PM +0300, Dmitry Kozlyuk wrote:
> Hi Andre,
> 
> > @@ -386,7 +386,7 @@ rte_rcu_qsbr_dq_reclaim(struct rte_rcu_qsbr_dq *dq, unsigned int n,
> >
> >      cnt = 0;
> >
> > -    char data[dq->esize];
> > +    char *data = alloca(dq->esize);
> >      /* Check reader threads quiescent state and reclaim resources */
> >      while (cnt < n &&
> >          rte_ring_dequeue_bulk_elem_start(dq->r, &data,
> 
> In the last line, "&data" was equivalent to "data" when "data" was an array.
> This is no longer true when "data" is a pointer.
> Removing "&" fixes the issue.
> Maybe coccinelle can check for similar mistakes caused by mechanical
> replacement of VLA to alloca()?

That makes sense. Thanks Dmitry.
  

Patch

diff --git a/lib/rcu/rte_rcu_qsbr.c b/lib/rcu/rte_rcu_qsbr.c
index dbf31501a6..65037d0217 100644
--- a/lib/rcu/rte_rcu_qsbr.c
+++ b/lib/rcu/rte_rcu_qsbr.c
@@ -323,7 +323,7 @@  int rte_rcu_qsbr_dq_enqueue(struct rte_rcu_qsbr_dq *dq, void *e)
 		return 1;
 	}
 
-	char data[dq->esize];
+	char *data = alloca(dq->esize);
 	dq_elem = (__rte_rcu_qsbr_dq_elem_t *)data;
 	/* Start the grace period */
 	dq_elem->token = rte_rcu_qsbr_start(dq->v);
@@ -386,7 +386,7 @@  rte_rcu_qsbr_dq_reclaim(struct rte_rcu_qsbr_dq *dq, unsigned int n,
 
 	cnt = 0;
 
-	char data[dq->esize];
+	char *data = alloca(dq->esize);
 	/* Check reader threads quiescent state and reclaim resources */
 	while (cnt < n &&
 		rte_ring_dequeue_bulk_elem_start(dq->r, &data,