[v3,1/1] ring: enforce reading the tail before reading ring slots

Message ID 1552409933-45684-2-git-send-email-gavin.hu@arm.com (mailing list archive)
State Accepted, archived
Delegated to: Thomas Monjalon
Headers
Series ring: enforce reading the tail before reading ring slots |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/intel-Performance-Testing success Performance Testing PASS
ci/mellanox-Performance-Testing success Performance Testing PASS
ci/Intel-compilation success Compilation OK

Commit Message

Gavin Hu March 12, 2019, 4:58 p.m. UTC
  From: gavin hu <gavin.hu@arm.com>

In weak memory models, like arm64, reading the prod.tail may get
reordered after reading the ring slots, which corrupts the ring and
stale data is observed.

This issue was reported by NXP on 8-A72 DPAA2 board. The problem is most
likely caused by missing the acquire semantics when reading
prod.tail (in SC dequeue) which makes it possible to read a
stale value from the ring slots.

For MP (and MC) case, rte_atomic32_cmpset() already provides the required
ordering. For SP case, the control depependency between if-statement(which
depends on the read of r->cons.tail) and the later stores to the ring slots
make RMB unnecessary. About the control dependency, read more at:
https://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf

This patch is adding the required read barrier to prevent reading the ring
slots get reordered before reading prod.tail for SC case.

Fixes: c9fb3c62896f ("ring: move code in a new header file")
Cc: stable@dpdk.org

Signed-off-by: gavin hu <gavin.hu@arm.com>
Reviewed-by: Ola Liljedahl <Ola.Liljedahl@arm.com>
Tested-by: Nipun Gupta <nipun.gupta@nxp.com>
---
 lib/librte_ring/rte_ring_generic.h | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)
  

Comments

Nipun Gupta March 13, 2019, 8:12 a.m. UTC | #1
> -----Original Message-----
> From: Gavin Hu [mailto:gavin.hu@arm.com]
> Sent: Tuesday, March 12, 2019 10:29 PM
> To: dev@dpdk.org
> Cc: nd@arm.com; gavin hu <gavin.hu@arm.com>; thomas@monjalon.net;
> konstantin.ananyev@intel.com; jerinj@marvell.com; Hemant Agrawal
> <hemant.agrawal@nxp.com>; Nipun Gupta <nipun.gupta@nxp.com>;
> Honnappa.Nagarahalli@arm.com; i.maximets@samsung.com;
> chaozhu@linux.vnet.ibm.com; stable@dpdk.org
> Subject: [PATCH v3 1/1] ring: enforce reading the tail before reading ring slots
> 
> From: gavin hu <gavin.hu@arm.com>
> 
> In weak memory models, like arm64, reading the prod.tail may get
> reordered after reading the ring slots, which corrupts the ring and
> stale data is observed.
> 
> This issue was reported by NXP on 8-A72 DPAA2 board. The problem is most
> likely caused by missing the acquire semantics when reading
> prod.tail (in SC dequeue) which makes it possible to read a
> stale value from the ring slots.
> 
> For MP (and MC) case, rte_atomic32_cmpset() already provides the required
> ordering. For SP case, the control depependency between if-
> statement(which
> depends on the read of r->cons.tail) and the later stores to the ring slots
> make RMB unnecessary. About the control dependency, read more at:
> https://emea01.safelinks.protection.outlook.com/?url=https:%2F%2Fwww.c
> l.cam.ac.uk%2F~pes20%2Fppc-
> supplemental%2Ftest7.pdf&amp;data=02%7C01%7Cnipun.gupta%40nxp.co
> m%7Cc3df030ec49449bbaf0508d6a70c0b9f%7C686ea1d3bc2b4c6fa92cd99c5c
> 301635%7C0%7C0%7C636880067514602937&amp;sdata=ogPdd%2F4UGYWE8
> nAwY1lkXPCB9MpUIFY0VOQr2N1lwe4%3D&amp;reserved=0
> 
> This patch is adding the required read barrier to prevent reading the ring
> slots get reordered before reading prod.tail for SC case.
> 
> Fixes: c9fb3c62896f ("ring: move code in a new header file")
> Cc: stable@dpdk.org
> 
> Signed-off-by: gavin hu <gavin.hu@arm.com>
> Reviewed-by: Ola Liljedahl <Ola.Liljedahl@arm.com>
> Tested-by: Nipun Gupta <nipun.gupta@nxp.com>

Acked-by: Nipun Gupta <nipun.gupta@nxp.com>

Also, I have revalidated this updated patch.
  
Ananyev, Konstantin March 15, 2019, 1:26 p.m. UTC | #2
> -----Original Message-----
> From: Gavin Hu [mailto:gavin.hu@arm.com]
> Sent: Tuesday, March 12, 2019 4:59 PM
> To: dev@dpdk.org
> Cc: nd@arm.com; gavin hu <gavin.hu@arm.com>; thomas@monjalon.net; Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> jerinj@marvell.com; hemant.agrawal@nxp.com; nipun.gupta@nxp.com; Honnappa.Nagarahalli@arm.com; i.maximets@samsung.com;
> chaozhu@linux.vnet.ibm.com; stable@dpdk.org
> Subject: [PATCH v3 1/1] ring: enforce reading the tail before reading ring slots
> 
> From: gavin hu <gavin.hu@arm.com>
> 
> In weak memory models, like arm64, reading the prod.tail may get
> reordered after reading the ring slots, which corrupts the ring and
> stale data is observed.
> 
> This issue was reported by NXP on 8-A72 DPAA2 board. The problem is most
> likely caused by missing the acquire semantics when reading
> prod.tail (in SC dequeue) which makes it possible to read a
> stale value from the ring slots.
> 
> For MP (and MC) case, rte_atomic32_cmpset() already provides the required
> ordering. For SP case, the control depependency between if-statement(which
> depends on the read of r->cons.tail) and the later stores to the ring slots
> make RMB unnecessary. About the control dependency, read more at:
> https://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf
> 
> This patch is adding the required read barrier to prevent reading the ring
> slots get reordered before reading prod.tail for SC case.
> 
> Fixes: c9fb3c62896f ("ring: move code in a new header file")
> Cc: stable@dpdk.org
> 
> Signed-off-by: gavin hu <gavin.hu@arm.com>
> Reviewed-by: Ola Liljedahl <Ola.Liljedahl@arm.com>
> Tested-by: Nipun Gupta <nipun.gupta@nxp.com>
> ---
>  lib/librte_ring/rte_ring_generic.h | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/librte_ring/rte_ring_generic.h b/lib/librte_ring/rte_ring_generic.h
> index ea7dbe5..953cdbb 100644
> --- a/lib/librte_ring/rte_ring_generic.h
> +++ b/lib/librte_ring/rte_ring_generic.h
> @@ -158,11 +158,14 @@ __rte_ring_move_cons_head(struct rte_ring *r, unsigned int is_sc,
>  			return 0;
> 
>  		*new_head = *old_head + n;
> -		if (is_sc)
> -			r->cons.head = *new_head, success = 1;
> -		else
> +		if (is_sc) {
> +			r->cons.head = *new_head;
> +			rte_smp_rmb();
> +			success = 1;
> +		} else {
>  			success = rte_atomic32_cmpset(&r->cons.head, *old_head,
>  					*new_head);
> +		}
>  	} while (unlikely(success == 0));
>  	return n;
>  }
> --

Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

> 2.7.4
  
Thomas Monjalon March 28, 2019, 12:21 a.m. UTC | #3
> > In weak memory models, like arm64, reading the prod.tail may get
> > reordered after reading the ring slots, which corrupts the ring and
> > stale data is observed.
> > 
> > This issue was reported by NXP on 8-A72 DPAA2 board. The problem is most
> > likely caused by missing the acquire semantics when reading
> > prod.tail (in SC dequeue) which makes it possible to read a
> > stale value from the ring slots.
> > 
> > For MP (and MC) case, rte_atomic32_cmpset() already provides the required
> > ordering. For SP case, the control depependency between if-statement(which
> > depends on the read of r->cons.tail) and the later stores to the ring slots
> > make RMB unnecessary. About the control dependency, read more at:
> > https://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf
> > 
> > This patch is adding the required read barrier to prevent reading the ring
> > slots get reordered before reading prod.tail for SC case.
> > 
> > Fixes: c9fb3c62896f ("ring: move code in a new header file")
> > Cc: stable@dpdk.org
> > 
> > Signed-off-by: gavin hu <gavin.hu@arm.com>
> > Reviewed-by: Ola Liljedahl <Ola.Liljedahl@arm.com>
> > Tested-by: Nipun Gupta <nipun.gupta@nxp.com>
> 
> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

Applied, thanks
  

Patch

diff --git a/lib/librte_ring/rte_ring_generic.h b/lib/librte_ring/rte_ring_generic.h
index ea7dbe5..953cdbb 100644
--- a/lib/librte_ring/rte_ring_generic.h
+++ b/lib/librte_ring/rte_ring_generic.h
@@ -158,11 +158,14 @@  __rte_ring_move_cons_head(struct rte_ring *r, unsigned int is_sc,
 			return 0;
 
 		*new_head = *old_head + n;
-		if (is_sc)
-			r->cons.head = *new_head, success = 1;
-		else
+		if (is_sc) {
+			r->cons.head = *new_head;
+			rte_smp_rmb();
+			success = 1;
+		} else {
 			success = rte_atomic32_cmpset(&r->cons.head, *old_head,
 					*new_head);
+		}
 	} while (unlikely(success == 0));
 	return n;
 }