From patchwork Tue Nov  7 08:34:30 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: Jia He <hejianet@gmail.com>
X-Patchwork-Id: 31233
Return-Path: <dev-bounces@dpdk.org>
X-Original-To: patchwork@dpdk.org
Delivered-To: patchwork@dpdk.org
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id EE1781B3E2;
	Tue,  7 Nov 2017 09:34:47 +0100 (CET)
Received: from mail-it0-f53.google.com (mail-it0-f53.google.com
	[209.85.214.53]) by dpdk.org (Postfix) with ESMTP id 090651B3D9
	for <dev@dpdk.org>; Tue,  7 Nov 2017 09:34:45 +0100 (CET)
Received: by mail-it0-f53.google.com with SMTP id y15so1447894ita.4
	for <dev@dpdk.org>; Tue, 07 Nov 2017 00:34:45 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
	h=subject:to:cc:references:from:message-id:date:user-agent
	:mime-version:in-reply-to:content-transfer-encoding;
	bh=EyjjOk+FfOiapYI0ws4opp0LmJLNTj1DJBZuiOA7p/Q=;
	b=f7DVA+oT95TVx5W6X21w4585zmG5821IiGMFZd22V7v3r9dBlfQ2HvXBLix1RVXwk3
	xWoRqRUk+R3iaeuhD4lafwhjGqQX+ega5F4XYV5QV6ZwA8taAX8T7YKvtRoEKlI3Ij2p
	rjGgOWox6X6SH2X+K08EdICb5/WpEfIBc71TmarrtEQjGjusSzgnwHICXa9AnbU+8www
	aUBOAivIy07tqnuPkfFpops+5F60be+BVbiEphxkgv9Cpa0rkBPidSQ8LFAr6lqwmvup
	oVFdDWpbMQNaojPJINKOSrjoFyR9VmsDROmVcgSYe72XcXSkWYTLL4h87sDcda9JYJwH
	yLEQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:subject:to:cc:references:from:message-id:date
	:user-agent:mime-version:in-reply-to:content-transfer-encoding;
	bh=EyjjOk+FfOiapYI0ws4opp0LmJLNTj1DJBZuiOA7p/Q=;
	b=RaJjiGEvsZ7Tq0C8pqgCwFSm7BMQE6mfadARJaPFbkli1+bcmgUTayev9c03ImGRE6
	Mcq/7E7gL74WgNAlLZE3h5qsR5UyRoKDxiQKCKjXfavASEGdUIq99KLvHkF8i7tEJLiS
	jCNJ4IbWH26t19WYRgbbPTqnkOnIM8rdFs6n34ezW1vaHKAalV3u5L5wbsCuXj4br/le
	pX9kZzCEDjiuDZlHnTNtOoEFaSyaZ1LX9pMzbsxOOENQ7Shls4EmgRYc082m+nTLhr8r
	pKWyNEHP2vSOSJkKhUjmnBRxx9EnB4HXnMsDWHaf9ZuDEf3ywizZBTwbG5YazitaK49d
	4XhQ==
X-Gm-Message-State: AJaThX4oqBmtBbpVtiiU0Lb6Lpa7vXZOpifKuoHCfBgntCIx4YDyNy6u
	WeZpyoN/E2C5WfuOLQCLB7w=
X-Google-Smtp-Source: 
 ABhQp+RVOo3NLTPnjhf1bGM2oQdnpR0HYdaMw3krLTcniGwFInAS/PNVEWPxW5m5Hvj/QGXvdIadvw==
X-Received: by 10.36.211.151 with SMTP id n145mr1102587itg.19.1510043684942;
	Tue, 07 Nov 2017 00:34:44 -0800 (PST)
Received: from [0.0.0.0] (67.209.179.165.16clouds.com. [67.209.179.165])
	by smtp.gmail.com with ESMTPSA id
	e99sm568047itd.40.2017.11.07.00.34.35
	(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
	Tue, 07 Nov 2017 00:34:44 -0800 (PST)
To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Cc: dev@dpdk.org, olivier.matz@6wind.com, konstantin.ananyev@intel.com,
	bruce.richardson@intel.com, jianbo.liu@arm.com, hemant.agrawal@nxp.com,
	jie2.liu@hxt-semitech.com, bing.zhao@hxt-semitech.com,
	jia.he@hxt-semitech.com
References: <1509612210-5499-1-git-send-email-hejianet@gmail.com>
	<20171102172337.GB1478@jerin>
	<25192429-8369-ac3d-44b0-c1b1d7182ef0@gmail.com>
	<20171103125616.GB20326@jerin>
	<7b7f3677-8313-9a2f-868f-b3a6231548d6@gmail.com>
	<20171107043655.GA3244@jerin>
From: Jia He <hejianet@gmail.com>
Message-ID: <c2ce8774-a1b6-edf6-444e-ee0981df7497@gmail.com>
Date: Tue, 7 Nov 2017 16:34:30 +0800
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101
	Thunderbird/52.4.0
MIME-Version: 1.0
In-Reply-To: <20171107043655.GA3244@jerin>
Subject: Re: [dpdk-dev] [PATCH v2] ring: guarantee ordering of cons/prod
	loading when doing
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
	<mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
	<mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

On 11/7/2017 12:36 PM, Jerin Jacob Wrote:
> -----Original Message-----
>
> On option could be to change the prototype of update_tail() and make
> compiler accommodate it for zero cost for arm64(Which I think, it it the
> case. But you can check the generated instructions)
> If not, move, __rte_ring_do_dequeue() and __rte_ring_do_enqueue() instead of
> __rte_ring_move_prod_head/__rte_ring_move_cons_head/update_tail()
>
>
> ➜ [master][dpdk.org] $ git diff
> diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> index 5e9b3b7b4..b32648825 100644
> --- a/lib/librte_ring/rte_ring.h
> +++ b/lib/librte_ring/rte_ring.h
> @@ -358,8 +358,12 @@ void rte_ring_dump(FILE *f, const struct rte_ring
> *r);
>   
>   static __rte_always_inline void
>   update_tail(struct rte_ring_headtail *ht, uint32_t old_val, uint32_t
> new_val,
> -               uint32_t single)
> +               uint32_t single, const uint32_t enqueue)
>   {
> +       if (enqueue)
> +               rte_smp_wmb();
> +       else
> +               rte_smp_rmb();
>          /*
>           * If there are other enqueues/dequeues in progress that
>           * preceded us,
>           * we need to wait for them to complete
> @@ -470,9 +474,8 @@ __rte_ring_do_enqueue(struct rte_ring *r, void *
> const *obj_table,
>                  goto end;
>   
>          ENQUEUE_PTRS(r, &r[1], prod_head, obj_table, n, void *);
> -       rte_smp_wmb();
>   
> -       update_tail(&r->prod, prod_head, prod_next, is_sp);
> +       update_tail(&r->prod, prod_head, prod_next, is_sp, 1);
>   end:
>          if (free_space != NULL)
>                  *free_space = free_entries - n;
> @@ -575,9 +578,8 @@ __rte_ring_do_dequeue(struct rte_ring *r, void
> **obj_table,
>                  goto end;
>   
>          DEQUEUE_PTRS(r, &r[1], cons_head, obj_table, n, void *);
> -       rte_smp_rmb();
>   
> -       update_tail(&r->cons, cons_head, cons_next, is_sc);
> +       update_tail(&r->cons, cons_head, cons_next, is_sc, 0);
>   
>   end:
>          if (available != NULL)
>
>
>
Hi Jerin, yes I knew this suggestion in update_tail.
But what I mean is the rte_smp_rmb() in __rte_ring_move_cons_head and 
__rte_ring_move_pros_head:
[option 1]
+        *old_head = r->cons.head;
+        rte_smp_rmb();
+        const uint32_t prod_tail = r->prod.tail;

[option 2]
+        *old_head = __atomic_load_n(&r->cons.head,
+                    __ATOMIC_ACQUIRE);
+        *old_head = r->cons.head;

ie.I wonder what is the suitable new config name to distinguish the 
above 2 options?
Thanks for the patience :-)

see my drafted patch below, the marcro "PREFER":
+        *available = entries - n;
+    return n;
+}
+
+#endif /* _RTE_RING_GENERIC_H_ */
+

diff --git a/lib/librte_ring/rte_ring_c11_mem.h 
b/lib/librte_ring/rte_ring_c11_mem.h
new file mode 100644
index 0000000..22fe887
--- /dev/null
+++ b/lib/librte_ring/rte_ring_c11_mem.h
@@ -0,0 +1,305 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 hxt-semitech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of hxt-semitech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_RING_C11_MEM_H_
+#define _RTE_RING_C11_MEM_H_
+
+static __rte_always_inline void
+update_tail(struct rte_ring_headtail *ht, uint32_t old_val, uint32_t 
new_val,
+        uint32_t single, uint32_t enqueue)
+{
+    /* Don't need wmb/rmb when we prefer to use load_acquire/
+     * store_release barrier */
+#ifndef PREFER
+    if (enqueue)
+        rte_smp_wmb();
+    else
+        rte_smp_rmb();
+#endif
+
+    /*
+     * If there are other enqueues/dequeues in progress that preceded us,
+     * we need to wait for them to complete
+     */
+    if (!single)
+        while (unlikely(ht->tail != old_val))
+            rte_pause();
+
+#ifdef PREFER
+    __atomic_store_n(&ht->tail, new_val, __ATOMIC_RELEASE);
+#else
+    ht->tail = new_val;
+#endif
+}
+
+/**
+ * @internal This function updates the producer head for enqueue
+ *
+ * @param r
+ *   A pointer to the ring structure
+ * @param is_sp
+ *   Indicates whether multi-producer path is needed or not
+ * @param n
+ *   The number of elements we will want to enqueue, i.e. how far 
should the
+ *   head be moved
+ * @param behavior
+ *   RTE_RING_QUEUE_FIXED:    Enqueue a fixed number of items from a ring
+ *   RTE_RING_QUEUE_VARIABLE: Enqueue as many items as possible from ring
+ * @param old_head
+ *   Returns head value as it was before the move, i.e. where enqueue 
starts
+ * @param new_head
+ *   Returns the current/new head value i.e. where enqueue finishes
+ * @param free_entries
+ *   Returns the amount of free space in the ring BEFORE head was moved
+ * @return
+ *   Actual number of objects enqueued.
+ *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_move_prod_head(struct rte_ring *r, int is_sp,
+        unsigned int n, enum rte_ring_queue_behavior behavior,
+        uint32_t *old_head, uint32_t *new_head,
+        uint32_t *free_entries)
+{
+    const uint32_t capacity = r->capacity;
+    unsigned int max = n;
+    int success;
+
+    do {
+        /* Reset n to the initial burst count */
+        n = max;
+
+#ifdef PREFER
+        *old_head = __atomic_load_n(&r->prod.head,
+                    __ATOMIC_ACQUIRE);
+#else
+        *old_head = r->prod.head;
+        /* prevent reorder of load/load */
+        rte_smp_rmb();
+#endif
+        const uint32_t cons_tail = r->cons.tail;
+        /*
+         *  The subtraction is done between two unsigned 32bits value
+         * (the result is always modulo 32 bits even if we have
+         * *old_head > cons_tail). So 'free_entries' is always between 0
+         * and capacity (which is < size).
+         */
+        *free_entries = (capacity + cons_tail - *old_head);
+
+        /* check that we have enough room in ring */
+        if (unlikely(n > *free_entries))
+            n = (behavior == RTE_RING_QUEUE_FIXED) ?
+                    0 : *free_entries;
+
+        if (n == 0)
+            return 0;
+
+        *new_head = *old_head + n;
+        if (is_sp)
+            r->prod.head = *new_head, success = 1;
+        else
+#ifdef PREFER
+            success = arch_rte_atomic32_cmpset(&r->prod.head,
+                    old_head, *new_head,
+                    0, __ATOMIC_ACQUIRE,
+                    __ATOMIC_RELAXED);
+#else
+            success = rte_atomic32_cmpset(&r->prod.head,
+                    *old_head, *new_head);
+#endif
+    } while (unlikely(success == 0));
+    return n;
+}
+
+/**
+ * @internal Enqueue several objects on the ring
+ *
+  * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of void * pointers (objects).
+ * @param n
+ *   The number of objects to add in the ring from the obj_table.
+ * @param behavior
+ *   RTE_RING_QUEUE_FIXED:    Enqueue a fixed number of items from a ring
+ *   RTE_RING_QUEUE_VARIABLE: Enqueue as many items as possible from ring
+ * @param is_sp
+ *   Indicates whether to use single producer or multi-producer head update
+ * @param free_space
+ *   returns the amount of space after the enqueue operation has finished
+ * @return
+ *   Actual number of objects enqueued.
+ *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_do_enqueue(struct rte_ring *r, void * const *obj_table,
+         unsigned int n, enum rte_ring_queue_behavior behavior,
+         int is_sp, unsigned int *free_space)
+{
+    uint32_t prod_head, prod_next;
+    uint32_t free_entries;
+
+    n = __rte_ring_move_prod_head(r, is_sp, n, behavior,
+            &prod_head, &prod_next, &free_entries);
+    if (n == 0)
+        goto end;
+
+    ENQUEUE_PTRS(r, &r[1], prod_head, obj_table, n, void *);
+
+    update_tail(&r->prod, prod_head, prod_next, is_sp, 1);
+end:
+    if (free_space != NULL)
+        *free_space = free_entries - n;
+    return n;
+}
+
+/**
+ * @internal This function updates the consumer head for dequeue
+ *
+ * @param r
+ *   A pointer to the ring structure
+ * @param is_sc
+ *   Indicates whether multi-consumer path is needed or not
+ * @param n
+ *   The number of elements we will want to enqueue, i.e. how far 
should the
+ *   head be moved
+ * @param behavior
+ *   RTE_RING_QUEUE_FIXED:    Dequeue a fixed number of items from a ring
+ *   RTE_RING_QUEUE_VARIABLE: Dequeue as many items as possible from ring
+ * @param old_head
+ *   Returns head value as it was before the move, i.e. where dequeue 
starts
+ * @param new_head
+ *   Returns the current/new head value i.e. where dequeue finishes
+ * @param entries
+ *   Returns the number of entries in the ring BEFORE head was moved
+ * @return
+ *   - Actual number of objects dequeued.
+ *     If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_move_cons_head(struct rte_ring *r, int is_sc,
+        unsigned int n, enum rte_ring_queue_behavior behavior,
+        uint32_t *old_head, uint32_t *new_head,
+        uint32_t *entries)
+{
+    unsigned int max = n;
+    int success;
+
+    /* move cons.head atomically */
+    do {
+        /* Restore n as it may change every loop */
+        n = max;
+#ifdef PREFER
+        *old_head = __atomic_load_n(&r->cons.head,
+                    __ATOMIC_ACQUIRE);
+#else
+        *old_head = r->cons.head;
+        /*  prevent reorder of load/load */
+        rte_smp_rmb();
+#endif
+
+        const uint32_t prod_tail = r->prod.tail;
+        /* The subtraction is done between two unsigned 32bits value
+         * (the result is always modulo 32 bits even if we have
+         * cons_head > prod_tail). So 'entries' is always between 0
+         * and size(ring)-1. */
+        *entries = (prod_tail - *old_head);
+
+        /* Set the actual entries for dequeue */
+        if (n > *entries)
+            n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : *entries;
+
+        if (unlikely(n == 0))
+            return 0;
+
+        *new_head = *old_head + n;
+        if (is_sc)
+            r->cons.head = *new_head, success = 1;
+        else
+#ifdef PREFER
+            success = arch_rte_atomic32_cmpset(&r->cons.head,
+                            old_head, *new_head,
+                            0, __ATOMIC_ACQUIRE,
+                            __ATOMIC_RELAXED);
+#else
+            success = rte_atomic32_cmpset(&r->cons.head, *old_head,
+                    *new_head);
+#endif
+    } while (unlikely(success == 0));
+    return n;
+}
+
+/**
+ * @internal Dequeue several objects from the ring
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of void * pointers (objects).
+ * @param n
+ *   The number of objects to pull from the ring.
+ * @param behavior
+ *   RTE_RING_QUEUE_FIXED:    Dequeue a fixed number of items from a ring
+ *   RTE_RING_QUEUE_VARIABLE: Dequeue as many items as possible from ring
+ * @param is_sc
+ *   Indicates whether to use single consumer or multi-consumer head update
+ * @param available
+ *   returns the number of remaining ring entries after the dequeue has 
finished
+ * @return
+ *   - Actual number of objects dequeued.
+ *     If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_do_dequeue(struct rte_ring *r, void **obj_table,
+         unsigned int n, enum rte_ring_queue_behavior behavior,
+         int is_sc, unsigned int *available)
+{
+    uint32_t cons_head, cons_next;
+    uint32_t entries;
+
+    n = __rte_ring_move_cons_head(r, is_sc, n, behavior,
+            &cons_head, &cons_next, &entries);
+    if (n == 0)
+        goto end;
+
+    DEQUEUE_PTRS(r, &r[1], cons_head, obj_table, n, void *);
+
+    update_tail(&r->cons, cons_head, cons_next, is_sc, 0);
+
+end:
+    if (available != NULL)
+        *available = entries - n;
+    return n;
+}
+
+#endif /* _RTE_RING_C11_MEM_H_ */
+
diff --git a/lib/librte_ring/rte_ring_generic.h 
b/lib/librte_ring/rte_ring_generic.h
new file mode 100644
index 0000000..0ce6d57
--- /dev/null
+++ b/lib/librte_ring/rte_ring_generic.h
@@ -0,0 +1,268 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 hxt-semitech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of hxt-semitech nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_RING_GENERIC_H_
+#define _RTE_RING_GENERIC_H_
+
+static __rte_always_inline void
+update_tail(struct rte_ring_headtail *ht, uint32_t old_val, uint32_t 
new_val,
+        uint32_t single, uint32_t enqueue)
+{
+    if (enqueue)
+        rte_smp_wmb();
+    else
+        rte_smp_rmb();
+    /*
+     * If there are other enqueues/dequeues in progress that preceded us,
+     * we need to wait for them to complete
+     */
+    if (!single)
+        while (unlikely(ht->tail != old_val))
+            rte_pause();
+
+    ht->tail = new_val;
+}
+
+/**
+ * @internal This function updates the producer head for enqueue
+ *
+ * @param r
+ *   A pointer to the ring structure
+ * @param is_sp
+ *   Indicates whether multi-producer path is needed or not
+ * @param n
+ *   The number of elements we will want to enqueue, i.e. how far 
should the
+ *   head be moved
+ * @param behavior
+ *   RTE_RING_QUEUE_FIXED:    Enqueue a fixed number of items from a ring
+ *   RTE_RING_QUEUE_VARIABLE: Enqueue as many items as possible from ring
+ * @param old_head
+ *   Returns head value as it was before the move, i.e. where enqueue 
starts
+ * @param new_head
+ *   Returns the current/new head value i.e. where enqueue finishes
+ * @param free_entries
+ *   Returns the amount of free space in the ring BEFORE head was moved
+ * @return
+ *   Actual number of objects enqueued.
+ *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_move_prod_head(struct rte_ring *r, int is_sp,
+        unsigned int n, enum rte_ring_queue_behavior behavior,
+        uint32_t *old_head, uint32_t *new_head,
+        uint32_t *free_entries)
+{
+    const uint32_t capacity = r->capacity;
+    unsigned int max = n;
+    int success;
+
+    do {
+        /* Reset n to the initial burst count */
+        n = max;
+
+        *old_head = r->prod.head;
+        const uint32_t cons_tail = r->cons.tail;
+        /*
+         *  The subtraction is done between two unsigned 32bits value
+         * (the result is always modulo 32 bits even if we have
+         * *old_head > cons_tail). So 'free_entries' is always between 0
+         * and capacity (which is < size).
+         */
+        *free_entries = (capacity + cons_tail - *old_head);
+
+        /* check that we have enough room in ring */
+        if (unlikely(n > *free_entries))
+            n = (behavior == RTE_RING_QUEUE_FIXED) ?
+                    0 : *free_entries;
+
+        if (n == 0)
+            return 0;
+
+        *new_head = *old_head + n;
+        if (is_sp)
+            r->prod.head = *new_head, success = 1;
+        else
+            success = rte_atomic32_cmpset(&r->prod.head,
+                    *old_head, *new_head);
+    } while (unlikely(success == 0));
+    return n;
+}
+
+/**
+ * @internal Enqueue several objects on the ring
+ *
+  * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of void * pointers (objects).
+ * @param n
+ *   The number of objects to add in the ring from the obj_table.
+ * @param behavior
+ *   RTE_RING_QUEUE_FIXED:    Enqueue a fixed number of items from a ring
+ *   RTE_RING_QUEUE_VARIABLE: Enqueue as many items as possible from ring
+ * @param is_sp
+ *   Indicates whether to use single producer or multi-producer head update
+ * @param free_space
+ *   returns the amount of space after the enqueue operation has finished
+ * @return
+ *   Actual number of objects enqueued.
+ *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_do_enqueue(struct rte_ring *r, void * const *obj_table,
+         unsigned int n, enum rte_ring_queue_behavior behavior,
+         int is_sp, unsigned int *free_space)
+{
+    uint32_t prod_head, prod_next;
+    uint32_t free_entries;
+
+    n = __rte_ring_move_prod_head(r, is_sp, n, behavior,
+            &prod_head, &prod_next, &free_entries);
+    if (n == 0)
+        goto end;
+
+    ENQUEUE_PTRS(r, &r[1], prod_head, obj_table, n, void *);
+
+    update_tail(&r->prod, prod_head, prod_next, is_sp, 1);
+end:
+    if (free_space != NULL)
+        *free_space = free_entries - n;
+    return n;
+}
+
+/**
+ * @internal This function updates the consumer head for dequeue
+ *
+ * @param r
+ *   A pointer to the ring structure
+ * @param is_sc
+ *   Indicates whether multi-consumer path is needed or not
+ * @param n
+ *   The number of elements we will want to enqueue, i.e. how far 
should the
+ *   head be moved
+ * @param behavior
+ *   RTE_RING_QUEUE_FIXED:    Dequeue a fixed number of items from a ring
+ *   RTE_RING_QUEUE_VARIABLE: Dequeue as many items as possible from ring
+ * @param old_head
+ *   Returns head value as it was before the move, i.e. where dequeue 
starts
+ * @param new_head
+ *   Returns the current/new head value i.e. where dequeue finishes
+ * @param entries
+ *   Returns the number of entries in the ring BEFORE head was moved
+ * @return
+ *   - Actual number of objects dequeued.
+ *     If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_move_cons_head(struct rte_ring *r, int is_sc,
+        unsigned int n, enum rte_ring_queue_behavior behavior,
+        uint32_t *old_head, uint32_t *new_head,
+        uint32_t *entries)
+{
+    unsigned int max = n;
+    int success;
+
+    /* move cons.head atomically */
+    do {
+        /* Restore n as it may change every loop */
+        n = max;
+
+        *old_head = r->cons.head;
+        const uint32_t prod_tail = r->prod.tail;
+        /* The subtraction is done between two unsigned 32bits value
+         * (the result is always modulo 32 bits even if we have
+         * cons_head > prod_tail). So 'entries' is always between 0
+         * and size(ring)-1. */
+        *entries = (prod_tail - *old_head);
+
+        /* Set the actual entries for dequeue */
+        if (n > *entries)
+            n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : *entries;
+
+        if (unlikely(n == 0))
+            return 0;
+
+        *new_head = *old_head + n;
+        if (is_sc)
+            r->cons.head = *new_head, success = 1;
+        else
+            success = rte_atomic32_cmpset(&r->cons.head, *old_head,
+                    *new_head);
+    } while (unlikely(success == 0));
+    return n;
+}
+
+/**
+ * @internal Dequeue several objects from the ring
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of void * pointers (objects).
+ * @param n
+ *   The number of objects to pull from the ring.
+ * @param behavior
+ *   RTE_RING_QUEUE_FIXED:    Dequeue a fixed number of items from a ring
+ *   RTE_RING_QUEUE_VARIABLE: Dequeue as many items as possible from ring
+ * @param is_sc
+ *   Indicates whether to use single consumer or multi-consumer head update
+ * @param available
+ *   returns the number of remaining ring entries after the dequeue has 
finished
+ * @return
+ *   - Actual number of objects dequeued.
+ *     If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_do_dequeue(struct rte_ring *r, void **obj_table,
+         unsigned int n, enum rte_ring_queue_behavior behavior,
+         int is_sc, unsigned int *available)
+{
+    uint32_t cons_head, cons_next;
+    uint32_t entries;
+
+    n = __rte_ring_move_cons_head(r, is_sc, n, behavior,
+            &cons_head, &cons_next, &entries);
+    if (n == 0)
+        goto end;
+
+    DEQUEUE_PTRS(r, &r[1], cons_head, obj_table, n, void *);
+
+    update_tail(&r->cons, cons_head, cons_next, is_sc, 0);
+
+end:
+    if (available != NULL)