[dpdk-dev,v5,18/19] ring: add sched_yield to avoid spin forever

Message ID 1423728996-3004-19-git-send-email-cunming.liang@intel.com (mailing list archive)
State Superseded, archived
Headers

Commit Message

Cunming Liang Feb. 12, 2015, 8:16 a.m. UTC
  Add a sched_yield() syscall if the thread spins for too long, waiting other thread to finish its operations on the ring.
That gives pre-empted thread a chance to proceed and finish with ring enqnue/dequeue operation.
The purpose is to reduce contention on the ring. By ring_perf_test, it doesn't shows additional perf penalty.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 v5 changes:
   add RTE_RING_PAUSE_REP to config file

 v4 changes:
   update and add more comments on sched_yield()

 v3 changes:
   new patch adding sched_yield() in rte_ring to avoid long spin

 config/common_bsdapp       |  1 +
 config/common_linuxapp     |  1 +
 lib/librte_ring/rte_ring.h | 31 +++++++++++++++++++++++++++----
 3 files changed, 29 insertions(+), 4 deletions(-)
  

Comments

Olivier Matz Feb. 12, 2015, 11:16 a.m. UTC | #1
Hi,

On 02/12/2015 09:16 AM, Cunming Liang wrote:
> Add a sched_yield() syscall if the thread spins for too long, waiting other thread to finish its operations on the ring.
> That gives pre-empted thread a chance to proceed and finish with ring enqnue/dequeue operation.
> The purpose is to reduce contention on the ring. By ring_perf_test, it doesn't shows additional perf penalty.
>
> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> ---
>   v5 changes:
>     add RTE_RING_PAUSE_REP to config file
>
>   v4 changes:
>     update and add more comments on sched_yield()
>
>   v3 changes:
>     new patch adding sched_yield() in rte_ring to avoid long spin
>
>   config/common_bsdapp       |  1 +
>   config/common_linuxapp     |  1 +
>   lib/librte_ring/rte_ring.h | 31 +++++++++++++++++++++++++++----
>   3 files changed, 29 insertions(+), 4 deletions(-)
>
> diff --git a/config/common_bsdapp b/config/common_bsdapp
> index 57bacb8..52c5143 100644
> --- a/config/common_bsdapp
> +++ b/config/common_bsdapp
> @@ -234,6 +234,7 @@ CONFIG_RTE_PMD_PACKET_PREFETCH=y
>   CONFIG_RTE_LIBRTE_RING=y
>   CONFIG_RTE_LIBRTE_RING_DEBUG=n
>   CONFIG_RTE_RING_SPLIT_PROD_CONS=n
> +CONFIG_RTE_RING_PAUSE_REP=n

Maybe it's better to use CONFIG_RTE_RING_PAUSE_REP=0 instead?
If I understand well, it has to be set to an integer value to
enable it, am I correct?

Thanks,
Olivier
  
Cunming Liang Feb. 12, 2015, 1:05 p.m. UTC | #2
Hi,

> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Thursday, February 12, 2015 7:16 PM
> To: Liang, Cunming; dev@dpdk.org
> Cc: Ananyev, Konstantin
> Subject: Re: [PATCH v5 18/19] ring: add sched_yield to avoid spin forever
> 
> Hi,
> 
> On 02/12/2015 09:16 AM, Cunming Liang wrote:
> > Add a sched_yield() syscall if the thread spins for too long, waiting other thread
> to finish its operations on the ring.
> > That gives pre-empted thread a chance to proceed and finish with ring
> enqnue/dequeue operation.
> > The purpose is to reduce contention on the ring. By ring_perf_test, it doesn't
> shows additional perf penalty.
> >
> > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > ---
> >   v5 changes:
> >     add RTE_RING_PAUSE_REP to config file
> >
> >   v4 changes:
> >     update and add more comments on sched_yield()
> >
> >   v3 changes:
> >     new patch adding sched_yield() in rte_ring to avoid long spin
> >
> >   config/common_bsdapp       |  1 +
> >   config/common_linuxapp     |  1 +
> >   lib/librte_ring/rte_ring.h | 31 +++++++++++++++++++++++++++----
> >   3 files changed, 29 insertions(+), 4 deletions(-)
> >
> > diff --git a/config/common_bsdapp b/config/common_bsdapp
> > index 57bacb8..52c5143 100644
> > --- a/config/common_bsdapp
> > +++ b/config/common_bsdapp
> > @@ -234,6 +234,7 @@ CONFIG_RTE_PMD_PACKET_PREFETCH=y
> >   CONFIG_RTE_LIBRTE_RING=y
> >   CONFIG_RTE_LIBRTE_RING_DEBUG=n
> >   CONFIG_RTE_RING_SPLIT_PROD_CONS=n
> > +CONFIG_RTE_RING_PAUSE_REP=n
> 
> Maybe it's better to use CONFIG_RTE_RING_PAUSE_REP=0 instead?
> If I understand well, it has to be set to an integer value to
> enable it, am I correct?
[LCM] If RTE_RING_PAUSE_REP=N (no define), by default will use 0. If it's set to 'y'(=1), will issue yield in the most frequent rate.
It also can set as integer to assign any number. All cases works for this configure.
One point is in configure file, just demonstrate the default way to use it.
It can't prevent to use anything unexpected. Except we rule the n & y illegal for this option.
The meaningful value of it can write in the doc.
> 
> Thanks,
> Olivier
  
Ananyev, Konstantin Feb. 12, 2015, 1:08 p.m. UTC | #3
> -----Original Message-----
> From: Liang, Cunming
> Sent: Thursday, February 12, 2015 1:05 PM
> To: Olivier MATZ; dev@dpdk.org
> Cc: Ananyev, Konstantin
> Subject: RE: [PATCH v5 18/19] ring: add sched_yield to avoid spin forever
> 
> Hi,
> 
> > -----Original Message-----
> > From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> > Sent: Thursday, February 12, 2015 7:16 PM
> > To: Liang, Cunming; dev@dpdk.org
> > Cc: Ananyev, Konstantin
> > Subject: Re: [PATCH v5 18/19] ring: add sched_yield to avoid spin forever
> >
> > Hi,
> >
> > On 02/12/2015 09:16 AM, Cunming Liang wrote:
> > > Add a sched_yield() syscall if the thread spins for too long, waiting other thread
> > to finish its operations on the ring.
> > > That gives pre-empted thread a chance to proceed and finish with ring
> > enqnue/dequeue operation.
> > > The purpose is to reduce contention on the ring. By ring_perf_test, it doesn't
> > shows additional perf penalty.
> > >
> > > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > > ---
> > >   v5 changes:
> > >     add RTE_RING_PAUSE_REP to config file
> > >
> > >   v4 changes:
> > >     update and add more comments on sched_yield()
> > >
> > >   v3 changes:
> > >     new patch adding sched_yield() in rte_ring to avoid long spin
> > >
> > >   config/common_bsdapp       |  1 +
> > >   config/common_linuxapp     |  1 +
> > >   lib/librte_ring/rte_ring.h | 31 +++++++++++++++++++++++++++----
> > >   3 files changed, 29 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/config/common_bsdapp b/config/common_bsdapp
> > > index 57bacb8..52c5143 100644
> > > --- a/config/common_bsdapp
> > > +++ b/config/common_bsdapp
> > > @@ -234,6 +234,7 @@ CONFIG_RTE_PMD_PACKET_PREFETCH=y
> > >   CONFIG_RTE_LIBRTE_RING=y
> > >   CONFIG_RTE_LIBRTE_RING_DEBUG=n
> > >   CONFIG_RTE_RING_SPLIT_PROD_CONS=n
> > > +CONFIG_RTE_RING_PAUSE_REP=n
> >
> > Maybe it's better to use CONFIG_RTE_RING_PAUSE_REP=0 instead?
> > If I understand well, it has to be set to an integer value to
> > enable it, am I correct?
> [LCM] If RTE_RING_PAUSE_REP=N (no define), by default will use 0. If it's set to 'y'(=1), will issue yield in the most frequent rate.
> It also can set as integer to assign any number. All cases works for this configure.
> One point is in configure file, just demonstrate the default way to use it.
> It can't prevent to use anything unexpected. Except we rule the n & y illegal for this option.
> The meaningful value of it can write in the doc.

I also think that it is better avoid 'y' and 'n' (though as you said it would work), but use integers for all cases.
Less confusion for the users.
Konstantin 

> >
> > Thanks,
> > Olivier
  
Bruce Richardson Feb. 12, 2015, 1:11 p.m. UTC | #4
On Thu, Feb 12, 2015 at 01:08:43PM +0000, Ananyev, Konstantin wrote:
> 
> 
> > -----Original Message-----
> > From: Liang, Cunming
> > Sent: Thursday, February 12, 2015 1:05 PM
> > To: Olivier MATZ; dev@dpdk.org
> > Cc: Ananyev, Konstantin
> > Subject: RE: [PATCH v5 18/19] ring: add sched_yield to avoid spin forever
> > 
> > Hi,
> > 
> > > -----Original Message-----
> > > From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> > > Sent: Thursday, February 12, 2015 7:16 PM
> > > To: Liang, Cunming; dev@dpdk.org
> > > Cc: Ananyev, Konstantin
> > > Subject: Re: [PATCH v5 18/19] ring: add sched_yield to avoid spin forever
> > >
> > > Hi,
> > >
> > > On 02/12/2015 09:16 AM, Cunming Liang wrote:
> > > > Add a sched_yield() syscall if the thread spins for too long, waiting other thread
> > > to finish its operations on the ring.
> > > > That gives pre-empted thread a chance to proceed and finish with ring
> > > enqnue/dequeue operation.
> > > > The purpose is to reduce contention on the ring. By ring_perf_test, it doesn't
> > > shows additional perf penalty.
> > > >
> > > > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > > > ---
> > > >   v5 changes:
> > > >     add RTE_RING_PAUSE_REP to config file
> > > >
> > > >   v4 changes:
> > > >     update and add more comments on sched_yield()
> > > >
> > > >   v3 changes:
> > > >     new patch adding sched_yield() in rte_ring to avoid long spin
> > > >
> > > >   config/common_bsdapp       |  1 +
> > > >   config/common_linuxapp     |  1 +
> > > >   lib/librte_ring/rte_ring.h | 31 +++++++++++++++++++++++++++----
> > > >   3 files changed, 29 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/config/common_bsdapp b/config/common_bsdapp
> > > > index 57bacb8..52c5143 100644
> > > > --- a/config/common_bsdapp
> > > > +++ b/config/common_bsdapp
> > > > @@ -234,6 +234,7 @@ CONFIG_RTE_PMD_PACKET_PREFETCH=y
> > > >   CONFIG_RTE_LIBRTE_RING=y
> > > >   CONFIG_RTE_LIBRTE_RING_DEBUG=n
> > > >   CONFIG_RTE_RING_SPLIT_PROD_CONS=n
> > > > +CONFIG_RTE_RING_PAUSE_REP=n
> > >
> > > Maybe it's better to use CONFIG_RTE_RING_PAUSE_REP=0 instead?
> > > If I understand well, it has to be set to an integer value to
> > > enable it, am I correct?
> > [LCM] If RTE_RING_PAUSE_REP=N (no define), by default will use 0. If it's set to 'y'(=1), will issue yield in the most frequent rate.
> > It also can set as integer to assign any number. All cases works for this configure.
> > One point is in configure file, just demonstrate the default way to use it.
> > It can't prevent to use anything unexpected. Except we rule the n & y illegal for this option.
> > The meaningful value of it can write in the doc.
> 
> I also think that it is better avoid 'y' and 'n' (though as you said it would work), but use integers for all cases.
> Less confusion for the users.
> Konstantin 
> 
+1
Also expand "REP" to "REP_COUNT" to make it clear that it's a numeric value.

/Bruce
  

Patch

diff --git a/config/common_bsdapp b/config/common_bsdapp
index 57bacb8..52c5143 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -234,6 +234,7 @@  CONFIG_RTE_PMD_PACKET_PREFETCH=y
 CONFIG_RTE_LIBRTE_RING=y
 CONFIG_RTE_LIBRTE_RING_DEBUG=n
 CONFIG_RTE_RING_SPLIT_PROD_CONS=n
+CONFIG_RTE_RING_PAUSE_REP=n
 
 #
 # Compile librte_mempool
diff --git a/config/common_linuxapp b/config/common_linuxapp
index d428f84..0b4eb3c 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -242,6 +242,7 @@  CONFIG_RTE_PMD_PACKET_PREFETCH=y
 CONFIG_RTE_LIBRTE_RING=y
 CONFIG_RTE_LIBRTE_RING_DEBUG=n
 CONFIG_RTE_RING_SPLIT_PROD_CONS=n
+CONFIG_RTE_RING_PAUSE_REP=n
 
 #
 # Compile librte_mempool
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 553a880..eecba0b 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -127,6 +127,11 @@  struct rte_ring_debug_stats {
 #define RTE_RING_NAMESIZE 32 /**< The maximum length of a ring name. */
 #define RTE_RING_MZ_PREFIX "RG_"
 
+#ifndef RTE_RING_PAUSE_REP
+#define RTE_RING_PAUSE_REP 0 /**< yield after pause num of times,
+			      * no yield if RTE_RING_PAUSE_REP not defined. */
+#endif
+
 /**
  * An RTE ring structure.
  *
@@ -410,7 +415,7 @@  __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	uint32_t cons_tail, free_entries;
 	const unsigned max = n;
 	int success;
-	unsigned i;
+	unsigned i, rep = 0;
 	uint32_t mask = r->prod.mask;
 	int ret;
 
@@ -468,9 +473,18 @@  __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	 * If there are other enqueues in progress that preceded us,
 	 * we need to wait for them to complete
 	 */
-	while (unlikely(r->prod.tail != prod_head))
+	while (unlikely(r->prod.tail != prod_head)) {
 		rte_pause();
 
+		/* Set RTE_RING_PAUSE_REP to avoid spin too long waiting for
+		 * other thread finish. It gives pre-empted thread a chance
+		 * to proceed and finish with ring denqnue operation. */
+		if (RTE_RING_PAUSE_REP &&
+		    ++rep == RTE_RING_PAUSE_REP) {
+			rep = 0;
+			sched_yield();
+		}
+	}
 	r->prod.tail = prod_next;
 	return ret;
 }
@@ -589,7 +603,7 @@  __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 	uint32_t cons_next, entries;
 	const unsigned max = n;
 	int success;
-	unsigned i;
+	unsigned i, rep = 0;
 	uint32_t mask = r->prod.mask;
 
 	/* move cons.head atomically */
@@ -634,9 +648,18 @@  __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 	 * If there are other dequeues in progress that preceded us,
 	 * we need to wait for them to complete
 	 */
-	while (unlikely(r->cons.tail != cons_head))
+	while (unlikely(r->cons.tail != cons_head)) {
 		rte_pause();
 
+		/* Set RTE_RING_PAUSE_REP to avoid spin too long waiting for
+		 * other thread finish. It gives pre-empted thread a chance
+		 * to proceed and finish with ring denqnue operation. */
+		if (RTE_RING_PAUSE_REP &&
+		    ++rep == RTE_RING_PAUSE_REP) {
+			rep = 0;
+			sched_yield();
+		}
+	}
 	__RING_STAT_ADD(r, deq_success, n);
 	r->cons.tail = cons_next;