[dpdk-dev] Improve the shaper accuracy for large packets

Message ID 1518172716-7649-1-git-send-email-alanrobertsonatt@gmail.com (mailing list archive)
State Rejected, archived
Delegated to: Cristian Dumitrescu
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Alan Robertson Feb. 9, 2018, 10:38 a.m. UTC
  From: Alan Robertson <ar771e@att.com>

There were 2 issues, the first was time could be lost whilst updating
the traffic-class period, the second was a frame could be delayed if
not enough tokens were available for the full frame.  By allowing the
shaper to borrow credit from the next period the throughput is improved.

Signed-off-by: Alan Robertson <alan.robertson@att.com>
---
 lib/librte_sched/rte_sched.c | 60 +++++++++++++++++++++++++++++++-------------
 1 file changed, 43 insertions(+), 17 deletions(-)
  

Comments

Ferruh Yigit April 5, 2019, 3:54 p.m. UTC | #1
On 2/9/2018 10:38 AM, alanrobertsonatt@gmail.com wrote:
> From: Alan Robertson <ar771e@att.com>
> 
> There were 2 issues, the first was time could be lost whilst updating
> the traffic-class period, the second was a frame could be delayed if
> not enough tokens were available for the full frame.  By allowing the
> shaper to borrow credit from the next period the throughput is improved.
> 
> Signed-off-by: Alan Robertson <alan.robertson@att.com>


Hi Cristian,

Another 'librte_sched' patch waiting for a review for almost a year now, do you
know status of this patch?

Thanks,
ferruh
  
Cristian Dumitrescu April 8, 2019, 1:36 p.m. UTC | #2
> -----Original Message-----
> From: Yigit, Ferruh
> Sent: Friday, April 5, 2019 4:54 PM
> To: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
> Cc: alanrobertsonatt@gmail.com; dev@dpdk.org; Alan Robertson
> <ar771e@att.com>; Alan Robertson <alan.robertson@att.com>; Thomas
> Monjalon <thomas@monjalon.net>
> Subject: Re: [dpdk-dev] [PATCH] Improve the shaper accuracy for large
> packets
> 
> On 2/9/2018 10:38 AM, alanrobertsonatt@gmail.com wrote:
> > From: Alan Robertson <ar771e@att.com>
> >
> > There were 2 issues, the first was time could be lost whilst updating
> > the traffic-class period, the second was a frame could be delayed if
> > not enough tokens were available for the full frame.  By allowing the
> > shaper to borrow credit from the next period the throughput is improved.
> >
> > Signed-off-by: Alan Robertson <alan.robertson@att.com>
> 
> 
> Hi Cristian,
> 
> Another 'librte_sched' patch waiting for a review for almost a year now, do
> you
> know status of this patch?
> 
> Thanks,
> ferruh

As previously announced, we are planning to do some improvements on this part of the code [1] and we are planning improve this behavior in a way that is less intrusive and has slightly better performance (due to no branches).

Regards,
Cristian

[1] https://mails.dpdk.org/archives/dev/2018-November/120035.html
  
Stephen Hemminger June 11, 2023, 2:45 a.m. UTC | #3
On Mon, 8 Apr 2019 13:36:14 +0000
"Dumitrescu, Cristian" <cristian.dumitrescu@intel.com> wrote:

> > -----Original Message-----
> > From: Yigit, Ferruh
> > Sent: Friday, April 5, 2019 4:54 PM
> > To: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
> > Cc: alanrobertsonatt@gmail.com; dev@dpdk.org; Alan Robertson
> > <ar771e@att.com>; Alan Robertson <alan.robertson@att.com>; Thomas
> > Monjalon <thomas@monjalon.net>
> > Subject: Re: [dpdk-dev] [PATCH] Improve the shaper accuracy for large
> > packets
> > 
> > On 2/9/2018 10:38 AM, alanrobertsonatt@gmail.com wrote:  
> > > From: Alan Robertson <ar771e@att.com>
> > >
> > > There were 2 issues, the first was time could be lost whilst updating
> > > the traffic-class period, the second was a frame could be delayed if
> > > not enough tokens were available for the full frame.  By allowing the
> > > shaper to borrow credit from the next period the throughput is improved.
> > >
> > > Signed-off-by: Alan Robertson <alan.robertson@att.com>  
> > 
> > 
> > Hi Cristian,
> > 
> > Another 'librte_sched' patch waiting for a review for almost a year now, do
> > you
> > know status of this patch?
> > 
> > Thanks,
> > ferruh  
> 
> As previously announced, we are planning to do some improvements on this part of the code [1] and we are planning improve this behavior in a way that is less intrusive and has slightly better performance (due to no branches).
> 
> Regards,
> Cristian
> 
> [1] https://mails.dpdk.org/archives/dev/2018-November/120035.html
> 

This patch is quite old. It looks the original issue was that rte_sched_port_dequeue
is racy if port is shared by multiple lcore's.  The only update to port->time
happens in rte_sched_port_time_resync() frin rte_sched_port_dequeue().

The warnings about thread safety in qos_framework.rst could be improved, but that
is something for another patch in future.
  

Patch

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 634486c..7b06b0b 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -57,7 +57,7 @@  struct rte_sched_subport {
 	/* Traffic classes (TCs) */
 	uint64_t tc_time; /* time of next update */
 	uint32_t tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
-	uint32_t tc_credits[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+	int32_t tc_credits[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
 	uint32_t tc_period;
 
 	/* TC oversubscription */
@@ -98,7 +98,7 @@  struct rte_sched_pipe {
 
 	/* Traffic classes (TCs) */
 	uint64_t tc_time; /* time of next update */
-	uint32_t tc_credits[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+	int32_t tc_credits[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
 
 	/* Weighted Round Robin (WRR) */
 	uint8_t wrr_tokens[RTE_SCHED_QUEUES_PER_PIPE];
@@ -1451,6 +1451,8 @@  grinder_credits_update(struct rte_sched_port *port, uint32_t pos)
 	struct rte_sched_pipe *pipe = grinder->pipe;
 	struct rte_sched_pipe_profile *params = grinder->pipe_params;
 	uint64_t n_periods;
+	uint32_t tc;
+	uint64_t lapsed;
 
 	/* Subport TB */
 	n_periods = (port->time - subport->tb_time) / subport->tb_period;
@@ -1466,20 +1468,44 @@  grinder_credits_update(struct rte_sched_port *port, uint32_t pos)
 
 	/* Subport TCs */
 	if (unlikely(port->time >= subport->tc_time)) {
-		subport->tc_credits[0] = subport->tc_credits_per_period[0];
-		subport->tc_credits[1] = subport->tc_credits_per_period[1];
-		subport->tc_credits[2] = subport->tc_credits_per_period[2];
-		subport->tc_credits[3] = subport->tc_credits_per_period[3];
-		subport->tc_time = port->time + subport->tc_period;
+		for (tc = 0; tc < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; tc++) {
+			if (subport->tc_credits[tc] < 0)
+				subport->tc_credits[tc] +=
+					subport->tc_credits_per_period[tc];
+			else
+				subport->tc_credits[tc] =
+					subport->tc_credits_per_period[tc];
+		}
+		/* If we've run into the next period only update the clock to
+		 * the time + tc_period so we'll replenish the tc tokens early
+		 * in the next tc_period to compensate.
+		 */
+		lapsed = port->time - subport->tc_time;
+		if (lapsed < subport->tc_period)
+			subport->tc_time += subport->tc_period;
+		else
+			subport->tc_time = port->time + subport->tc_period;
 	}
 
 	/* Pipe TCs */
 	if (unlikely(port->time >= pipe->tc_time)) {
-		pipe->tc_credits[0] = params->tc_credits_per_period[0];
-		pipe->tc_credits[1] = params->tc_credits_per_period[1];
-		pipe->tc_credits[2] = params->tc_credits_per_period[2];
-		pipe->tc_credits[3] = params->tc_credits_per_period[3];
-		pipe->tc_time = port->time + params->tc_period;
+		for (tc = 0; tc < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; tc++) {
+			if (pipe->tc_credits[tc] < 0)
+				pipe->tc_credits[tc] +=
+					params->tc_credits_per_period[tc];
+			else
+				pipe->tc_credits[tc] =
+					params->tc_credits_per_period[tc];
+		}
+		/* If we've run into the next period only update the clock to
+		 * the time + tc_period so we'll replenish the tc tokens early
+		 * in the next tc_period to compensate.
+		 */
+		lapsed = port->time - pipe->tc_time;
+		if (lapsed < params->tc_period)
+			pipe->tc_time += params->tc_period;
+		else
+			pipe->tc_time = port->time + params->tc_period;
 	}
 }
 
@@ -1586,16 +1612,16 @@  grinder_credits_check(struct rte_sched_port *port, uint32_t pos)
 	uint32_t tc_index = grinder->tc_index;
 	uint32_t pkt_len = pkt->pkt_len + port->frame_overhead;
 	uint32_t subport_tb_credits = subport->tb_credits;
-	uint32_t subport_tc_credits = subport->tc_credits[tc_index];
+	int32_t subport_tc_credits = subport->tc_credits[tc_index];
 	uint32_t pipe_tb_credits = pipe->tb_credits;
-	uint32_t pipe_tc_credits = pipe->tc_credits[tc_index];
+	int32_t pipe_tc_credits = pipe->tc_credits[tc_index];
 	int enough_credits;
 
 	/* Check queue credits */
 	enough_credits = (pkt_len <= subport_tb_credits) &&
-		(pkt_len <= subport_tc_credits) &&
+		(subport_tc_credits > 0) &&
 		(pkt_len <= pipe_tb_credits) &&
-		(pkt_len <= pipe_tc_credits);
+		(pipe_tc_credits > 0);
 
 	if (!enough_credits)
 		return 0;
@@ -1603,8 +1629,8 @@  grinder_credits_check(struct rte_sched_port *port, uint32_t pos)
 	/* Update port credits */
 	subport->tb_credits -= pkt_len;
 	subport->tc_credits[tc_index] -= pkt_len;
-	pipe->tb_credits -= pkt_len;
 	pipe->tc_credits[tc_index] -= pkt_len;
+	pipe->tb_credits -= pkt_len;
 
 	return 1;
 }