Message ID | CAHVfvh7ggGB_q1Rs1c3-9PRwDr_GKA+etaMXRSeKCfUKoUx8hQ@mail.gmail.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers |
Return-Path: <dev-bounces@dpdk.org> X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id CAF927F30; Fri, 7 Nov 2014 15:21:48 +0100 (CET) Received: from mail-wg0-f43.google.com (mail-wg0-f43.google.com [74.125.82.43]) by dpdk.org (Postfix) with ESMTP id 6EC422A9 for <dev@dpdk.org>; Fri, 7 Nov 2014 15:21:46 +0100 (CET) Received: by mail-wg0-f43.google.com with SMTP id y10so3907796wgg.16 for <dev@dpdk.org>; Fri, 07 Nov 2014 06:31:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=aTHS6x5TR95gg4/lDXltvmX0RpRmOI2gOOjmiLH9Y/M=; b=o1++sbPXYGfiEH3fJzQm30Nq4Y6ecg5qSpbgkPgwYGdhToXHPbNxPkx9BANAztKG/e VkMQGoxrjdHTNj/fNfCybpsVf3KJjV0/XIiCShDl+hBT2HsTD6DSJGLK93dN2CyKE3Cn flxXJWZCy+WgCFhw1Dtp1BliK2vCvJu1f8CeQYeTJ9/9v3RQfNUBmaLe6iH4Oa1DCc7p qeWinmbHbPfec6NnQ8BoFu7CxNTE8Tg5n5CrzwHHvOAEarJ/nxNj9GM05S1TE5FLzXE1 qGWcPqZmDxYrNVv06vxeWH+DBj81XCoM7P7AHIvRd2l9FOd+bqPu8e5p5NxR4CgrfW16 U1lw== MIME-Version: 1.0 X-Received: by 10.180.95.201 with SMTP id dm9mr5486712wib.27.1415370678481; Fri, 07 Nov 2014 06:31:18 -0800 (PST) Received: by 10.27.86.144 with HTTP; Fri, 7 Nov 2014 06:31:18 -0800 (PST) In-Reply-To: <20141107135303.GB12092@bricha3-MOBL3> References: <1415194237-1219-1-git-send-email-jigsaw@gmail.com> <CAHVfvh4X_sUPUzSJTqBdEnkS94t2Jwj_98Vg0xbUS3MPSeo2ZA@mail.gmail.com> <20141106092228.GA3056@bricha3-MOBL3> <9190772.1rnKUO3oNV@xps13> <545b6b74.a96db40a.26af.ffffe7fb@mx.google.com> <20141106135951.GB7252@bricha3-MOBL3> <CAHVfvh4U4PZKZSue_kKDQKATC2snb_=10OD08LGmUtieBc_LzA@mail.gmail.com> <CAHVfvh5SzJ-kpQQ9h=1wmMihiitcJXeR9mcNa1in8x6Gb6tSqQ@mail.gmail.com> <20141107094521.GB4628@bricha3-MOBL3> <CAHVfvh6y4f7+bMhzmwOu5c0Y4wzwNaxj4sQPtq8cabGbdHrzXg@mail.gmail.com> <20141107135303.GB12092@bricha3-MOBL3> Date: Fri, 7 Nov 2014 16:31:18 +0200 Message-ID: <CAHVfvh7ggGB_q1Rs1c3-9PRwDr_GKA+etaMXRSeKCfUKoUx8hQ@mail.gmail.com> From: jigsaw <jigsaw@gmail.com> To: Bruce Richardson <bruce.richardson@intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Cc: "dev@dpdk.org" <dev@dpdk.org> Subject: Re: [dpdk-dev] =?utf-8?b?562U5aSNOiAgW1BBVENIXSBBZGQgdXNlciBkZWZpbmVk?= =?utf-8?q?_tag_calculation_callback_tolibrte=5Fdistributor=2E?= X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK <dev.dpdk.org> List-Unsubscribe: <http://dpdk.org/ml/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://dpdk.org/ml/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <http://dpdk.org/ml/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org> |
Commit Message
Qinglai Xiao
Nov. 7, 2014, 2:31 p.m. UTC
Hi Bruce, Pls have a quick look at the diff to see if this is exactly what you mean about the bitmask. I just wrote it without even compiling, just to express the idea. So it may leave some places unpatched. If this is agreed, I will make a decent test to verify it before sending the patch for RFC. union rte_distributor_buffer bufs[RTE_MAX_LCORE]; @@ -188,6 +190,7 @@ static inline void handle_worker_shutdown(struct rte_distributor *d, unsigned wkr) { d->in_flight_tags[wkr] = 0; + d->in_flight_mask &= ~(1 << wkr); d->bufs[wkr].bufptr64 = 0; if (unlikely(d->backlog[wkr].count != 0)) { /* On return of a packet, we need to move the @@ -241,6 +244,7 @@ process_returns(struct rte_distributor *d) else { d->bufs[wkr].bufptr64 = RTE_DISTRIB_GET_BUF; d->in_flight_tags[wkr] = 0; + d->in_flight_mask &= ~(1 << wkr); } oldbuf = data >> RTE_DISTRIB_FLAG_BITS; } else if (data & RTE_DISTRIB_RETURN_BUF) { @@ -282,12 +286,13 @@ rte_distributor_process(struct rte_distributor *d, next_mb = mbufs[next_idx++]; next_value = (((int64_t)(uintptr_t)next_mb) << RTE_DISTRIB_FLAG_BITS); - new_tag = (next_mb->hash.rss | 1); + new_tag = next_mb->hash.rss; uint32_t match = 0; unsigned i; for (i = 0; i < d->num_workers; i++) - match |= (!(d->in_flight_tags[i] ^ new_tag) + match |= (((!(d->in_flight_tags[i] ^ new_tag)) & + (d->in_flight_bitmask >> i)) << i); if (match) { @@ -309,6 +314,7 @@ rte_distributor_process(struct rte_distributor *d, else { d->bufs[wkr].bufptr64 = next_value; d->in_flight_tags[wkr] = new_tag; + d->in_flight_bitmask |= 1 << wkr; next_mb = NULL; } oldbuf = data >> RTE_DISTRIB_FLAG_BITS; On Fri, Nov 7, 2014 at 3:53 PM, Bruce Richardson <bruce.richardson@intel.com > wrote: > On Fri, Nov 07, 2014 at 02:38:13PM +0200, jigsaw wrote: > > Hi Bruce, > > > > >> If a tag value of zero is ever passed in, then it will start matching > > against cores which are not doing any processing. > > > > Yes, this is true according to current bookkeeping of inflight tags. > > > > But if the slot in in_flight_tags is not a uint32_t but a struct which > has > > a filed as indication of "on/off", and also with corresponding changes in > > looking for a matched tag, then the need for 1 bit mask can be > eliminated. > > Of course this change requires a little bit more, O(n), memory space and > > costs O(n) more branch misses. But the benefit is a more free interface > to > > user app. > > > > This is just another trade-off. Since I am in need of such freedom, I am > > more interested in the free use of 32bits. > > If you do implement such a change, I would suggest you simply add a bitmask > to the distributor indicating valid workers. Then when we do the check > for tag matches, we just need an extra "and" instruction to eliminate > invalid > workers from the match. > > /Bruce > > > > > thx & > > rgds, > > -qinglai > > > > > > On Fri, Nov 7, 2014 at 11:45 AM, Bruce Richardson < > > bruce.richardson@intel.com> wrote: > > > > > On Thu, Nov 06, 2014 at 09:52:25PM +0200, jigsaw wrote: > > > > Hi Bruce, > > > > > > > > Actually IMHO it is good to leave the freedom to user to decide how > to > > > > interpret the tag value, i.e. remove the OR 1 bit. > > > > If the tag value is zero, then we assume the programmer know what he > is > > > > doing. Of course this shall be clearly documented in the > comment/doxgen. > > > > > > > > > > > > thx & > > > > rgds, > > > > -qinglai > > > > > > I don't believe that will work. If a tag value of zero is ever passed > > > in, then it will start matching against cores which are not doing any > > > processing. Then it will get queued up to get sent to those cores, and > so > > > never get processed. > > > We need a bit somewhere inside the tag to permanently set - though it > can > > > be configurable. > > > > > > /Bruce > > > > > > > > > > > On Thu, Nov 6, 2014 at 8:01 PM, jigsaw <jigsaw@gmail.com> wrote: > > > > > > > > > Hi Bruce, > > > > > > > > > > In my use case, unfortunately the tag is not hash. And the tag can > be > > > on > > > > > either low or high bits, depending on configuration. > > > > > I wonder if it is possible to let the user to decide which bit to > mask, > > > > > i.e. to add another param to rte_distributor_create to define the > mask. > > > > > > > > > > thx & > > > > > rgds, > > > > > -qinglai > > > > > > > > > > On Thu, Nov 6, 2014 at 3:59 PM, Bruce Richardson < > > > > > bruce.richardson@intel.com> wrote: > > > > > > > > > >> On Thu, Nov 06, 2014 at 02:36:09PM +0200, Qinglai Xiao wrote: > > > > >> > Hi Bruce, > > > > >> > > > > > >> > There is a subtle case in which tag values are 2 and 3, > > > respectively. > > > > >> Then these two tags cannot be distinguished. There should be a > better > > > way > > > > >> so as to handle this situation. > > > > >> > > > > >> It's not just in that, case, it's in any case where a pair of tags > > > differ > > > > >> by > > > > >> only a single bit. I've been assuming that the tag is likely to > be a > > > hash > > > > >> value in most cases - given that it's only 32-bit - in which case > it > > > just > > > > >> doesn't > > > > >> matter which bit we chose to permanently set to 1, but if there > are > > > > >> scenarios > > > > >> where it's likely that the low bits are used but the high ones not > > > so, we > > > > >> can > > > > >> look to change which bit is set to 1. Either way, the distributor > just > > > > >> uses a > > > > >> 31-bit tag rather than a 32-bit one. > > > > >> > > > > >> /Bruce > > > > >> > > > > >> > > > > > >> > thx & > > > > >> > rgds > > > > >> > -qinglai > > > > >> > > > > > >> > -----原始邮件----- > > > > >> > 发件人: "Thomas Monjalon" <thomas.monjalon@6wind.com> > > > > >> > 发送时间: 2014/11/6 12:36 > > > > >> > 收件人: "Bruce Richardson" <bruce.richardson@intel.com> > > > > >> > 抄送: "dev@dpdk.org" <dev@dpdk.org>; "jigsaw" <jigsaw@gmail.com> > > > > >> > 主题: Re: [dpdk-dev] [PATCH] Add user defined tag calculation > callback > > > > >> tolibrte_distributor. > > > > >> > > > > > >> > 2014-11-06 09:22, Bruce Richardson: > > > > >> > > On Wed, Nov 05, 2014 at 07:24:13PM +0200, jigsaw wrote: > > > > >> > > > > > > > >> > > > > http://dpdk.org/browse/dpdk/tree/lib/librte_distributor/rte_distributor.c#n285 > > > > >> > > > > > > > >> > > > new_tag = (next_mb->hash.rss | 1); > > > > >> > > > > > > > >> > > > Why the logical OR is needed? > > > > >> > > > > > > >> > > That's needed to ensure that we never track a tag with an > actual > > > > >> value of zero. > > > > >> > > We instead always force the low bit to be 1, so that we can > use > > > zero > > > > >> as an > > > > >> > > "empty" value. > > > > >> > > > > > >> > Bruce, could you check how this code may be better commented > please? > > > > >> > This discussion shows that the distributor library probably > needs > > > more > > > > >> > explanations in the code or doxygen. > > > > >> > > > > > >> > Thanks > > > > >> > -- > > > > >> > Thomas > > > > >> > > > > > > > > > > > > > >
Comments
On Fri, Nov 07, 2014 at 04:31:18PM +0200, jigsaw wrote: > Hi Bruce, > > Pls have a quick look at the diff to see if this is exactly what you mean > about the bitmask. > I just wrote it without even compiling, just to express the idea. So it may > leave some places unpatched. > If this is agreed, I will make a decent test to verify it before sending > the patch for RFC. > > diff --git a/lib/librte_distributor/rte_distributor.c > b/lib/librte_distributor/rte_di > index 585ff88..d606bcf 100644 > --- a/lib/librte_distributor/rte_distributor.c > +++ b/lib/librte_distributor/rte_distributor.c > @@ -92,6 +92,8 @@ struct rte_distributor { > unsigned num_workers; /**< Number of workers > polling */ > > uint32_t in_flight_tags[RTE_MAX_LCORE]; > + uint32_t in_flight_bitmask; > + > struct rte_distributor_backlog backlog[RTE_MAX_LCORE]; > > union rte_distributor_buffer bufs[RTE_MAX_LCORE]; > @@ -188,6 +190,7 @@ static inline void > handle_worker_shutdown(struct rte_distributor *d, unsigned wkr) > { > d->in_flight_tags[wkr] = 0; > + d->in_flight_mask &= ~(1 << wkr); > d->bufs[wkr].bufptr64 = 0; > if (unlikely(d->backlog[wkr].count != 0)) { > /* On return of a packet, we need to move the > @@ -241,6 +244,7 @@ process_returns(struct rte_distributor *d) > else { > d->bufs[wkr].bufptr64 = RTE_DISTRIB_GET_BUF; > d->in_flight_tags[wkr] = 0; > + d->in_flight_mask &= ~(1 << wkr); > } > oldbuf = data >> RTE_DISTRIB_FLAG_BITS; > } else if (data & RTE_DISTRIB_RETURN_BUF) { > @@ -282,12 +286,13 @@ rte_distributor_process(struct rte_distributor *d, > next_mb = mbufs[next_idx++]; > next_value = (((int64_t)(uintptr_t)next_mb) > << RTE_DISTRIB_FLAG_BITS); > - new_tag = (next_mb->hash.rss | 1); > + new_tag = next_mb->hash.rss; > > uint32_t match = 0; > unsigned i; > for (i = 0; i < d->num_workers; i++) > - match |= (!(d->in_flight_tags[i] ^ new_tag) > + match |= (((!(d->in_flight_tags[i] ^ > new_tag)) & > + (d->in_flight_bitmask >> i)) I would not do the bitmask comparison here, as that's extra instruction in the loop. Instead, because its a bitmask, build up the match variable as it was before, and then just do a single and operation afterwards, outside the loop body. /Bruce > << i); > > if (match) { > @@ -309,6 +314,7 @@ rte_distributor_process(struct rte_distributor *d, > else { > d->bufs[wkr].bufptr64 = next_value; > d->in_flight_tags[wkr] = new_tag; > + d->in_flight_bitmask |= 1 << wkr; > next_mb = NULL; > } > oldbuf = data >> RTE_DISTRIB_FLAG_BITS; > > >
Yeah that's better. As below, right? @@ -290,6 +294,7 @@ rte_distributor_process(struct rte_distributor *d, match |= (!(d->in_flight_tags[i] ^ new_tag) << i); + match &= d->in_flight_bitmask; if (match) { next_mb = NULL; unsigned worker = __builtin_ctz(match); On Fri, Nov 7, 2014 at 4:44 PM, Bruce Richardson <bruce.richardson@intel.com > wrote: > On Fri, Nov 07, 2014 at 04:31:18PM +0200, jigsaw wrote: > > Hi Bruce, > > > > Pls have a quick look at the diff to see if this is exactly what you mean > > about the bitmask. > > I just wrote it without even compiling, just to express the idea. So it > may > > leave some places unpatched. > > If this is agreed, I will make a decent test to verify it before sending > > the patch for RFC. > > > > diff --git a/lib/librte_distributor/rte_distributor.c > > b/lib/librte_distributor/rte_di > > index 585ff88..d606bcf 100644 > > --- a/lib/librte_distributor/rte_distributor.c > > +++ b/lib/librte_distributor/rte_distributor.c > > @@ -92,6 +92,8 @@ struct rte_distributor { > > unsigned num_workers; /**< Number of workers > > polling */ > > > > uint32_t in_flight_tags[RTE_MAX_LCORE]; > > + uint32_t in_flight_bitmask; > > + > > struct rte_distributor_backlog backlog[RTE_MAX_LCORE]; > > > > union rte_distributor_buffer bufs[RTE_MAX_LCORE]; > > @@ -188,6 +190,7 @@ static inline void > > handle_worker_shutdown(struct rte_distributor *d, unsigned wkr) > > { > > d->in_flight_tags[wkr] = 0; > > + d->in_flight_mask &= ~(1 << wkr); > > d->bufs[wkr].bufptr64 = 0; > > if (unlikely(d->backlog[wkr].count != 0)) { > > /* On return of a packet, we need to move the > > @@ -241,6 +244,7 @@ process_returns(struct rte_distributor *d) > > else { > > d->bufs[wkr].bufptr64 = > RTE_DISTRIB_GET_BUF; > > d->in_flight_tags[wkr] = 0; > > + d->in_flight_mask &= ~(1 << wkr); > > } > > oldbuf = data >> RTE_DISTRIB_FLAG_BITS; > > } else if (data & RTE_DISTRIB_RETURN_BUF) { > > @@ -282,12 +286,13 @@ rte_distributor_process(struct rte_distributor *d, > > next_mb = mbufs[next_idx++]; > > next_value = (((int64_t)(uintptr_t)next_mb) > > << RTE_DISTRIB_FLAG_BITS); > > - new_tag = (next_mb->hash.rss | 1); > > + new_tag = next_mb->hash.rss; > > > > uint32_t match = 0; > > unsigned i; > > for (i = 0; i < d->num_workers; i++) > > - match |= (!(d->in_flight_tags[i] ^ > new_tag) > > + match |= (((!(d->in_flight_tags[i] ^ > > new_tag)) & > > + (d->in_flight_bitmask >> > i)) > > I would not do the bitmask comparison here, as that's extra instruction in > the > loop. Instead, because its a bitmask, build up the match variable as it was > before, and then just do a single and operation afterwards, outside the > loop > body. > > /Bruce > > > << i); > > > > if (match) { > > @@ -309,6 +314,7 @@ rte_distributor_process(struct rte_distributor *d, > > else { > > d->bufs[wkr].bufptr64 = next_value; > > d->in_flight_tags[wkr] = new_tag; > > + d->in_flight_bitmask |= 1 << wkr; > > next_mb = NULL; > > } > > oldbuf = data >> RTE_DISTRIB_FLAG_BITS; > > > > > > >
On Fri, Nov 07, 2014 at 04:52:46PM +0200, jigsaw wrote: > Yeah that's better. As below, right? Yep. > > @@ -290,6 +294,7 @@ rte_distributor_process(struct rte_distributor *d, > match |= (!(d->in_flight_tags[i] ^ new_tag) > << i); > > + match &= d->in_flight_bitmask; > if (match) { > next_mb = NULL; > unsigned worker = __builtin_ctz(match); > > > On Fri, Nov 7, 2014 at 4:44 PM, Bruce Richardson <bruce.richardson@intel.com > > wrote: > > > On Fri, Nov 07, 2014 at 04:31:18PM +0200, jigsaw wrote: > > > Hi Bruce, > > > > > > Pls have a quick look at the diff to see if this is exactly what you mean > > > about the bitmask. > > > I just wrote it without even compiling, just to express the idea. So it > > may > > > leave some places unpatched. > > > If this is agreed, I will make a decent test to verify it before sending > > > the patch for RFC. > > > > > > diff --git a/lib/librte_distributor/rte_distributor.c > > > b/lib/librte_distributor/rte_di > > > index 585ff88..d606bcf 100644 > > > --- a/lib/librte_distributor/rte_distributor.c > > > +++ b/lib/librte_distributor/rte_distributor.c > > > @@ -92,6 +92,8 @@ struct rte_distributor { > > > unsigned num_workers; /**< Number of workers > > > polling */ > > > > > > uint32_t in_flight_tags[RTE_MAX_LCORE]; > > > + uint32_t in_flight_bitmask; > > > + > > > struct rte_distributor_backlog backlog[RTE_MAX_LCORE]; > > > > > > union rte_distributor_buffer bufs[RTE_MAX_LCORE]; > > > @@ -188,6 +190,7 @@ static inline void > > > handle_worker_shutdown(struct rte_distributor *d, unsigned wkr) > > > { > > > d->in_flight_tags[wkr] = 0; > > > + d->in_flight_mask &= ~(1 << wkr); > > > d->bufs[wkr].bufptr64 = 0; > > > if (unlikely(d->backlog[wkr].count != 0)) { > > > /* On return of a packet, we need to move the > > > @@ -241,6 +244,7 @@ process_returns(struct rte_distributor *d) > > > else { > > > d->bufs[wkr].bufptr64 = > > RTE_DISTRIB_GET_BUF; > > > d->in_flight_tags[wkr] = 0; > > > + d->in_flight_mask &= ~(1 << wkr); > > > } > > > oldbuf = data >> RTE_DISTRIB_FLAG_BITS; > > > } else if (data & RTE_DISTRIB_RETURN_BUF) { > > > @@ -282,12 +286,13 @@ rte_distributor_process(struct rte_distributor *d, > > > next_mb = mbufs[next_idx++]; > > > next_value = (((int64_t)(uintptr_t)next_mb) > > > << RTE_DISTRIB_FLAG_BITS); > > > - new_tag = (next_mb->hash.rss | 1); > > > + new_tag = next_mb->hash.rss; > > > > > > uint32_t match = 0; > > > unsigned i; > > > for (i = 0; i < d->num_workers; i++) > > > - match |= (!(d->in_flight_tags[i] ^ > > new_tag) > > > + match |= (((!(d->in_flight_tags[i] ^ > > > new_tag)) & > > > + (d->in_flight_bitmask >> > > i)) > > > > I would not do the bitmask comparison here, as that's extra instruction in > > the > > loop. Instead, because its a bitmask, build up the match variable as it was > > before, and then just do a single and operation afterwards, outside the > > loop > > body. > > > > /Bruce > > > > > << i); > > > > > > if (match) { > > > @@ -309,6 +314,7 @@ rte_distributor_process(struct rte_distributor *d, > > > else { > > > d->bufs[wkr].bufptr64 = next_value; > > > d->in_flight_tags[wkr] = new_tag; > > > + d->in_flight_bitmask |= 1 << wkr; > > > next_mb = NULL; > > > } > > > oldbuf = data >> RTE_DISTRIB_FLAG_BITS; > > > > > > > > > > >
OK thanks Bruce. I will get the patch done in coming week. -qinglai On Fri, Nov 7, 2014 at 5:04 PM, Bruce Richardson <bruce.richardson@intel.com > wrote: > On Fri, Nov 07, 2014 at 04:52:46PM +0200, jigsaw wrote: > > Yeah that's better. As below, right? > > Yep. > > > > > @@ -290,6 +294,7 @@ rte_distributor_process(struct rte_distributor *d, > > match |= (!(d->in_flight_tags[i] ^ > new_tag) > > << i); > > > > + match &= d->in_flight_bitmask; > > if (match) { > > next_mb = NULL; > > unsigned worker = __builtin_ctz(match); > > > > > > On Fri, Nov 7, 2014 at 4:44 PM, Bruce Richardson < > bruce.richardson@intel.com > > > wrote: > > > > > On Fri, Nov 07, 2014 at 04:31:18PM +0200, jigsaw wrote: > > > > Hi Bruce, > > > > > > > > Pls have a quick look at the diff to see if this is exactly what you > mean > > > > about the bitmask. > > > > I just wrote it without even compiling, just to express the idea. So > it > > > may > > > > leave some places unpatched. > > > > If this is agreed, I will make a decent test to verify it before > sending > > > > the patch for RFC. > > > > > > > > diff --git a/lib/librte_distributor/rte_distributor.c > > > > b/lib/librte_distributor/rte_di > > > > index 585ff88..d606bcf 100644 > > > > --- a/lib/librte_distributor/rte_distributor.c > > > > +++ b/lib/librte_distributor/rte_distributor.c > > > > @@ -92,6 +92,8 @@ struct rte_distributor { > > > > unsigned num_workers; /**< Number of workers > > > > polling */ > > > > > > > > uint32_t in_flight_tags[RTE_MAX_LCORE]; > > > > + uint32_t in_flight_bitmask; > > > > + > > > > struct rte_distributor_backlog backlog[RTE_MAX_LCORE]; > > > > > > > > union rte_distributor_buffer bufs[RTE_MAX_LCORE]; > > > > @@ -188,6 +190,7 @@ static inline void > > > > handle_worker_shutdown(struct rte_distributor *d, unsigned wkr) > > > > { > > > > d->in_flight_tags[wkr] = 0; > > > > + d->in_flight_mask &= ~(1 << wkr); > > > > d->bufs[wkr].bufptr64 = 0; > > > > if (unlikely(d->backlog[wkr].count != 0)) { > > > > /* On return of a packet, we need to move the > > > > @@ -241,6 +244,7 @@ process_returns(struct rte_distributor *d) > > > > else { > > > > d->bufs[wkr].bufptr64 = > > > RTE_DISTRIB_GET_BUF; > > > > d->in_flight_tags[wkr] = 0; > > > > + d->in_flight_mask &= ~(1 << wkr); > > > > } > > > > oldbuf = data >> RTE_DISTRIB_FLAG_BITS; > > > > } else if (data & RTE_DISTRIB_RETURN_BUF) { > > > > @@ -282,12 +286,13 @@ rte_distributor_process(struct rte_distributor > *d, > > > > next_mb = mbufs[next_idx++]; > > > > next_value = (((int64_t)(uintptr_t)next_mb) > > > > << RTE_DISTRIB_FLAG_BITS); > > > > - new_tag = (next_mb->hash.rss | 1); > > > > + new_tag = next_mb->hash.rss; > > > > > > > > uint32_t match = 0; > > > > unsigned i; > > > > for (i = 0; i < d->num_workers; i++) > > > > - match |= (!(d->in_flight_tags[i] ^ > > > new_tag) > > > > + match |= (((!(d->in_flight_tags[i] ^ > > > > new_tag)) & > > > > + > (d->in_flight_bitmask >> > > > i)) > > > > > > I would not do the bitmask comparison here, as that's extra > instruction in > > > the > > > loop. Instead, because its a bitmask, build up the match variable as > it was > > > before, and then just do a single and operation afterwards, outside the > > > loop > > > body. > > > > > > /Bruce > > > > > > > << i); > > > > > > > > if (match) { > > > > @@ -309,6 +314,7 @@ rte_distributor_process(struct rte_distributor > *d, > > > > else { > > > > d->bufs[wkr].bufptr64 = next_value; > > > > d->in_flight_tags[wkr] = new_tag; > > > > + d->in_flight_bitmask |= 1 << wkr; > > > > next_mb = NULL; > > > > } > > > > oldbuf = data >> RTE_DISTRIB_FLAG_BITS; > > > > > > > > > > > > > > > >
diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_di index 585ff88..d606bcf 100644 --- a/lib/librte_distributor/rte_distributor.c +++ b/lib/librte_distributor/rte_distributor.c @@ -92,6 +92,8 @@ struct rte_distributor { unsigned num_workers; /**< Number of workers polling */ uint32_t in_flight_tags[RTE_MAX_LCORE]; + uint32_t in_flight_bitmask; + struct rte_distributor_backlog backlog[RTE_MAX_LCORE];