[dpdk-dev,v3,1/2] mbuf: provide rte_pktmbuf_alloc_bulk API

Message ID 1450801074-29361-2-git-send-email-huawei.xie@intel.com (mailing list archive)
State Superseded, archived
Headers

Commit Message

Huawei Xie Dec. 22, 2015, 4:17 p.m. UTC
  v3 changes:
 move while after case 0
 add context about duff's device and why we use while loop in the commit
message

v2 changes:
 unroll the loop a bit to help the performance

rte_pktmbuf_alloc_bulk allocates a bulk of packet mbufs.

There is related thread about this bulk API.
http://dpdk.org/dev/patchwork/patch/4718/
Thanks to Konstantin's loop unrolling.

Attached the wiki page about duff's device. It explains the performance
optimization through loop unwinding, and also the most dramatic use of
case label fall-through.
https://en.wikipedia.org/wiki/Duff%27s_device

In our implementation, we use while() loop rather than do{} while() loop
because we could not assume count is strictly positive. Using while()
loop saves one line of check if count is zero.

Signed-off-by: Gerald Rogers <gerald.rogers@intel.com>
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 lib/librte_mbuf/rte_mbuf.h | 49 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 49 insertions(+)
  

Comments

Stephen Hemminger Dec. 23, 2015, 6:37 p.m. UTC | #1
On Wed, 23 Dec 2015 00:17:53 +0800
Huawei Xie <huawei.xie@intel.com> wrote:

> +
> +	rc = rte_mempool_get_bulk(pool, (void **)mbufs, count);
> +	if (unlikely(rc))
> +		return rc;
> +
> +	switch (count % 4) {
> +	case 0: while (idx != count) {
> +			RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 0);
> +			rte_mbuf_refcnt_set(mbufs[idx], 1);
> +			rte_pktmbuf_reset(mbufs[idx]);
> +			idx++;
> +	case 3:
> +			RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 0);
> +			rte_mbuf_refcnt_set(mbufs[idx], 1);
> +			rte_pktmbuf_reset(mbufs[idx]);
> +			idx++;
> +	case 2:
> +			RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 0);
> +			rte_mbuf_refcnt_set(mbufs[idx], 1);
> +			rte_pktmbuf_reset(mbufs[idx]);
> +			idx++;
> +	case 1:
> +			RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 0);
> +			rte_mbuf_refcnt_set(mbufs[idx], 1);
> +			rte_pktmbuf_reset(mbufs[idx]);
> +			idx++;
> +	}
> +	}
> +	return 0;
> +}

Since function will not work if count can not be 0 (otherwise rte_mempool_get_bulk will fail),
why not:
	1. Document that assumption
	2. Use that assumption to speed up code.



	switch(count % 4) {
		do {
			case 0:
			...
			case 1:
			...
		} while (idx != count);
	}

Also you really need to add a big block comment about this loop, to explain
what it does and why.
  
Ananyev, Konstantin Dec. 23, 2015, 6:49 p.m. UTC | #2
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Stephen Hemminger
> Sent: Wednesday, December 23, 2015 6:38 PM
> To: Xie, Huawei
> Cc: dev@dpdk.org; dprovan@bivio.net
> Subject: Re: [dpdk-dev] [PATCH v3 1/2] mbuf: provide rte_pktmbuf_alloc_bulk API
> 
> On Wed, 23 Dec 2015 00:17:53 +0800
> Huawei Xie <huawei.xie@intel.com> wrote:
> 
> > +
> > +	rc = rte_mempool_get_bulk(pool, (void **)mbufs, count);
> > +	if (unlikely(rc))
> > +		return rc;
> > +
> > +	switch (count % 4) {
> > +	case 0: while (idx != count) {
> > +			RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 0);
> > +			rte_mbuf_refcnt_set(mbufs[idx], 1);
> > +			rte_pktmbuf_reset(mbufs[idx]);
> > +			idx++;
> > +	case 3:
> > +			RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 0);
> > +			rte_mbuf_refcnt_set(mbufs[idx], 1);
> > +			rte_pktmbuf_reset(mbufs[idx]);
> > +			idx++;
> > +	case 2:
> > +			RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 0);
> > +			rte_mbuf_refcnt_set(mbufs[idx], 1);
> > +			rte_pktmbuf_reset(mbufs[idx]);
> > +			idx++;
> > +	case 1:
> > +			RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 0);
> > +			rte_mbuf_refcnt_set(mbufs[idx], 1);
> > +			rte_pktmbuf_reset(mbufs[idx]);
> > +			idx++;
> > +	}
> > +	}
> > +	return 0;
> > +}
> 
> Since function will not work if count can not be 0 (otherwise rte_mempool_get_bulk will fail),

As I understand, rte_mempool_get_bulk() will work correctly and return 0, if count==0.
That's why Huawei prefers while() {}, instead of do {} while() - to avoid extra check for
(count != 0) at the start. 
Konstantin


> why not:
> 	1. Document that assumption
> 	2. Use that assumption to speed up code.
> 
> 
> 
> 	switch(count % 4) {
> 		do {
> 			case 0:
> 			...
> 			case 1:
> 			...
> 		} while (idx != count);
> 	}
> 
> Also you really need to add a big block comment about this loop, to explain
> what it does and why.
  
Huawei Xie Dec. 24, 2015, 1:33 a.m. UTC | #3
On 12/24/2015 2:49 AM, Ananyev, Konstantin wrote:
>
>> -----Original Message-----
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Stephen Hemminger
>> Sent: Wednesday, December 23, 2015 6:38 PM
>> To: Xie, Huawei
>> Cc: dev@dpdk.org; dprovan@bivio.net
>> Subject: Re: [dpdk-dev] [PATCH v3 1/2] mbuf: provide rte_pktmbuf_alloc_bulk API
>>
>> On Wed, 23 Dec 2015 00:17:53 +0800
>> Huawei Xie <huawei.xie@intel.com> wrote:
>>
>>> +
>>> +	rc = rte_mempool_get_bulk(pool, (void **)mbufs, count);
>>> +	if (unlikely(rc))
>>> +		return rc;
>>> +
>>> +	switch (count % 4) {
>>> +	case 0: while (idx != count) {
>>> +			RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 0);
>>> +			rte_mbuf_refcnt_set(mbufs[idx], 1);
>>> +			rte_pktmbuf_reset(mbufs[idx]);
>>> +			idx++;
>>> +	case 3:
>>> +			RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 0);
>>> +			rte_mbuf_refcnt_set(mbufs[idx], 1);
>>> +			rte_pktmbuf_reset(mbufs[idx]);
>>> +			idx++;
>>> +	case 2:
>>> +			RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 0);
>>> +			rte_mbuf_refcnt_set(mbufs[idx], 1);
>>> +			rte_pktmbuf_reset(mbufs[idx]);
>>> +			idx++;
>>> +	case 1:
>>> +			RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 0);
>>> +			rte_mbuf_refcnt_set(mbufs[idx], 1);
>>> +			rte_pktmbuf_reset(mbufs[idx]);
>>> +			idx++;
>>> +	}
>>> +	}
>>> +	return 0;
>>> +}
>> Since function will not work if count can not be 0 (otherwise rte_mempool_get_bulk will fail),
> As I understand, rte_mempool_get_bulk() will work correctly and return 0, if count==0.
> That's why Huawei prefers while() {}, instead of do {} while() - to avoid extra check for
> (count != 0) at the start. 
> Konstantin

Yes.

>
>
>> why not:
>> 	1. Document that assumption
>> 	2. Use that assumption to speed up code.
>>
>>
>>
>> 	switch(count % 4) {
>> 		do {
>> 			case 0:
>> 			...
>> 			case 1:
>> 			...
>> 		} while (idx != count);
>> 	}
>>
>> Also you really need to add a big block comment about this loop, to explain
>> what it does and why.

Since we change duff's implementation a bit, and for people who don't
know duff's device, we could add comment.
Is comment like this enough?
Use Duff's device to unroll the loop a bit to gain more performance
Use while() rather than do {} while() as count could be zero.
  

Patch

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index f234ac9..3381c28 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -1336,6 +1336,55 @@  static inline struct rte_mbuf *rte_pktmbuf_alloc(struct rte_mempool *mp)
 }
 
 /**
+ * Allocate a bulk of mbufs, initialize refcnt and reset the fields to default
+ * values.
+ *
+ *  @param pool
+ *    The mempool from which mbufs are allocated.
+ *  @param mbufs
+ *    Array of pointers to mbufs
+ *  @param count
+ *    Array size
+ *  @return
+ *   - 0: Success
+ */
+static inline int rte_pktmbuf_alloc_bulk(struct rte_mempool *pool,
+	 struct rte_mbuf **mbufs, unsigned count)
+{
+	unsigned idx = 0;
+	int rc;
+
+	rc = rte_mempool_get_bulk(pool, (void **)mbufs, count);
+	if (unlikely(rc))
+		return rc;
+
+	switch (count % 4) {
+	case 0: while (idx != count) {
+			RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 0);
+			rte_mbuf_refcnt_set(mbufs[idx], 1);
+			rte_pktmbuf_reset(mbufs[idx]);
+			idx++;
+	case 3:
+			RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 0);
+			rte_mbuf_refcnt_set(mbufs[idx], 1);
+			rte_pktmbuf_reset(mbufs[idx]);
+			idx++;
+	case 2:
+			RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 0);
+			rte_mbuf_refcnt_set(mbufs[idx], 1);
+			rte_pktmbuf_reset(mbufs[idx]);
+			idx++;
+	case 1:
+			RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 0);
+			rte_mbuf_refcnt_set(mbufs[idx], 1);
+			rte_pktmbuf_reset(mbufs[idx]);
+			idx++;
+	}
+	}
+	return 0;
+}
+
+/**
  * Attach packet mbuf to another packet mbuf.
  *
  * After attachment we refer the mbuf we attached as 'indirect',