[dpdk-dev,memnic,5/7] pmd: packet receiving optimization with prefetch
Commit Message
From: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Prefetch the next packet area to reduce memory stall cycles.
Prefetching the next packet area could hide memory stall, because the next
area will be accessed just after processing the current receive operations.
We can see performance improvements with memnic-tester.
Using Xeon E5-2697 v2 @ 2.70GHz, 4 vCPU.
size | before | after
64 | 4.59Mpps | 5.54Mpps
128 | 4.87Mpps | 5.46Mpps
256 | 4.72Mpps | 5.21Mpps
512 | 4.41Mpps | 4.50Mpps
1024 | 3.64Mpps | 3.71Mpps
1280 | 3.15Mpps | 3.21Mpps
1518 | 2.87Mpps | 2.92Mpps
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Reviewed-by: Hayato Momma <h-momma@ce.jp.nec.com>
---
pmd/pmd_memnic.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
@@ -286,7 +286,7 @@ static uint16_t memnic_recv_pkts(void *rx_queue,
uint16_t nr;
uint64_t pkts, bytes, errs;
uint32_t framesz = adapter->framesz;
- int idx;
+ int idx, next;
struct rte_eth_stats *st = &adapter->stats[rte_lcore_id()];
if (!adapter->nic->hdr.valid)
@@ -298,6 +298,11 @@ static uint16_t memnic_recv_pkts(void *rx_queue,
p = &data->packets[idx];
if (p->status != MEMNIC_PKT_ST_FILLED)
break;
+ /* prefetch the next area */
+ next = idx;
+ if (++next >= MEMNIC_NR_PACKET)
+ next = 0;
+ rte_prefetch0(&data->packets[next]);
if (p->len > framesz) {
errs++;
goto drop;
@@ -318,9 +323,7 @@ static uint16_t memnic_recv_pkts(void *rx_queue,
drop:
rte_compiler_barrier();
p->status = MEMNIC_PKT_ST_FREE;
-
- if (++idx >= MEMNIC_NR_PACKET)
- idx = 0;
+ idx = next;
}
adapter->up_idx = idx;