Message ID | 20210319205718.1436-1-pbhagavatula@marvell.com (mailing list archive) |
---|---|
Headers |
Return-Path: <dev-bounces@dpdk.org> X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9F852A0524; Fri, 19 Mar 2021 21:57:30 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 2B05014102D; Fri, 19 Mar 2021 21:57:30 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id B64F040143 for <dev@dpdk.org>; Fri, 19 Mar 2021 21:57:28 +0100 (CET) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 12JKqO3m014749; Fri, 19 Mar 2021 13:57:28 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=rfsI/+u93pyJHUAmbaI+SCVR+WGtEFU3WYCfGnwnrV0=; b=IoL/TTEdsxwWFLqsWQWyoo44rsC8FzqGgHz2gYXj9uc1Z+Ng8PzOXDOMYu066DTbB6eU 9OI1o6vvjkP4wQM89eTcr8hOaY3/XX4HjMAIrFMYIxYy1Mrle+h2a8zB+DdYFvGIB1b2 TunhFeja3OK3G6ymoxOc3wLmTNGQDg7N09NTyTh8fDGXERMuNkRsDa0MTgdNcCefD7Xc MLwzO4/EqtUzB9yJDoZro9/fOeYjWxoAKT08UFCvFyZpesAX1fcuTQEIQB1/bvYkv5sv jN9sbiuCbigFFG6aFFt8W1HUZVFIUdRW1n2KQpjDN/dp6bQ+nDLlWz0mdQhos1TDcADn YQ== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0b-0016f401.pphosted.com with ESMTP id 37cakqcewr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Fri, 19 Mar 2021 13:57:27 -0700 Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Fri, 19 Mar 2021 13:57:25 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Fri, 19 Mar 2021 13:57:25 -0700 Received: from BG-LT7430.marvell.com (BG-LT7430.marvell.com [10.28.177.176]) by maili.marvell.com (Postfix) with ESMTP id DD40A3F7040; Fri, 19 Mar 2021 13:57:21 -0700 (PDT) From: <pbhagavatula@marvell.com> To: <jerinj@marvell.com>, <jay.jayatheerthan@intel.com>, <erik.g.carrillo@intel.com>, <abhinandan.gujjar@intel.com>, <timothy.mcdaniel@intel.com>, <hemant.agrawal@nxp.com>, <harry.van.haaren@intel.com>, <mattias.ronnblom@ericsson.com>, <liang.j.ma@intel.com> CC: <dev@dpdk.org>, Pavan Nikhilesh <pbhagavatula@marvell.com> Date: Sat, 20 Mar 2021 02:27:09 +0530 Message-ID: <20210319205718.1436-1-pbhagavatula@marvell.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210316200156.252-1-pbhagavatula@marvell.com> References: <20210316200156.252-1-pbhagavatula@marvell.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.369, 18.0.761 definitions=2021-03-19_10:2021-03-19, 2021-03-19 signatures=0 Subject: [dpdk-dev] [PATCH v4 0/8] Introduce event vectorization X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions <dev.dpdk.org> List-Unsubscribe: <https://mails.dpdk.org/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://mails.dpdk.org/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <https://mails.dpdk.org/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org> |
Series |
Introduce event vectorization
|
|
Message
Pavan Nikhilesh Bhagavatula
March 19, 2021, 8:57 p.m. UTC
From: Pavan Nikhilesh <pbhagavatula@marvell.com>
In traditional event programming model, events are identified by a
flow-id and a uintptr_t. The flow-id uniquely identifies a given event
and determines the order of scheduling based on schedule type, the
uintptr_t holds a single object.
Event devices also support burst mode with configurable dequeue depth,
i.e. each dequeue call would return multiple events and each event
might be at a different stage of the pipeline.
Having a burst of events belonging to different stages in a dequeue
burst is not only difficult to vectorize but also increases the scheduler
overhead and application overhead of pipelining events further.
Using event vectors we see a performance gain of ~628% as shown in [1].
By introducing event vectorization, each event will be capable of holding
multiple uintptr_t of the same flow thereby allowing applications
to vectorize their pipeline and reduce the complexity of pipelining
events across multiple stages. This also reduces the complexity of handling
enqueue and dequeue on an event device.
Since event devices are transparent to the events they are scheduling
so the event producers such as eth_rx_adapter, crypto_adapter , etc..
are responsible for vectorizing the buffers of the same flow into a single
event.
The series also breaks ABI in the patch [8/8] which is targetted to the
v21.11 release.
The dpdk-test-eventdev application has been updated with options to test
multiple vector sizes and timeouts.
[1]
As for performance improvement, with a ARM Cortex-A72 equivalent processer,
software event device (--vdev=event_sw0), single worker core, single stage
and using one service core for Rx adapter, Tx adapter, Scheduling.
Without event vectorization:
./build/app/dpdk-test-eventdev -l 7-23 -s 0x700 --vdev="event_sw0" --
--prod_type_ethdev --nb_pkts=0 --verbose 2 --test=pipeline_queue
--stlist=a --wlcores=20
Port[0] using Rx adapter[0] configured
Port[0] using Tx adapter[0] Configured
4.728 mpps avg 4.728 mpps
With event vectorization:
./build/app/dpdk-test-eventdev -l 7-23 -s 0x700 --vdev="event_sw0" --
--prod_type_ethdev --nb_pkts=0 --verbose 2 --test=pipeline_queue
--stlist=a --wlcores=20 --enable_vector --nb_eth_queues 1
--vector_size 256
Port[0] using Rx adapter[0] configured
Port[0] using Tx adapter[0] Configured
34.383 mpps avg 34.383 mpps
Having dedicated service cores for each Rx queues and tweaking the vector,
dequeue burst size would further improve performance.
API usage is shown below:
Configuration:
struct rte_event_eth_rx_adapter_event_vector_config vec_conf;
vector_pool = rte_event_vector_pool_create("vector_pool",
nb_elem, 0, vector_size, socket_id);
rte_event_eth_rx_adapter_create(id, event_id, &adptr_conf);
rte_event_eth_rx_adapter_queue_add(id, eth_id, -1, &queue_conf);
if (cap & RTE_EVENT_ETH_RX_ADAPTER_CAP_EVENT_VECTOR) {
vec_conf.vector_sz = vector_size;
vec_conf.vector_timeout_ns = vector_tmo_nsec;
vec_conf.vector_mp = vector_pool;
rte_event_eth_rx_adapter_queue_event_vector_config(id,
eth_id, -1, &vec_conf);
}
Fastpath:
num = rte_event_dequeue_burst(event_id, port_id, &ev, 1, 0);
if (!num)
continue;
if (ev.event_type & RTE_EVENT_TYPE_VECTOR) {
switch (ev.event_type) {
case RTE_EVENT_TYPE_ETHDEV_VECTOR:
case RTE_EVENT_TYPE_ETH_RX_ADAPTER_VECTOR:
struct rte_mbuf **mbufs;
mbufs = ev.vector_ev->mbufs;
for (i = 0; i < ev.vector_ev->nb_elem; i++)
//Process mbufs.
break;
case ...
}
}
...
v4 Changes:
- Fix missing event vector structure in event structure.(Jay)
v3 Changes:
- Fix unintended formatting changes.
v2 Changes:
- Multiple gramatical and style fixes.(Jerin)
- Add parameter to define vector size in power of 2. (Jerin)
- Redo patch series w/o breaking ABI till the last patch.(David)
- Add deprication notice to announce ABI break in 21.11.(David)
- Add vector limits validation to app/test-eventdev.
Pavan Nikhilesh (8):
eventdev: introduce event vector capability
eventdev: introduce event vector Rx capability
eventdev: introduce event vector Tx capability
eventdev: add Rx adapter event vector support
eventdev: add Tx adapter event vector support
app/eventdev: add event vector mode in pipeline test
doc: announce event Rx adapter config changes
eventdev: simplify Rx adapter event vector config
app/test-eventdev/evt_common.h | 4 +
app/test-eventdev/evt_options.c | 52 +++
app/test-eventdev/evt_options.h | 4 +
app/test-eventdev/test_pipeline_atq.c | 310 +++++++++++++++--
app/test-eventdev/test_pipeline_common.c | 105 +++++-
app/test-eventdev/test_pipeline_common.h | 18 +
app/test-eventdev/test_pipeline_queue.c | 320 ++++++++++++++++--
.../prog_guide/event_ethernet_rx_adapter.rst | 38 +++
.../prog_guide/event_ethernet_tx_adapter.rst | 12 +
doc/guides/prog_guide/eventdev.rst | 36 +-
doc/guides/rel_notes/deprecation.rst | 9 +
doc/guides/tools/testeventdev.rst | 28 ++
lib/librte_eventdev/eventdev_pmd.h | 31 +-
.../rte_event_eth_rx_adapter.c | 305 ++++++++++++++++-
.../rte_event_eth_rx_adapter.h | 68 ++++
.../rte_event_eth_tx_adapter.c | 66 +++-
lib/librte_eventdev/rte_eventdev.c | 11 +-
lib/librte_eventdev/rte_eventdev.h | 144 +++++++-
lib/librte_eventdev/version.map | 4 +
19 files changed, 1479 insertions(+), 86 deletions(-)
--
2.17.1
Comments
On Sat, Mar 20, 2021 at 2:27 AM <pbhagavatula@marvell.com> wrote: > > From: Pavan Nikhilesh <pbhagavatula@marvell.com> > > In traditional event programming model, events are identified by a > flow-id and a uintptr_t. The flow-id uniquely identifies a given event > and determines the order of scheduling based on schedule type, the > uintptr_t holds a single object. > > Event devices also support burst mode with configurable dequeue depth, > i.e. each dequeue call would return multiple events and each event > might be at a different stage of the pipeline. > Having a burst of events belonging to different stages in a dequeue > burst is not only difficult to vectorize but also increases the scheduler > overhead and application overhead of pipelining events further. > Using event vectors we see a performance gain of ~628% as shown in [1]. > > By introducing event vectorization, each event will be capable of holding > multiple uintptr_t of the same flow thereby allowing applications > to vectorize their pipeline and reduce the complexity of pipelining > events across multiple stages. This also reduces the complexity of handling > enqueue and dequeue on an event device. > > Since event devices are transparent to the events they are scheduling > so the event producers such as eth_rx_adapter, crypto_adapter , etc.. > are responsible for vectorizing the buffers of the same flow into a single > event. > > The series also breaks ABI in the patch [8/8] which is targetted to the > v21.11 release. > > The dpdk-test-eventdev application has been updated with options to test > multiple vector sizes and timeouts. > > [1] > As for performance improvement, with a ARM Cortex-A72 equivalent processer, > software event device (--vdev=event_sw0), single worker core, single stage > and using one service core for Rx adapter, Tx adapter, Scheduling. > > Without event vectorization: > ./build/app/dpdk-test-eventdev -l 7-23 -s 0x700 --vdev="event_sw0" -- > --prod_type_ethdev --nb_pkts=0 --verbose 2 --test=pipeline_queue > --stlist=a --wlcores=20 > Port[0] using Rx adapter[0] configured > Port[0] using Tx adapter[0] Configured > 4.728 mpps avg 4.728 mpps > > With event vectorization: > ./build/app/dpdk-test-eventdev -l 7-23 -s 0x700 --vdev="event_sw0" -- > --prod_type_ethdev --nb_pkts=0 --verbose 2 --test=pipeline_queue > --stlist=a --wlcores=20 --enable_vector --nb_eth_queues 1 > --vector_size 256 > Port[0] using Rx adapter[0] configured > Port[0] using Tx adapter[0] Configured > 34.383 mpps avg 34.383 mpps > > Having dedicated service cores for each Rx queues and tweaking the vector, > dequeue burst size would further improve performance. > > API usage is shown below: > > Configuration: > > struct rte_event_eth_rx_adapter_event_vector_config vec_conf; > > vector_pool = rte_event_vector_pool_create("vector_pool", > nb_elem, 0, vector_size, socket_id); > > rte_event_eth_rx_adapter_create(id, event_id, &adptr_conf); > rte_event_eth_rx_adapter_queue_add(id, eth_id, -1, &queue_conf); > if (cap & RTE_EVENT_ETH_RX_ADAPTER_CAP_EVENT_VECTOR) { > vec_conf.vector_sz = vector_size; > vec_conf.vector_timeout_ns = vector_tmo_nsec; > vec_conf.vector_mp = vector_pool; > rte_event_eth_rx_adapter_queue_event_vector_config(id, > eth_id, -1, &vec_conf); > } > > Fastpath: > > num = rte_event_dequeue_burst(event_id, port_id, &ev, 1, 0); > if (!num) > continue; > > if (ev.event_type & RTE_EVENT_TYPE_VECTOR) { > switch (ev.event_type) { > case RTE_EVENT_TYPE_ETHDEV_VECTOR: > case RTE_EVENT_TYPE_ETH_RX_ADAPTER_VECTOR: > struct rte_mbuf **mbufs; > > mbufs = ev.vector_ev->mbufs; > for (i = 0; i < ev.vector_ev->nb_elem; i++) > //Process mbufs. > break; > case ... > } > } > ... > > v4 Changes: > - Fix missing event vector structure in event structure.(Jay) > > v3 Changes: > - Fix unintended formatting changes. > > v2 Changes: > - Multiple gramatical and style fixes.(Jerin) > - Add parameter to define vector size in power of 2. (Jerin) > - Redo patch series w/o breaking ABI till the last patch.(David) > - Add deprication notice to announce ABI break in 21.11.(David) > - Add vector limits validation to app/test-eventdev. > > Pavan Nikhilesh (8): > eventdev: introduce event vector capability > eventdev: introduce event vector Rx capability > eventdev: introduce event vector Tx capability > eventdev: add Rx adapter event vector support > eventdev: add Tx adapter event vector support > app/eventdev: add event vector mode in pipeline test > doc: announce event Rx adapter config changes > eventdev: simplify Rx adapter event vector config > > app/test-eventdev/evt_common.h | 4 + > app/test-eventdev/evt_options.c | 52 +++ > app/test-eventdev/evt_options.h | 4 + > app/test-eventdev/test_pipeline_atq.c | 310 +++++++++++++++-- > app/test-eventdev/test_pipeline_common.c | 105 +++++- > app/test-eventdev/test_pipeline_common.h | 18 + > app/test-eventdev/test_pipeline_queue.c | 320 ++++++++++++++++-- > .../prog_guide/event_ethernet_rx_adapter.rst | 38 +++ > .../prog_guide/event_ethernet_tx_adapter.rst | 12 + > doc/guides/prog_guide/eventdev.rst | 36 +- > doc/guides/rel_notes/deprecation.rst | 9 + > doc/guides/tools/testeventdev.rst | 28 ++ > lib/librte_eventdev/eventdev_pmd.h | 31 +- > .../rte_event_eth_rx_adapter.c | 305 ++++++++++++++++- > .../rte_event_eth_rx_adapter.h | 68 ++++ > .../rte_event_eth_tx_adapter.c | 66 +++- > lib/librte_eventdev/rte_eventdev.c | 11 +- > lib/librte_eventdev/rte_eventdev.h | 144 +++++++- > lib/librte_eventdev/version.map | 4 + > 19 files changed, 1479 insertions(+), 86 deletions(-) Please update release notes(doc/guides/rel_notes/release_21_05.rst) for this feature. If there are no more comments on this series from others. IMO, Good to merge the next series for RC1. > > -- > 2.17.1 >