[v4,00/29] graph: introduce graph subsystem
Message ID | 20200405085613.1336841-1-jerinj@marvell.com (mailing list archive) |
---|---|
Headers |
Return-Path: <dev-bounces@dpdk.org> X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 8A0D7A0577; Sun, 5 Apr 2020 10:56:08 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id AC2A92B83; Sun, 5 Apr 2020 10:56:07 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com [67.231.148.174]) by dpdk.org (Postfix) with ESMTP id D35E1FFA for <dev@dpdk.org>; Sun, 5 Apr 2020 10:56:05 +0200 (CEST) Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 0358rAnK026566; Sun, 5 Apr 2020 01:56:03 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=pfpt0818; bh=/rfzsWWAeJ+2VcTSA31iSVeHI/A5HN1l33IVHZOJulA=; b=cF/hcOVqEmXJtxg99cwV9ZI7w0PkRknS5oNWWhpbZFUv2xPgLrLzR1t6F5szinoqiXda ZlF8ezORm7HQOcWVv4IB4t3NqVEXhcg1zB0dgs+sKcvwiSz0zqIljV6w3MYE604/4W4H pYckhknZSLQ/mS7+bIMjsTaM+Kh/3TEYCOgouZXNr3cWm/2WVeExHqlyTcZ5GBwFfIQ4 uYrWMJlDtEHsNs6anZ2/Bl44bvL47096rNtM2fJUwNoQXiKF2zaonmygN8nyjFSPRob7 lw1+uUL4vpZGscQuZ3mJF31UaVzn28w3KFhTq3jTJwd1G9S/gDp9rgjeStV5KiqB1QMF zQ== Received: from sc-exch03.marvell.com ([199.233.58.183]) by mx0a-0016f401.pphosted.com with ESMTP id 306qkqtmn5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Sun, 05 Apr 2020 01:56:03 -0700 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by SC-EXCH03.marvell.com (10.93.176.83) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Sun, 5 Apr 2020 01:56:01 -0700 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Sun, 5 Apr 2020 01:56:00 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Sun, 5 Apr 2020 01:56:00 -0700 Received: from jerin-lab.marvell.com (jerin-lab.marvell.com [10.28.34.14]) by maili.marvell.com (Postfix) with ESMTP id A43D43F703F; Sun, 5 Apr 2020 01:55:57 -0700 (PDT) From: <jerinj@marvell.com> To: CC: <dev@dpdk.org>, <thomas@monjalon.net>, <david.marchand@redhat.com>, <mdr@ashroe.eu>, <mattias.ronnblom@ericsson.com>, <kirankumark@marvell.com>, <pbhagavatula@marvell.com>, <ndabilpuram@marvell.com>, <xiao.w.wang@intel.com>, Jerin Jacob <jerinj@marvell.com> Date: Sun, 5 Apr 2020 14:25:44 +0530 Message-ID: <20200405085613.1336841-1-jerinj@marvell.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200331192945.2466880-1-jerinj@marvell.com> References: <20200331192945.2466880-1-jerinj@marvell.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.676 definitions=2020-04-05_01:2020-04-03, 2020-04-05 signatures=0 Subject: [dpdk-dev] [PATCH v4 00/29] graph: introduce graph subsystem X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions <dev.dpdk.org> List-Unsubscribe: <https://mails.dpdk.org/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://mails.dpdk.org/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <https://mails.dpdk.org/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org> |
Message
Jerin Jacob
April 5, 2020, 8:55 a.m. UTC
From: Jerin Jacob <jerinj@marvell.com>
Using graph traversal for packet processing is a proven architecture
that has been implemented in various open source libraries.
Graph architecture for packet processing enables abstracting the data
processing functions as “nodes” and “links” them together to create a
complex “graph” to create reusable/modular data processing functions.
The patchset further includes performance enhancements and modularity
to the DPDK as discussed in more detail below.
v4..v3:
-------
Addressed the following review comments from Wang, Xiao W
1) Remove unnecessary line from rte_graph.h
2) Fix a typo from rte_graph.h
3) Move NODE_ID_CHECK to 3rd patch where it is first used.
4) Fixed bug in edge_update()
v3..v2:
-------
1) refactor ipv4 node lookup by moving SSE and NEON specific code to
lib/librte_node/ip4_lookup_sse.h and lib/librte_node/ip4_lookup_neon.h
2) Add scalar version of process() function for ipv4 lookup to make
the node work on NON x86 and arm64 machines.
v2..v1:
------
1) Added programmer guide/implementation documentation and l3fwd-graph doc
RFC..v1:
--------
1) Split the patch to more logical ones for review.
2) Added doxygen comments for the API
3) Code cleanup
4) Additional performance improvements.
Delta between l3fwd and l3fwd-graph is negligible now.
(~1%) on octeontx2.
5) Added SIMD routines for x86 in additional to arm64.
Hosted in netlify for easy reference:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Programmer’s Guide:
https://dpdk-graph.netlify.com/doc/html/guides/prog_guide/graph_lib.html
l3fwd-graph doc:
https://dpdk-graph.netlify.com/doc/html/guides/sample_app_ug/l3_forward_graph.html
API doc:
https://dpdk-graph.netlify.com/doc/html/api/rte__graph_8h.html
https://dpdk-graph.netlify.com/doc/html/api/rte__graph__worker_8h.html
https://dpdk-graph.netlify.com/doc/html/api/rte__node__eth__api_8h.html
https://dpdk-graph.netlify.com/doc/html/api/rte__node__ip4__api_8h.html
2) Added the release notes for the this feature
3) Fix build issues reported by CI for v1:
http://mails.dpdk.org/archives/test-report/2020-March/121326.html
Addional nodes planned for v20.08
----------------------------------
1) Packet classification node
2) Support for IPV6 LPM node
This patchset contains
-----------------------------
1) The API definition to "create" nodes and "link" together to create a
"graph" for packet processing. See, lib/librte_graph/rte_graph.h
2) The Fast path API definition for the graph walker and enqueue
function used by the workers. See, lib/librte_graph/rte_graph_worker.h
3) Optimized SW implementation for (1) and (2). See, lib/librte_graph/
4) Test case to verify the graph infrastructure functionality
See, app/test/test_graph.c
5) Performance test cases to evaluate the cost of graph walker and nodes
enqueue fast-path function for various combinations.
See app/test/test_graph_perf.c
6) Packet processing nodes(Null, Rx, Tx, Pkt drop, IPV4 rewrite, IPv4
lookup)
using graph infrastructure. See lib/librte_node/*
7) An example application to showcase l3fwd
(functionality same as existing examples/l3fwd) using graph
infrastructure and use packets processing nodes (item (6)). See examples/l3fwd-graph/.
Performance
-----------
1) Graph walk and node enqueue overhead can be tested with performance
test case application [1]
# If all packets go from a node to another node (we call it as
# "homerun") then it will be just a pointer swap for a burst of packets.
# In the worst case, a couple of handful cycles to move an object from a
node to another node.
2) Performance comparison with existing l3fwd (The complete static code
with out any nodes) vs modular l3fwd-graph with 5 nodes
(ip4_lookup, ip4_rewrite, ethdev_tx, ethdev_rx, pkt_drop).
Here is graphical representation of the l3fwd-graph as Graphviz dot
file:
http://bit.ly/39UPPGm
# l3fwd-graph performance is -1.2% wrt static l3fwd.
# We have simulated the similar test with existing librte_pipeline
# application [4].
ip_pipline application is -48.62% wrt static l3fwd.
The above results are on octeontx2. It may vary on other platforms.
The platforms with higher L1 and L2 caches will have further better
performance.
Tested architectures:
--------------------
1) AArch64
2) X86
Identified tweaking for better performance on different targets
---------------------------------------------------------------
1) Test with various burst size values (256, 128, 64, 32) using
CONFIG_RTE_GRAPH_BURST_SIZE config option.
Based on our testing, on x86 and arm64 servers, The sweet spot is 256
burst size.
While on arm64 embedded SoCs, it is either 64 or 128.
2) Disable node statistics (use CONFIG_RTE_LIBRTE_GRAPH_STATS config
option)
if not needed.
3) Use arm64 optimized memory copy for arm64 architecture by
selecting CONFIG_RTE_ARCH_ARM64_MEMCPY.
Commands to run tests
---------------------
[1]
perf test:
echo "graph_perf_autotest" | sudo ./build/app/test/dpdk-test -c 0x30
[2]
functionality test:
echo "graph_autotest" | sudo ./build/app/test/dpdk-test -c 0x30
[3]
l3fwd-graph:
./l3fwd-graph -c 0x100 -- -p 0x3 --config="(0, 0, 8)" -P
[4]
# ./ip_pipeline --c 0xff0000 -- -s route.cli
Route.cli: (Copy paste to the shell to avoid dos format issues)
https://pastebin.com/raw/B4Ktx7TT
Jerin Jacob (13):
graph: define the public API for graph support
graph: implement node registration
graph: implement node operations
graph: implement node debug routines
graph: implement internal graph operation helpers
graph: populate fastpath memory for graph reel
graph: implement create and destroy APIs
graph: implement graph operation APIs
graph: implement Graphviz export
graph: implement debug routines
graph: implement stats support
graph: implement fastpath API routines
doc: add graph library programmer's guide guide
Kiran Kumar K (2):
graph: add unit test case
node: add ipv4 rewrite node
Nithin Dabilpuram (11):
node: add log infra and null node
node: add ethdev Rx node
node: add ethdev Tx node
node: add ethdev Rx and Tx node ctrl API
node: ipv4 lookup for arm64
node: add ipv4 rewrite and lookup ctrl API
node: add packet drop node
l3fwd-graph: add graph based l3fwd skeleton
l3fwd-graph: add ethdev configuration changes
l3fwd-graph: add graph config and main loop
doc: add l3fwd graph application user guide
Pavan Nikhilesh (3):
graph: add performance testcase
node: add generic ipv4 lookup node
node: ipv4 lookup for x86
MAINTAINERS | 14 +
app/test/Makefile | 7 +
app/test/meson.build | 12 +-
app/test/test_graph.c | 819 ++++
app/test/test_graph_perf.c | 1057 ++++++
config/common_base | 12 +
config/rte_config.h | 4 +
doc/api/doxy-api-index.md | 5 +
doc/api/doxy-api.conf.in | 2 +
doc/guides/prog_guide/graph_lib.rst | 397 ++
.../prog_guide/img/anatomy_of_a_node.svg | 1078 ++++++
.../prog_guide/img/graph_mem_layout.svg | 702 ++++
doc/guides/prog_guide/img/link_the_nodes.svg | 3330 +++++++++++++++++
doc/guides/prog_guide/index.rst | 1 +
doc/guides/rel_notes/release_20_05.rst | 32 +
doc/guides/sample_app_ug/index.rst | 1 +
doc/guides/sample_app_ug/intro.rst | 4 +
doc/guides/sample_app_ug/l3_forward_graph.rst | 327 ++
examples/Makefile | 3 +
examples/l3fwd-graph/Makefile | 58 +
examples/l3fwd-graph/main.c | 1111 ++++++
examples/l3fwd-graph/meson.build | 13 +
examples/meson.build | 6 +-
lib/Makefile | 6 +
lib/librte_graph/Makefile | 28 +
lib/librte_graph/graph.c | 589 +++
lib/librte_graph/graph_debug.c | 84 +
lib/librte_graph/graph_ops.c | 169 +
lib/librte_graph/graph_populate.c | 234 ++
lib/librte_graph/graph_private.h | 347 ++
lib/librte_graph/graph_stats.c | 406 ++
lib/librte_graph/meson.build | 11 +
lib/librte_graph/node.c | 421 +++
lib/librte_graph/rte_graph.h | 785 ++++
lib/librte_graph/rte_graph_version.map | 47 +
lib/librte_graph/rte_graph_worker.h | 542 +++
lib/librte_node/Makefile | 32 +
lib/librte_node/ethdev_ctrl.c | 116 +
lib/librte_node/ethdev_rx.c | 221 ++
lib/librte_node/ethdev_rx_priv.h | 81 +
lib/librte_node/ethdev_tx.c | 86 +
lib/librte_node/ethdev_tx_priv.h | 62 +
lib/librte_node/ip4_lookup.c | 216 ++
lib/librte_node/ip4_lookup_neon.h | 238 ++
lib/librte_node/ip4_lookup_sse.h | 244 ++
lib/librte_node/ip4_rewrite.c | 326 ++
lib/librte_node/ip4_rewrite_priv.h | 77 +
lib/librte_node/log.c | 14 +
lib/librte_node/meson.build | 10 +
lib/librte_node/node_private.h | 96 +
lib/librte_node/null.c | 23 +
lib/librte_node/pkt_drop.c | 26 +
lib/librte_node/rte_node_eth_api.h | 70 +
lib/librte_node/rte_node_ip4_api.h | 87 +
lib/librte_node/rte_node_version.map | 9 +
lib/meson.build | 5 +-
meson.build | 1 +
mk/rte.app.mk | 2 +
58 files changed, 14701 insertions(+), 5 deletions(-)
create mode 100644 app/test/test_graph.c
create mode 100644 app/test/test_graph_perf.c
create mode 100644 doc/guides/prog_guide/graph_lib.rst
create mode 100644 doc/guides/prog_guide/img/anatomy_of_a_node.svg
create mode 100644 doc/guides/prog_guide/img/graph_mem_layout.svg
create mode 100644 doc/guides/prog_guide/img/link_the_nodes.svg
create mode 100644 doc/guides/sample_app_ug/l3_forward_graph.rst
create mode 100644 examples/l3fwd-graph/Makefile
create mode 100644 examples/l3fwd-graph/main.c
create mode 100644 examples/l3fwd-graph/meson.build
create mode 100644 lib/librte_graph/Makefile
create mode 100644 lib/librte_graph/graph.c
create mode 100644 lib/librte_graph/graph_debug.c
create mode 100644 lib/librte_graph/graph_ops.c
create mode 100644 lib/librte_graph/graph_populate.c
create mode 100644 lib/librte_graph/graph_private.h
create mode 100644 lib/librte_graph/graph_stats.c
create mode 100644 lib/librte_graph/meson.build
create mode 100644 lib/librte_graph/node.c
create mode 100644 lib/librte_graph/rte_graph.h
create mode 100644 lib/librte_graph/rte_graph_version.map
create mode 100644 lib/librte_graph/rte_graph_worker.h
create mode 100644 lib/librte_node/Makefile
create mode 100644 lib/librte_node/ethdev_ctrl.c
create mode 100644 lib/librte_node/ethdev_rx.c
create mode 100644 lib/librte_node/ethdev_rx_priv.h
create mode 100644 lib/librte_node/ethdev_tx.c
create mode 100644 lib/librte_node/ethdev_tx_priv.h
create mode 100644 lib/librte_node/ip4_lookup.c
create mode 100644 lib/librte_node/ip4_lookup_neon.h
create mode 100644 lib/librte_node/ip4_lookup_sse.h
create mode 100644 lib/librte_node/ip4_rewrite.c
create mode 100644 lib/librte_node/ip4_rewrite_priv.h
create mode 100644 lib/librte_node/log.c
create mode 100644 lib/librte_node/meson.build
create mode 100644 lib/librte_node/node_private.h
create mode 100644 lib/librte_node/null.c
create mode 100644 lib/librte_node/pkt_drop.c
create mode 100644 lib/librte_node/rte_node_eth_api.h
create mode 100644 lib/librte_node/rte_node_ip4_api.h
create mode 100644 lib/librte_node/rte_node_version.map