From patchwork Wed Sep 30 09:18:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gregory Etelson X-Patchwork-Id: 79305 X-Patchwork-Delegate: ferruh.yigit@amd.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 8641EA04B5; Wed, 30 Sep 2020 11:19:45 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 65BD41DAE4; Wed, 30 Sep 2020 11:19:44 +0200 (CEST) Received: from hqnvemgate26.nvidia.com (hqnvemgate26.nvidia.com [216.228.121.65]) by dpdk.org (Postfix) with ESMTP id 13CBB1DAE4 for ; Wed, 30 Sep 2020 11:19:42 +0200 (CEST) Received: from hqmail.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate26.nvidia.com (using TLS: TLSv1.2, AES256-SHA) id ; Wed, 30 Sep 2020 02:19:28 -0700 Received: from nvidia.com (10.124.1.5) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Wed, 30 Sep 2020 09:19:23 +0000 From: Gregory Etelson To: CC: , , , "Gregory Etelson" , Ori Kam , "Viacheslav Ovsiienko" , Ori Kam , "Thomas Monjalon" , Ferruh Yigit , Andrew Rybchenko Date: Wed, 30 Sep 2020 12:18:50 +0300 Message-ID: <20200930091854.19768-2-getelson@nvidia.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930091854.19768-1-getelson@nvidia.com> References: <20200625160348.26220-1-getelson@mellanox.com> <20200930091854.19768-1-getelson@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [10.124.1.5] X-ClientProxiedBy: HQMAIL105.nvidia.com (172.20.187.12) To HQMAIL107.nvidia.com (172.20.187.13) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1601457568; bh=MOkSmkvwBPhrqOVQrv5ChJoWBWwFStB81sk70oyZ3X0=; h=From:To:CC:Subject:Date:Message-ID:X-Mailer:In-Reply-To: References:MIME-Version:Content-Transfer-Encoding:Content-Type: X-Originating-IP:X-ClientProxiedBy; b=S+WWVbMIfHe3Qyy+WLt75igSzZ34Q7cWPHL77kMosDB3SJTHVDD4YkbtKBwomWANH fbEBzJAxj0/tm8aGoU7KLJQ2K7TlsFmMAM6J8GCdfr++VrArf1IHB2lLib8v2VXNdd qoa//bL+ApqOH+04Y1Rzp6QQNevsB/2AMLN365e3RXnXU1KHaRmsBnOBY5bdJmdlw6 UIMnmMp7khXJE8hPWQ4tHQD19XojjFVUHOwS9VG4Z/adLpkmjLCWG/CRcsR4QIh/sR nX4wDBd64E3dh+7eP5DhNhWxv+VSARv3cFUj/u040x3r1LD+qfynfpmc/24tRpphqC c/xEODtR4nTKw== Subject: [dpdk-dev] [PATCH v3 1/4] ethdev: allow negative values in flow rule types X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Gregory Etelson RTE flow items & actions use positive values in item & action type. Negative values are reserved for PMD private types. PMD items & actions usually are not exposed to application and are not used to create RTE flows. The patch allows applications with access to PMD flow items & actions ability to integrate RTE and PMD items & actions and use them to create flow rule. RTE flow library functions cannot work with PMD private items and actions (elements) because RTE flow has no API to query PMD flow object size. In the patch, PMD flow elements use object pointer. RTE flow library functions handle PMD element object size as size of a pointer. PMD handles its objects internally. Signed-off-by: Gregory Etelson Acked-by: Ori Kam Acked-by: Viacheslav Ovsiienko --- lib/librte_ethdev/rte_flow.c | 28 ++++++++++++++++++++++------ 1 file changed, 22 insertions(+), 6 deletions(-) diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c index f8fdd68fe9..c8c6d62a8b 100644 --- a/lib/librte_ethdev/rte_flow.c +++ b/lib/librte_ethdev/rte_flow.c @@ -564,7 +564,11 @@ rte_flow_conv_item_spec(void *buf, const size_t size, } break; default: - off = rte_flow_desc_item[item->type].size; + /** + * allow PMD private flow item + */ + off = (int)item->type >= 0 ? + rte_flow_desc_item[item->type].size : sizeof(void *); rte_memcpy(buf, data, (size > off ? off : size)); break; } @@ -667,7 +671,11 @@ rte_flow_conv_action_conf(void *buf, const size_t size, } break; default: - off = rte_flow_desc_action[action->type].size; + /** + * allow PMD private flow action + */ + off = (int)action->type >= 0 ? + rte_flow_desc_action[action->type].size : sizeof(void *); rte_memcpy(buf, action->conf, (size > off ? off : size)); break; } @@ -709,8 +717,12 @@ rte_flow_conv_pattern(struct rte_flow_item *dst, unsigned int i; for (i = 0, off = 0; !num || i != num; ++i, ++src, ++dst) { - if ((size_t)src->type >= RTE_DIM(rte_flow_desc_item) || - !rte_flow_desc_item[src->type].name) + /** + * allow PMD private flow item + */ + if (((int)src->type >= 0) && + ((size_t)src->type >= RTE_DIM(rte_flow_desc_item) || + !rte_flow_desc_item[src->type].name)) return rte_flow_error_set (error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ITEM, src, "cannot convert unknown item type"); @@ -798,8 +810,12 @@ rte_flow_conv_actions(struct rte_flow_action *dst, unsigned int i; for (i = 0, off = 0; !num || i != num; ++i, ++src, ++dst) { - if ((size_t)src->type >= RTE_DIM(rte_flow_desc_action) || - !rte_flow_desc_action[src->type].name) + /** + * allow PMD private flow action + */ + if (((int)src->type >= 0) && + ((size_t)src->type >= RTE_DIM(rte_flow_desc_action) || + !rte_flow_desc_action[src->type].name)) return rte_flow_error_set (error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ACTION, src, "cannot convert unknown action type"); From patchwork Wed Sep 30 09:18:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Gregory Etelson X-Patchwork-Id: 79306 X-Patchwork-Delegate: ferruh.yigit@amd.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 4F024A04B5; Wed, 30 Sep 2020 11:20:04 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 2A27F1DB16; Wed, 30 Sep 2020 11:19:47 +0200 (CEST) Received: from hqnvemgate26.nvidia.com (hqnvemgate26.nvidia.com [216.228.121.65]) by dpdk.org (Postfix) with ESMTP id 092421DACE for ; Wed, 30 Sep 2020 11:19:42 +0200 (CEST) Received: from hqmail.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate26.nvidia.com (using TLS: TLSv1.2, AES256-SHA) id ; Wed, 30 Sep 2020 02:19:28 -0700 Received: from nvidia.com (10.124.1.5) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Wed, 30 Sep 2020 09:19:26 +0000 From: Gregory Etelson To: CC: , , , "Eli Britstein" , Ori Kam , "Viacheslav Ovsiienko" , Ori Kam , "John McNamara" , Marko Kovacevic , Ray Kinsella , Neil Horman , Thomas Monjalon , Ferruh Yigit , Andrew Rybchenko Date: Wed, 30 Sep 2020 12:18:51 +0300 Message-ID: <20200930091854.19768-3-getelson@nvidia.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930091854.19768-1-getelson@nvidia.com> References: <20200625160348.26220-1-getelson@mellanox.com> <20200930091854.19768-1-getelson@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [10.124.1.5] X-ClientProxiedBy: HQMAIL105.nvidia.com (172.20.187.12) To HQMAIL107.nvidia.com (172.20.187.13) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1601457568; bh=8qdE00NJGK13jgwj382hvGJXo/rBjus65qASxgFEupo=; h=From:To:CC:Subject:Date:Message-ID:X-Mailer:In-Reply-To: References:MIME-Version:Content-Type:Content-Transfer-Encoding: X-Originating-IP:X-ClientProxiedBy; b=KfzB+sEYy0pBRmjl1SXELGl4su08RmvrbH6bQJSqSyN3xT3t21t0NFdbiYazq5gLr 1Xpwl69eb7MEbo1IbkCx1zaSxXdboij+4IpGk3s/kYOKXYGY5r2y1dbspChfkCZG+U VsX0n5sH404UXMcWOJ1siO7AieN7OTsNkqr7z2TowigKWaAgWedzZDEHabHMVpFH9K PB4txx6nVKUaEftP2zwq3Z5oLpeiFYOTv9sjlEki7otopI4VcwzAqdyPTdca9OeGIz rHr4nybDm+59Pylr+QDjxJZ/wr8SwrStnVfSWfvfApYNxqc/TsJz3cM1bZfz9IfgSo Bx/oi5xzVf2vQ== Subject: [dpdk-dev] [PATCH v3 2/4] ethdev: tunnel offload model X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Eli Britstein Rte_flow API provides the building blocks for vendor agnostic flow classification offloads. The rte_flow match and action primitives are fine grained, thus enabling DPDK applications the flexibility to offload network stacks and complex pipelines. Applications wishing to offload complex data structures (e.g. tunnel virtual ports) are required to use the rte_flow primitives, such as group, meta, mark, tag and others to model their high level objects. The hardware model design for high level software objects is not trivial. Furthermore, an optimal design is often vendor specific. The goal of this API is to provide applications with the hardware offload model for common high level software objects which is optimal in regards to the underlying hardware. Tunnel ports are the first of such objects. Tunnel ports ------------ Ingress processing of tunneled traffic requires the classification of the tunnel type followed by a decap action. In software, once a packet is decapsulated the in_port field is changed to a virtual port representing the tunnel type. The outer header fields are stored as packet metadata members and may be matched by proceeding flows. Openvswitch, for example, uses two flows: 1. classification flow - setting the virtual port representing the tunnel type For example: match on udp port 4789 actions=tnl_pop(vxlan_vport) 2. steering flow according to outer and inner header matches match on in_port=vxlan_vport and outer/inner header matches actions=forward to p ort X The benefits of multi-flow tables are described in [1]. Offloading tunnel ports ----------------------- Tunnel ports introduce a new stateless field that can be matched on. Currently the rte_flow library provides an API to encap, decap and match on tunnel headers. However, there is no rte_flow primitive to set and match tunnel virtual ports. There are several possible hardware models for offloading virtual tunnel port flows including, but not limited to, the following: 1. Setting the virtual port on a hw register using the rte_flow_action_mark/ rte_flow_action_tag/rte_flow_set_meta objects. 2. Mapping a virtual port to an rte_flow group 3. Avoiding the need to match on transient objects by merging multi-table flows to a single rte_flow rule. Every approach has its pros and cons. The preferred approach should take into account the entire system architecture and is very often vendor specific. The proposed rte_flow_tunnel_decap_set helper function (drafted below) is designed to provide a common, vendor agnostic, API for setting the virtual port value. The helper API enables PMD implementations to return vendor specific combination of rte_flow actions realizing the vendor's hardware model for setting a tunnel port. Applications may append the list of actions returned from the helper function when creating an rte_flow rule in hardware. Similarly, the rte_flow_tunnel_match helper (drafted below) allows for multiple hardware implementations to return a list of fte_flow items. Miss handling ------------- Packets going through multiple rte_flow groups are exposed to hw misses due to partial packet processing. In such cases, the software should continue the packet's processing from the point where the hardware missed. We propose a generic rte_flow_restore structure providing the state that was stored in hardware when the packet missed. Currently, the structure will provide the tunnel state of the packet that missed, namely: 1. The group id that missed 2. The tunnel port that missed 3. Tunnel information that was stored in memory (due to decap action). In the future, we may add additional fields as more state may be stored in the device memory (e.g. ct_state). Applications may query the state via a new rte_flow_tunnel_get_restore_info(mbuf) API, thus allowing a vendor specific implementation. VXLAN Code example: Assume application needs to do inner NAT on VXLAN packet. The first rule in group 0: flow create ingress group 0 pattern eth / ipv4 / udp dst is 4789 / vxlan / end actions {pmd actions} / jump group 3 / end First VXLAN packet that arrives matches the rule in group 0 and jumps to group 3 In group 3 the packet will miss since there is no flow to match and will be uploaded to application. Application will call rte_flow_get_restore_info() to get the packet outer header. Application will insert a new rule in group 3 to match outer and inner headers: flow create ingress group 3 pattern {pmd items} / eth / ipv4 dst is 172.10.10.1 / udp dst 4789 / vxlan vni is 10 / ipv4 dst is 184.1.2.3 / end actions set_ipv4_dst 186.1.1.1 / queue index 3 / end Resulting of rules will be that VXLAN packet with vni=10, outer IPv4 dst=172.10.10.1 and inner IPv4 dst=184.1.2.3 will be received decaped on queue 3 with IPv4 dst=186.1.1.1 Note: Packet in group 3 is considered decaped. All actions in that group will be done on header that was inner before decap. Application may specify outer header to be matched on. It's PMD responsibility to translate these items to outer metadata. API usage: /** * 1. Initiate RTE flow tunnel object */ const struct rte_flow_tunnel tunnel = { .type = RTE_FLOW_ITEM_TYPE_VXLAN, .tun_id = 10, } /** * 2. Obtain PMD tunnel actions * * pmd_actions is an intermediate variable application uses to * compile actions array */ struct rte_flow_action **pmd_actions; rte_flow_tunnel_decap_and_set(&tunnel, &pmd_actions, &num_pmd_actions, &error); /** * 3. offload the first rule * matching on VXLAN traffic and jumps to group 3 * (implicitly decaps packet) */ app_actions = jump group 3 rule_items = app_items; /** eth / ipv4 / udp / vxlan */ rule_actions = { pmd_actions, app_actions }; attr.group = 0; flow_1 = rte_flow_create(port_id, &attr, rule_items, rule_actions, &error); /** * 4. after flow creation application does not need to keep tunnel * action resources. */ rte_flow_tunnel_action_release(port_id, pmd_actions, num_pmd_actions); /** * 5. After partially offloaded packet miss because there was no * matching rule handle miss on group 3 */ struct rte_flow_restore_info info; rte_flow_get_restore_info(port_id, mbuf, &info, &error); /** * 6. Offload NAT rule: */ app_items = { eth / ipv4 dst is 172.10.10.1 / udp dst 4789 / vxlan vni is 10 / ipv4 dst is 184.1.2.3 } app_actions = { set_ipv4_dst 186.1.1.1 / queue index 3 } rte_flow_tunnel_match(&info.tunnel, &pmd_items, &num_pmd_items, &error); rule_items = {pmd_items, app_items}; rule_actions = app_actions; attr.group = info.group_id; flow_2 = rte_flow_create(port_id, &attr, rule_items, rule_actions, &error); /** * 7. Release PMD items after rule creation */ rte_flow_tunnel_item_release(port_id, pmd_items, num_pmd_items); References 1. https://mails.dpdk.org/archives/dev/2020-June/index.html Signed-off-by: Eli Britstein Signed-off-by: Gregory Etelson Acked-by: Ori Kam Acked-by: Viacheslav Ovsiienko --- doc/guides/prog_guide/rte_flow.rst | 105 ++++++++++++ doc/guides/rel_notes/release_20_11.rst | 10 ++ lib/librte_ethdev/rte_ethdev_version.map | 6 + lib/librte_ethdev/rte_flow.c | 112 +++++++++++++ lib/librte_ethdev/rte_flow.h | 195 +++++++++++++++++++++++ lib/librte_ethdev/rte_flow_driver.h | 32 ++++ 6 files changed, 460 insertions(+) diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst index 119b128739..e62030150e 100644 --- a/doc/guides/prog_guide/rte_flow.rst +++ b/doc/guides/prog_guide/rte_flow.rst @@ -3031,6 +3031,111 @@ operations include: - Duplication of a complete flow rule description. - Pattern item or action name retrieval. +Tunneled traffic offload +~~~~~~~~~~~~~~~~~~~~~~~~ + +Provide software application with unified rules model for tunneled traffic +regardless underlying hardware. + + - The model introduces a concept of a virtual tunnel port (VTP). + - The model uses VTP to offload ingress tunneled network traffic  + with RTE flow rules. + - The model is implemented as set of helper functions. Each PMD + implements VTP offload according to underlying hardware offload + capabilities. Applications must query PMD for VTP flow + items / actions before using in creation of a VTP flow rule. + +The model components: + +- Virtual Tunnel Port (VTP) is a stateless software object that + describes tunneled network traffic. VTP object usually contains + descriptions of outer headers, tunnel headers and inner headers. +- Tunnel Steering flow Rule (TSR) detects tunneled packets and + delegates them to tunnel processing infrastructure, implemented + in PMD for optimal hardware utilization, for further processing. +- Tunnel Matching flow Rule (TMR) verifies packet configuration and + runs offload actions in case of a match. + +Application actions: + +1 Initialize VTP object according to tunnel network parameters. + +2 Create TSR flow rule. + +2.1 Query PMD for VTP actions. Application can query for VTP actions more than once. + + .. code-block:: c + + int + rte_flow_tunnel_decap_set(uint16_t port_id, + struct rte_flow_tunnel *tunnel, + struct rte_flow_action **pmd_actions, + uint32_t *num_of_pmd_actions, + struct rte_flow_error *error); + +2.2 Integrate PMD actions into TSR actions list. + +2.3 Create TSR flow rule. + + .. code-block:: console + + flow create group 0 match {tunnel items} / end actions {PMD actions} / {App actions} / end + +3 Create TMR flow rule. + +3.1 Query PMD for VTP items. Application can query for VTP items more than once. + + .. code-block:: c + + int + rte_flow_tunnel_match(uint16_t port_id, + struct rte_flow_tunnel *tunnel, + struct rte_flow_item **pmd_items, + uint32_t *num_of_pmd_items, + struct rte_flow_error *error); + +3.2 Integrate PMD items into TMR items list. + +3.3 Create TMR flow rule. + + .. code-block:: console + + flow create group 0 match {PMD items} / {APP items} / end actions {offload actions} / end + +The model provides helper function call to restore packets that miss +tunnel TMR rules to its original state: + +.. code-block:: c + + int + rte_flow_get_restore_info(uint16_t port_id, + struct rte_mbuf *mbuf, + struct rte_flow_restore_info *info, + struct rte_flow_error *error); + +rte_tunnel object filled by the call inside +``rte_flow_restore_info *info parameter`` can be used by the application +to create new TMR rule for that tunnel. + +The model requirements: + +Software application must initialize +rte_tunnel object with tunnel parameters before calling +rte_flow_tunnel_decap_set() & rte_flow_tunnel_match(). + +PMD actions array obtained in rte_flow_tunnel_decap_set() must be +released by application with rte_flow_action_release() call. +Application can release the actionsfter TSR rule was created. + +PMD items array obtained with rte_flow_tunnel_match() must be released +by application with rte_flow_item_release() call. Application can +release the items after rule was created. However, if the application +needs to create additional TMR rule for the same tunnel it will need +to obtain PMD items again. + +Application cannot destroy rte_tunnel object before it releases all +PMD actions & PMD items referencing that tunnel. + Caveats ------- diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst index c6642f5f94..802d9e74d6 100644 --- a/doc/guides/rel_notes/release_20_11.rst +++ b/doc/guides/rel_notes/release_20_11.rst @@ -55,6 +55,16 @@ New Features Also, make sure to start the actual text at the margin. ======================================================= +* **Flow rules allowed to use private PMD items / actions.** + + * Flow rule verification was updated to accept private PMD + items and actions. + +* **Added generic API to offload tunneled traffic and restore missed packet.** + + * Added a new hardware independent helper API to RTE flow library that + offloads tunneled traffic and restores missed packets. + * **Updated Cisco enic driver.** * Added support for VF representors with single-queue Tx/Rx and flow API diff --git a/lib/librte_ethdev/rte_ethdev_version.map b/lib/librte_ethdev/rte_ethdev_version.map index c95ef5157a..9832c138a2 100644 --- a/lib/librte_ethdev/rte_ethdev_version.map +++ b/lib/librte_ethdev/rte_ethdev_version.map @@ -226,6 +226,12 @@ EXPERIMENTAL { rte_tm_wred_profile_add; rte_tm_wred_profile_delete; + rte_flow_tunnel_decap_set; + rte_flow_tunnel_match; + rte_flow_get_restore_info; + rte_flow_tunnel_action_decap_release; + rte_flow_tunnel_item_release; + # added in 20.11 rte_eth_link_speed_to_str; rte_eth_link_to_str; diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c index c8c6d62a8b..181c02792d 100644 --- a/lib/librte_ethdev/rte_flow.c +++ b/lib/librte_ethdev/rte_flow.c @@ -1267,3 +1267,115 @@ rte_flow_get_aged_flows(uint16_t port_id, void **contexts, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL, rte_strerror(ENOTSUP)); } + +int +rte_flow_tunnel_decap_set(uint16_t port_id, + struct rte_flow_tunnel *tunnel, + struct rte_flow_action **actions, + uint32_t *num_of_actions, + struct rte_flow_error *error) +{ + struct rte_eth_dev *dev = &rte_eth_devices[port_id]; + const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error); + + if (unlikely(!ops)) + return -rte_errno; + if (likely(!!ops->tunnel_decap_set)) { + return flow_err(port_id, + ops->tunnel_decap_set(dev, tunnel, actions, + num_of_actions, error), + error); + } + return rte_flow_error_set(error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_UNSPECIFIED, + NULL, rte_strerror(ENOTSUP)); +} + +int +rte_flow_tunnel_match(uint16_t port_id, + struct rte_flow_tunnel *tunnel, + struct rte_flow_item **items, + uint32_t *num_of_items, + struct rte_flow_error *error) +{ + struct rte_eth_dev *dev = &rte_eth_devices[port_id]; + const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error); + + if (unlikely(!ops)) + return -rte_errno; + if (likely(!!ops->tunnel_match)) { + return flow_err(port_id, + ops->tunnel_match(dev, tunnel, items, + num_of_items, error), + error); + } + return rte_flow_error_set(error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_UNSPECIFIED, + NULL, rte_strerror(ENOTSUP)); +} + +int +rte_flow_get_restore_info(uint16_t port_id, + struct rte_mbuf *m, + struct rte_flow_restore_info *restore_info, + struct rte_flow_error *error) +{ + struct rte_eth_dev *dev = &rte_eth_devices[port_id]; + const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error); + + if (unlikely(!ops)) + return -rte_errno; + if (likely(!!ops->get_restore_info)) { + return flow_err(port_id, + ops->get_restore_info(dev, m, restore_info, + error), + error); + } + return rte_flow_error_set(error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_UNSPECIFIED, + NULL, rte_strerror(ENOTSUP)); +} + +int +rte_flow_tunnel_action_decap_release(uint16_t port_id, + struct rte_flow_action *actions, + uint32_t num_of_actions, + struct rte_flow_error *error) +{ + struct rte_eth_dev *dev = &rte_eth_devices[port_id]; + const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error); + + if (unlikely(!ops)) + return -rte_errno; + if (likely(!!ops->action_release)) { + return flow_err(port_id, + ops->action_release(dev, actions, + num_of_actions, error), + error); + } + return rte_flow_error_set(error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_UNSPECIFIED, + NULL, rte_strerror(ENOTSUP)); +} + +int +rte_flow_tunnel_item_release(uint16_t port_id, + struct rte_flow_item *items, + uint32_t num_of_items, + struct rte_flow_error *error) +{ + struct rte_eth_dev *dev = &rte_eth_devices[port_id]; + const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error); + + if (unlikely(!ops)) + return -rte_errno; + if (likely(!!ops->item_release)) { + return flow_err(port_id, + ops->item_release(dev, items, + num_of_items, error), + error); + } + return rte_flow_error_set(error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_UNSPECIFIED, + NULL, rte_strerror(ENOTSUP)); +} diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h index da8bfa5489..2f12d3ea1a 100644 --- a/lib/librte_ethdev/rte_flow.h +++ b/lib/librte_ethdev/rte_flow.h @@ -3357,6 +3357,201 @@ int rte_flow_get_aged_flows(uint16_t port_id, void **contexts, uint32_t nb_contexts, struct rte_flow_error *error); +/* Tunnel has a type and the key information. */ +struct rte_flow_tunnel { + /** + * Tunnel type, for example RTE_FLOW_ITEM_TYPE_VXLAN, + * RTE_FLOW_ITEM_TYPE_NVGRE etc. + */ + enum rte_flow_item_type type; + uint64_t tun_id; /**< Tunnel identification. */ + + RTE_STD_C11 + union { + struct { + rte_be32_t src_addr; /**< IPv4 source address. */ + rte_be32_t dst_addr; /**< IPv4 destination address. */ + } ipv4; + struct { + uint8_t src_addr[16]; /**< IPv6 source address. */ + uint8_t dst_addr[16]; /**< IPv6 destination address. */ + } ipv6; + }; + rte_be16_t tp_src; /**< Tunnel port source. */ + rte_be16_t tp_dst; /**< Tunnel port destination. */ + uint16_t tun_flags; /**< Tunnel flags. */ + + bool is_ipv6; /**< True for valid IPv6 fields. Otherwise IPv4. */ + + /** + * the following members are required to restore packet + * after miss + */ + uint8_t tos; /**< TOS for IPv4, TC for IPv6. */ + uint8_t ttl; /**< TTL for IPv4, HL for IPv6. */ + uint32_t label; /**< Flow Label for IPv6. */ +}; + +/** + * Indicate that the packet has a tunnel. + */ +#define RTE_FLOW_RESTORE_INFO_TUNNEL (1ULL << 0) + +/** + * Indicate that the packet has a non decapsulated tunnel header. + */ +#define RTE_FLOW_RESTORE_INFO_ENCAPSULATED (1ULL << 1) + +/** + * Indicate that the packet has a group_id. + */ +#define RTE_FLOW_RESTORE_INFO_GROUP_ID (1ULL << 2) + +/** + * Restore information structure to communicate the current packet processing + * state when some of the processing pipeline is done in hardware and should + * continue in software. + */ +struct rte_flow_restore_info { + /** + * Bitwise flags (RTE_FLOW_RESTORE_INFO_*) to indicate validation of + * other fields in struct rte_flow_restore_info. + */ + uint64_t flags; + uint32_t group_id; /**< Group ID where packed missed */ + struct rte_flow_tunnel tunnel; /**< Tunnel information. */ +}; + +/** + * Allocate an array of actions to be used in rte_flow_create, to implement + * tunnel-decap-set for the given tunnel. + * Sample usage: + * actions vxlan_decap / tunnel-decap-set(tunnel properties) / + * jump group 0 / end + * + * @param port_id + * Port identifier of Ethernet device. + * @param[in] tunnel + * Tunnel properties. + * @param[out] actions + * Array of actions to be allocated by the PMD. This array should be + * concatenated with the actions array provided to rte_flow_create. + * @param[out] num_of_actions + * Number of actions allocated. + * @param[out] error + * Perform verbose error reporting if not NULL. PMDs initialize this + * structure in case of error only. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +__rte_experimental +int +rte_flow_tunnel_decap_set(uint16_t port_id, + struct rte_flow_tunnel *tunnel, + struct rte_flow_action **actions, + uint32_t *num_of_actions, + struct rte_flow_error *error); + +/** + * Allocate an array of items to be used in rte_flow_create, to implement + * tunnel-match for the given tunnel. + * Sample usage: + * pattern tunnel-match(tunnel properties) / outer-header-matches / + * inner-header-matches / end + * + * @param port_id + * Port identifier of Ethernet device. + * @param[in] tunnel + * Tunnel properties. + * @param[out] items + * Array of items to be allocated by the PMD. This array should be + * concatenated with the items array provided to rte_flow_create. + * @param[out] num_of_items + * Number of items allocated. + * @param[out] error + * Perform verbose error reporting if not NULL. PMDs initialize this + * structure in case of error only. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +__rte_experimental +int +rte_flow_tunnel_match(uint16_t port_id, + struct rte_flow_tunnel *tunnel, + struct rte_flow_item **items, + uint32_t *num_of_items, + struct rte_flow_error *error); + +/** + * Populate the current packet processing state, if exists, for the given mbuf. + * + * @param port_id + * Port identifier of Ethernet device. + * @param[in] m + * Mbuf struct. + * @param[out] info + * Restore information. Upon success contains the HW state. + * @param[out] error + * Perform verbose error reporting if not NULL. PMDs initialize this + * structure in case of error only. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +__rte_experimental +int +rte_flow_get_restore_info(uint16_t port_id, + struct rte_mbuf *m, + struct rte_flow_restore_info *info, + struct rte_flow_error *error); + +/** + * Release the action array as allocated by rte_flow_tunnel_decap_set. + * + * @param port_id + * Port identifier of Ethernet device. + * @param[in] actions + * Array of actions to be released. + * @param[in] num_of_actions + * Number of elements in actions array. + * @param[out] error + * Perform verbose error reporting if not NULL. PMDs initialize this + * structure in case of error only. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +__rte_experimental +int +rte_flow_tunnel_action_decap_release(uint16_t port_id, + struct rte_flow_action *actions, + uint32_t num_of_actions, + struct rte_flow_error *error); + +/** + * Release the item array as allocated by rte_flow_tunnel_match. + * + * @param port_id + * Port identifier of Ethernet device. + * @param[in] items + * Array of items to be released. + * @param[in] num_of_items + * Number of elements in item array. + * @param[out] error + * Perform verbose error reporting if not NULL. PMDs initialize this + * structure in case of error only. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +__rte_experimental +int +rte_flow_tunnel_item_release(uint16_t port_id, + struct rte_flow_item *items, + uint32_t num_of_items, + struct rte_flow_error *error); #ifdef __cplusplus } #endif diff --git a/lib/librte_ethdev/rte_flow_driver.h b/lib/librte_ethdev/rte_flow_driver.h index 3ee871d3eb..9d87407203 100644 --- a/lib/librte_ethdev/rte_flow_driver.h +++ b/lib/librte_ethdev/rte_flow_driver.h @@ -108,6 +108,38 @@ struct rte_flow_ops { void **context, uint32_t nb_contexts, struct rte_flow_error *err); + /** See rte_flow_tunnel_decap_set() */ + int (*tunnel_decap_set) + (struct rte_eth_dev *dev, + struct rte_flow_tunnel *tunnel, + struct rte_flow_action **pmd_actions, + uint32_t *num_of_actions, + struct rte_flow_error *err); + /** See rte_flow_tunnel_match() */ + int (*tunnel_match) + (struct rte_eth_dev *dev, + struct rte_flow_tunnel *tunnel, + struct rte_flow_item **pmd_items, + uint32_t *num_of_items, + struct rte_flow_error *err); + /** See rte_flow_get_rte_flow_restore_info() */ + int (*get_restore_info) + (struct rte_eth_dev *dev, + struct rte_mbuf *m, + struct rte_flow_restore_info *info, + struct rte_flow_error *err); + /** See rte_flow_action_tunnel_decap_release() */ + int (*action_release) + (struct rte_eth_dev *dev, + struct rte_flow_action *pmd_actions, + uint32_t num_of_actions, + struct rte_flow_error *err); + /** See rte_flow_item_release() */ + int (*item_release) + (struct rte_eth_dev *dev, + struct rte_flow_item *pmd_items, + uint32_t num_of_items, + struct rte_flow_error *err); }; /** From patchwork Wed Sep 30 09:18:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gregory Etelson X-Patchwork-Id: 79307 X-Patchwork-Delegate: ferruh.yigit@amd.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id BE4E5A04B5; Wed, 30 Sep 2020 11:20:27 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 9D0761DB25; Wed, 30 Sep 2020 11:19:55 +0200 (CEST) Received: from hqnvemgate25.nvidia.com (hqnvemgate25.nvidia.com [216.228.121.64]) by dpdk.org (Postfix) with ESMTP id 22AB31DB25 for ; Wed, 30 Sep 2020 11:19:51 +0200 (CEST) Received: from hqmail.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate25.nvidia.com (using TLS: TLSv1.2, AES256-SHA) id ; Wed, 30 Sep 2020 02:19:00 -0700 Received: from nvidia.com (10.124.1.5) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Wed, 30 Sep 2020 09:19:31 +0000 From: Gregory Etelson To: CC: , , , Viacheslav Ovsiienko , Matan Azrad , Shahaf Shuler , "Viacheslav Ovsiienko" , John McNamara , Marko Kovacevic Date: Wed, 30 Sep 2020 12:18:52 +0300 Message-ID: <20200930091854.19768-4-getelson@nvidia.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930091854.19768-1-getelson@nvidia.com> References: <20200625160348.26220-1-getelson@mellanox.com> <20200930091854.19768-1-getelson@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [10.124.1.5] X-ClientProxiedBy: HQMAIL105.nvidia.com (172.20.187.12) To HQMAIL107.nvidia.com (172.20.187.13) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1601457540; bh=q0HWqScAVuN246n59FrBWfh7EFVU0HliCb7mojzghkY=; h=From:To:CC:Subject:Date:Message-ID:X-Mailer:In-Reply-To: References:MIME-Version:Content-Transfer-Encoding:Content-Type: X-Originating-IP:X-ClientProxiedBy; b=ZKX2unAPxhXrWlgPfFb/bHO2pHa3tcWCP0O/dtiXvKNe52PRkOojP4ALsXiMIgfFK pLNOcahGCXrjpiRacAx+OgxMOq/JiMwzZHghZPtpbzArTh6kSkhqPTIShbxtVp8Hrx uz4cIZ/yaOwOO+YVBlc5dE6cOodFAzRLvYPwqBwS5gCDGIUVA6TrPQqdZJvxwngziA yIDwfB/u4ScnYZUUszTsczRFxTOGXl+NJwaTPkvDCOyaJ/rDNsf/B+jm9uCCmK0UqO 1AhhRgFQTnAzkqSc9nQmJSYuFUiGxch/CxkddErP98XVfHtZeUIFAvErVgGa41w7TY zQX9Gp/yrz8ZA== Subject: [dpdk-dev] [PATCH v3 3/4] net/mlx5: implement tunnel offload API X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Tunnel Offload API provides hardware independent, unified model to offload tunneled traffic. Key model elements are: - apply matches to both outer and inner packet headers during entire offload procedure; - restore outer header of partially offloaded packet; - model is implemented as a set of helper functions. Implementation details: * tunnel_offload PMD parameter must be set to 1 to enable the feature. * application cannot use MARK and META flow actions whith tunnel. * offload JUMP action is restricted to steering tunnel rule only. Signed-off-by: Gregory Etelson Acked-by: Viacheslav Ovsiienko --- v2: * introduce MLX5 PMD API implementation v3: * bug fixes --- doc/guides/nics/mlx5.rst | 3 + drivers/net/mlx5/linux/mlx5_os.c | 18 + drivers/net/mlx5/mlx5.c | 8 +- drivers/net/mlx5/mlx5.h | 3 + drivers/net/mlx5/mlx5_defs.h | 2 + drivers/net/mlx5/mlx5_flow.c | 678 ++++++++++++++++++++++++++++++- drivers/net/mlx5/mlx5_flow.h | 173 +++++++- drivers/net/mlx5/mlx5_flow_dv.c | 241 +++++++++-- 8 files changed, 1080 insertions(+), 46 deletions(-) diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index 211c0c5a6c..287fd23b43 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -815,6 +815,9 @@ Driver options 24 bits. The actual supported width can be retrieved in runtime by series of rte_flow_validate() trials. + - 3, this engages tunnel offload mode. In E-Switch configuration, that + mode implicitly activates ``dv_xmeta_en=1``. + +------+-----------+-----------+-------------+-------------+ | Mode | ``MARK`` | ``META`` | ``META`` Tx | FDB/Through | +======+===========+===========+=============+=============+ diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c index 81a2e99e71..ecf4d2f2a6 100644 --- a/drivers/net/mlx5/linux/mlx5_os.c +++ b/drivers/net/mlx5/linux/mlx5_os.c @@ -298,6 +298,12 @@ mlx5_alloc_shared_dr(struct mlx5_priv *priv) sh->esw_drop_action = mlx5_glue->dr_create_flow_action_drop(); } #endif + if (!sh->tunnel_hub) + err = mlx5_alloc_tunnel_hub(sh); + if (err) { + DRV_LOG(ERR, "mlx5_alloc_tunnel_hub failed err=%d", err); + goto error; + } if (priv->config.reclaim_mode == MLX5_RCM_AGGR) { mlx5_glue->dr_reclaim_domain_memory(sh->rx_domain, 1); mlx5_glue->dr_reclaim_domain_memory(sh->tx_domain, 1); @@ -344,6 +350,10 @@ mlx5_alloc_shared_dr(struct mlx5_priv *priv) mlx5_hlist_destroy(sh->tag_table, NULL, NULL); sh->tag_table = NULL; } + if (sh->tunnel_hub) { + mlx5_release_tunnel_hub(sh, priv->dev_port); + sh->tunnel_hub = NULL; + } mlx5_free_table_hash_list(priv); return err; } @@ -405,6 +415,10 @@ mlx5_os_free_shared_dr(struct mlx5_priv *priv) mlx5_hlist_destroy(sh->tag_table, NULL, NULL); sh->tag_table = NULL; } + if (sh->tunnel_hub) { + mlx5_release_tunnel_hub(sh, priv->dev_port); + sh->tunnel_hub = NULL; + } mlx5_free_table_hash_list(priv); } @@ -658,6 +672,10 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev, strerror(rte_errno)); goto error; } + if (config->dv_miss_info) { + if (switch_info->master || switch_info->representor) + config->dv_xmeta_en = MLX5_XMETA_MODE_META16; + } mlx5_malloc_mem_select(config->sys_mem_en); sh = mlx5_alloc_shared_dev_ctx(spawn, config); if (!sh) diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index 4a807fb4fd..569a865da8 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -1590,13 +1590,17 @@ mlx5_args_check(const char *key, const char *val, void *opaque) } else if (strcmp(MLX5_DV_XMETA_EN, key) == 0) { if (tmp != MLX5_XMETA_MODE_LEGACY && tmp != MLX5_XMETA_MODE_META16 && - tmp != MLX5_XMETA_MODE_META32) { + tmp != MLX5_XMETA_MODE_META32 && + tmp != MLX5_XMETA_MODE_MISS_INFO) { DRV_LOG(ERR, "invalid extensive " "metadata parameter"); rte_errno = EINVAL; return -rte_errno; } - config->dv_xmeta_en = tmp; + if (tmp != MLX5_XMETA_MODE_MISS_INFO) + config->dv_xmeta_en = tmp; + else + config->dv_miss_info = 1; } else if (strcmp(MLX5_LACP_BY_USER, key) == 0) { config->lacp_by_user = !!tmp; } else if (strcmp(MLX5_MR_EXT_MEMSEG_EN, key) == 0) { diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index 0907506755..e12c4cee4b 100644 --- a/drivers/net/mlx5/mlx5.h +++ b/drivers/net/mlx5/mlx5.h @@ -206,6 +206,7 @@ struct mlx5_dev_config { unsigned int rt_timestamp:1; /* realtime timestamp format. */ unsigned int sys_mem_en:1; /* The default memory allocator. */ unsigned int decap_en:1; /* Whether decap will be used or not. */ + unsigned int dv_miss_info:1; /* restore packet after partial hw miss */ struct { unsigned int enabled:1; /* Whether MPRQ is enabled. */ unsigned int stride_num_n; /* Number of strides. */ @@ -632,6 +633,8 @@ struct mlx5_dev_ctx_shared { /* UAR same-page access control required in 32bit implementations. */ #endif struct mlx5_hlist *flow_tbls; + struct rte_hash *flow_tbl_map; /* app group-to-flow table map */ + struct mlx5_flow_tunnel_hub *tunnel_hub; /* Direct Rules tables for FDB, NIC TX+RX */ void *esw_drop_action; /* Pointer to DR E-Switch drop action. */ void *pop_vlan_action; /* Pointer to DR pop VLAN action. */ diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h index 0df47391ee..41a7537d5e 100644 --- a/drivers/net/mlx5/mlx5_defs.h +++ b/drivers/net/mlx5/mlx5_defs.h @@ -165,6 +165,8 @@ #define MLX5_XMETA_MODE_LEGACY 0 #define MLX5_XMETA_MODE_META16 1 #define MLX5_XMETA_MODE_META32 2 +/* Provide info on patrial hw miss. Implies MLX5_XMETA_MODE_META16 */ +#define MLX5_XMETA_MODE_MISS_INFO 3 /* MLX5_TX_DB_NC supported values. */ #define MLX5_TXDB_CACHED 0 diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c index 416505f1c8..26625e0bd8 100644 --- a/drivers/net/mlx5/mlx5_flow.c +++ b/drivers/net/mlx5/mlx5_flow.c @@ -18,6 +18,7 @@ #include #include #include +#include #include #include @@ -30,6 +31,18 @@ #include "mlx5_flow_os.h" #include "mlx5_rxtx.h" +static struct mlx5_flow_tunnel * +mlx5_find_tunnel_id(struct rte_eth_dev *dev, uint32_t id); +static void +mlx5_flow_tunnel_free(struct rte_eth_dev *dev, struct mlx5_flow_tunnel *tunnel); +static uint32_t +tunnel_flow_group_to_flow_table(struct rte_eth_dev *dev, + const struct mlx5_flow_tunnel *tunnel, + uint32_t group, uint32_t *table, + struct rte_flow_error *error); +static const struct mlx5_flow_tbl_data_entry * +tunnel_mark_decode(struct rte_eth_dev *dev, uint32_t mark); + /** Device flow drivers. */ extern const struct mlx5_flow_driver_ops mlx5_flow_verbs_drv_ops; @@ -220,6 +233,171 @@ static const struct rte_flow_expand_node mlx5_support_expansion[] = { }, }; +struct tunnel_validation { + bool verdict; + const char *msg; +}; + +static inline struct tunnel_validation +mlx5_flow_tunnel_validate(struct rte_eth_dev *dev, + struct rte_flow_tunnel *tunnel) +{ + struct tunnel_validation tv; + + if (!is_tunnel_offload_active(dev)) { + tv.msg = "tunnel offload was not activated"; + goto err; + } else if (!tunnel) { + tv.msg = "no application tunnel"; + goto err; + } + + switch (tunnel->type) { + default: + tv.msg = "unsupported tunnel type"; + goto err; + case RTE_FLOW_ITEM_TYPE_VXLAN: + break; + } + + tv.verdict = true; + return tv; + +err: + tv.verdict = false; + return tv; +} + +static int +mlx5_flow_tunnel_decap_set(struct rte_eth_dev *dev, + struct rte_flow_tunnel *app_tunnel, + struct rte_flow_action **actions, + uint32_t *num_of_actions, + struct rte_flow_error *error) +{ + int ret; + struct mlx5_flow_tunnel *tunnel; + struct tunnel_validation tv; + + tv = mlx5_flow_tunnel_validate(dev, app_tunnel); + if (!tv.verdict) + return rte_flow_error_set(error, EINVAL, + RTE_FLOW_ERROR_TYPE_ACTION_CONF, NULL, + tv.msg); + ret = mlx5_get_flow_tunnel(dev, app_tunnel, &tunnel); + if (ret < 0) { + return rte_flow_error_set(error, ret, + RTE_FLOW_ERROR_TYPE_ACTION_CONF, NULL, + "failed to initialize pmd tunnel"); + } + *actions = &tunnel->action; + *num_of_actions = 1; + return 0; +} + +static int +mlx5_flow_tunnel_match(struct rte_eth_dev *dev, + struct rte_flow_tunnel *app_tunnel, + struct rte_flow_item **items, + uint32_t *num_of_items, + struct rte_flow_error *error) +{ + int ret; + struct mlx5_flow_tunnel *tunnel; + struct tunnel_validation tv; + + tv = mlx5_flow_tunnel_validate(dev, app_tunnel); + if (!tv.verdict) + return rte_flow_error_set(error, EINVAL, + RTE_FLOW_ERROR_TYPE_HANDLE, NULL, + tv.msg); + ret = mlx5_get_flow_tunnel(dev, app_tunnel, &tunnel); + if (ret < 0) { + return rte_flow_error_set(error, ret, + RTE_FLOW_ERROR_TYPE_HANDLE, NULL, + "failed to initialize pmd tunnel"); + } + *items = &tunnel->item; + *num_of_items = 1; + return 0; +} + +static int +mlx5_flow_item_release(struct rte_eth_dev *dev, + struct rte_flow_item *pmd_items, + uint32_t num_items, struct rte_flow_error *err) +{ + struct mlx5_flow_tunnel_hub *thub = mlx5_tunnel_hub(dev); + struct mlx5_flow_tunnel *tun; + + LIST_FOREACH(tun, &thub->tunnels, chain) { + if (&tun->item == pmd_items) + break; + } + if (!tun || num_items != 1) + return rte_flow_error_set(err, EINVAL, + RTE_FLOW_ERROR_TYPE_HANDLE, NULL, + "invalid argument"); + if (!__atomic_sub_fetch(&tun->refctn, 1, __ATOMIC_RELAXED)) + mlx5_flow_tunnel_free(dev, tun); + return 0; +} + +static int +mlx5_flow_action_release(struct rte_eth_dev *dev, + struct rte_flow_action *pmd_actions, + uint32_t num_actions, struct rte_flow_error *err) +{ + struct mlx5_flow_tunnel_hub *thub = mlx5_tunnel_hub(dev); + struct mlx5_flow_tunnel *tun; + + LIST_FOREACH(tun, &thub->tunnels, chain) { + if (&tun->action == pmd_actions) + break; + } + if (!tun || num_actions != 1) + return rte_flow_error_set(err, EINVAL, + RTE_FLOW_ERROR_TYPE_HANDLE, NULL, + "invalid argument"); + if (!__atomic_sub_fetch(&tun->refctn, 1, __ATOMIC_RELAXED)) + mlx5_flow_tunnel_free(dev, tun); + + return 0; +} + +static int +mlx5_flow_tunnel_get_restore_info(struct rte_eth_dev *dev, + struct rte_mbuf *m, + struct rte_flow_restore_info *info, + struct rte_flow_error *err) +{ + uint64_t ol_flags = m->ol_flags; + const struct mlx5_flow_tbl_data_entry *tble; + const uint64_t mask = PKT_RX_FDIR | PKT_RX_FDIR_ID; + + if ((ol_flags & mask) != mask) + goto err; + tble = tunnel_mark_decode(dev, m->hash.fdir.hi); + if (!tble) { + DRV_LOG(DEBUG, "port %u invalid miss tunnel mark %#x", + dev->data->port_id, m->hash.fdir.hi); + goto err; + } + MLX5_ASSERT(tble->tunnel); + memcpy(&info->tunnel, &tble->tunnel->app_tunnel, sizeof(info->tunnel)); + info->group_id = tble->group_id; + info->flags = RTE_FLOW_RESTORE_INFO_TUNNEL | + RTE_FLOW_RESTORE_INFO_GROUP_ID | + RTE_FLOW_RESTORE_INFO_ENCAPSULATED; + + return 0; + +err: + return rte_flow_error_set(err, EINVAL, + RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL, + "failed to get restore info"); +} + static const struct rte_flow_ops mlx5_flow_ops = { .validate = mlx5_flow_validate, .create = mlx5_flow_create, @@ -229,6 +407,11 @@ static const struct rte_flow_ops mlx5_flow_ops = { .query = mlx5_flow_query, .dev_dump = mlx5_flow_dev_dump, .get_aged_flows = mlx5_flow_get_aged_flows, + .tunnel_decap_set = mlx5_flow_tunnel_decap_set, + .tunnel_match = mlx5_flow_tunnel_match, + .action_release = mlx5_flow_action_release, + .item_release = mlx5_flow_item_release, + .get_restore_info = mlx5_flow_tunnel_get_restore_info, }; /* Convert FDIR request to Generic flow. */ @@ -3524,6 +3707,136 @@ flow_hairpin_split(struct rte_eth_dev *dev, return 0; } +__extension__ +union tunnel_offload_mark { + uint32_t val; + struct { + uint32_t app_reserve:8; + uint32_t table_id:15; + uint32_t transfer:1; + uint32_t _unused_:8; + }; +}; + +struct tunnel_default_miss_ctx { + uint16_t *queue; + __extension__ + union { + struct rte_flow_action_rss action_rss; + struct rte_flow_action_queue miss_queue; + struct rte_flow_action_jump miss_jump; + uint8_t raw[0]; + }; +}; + +static int +flow_tunnel_add_default_miss(struct rte_eth_dev *dev, + struct rte_flow *flow, + const struct rte_flow_attr *attr, + const struct rte_flow_action *app_actions, + uint32_t flow_idx, + struct tunnel_default_miss_ctx *ctx, + struct rte_flow_error *error) +{ + struct mlx5_flow *dev_flow; + struct rte_flow_attr miss_attr = *attr; + const struct mlx5_flow_tunnel *tunnel = app_actions[0].conf; + const struct rte_flow_item miss_items[2] = { + { + .type = RTE_FLOW_ITEM_TYPE_ETH, + .spec = NULL, + .last = NULL, + .mask = NULL + }, + { + .type = RTE_FLOW_ITEM_TYPE_END, + .spec = NULL, + .last = NULL, + .mask = NULL + } + }; + union tunnel_offload_mark mark_id; + struct rte_flow_action_mark miss_mark; + struct rte_flow_action miss_actions[3] = { + [0] = { .type = RTE_FLOW_ACTION_TYPE_MARK, .conf = &miss_mark }, + [2] = { .type = RTE_FLOW_ACTION_TYPE_END, .conf = NULL } + }; + const struct rte_flow_action_jump *jump_data; + uint32_t i, flow_table = 0; /* prevent compilation warning */ + int ret; + + if (!attr->transfer) { + struct mlx5_priv *priv = dev->data->dev_private; + uint32_t q_size; + + miss_actions[1].type = RTE_FLOW_ACTION_TYPE_RSS; + q_size = priv->reta_idx_n * sizeof(ctx->queue[0]); + ctx->queue = mlx5_malloc(MLX5_MEM_SYS | MLX5_MEM_ZERO, q_size, + 0, SOCKET_ID_ANY); + if (!ctx->queue) + return rte_flow_error_set + (error, ENOMEM, + RTE_FLOW_ERROR_TYPE_ACTION_CONF, + NULL, "invalid default miss RSS"); + ctx->action_rss.func = RTE_ETH_HASH_FUNCTION_DEFAULT, + ctx->action_rss.level = 0, + ctx->action_rss.types = priv->rss_conf.rss_hf, + ctx->action_rss.key_len = priv->rss_conf.rss_key_len, + ctx->action_rss.queue_num = priv->reta_idx_n, + ctx->action_rss.key = priv->rss_conf.rss_key, + ctx->action_rss.queue = ctx->queue; + if (!priv->reta_idx_n || !priv->rxqs_n) + return rte_flow_error_set + (error, EINVAL, + RTE_FLOW_ERROR_TYPE_ACTION_CONF, + NULL, "invalid port configuration"); + if (!(dev->data->dev_conf.rxmode.mq_mode & ETH_MQ_RX_RSS_FLAG)) + ctx->action_rss.types = 0; + for (i = 0; i != priv->reta_idx_n; ++i) + ctx->queue[i] = (*priv->reta_idx)[i]; + } else { + miss_actions[1].type = RTE_FLOW_ACTION_TYPE_JUMP; + ctx->miss_jump.group = MLX5_TNL_MISS_FDB_JUMP_GRP; + } + miss_actions[1].conf = (typeof(miss_actions[1].conf))ctx->raw; + for (; app_actions->type != RTE_FLOW_ACTION_TYPE_JUMP; app_actions++); + jump_data = app_actions->conf; + miss_attr.priority = MLX5_TNL_MISS_RULE_PRIORITY; + miss_attr.group = jump_data->group; + ret = tunnel_flow_group_to_flow_table(dev, tunnel, jump_data->group, + &flow_table, error); + if (ret) + return rte_flow_error_set(error, EINVAL, + RTE_FLOW_ERROR_TYPE_ACTION_CONF, + NULL, "invalid tunnel id"); + mark_id.app_reserve = 0; + mark_id.table_id = tunnel_flow_tbl_to_id(flow_table); + mark_id.transfer = !!attr->transfer; + mark_id._unused_ = 0; + miss_mark.id = mark_id.val; + dev_flow = flow_drv_prepare(dev, flow, &miss_attr, + miss_items, miss_actions, flow_idx, error); + if (!dev_flow) + return -rte_errno; + dev_flow->flow = flow; + dev_flow->external = true; + dev_flow->tunnel = tunnel; + /* Subflow object was created, we must include one in the list. */ + SILIST_INSERT(&flow->dev_handles, dev_flow->handle_idx, + dev_flow->handle, next); + DRV_LOG(DEBUG, + "port %u tunnel type=%d id=%u miss rule priority=%u group=%u", + dev->data->port_id, tunnel->app_tunnel.type, + tunnel->tunnel_id, miss_attr.priority, miss_attr.group); + ret = flow_drv_translate(dev, dev_flow, &miss_attr, miss_items, + miss_actions, error); + if (!ret) + ret = flow_mreg_update_copy_table(dev, flow, miss_actions, + error); + + return ret; +} + /** * The last stage of splitting chain, just creates the subflow * without any modification. @@ -4296,6 +4609,27 @@ flow_create_split_outer(struct rte_eth_dev *dev, return ret; } +static struct mlx5_flow_tunnel * +flow_tunnel_from_rule(struct rte_eth_dev *dev, + const struct rte_flow_attr *attr, + const struct rte_flow_item items[], + const struct rte_flow_action actions[]) +{ + struct mlx5_flow_tunnel *tunnel; + +#pragma GCC diagnostic push +#pragma GCC diagnostic ignored "-Wcast-qual" + if (is_flow_tunnel_match_rule(dev, attr, items, actions)) + tunnel = (struct mlx5_flow_tunnel *)items[0].spec; + else if (is_flow_tunnel_steer_rule(dev, attr, items, actions)) + tunnel = (struct mlx5_flow_tunnel *)actions[0].conf; + else + tunnel = NULL; +#pragma GCC diagnostic pop + + return tunnel; +} + /** * Create a flow and add it to @p list. * @@ -4356,6 +4690,8 @@ flow_list_create(struct rte_eth_dev *dev, uint32_t *list, int hairpin_flow; uint32_t hairpin_id = 0; struct rte_flow_attr attr_tx = { .priority = 0 }; + struct mlx5_flow_tunnel *tunnel; + struct tunnel_default_miss_ctx default_miss_ctx = { 0, }; int ret; hairpin_flow = flow_check_hairpin_split(dev, attr, actions); @@ -4430,6 +4766,19 @@ flow_list_create(struct rte_eth_dev *dev, uint32_t *list, error); if (ret < 0) goto error; + if (is_flow_tunnel_steer_rule(dev, attr, + buf->entry[i].pattern, + p_actions_rx)) { + ret = flow_tunnel_add_default_miss(dev, flow, attr, + p_actions_rx, + idx, + &default_miss_ctx, + error); + if (ret < 0) { + mlx5_free(default_miss_ctx.queue); + goto error; + } + } } /* Create the tx flow. */ if (hairpin_flow) { @@ -4484,6 +4833,13 @@ flow_list_create(struct rte_eth_dev *dev, uint32_t *list, priv->flow_idx = priv->flow_nested_idx; if (priv->flow_nested_idx) priv->flow_nested_idx = 0; + tunnel = flow_tunnel_from_rule(dev, attr, items, actions); + if (tunnel) { + flow->tunnel = 1; + flow->tunnel_id = tunnel->tunnel_id; + __atomic_add_fetch(&tunnel->refctn, 1, __ATOMIC_RELAXED); + mlx5_free(default_miss_ctx.queue); + } return idx; error: MLX5_ASSERT(flow); @@ -4603,6 +4959,7 @@ mlx5_flow_create(struct rte_eth_dev *dev, "port not started"); return NULL; } + return (void *)(uintptr_t)flow_list_create(dev, &priv->flows, attr, items, actions, true, error); } @@ -4657,6 +5014,13 @@ flow_list_destroy(struct rte_eth_dev *dev, uint32_t *list, } } mlx5_ipool_free(priv->sh->ipool[MLX5_IPOOL_RTE_FLOW], flow_idx); + if (flow->tunnel) { + struct mlx5_flow_tunnel *tunnel; + tunnel = mlx5_find_tunnel_id(dev, flow->tunnel_id); + RTE_VERIFY(tunnel); + if (!__atomic_sub_fetch(&tunnel->refctn, 1, __ATOMIC_RELAXED)) + mlx5_flow_tunnel_free(dev, tunnel); + } } /** @@ -6131,19 +6495,122 @@ mlx5_flow_async_pool_query_handle(struct mlx5_dev_ctx_shared *sh, sh->cmng.pending_queries--; } +static const struct mlx5_flow_tbl_data_entry * +tunnel_mark_decode(struct rte_eth_dev *dev, uint32_t mark) +{ + struct mlx5_priv *priv = dev->data->dev_private; + struct mlx5_dev_ctx_shared *sh = priv->sh; + struct mlx5_hlist_entry *he; + union tunnel_offload_mark mbits = { .val = mark }; + union mlx5_flow_tbl_key table_key = { + { + .table_id = tunnel_id_to_flow_tbl(mbits.table_id), + .reserved = 0, + .domain = !!mbits.transfer, + .direction = 0, + } + }; + he = mlx5_hlist_lookup(sh->flow_tbls, table_key.v64); + return he ? + container_of(he, struct mlx5_flow_tbl_data_entry, entry) : NULL; +} + +static uint32_t +tunnel_flow_group_to_flow_table(struct rte_eth_dev *dev, + const struct mlx5_flow_tunnel *tunnel, + uint32_t group, uint32_t *table, + struct rte_flow_error *error) +{ + struct mlx5_hlist_entry *he; + struct tunnel_tbl_entry *tte; + union tunnel_tbl_key key = { + .tunnel_id = tunnel ? tunnel->tunnel_id : 0, + .group = group + }; + struct mlx5_flow_tunnel_hub *thub = mlx5_tunnel_hub(dev); + struct mlx5_hlist *group_hash; + + group_hash = tunnel ? tunnel->groups : thub->groups; + he = mlx5_hlist_lookup(group_hash, key.val); + if (!he) { + int ret; + tte = mlx5_malloc(MLX5_MEM_SYS | MLX5_MEM_ZERO, + sizeof(*tte), 0, + SOCKET_ID_ANY); + if (!tte) + goto err; + tte->hash.key = key.val; + ret = mlx5_flow_id_get(thub->table_ids, &tte->flow_table); + if (ret) { + mlx5_free(tte); + goto err; + } + tte->flow_table = tunnel_id_to_flow_tbl(tte->flow_table); + mlx5_hlist_insert(group_hash, &tte->hash); + } else { + tte = container_of(he, typeof(*tte), hash); + } + *table = tte->flow_table; + DRV_LOG(DEBUG, "port %u tunnel %u group=%#x table=%#x", + dev->data->port_id, key.tunnel_id, group, *table); + return 0; + +err: + return rte_flow_error_set(error, EINVAL, RTE_FLOW_ERROR_TYPE_ATTR_GROUP, + NULL, "tunnel group index not supported"); +} + +static int +flow_group_to_table(uint32_t port_id, uint32_t group, uint32_t *table, + struct flow_grp_info grp_info, struct rte_flow_error *error) +{ + if (grp_info.transfer && grp_info.external && grp_info.fdb_def_rule) { + if (group == UINT32_MAX) + return rte_flow_error_set + (error, EINVAL, + RTE_FLOW_ERROR_TYPE_ATTR_GROUP, + NULL, + "group index not supported"); + *table = group + 1; + } else { + *table = group; + } + DRV_LOG(DEBUG, "port %u group=%#x table=%#x", port_id, group, *table); + return 0; +} + /** * Translate the rte_flow group index to HW table value. * - * @param[in] attributes - * Pointer to flow attributes - * @param[in] external - * Value is part of flow rule created by request external to PMD. + * If tunnel offload is disabled, all group ids coverted to flow table + * id using the standard method. + * If tunnel offload is enabled, group id can be converted using the + * standard or tunnel conversion method. Group conversion method + * selection depends on flags in `grp_info` parameter: + * - Internal (grp_info.external == 0) groups conversion uses the + * standard method. + * - Group ids in JUMP action converted with the tunnel conversion. + * - Group id in rule attribute conversion depends on a rule type and + * group id value: + * ** non zero group attributes converted with the tunnel method + * ** zero group attribute in non-tunnel rule is converted using the + * standard method - there's only one root table + * ** zero group attribute in steer tunnel rule is converted with the + * standard method - single root table + * ** zero group attribute in match tunnel rule is a special OvS + * case: that value is used for portability reasons. That group + * id is converted with the tunnel conversion method. + * + * @param[in] dev + * Port device + * @param[in] tunnel + * PMD tunnel offload object * @param[in] group * rte_flow group index value. - * @param[out] fdb_def_rule - * Whether fdb jump to table 1 is configured. * @param[out] table * HW table value. + * @param[in] grp_info + * flags used for conversion * @param[out] error * Pointer to error structure. * @@ -6151,22 +6618,34 @@ mlx5_flow_async_pool_query_handle(struct mlx5_dev_ctx_shared *sh, * 0 on success, a negative errno value otherwise and rte_errno is set. */ int -mlx5_flow_group_to_table(const struct rte_flow_attr *attributes, bool external, - uint32_t group, bool fdb_def_rule, uint32_t *table, +mlx5_flow_group_to_table(struct rte_eth_dev *dev, + const struct mlx5_flow_tunnel *tunnel, + uint32_t group, uint32_t *table, + struct flow_grp_info grp_info, struct rte_flow_error *error) { - if (attributes->transfer && external && fdb_def_rule) { - if (group == UINT32_MAX) - return rte_flow_error_set - (error, EINVAL, - RTE_FLOW_ERROR_TYPE_ATTR_GROUP, - NULL, - "group index not supported"); - *table = group + 1; + int ret; + bool standard_translation; + + if (is_tunnel_offload_active(dev)) { + standard_translation = !grp_info.external || + grp_info.std_tbl_fix; } else { - *table = group; + standard_translation = true; } - return 0; + DRV_LOG(DEBUG, + "port %u group=%#x transfer=%d external=%d fdb_def_rule=%d translate=%s", + dev->data->port_id, group, grp_info.transfer, + grp_info.external, grp_info.fdb_def_rule, + standard_translation ? "STANDARD" : "TUNNEL"); + if (standard_translation) + ret = flow_group_to_table(dev->data->port_id, group, table, + grp_info, error); + else + ret = tunnel_flow_group_to_flow_table(dev, tunnel, group, + table, error); + + return ret; } /** @@ -6305,3 +6784,166 @@ mlx5_flow_get_aged_flows(struct rte_eth_dev *dev, void **contexts, dev->data->port_id); return -ENOTSUP; } + +static void +mlx5_flow_tunnel_free(struct rte_eth_dev *dev, + struct mlx5_flow_tunnel *tunnel) +{ + struct mlx5_flow_tunnel_hub *thub = mlx5_tunnel_hub(dev); + struct mlx5_flow_id_pool *id_pool = thub->tunnel_ids; + + DRV_LOG(DEBUG, "port %u release pmd tunnel id=0x%x", + dev->data->port_id, tunnel->tunnel_id); + RTE_VERIFY(!__atomic_load_n(&tunnel->refctn, __ATOMIC_RELAXED)); + LIST_REMOVE(tunnel, chain); + mlx5_flow_id_release(id_pool, tunnel->tunnel_id); + mlx5_hlist_destroy(tunnel->groups, NULL, NULL); + mlx5_free(tunnel); +} + +static struct mlx5_flow_tunnel * +mlx5_find_tunnel_id(struct rte_eth_dev *dev, uint32_t id) +{ + struct mlx5_flow_tunnel_hub *thub = mlx5_tunnel_hub(dev); + struct mlx5_flow_tunnel *tun; + + LIST_FOREACH(tun, &thub->tunnels, chain) { + if (tun->tunnel_id == id) + break; + } + + return tun; +} + +static struct mlx5_flow_tunnel * +mlx5_flow_tunnel_allocate(struct rte_eth_dev *dev, + const struct rte_flow_tunnel *app_tunnel) +{ + int ret; + struct mlx5_flow_tunnel *tunnel; + struct mlx5_flow_tunnel_hub *thub = mlx5_tunnel_hub(dev); + struct mlx5_flow_id_pool *id_pool = thub->tunnel_ids; + uint32_t id; + + ret = mlx5_flow_id_get(id_pool, &id); + if (ret) + return NULL; + /** + * mlx5 flow tunnel is an auxlilary data structure + * It's not part of IO. No need to allocate it from + * huge pages pools dedicated for IO + */ + tunnel = mlx5_malloc(MLX5_MEM_SYS | MLX5_MEM_ZERO, sizeof(*tunnel), + 0, SOCKET_ID_ANY); + if (!tunnel) { + mlx5_flow_id_pool_release(id_pool); + return NULL; + } + tunnel->groups = mlx5_hlist_create("tunnel groups", 1024); + if (!tunnel->groups) { + mlx5_flow_id_pool_release(id_pool); + mlx5_free(tunnel); + return NULL; + } + /* initiate new PMD tunnel */ + memcpy(&tunnel->app_tunnel, app_tunnel, sizeof(*app_tunnel)); + tunnel->tunnel_id = id; + tunnel->action.type = MLX5_RTE_FLOW_ACTION_TYPE_TUNNEL_SET; + tunnel->action.conf = tunnel; + tunnel->item.type = MLX5_RTE_FLOW_ITEM_TYPE_TUNNEL; + tunnel->item.spec = tunnel; + tunnel->item.last = NULL; + tunnel->item.mask = NULL; + + DRV_LOG(DEBUG, "port %u new pmd tunnel id=0x%x", + dev->data->port_id, tunnel->tunnel_id); + + return tunnel; +} + +int +mlx5_get_flow_tunnel(struct rte_eth_dev *dev, + const struct rte_flow_tunnel *app_tunnel, + struct mlx5_flow_tunnel **tunnel) +{ + int ret; + struct mlx5_flow_tunnel_hub *thub = mlx5_tunnel_hub(dev); + struct mlx5_flow_tunnel *tun; + + LIST_FOREACH(tun, &thub->tunnels, chain) { + if (!memcmp(app_tunnel, &tun->app_tunnel, + sizeof(*app_tunnel))) { + *tunnel = tun; + ret = 0; + break; + } + } + if (!tun) { + tun = mlx5_flow_tunnel_allocate(dev, app_tunnel); + if (tun) { + LIST_INSERT_HEAD(&thub->tunnels, tun, chain); + *tunnel = tun; + } else { + ret = -ENOMEM; + } + } + if (tun) + __atomic_add_fetch(&tun->refctn, 1, __ATOMIC_RELAXED); + + return ret; +} + +void mlx5_release_tunnel_hub(struct mlx5_dev_ctx_shared *sh, uint16_t port_id) +{ + struct mlx5_flow_tunnel_hub *thub = sh->tunnel_hub; + + if (!thub) + return; + if (!LIST_EMPTY(&thub->tunnels)) + DRV_LOG(WARNING, "port %u tunnels present\n", port_id); + mlx5_flow_id_pool_release(thub->tunnel_ids); + mlx5_flow_id_pool_release(thub->table_ids); + mlx5_hlist_destroy(thub->groups, NULL, NULL); + mlx5_free(thub); +} + +int mlx5_alloc_tunnel_hub(struct mlx5_dev_ctx_shared *sh) +{ + int err; + struct mlx5_flow_tunnel_hub *thub; + + thub = mlx5_malloc(MLX5_MEM_SYS | MLX5_MEM_ZERO, sizeof(*thub), + 0, SOCKET_ID_ANY); + if (!thub) + return -ENOMEM; + LIST_INIT(&thub->tunnels); + thub->tunnel_ids = mlx5_flow_id_pool_alloc(MLX5_MAX_TUNNELS); + if (!thub->tunnel_ids) { + err = -rte_errno; + goto err; + } + thub->table_ids = mlx5_flow_id_pool_alloc(MLX5_MAX_TABLES); + if (!thub->table_ids) { + err = -rte_errno; + goto err; + } + thub->groups = mlx5_hlist_create("flow groups", MLX5_MAX_TABLES); + if (!thub->groups) { + err = -rte_errno; + goto err; + } + sh->tunnel_hub = thub; + + return 0; + +err: + if (thub->groups) + mlx5_hlist_destroy(thub->groups, NULL, NULL); + if (thub->table_ids) + mlx5_flow_id_pool_release(thub->table_ids); + if (thub->tunnel_ids) + mlx5_flow_id_pool_release(thub->tunnel_ids); + if (thub) + mlx5_free(thub); + return err; +} diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h index 279daf21f5..8691db16ab 100644 --- a/drivers/net/mlx5/mlx5_flow.h +++ b/drivers/net/mlx5/mlx5_flow.h @@ -26,6 +26,7 @@ enum mlx5_rte_flow_item_type { MLX5_RTE_FLOW_ITEM_TYPE_TAG, MLX5_RTE_FLOW_ITEM_TYPE_TX_QUEUE, MLX5_RTE_FLOW_ITEM_TYPE_VLAN, + MLX5_RTE_FLOW_ITEM_TYPE_TUNNEL, }; /* Private (internal) rte flow actions. */ @@ -35,6 +36,7 @@ enum mlx5_rte_flow_action_type { MLX5_RTE_FLOW_ACTION_TYPE_MARK, MLX5_RTE_FLOW_ACTION_TYPE_COPY_MREG, MLX5_RTE_FLOW_ACTION_TYPE_DEFAULT_MISS, + MLX5_RTE_FLOW_ACTION_TYPE_TUNNEL_SET, }; /* Matches on selected register. */ @@ -196,6 +198,8 @@ enum mlx5_feature_name { #define MLX5_FLOW_ACTION_SET_IPV6_DSCP (1ull << 33) #define MLX5_FLOW_ACTION_AGE (1ull << 34) #define MLX5_FLOW_ACTION_DEFAULT_MISS (1ull << 35) +#define MLX5_FLOW_ACTION_TUNNEL_SET (1ull << 36) +#define MLX5_FLOW_ACTION_TUNNEL_MATCH (1ull << 37) #define MLX5_FLOW_FATE_ACTIONS \ (MLX5_FLOW_ACTION_DROP | MLX5_FLOW_ACTION_QUEUE | \ @@ -517,6 +521,10 @@ struct mlx5_flow_tbl_data_entry { struct mlx5_flow_dv_jump_tbl_resource jump; /**< jump resource, at most one for each table created. */ uint32_t idx; /**< index for the indexed mempool. */ + /**< tunnel offload */ + const struct mlx5_flow_tunnel *tunnel; + uint32_t group_id; + bool external; }; /* Verbs specification header. */ @@ -695,6 +703,7 @@ struct mlx5_flow { }; struct mlx5_flow_handle *handle; uint32_t handle_idx; /* Index of the mlx5 flow handle memory. */ + const struct mlx5_flow_tunnel *tunnel; }; /* Flow meter state. */ @@ -840,6 +849,112 @@ struct mlx5_fdir_flow { #define HAIRPIN_FLOW_ID_BITS 28 +#define MLX5_MAX_TUNNELS 256 +#define MLX5_TNL_MISS_RULE_PRIORITY 3 +#define MLX5_TNL_MISS_FDB_JUMP_GRP 0xfaac + +/* + * When tunnel offload is active, all JUMP group ids are converted + * using the same method. That conversion is applied both to tunnel and + * regular rule types. + * Group ids used in tunnel rules are relative to it's tunnel (!). + * Application can create number of steer rules, using the same + * tunnel, with different group id in each rule. + * Each tunnel stores its groups internally in PMD tunnel object. + * Groups used in regular rules do not belong to any tunnel and are stored + * in tunnel hub. + */ + +struct mlx5_flow_tunnel { + LIST_ENTRY(mlx5_flow_tunnel) chain; + struct rte_flow_tunnel app_tunnel; /** app tunnel copy */ + uint32_t tunnel_id; /** unique tunnel ID */ + uint32_t refctn; + struct rte_flow_action action; + struct rte_flow_item item; + struct mlx5_hlist *groups; /** tunnel groups */ +}; + +/** PMD tunnel related context */ +struct mlx5_flow_tunnel_hub { + LIST_HEAD(, mlx5_flow_tunnel) tunnels; + struct mlx5_flow_id_pool *tunnel_ids; + struct mlx5_flow_id_pool *table_ids; + struct mlx5_hlist *groups; /** non tunnel groups */ +}; + +/* convert jump group to flow table ID in tunnel rules */ +struct tunnel_tbl_entry { + struct mlx5_hlist_entry hash; + uint32_t flow_table; +}; + +static inline uint32_t +tunnel_id_to_flow_tbl(uint32_t id) +{ + return id | (1u << 16); +} + +static inline uint32_t +tunnel_flow_tbl_to_id(uint32_t flow_tbl) +{ + return flow_tbl & ~(1u << 16); +} + +union tunnel_tbl_key { + uint64_t val; + struct { + uint32_t tunnel_id; + uint32_t group; + }; +}; + +static inline struct mlx5_flow_tunnel_hub * +mlx5_tunnel_hub(struct rte_eth_dev *dev) +{ + struct mlx5_priv *priv = dev->data->dev_private; + return priv->sh->tunnel_hub; +} + +static inline bool +is_tunnel_offload_active(struct rte_eth_dev *dev) +{ + struct mlx5_priv *priv = dev->data->dev_private; + return !!priv->config.dv_miss_info; +} + +static inline bool +is_flow_tunnel_match_rule(__rte_unused struct rte_eth_dev *dev, + __rte_unused const struct rte_flow_attr *attr, + __rte_unused const struct rte_flow_item items[], + __rte_unused const struct rte_flow_action actions[]) +{ + return (items[0].type == (typeof(items[0].type)) + MLX5_RTE_FLOW_ITEM_TYPE_TUNNEL); +} + +static inline bool +is_flow_tunnel_steer_rule(__rte_unused struct rte_eth_dev *dev, + __rte_unused const struct rte_flow_attr *attr, + __rte_unused const struct rte_flow_item items[], + __rte_unused const struct rte_flow_action actions[]) +{ + return (actions[0].type == (typeof(actions[0].type)) + MLX5_RTE_FLOW_ACTION_TYPE_TUNNEL_SET); +} + +static inline const struct mlx5_flow_tunnel * +flow_actions_to_tunnel(const struct rte_flow_action actions[]) +{ + return actions[0].conf; +} + +static inline const struct mlx5_flow_tunnel * +flow_items_to_tunnel(const struct rte_flow_item items[]) +{ + return items[0].spec; +} + /* Flow structure. */ struct rte_flow { ILIST_ENTRY(uint32_t)next; /**< Index to the next flow structure. */ @@ -847,12 +962,14 @@ struct rte_flow { /**< Device flow handles that are part of the flow. */ uint32_t drv_type:2; /**< Driver type. */ uint32_t fdir:1; /**< Identifier of associated FDIR if any. */ + uint32_t tunnel:1; uint32_t hairpin_flow_id:HAIRPIN_FLOW_ID_BITS; /**< The flow id used for hairpin. */ uint32_t copy_applied:1; /**< The MARK copy Flow os applied. */ uint32_t rix_mreg_copy; /**< Index to metadata register copy table resource. */ uint32_t counter; /**< Holds flow counter. */ + uint32_t tunnel_id; /**< Tunnel id */ uint16_t meter; /**< Holds flow meter id. */ } __rte_packed; @@ -935,9 +1052,54 @@ void mlx5_flow_id_pool_release(struct mlx5_flow_id_pool *pool); uint32_t mlx5_flow_id_get(struct mlx5_flow_id_pool *pool, uint32_t *id); uint32_t mlx5_flow_id_release(struct mlx5_flow_id_pool *pool, uint32_t id); -int mlx5_flow_group_to_table(const struct rte_flow_attr *attributes, - bool external, uint32_t group, bool fdb_def_rule, - uint32_t *table, struct rte_flow_error *error); +__extension__ +struct flow_grp_info { + uint64_t external:1; + uint64_t transfer:1; + uint64_t fdb_def_rule:1; + /* force standard group translation */ + uint64_t std_tbl_fix:1; +}; + +static inline bool +tunnel_use_standard_attr_group_translate + (struct rte_eth_dev *dev, + const struct mlx5_flow_tunnel *tunnel, + const struct rte_flow_attr *attr, + const struct rte_flow_item items[], + const struct rte_flow_action actions[]) +{ + bool verdict; + + if (!is_tunnel_offload_active(dev)) + /* no tunnel offload API */ + verdict = true; + else if (tunnel) { + /* + * OvS will use jump to group 0 in tunnel steer rule. + * If tunnel steer rule starts from group 0 (attr.group == 0) + * that 0 group must be traslated with standard method. + * attr.group == 0 in tunnel match rule translated with tunnel + * method + */ + verdict = !attr->group && + is_flow_tunnel_steer_rule(dev, attr, items, actions); + } else { + /* + * non-tunnel group translation uses standard method for + * root group only: attr.group == 0 + */ + verdict = !attr->group; + } + + return verdict; +} + +int mlx5_flow_group_to_table(struct rte_eth_dev *dev, + const struct mlx5_flow_tunnel *tunnel, + uint32_t group, uint32_t *table, + struct flow_grp_info flags, + struct rte_flow_error *error); uint64_t mlx5_flow_hashfields_adjust(struct mlx5_flow_rss_desc *rss_desc, int tunnel, uint64_t layer_types, uint64_t hash_fields); @@ -1069,4 +1231,9 @@ int mlx5_flow_destroy_policer_rules(struct rte_eth_dev *dev, const struct rte_flow_attr *attr); int mlx5_flow_meter_flush(struct rte_eth_dev *dev, struct rte_mtr_error *error); +int mlx5_get_flow_tunnel(struct rte_eth_dev *dev, + const struct rte_flow_tunnel *app_tunnel, + struct mlx5_flow_tunnel **tunnel); +void mlx5_release_tunnel_hub(struct mlx5_dev_ctx_shared *sh, uint16_t port_id); +int mlx5_alloc_tunnel_hub(struct mlx5_dev_ctx_shared *sh); #endif /* RTE_PMD_MLX5_FLOW_H_ */ diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c index 3819cdb266..2e6cb779fb 100644 --- a/drivers/net/mlx5/mlx5_flow_dv.c +++ b/drivers/net/mlx5/mlx5_flow_dv.c @@ -3702,14 +3702,21 @@ flow_dv_validate_action_modify_ttl(const uint64_t action_flags, * 0 on success, a negative errno value otherwise and rte_errno is set. */ static int -flow_dv_validate_action_jump(const struct rte_flow_action *action, +flow_dv_validate_action_jump(struct rte_eth_dev *dev, + const struct mlx5_flow_tunnel *tunnel, + const struct rte_flow_action *action, uint64_t action_flags, const struct rte_flow_attr *attributes, bool external, struct rte_flow_error *error) { uint32_t target_group, table; int ret = 0; - + struct flow_grp_info grp_info = { + .external = !!external, + .transfer = !!attributes->transfer, + .fdb_def_rule = 1, + .std_tbl_fix = 0 + }; if (action_flags & (MLX5_FLOW_FATE_ACTIONS | MLX5_FLOW_FATE_ESWITCH_ACTIONS)) return rte_flow_error_set(error, EINVAL, @@ -3726,11 +3733,13 @@ flow_dv_validate_action_jump(const struct rte_flow_action *action, NULL, "action configuration not set"); target_group = ((const struct rte_flow_action_jump *)action->conf)->group; - ret = mlx5_flow_group_to_table(attributes, external, target_group, - true, &table, error); + ret = mlx5_flow_group_to_table(dev, tunnel, target_group, &table, + grp_info, error); if (ret) return ret; - if (attributes->group == target_group) + if (attributes->group == target_group && + !(action_flags & (MLX5_FLOW_ACTION_TUNNEL_SET | + MLX5_FLOW_ACTION_TUNNEL_MATCH))) return rte_flow_error_set(error, EINVAL, RTE_FLOW_ERROR_TYPE_ACTION, NULL, "target group must be other than" @@ -4982,8 +4991,9 @@ flow_dv_counter_release(struct rte_eth_dev *dev, uint32_t counter) */ static int flow_dv_validate_attributes(struct rte_eth_dev *dev, + const struct mlx5_flow_tunnel *tunnel, const struct rte_flow_attr *attributes, - bool external __rte_unused, + struct flow_grp_info grp_info, struct rte_flow_error *error) { struct mlx5_priv *priv = dev->data->dev_private; @@ -4999,9 +5009,8 @@ flow_dv_validate_attributes(struct rte_eth_dev *dev, #else uint32_t table = 0; - ret = mlx5_flow_group_to_table(attributes, external, - attributes->group, !!priv->fdb_def_rule, - &table, error); + ret = mlx5_flow_group_to_table(dev, tunnel, attributes->group, &table, + grp_info, error); if (ret) return ret; if (!table) @@ -5123,10 +5132,28 @@ flow_dv_validate(struct rte_eth_dev *dev, const struct rte_flow_attr *attr, const struct rte_flow_item_vlan *vlan_m = NULL; int16_t rw_act_num = 0; uint64_t is_root; + const struct mlx5_flow_tunnel *tunnel; + struct flow_grp_info grp_info = { + .external = !!external, + .transfer = !!attr->transfer, + .fdb_def_rule = !!priv->fdb_def_rule, + }; if (items == NULL) return -1; - ret = flow_dv_validate_attributes(dev, attr, external, error); + if (is_flow_tunnel_match_rule(dev, attr, items, actions)) { + tunnel = flow_items_to_tunnel(items); + action_flags |= MLX5_FLOW_ACTION_TUNNEL_MATCH | + MLX5_FLOW_ACTION_DECAP; + } else if (is_flow_tunnel_steer_rule(dev, attr, items, actions)) { + tunnel = flow_actions_to_tunnel(actions); + action_flags |= MLX5_FLOW_ACTION_TUNNEL_SET; + } else { + tunnel = NULL; + } + grp_info.std_tbl_fix = tunnel_use_standard_attr_group_translate + (dev, tunnel, attr, items, actions); + ret = flow_dv_validate_attributes(dev, tunnel, attr, grp_info, error); if (ret < 0) return ret; is_root = (uint64_t)ret; @@ -5139,6 +5166,15 @@ flow_dv_validate(struct rte_eth_dev *dev, const struct rte_flow_attr *attr, RTE_FLOW_ERROR_TYPE_ITEM, NULL, "item not supported"); switch (type) { + case MLX5_RTE_FLOW_ITEM_TYPE_TUNNEL: + if (items[0].type != (typeof(items[0].type)) + MLX5_RTE_FLOW_ITEM_TYPE_TUNNEL) + return rte_flow_error_set + (error, EINVAL, + RTE_FLOW_ERROR_TYPE_ITEM, + NULL, "MLX5 private items " + "must be the first"); + break; case RTE_FLOW_ITEM_TYPE_VOID: break; case RTE_FLOW_ITEM_TYPE_PORT_ID: @@ -5703,7 +5739,7 @@ flow_dv_validate(struct rte_eth_dev *dev, const struct rte_flow_attr *attr, rw_act_num += MLX5_ACT_NUM_MDF_TTL; break; case RTE_FLOW_ACTION_TYPE_JUMP: - ret = flow_dv_validate_action_jump(actions, + ret = flow_dv_validate_action_jump(dev, tunnel, actions, action_flags, attr, external, error); @@ -5803,6 +5839,17 @@ flow_dv_validate(struct rte_eth_dev *dev, const struct rte_flow_attr *attr, action_flags |= MLX5_FLOW_ACTION_SET_IPV6_DSCP; rw_act_num += MLX5_ACT_NUM_SET_DSCP; break; + case MLX5_RTE_FLOW_ACTION_TYPE_TUNNEL_SET: + if (actions[0].type != (typeof(actions[0].type)) + MLX5_RTE_FLOW_ACTION_TYPE_TUNNEL_SET) + return rte_flow_error_set + (error, EINVAL, + RTE_FLOW_ERROR_TYPE_ACTION, + NULL, "MLX5 private action " + "must be the first"); + + action_flags |= MLX5_FLOW_ACTION_TUNNEL_SET; + break; default: return rte_flow_error_set(error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ACTION, @@ -5810,6 +5857,54 @@ flow_dv_validate(struct rte_eth_dev *dev, const struct rte_flow_attr *attr, "action not supported"); } } + /* + * Validate actions in flow rules + * - Explicit decap action is prohibited by the tunnel offload API. + * - Drop action in tunnel steer rule is prohibited by the API. + * - Application cannot use MARK action because it's value can mask + * tunnel default miss nitification. + * - JUMP in tunnel match rule has no support in current PMD + * implementation. + * - TAG & META are reserved for future uses. + */ + if (action_flags & MLX5_FLOW_ACTION_TUNNEL_SET) { + uint64_t bad_actions_mask = MLX5_FLOW_ACTION_DECAP | + MLX5_FLOW_ACTION_MARK | + MLX5_FLOW_ACTION_SET_TAG | + MLX5_FLOW_ACTION_SET_META | + MLX5_FLOW_ACTION_DROP; + + if (action_flags & bad_actions_mask) + return rte_flow_error_set + (error, EINVAL, + RTE_FLOW_ERROR_TYPE_ACTION, NULL, + "Invalid RTE action in tunnel " + "set decap rule"); + if (!(action_flags & MLX5_FLOW_ACTION_JUMP)) + return rte_flow_error_set + (error, EINVAL, + RTE_FLOW_ERROR_TYPE_ACTION, NULL, + "tunnel set decap rule must terminate " + "with JUMP"); + if (!attr->ingress) + return rte_flow_error_set + (error, EINVAL, + RTE_FLOW_ERROR_TYPE_ACTION, NULL, + "tunnel flows for ingress traffic only"); + } + if (action_flags & MLX5_FLOW_ACTION_TUNNEL_MATCH) { + uint64_t bad_actions_mask = MLX5_FLOW_ACTION_JUMP | + MLX5_FLOW_ACTION_MARK | + MLX5_FLOW_ACTION_SET_TAG | + MLX5_FLOW_ACTION_SET_META; + + if (action_flags & bad_actions_mask) + return rte_flow_error_set + (error, EINVAL, + RTE_FLOW_ERROR_TYPE_ACTION, NULL, + "Invalid RTE action in tunnel " + "set match rule"); + } /* * Validate the drop action mutual exclusion with other actions. * Drop action is mutually-exclusive with any other action, except for @@ -7616,6 +7711,9 @@ static struct mlx5_flow_tbl_resource * flow_dv_tbl_resource_get(struct rte_eth_dev *dev, uint32_t table_id, uint8_t egress, uint8_t transfer, + bool external, + const struct mlx5_flow_tunnel *tunnel, + uint32_t group_id, struct rte_flow_error *error) { struct mlx5_priv *priv = dev->data->dev_private; @@ -7652,6 +7750,9 @@ flow_dv_tbl_resource_get(struct rte_eth_dev *dev, return NULL; } tbl_data->idx = idx; + tbl_data->tunnel = tunnel; + tbl_data->group_id = group_id; + tbl_data->external = external; tbl = &tbl_data->tbl; pos = &tbl_data->entry; if (transfer) @@ -7715,6 +7816,41 @@ flow_dv_tbl_resource_release(struct rte_eth_dev *dev, mlx5_flow_os_destroy_flow_tbl(tbl->obj); tbl->obj = NULL; + if (is_tunnel_offload_active(dev) && tbl_data->external) { + struct mlx5_hlist_entry *he; + struct mlx5_hlist *tunnel_grp_hash; + struct mlx5_flow_tunnel_hub *thub = + mlx5_tunnel_hub(dev); + union tunnel_tbl_key tunnel_key = { + .tunnel_id = tbl_data->tunnel ? + tbl_data->tunnel->tunnel_id : 0, + .group = tbl_data->group_id + }; + union mlx5_flow_tbl_key table_key = { + .v64 = pos->key + }; + uint32_t table_id = table_key.table_id; + + tunnel_grp_hash = tbl_data->tunnel ? + tbl_data->tunnel->groups : + thub->groups; + he = mlx5_hlist_lookup(tunnel_grp_hash, tunnel_key.val); + if (he) { + struct tunnel_tbl_entry *tte; + tte = container_of(he, typeof(*tte), hash); + MLX5_ASSERT(tte->flow_table == table_id); + mlx5_hlist_remove(tunnel_grp_hash, he); + mlx5_free(tte); + } + mlx5_flow_id_release(mlx5_tunnel_hub(dev)->table_ids, + tunnel_flow_tbl_to_id(table_id)); + DRV_LOG(DEBUG, + "port %u release table_id %#x tunnel %u group %u", + dev->data->port_id, table_id, + tbl_data->tunnel ? + tbl_data->tunnel->tunnel_id : 0, + tbl_data->group_id); + } /* remove the entry from the hash list and free memory. */ mlx5_hlist_remove(sh->flow_tbls, pos); mlx5_ipool_free(priv->sh->ipool[MLX5_IPOOL_JUMP], @@ -7760,7 +7896,7 @@ flow_dv_matcher_register(struct rte_eth_dev *dev, int ret; tbl = flow_dv_tbl_resource_get(dev, key->table_id, key->direction, - key->domain, error); + key->domain, false, NULL, 0, error); if (!tbl) return -rte_errno; /* No need to refill the error info */ tbl_data = container_of(tbl, struct mlx5_flow_tbl_data_entry, tbl); @@ -8215,11 +8351,23 @@ __flow_dv_translate(struct rte_eth_dev *dev, struct rte_vlan_hdr vlan = { 0 }; uint32_t table; int ret = 0; - + const struct mlx5_flow_tunnel *tunnel; + struct flow_grp_info grp_info = { + .external = !!dev_flow->external, + .transfer = !!attr->transfer, + .fdb_def_rule = !!priv->fdb_def_rule, + }; + tunnel = is_flow_tunnel_match_rule(dev, attr, items, actions) ? + flow_items_to_tunnel(items) : + is_flow_tunnel_steer_rule(dev, attr, items, actions) ? + flow_actions_to_tunnel(actions) : + dev_flow->tunnel ? dev_flow->tunnel : NULL; mhdr_res->ft_type = attr->egress ? MLX5DV_FLOW_TABLE_TYPE_NIC_TX : MLX5DV_FLOW_TABLE_TYPE_NIC_RX; - ret = mlx5_flow_group_to_table(attr, dev_flow->external, attr->group, - !!priv->fdb_def_rule, &table, error); + grp_info.std_tbl_fix = tunnel_use_standard_attr_group_translate + (dev, tunnel, attr, items, actions); + ret = mlx5_flow_group_to_table(dev, tunnel, attr->group, &table, + grp_info, error); if (ret) return ret; dev_flow->dv.group = table; @@ -8229,6 +8377,45 @@ __flow_dv_translate(struct rte_eth_dev *dev, priority = dev_conf->flow_prio - 1; /* number of actions must be set to 0 in case of dirty stack. */ mhdr_res->actions_num = 0; + if (is_flow_tunnel_match_rule(dev, attr, items, actions)) { + /* + * do not add decap action if match rule drops packet + * HW rejects rules with decap & drop + */ + bool add_decap = true; + const struct rte_flow_action *ptr = actions; + struct mlx5_flow_tbl_resource *tbl; + + for (; ptr->type != RTE_FLOW_ACTION_TYPE_END; ptr++) { + if (ptr->type == RTE_FLOW_ACTION_TYPE_DROP) { + add_decap = false; + break; + } + } + if (add_decap) { + if (flow_dv_create_action_l2_decap(dev, dev_flow, + attr->transfer, + error)) + return -rte_errno; + dev_flow->dv.actions[actions_n++] = + dev_flow->dv.encap_decap->action; + action_flags |= MLX5_FLOW_ACTION_DECAP; + } + /* + * bind table_id with for tunnel match rule. + * Tunnel set rule establishes that bind in JUMP action handler. + * Required for scenario when application creates tunnel match + * rule before tunnel set rule. + */ + tbl = flow_dv_tbl_resource_get(dev, table, attr->egress, + attr->transfer, + !!dev_flow->external, tunnel, + attr->group, error); + if (!tbl) + return rte_flow_error_set + (error, EINVAL, RTE_FLOW_ERROR_TYPE_ACTION, + actions, "cannot register tunnel group"); + } for (; !actions_end ; actions++) { const struct rte_flow_action_queue *queue; const struct rte_flow_action_rss *rss; @@ -8249,6 +8436,9 @@ __flow_dv_translate(struct rte_eth_dev *dev, actions, "action not supported"); switch (action_type) { + case MLX5_RTE_FLOW_ACTION_TYPE_TUNNEL_SET: + action_flags |= MLX5_FLOW_ACTION_TUNNEL_SET; + break; case RTE_FLOW_ACTION_TYPE_VOID: break; case RTE_FLOW_ACTION_TYPE_PORT_ID: @@ -8480,16 +8670,19 @@ __flow_dv_translate(struct rte_eth_dev *dev, action_flags |= MLX5_FLOW_ACTION_DECAP; break; case RTE_FLOW_ACTION_TYPE_JUMP: + grp_info.std_tbl_fix = 0; jump_data = action->conf; - ret = mlx5_flow_group_to_table(attr, dev_flow->external, + ret = mlx5_flow_group_to_table(dev, tunnel, jump_data->group, - !!priv->fdb_def_rule, - &table, error); + &table, + grp_info, error); if (ret) return ret; - tbl = flow_dv_tbl_resource_get(dev, table, - attr->egress, - attr->transfer, error); + tbl = flow_dv_tbl_resource_get(dev, table, attr->egress, + attr->transfer, + !!dev_flow->external, + tunnel, jump_data->group, + error); if (!tbl) return rte_flow_error_set (error, errno, @@ -9681,7 +9874,8 @@ flow_dv_prepare_mtr_tables(struct rte_eth_dev *dev, dtb = &mtb->ingress; /* Create the meter table with METER level. */ dtb->tbl = flow_dv_tbl_resource_get(dev, MLX5_FLOW_TABLE_LEVEL_METER, - egress, transfer, &error); + egress, transfer, false, NULL, 0, + &error); if (!dtb->tbl) { DRV_LOG(ERR, "Failed to create meter policer table."); return -1; @@ -9689,7 +9883,8 @@ flow_dv_prepare_mtr_tables(struct rte_eth_dev *dev, /* Create the meter suffix table with SUFFIX level. */ dtb->sfx_tbl = flow_dv_tbl_resource_get(dev, MLX5_FLOW_TABLE_LEVEL_SUFFIX, - egress, transfer, &error); + egress, transfer, false, NULL, 0, + &error); if (!dtb->sfx_tbl) { DRV_LOG(ERR, "Failed to create meter suffix table."); return -1; From patchwork Wed Sep 30 09:18:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gregory Etelson X-Patchwork-Id: 79308 X-Patchwork-Delegate: ferruh.yigit@amd.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 8E40FA04B5; Wed, 30 Sep 2020 11:20:50 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id EEEA11DB23; Wed, 30 Sep 2020 11:20:04 +0200 (CEST) Received: from hqnvemgate25.nvidia.com (hqnvemgate25.nvidia.com [216.228.121.64]) by dpdk.org (Postfix) with ESMTP id F40B81DB1F for ; Wed, 30 Sep 2020 11:20:02 +0200 (CEST) Received: from hqmail.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate25.nvidia.com (using TLS: TLSv1.2, AES256-SHA) id ; Wed, 30 Sep 2020 02:19:10 -0700 Received: from nvidia.com (10.124.1.5) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Wed, 30 Sep 2020 09:19:39 +0000 From: Gregory Etelson To: CC: , , , Ori Kam , Wenzhuo Lu , Beilei Xing , Bernard Iremonger , John McNamara , Marko Kovacevic Date: Wed, 30 Sep 2020 12:18:53 +0300 Message-ID: <20200930091854.19768-5-getelson@nvidia.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930091854.19768-1-getelson@nvidia.com> References: <20200625160348.26220-1-getelson@mellanox.com> <20200930091854.19768-1-getelson@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [10.124.1.5] X-ClientProxiedBy: HQMAIL105.nvidia.com (172.20.187.12) To HQMAIL107.nvidia.com (172.20.187.13) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1601457550; bh=WLIzA8AfAITgS/Ye+dEHJJmP3wz++5WeSU2lD2IgBF4=; h=From:To:CC:Subject:Date:Message-ID:X-Mailer:In-Reply-To: References:MIME-Version:Content-Transfer-Encoding:Content-Type: X-Originating-IP:X-ClientProxiedBy; b=JzMNqhZf7+3wpjks3+MnpQxoY87SfzTXULfbrAcgFpqGO0d24o2QrfahsAlsCh+1x LJ7BRPSYrHxhpWdc7LDq9LgF37Vn4FF386foMeVj4ePY0RsoJBxGcnfI1OTeEIs65e eq1PMoZyO5pUViVYIEgReUUIoTriO6HZGoX/RVAeLfqSkJUvaw1nJykY/9zCBxz/mh I5BH+xBTDxfTaRlrh9Q2LEFYS60Ib10/Q38ZRbRj6Oqbs1SU8YLvKYTdTgDGh4BM31 oqIBeRrRV8mRF1p9RordUBxBMRBpYY11xgaO2yduv8OjyFrhLeQ0h92bWSXr4AmkjD OYXMBxueoMKGA== Subject: [dpdk-dev] [PATCH v3 4/4] app/testpmd: add commands for tunnel offload API X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Tunnel Offload API provides hardware independent, unified model to offload tunneled traffic. Key model elements are: - apply matches to both outer and inner packet headers during entire offload procedure; - restore outer header of partially offloaded packet; - model is implemented as a set of helper functions. Implementation details: * Create application tunnel: flow tunnel create type On success, the command creates application tunnel object and returns the tunnel descriptor. Tunnel descriptor is used in subsequent flow creation commands to reference the tunnel. * Create tunnel steering flow rule: tunnel_set parameter used with steering rule template. * Create tunnel matching flow rule: tunnel_match used with matching rule template. * If tunnel steering rule was offloaded, outer header of a partially offloaded packet is restored after miss. Example: test packet= >>>>>> >>> len(packet) 92 testpmd> flow flush 0 testpmd> port 0/queue 0: received 1 packets src=50:6B:4B:CC:FC:E2 - dst=24:8A:07:8D:AE:D6 - type=0x0800 - length=92 testpmd> flow tunnel 0 type vxlan port 0: flow tunnel #1 type vxlan testpmd> flow create 0 ingress group 0 tunnel_set 1 pattern eth /ipv4 / udp dst is 4789 / vxlan / end actions jump group 0 / end Flow rule #0 created testpmd> port 0/queue 0: received 1 packets tunnel restore info: - vxlan tunnel - outer header present # <-- src=50:6B:4B:CC:FC:E2 - dst=24:8A:07:8D:AE:D6 - type=0x0800 - length=92 testpmd> flow create 0 ingress group 0 tunnel_match 1 pattern eth / ipv4 / udp dst is 4789 / vxlan / eth / ipv4 / end actions set_mac_dst mac_addr 02:CA:FE:CA:FA:80 / queue index 0 / end Flow rule #1 created testpmd> port 0/queue 0: received 1 packets src=50:BB:BB:BB:BB:E2 - dst=02:CA:FE:CA:FA:80 - type=0x0800 - length=42 * Destroy flow tunnel flow tunnel destroy id * Show existing flow tunnels flow tunnel list Signed-off-by: Gregory Etelson --- v2: * introduce testpmd support for tunnel offload API v3: * update flow tunnel commands --- app/test-pmd/cmdline_flow.c | 170 ++++++++++++- app/test-pmd/config.c | 253 +++++++++++++++++++- app/test-pmd/testpmd.c | 5 +- app/test-pmd/testpmd.h | 34 ++- app/test-pmd/util.c | 35 ++- doc/guides/testpmd_app_ug/testpmd_funcs.rst | 49 ++++ 6 files changed, 533 insertions(+), 13 deletions(-) diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c index 6263d307ed..0fb61860cd 100644 --- a/app/test-pmd/cmdline_flow.c +++ b/app/test-pmd/cmdline_flow.c @@ -69,6 +69,14 @@ enum index { LIST, AGED, ISOLATE, + TUNNEL, + + /* Tunnel argumens. */ + TUNNEL_CREATE, + TUNNEL_CREATE_TYPE, + TUNNEL_LIST, + TUNNEL_DESTROY, + TUNNEL_DESTROY_ID, /* Destroy arguments. */ DESTROY_RULE, @@ -88,6 +96,8 @@ enum index { INGRESS, EGRESS, TRANSFER, + TUNNEL_SET, + TUNNEL_MATCH, /* Validate/create pattern. */ PATTERN, @@ -653,6 +663,7 @@ struct buffer { union { struct { struct rte_flow_attr attr; + struct tunnel_ops tunnel_ops; struct rte_flow_item *pattern; struct rte_flow_action *actions; uint32_t pattern_n; @@ -713,10 +724,32 @@ static const enum index next_vc_attr[] = { INGRESS, EGRESS, TRANSFER, + TUNNEL_SET, + TUNNEL_MATCH, PATTERN, ZERO, }; +static const enum index tunnel_create_attr[] = { + TUNNEL_CREATE, + TUNNEL_CREATE_TYPE, + END, + ZERO, +}; + +static const enum index tunnel_destroy_attr[] = { + TUNNEL_DESTROY, + TUNNEL_DESTROY_ID, + END, + ZERO, +}; + +static const enum index tunnel_list_attr[] = { + TUNNEL_LIST, + END, + ZERO, +}; + static const enum index next_destroy_attr[] = { DESTROY_RULE, END, @@ -1516,6 +1549,9 @@ static int parse_aged(struct context *, const struct token *, static int parse_isolate(struct context *, const struct token *, const char *, unsigned int, void *, unsigned int); +static int parse_tunnel(struct context *, const struct token *, + const char *, unsigned int, + void *, unsigned int); static int parse_int(struct context *, const struct token *, const char *, unsigned int, void *, unsigned int); @@ -1698,7 +1734,8 @@ static const struct token token_list[] = { LIST, AGED, QUERY, - ISOLATE)), + ISOLATE, + TUNNEL)), .call = parse_init, }, /* Sub-level commands. */ @@ -1772,6 +1809,49 @@ static const struct token token_list[] = { ARGS_ENTRY(struct buffer, port)), .call = parse_isolate, }, + [TUNNEL] = { + .name = "tunnel", + .help = "new tunnel API", + .next = NEXT(NEXT_ENTRY + (TUNNEL_CREATE, TUNNEL_LIST, TUNNEL_DESTROY)), + .call = parse_tunnel, + }, + /* Tunnel arguments. */ + [TUNNEL_CREATE] = { + .name = "create", + .help = "create new tunnel object", + .next = NEXT(tunnel_create_attr, NEXT_ENTRY(PORT_ID)), + .args = ARGS(ARGS_ENTRY(struct buffer, port)), + .call = parse_tunnel, + }, + [TUNNEL_CREATE_TYPE] = { + .name = "type", + .help = "create new tunnel", + .next = NEXT(tunnel_create_attr, NEXT_ENTRY(FILE_PATH)), + .args = ARGS(ARGS_ENTRY(struct tunnel_ops, type)), + .call = parse_tunnel, + }, + [TUNNEL_DESTROY] = { + .name = "destroy", + .help = "destroy tunel", + .next = NEXT(tunnel_destroy_attr, NEXT_ENTRY(PORT_ID)), + .args = ARGS(ARGS_ENTRY(struct buffer, port)), + .call = parse_tunnel, + }, + [TUNNEL_DESTROY_ID] = { + .name = "id", + .help = "tunnel identifier to testroy", + .next = NEXT(tunnel_destroy_attr, NEXT_ENTRY(UNSIGNED)), + .args = ARGS(ARGS_ENTRY(struct tunnel_ops, id)), + .call = parse_tunnel, + }, + [TUNNEL_LIST] = { + .name = "list", + .help = "list existing tunnels", + .next = NEXT(tunnel_list_attr, NEXT_ENTRY(PORT_ID)), + .args = ARGS(ARGS_ENTRY(struct buffer, port)), + .call = parse_tunnel, + }, /* Destroy arguments. */ [DESTROY_RULE] = { .name = "rule", @@ -1835,6 +1915,20 @@ static const struct token token_list[] = { .next = NEXT(next_vc_attr), .call = parse_vc, }, + [TUNNEL_SET] = { + .name = "tunnel_set", + .help = "tunnel steer rule", + .next = NEXT(next_vc_attr, NEXT_ENTRY(UNSIGNED)), + .args = ARGS(ARGS_ENTRY(struct tunnel_ops, id)), + .call = parse_vc, + }, + [TUNNEL_MATCH] = { + .name = "tunnel_match", + .help = "tunnel match rule", + .next = NEXT(next_vc_attr, NEXT_ENTRY(UNSIGNED)), + .args = ARGS(ARGS_ENTRY(struct tunnel_ops, id)), + .call = parse_vc, + }, /* Validate/create pattern. */ [PATTERN] = { .name = "pattern", @@ -4054,12 +4148,28 @@ parse_vc(struct context *ctx, const struct token *token, return len; } ctx->objdata = 0; - ctx->object = &out->args.vc.attr; + switch (ctx->curr) { + default: + ctx->object = &out->args.vc.attr; + break; + case TUNNEL_SET: + case TUNNEL_MATCH: + ctx->object = &out->args.vc.tunnel_ops; + break; + } ctx->objmask = NULL; switch (ctx->curr) { case GROUP: case PRIORITY: return len; + case TUNNEL_SET: + out->args.vc.tunnel_ops.enabled = 1; + out->args.vc.tunnel_ops.actions = 1; + return len; + case TUNNEL_MATCH: + out->args.vc.tunnel_ops.enabled = 1; + out->args.vc.tunnel_ops.items = 1; + return len; case INGRESS: out->args.vc.attr.ingress = 1; return len; @@ -5597,6 +5707,47 @@ parse_isolate(struct context *ctx, const struct token *token, return len; } +static int +parse_tunnel(struct context *ctx, const struct token *token, + const char *str, unsigned int len, + void *buf, unsigned int size) +{ + struct buffer *out = buf; + + /* Token name must match. */ + if (parse_default(ctx, token, str, len, NULL, 0) < 0) + return -1; + /* Nothing else to do if there is no buffer. */ + if (!out) + return len; + if (!out->command) { + if (ctx->curr != TUNNEL) + return -1; + if (sizeof(*out) > size) + return -1; + out->command = ctx->curr; + ctx->objdata = 0; + ctx->object = out; + ctx->objmask = NULL; + } else { + switch (ctx->curr) { + default: + break; + case TUNNEL_CREATE: + case TUNNEL_DESTROY: + case TUNNEL_LIST: + out->command = ctx->curr; + break; + case TUNNEL_CREATE_TYPE: + case TUNNEL_DESTROY_ID: + ctx->object = &out->args.vc.tunnel_ops; + break; + } + } + + return len; +} + /** * Parse signed/unsigned integers 8 to 64-bit long. * @@ -6543,11 +6694,13 @@ cmd_flow_parsed(const struct buffer *in) switch (in->command) { case VALIDATE: port_flow_validate(in->port, &in->args.vc.attr, - in->args.vc.pattern, in->args.vc.actions); + in->args.vc.pattern, in->args.vc.actions, + &in->args.vc.tunnel_ops); break; case CREATE: port_flow_create(in->port, &in->args.vc.attr, - in->args.vc.pattern, in->args.vc.actions); + in->args.vc.pattern, in->args.vc.actions, + &in->args.vc.tunnel_ops); break; case DESTROY: port_flow_destroy(in->port, in->args.destroy.rule_n, @@ -6573,6 +6726,15 @@ cmd_flow_parsed(const struct buffer *in) case AGED: port_flow_aged(in->port, in->args.aged.destroy); break; + case TUNNEL_CREATE: + port_flow_tunnel_create(in->port, &in->args.vc.tunnel_ops); + break; + case TUNNEL_DESTROY: + port_flow_tunnel_destroy(in->port, in->args.vc.tunnel_ops.id); + break; + case TUNNEL_LIST: + port_flow_tunnel_list(in->port); + break; default: break; } diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c index 2d9a456467..d0f86230d0 100644 --- a/app/test-pmd/config.c +++ b/app/test-pmd/config.c @@ -1339,6 +1339,115 @@ port_mtu_set(portid_t port_id, uint16_t mtu) /* Generic flow management functions. */ +static struct port_flow_tunnel * +port_flow_locate_tunnel_id(struct rte_port *port, uint32_t port_tunnel_id) +{ + struct port_flow_tunnel *flow_tunnel; + + LIST_FOREACH(flow_tunnel, &port->flow_tunnel_list, chain) { + if (flow_tunnel->id == port_tunnel_id) + goto out; + } + flow_tunnel = NULL; + +out: + return flow_tunnel; +} + +const char * +port_flow_tunnel_type(struct rte_flow_tunnel *tunnel) +{ + const char *type; + switch (tunnel->type) { + default: + type = "unknown"; + break; + case RTE_FLOW_ITEM_TYPE_VXLAN: + type = "vxlan"; + break; + } + + return type; +} + +struct port_flow_tunnel * +port_flow_locate_tunnel(uint16_t port_id, struct rte_flow_tunnel *tun) +{ + struct rte_port *port = &ports[port_id]; + struct port_flow_tunnel *flow_tunnel; + + LIST_FOREACH(flow_tunnel, &port->flow_tunnel_list, chain) { + if (!memcmp(&flow_tunnel->tunnel, tun, sizeof(*tun))) + goto out; + } + flow_tunnel = NULL; + +out: + return flow_tunnel; +} + +void port_flow_tunnel_list(portid_t port_id) +{ + struct rte_port *port = &ports[port_id]; + struct port_flow_tunnel *flt; + + LIST_FOREACH(flt, &port->flow_tunnel_list, chain) { + printf("port %u tunnel #%u type=%s", + port_id, flt->id, port_flow_tunnel_type(&flt->tunnel)); + if (flt->tunnel.tun_id) + printf(" id=%lu", flt->tunnel.tun_id); + printf("\n"); + } +} + +void port_flow_tunnel_destroy(portid_t port_id, uint32_t tunnel_id) +{ + struct rte_port *port = &ports[port_id]; + struct port_flow_tunnel *flt; + + LIST_FOREACH(flt, &port->flow_tunnel_list, chain) { + if (flt->id == tunnel_id) + break; + } + if (flt) { + LIST_REMOVE(flt, chain); + free(flt); + printf("port %u: flow tunnel #%u destroyed\n", + port_id, tunnel_id); + } +} + +void port_flow_tunnel_create(portid_t port_id, const struct tunnel_ops *ops) +{ + struct rte_port *port = &ports[port_id]; + enum rte_flow_item_type type; + struct port_flow_tunnel *flt; + + if (!strcmp(ops->type, "vxlan")) + type = RTE_FLOW_ITEM_TYPE_VXLAN; + else { + printf("cannot offload \"%s\" tunnel type\n", ops->type); + return; + } + LIST_FOREACH(flt, &port->flow_tunnel_list, chain) { + if (flt->tunnel.type == type) + break; + } + if (!flt) { + flt = calloc(1, sizeof(*flt)); + if (!flt) { + printf("failed to allocate port flt object\n"); + return; + } + flt->tunnel.type = type; + flt->id = LIST_EMPTY(&port->flow_tunnel_list) ? 1 : + LIST_FIRST(&port->flow_tunnel_list)->id + 1; + LIST_INSERT_HEAD(&port->flow_tunnel_list, flt, chain); + } + printf("port %d: flow tunnel #%u type %s\n", + port_id, flt->id, ops->type); +} + /** Generate a port_flow entry from attributes/pattern/actions. */ static struct port_flow * port_flow_new(const struct rte_flow_attr *attr, @@ -1463,19 +1572,137 @@ rss_config_display(struct rte_flow_action_rss *rss_conf) } } +static struct port_flow_tunnel * +port_flow_tunnel_offload_cmd_prep(portid_t port_id, + const struct rte_flow_item *pattern, + const struct rte_flow_action *actions, + const struct tunnel_ops *tunnel_ops) +{ + int ret; + struct rte_port *port; + struct port_flow_tunnel *pft; + struct rte_flow_error error; + + port = &ports[port_id]; + pft = port_flow_locate_tunnel_id(port, tunnel_ops->id); + if (!pft) { + printf("failed to locate port flow tunnel #%u\n", + tunnel_ops->id); + return NULL; + } + if (tunnel_ops->actions) { + uint32_t num_actions; + const struct rte_flow_action *aptr; + + ret = rte_flow_tunnel_decap_set(port_id, &pft->tunnel, + &pft->pmd_actions, + &pft->num_pmd_actions, + &error); + if (ret) { + port_flow_complain(&error); + return NULL; + } + for (aptr = actions, num_actions = 1; + aptr->type != RTE_FLOW_ACTION_TYPE_END; + aptr++, num_actions++); + pft->actions = malloc( + (num_actions + pft->num_pmd_actions) * + sizeof(actions[0])); + if (!pft->actions) { + rte_flow_tunnel_action_decap_release( + port_id, pft->actions, + pft->num_pmd_actions, &error); + return NULL; + } + rte_memcpy(pft->actions, pft->pmd_actions, + pft->num_pmd_actions * sizeof(actions[0])); + rte_memcpy(pft->actions + pft->num_pmd_actions, actions, + num_actions * sizeof(actions[0])); + } + if (tunnel_ops->items) { + uint32_t num_items; + const struct rte_flow_item *iptr; + + ret = rte_flow_tunnel_match(port_id, &pft->tunnel, + &pft->pmd_items, + &pft->num_pmd_items, + &error); + if (ret) { + port_flow_complain(&error); + return NULL; + } + for (iptr = pattern, num_items = 1; + iptr->type != RTE_FLOW_ITEM_TYPE_END; + iptr++, num_items++); + pft->items = malloc((num_items + pft->num_pmd_items) * + sizeof(pattern[0])); + if (!pft->items) { + rte_flow_tunnel_item_release( + port_id, pft->pmd_items, + pft->num_pmd_items, &error); + return NULL; + } + rte_memcpy(pft->items, pft->pmd_items, + pft->num_pmd_items * sizeof(pattern[0])); + rte_memcpy(pft->items + pft->num_pmd_items, pattern, + num_items * sizeof(pattern[0])); + } + + return pft; +} + +static void +port_flow_tunnel_offload_cmd_release(portid_t port_id, + const struct tunnel_ops *tunnel_ops, + struct port_flow_tunnel *pft) +{ + struct rte_flow_error error; + + if (tunnel_ops->actions) { + free(pft->actions); + rte_flow_tunnel_action_decap_release( + port_id, pft->pmd_actions, + pft->num_pmd_actions, &error); + pft->actions = NULL; + pft->pmd_actions = NULL; + } + if (tunnel_ops->items) { + free(pft->items); + rte_flow_tunnel_item_release(port_id, pft->pmd_items, + pft->num_pmd_items, + &error); + pft->items = NULL; + pft->pmd_items = NULL; + } +} + /** Validate flow rule. */ int port_flow_validate(portid_t port_id, const struct rte_flow_attr *attr, const struct rte_flow_item *pattern, - const struct rte_flow_action *actions) + const struct rte_flow_action *actions, + const struct tunnel_ops *tunnel_ops) { struct rte_flow_error error; + struct port_flow_tunnel *pft = NULL; /* Poisoning to make sure PMDs update it in case of error. */ memset(&error, 0x11, sizeof(error)); + if (tunnel_ops->enabled) { + pft = port_flow_tunnel_offload_cmd_prep(port_id, pattern, + actions, tunnel_ops); + if (!pft) + return -ENOENT; + if (pft->items) + pattern = pft->items; + if (pft->actions) + actions = pft->actions; + } if (rte_flow_validate(port_id, attr, pattern, actions, &error)) return port_flow_complain(&error); + if (tunnel_ops->enabled) + port_flow_tunnel_offload_cmd_release(port_id, tunnel_ops, pft); printf("Flow rule validated\n"); return 0; } @@ -1505,13 +1732,15 @@ int port_flow_create(portid_t port_id, const struct rte_flow_attr *attr, const struct rte_flow_item *pattern, - const struct rte_flow_action *actions) + const struct rte_flow_action *actions, + const struct tunnel_ops *tunnel_ops) { struct rte_flow *flow; struct rte_port *port; struct port_flow *pf; uint32_t id = 0; struct rte_flow_error error; + struct port_flow_tunnel *pft = NULL; port = &ports[port_id]; if (port->flow_list) { @@ -1522,6 +1751,16 @@ port_flow_create(portid_t port_id, } id = port->flow_list->id + 1; } + if (tunnel_ops->enabled) { + pft = port_flow_tunnel_offload_cmd_prep(port_id, pattern, + actions, tunnel_ops); + if (!pft) + return -ENOENT; + if (pft->items) + pattern = pft->items; + if (pft->actions) + actions = pft->actions; + } pf = port_flow_new(attr, pattern, actions, &error); if (!pf) return port_flow_complain(&error); @@ -1537,6 +1776,8 @@ port_flow_create(portid_t port_id, pf->id = id; pf->flow = flow; port->flow_list = pf; + if (tunnel_ops->enabled) + port_flow_tunnel_offload_cmd_release(port_id, tunnel_ops, pft); printf("Flow rule #%u created\n", pf->id); return 0; } @@ -1831,7 +2072,9 @@ port_flow_list(portid_t port_id, uint32_t n, const uint32_t group[n]) pf->rule.attr->egress ? 'e' : '-', pf->rule.attr->transfer ? 't' : '-'); while (item->type != RTE_FLOW_ITEM_TYPE_END) { - if (rte_flow_conv(RTE_FLOW_CONV_OP_ITEM_NAME_PTR, + if ((uint32_t)item->type > INT_MAX) + name = "PMD_INTERNAL"; + else if (rte_flow_conv(RTE_FLOW_CONV_OP_ITEM_NAME_PTR, &name, sizeof(name), (void *)(uintptr_t)item->type, NULL) <= 0) @@ -1842,7 +2085,9 @@ port_flow_list(portid_t port_id, uint32_t n, const uint32_t group[n]) } printf("=>"); while (action->type != RTE_FLOW_ACTION_TYPE_END) { - if (rte_flow_conv(RTE_FLOW_CONV_OP_ACTION_NAME_PTR, + if ((uint32_t)action->type > INT_MAX) + name = "PMD_INTERNAL"; + else if (rte_flow_conv(RTE_FLOW_CONV_OP_ACTION_NAME_PTR, &name, sizeof(name), (void *)(uintptr_t)action->type, NULL) <= 0) diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index fe6450cc0d..e484079147 100644 --- a/app/test-pmd/testpmd.c +++ b/app/test-pmd/testpmd.c @@ -3588,6 +3588,8 @@ init_port_dcb_config(portid_t pid, static void init_port(void) { + int i; + /* Configuration of Ethernet ports. */ ports = rte_zmalloc("testpmd: ports", sizeof(struct rte_port) * RTE_MAX_ETHPORTS, @@ -3597,7 +3599,8 @@ init_port(void) "rte_zmalloc(%d struct rte_port) failed\n", RTE_MAX_ETHPORTS); } - + for (i = 0; i < RTE_MAX_ETHPORTS; i++) + LIST_INIT(&ports[i].flow_tunnel_list); /* Initialize ports NUMA structures */ memset(port_numa, NUMA_NO_CONFIG, RTE_MAX_ETHPORTS); memset(rxring_numa, NUMA_NO_CONFIG, RTE_MAX_ETHPORTS); diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h index f139fe7a0a..1dd8cfd3d8 100644 --- a/app/test-pmd/testpmd.h +++ b/app/test-pmd/testpmd.h @@ -12,6 +12,7 @@ #include #include #include +#include #define RTE_PORT_ALL (~(portid_t)0x0) @@ -142,6 +143,26 @@ struct port_flow { uint8_t data[]; /**< Storage for flow rule description */ }; +struct port_flow_tunnel { + LIST_ENTRY(port_flow_tunnel) chain; + struct rte_flow_action *pmd_actions; + struct rte_flow_item *pmd_items; + uint32_t id; + uint32_t num_pmd_actions; + uint32_t num_pmd_items; + struct rte_flow_tunnel tunnel; + struct rte_flow_action *actions; + struct rte_flow_item *items; +}; + +struct tunnel_ops { + uint32_t id; + char type[16]; + uint32_t enabled:1; + uint32_t actions:1; + uint32_t items:1; +}; + /** * The data structure associated with each port. */ @@ -172,6 +193,7 @@ struct rte_port { uint32_t mc_addr_nb; /**< nb. of addr. in mc_addr_pool */ uint8_t slave_flag; /**< bonding slave port */ struct port_flow *flow_list; /**< Associated flows. */ + LIST_HEAD(, port_flow_tunnel) flow_tunnel_list; const struct rte_eth_rxtx_callback *rx_dump_cb[RTE_MAX_QUEUES_PER_PORT+1]; const struct rte_eth_rxtx_callback *tx_dump_cb[RTE_MAX_QUEUES_PER_PORT+1]; /**< metadata value to insert in Tx packets. */ @@ -749,11 +771,13 @@ void port_reg_set(portid_t port_id, uint32_t reg_off, uint32_t value); int port_flow_validate(portid_t port_id, const struct rte_flow_attr *attr, const struct rte_flow_item *pattern, - const struct rte_flow_action *actions); + const struct rte_flow_action *actions, + const struct tunnel_ops *tunnel_ops); int port_flow_create(portid_t port_id, const struct rte_flow_attr *attr, const struct rte_flow_item *pattern, - const struct rte_flow_action *actions); + const struct rte_flow_action *actions, + const struct tunnel_ops *tunnel_ops); void update_age_action_context(const struct rte_flow_action *actions, struct port_flow *pf); int port_flow_destroy(portid_t port_id, uint32_t n, const uint32_t *rule); @@ -763,6 +787,12 @@ int port_flow_query(portid_t port_id, uint32_t rule, const struct rte_flow_action *action); void port_flow_list(portid_t port_id, uint32_t n, const uint32_t *group); void port_flow_aged(portid_t port_id, uint8_t destroy); +const char *port_flow_tunnel_type(struct rte_flow_tunnel *tunnel); +struct port_flow_tunnel * +port_flow_locate_tunnel(uint16_t port_id, struct rte_flow_tunnel *tun); +void port_flow_tunnel_list(portid_t port_id); +void port_flow_tunnel_destroy(portid_t port_id, uint32_t tunnel_id); +void port_flow_tunnel_create(portid_t port_id, const struct tunnel_ops *ops); int port_flow_isolate(portid_t port_id, int set); void rx_ring_desc_display(portid_t port_id, queueid_t rxq_id, uint16_t rxd_id); diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c index 8488fa1a8f..781a813759 100644 --- a/app/test-pmd/util.c +++ b/app/test-pmd/util.c @@ -48,18 +48,49 @@ dump_pkt_burst(uint16_t port_id, uint16_t queue, struct rte_mbuf *pkts[], is_rx ? "received" : "sent", (unsigned int) nb_pkts); for (i = 0; i < nb_pkts; i++) { + int ret; + struct rte_flow_error error; + struct rte_flow_restore_info info = { 0, }; + mb = pkts[i]; eth_hdr = rte_pktmbuf_read(mb, 0, sizeof(_eth_hdr), &_eth_hdr); eth_type = RTE_BE_TO_CPU_16(eth_hdr->ether_type); - ol_flags = mb->ol_flags; packet_type = mb->packet_type; is_encapsulation = RTE_ETH_IS_TUNNEL_PKT(packet_type); - + ret = rte_flow_get_restore_info(port_id, mb, &info, &error); + if (!ret) { + printf("restore info:"); + if (info.flags & RTE_FLOW_RESTORE_INFO_TUNNEL) { + struct port_flow_tunnel *port_tunnel; + + port_tunnel = port_flow_locate_tunnel + (port_id, &info.tunnel); + printf(" - tunnel"); + if (port_tunnel) + printf(" #%u", port_tunnel->id); + else + printf(" %s", "-none-"); + printf(" type %s", + port_flow_tunnel_type(&info.tunnel)); + } else { + printf(" - no tunnel info"); + } + if (info.flags & RTE_FLOW_RESTORE_INFO_ENCAPSULATED) + printf(" - outer header present"); + else + printf(" - no outer header"); + if (info.flags & RTE_FLOW_RESTORE_INFO_GROUP_ID) + printf(" - miss group %u", info.group_id); + else + printf(" - no miss group"); + printf("\n"); + } print_ether_addr(" src=", ð_hdr->s_addr); print_ether_addr(" - dst=", ð_hdr->d_addr); printf(" - type=0x%04x - length=%u - nb_segs=%d", eth_type, (unsigned int) mb->pkt_len, (int)mb->nb_segs); + ol_flags = mb->ol_flags; if (ol_flags & PKT_RX_RSS_HASH) { printf(" - RSS hash=0x%x", (unsigned int) mb->hash.rss); printf(" - RSS queue=0x%x", (unsigned int) queue); diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst index a972ef8951..97fcbfd329 100644 --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst @@ -3720,6 +3720,45 @@ following sections. flow aged {port_id} [destroy] +- Tunnel offload - create a tunnel stub:: + + flow tunnel create {port_id} type {tunnel_type} + +- Tunnel offload - destroy a tunnel stub:: + + flow tunnel destroy {port_id} id {tunnel_id} + +- Tunnel offload - list port tunnel stubs:: + + flow tunnel list {port_id} + +Creating a tunnel stub for offload +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +``flow tunnel create`` setup a tunnel stub for tunnel offload flow rules:: + + flow tunnel create {port_id} type {tunnel_type} + +If successful, it will return a tunnel stub ID usable with other commands:: + + port [...]: flow tunnel #[...] type [...] + +Tunnel stub ID is relative to a port. + +Destroying tunnel offload stub +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +``flow tunnel destroy`` destroy port tunnel stub:: + + flow tunnel destroy {port_id} id {tunnel_id} + +Listing tunnel offload stubs +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +``flow tunnel list`` list port tunnel offload stubs:: + + flow tunnel list {port_id} + Validating flow rules ~~~~~~~~~~~~~~~~~~~~~ @@ -3766,6 +3805,7 @@ to ``rte_flow_create()``:: flow create {port_id} [group {group_id}] [priority {level}] [ingress] [egress] [transfer] + [tunnel_set {tunnel_id}] [tunnel_match {tunnel_id}] pattern {item} [/ {item} [...]] / end actions {action} [/ {action} [...]] / end @@ -3780,6 +3820,7 @@ Otherwise it will show an error message of the form:: Parameters describe in the following order: - Attributes (*group*, *priority*, *ingress*, *egress*, *transfer* tokens). +- Tunnel offload specification (tunnel_set, tunnel_match) - A matching pattern, starting with the *pattern* token and terminated by an *end* pattern item. - Actions, starting with the *actions* token and terminated by an *end* @@ -3823,6 +3864,14 @@ Most rules affect RX therefore contain the ``ingress`` token:: testpmd> flow create 0 ingress pattern [...] +Tunnel offload +^^^^^^^^^^^^^^ + +Indicate tunnel offload rule type + +- ``tunnel_set {tunnel_id}``: mark rule as tunnel offload decap_set type. +- ``tunnel_match {tunnel_id}``: mark rule as tunel offload match type. + Matching pattern ^^^^^^^^^^^^^^^^